Matching Forensic Sketches to Mug Shot Photos Using a Population of Sketches Generated by Combining Geometrical Facial Changes and Genetic Algorithms
Introduction
Forensic science is to apply science for collecting, preserving, and analyzing scientific evidence during the course of a criminal investigation. One of its main applications is to help law enforcement agencies to detect the identity of criminals. Considering the advent of automated biometric identification, it is possible to determine the culprit’s identity using a variety of information taken from the crime scene. These may include the suspect’s DNA, fingerprint, or an image of his/her face taken by a surveillance camera. There are nonetheless situations where none of the aforementioned information is available at the crime scene except an eyewitness who was present during the course of the criminal action. In these conditions, a forensic artist is normally employed to communicate with the witness in order to draw a sketch of the suspect’s face according to the verbal description. The completed image which we call a forensic sketch is then circulated to law enforcement officers and media outlets in the hope that someone can identify the suspect.
Forensic sketches, regardless of being composite by using a library of facial features or drawn by an artist, bear a significant challenge due to their reliance on the verbal description of a witness. Often witnesses are unable to exactly remember the appearance of a suspect and their subjective account of the description would result in imprecise and inadequate forensic sketches.
This paper addresses the problem of matching forensic sketches to photos by developing a novel hybrid approach. Having inspired from the work by Kukharev et al.,4 we first generate a population of sketches out of each initial forensic sketch by applying geometrical changes in facial areas. The quality of sketches is then optimized using Genetic Algorithms (GA). We adopt the Structural SIMilarity (SSIM) index introduced in5 as the fitness function of our proposed GA. Finally, the matching is applied to the best sketch produced by GA by employing the Local Feature-based Discriminant Analysis (LFDA) framework proposed by Kalre et al.,6 The efficiency of our proposed hybrid approach in attaining correct matchings is evaluated against 88 sketch/photo pairs provided by the Michigan State Police Department7 and Forensic Art Essentials,8 and 100 sketch/black-and-white photo pairs from FERET database. As a further analysis, we investigate the contribution of cropping sketch/photo pairs to central facial areas in increasing the retrieval rate. We can summarize the main contributions of our work as follows
A novel hybrid approach is proposed as the pre-processing phase of any matching algorithm by combining geometrical changes and genetic algorithms to generate high quality sketches out of original forensic sketches.
A substantial improvement in the retrieval rate is observed when initial sketch/image pairs are cropped to central facial areas restricted to the eyebrows above and to the chin below.
The rest of the paper is organized as follows. In Section 2, we conduct a critical review of the related work. Section 3 presents our proposed hybrid approach. It first describes the procedure of generating a population of sketches out of each original forensic sketch. Next, the structure of our proposed genetic algorithm by the aim of optimizing the sketches is introduced. The section ends by explaining the process of identifying the best sketch using the LFDA framework. Having introduced two mug shot databases, we present our experimental results and analysis in Section 4. Finally, Section 5 concludes the paper and highlights our future research directions.
Related Work
Most studies on sketch to photo matching have focused on viewed sketches where the original photograph is available. The common approach was based on generating a synthetic photograph from a sketch and then applying conventional face recognition algorithms to match the synthetic photographs to gallery photos.1-3 Nonetheless, the pioneering work on automatically matching a forensic sketch to a set of true paragraphs dates back to two decades ago.9,10 Uhl and Lobo9 developed a framework for matching a police sketch to a true image by first locating facial features in the sketch as well as the photograph images and then employing eigen analysis. Koren10 used neural face recognition algorithms as the matching algorithm.
Later studies adopted effective feature descriptors like the Scale Invariant Feature Transform (SIFT)11 and Local Binary Patterns (LBP)12 that have been successful in other heterogeneous face recognition scenarios in their method of sketch matching.6,16Jain et al.,13discussed recent developments in automated face recognition that affected the forensic face recognition community including research in facial aging, facial marks, forensic sketch recognition, face recognition in video, near-infrared face recognition, and use of soft biometrics. Han et al.,14 particularly addressed the problem of matching composite sketches to facial photographs by proposing a Component-Based Representation (CBR) approach to measure the similarity between a composite sketch and mugshot photograph. Klum et al.,15 compared the effectiveness of forensic sketches with composite sketches and showed that the latter were matched with higher accuracy. They also conducted an analysis of the factor of filtering a mugshot gallery using three sources of demographic information (age, gender and race/ethnicity). Kukharev et al.4 developed methods for generating a population of sketches from the initial sketch to improve the performance of sketch-based photo image retrieval systems.
This paper extends the work of Brandan et al.,6 by incorporating a preprocessing phase in the LFDA framework to generate high quality sketches out of initial sketches. This involves the generation of a population of sketches out of each initial one by applying geometrical changes in facial areas. The approach was inspired from the work by Kukharev et al.,4. The population is then optimized using genetic algorithms. We describe our proposed matching approach in the next section.
The Proposed Approach
The objective of this study is to develop a method for improving the automated identification of forensic sketches relative to a gallery of mug shot images primarily in heterogeneous face recognition scenarios. The proposed approach is presented in two phases of pre-processing and matching. The contribution of this paper is on the pre-processing phase by aiming at enhancing the resulting forensic sketches before being fed to the matching phase.
The Pre-processing Phase
Given a set of forensic sketches and their corresponding photographs, a population of sketch images is generated out of each initial forensic sketch based on an algorithm proposed by Kukharev et al.,4 The algorithm is introduced in the next section.
Sketch Population Generation
The objective of the algorithm introduced by Kukharev et al.,4 is to generate a population of new sketches from a given sketch by applying geometrical changes in facial areas. We can briefly describe their algorithm as follows.
At first, it is required to represent the original sketch image in a matrix, let say S, in a grayscale format. For each new sketch to be generated, the geometrical changes are applied to the original sketch by changing the number of rows, the number of columns and carrying out the cyclic shift of columns, respectively. This is done by generating three random parameters, let say p1, p2, and p3, whose sign and magnitude determine the way the change is applied. Kukharev et al.,4 developed their algorithm in three steps where in each step, one operation of changing geometry is executed.
Step 1. Remove the first |p1-1| rows from matrix S, if p1>0; else if p1<0, extend matrix S by adding the first |p1-1| rows above S. The resultant matrix is denoted as S.1
Step 2. Remove the first |p2-1| columns from matrix S(1), if p2>0; else if p2<0, remove the last |p2-1| columns from S(1). The resultant matrix is denoted as S.2
Step 3. Carry out the cyclic shift of matrix S(2) to the left by |p3-1| columns, if p3>0; else if p3<0, to the right. The resultant matrix is recorded to form a new sketch.
The preceding three steps are then repeated with new values of p1, p2, and p3 to generate the next new sketch until all specified sketches are produced. Fig. 1 depicts an example where a population of 20 sketches are generated after applying geometrical changes in facial areas of the input sketch.
|
Figure 1: An example of the input sketch and population created using the geometric variation algorithm in the face
Click here to View figure
|
Genetic Algorithms
The general steps of genetic algorithms include initial population generation, selection, crossover, mutation and replacement. Note that the last four steps are carried out iteratively until some stopping criteria are met.
In our proposed GA, each new sketch generated in the previous section represents a chromosome (see Fig. 2(a)). In other words, a chromosome is encoded as a matrix of size M×N in a grayscale format. The whole sketches form the initial population. The fitness function employed to evaluate each chromosome is based on the SSIM index which is described in Section 3.1.3. Roulette wheel selection is used to give higher priorities to fittest individuals for mating. For each pair of individuals in the mating pool, one-point crossover is applied with a given probability (here 0.7) on the columns of their representative matrices to produce two offspring (see Fig. 2(b)). The offspring are then combined with the current population to form the next population. The above steps continue until no more improvement is attained and the best individual (sketch) whose SSIM index relative to its corresponding photograph is the highest is chosen as the final sketch for the matching phase. Note that for each initial forensic sketch, GA should be applied to optimize its corresponding population of sketches produced in Section 3.1.1.
|
Figure 2a): Representation of sketch/photo in a matrix or array of numerical values (i.e., in a gray-scale format)b) Applying the one-point crossover operator (here for crossover point equal to 4) on two arrays of sketches to produce two new sketches as the offspring
Click here to View figure
|
SSIM Index
The SSIM index was proposed by.5 as a kind of image quality assessment measure based on the theory of structure similarity. As already stated, the measure is adopted as the fitness function of our proposed GA. Given a chromosome representing a sketch and the original photograph as shown in Fig. 2(a), the SSIM index calculates the resemblance of the given chromosome to the original image. SSIM defines the image distortion model as a mixture of three features including luminance, contrast and structure. Let x be the original image signal and y be the signal to be measured (i.e., the sketch represented by a chromosome), the average value (µx,µy) is defined as the brightness value, the standard deviation (σx,σy) is used as the contrast and the covariance σxy is considered as the structure similarity measure. The following three equations are then used for the calculation of luminance l(x,y), contrast c(x,y) and structure s(x,y), respectively:
To avoid unsteadiness when the denominators of the above functions tend to zero, C1, C2 and C3 are set to be small enough positive constant. By merging Equations (1)-(3), the structure similarity of the two images x and y is calculated according to the following formula:
Where α,β,γ>0 are parameters used to adjust the weights. Provided that α=β=γ=1 and C3=C2/2, Equation (4) can be simplified as follows :
In other words, the value of equation (5) represents the fitness of the chromosome representing sketch y relative to the original image x. It is important to note that in this paper, we define C1=(K1L)2 and C2=(K2L)2 where parameters K1 and K2 are equal to 0.01 and 0.03, respectively, and L equals the number of colors in the image which is set to 255.
The Matching Phase
Sketch to photo matching is carried out according to the LFDA framework developed by Klare et al.,6 It consists of two steps of training and recognition. Given the sketches produced by GA in the pre-processing phase, the training set is formed by choosing some sketch/photo correspondences. (It should be noted that the k-fold cross validation technique is used for training and recognition steps.) All training sketch/photo pairs are first broken into a set of overlapping patches. Next, each sketch and photo are represented by the Scale-Invariant Feature Transform (SIFT) and Multiscale Local Binary Pattern (MLBP) feature descriptors extracted from overlapping patches. In the LFDA framework, each image feature vector is first divided into slices of smaller dimensionality, where slices correspond to the concatenation of feature descriptor vectors from each column of image patches. After grouping slices of patches together into feature vectors, a discriminant projection is learned for each slice. Finally, the Principal Component Analysis (PCA) procedure is applied to the new feature vector to remove redundant information among the feature slices to extract the final feature vector.
With each local feature-based discriminant trained, sketches to photos are matched using the nearest neighbor matching on the concatenated slice vectors. The recognition step is carried out after combining each projected vector slice into a single vector and measuring the normed distance between a probe sketch and a gallery photo. The gallery photo with the minimum distance to each probe sketch is then selected and the sketch is labeled accordingly.
Experimental Evaluation and Analysis
The aim of our experimental study is to evaluate the performance of our proposed hybrid Geometrical
Changes Genetic Algorithms (denoted as GCGA) approach relative to the LFDA framework developed in.6 For our assessment, we used two widely well-known forensic sketch databases which are introduced in the next section.
Forensic Sketch Databases
The first database was provided by.7,8 Each sketch was accompanied with a corresponding photograph of the subject who was later identified by the police. The sketches were drawn by forensic sketch artists working with witnesses who provided verbal descriptions after crimes were committed by unknown culprits. Fig. 3 depicts two pairs of forensic sketches and their corresponding photographs. As just mentioned, the corresponding photographs (mug shots) are the result of the subject later being identified. Given the quality of drawn sketches, we only chose 88 sketch/photo pairs.
|
Figure 3: Two pairs of forensic sketches and their corresponding photographs taken from our database7,8
Click here to View figure
|
Due to accessibility restrictions to real-world forensic sketches, our second database was formed by 100 black-and-white images chosen from FERET database. For each image, according to verbal descriptions provided by the first author of this paper, their corresponding forensic sketches were drawn by a talented sketch artist. Fig. 4 shows two examples of true images and their corresponding drawn forensic sketches. So, in total, 188 photo/forensic sketch pairs were used for our experimentation.
|
Figure 4: Two pairs of images taken from FERET database and their corresponding sketches drawn according to the verbal description of the author of the paper
Click here to View figure
|
Analysis of Results
In this section, we empirically evaluate the quality of matches retrieved by our proposed GCGA approach relative to the basic LFDA framework. There are various tools for comparing recognition methods among which cumulative match characteristic (CMC) curves are commonly used. A CMC curve is a 2D plot of the accuracy (as the vertical axis) at rank-k (as the horizontal axis). A probe sketch is given rank-k when the actual photograph is ranked in position k by a face recognition system. Thus, a rank-1 outcome is considered a correct identification because the recognition system is able to retrieve the correct image as the first rank based on the probe sketch. The accuracy is an estimate of the probability that a subject is identified correctly at least at rank-k. For example, an accuracy of 80% at rank-10 indicates that for 80 percent of retrievals, the recognition system is able to find the actual photographs among the first 10 retrieved images based on the given probe sketches. It is hence obvious that the accuracy is necessarily an increasing function of k.
We used the k-fold cross validation technique for training and testing of our compared methods. Considering the size of our test data, we changed the value of k from 4 to 10. As Fig. 5 indicates, the accuracy of both LFDA and GCGA increases with respect to larger values of k.
We conducted extensive experiments to evaluate the performance of our GCGA approach with respect to our available forensic sketch databases. Our preliminary tests strongly demonstrated that the effectiveness of our proposed algorithm seriously depends on the quality of the original forensic sketches drawn based on verbal descriptions of eyewitnesses. This is shown in Fig. 6 where no significant improvement in the accuracy was attained relative to LFDA even if our approach benefits from a pre-processing phase. We thus carried out further experiments by slightly modifying probe sketches and the gallery of mug shot images. We cropped them to central facial areas restricted to the eyebrows above and the chin below. Fig. 7 depicts an example of two pairs of cropped sketches and their corresponding photographs. According to Fig. 8, it can be noticed that in this condition, the retrieval rate of both LFDA and GCGA substantially increases even for rank-1. This suggests that the central facial areas play a crucial role in matching sketches to photographs.
|
Figure 7: Two pairs of cropped sketches and their corresponding photographs (to notice the difference, compare the two pairs with those in Fig. 3)
Click here to View figure
|
|
Figure 8: The contribution of cropping sketch/image pairs to central facial areas in increasing the retrieval rate
Click here to View figure
|
Finally, Table.1 summarizes the accuracy of LFDA and GCGA methods without/with cropping. According to the table, our GCGA method slightly outperforms LFDA even if cropping is not applied, especially for rank-10. This is mainly due to the incorporation of our proposed pre-processing phase. Furthermore, although the accuracy of both algorithms remarkably increases after cropping, it seems that GCGA performs fairly better than LFDA, particularly for rank-1 and rank-5.
Conclusions
This paper addressed the problem of matching forensic sketches to a gallery of mug shot images. The proposed algorithmic approach was based on applying geometrical changes in the facial areas of the original forensic sketch to generate a population of sketches. The population was then optimized by genetic algorithms according to the SSIM index as the fitness function. The matching process was next carried out by applying the LFDA framework. The experimental results in terms of accuracy indicated that our proposed GCGA approach performs fairly better than the LFDA framework when both images and sketches are cropped to central facial areas restricted to the eyebrows above and the chin below. As the effectiveness of any forensic sketch to photo matching approach significantly depends on the quality of sketches drawn according to verbal descriptions of eyewitnesses, our findings suggest that instructing people to concentrate on the central facial areas of suspects rather than surrounding facial areas would meaningfully contribute the accuracy of matching.
References
- X. Tang and X. Wang, Face Sketch Recognition. IEEE Transactions on Circuits and Systems for Video Technology. 2004;14(1):50-57.
CrossRef
- D. Lin and X. Tang, Inter-Modality Face Recognition, in Proc. European Conference Computer Vision. 2006.
CrossRef
- X. Wang and X. Tang, Face Photo-Sketch Synthesis and Recognition, IEEE Transactions Pattern Analysis and Machine Intelligence. 2009;31(11):1955-1967.
CrossRef
- G. A. Kukhareva, Y.N.M., and N. L. Shchegolevac, New Solutions for Face Photo Retrieval Based on Sketches,. Pattern Recognition and Image Analysis. 2016;26(1):165–175.
CrossRef
- Z. Wang and A. C. Bovik, A universal image quality index,. IEEE Signal Processing Lett., 2002. 9(3):81–84.
CrossRef
- Brendan F. Klare, Z.L.a.A.K.J., Matching Forensic Sketches to Mug Shot Photos, IEEE Transactions on Pattern Analysis and Machine Intelligence, 2011. 33(3):639-646.
CrossRef
- Michigan State Police Department, http://www.michigan.gov/msp/.
- L. Gibson, Forensic Art Essentials. Elsevier;2008.
- R. Uhl and N. Lobo, A Framework for Recognizing a Facial Image from a Police Sketch, in Proc. IEEE Conf. Computer Vision and Pattern Recognition. 1996.
CrossRef
- W. Konen, Comparing facial line drawings with gray-level images: A case study on PHANTOMAS, in International Conference Bochum, Germany. 1996;727–734.
CrossRef
- D. Lowe, Distinctive Image Features from Scale-Invariant Keypoints. Int’l J. Computer Vision. 2004;60(2):91-110.
CrossRef
- Timo Ahonen, A.H., and Matti Pietika¨ inen, Face Description with Local Binary Patterns: Application to Face Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2006;28(12).
- Anil K. Jain, B.K.a.U.P., Face Recognition: Some Challenges in Forensics, in IEEE Int’l Conference on Automatic Face and Gesture Recognition. 2011.
- Hu Han, B.F.K., Kathryn Bonnen, and Anil K. Jain, Matching Composite Sketches to Face Photos: A Component-Based Approach. IEEE Transactions on Information Forensics and Security. 2013;8(1):191-204.
CrossRef
- S. Klum, H.H., Anil K. Jain and B. Klare, Sketch Based Face Recognition: Forensic vs. Composite Sketches, in Biometrics (ICB), International Conf Madrid, Spain. 2013.
- B. Klare and A. Jain, Sketch to Photo Matching: A Feature-Based Approach, Proc. SPIE Conf. Biometric Technology for Human Identification VII. 2010.
CrossRef
This work is licensed under a Creative Commons Attribution 4.0 International License.