Team:Dundee/Modeling/Fingerprints

Fingerprint Ageing


Analysis and Modelling

Overview


Principal component analysis (PCA) was used to work with a set of fingerprint data obtained from an Australian PhD thesis. The two aims were to find which lipid compound has the most distinct degradation rate and to find the general degradation behaviour of compounds within fingerprints. PCA is a statistical observation tool that reduces multivariate data to its principal components in order to visualize hidden correlations in data which otherwise contains too many dimensions to view these correlations. The results of PCA can be visualised through plots and graphs produced by MATLAB.


Using a method similar to that of the FluID models, the binding of lanosterol synthase and squalene epoxide to form lanosterol can be investigated. The aim of creating a model describing lanosterol formation is to find the optimum binding rates and optimum initial concentration of lanosterol synthase required in the fingerprint ageing device. Ordinary differential equations were used to investigate the concentration of each of the substances over time, and sensitivity analysis was used to find the optimum conditions. Click below to find out more about each section.



Principal Component Analysis

Consider principal component analysis and how it was implemented.

Squalene Epoxide Model

Consider the binding between squalene epoxide and lanosterol synthase.

Principal Component Analysis


Aim

The aim of this section was to find the principal lipid compounds that account for the most variance across each individual, therefore the lipid with the most distinct degradation rate across all tested individuals. We also wanted to determine how fingerprint compounds differ in quantity in-between individuals and against time. From the principal lipid, age curves can then be produced. This would be used as a means to deduce the age of a fingerprint if the concentration of the principal compounds are known.


Background

A dataset from a recently published PhD was made available to us. The data consisted of 336 fingerprint samples taken from 8 donors in a single day. The samples from every donor were then analysed in triplicates for lipid content across 28 days using gas chromatography mass spectrometry. The raw chromatogram values were then normalised to take into account the quantitative changes between compounds (1). The data has many dimensions to consider such as sample, age or donor, these can obscure correlations making it difficult to pinpoint where a correlation may come from. Principal component analysis is a statistical observational tool that reduces multivariate data to its principal components in order to visualize hidden correlations in data which otherwise contains too many dimensions to view these correlations (2,3). Principal component analysis was the chosen statistical procedure for its ability to encapsulate and reduce the dimensions of the data. A more elaborate explanation of how PCA works can be found in the method.


Results

Using PCA it was found that squalene contributes 80% of the first principal component. The first principal component, in turn was found to account for 90% of the total variance of the data. In other words, out of the 15 compounds measured, squalene alone accounts for 72% of the total variance, highlighting that squalene has a distinct pattern within the dataset.


Figure 1: Biplot: Factor loading lipid compounds to first two principal components.

To determine the relationship between fingerprint ageing both with respect to variation between donors, intradonal variation, and within each donor’s samples, interdonal variation, the data was plotted against the first two principal components.


Figure 2: Scoreplot: Compound degradation against time with respect to the first two principal components.
Figure 3: Scoreplot: Interdonal variation with respect to first two PCs.

Figure 2 shows interdonal variation between all the donors and their aged fingerprint samples from all ageing intervals. Along the first principal component (PC), both inter and intra variation appear to increase. Moreover, this graph shows that individual donor samples do not distinctly group together which means that it would not be possible to deduce which fingerprint corresponds to which donor based on their PCs. The donor dots appear to follow faintly separate trajectories along the first PC axis, this suggests a distinct intradonal variation across all donors. This partly confirms the feasibility of a fingerprint age presumptive test.

Figure 3 is difficult to interpret due to the wide range of colours. The graph shows that the older a fingerprint is, the more difficult it is to pinpoint their age. The lack of distinct groupings between tandem day intervals shows that fingerprints do not age at a constant rate across all donors. This suggests that to construct an age curve would require intraspecificity. This allows for the rejection of the hypothesis of creating a 'one size fits all age curve' for any random fingerprint of age deduction.


Figure 4: Age curve of principal compounds for a single donor.

Figure 4 is an age curve for a specific donor focusing on three compounds from the dataset. Squalene, palmitic acid and hexadecenoic acid were found to be the biggest contributors to the first two principal components. Hence, these were chosen to be put plotted exclusively. Squalene appears to significantly decrease around day 15 which was found to be slightly higher than the expected degradation at day 7, suggested from current literature (4).

Figure 5: PC1 alignment for compound degradation.

Building on the conclusions drawn from Figure 3. In order to visualize the behaviour of fingerprint degradation across all donors more clearly, only the first principle component was used to align the data. The result while slightly clearer, shown in Figure 5, but still difficult to interpret. There is a strong change in composition of the fingerprint from Day 0 and Day 2 that is consistent across all donors. Apart from this observation, variance appears to increase the older the fingerprints become.

Method

Principal Component Analysis is a broad term used to describe the general procedure of reducing the number of variables from a multivariate set of data while still keeping the correlations intact. PCA has many possible appliances including neuroscience and image compression. The technique comes from an older established theorem in linear algebra called Singular Value Decomposition (2,3). The matrix operations from this theorem were used in MATLAB rather than the PCA package to attain a better understanding of each step of the process.


Remarks

Some useful conclusions have been drawn from the analysis namely the relevance of squalene as the key target for an ageing device. It is important to highlight that these results are a long way from having a practical use. The major barrier to overcome is understanding the complexity of how sex, age, ethnicity, diet, as well as substrate properties interact (2). The results already show great variation in a controlled environment so taking into account the different types of surfaces fingerprints that can be left behind would add further difficulty in attempting to draw any hard lined conclusions. The major aim from this research was to prove that it is possible to date fingerprints. This work has helped assert the possibility while also rejecting the notion of a “one size fits all” type of age curve to do so. From the PCA results, what we propose is a procedure where if there is a known suspect from a crime scene with a fingermark found at the crime scene. We would take forty or so samples from the suspect and construct an intraspecific age curve that would work in conjunction with the biosensor. This would be using a set of fingerprints collected from the suspect to deduce the age of the fingermark at the crime scene. This would be a laborious procedure and would probably be reserved for high profile cases.


References
  1. Thesis!
  2. PCA definition
  3. Mountfort KA, Bronstein H, Archer N, Jickells SM. Identification of oxidation products of squalene in solution and in latent fingerprints by ESI-MS and LC/APCI-MS.. Anal. Chem. 2007; 79(7): 2650-2657.

Click here to see MATLAB code Back to start of PCA

Squalene Epoxide and Lanosterol Synthase Binding Model


Aim

The aim of a model describing the binding between squalene epoxide and lanosterol synthase is to find the optimum concentration and binding rates that we require for visual detection of squalene epoxide in the fingermark sample from the crime scene. The more squalene epoxide and lanosterol synthase that bind the more likely it will be that squalene epoxide will be visually detected.


Results

Squalene epoxide is an intermediate in cholesterol synthesis. Squalene epoxide is degraded into lanosterol, a precursor for cholesterol by lanosterol synthase. These reactions can be described by the schematic:

$$ \ce{LS + SE<=>[K_{1}][K_{2}] PC ->[K_{3}] La}. $$

Where \(LS\) is the concentration of lanosterol synthase, \(SE\) is the concentration of squalene epoxide, \(PC\) is the concentration of the 1st intermediate, protosterol cation, and \(La\) is the concentration of lanosterol, the full complex. \(K_{1}\), \(K_{3}\) are the forward reaction rates, and \(K_{2}\) is the reverse reaction rate.

The initial concentration of squalene epoxide was defined to be \(SE_{0}\) and two parameters of the system were defined as:

$$ \large{ \begin{equation*} \lambda=\frac{K_{1}}{K_{2}} SE_{0}, \qquad \gamma=\frac{K_{3}}{K_{2}}. \end{equation*} } $$

The initial concentration of lanosterol synthase was defined to be \(LS_{0}\) and the ratio between initial concentrations was defined as:

$$ \large{ \begin{equation*} v_{0}=\frac{LS_{0}}{SE_{0}}. \end{equation*} } $$ Sensitivity analysis was performed to find the optimum values for the two parameters, \(\gamma\) and \(\lambda\), and the ratio, \(v_{0}\) which give the highest concentration of the final complex, lanosterol. Consider \(\lambda\) and \(\gamma\) first by setting \(v_{0}\) as the suggested value from (Eq 7) and setting a range of values for \(\gamma\) and \(\lambda\). The range of values chosen has the maximum value as twice the suggested value from (Eq 10).


Figure 6: Sensitivity analysis for the binding parameters of squalene epoxide and lanosterol synthase binding.

From Figure 6, it can be seen that there are optimal values for both parameters where increasing them has no effect on lanosterol formation. This was further investigated by looking at each parameter individually and the effect on lanosterol formation over time. Consider the effect of \(\lambda\) by setting \(v_{0}\) and \(\gamma\) as the suggested values, (Eq 7) and (Eq 10).


Figure 7: Lanosterol formation with increasing \(\lambda\).

Figure 7 suggests that the concentration of lanosterol does not increase after \(\lambda = 0.28\). Therefore if the value of \(\lambda\) is smaller than the suggested value, \(\lambda = 0.64\), but larger than \(\lambda = 0.28\) the same concentration of lanosterol will be formed. Recall that \(\large{\lambda = \frac{K_{1}SE_{0}}{K_{2}}}\), therefore the initial concentration of squalene epoxide or the binding ratio can be less than the suggested value. The effect of \(\gamma\) on lanosterol formation can be considered by setting \(v_{0}\) and \(\lambda\) as the suggested values, (Eq 7) and (Eq 10).


Figure 8: Lanosterol formation with increasing \(\gamma\).

Figure 8, similar to Figure 7, suggests that the concentration of lanosterol does not increase after \(\gamma = 0.4\). Therefore if the value of \(\gamma\) is smaller than the suggested value, \(\gamma = 1\), but larger than \(\gamma = 0.4\) the same concentration of lanosterol will be formed. Recall that \(\large{\gamma = \frac{K_{3}}{K_{2}}}\), therefore the binding rate ratio can be less than the suggested value. The effect of \(v_{0}\) on lanosterol formation can be considered by setting \(\gamma\) and \(\lambda\) as the suggested values, (Eq 10).


Figure 9: Lanosterol formation with increasing \(v_{0}\).

Figure 9, similar to Figure 7 and 8, suggests that the concentration of lanosterol does not increase after \(v_{0} = 1.5\). Therefore if the value of \(v_{0}\) is smaller than the suggested value, \(v_{0} = 2.56\), but larger than \(v_{0} = 1.5\) the same concentration of lanosterol will be formed. Recall that \(\large{v_{0} = \frac{LS_{0}}{SE_{0}}}\), and \(SE_{0}=0.82 \mu M\). From this the best concentration of lanosterol synthase to have in the fingerprint ageing device is:

$$ \large{ \begin{equation*} LS_{0}=1.23 \mu M. \end{equation*} } $$

The results of the model describing lanosterol formation has been passed on to the lab to be used in future decision making.


Method

Using the law of mass action (1) the binding reaction schematic was written as a system of ordinary differential equations (ODEs):

$$ \large{ \begin{eqnarray} \frac{dLS}{dt}&=&K_{2}PC - K_{1}LS\cdot SE, \nonumber \\ \frac{dSE}{dt}&=&K_{2}PC - K_{1}LS\cdot SE, \nonumber \\ \frac{dPC}{dt}&=&K_{1}LS \cdot SE - K_{2} PC- K_{3}PC,\tag{Eq 1}\\ \frac{dLa}{dt}&=&K_{3} PC. \nonumber \end{eqnarray} } $$

with initial conditions:

$$ \large{ \begin{eqnarray} LS(0)&=&LS_{0}, \quad \mu M \nonumber \\ \nonumber SE(0)&=&SE_{0}, \quad \mu M\\ PC(0)&=&0, \quad \mu M \tag{Eq 2}\\ La(0)&=&0. \quad \mu M \nonumber \end{eqnarray} } $$

The parameters were estimated by considering the steady state of the system. Setting the left hand side of (Eq 1) to zero gives:

$$ \large{ \begin{eqnarray} K_{2} PC&=&K_{1} LS \cdot SE, \nonumber \\ K_{1} LS \cdot SE&=&K_{2} PC - K_{3} PC. \tag{Eq 3} \end{eqnarray} } $$

Rearranging (Eq 3) gives:

$$ \large{ \begin{equation} \frac{PC}{LS \cdot SE}=\frac{K_{1}}{K_{2}}. \tag{Eq 4} \end{equation} } $$

Considering the first binding reaction, it was found that the total concentration of lanosterol synthase, \(LST\), will be equal to:

$$ \large{ \begin{equation} LST=LS+PC. \tag{Eq 5} \end{equation} } $$

Now using (Eq 4) and (Eq 5) it can be written that:

$$ \large{ \begin{equation} \frac{LS}{LST}=\frac{1}{\frac{K_{1}}{K_{2}} SE_{0} + 1}. \tag{Eq 6} \end{equation} } $$

It is known that the ratio between lanosterol synthase and squalene epoxide is:

$$ \large{ \begin{equation} v_{0}= 2.56, \tag{Eq 7} \end{equation} } $$

and that they bind at a 1:1 ratio (2) . Therefore the ratio of free lanosterol synthase to total lanosterol synthase will be:

$$ \large{ \begin{equation} \frac{LS}{LST}=\frac{1.56}{2.56}. \tag{Eq 8} \end{equation} } $$

By substituting (Eq 8) into equation (Eq 6) the ratio between \(K_{1}\) and \(K_{2}\) can be found:

$$ \large{ \begin{equation} \frac{K_{1}}{K_{2}}= 0.78\quad \mu M^{-1}. \tag{Eq 9} \end{equation} } $$

For (Eq 3), (Eq 6) and (Eq 8) can be used to find the ratio between \(K_{3}\) and \(K_{2}\):

$$ \large{ \begin{equation} \frac{K_{3}}{K_{2}}=1. \tag{Eq 10} \end{equation} } $$

From Goodman's 1964 paper (3) , it can be calculated that the suggested initial concentration of squalene is: \(SE_{0}=\) 0.82 \(\mu M\). It is then assumed that this will be a reasonable estimate for the initial concentration of squalene epoxide. Therefore, from (Eq 9) and (Eq 10) the estimated values for \(\lambda\) and \(\gamma\) are found to be:

$$ \large{ \begin{equation} \lambda=0.64, \qquad \gamma=1.\tag{Eq 11} \end{equation} } $$

By running the ode23 solver over one hundred different values for both parameters and the ratio \(v_{0}\), sensitivity analysis can be performed. The range of values has the mean as the estimated values, (Eq 7) and (Eq 11). The results are shown in Figure 6, where the centre of the plot represents the suggested concentration of complex formed when the suggested binding rates are used.


References
  1. Guldberg CM, Waage P. Concerning chemical affinity. Erdmanns Journal fr Practische Chemie 1879; 127: 69-114.
  2. Boutaud O, Dolis D, & Schuber F. Preferential cyclization of 2, 3 (S): 22 (S), 23-dioxidosqualene by mammalian 2, 3-oxidosqualene-lanosterol cyclase. Biochemical and Biophysical Research Communications 1992; 188(2): 898-904.
  3. Goodman DS. Squalene in human and rat blood plasma. Journal of Clinical Investigation 1964; 43(7): 1480.

Click here to see MATLAB code Back to start of model