Modeling
Page 1 of 7
1.Circumstance of various factors changing over time
2.Identification to factors influencing the input
3.Inputting situation changing over the output
The model we designed is to check whether our Criticality detection works properly or not. Therefore, we set up the model as follows:This system is a smart genetic circuit. When Input, like light stimulation, changes, Output still keeps a similar stable outputting, which means this system can transform the signals with different intensity and length into signal impulse.
Our model is to check whether the circuit is right or not, first of all, we try to figure out all the factors which would have an impact on this circuit and the connections among all the factors. By fitting the optimal condition from the experimental data, we can give feedbacks of our project and offer constructive suggestions.
1. First of all, we point our all the parameters and factors in list one, and fit the each curve of the 4 factors which varies with time in plot 1, 2, and 3.
parameters | values |
1 | |
1 | |
10 | |
5 | |
1 | |
0.1 | |
1 | |
1 | |
2 | |
100 | |
0.1 |
Modeling
Page 2 of 7
Plot 3: C1 changing with operating time
Plot4: MGFP changing with operating time (in short period)
2. Principal factor analysis on the basis of principal component analysis
We can figure out the optimal influenced component of the output by modeling on the basis of principal factor analysis 1. Teaching evaluation model based on principal component analysis Principal component analysis is a statistical procedure that uses an orthogonal transformation to convert a set of observations of possibly correlated variables into a set of values of linearly uncorrelated variables called principal components. As each variable reflects some information of the research more or less, and there is some correlation between the indicators, the information we can get from the statistics are overlapped to some extent. Therefore, principal component analysis makes use of the ideas of dimension reduction, which can help retain the original data and minimize the loss of information. It reduces dimension of the high-dimensional variable space so that the original variable system can be integrated and simplified to the largest extent. In addition, it can objectively determine the weight of each indicator parameters, which avoids subjective judgment brought by randomness. On the basis of the original data, we can use principal component analysis to discard some information through linear transformation and then to find out composite indicator made of a combination of several indicators that are also called main ingredients. These components can reflect the characteristics of the original index and they are independent from each other. Set the original vector as and the main components obtained through principal component analysis as , which are linear combination of . Coordinate system composed of is obtained by translating the original coordinate system and orthogonal rotation. We call the space dimension of the primary hyperplane.Modeling
Page 3 of 7
On this main hyperplane, the largest variation in the data is the first principal component
.For
we have
in turn. As a result
can reflect most information of the original data, which implies that the main hyperplane with m dimensions is the very subspace with m dimensions which can retain the original data information to the largest extent.
The followings are the steps of principal component analysis.
(1)Firstly, normalize the original data to eliminate the impact brought by different magnitudes and dimensions. The formula used is as follows.
Where means the original data of the jth sample of the ith indicator.
and respectively implies the mean and standard deviation of samples of the ith indicator. (2)Through the normalized data sheet , we can further calculate the correlation coefficient R with the formula
Where
(3)Calculate the eigenvalues and eigenvectors of R. And according to the characteristic equation we can obtain the characteristic root and place it in descending order we can get the corresponding feature vector which are standard orthogonal. We call spindle. And I mentioned above means the unit matrix. (4) Calculate the contribution rate and the cumulative contribution rate
(5)Calculate the principal component
where the principal components are independent from each other.
Where means the original data of the jth sample of the ith indicator.
and respectively implies the mean and standard deviation of samples of the ith indicator. (2)Through the normalized data sheet , we can further calculate the correlation coefficient R with the formula
Where
(3)Calculate the eigenvalues and eigenvectors of R. And according to the characteristic equation we can obtain the characteristic root and place it in descending order we can get the corresponding feature vector which are standard orthogonal. We call spindle. And I mentioned above means the unit matrix. (4) Calculate the contribution rate and the cumulative contribution rate
(5)Calculate the principal component
where the principal components are independent from each other.
Modeling
Page 4 of 7
(6)Comprehensive analysis.
In order to retain as much of the original data information, we should take into consideration how much precision is needed to replace the original variable system with the m-dimension hyperplane. We can make judgment through the cumulative contribution rate and decide on the dimension of hyperplane m when and when m can satisfy the condition that And then we can make further analysis on the principal components extracted. (7) Finally, according to the principal components after principal component analysis and the corresponding weights we can calculate the comprehensive indicator which reflects characteristics of air polluting.
As the data information implying parameters of indicators has basically been reflected in the principal components, we can say that the comprehensive indicator W has already included the basic characteristics of teaching evaluation from these parameters in different aspects.
In order to retain as much of the original data information, we should take into consideration how much precision is needed to replace the original variable system with the m-dimension hyperplane. We can make judgment through the cumulative contribution rate and decide on the dimension of hyperplane m when and when m can satisfy the condition that And then we can make further analysis on the principal components extracted. (7) Finally, according to the principal components after principal component analysis and the corresponding weights we can calculate the comprehensive indicator which reflects characteristics of air polluting.
As the data information implying parameters of indicators has basically been reflected in the principal components, we can say that the comprehensive indicator W has already included the basic characteristics of teaching evaluation from these parameters in different aspects.
Statistical test system for principal component analysis
No denying that the principal component analysis is not appropriate for all sample data. Generally speaking, the method can be used to simplify data structure and to make further analysis in practical problems. Hence, principal component analysis has certain preconditions. Principal component analysis is a good choice only when variables of the original data have strong linear relationship. That is to say, when we find not enough degree of linear correlation among the original variables, it is not reasonable for us to use the principal component analysis as we can not simplify the data structure in this case. Therefore, we should test the applicability of raw data before the application of principal component analysis1.Bartlett test of sphercity
Bartlett test of sphercity is one of the commonly used statistical test methods. It is a test of the entire correlation matrix and the null hypothesis is that the correlation matrix is unit matrix. And when we can not reject the null hypothesis, we can say that the original variables are independent from each other, which means that it is not suitable to use the principal component analysis method[4]. We can also calculate significance level P value based on the statistic testing formula. When P value is less than 0.05, we should reject the null hypothesis, indicating that principal component analysis can be applied to the original data. On the contrary, if the probability P value is larger than 0.05, the main component analysis is no longer applicable.Modeling
Page 5 of 7
2 KMO (Kaiser-Meyer-Olkin) test
KMO (Kaiser-Meyer-Olkin) test is used to compare the simple correlation coefficient with partial correlation coefficient between the comparison variables. It is used mainly in the main factor analysis of multivariate statistics. KMO statistic is a value between 0 and 1. When the sum of the squares of simple correlation coefficient between the variables is larger than that of partial correlation coefficient, KMO value is close to 1, which means that the correlation between variables is strong and the original variable is suitable for factor analysis. Similarly, when the sum of the squares of simple correlation coefficient between the variables is close to zero, KMO value is close to zero, which means that the correlation between variables is weak and the original variable is not suitable for factor analysis. The formula for KMO test is as follows.Here rij represents the simple correlation coefficient and represents the partial correlation coefficients. When
KMO value is always between 0 and 1. And when we can use the principal component analysis.
3. simulation and analysis of the model
We conduct principal component analysis on the approximately 6000 groups of data gaining from the simulation of the system. The correlation coefficient matrix is as follows:List 2: the correlation coefficient matrix of each factor
RStotal | taRNA | CI | M_GFP | ||
the correlation coefficient matrix | RStotal | 1.000 | 0.183 | 0.999 | 0.115 |
taRNA | 0.183 | 1.000 | 0.165 | 0.631 | |
CI | 0.999 | 0.165 | 1.000 | 0.084 | |
MGFP | 0.115 | 0.631 | 0.084 | 1.000 |
The results of KMO and Bartlett:
KMO test | .386 | |
Bartlett test | Approximate chi square | 48881.528 |
df | 6 | |
Sig | .000 |
Factor | Proportion |
RStotal | 0.920 |
CI | 0.909 |
M_GFP | 0.447 |
taRNA | 0.524 |
Modeling
Page 6 of 7
So far, we have finished the analysis of the primary factors contributing to the output, and the relationship plot related with RStotal and CI is displayed as follows:
Plot5: relational graph of RStotal, GFP, and C1
Plot6: Through interpolation simulation, making the 3D Surface Plot about the output which relevant to these two factors.
3. When X is assigned 1, 10 ,100, the variation of output
Plot7: X equals 1reference documentation
Page 7 of 7
Plot8: X equal 10
Plot9:X equals 100
Summary: by fitting the model, we prove that output does not vary with input, as displayed in plot 6,7 and 8, when input is assigned 1, 10, or 100, the output of GFP maintains 8ng. We can claim that the output remains stable with the change of input, the effectiveness of the model is conformed.
[1] Yuan Peng.Teaching evaluation model research based on students' evaluation.[J]Journal of wuhan university of science and technology(JCR Social science edition)(In Chinese), 2005, 7(3): 67-69.
[2] Hotelling H. Analysis of a complex of statistical variables into principal components[J].Journal of Educational Psychology,1933, 24:417-441.
[3] Wei Wang, Qinzhong Ma, Mingzhou Lin, etc.Principal component analysis and seismic activity parameters reducton.[J]Acta Seismologica Sinica(In Chinese), 2005, 27(5):524 - 531.
[4] Deyin Fu.Principal component analysis in the statistical test problems[C].//The 14th national statistical scientific symposium proceedings(In Chinese), 2007:483-488
[5] Qiyuan Jiang, Jinxing Xie, Jun Ye. Mathematical model. Beijing Higher Education Press 2003(In Chinese)
[6] N. R. Draper, H. Smith. Applied Regression Analysis (third edition). John Wiley & Sons, Inc.1998