|
|
Line 207: |
Line 207: |
| <p>In view of the unknown distributions and different variances of the signals by our Paired dCas9 Reporter System, we chose a non-parametric statistics method called Wilcoxon Rank Sum Test of Block Design with the data Rank instead of ANOVA. <br> | | <p>In view of the unknown distributions and different variances of the signals by our Paired dCas9 Reporter System, we chose a non-parametric statistics method called Wilcoxon Rank Sum Test of Block Design with the data Rank instead of ANOVA. <br> |
| In the Block Design, we regarded the same gRNA detection of two treatment, i.e. target and mismatch DNA, as a block. To test the difference between two treatments, we test the null hypothesis that two treatment have no difference. The Wilcoxon Rank Sum statistics <img alt='Peking-Analysis-W_j.gif' src="https://static.igem.org/mediawiki/2015/1/15/Peking-Analysis-W_j.gif" class='formula-inline'>of each block is calculated first by | | In the Block Design, we regarded the same gRNA detection of two treatment, i.e. target and mismatch DNA, as a block. To test the difference between two treatments, we test the null hypothesis that two treatment have no difference. The Wilcoxon Rank Sum statistics <img alt='Peking-Analysis-W_j.gif' src="https://static.igem.org/mediawiki/2015/1/15/Peking-Analysis-W_j.gif" class='formula-inline'>of each block is calculated first by |
− | </p> | + | |
− | <div id="Modeling_Fm4"> | + | <div align="center" class='row'> |
− | <div align="center" class='row'>
| + | <img class='formula-line' alt="Peking-Analysis-Wj%3DsumR_i" src="https://static.igem.org/mediawiki/2015/0/0e/Peking-Analysis-Wj%3DsumR_i.gif"> |
− | <img class='formula-line' alt="Peking-Analysis-Wj%3DsumR_i" src="https://static.igem.org/mediawiki/2015/0/0e/Peking-Analysis-Wj%3DsumR_i.gif">
| + | <img class='formula-line' alt="Peking-Analysis-W_j_range" src="https://static.igem.org/mediawiki/2015/5/5a/Peking-Analysis-W_j_range.gif"> |
− | <img class='formula-line' alt="Peking-Analysis-W_j_range" src="https://static.igem.org/mediawiki/2015/5/5a/Peking-Analysis-W_j_range.gif">
| + | |
− | </div>
| + | |
− | <p>where <i>R</i><sub>i</sub> indicates the serial number of <i>X</i><sub>i</sub> in the population of both <i>X</i><sub>j</sub> and <i>Y</i><sub>j</sub>. Note that Wilcoxon Rank Sum statistics <i>W</i><sub>j</sub> are distribution free and its distribution is known as long as the sample number is known.</p>
| + | |
| </div> | | </div> |
| + | <p>where <img alt='Peking-Analysis-R_i.gif' src="https://static.igem.org/mediawiki/2015/1/16/Peking-Analysis-R_i.gif" class='formula-inline'> indicates the serial number of <img alt='Peking-Analysis-X_j.gif' src="https://static.igem.org/mediawiki/2015/6/60/Peking-Analysis-X_j.gif" class='formula-inline'> in the population of both <img alt='Peking-Analysis-X_j.gif' src="https://static.igem.org/mediawiki/2015/6/60/Peking-Analysis-X_j.gif" class='formula-inline'> and <img alt='Peking-Analysis-Y_j.gif' src="https://static.igem.org/mediawiki/2015/7/79/Peking-Analysis-Y_j.gif" class='formula-inline'>. Note that Wilcoxon Rank Sum statistics <img alt='Peking-Analysis-W_j.gif' src="https://static.igem.org/mediawiki/2015/1/15/Peking-Analysis-W_j.gif" class='formula-inline'> are distribution free and its distribution is known as long as the sample number is known.</p> |
| + | </p> |
| <div> | | <div> |
− | <p>For example, if n=3, {x<sub>1</sub>,x<sub>2</sub>,x<sub>3</sub>}={3,3,5}, {y<sub>1</sub>,y<sub>2</sub>,y<sub>3</sub>}={1,4,2}, so {x<sub>1</sub>,x<sub>2</sub>,x<sub>3</sub>,y<sub>1</sub>,y<sub>2</sub>,y<sub>3</sub>}={1,2,3,3,4,5}, which implies that {<i>R</i><sub>1</sub>,<i>R</i><sub>2</sub>,<i>R</i><sub>3</sub>}={2,2,3}<br> | + | <p>For example, if <img alt='Peking-Analysis-n%3D3.gif' src="https://static.igem.org/mediawiki/2015/4/48/Peking-Analysis-n%3D3.gif" class='formula-inline'>, |
− | Under the null hypothesis, after calculate all the possible order of two sample sets, the distributions of the statistics are shown as below: | + | <img alt='Peking-Analysis-x_sample.gif' src="https://static.igem.org/mediawiki/2015/5/50/Peking-Analysis-x_sample.gif" class='formula-inline'>, |
| + | <img alt='Peking-Analysis-y_sample.gif' src="https://static.igem.org/mediawiki/2015/1/15/Peking-Analysis-y_sample.gif" class='formula-inline'>, <br> |
| + | so <img alt='Peking-Analysis-xy_sample.gif' src="https://static.igem.org/mediawiki/2015/e/ef/Peking-Analysis-xy_sample.gif" class='formula-inline'>, which implies that <img alt='Peking-Analysis-R_sample.gif' src="https://static.igem.org/mediawiki/2015/8/81/Peking-Analysis-R_sample.gif" class='formula-inline'><br> |
| + | Under the null hypothesis, after calculate all the possible order of two sample sets, the distributions of the statistics <img alt='Peking-Analysis-W_j.gif' src="https://static.igem.org/mediawiki/2015/1/15/Peking-Analysis-W_j.gif" class='formula-inline'> are shown as below: |
| </p> | | </p> |
| <table border='1' style='margin:10px;padding:10px' class='col-md-12'> | | <table border='1' style='margin:10px;padding:10px' class='col-md-12'> |
| <tr> | | <tr> |
− | <th>W</th> | + | <th>W<sub>j</sub></th> |
| <td>6</td><td>7</td><td>8</td><td>9</td><td>10</td><td>11</td><td>12</td><td>13</td><td>14</td><td>15</td> | | <td>6</td><td>7</td><td>8</td><td>9</td><td>10</td><td>11</td><td>12</td><td>13</td><td>14</td><td>15</td> |
| </tr> | | </tr> |
| <tr> | | <tr> |
− | <th>f(W)</th> | + | <th>f(W<sub>j</sub>)</th> |
| <td>0.05</td><td>0.05</td><td>0.10</td><td>0.15</td><td>0.15</td><td>0.15</td><td>0.15</td><td>0.10</td><td>0.05</td><td>0.05</td> | | <td>0.05</td><td>0.05</td><td>0.10</td><td>0.15</td><td>0.15</td><td>0.15</td><td>0.15</td><td>0.10</td><td>0.05</td><td>0.05</td> |
| </tr> | | </tr> |
Line 231: |
Line 233: |
| </div> | | </div> |
| <div> | | <div> |
− | <p>Due to the small sample size, the minimal significance level is 0.05, which means only if <i>W</i><sub>j</sub>=15 leads to a rejection of the null hypothesis, in other words only when the minimum value of <i>X</i><sub>j</sub> was greater than the maximum value of <i>Y</i><sub>j</sub> to accept the alternative hypothesis instead of the null hypothesis, the two sets of data is significantly different. So the Wilcoxon Rank Sum Test may face challenge in single block test when the experimental and control group are slightly different.However, by using Block Design, we can integrate data from m blocks similar to the idea of ANOVA. We calculated the sum of <i>W</i><sub>j</sub>(1<=j<=m) as the statistics. | + | <p>Due to the small sample size, the minimal significance level is 0.05, which means only if <img alt='Peking-Analysis-Wj%3D15.gif' src="https://static.igem.org/mediawiki/2015/e/e0/Peking-Analysis-Wj%3D15.gif" class='formula-inline'> leads to a rejection of the null hypothesis, in other words only when the minimum value of <img alt='Peking-Analysis-X_j.gif' src="https://static.igem.org/mediawiki/2015/6/60/Peking-Analysis-X_j.gif" class='formula-inline'> was greater than the maximum value of <img alt='Peking-Analysis-Y_j.gif' src="https://static.igem.org/mediawiki/2015/7/79/Peking-Analysis-Y_j.gif" class='formula-inline'> to accept the alternative hypothesis instead of the null hypothesis, the two sets of data is significantly different. So the Wilcoxon Rank Sum Test may face challenge in single block test when the experimental and control group are slightly different.However, by using Block Design, we can integrate data from m blocks similar to the idea of ANOVA. We calculated the sum of <img alt='Peking-Analysis-W_j.gif' src="https://static.igem.org/mediawiki/2015/1/15/Peking-Analysis-W_j.gif" class='formula-inline'> as the statistics. |
| </p> | | </p> |
| <img class='col-md-12' alt="Modeling_Fm5" src="https://static.igem.org/mediawiki/2015/0/06/All_score.gif"> | | <img class='col-md-12' alt="Modeling_Fm5" src="https://static.igem.org/mediawiki/2015/0/06/All_score.gif"> |
| </div> | | </div> |
| <div> | | <div> |
− | <p>The Wilcoxon Rank Sum <i>W</i><sub>j</sub>(1<=j<=m) from m blocks are independent and identically distributed (i.i.d), according to the central limit theorem (CLT), as m approaches infinity, the random variable <img alt="Modeling_Fm6" src="https://static.igem.org/mediawiki/2015/0/06/All_score.gif"> converges in distribution to a standard normal distribution <i>N</i>(0,1)</p> | + | <p>The Wilcoxon Rank Sum <img alt='Peking-Analysis-W_j.gif' src="https://static.igem.org/mediawiki/2015/1/15/Peking-Analysis-W_j.gif" class='formula-inline'> from m blocks are independent and identically distributed (i.i.d), according to the central limit theorem (CLT), as m approaches infinity, the random variable <img class='formula-inline' alt="Peking-Analysis-W_BD_statistics.gif" src="https://static.igem.org/mediawiki/2015/6/68/Peking-Analysis-W_BD_statistics.gif"> converges in distribution to a standard normal distribution <i>N</i>(0,1)</p> |
| <p class='col-md-3'></p> | | <p class='col-md-3'></p> |
− | <img class='col-md-6' alt="Modeling_Fm7" src="https://static.igem.org/mediawiki/2015/0/06/All_score.gif"> | + | <img align="center" class='formula-line' alt="Peking-Analysis-W_BD_statistics_CLT.gif" src="https://static.igem.org/mediawiki/2015/7/73/Peking-Analysis-W_BD_statistics_CLT.gif"> |
− | <p>So actually we use the statistics <img alt="Modeling_Fm6" src="https://static.igem.org/mediawiki/2015/0/06/All_score.gif">, also we can calculate the p-value <img alt="Modeling_Fm8" src="https://static.igem.org/mediawiki/2015/0/06/All_score.gif">, where <img alt="Modeling_Fm9" src="https://static.igem.org/mediawiki/2015/0/06/All_score.gif"> is the distribution function of the standard normal distribution. If p-value is less than 0.01 or <img alt="Modeling_Fm10" src="https://static.igem.org/mediawiki/2015/0/06/All_score.gif">, then we accept the alternative hypothesis that the two treatment, i.e. target and mismatch DNA, is highly statistic significantly.</p> | + | <p>So actually we use the statistics <img class='formula-inline' alt="Peking-Analysis-W_BD_statistics.gif" src="https://static.igem.org/mediawiki/2015/6/68/Peking-Analysis-W_BD_statistics.gif">, also we can calculate the p-value <img class='formula-inline' alt="Peking-Analysis-W_BD_statistics_p_value.gif" src="https://static.igem.org/mediawiki/2015/1/1f/Peking-Analysis-W_BD_statistics_p_value.gif">, where <img class='formula-inline' alt="Peking-Analysis-Phi%28x%29.gif" src="https://static.igem.org/mediawiki/2015/1/19/Peking-Analysis-Phi%28x%29.gif"> is the distribution function of the standard normal distribution. If p-value is less than 0.01 or <img class='formula-inline' alt="Peking-Analysis-W_BD_gt_2.33.gif" src="https://static.igem.org/mediawiki/2015/2/20/Peking-Analysis-W_BD_gt_2.33.gif">, then we accept the alternative hypothesis that the two treatment, i.e. target and mismatch DNA, is highly statistic significantly.</p> |
| </div> | | </div> |
| </div> | | </div> |
Line 245: |
Line 247: |
| <h4><em>Result</em></h4> | | <h4><em>Result</em></h4> |
| <div> | | <div> |
− | <img class='col-md-6' alt="Modeling_Analysis_Figure1" src="https://static.igem.org/mediawiki/2015/0/06/All_score.gif"> | + | <img class='col-md-12' alt="Peking-CRISPR-Figure13.png" src="https://static.igem.org/mediawiki/2015/4/4e/Peking-CRISPR-Figure13.png"> |
− | <img class='col-md-6' alt="Modeling_Analysis_Figure2" src="https://static.igem.org/mediawiki/2015/0/06/All_score.gif">
| + | <p> Fig. 1 Results of high-throughput assay for MTB and control strain. F denotes fragments obtained from MTB genome (a) or control strain (b); P denotes markers from each fragment.</p> |
− | <p> Fig. 1 Heatmaps of target and mismatch target. (a) Heatmap of target DNA assay. (b) Heatmap of mismatch target DNA assay.</p> | + | |
| </div> | | </div> |
| <div> | | <div> |
− | <p>In our experiment, n=3, so</p> | + | <p>In our experiment, <img alt='Peking-Analysis-n%3D3.gif' src="https://static.igem.org/mediawiki/2015/4/48/Peking-Analysis-n%3D3.gif" class='formula-inline'>, so |
− | <img alt="Modeling_Fm11" src="https://static.igem.org/mediawiki/2015/0/06/All_score.gif"> | + | <img alt='Peking-Analysis-E%28W_j%29.gif' src="https://static.igem.org/mediawiki/2015/a/a5/Peking-Analysis-E%28W_j%29.gif" class='formula-inline'> |
− | <p>(p-value = 7.868593384510819e-23) so target and mismatch DNA, are highly significantly different in signal.</p> | + | <img alt='Peking-Analysis-Var%28W_j%29.gif' src="https://static.igem.org/mediawiki/2015/a/ae/Peking-Analysis-Var%28W_j%29.gif" class='formula-inline'> |
− | | + | <img class='formula-line' alt="Peking-Analysis-W_BD_gt_2.33.gif" src="https://static.igem.org/mediawiki/2015/a/a5/Peking-Analysis-E%28W_j%29.gif"> |
| + | so target and mismatch DNA, are highly significantly different in signal.</p> |
| </div> | | </div> |
| </div> | | </div> |