Difference between revisions of "Team:SJTU-Software/project"

 
(22 intermediate revisions by 2 users not shown)
Line 51: Line 51:
  
 
<br/>
 
<br/>
<nav class="scrollspy-nav" data-am-scrollspy-nav="{offsetTop: 100}" data-am-sticky>
+
<nav data-am-sticky>
 
   <ul class="am-avg-sm-7 am-thumbnails mainNav">
 
   <ul class="am-avg-sm-7 am-thumbnails mainNav">
 
     <li class = "navItem">
 
     <li class = "navItem">
 
       <a href="home">
 
       <a href="home">
         <img class="am-thumbnails navPhoto" id = "1" src="https://static.igem.org/mediawiki/2015/4/4a/SJTU-SOFTWARE.1-1.png" />
+
         <img class="am-thumbnails navPhoto" id = "1" src="https://static.igem.org/mediawiki/2015/a/a0/SJTU-SOFTWARE.nav1-1.png" />
 
         <p class = "navPhoto">Home</p>
 
         <p class = "navPhoto">Home</p>
 
       </a>
 
       </a>
Line 61: Line 61:
 
     <li class = "navItem">
 
     <li class = "navItem">
 
       <a href="project">
 
       <a href="project">
         <img class="am-thumbnails navPhoto" id = "2" src="https://static.igem.org/mediawiki/2015/4/4c/SJTU-SOFTWARE.2-2.png" />
+
         <img class="am-thumbnails navPhoto" id = "2" src="https://static.igem.org/mediawiki/2015/0/06/SJTU-SOFTWARE.nav2-2.png" />
 
         <p class = "navPhoto">Project</p>
 
         <p class = "navPhoto">Project</p>
 
       </a>
 
       </a>
Line 67: Line 67:
 
     <li class = "navItem">
 
     <li class = "navItem">
 
       <a href="document">
 
       <a href="document">
         <img class="am-thumbnails navPhoto" id = "3" src="https://static.igem.org/mediawiki/2015/6/63/SJTU-SOFTWARE.3-1.png" />
+
         <img class="am-thumbnails navPhoto" id = "3" src="https://static.igem.org/mediawiki/2015/8/84/SJTU-SOFTWARE.nav3-1.png" />
         <p class = "navPhoto">&nbsp;Document</p>
+
         <p class = "navPhoto">Document</p>
 
       </a>
 
       </a>
 
     </li>
 
     </li>
 
     <li class = "navItem">
 
     <li class = "navItem">
 
       <a href="requirement">&nbsp;&nbsp;
 
       <a href="requirement">&nbsp;&nbsp;
         <img class="am-thumbnails navPhoto" id = "4" src="https://static.igem.org/mediawiki/2015/d/d0/SJTU-SOFTWARE.4-1.png" />
+
         <img class="am-thumbnails navPhoto" id = "4" src="https://static.igem.org/mediawiki/2015/4/49/SJTU-SOFTWARE.nav4-1.png" />
 
         <p class = "navPhoto">Requirement</p>
 
         <p class = "navPhoto">Requirement</p>
 
       </a>
 
       </a>
Line 79: Line 79:
 
     <li class = "navItem">
 
     <li class = "navItem">
 
       <a href="saftyPolicyconcern">
 
       <a href="saftyPolicyconcern">
         <img class="am-thumbnails navPhoto" id = "5" src="https://static.igem.org/mediawiki/2015/2/2a/SJTU-SOFTWARE.5-1.png" />
+
         <img class="am-thumbnails navPhoto" id = "5" src="https://static.igem.org/mediawiki/2015/b/b1/SJTU-SOFTWARE.nav7-1.png" />
         <p class = "navPhoto">Safty<br/>policy<br/>concern</p>
+
         <p class = "navPhoto" style = "font-size:14px">Safty&policy<br/>concern</p>
 
       </a>
 
       </a>
 
     </li>
 
     </li>
 
     <li class = "navItem">
 
     <li class = "navItem">
       <a href="humanPractice">
+
       <a href="Practices">
         <img class="am-thumbnails navPhoto" id = "6" src="https://static.igem.org/mediawiki/2015/0/08/SJTU-SOFTWARE.6-1.png" />
+
         <img class="am-thumbnails navPhoto" id = "6" src="https://static.igem.org/mediawiki/2015/d/d0/SJTU-SOFTWARE.nav5-1.png" />
 
         <p class = "navPhoto">Human<br/>practice</p>
 
         <p class = "navPhoto">Human<br/>practice</p>
 
       </a>
 
       </a>
Line 91: Line 91:
 
     <li class = "navItem">
 
     <li class = "navItem">
 
       <a href="team">
 
       <a href="team">
         <img class="am-thumbnails navPhoto" id = "7" src="https://static.igem.org/mediawiki/2015/b/b7/SJTU-SOFTWARE.7-1.png" />
+
         <img class="am-thumbnails navPhoto" id = "7" src="https://static.igem.org/mediawiki/2015/d/dd/SJTU-SOFTWARE.nav6-1.png" />
 
         <p class = "navPhoto">Team</p>
 
         <p class = "navPhoto">Team</p>
 
       </a>
 
       </a>
Line 108: Line 108:
 
         <h4 class = "am-panel-title" data-am-collapse="{parent: '#block1', target: '#do-not-say-1'}"><b>Background</b></h4>
 
         <h4 class = "am-panel-title" data-am-collapse="{parent: '#block1', target: '#do-not-say-1'}"><b>Background</b></h4>
 
       </div>
 
       </div>
       <div class="am-panel-collapse am-collapse backgroundContent" id = "do-not-say-1">
+
       <div class="am-panel-collapse am-collapse backgroundContent am-in" id = "do-not-say-1">
 
         <div class = "panel-hd">
 
         <div class = "panel-hd">
           <p class = " Background">
+
           <p class = " Background" style = "line-height:30px">
             As we know, there are more than 20,000 biobricks in iGEM official standard database and the number of biobrick keeps increasing every year. Based on the fact that sequencing technology gave birth to bioinformatics, we assumed that with the explosive increase of biobricks, it will be harder for synthetic biologists to manually find good biobricks which meets the requirements when they are trying to create new devices with existing parts. This issue will inevitably lead to the birth of softwares and databases especially related to synthetic biology and these intelligent tools will further promote the rapid development of synthetic biology. So we integrated biobricks data before September from the iGEM official standard database and then developed a visual online device-designing system for synthetic biology researchers.<br/>
+
             As we know, there are more than 20,000 biobricks in iGEM official standard database and the number of biobrick keeps increasing every year. with the explosive increase of biobricks, it will be harder for synthetic biologists to manually find good biobricks which meets the requirements when they are trying to create new devices with existing parts. This issue will inevitably lead to the birth of softwares and databases especially related to synthetic biology and these intelligent tools will further promote the rapid development of synthetic biology. So we integrated biobricks data before September 3,2015 from the iGEM official standard database and then created a visual online device-designing system for synthetic biology researchers.<br/>
            Meanwhile, in order to facilitate the researchers to look for better biobricks, we combined the search function and scoring system from the 2014 SJTU Software’s EasyBBK with our own system. In this way, users can find biobricks in line with their requirements more quickly.<br/>
+
Meanwhile,we get some ideas with our own system from 2014 SJTU Software’s EasyBBK that is available for finding good parts but has a lot to be improved. In this way, users can find biobricks with their requirements more quickly.<br/>
 
           </p>
 
           </p>
 
         </div>
 
         </div>
Line 123: Line 123:
 
       <hr class = "border-line"/>
 
       <hr class = "border-line"/>
 
       <div class = "designContent am-panel-hd">
 
       <div class = "designContent am-panel-hd">
         <h4 class = "am-panel-title" data-am-collapse="{parent: '#block7', target: '#do-not-say-2'}"><b>Design&Algorithm</b></h4>
+
         <h4 class = "am-panel-title" data-am-collapse="{parent: '#block7', target: '#do-not-say-2'}"><b>Design</b></h4>
 
       </div>
 
       </div>
 
       <div class="am-panel-collapse am-collapse backgroundContent" id = "do-not-say-2">
 
       <div class="am-panel-collapse am-collapse backgroundContent" id = "do-not-say-2">
 
         <div class = "panel-hd">
 
         <div class = "panel-hd">
 
           <p class = "Introduction"><b>Introduction<br/></b>
 
           <p class = "Introduction"><b>Introduction<br/></b>
           Our software, BASE, has four functions: search, recommendation, evaluation and upload. Via search function, users can search for parts or devices using IDs or features as keywords. In recommendation interface, users can draw their devices. They can also give some keywords when drag an icon to the chain to get a list of parts which fit the require and other parts best. When using evaluation, users firstly enter a device that they designed, then our software can give advice for each part to improve their devices.Finally, users can upload their device to the IGEM part registry and BASE’s database. For the first three functions, we develop a set of scoring system to evaluate the effectiveness and ease of use of the parts and devices.
+
           Our software, BASE, has four functions: search, recommendation, evaluation and upload. Via search function, users can search for parts or devices using IDs or features as keywords. In recommendation interface, users can draw their devices. They can also give some keywords when drag an icon to the chain to get a list of parts which fit the require and other parts best. When using evaluation, users firstly enter a device that they designed, then our software can give a score and advice for each part to improve their devices. Finally, users can upload their device to BASE’s database. For the first three functions, we develop a set of scoring system to evaluate the effectiveness and ease of use of the parts and devices. And the last 3 functions provide a whole set of helping system for device design.
 
           </p>
 
           </p>
 
           <p class = "Methods"><b>Method<br/></b>
 
           <p class = "Methods"><b>Method<br/></b>
           We get the data of parts from IGEM part registry. A total of 14971 bio-bricks are recorded in the database. Then we divide them into two groups, parts and devices, according to whether the biobrick has subparts. Among them, ??? are parts and ??? are devices. For each bio-brick, there’re four different websites:
+
           We get the data of bricks from IGEM part registry. A total of 28,637 biobricks are recorded in the database. Then we divide them into two groups, one for parts and the other for devices, according to whether the biobrick has subparts. Among them, 14,744 are parts and 13,893 are devices. For each biobrick, there’re four different websites:<br/>
          http://parts.igem.org/cgi/xml/part.cgi?part=BBa_???
+
http://parts.igem.org/cgi/xml/part.cgi?part=BBa_B0034<br/>
          http://parts.igem.org/cgi/partsdb/part_info.cgi?part_name=BBa_???
+
http://parts.igem.org/cgi/partsdb/part_info.cgi?part_name=BBa_B0034<br/>
          http://parts.igem.org/partsdb/get_part.cgi?part=BBa_??? 
+
http://parts.igem.org/partsdb/get_part.cgi?part=BBa_B0034 <br/>
          http://parts.igem.org/Part:BBa_???:Experience
+
http://parts.igem.org/Part:BBa_B0034:Experience<br/>
          When collecting data, we simply replace the ??? with the bricks’ ID.
+
  When collecting data, we simply replace the ID “B0034” with other bricks’ ID.<br/>
          We then extract information from the websites. The information include Part_status, Sample_status, Part_results, Uses, DNA_status, Qualitative_experience, Group_favorite, Star_rating, Del, Groups, Number_comments, Ave_rating. And we take most of the above factors into account when scoring bio-bricks.
+
We then extract information from the websites. The information includes Com_id, Author, Enter_time, Ctype, Part_status, Sample_status, Part_results, Star_rating, Uses, DNA_status, Qualitative_experience, Group_favorite, Del, Groups, Confirmed_times, Number_comments, Ave_rating, Des. And we take 12 of the above factors into account when we're scoring biobricks.<br/>
          As for optimizing the weight of these factors, we firstly analyze the distribution of value of the factors to choose the factors that can distinguish the parts most effectively. Then we select 40 parts and 40 devices as the training sets. Finally we get the weight by combining results of several methods.
+
As for optimizing the weight of these factors, we firstly analyze the distribution of value of the factors to choose the factors that can distinguish the parts most effectively. Then we select 40 parts and 40 devices as the training sets. Finally we get the weight by combining results of several methods. The optimized weight is set as the default weight.<br/>
 
           </p>
 
           </p>
 
           <p class = " Results"><b>Results<br/></b>
 
           <p class = " Results"><b>Results<br/></b>
           1.scores for different values of factors
+
           <b>1.Scores for different values of factors<br/></b>
 
           To build a scoring system, we start at giving scores to the values of these factors. With the help of wet lab researchers, we rank the values of discrete type according to their effect on researches, and choose a relatively good method to transform successive values into values between 0 and 1.
 
           To build a scoring system, we start at giving scores to the values of these factors. With the help of wet lab researchers, we rank the values of discrete type according to their effect on researches, and choose a relatively good method to transform successive values into values between 0 and 1.
 
             For discrete values, we have a scoring table as below.
 
             For discrete values, we have a scoring table as below.
 
           </p>
 
           </p>
 
           <div class = "am-g">
 
           <div class = "am-g">
             <p> Table1:<br/>
+
             <p><b> Table1:scores for factors'values<br/></b></p>
 
             <table class = "am-u-sm-5 "  >
 
             <table class = "am-u-sm-5 "  >
 
               <tr>
 
               <tr>
Line 204: Line 204:
 
                 <td>Null</td>
 
                 <td>Null</td>
 
                 <td>0</td>
 
                 <td>0</td>
 +
              </tr>
 +
              <tr>
 +
                <th rowspan = "2">Del</th>
 +
                <td>No</td>
 +
                <td>1</td>
 +
              </tr>
 +
              <tr>
 +
                  <td>Yes</td>
 +
                  <td>0</td>
 
               </tr>
 
               </tr>
 
               <tr>
 
               <tr>
Line 217: Line 226:
 
                 <td>other</td>
 
                 <td>other</td>
 
                 <td>0</td>
 
                 <td>0</td>
 +
              </tr>
 +
              <tr>
 +
                <th>Used Times</th>
 +
                <td></td>
 +
                <td>0-1</td>
 +
              </tr>
 +
              <tr>
 +
                <th>Average Rating</th>
 +
                <td></td>
 +
                <td>0-1</td>
 +
              </tr>
 +
              <tr>
 +
                <th>Confirmed Times</th>
 +
                <td></td>
 +
                <td>0-1</td>
 +
              </tr>
 +
              <tr>
 +
                <th>Number of comments</th>
 +
                <td></td>
 +
                <td>0-1</td>
 
               </tr>
 
               </tr>
 
             </table>
 
             </table>
Line 226: Line 255:
 
           </div>
 
           </div>
 
           <p>And the optimized weight of the factors are shown in the table below.<br/>
 
           <p>And the optimized weight of the factors are shown in the table below.<br/>
            Table2:<br/>
+
            <b> Table2:weight for each factor<br/></b>
 
           </p>
 
           </p>
 
           <div class = "am-g">
 
           <div class = "am-g">
Line 232: Line 261:
 
               <tr>
 
               <tr>
 
                 <th width = "40%">Part Status</th>
 
                 <th width = "40%">Part Status</th>
                 <td width = "60%">10</td>
+
                 <td width = "60%">7.5</td>
 
               </tr>
 
               </tr>
 
               <tr>
 
               <tr>
 
                 <th>Sample Status</th>
 
                 <th>Sample Status</th>
                 <td>10</td>
+
                 <td>6.8</td>
 
               </tr>
 
               </tr>
 
               <tr>
 
               <tr>
 
                 <th>DNA Status</th>
 
                 <th>DNA Status</th>
                 <td>10</td>
+
                 <td>6.7</td>
 
               </tr>
 
               </tr>
 
               <tr>
 
               <tr>
 
                 <th>Part Results</th>
 
                 <th>Part Results</th>
                 <td>15</td>
+
                 <td>11.9</td>
 
               </tr>
 
               </tr>
 
               <tr>
 
               <tr>
 
                 <th>Star Rating</th>
 
                 <th>Star Rating</th>
                 <td>10</td>
+
                 <td>7.5</td>
 
               </tr>
 
               </tr>
 
               <tr>
 
               <tr>
 
                 <th>Qualitative_<br/>experience</th>
 
                 <th>Qualitative_<br/>experience</th>
                 <td>5</td>
+
                 <td>3.5</td>
 
               </tr>
 
               </tr>
 
               <tr>
 
               <tr>
 
                 <th>Used Times</th>
 
                 <th>Used Times</th>
                 <td>15</td>
+
                 <td>13.7</td>
 
               </tr>
 
               </tr>
 
               <tr>
 
               <tr>
 
                 <th>Average Rating</th>
 
                 <th>Average Rating</th>
                 <td>20</td>
+
                 <td>13.7</td>
 +
              </tr>
 +
<tr>
 +
                <th>Del</th>
 +
                <td>5.5</td>
 +
              </tr>
 +
<tr>
 +
                <th>Group favorite</th>
 +
                <td>2.7</td>
 +
              </tr>
 +
<tr>
 +
                <th>Confirmed Times</th>
 +
                <td>10</td>
 
               </tr>
 
               </tr>
 
               <tr>
 
               <tr>
 
                 <th>Number<br/>of Comments</th>
 
                 <th>Number<br/>of Comments</th>
                 <td>5</td>
+
                 <td>10.5</td>
 
               </tr>
 
               </tr>
 
             </table>
 
             </table>
Line 271: Line 312:
 
           </div>
 
           </div>
 
           <p class = "Results">
 
           <p class = "Results">
           2.devices scoring method with relationship between parts
+
           <b>2.Devices scoring method with relationship between parts<br/></b>
           This method is mainly used in the evaluation function. For a device which is just designed by users, the score we get through the first method actually mean nothing, as there’re no information for the device on the registry. So we need to develop a new evaluation system based on its composing parts and relationships between the parts.  
+
           This method is mainly used in the evaluation function. For a device which is just designed by users, the score we get through the first method actually mean nothing, as there’re no information for the device on the registry. So we need to develop a new evaluation system based on its composing parts and relationships between the parts.<br/>
          When evaluating the relationships between parts, we take several factors into consideration, such as the frequency and the average score when the parts are used together and so on.
+
When evaluating the relationships between parts, we take several factors into consideration, such as the frequency and the average score when the parts are used together and so on.<br/>
          Firstly the weight of the two aspects is confirmed. The default ratio is 65% for the parts and 35% for the relationships. In the first aspect, the weight of different types is dynamic. It’s influenced by the number and type of the parts. However it still shows the different significance of the parts. But in the second aspect, all relationships share the same weight.
+
Firstly the weight of the two aspects is confirmed. The default ratio is 65% for the parts and 35% for the relationships. In the first aspect, given that wet lab researchers care more about the outcome of a device, the weight of coding types is 80% while that of others is 20%. But in the second aspect, all relationships between coding parts and others share the same weight.<br/>
          Then the scoring begins. Given that wet lab researchers care more about the outcome of a device, we search for functional coding parts in the device, and optimize it in the first place. After the user locks the functional parts, we start to optimize other parts. The order is decided according to their type and location in the device.
+
Since there’re two scoring system for devices, the weight in the second one is adjusted to make the scores made by different method close so that scores for new devices can have the comparability with those already in the database.<br/>
          Since there’re two scoring system for devices, the weight in the second one is adjusted to make the scores made by different method close so that scores for new devices can have the comparability with those already in the database.
+
           <b>3.Adding parts one by one<br/></b>
 
+
        This method is mainly used in recommendation function. It’s similar to the second one, but it only cares about the new adding part and relationships when doing the recommendations. The weight’s also adjusted to fit the other two method.<br/>
           3.Adding parts one by one
+
          This method is mainly used in recommendation function. It’s similar to the second one, but it only cares about the new adding part and relationships when doing the recommendations. The weight’s also adjusted to fit the other two method.
+
 
           </p>
 
           </p>
 
           <p class = "Reference"><b>Reference<br/><br/></b>
 
           <p class = "Reference"><b>Reference<br/><br/></b>
 
             Morgan Madec, Yves Gendrault, Christophe Lallement, Member, IEEE, Jacques Haiech. A game-of-life like simulator for design-oriented modeling of BioBricks in synthetic biology, 34th Annual International Conference of the IEEE EMBS, San Diego, California USA, 28 August - 1 September, 2012<br/><br/>
 
             Morgan Madec, Yves Gendrault, Christophe Lallement, Member, IEEE, Jacques Haiech. A game-of-life like simulator for design-oriented modeling of BioBricks in synthetic biology, 34th Annual International Conference of the IEEE EMBS, San Diego, California USA, 28 August - 1 September, 2012<br/><br/>
             Suvi Santala, * Matti Karp, and Ville Santala, Monitoring Alkane Degradation by Single BioBrick Integration to an Optimal Cellular Framework, Synth. Biol. 2012, 1, 60 −64<br/><br/>
+
             Suvi Santala, Matti Karp, and Ville Santala, Monitoring Alkane Degradation by Single BioBrick Integration to an Optimal Cellular Framework, Synth. Biol. 2012, 1, 60 −64<br/><br/>
             Patrick M Boyle1, Devin R Burrill1, Mara C Inniss1, Christina M Agapakis1, Aaron Deardon, Jonathan G DeWerd, Michael A Gedeon, Jacqueline Y Quinn, Morgan L Paull, Anugraha M Raman, Mark R Theilmann, Lu Wang, Julia C Winn, Oliver Medvedik, Kurt Schellenberg, Karmella A Haynes,Alain Viel, Tamara J Brenner, George M Church, Jagesh V Shah1 and Pamela A Silver, A BioBrick compatible strategy for genetic modification of plants, Journal of Biological Engineering 2012, 6:8<br/><br/>
+
             Patrick M Boyle, Devin R Burrill, Mara C Inniss, Christina M Agapakis, Aaron Deardon, Jonathan G DeWerd, Michael A Gedeon, Jacqueline Y Quinn, Morgan L Paull, Anugraha M Raman, Mark R Theilmann, Lu Wang, Julia C Winn, Oliver Medvedik, Kurt Schellenberg, Karmella A Haynes,Alain Viel, Tamara J Brenner, George M Church, Jagesh V Shah1 and Pamela A Silver, A BioBrick compatible strategy for genetic modification of plants, Journal of Biological Engineering 2012, 6:8<br/><br/>
 
             Ilya B. Tikh & Mark Held & Claudia Schmidt-Dannert, BioBrickTM compatible vector system for protein expression in Rhodobacter sphaeroides, Appl Microbiol Biotechnol (2014) 98:3111–3119<br/><br/>
 
             Ilya B. Tikh & Mark Held & Claudia Schmidt-Dannert, BioBrickTM compatible vector system for protein expression in Rhodobacter sphaeroides, Appl Microbiol Biotechnol (2014) 98:3111–3119<br/><br/>
 
             Jacob E. Vick & Ethan T. Johnson & Swat i Choudhar y & Sarah E. Bloch & Fern ando Lope z-Galleg o & Poonam Sr ivastava & Ilya B. Tikh & Grays on T. Wawrzy n & Claud ia Sc hmidt-Da nnert, Optimized compa tible set of BioBrick™ vectors for met abolic pathway engineering, Appl Microbiol Biotechnol (2011) 92:1275–1286<br/><br/>
 
             Jacob E. Vick & Ethan T. Johnson & Swat i Choudhar y & Sarah E. Bloch & Fern ando Lope z-Galleg o & Poonam Sr ivastava & Ilya B. Tikh & Grays on T. Wawrzy n & Claud ia Sc hmidt-Da nnert, Optimized compa tible set of BioBrick™ vectors for met abolic pathway engineering, Appl Microbiol Biotechnol (2011) 92:1275–1286<br/><br/>
             Methods in Molecular Biology: Synthetic+Gene+Networks, John M. Walker, Wilfried Weber, Martin Fussenegger, Humana Press, 2012<br/><br/>
+
             John M. Walker, Wilfried Weber, Martin Fussenegger, Methods in Molecular Biology: Synthetic+Gene+Networks,  Humana Press, 2012<br/><br/>
 
           </p>
 
           </p>
 
         </div>
 
         </div>
Line 297: Line 336:
 
       <hr class = "border-line"/>
 
       <hr class = "border-line"/>
 
       <div class = "validactionContent am-panel-hd">
 
       <div class = "validactionContent am-panel-hd">
         <h4 class = "am-panel-title" data-am-collapse="{parent: '#block2', target: '#do-not-say-3'}"><b>Validaction</b></h4>
+
         <h4 class = "am-panel-title" data-am-collapse="{parent: '#block2', target: '#do-not-say-3'}"><b>Validation</b></h4>
 
       </div>
 
       </div>
 
       <div class="am-panel-collapse am-collapse validactionContent" id = "do-not-say-3">
 
       <div class="am-panel-collapse am-collapse validactionContent" id = "do-not-say-3">
 
         <div class = "panel-hd">
 
         <div class = "panel-hd">
           <p class = "  validaction">勾勾丑丑哒</p>
+
           <p class = "  validaction" style = "line-height:30px">1.<b>Training of the built-in weight for the algorithm including only the information from the websites</b><br/>
 +
For the biobricks (parts and devices) in the database, we give them a score based on their 12 features. We transform the values of feature into values between 0 and 1, and give each feature a weight. In order to set up a appropriate weight for each feature, we choose 40 parts or devices with high values of features as positive samples and choose 40 parts or devices with low values of features as negative samples. Then we change the weight of one feature and fix others to expand the gap between the average of positive samples' score and the average of negative samples' score. By this way, we adjust the weight of each feature to improve the accuracy of distinguishing the great and poor biobricks.<br/>
 +
Using the above method, we find several important features that should have higher weights than others. On the other hand, we give each feature an appropriate weight in consideration of their effects on researches.<br/>
 +
<b>2.Adjust the score from devices scoring method to get close to the method based on biobrick features<br/></b>
 +
For the devices that are not in our database, we score them based on the score of their parts and the score of the connecting between their parts. In order to prove the validity of this algorithm, we choose 24 devices with high score (score>55) in the database as positive samples and choose 18 devices with low score (score<20) in the database as negative samples. Then we use this algorithm to score the positive samples and negative samples. In order to improve the accuracy of algorithm's prediction and balance the error rate of the prediction of two groups, we regard the devices with score beyond 55 as great devices and regard the devices with score below 30 as poor devices. For the devices with score between 30 and 55, our algorithm can not exactly tell you whether they are great devices or not. By that standard, the accuracy of the algorithm's prediction of positive samples and negative samples are 58.3% and 55.6%. For all samples, the accuracy of the algorithm's prediction is 57.1%.<br/>
 +
<b>3.The practical application of our algorithm <br/></b>
 +
We have an collaboration with igem team SJTU-BioX-Shanghai. Our goal is to evaluate their biobrick by scoring two parallel experiment results. We got the new devices and its structure from SJTU-BioX-Shanghai and noticed that the parts were built by themselves not long ago. When we downloaded the database data, they had not uploaded their new biobrick. So we could not find the parts' ID from the new devices in our database.<br/>
 +
In order to evaluate the new parts, we use BLAST to find the parts with most similar sequences compared to the new parts in our database. Then we use the parts found in our database to evaluate the new devices. We found the coding parts in the device are most important, so our algorithm give the coding part a highest weight. <br/>
 +
And the consequence of this collaboration is very ideal. The two devices got two disparate scores, one of which is scored 32.98 and the other is 3.542. In addition, this difference is also reflected by their experiment result that the one with the higher score is chosen as their final biobrick to control the expression of the iron pump.<br/>
 +
So this collaboration proves that our algorithm can evaluate new device roughly on the base of existing part.<br/></p>
 
         </div>
 
         </div>
 
       </div>
 
       </div>
Line 307: Line 355:
 
   </div>
 
   </div>
  
   <div id = "block6" class = "am-panel-group">
+
    
    <div class = "project-content am-panel am-panel-default" data-am-scrollspy="{animation: 'scale-down'}">
+
      <a name = "Improvement" id = "Demo"></a>
+
      <hr class = "border-line"  />
+
      <div class = "demoContent am-panel-hd">
+
        <h4 class = "am-panel-title" data-am-collapse="{parent: '#block6', target: '#do-not-say-8'}"><b>Improvement</b></h4>
+
      </div>
+
      <div class="am-panel-collapse am-collapse demoContent" id = "do-not-say-8">
+
        <div class = "panel-hd">
+
        <p class = "improvement"><br/>The school of life science and biotechnology of Shanghai Jiao Tong University is one of the best in China. We have a long tradition of taking part in the iGEM competition. However, until 2014, we only had wet lab team for our school and our seniors organized the first software team of SJTU. Thanks to the experience shared by last year’s software team, we have a much more mature software team this year.<br/>
+
Before we considering the project of this year, we firstly conducted carefully investigation on projects of former software teams. During the process, we found a common problem that the former project only got maintenance for the year when the team took the competition. After several years, we cannot contact the people who were responsible for the software then and some of the software are excellent. In order to prevent this problem from happening to our project, we consulted the members of last year’s software team after we determined our main idea. Consequentially, we absorbed the most important results of easyBBK in our software. First of all, as an outstanding software, easyBBK has a highlight in the search function. So we added the search function on our push system (which helps users to construct and design devices. Moreover, we made the scoring system of easyBBK to be more flexible by allowing users to choose the weight of different parts and filter the score range.<br/>
+
Meanwhile, we use the usage count of every existing biobricks to determine the difference of their performances. After standardizing the format of biological data, we can have better and more concrete standards like the expression level to evaluate the property of biobricks. In the end, we hope SJTU software teams can continue to inherit and develop the work of former teams. This would meet the requirements of iGEM competition better.<br/></p>
+
        </div>
+
      </div>
+
    </div>
+
  </div>
+
  <div id = "block3" class = "am-panel-group">
+
    <div class = "project-content am-panel am-panel-default" data-am-scrollspy="{animation: 'scale-down'}">
+
      <a name = "Demo" id = "Demo"></a>
+
      <hr class = "border-line"  />
+
      <div class = "demoContent am-panel-hd">
+
        <h4 class = "am-panel-title" data-am-collapse="{parent: '#block3', target: '#do-not-say-4'}"><b>Demo</b></h4>
+
      </div>
+
      <div class="am-panel-collapse am-collapse demoContent" id = "do-not-say-4">
+
        <div class = "panel-hd">
+
        <p class = ""><b>Search for device</b></p>
+
        <p>Enter keyword  “protein”<br/>
+
          Click button ‘Device’<br/>
+
        </p>
+
        <img src = "https://static.igem.org/mediawiki/2015/d/d4/SJTU-SOFTWARE.demo1.png"/>
+
        <p>Use button ‘Advanced’ to change weight.<br/></p>
+
        <img src = "https://static.igem.org/mediawiki/2015/3/32/SJTU-SOFTWARE.demo2.png"/>
+
        <p>
+
          Set Uses to 1<br/>
+
          Set Part Results to 20<br/>
+
          Set Confirmed_times to 1<br/>
+
          Set Number_comments to 1<br/>
+
        </p>
+
        <img src = "https://static.igem.org/mediawiki/2015/c/c1/SJTU-SOFTWARE.demo3.png"/>
+
        <p>
+
          Leave the rest as the defaults and click button ‘Sure’.<br/>
+
          You will see result like this:<br/>
+
        </p>
+
        <img src = "https://static.igem.org/mediawiki/2015/4/49/SJTU-SOFTWARE.demo4.png"/>
+
        <p><b>Search for part<br/></b></p>
+
        <p>Searching for part is a similar process<br/>
+
          Enter keyword ”DNA”<br/>
+
          Click button ‘Part’.<br/>
+
          Change weight the same as the example of searching for device above<br/>
+
          (Set Uses to 1;set Part Results to 20;set Confirmed_times to 1;set Number_comments to 1).<br/>
+
          The result is as below:<br/>
+
        </p>
+
        <img src = "https://static.igem.org/mediawiki/2015/2/2d/SJTU-SOFTWARE.demo5.png"/>
+
        <p><b>Construct</b></p>
+
        <p>Let us see a simple example:<br/>
+
          Drag the icons and construct a simple device with a regulator,coding area,a reporter and a terminator.<br/>
+
        </p>
+
        <img src = "https://static.igem.org/mediawiki/2015/2/2b/SJTU-SOFTWARE.demo6.png"/>
+
        <p>Click button ‘Advise’ and select biobricks for your device.You can fill in functions if you want.<br/><br/>
+
          History id will show all ID of biobricks in your device.<br/>
+
        </p>
+
        <img src = "https://static.igem.org/mediawiki/2015/1/12/SJTU-SOFTWARE.demo7.png"/>
+
        <p>When you have finished a device, you can click button </p><img src = "https://static.igem.org/mediawiki/2015/d/d8/SJTU-SOFTWARE.demo8.png"/>
+
        <p>on the bottom right corner to evaluate this device.</p>
+
        </div>
+
      </div>
+
    </div>
+
  </div>
+
 
   <div id = "block4" class = "am-panel-group">
 
   <div id = "block4" class = "am-panel-group">
 
     <div class = "project-content am-panel am-panel-default" data-am-scrollspy="{animation: 'scale-down'}">
 
     <div class = "project-content am-panel am-panel-default" data-am-scrollspy="{animation: 'scale-down'}">
Line 384: Line 365:
 
       <div class="am-panel-collapse am-collapse downloadContent" id = "do-not-say-5">
 
       <div class="am-panel-collapse am-collapse downloadContent" id = "do-not-say-5">
 
         <div class = "panel-hd">
 
         <div class = "panel-hd">
         <a href = "http://www.igembase.com" target = "_blank">Base mainpage</a>
+
         <a href = "http://www.igembase.com" target = "_blank">Base mainpage</a><br/>
 
         <a href = "https://github.com/igemsoftware/SJTU-Software2015" target = "_blank">Github mainpage</a>
 
         <a href = "https://github.com/igemsoftware/SJTU-Software2015" target = "_blank">Github mainpage</a>
 
         </div>
 
         </div>
Line 395: Line 376:
 
       <hr class = "border-line"/>
 
       <hr class = "border-line"/>
 
       <div class = "achivementContent am-panel-hd">
 
       <div class = "achivementContent am-panel-hd">
         <h4 class = "am-panel-title" data-am-collapse="{parent: '#block5', target: '#do-not-say-6'}"><b>Achivement</b></h4>
+
         <h4 class = "am-panel-title" data-am-collapse="{parent: '#block5', target: '#do-not-say-6'}"><b>Achievement</b></h4>
 
       </div>
 
       </div>
 
       <div class="am-panel-collapse am-collapse achivementContent" id = "do-not-say-6">
 
       <div class="am-panel-collapse am-collapse achivementContent" id = "do-not-say-6">
 
         <div class = "panel-hd">
 
         <div class = "panel-hd">
 
           <p class = "achivement">1.We firstly include relationships between parts to the evaluation of devices.<br/>
 
           <p class = "achivement">1.We firstly include relationships between parts to the evaluation of devices.<br/>
          2.Our software can help in the whole process that a user designs a new device: the optimization of one part and another, and the visualization of the device.<br/>
+
2.Our software can help in the whole process that a user designs a new device: the optimization of one part or the whole device, and the visualization of the device.<br/>
          3.Our software enable users to design their personalised weight for part evaluation.<br/>
+
3.Our software enables users to design their self-defined weights for part evaluation.<br/>
          4.Our software help users upload their parts more easily and can expand its own database.<br/>
+
4.The database of our software can optimize itself by enable users upload their parts and devices to our database.<br/>
          5.Our software is web-based, which is more convenient for users to use.<br/>
+
5.Our software is web-based, which is more convenient for users to use.<br/>
 +
 
 
           </p>
 
           </p>
 
         </div>
 
         </div>

Latest revision as of 01:00, 19 September 2015

Project





Background

As we know, there are more than 20,000 biobricks in iGEM official standard database and the number of biobrick keeps increasing every year. with the explosive increase of biobricks, it will be harder for synthetic biologists to manually find good biobricks which meets the requirements when they are trying to create new devices with existing parts. This issue will inevitably lead to the birth of softwares and databases especially related to synthetic biology and these intelligent tools will further promote the rapid development of synthetic biology. So we integrated biobricks data before September 3,2015 from the iGEM official standard database and then created a visual online device-designing system for synthetic biology researchers.
Meanwhile,we get some ideas with our own system from 2014 SJTU Software’s EasyBBK that is available for finding good parts but has a lot to be improved. In this way, users can find biobricks with their requirements more quickly.


Design

Introduction
Our software, BASE, has four functions: search, recommendation, evaluation and upload. Via search function, users can search for parts or devices using IDs or features as keywords. In recommendation interface, users can draw their devices. They can also give some keywords when drag an icon to the chain to get a list of parts which fit the require and other parts best. When using evaluation, users firstly enter a device that they designed, then our software can give a score and advice for each part to improve their devices. Finally, users can upload their device to BASE’s database. For the first three functions, we develop a set of scoring system to evaluate the effectiveness and ease of use of the parts and devices. And the last 3 functions provide a whole set of helping system for device design.

Method
We get the data of bricks from IGEM part registry. A total of 28,637 biobricks are recorded in the database. Then we divide them into two groups, one for parts and the other for devices, according to whether the biobrick has subparts. Among them, 14,744 are parts and 13,893 are devices. For each biobrick, there’re four different websites:
http://parts.igem.org/cgi/xml/part.cgi?part=BBa_B0034
http://parts.igem.org/cgi/partsdb/part_info.cgi?part_name=BBa_B0034
http://parts.igem.org/partsdb/get_part.cgi?part=BBa_B0034 
http://parts.igem.org/Part:BBa_B0034:Experience
When collecting data, we simply replace the ID “B0034” with other bricks’ ID.
We then extract information from the websites. The information includes Com_id, Author, Enter_time, Ctype, Part_status, Sample_status, Part_results, Star_rating, Uses, DNA_status, Qualitative_experience, Group_favorite, Del, Groups, Confirmed_times, Number_comments, Ave_rating, Des. And we take 12 of the above factors into account when we're scoring biobricks.
As for optimizing the weight of these factors, we firstly analyze the distribution of value of the factors to choose the factors that can distinguish the parts most effectively. Then we select 40 parts and 40 devices as the training sets. Finally we get the weight by combining results of several methods. The optimized weight is set as the default weight.

Results
1.Scores for different values of factors
To build a scoring system, we start at giving scores to the values of these factors. With the help of wet lab researchers, we rank the values of discrete type according to their effect on researches, and choose a relatively good method to transform successive values into values between 0 and 1. For discrete values, we have a scoring table as below.

Table1:scores for factors'values

Part status Released HQ 2013 1
other 0
Sample status In Stock 1
It's complicated 0.5
For Reference Only 0.25
other 0
DNA Status Available 1
other 0
Part Results Works 1
Issues 0.25
Fails;None;Null 0
Star Rating 1 1
Null 0
Del No 1
Yes 0
Qualitative_
experience
Works 1
Issues 0.25
other 0
Used Times 0-1
Average Rating 0-1
Confirmed Times 0-1
Number of comments 0-1

For those successive values, such as used times, average rating, number of comments, we develop two scoring methods. The “average rating” factor has only 5 values, so we just simply score it as a arithmetic progression. As for the other two factors, the distribution of values seems very unbalanced. And since we can be convinced that a brick is good when it’s used several tens times and the feedbacks are good, there’s no need to force a brick to get used for a thousand times before it’s recommended to other users, though some of the parts are actually used hundreds or even thousands of times. So we calculate the score by the expression below. Score=log(n+1)/log(nmax+1) The n in the expression refers to the values. By using this expression, we reduce the effect of extreme values and make the scores more convincing.

And the optimized weight of the factors are shown in the table below.
Table2:weight for each factor

Part Status 7.5
Sample Status 6.8
DNA Status 6.7
Part Results 11.9
Star Rating 7.5
Qualitative_
experience
3.5
Used Times 13.7
Average Rating 13.7
Del 5.5
Group favorite 2.7
Confirmed Times 10
Number
of Comments
10.5

The above scoring system are used to evaluate all the bricks in our databases. It become effective in all the functions except upload. However, we still have another scoring system for devices.

2.Devices scoring method with relationship between parts
This method is mainly used in the evaluation function. For a device which is just designed by users, the score we get through the first method actually mean nothing, as there’re no information for the device on the registry. So we need to develop a new evaluation system based on its composing parts and relationships between the parts.
When evaluating the relationships between parts, we take several factors into consideration, such as the frequency and the average score when the parts are used together and so on.
Firstly the weight of the two aspects is confirmed. The default ratio is 65% for the parts and 35% for the relationships. In the first aspect, given that wet lab researchers care more about the outcome of a device, the weight of coding types is 80% while that of others is 20%. But in the second aspect, all relationships between coding parts and others share the same weight.
Since there’re two scoring system for devices, the weight in the second one is adjusted to make the scores made by different method close so that scores for new devices can have the comparability with those already in the database.
3.Adding parts one by one
This method is mainly used in recommendation function. It’s similar to the second one, but it only cares about the new adding part and relationships when doing the recommendations. The weight’s also adjusted to fit the other two method.

Reference

Morgan Madec, Yves Gendrault, Christophe Lallement, Member, IEEE, Jacques Haiech. A game-of-life like simulator for design-oriented modeling of BioBricks in synthetic biology, 34th Annual International Conference of the IEEE EMBS, San Diego, California USA, 28 August - 1 September, 2012

Suvi Santala, Matti Karp, and Ville Santala, Monitoring Alkane Degradation by Single BioBrick Integration to an Optimal Cellular Framework, Synth. Biol. 2012, 1, 60 −64

Patrick M Boyle, Devin R Burrill, Mara C Inniss, Christina M Agapakis, Aaron Deardon, Jonathan G DeWerd, Michael A Gedeon, Jacqueline Y Quinn, Morgan L Paull, Anugraha M Raman, Mark R Theilmann, Lu Wang, Julia C Winn, Oliver Medvedik, Kurt Schellenberg, Karmella A Haynes,Alain Viel, Tamara J Brenner, George M Church, Jagesh V Shah1 and Pamela A Silver, A BioBrick compatible strategy for genetic modification of plants, Journal of Biological Engineering 2012, 6:8

Ilya B. Tikh & Mark Held & Claudia Schmidt-Dannert, BioBrickTM compatible vector system for protein expression in Rhodobacter sphaeroides, Appl Microbiol Biotechnol (2014) 98:3111–3119

Jacob E. Vick & Ethan T. Johnson & Swat i Choudhar y & Sarah E. Bloch & Fern ando Lope z-Galleg o & Poonam Sr ivastava & Ilya B. Tikh & Grays on T. Wawrzy n & Claud ia Sc hmidt-Da nnert, Optimized compa tible set of BioBrick™ vectors for met abolic pathway engineering, Appl Microbiol Biotechnol (2011) 92:1275–1286

John M. Walker, Wilfried Weber, Martin Fussenegger, Methods in Molecular Biology: Synthetic+Gene+Networks, Humana Press, 2012


Validation

1.Training of the built-in weight for the algorithm including only the information from the websites
For the biobricks (parts and devices) in the database, we give them a score based on their 12 features. We transform the values of feature into values between 0 and 1, and give each feature a weight. In order to set up a appropriate weight for each feature, we choose 40 parts or devices with high values of features as positive samples and choose 40 parts or devices with low values of features as negative samples. Then we change the weight of one feature and fix others to expand the gap between the average of positive samples' score and the average of negative samples' score. By this way, we adjust the weight of each feature to improve the accuracy of distinguishing the great and poor biobricks.
Using the above method, we find several important features that should have higher weights than others. On the other hand, we give each feature an appropriate weight in consideration of their effects on researches.
2.Adjust the score from devices scoring method to get close to the method based on biobrick features
For the devices that are not in our database, we score them based on the score of their parts and the score of the connecting between their parts. In order to prove the validity of this algorithm, we choose 24 devices with high score (score>55) in the database as positive samples and choose 18 devices with low score (score<20) in the database as negative samples. Then we use this algorithm to score the positive samples and negative samples. In order to improve the accuracy of algorithm's prediction and balance the error rate of the prediction of two groups, we regard the devices with score beyond 55 as great devices and regard the devices with score below 30 as poor devices. For the devices with score between 30 and 55, our algorithm can not exactly tell you whether they are great devices or not. By that standard, the accuracy of the algorithm's prediction of positive samples and negative samples are 58.3% and 55.6%. For all samples, the accuracy of the algorithm's prediction is 57.1%.
3.The practical application of our algorithm
We have an collaboration with igem team SJTU-BioX-Shanghai. Our goal is to evaluate their biobrick by scoring two parallel experiment results. We got the new devices and its structure from SJTU-BioX-Shanghai and noticed that the parts were built by themselves not long ago. When we downloaded the database data, they had not uploaded their new biobrick. So we could not find the parts' ID from the new devices in our database.
In order to evaluate the new parts, we use BLAST to find the parts with most similar sequences compared to the new parts in our database. Then we use the parts found in our database to evaluate the new devices. We found the coding parts in the device are most important, so our algorithm give the coding part a highest weight.
And the consequence of this collaboration is very ideal. The two devices got two disparate scores, one of which is scored 32.98 and the other is 3.542. In addition, this difference is also reflected by their experiment result that the one with the higher score is chosen as their final biobrick to control the expression of the iron pump.
So this collaboration proves that our algorithm can evaluate new device roughly on the base of existing part.


Achievement

1.We firstly include relationships between parts to the evaluation of devices.
2.Our software can help in the whole process that a user designs a new device: the optimization of one part or the whole device, and the visualization of the device.
3.Our software enables users to design their self-defined weights for part evaluation.
4.The database of our software can optimize itself by enable users upload their parts and devices to our database.
5.Our software is web-based, which is more convenient for users to use.