Difference between revisions of "Team:USTC-Software/Software"

Line 61: Line 61:
 
                                 <h1><a id="Algorithms">Algorithms</a></h1>
 
                                 <h1><a id="Algorithms">Algorithms</a></h1>
 
                                 <h2>Biocircuit (Transformation from truth table to logic circuit)</h2>
 
                                 <h2>Biocircuit (Transformation from truth table to logic circuit)</h2>
                                 <p>The Quine-McCluskey algorithm is a method used for minimization of boolean functions that was developed by W.V.Quine and extended by Edward J.McCluskey.</p>
+
                                 <p>The Quine-McCluskey algorithm is a method used for minimizing of boolean functions that was developed by W.V.Quine and extended by Edward J.McCluskey.</p>
  
 
                                 <p>Espresso heuristic logic minimizer is a computer program using heuristic and specific algorithms for efficiently reducing the complexity of digital electronic gate circuits.</p>
 
                                 <p>Espresso heuristic logic minimizer is a computer program using heuristic and specific algorithms for efficiently reducing the complexity of digital electronic gate circuits.</p>
 
                                 <p>Because Espresso is more efficient than Q-M algorithm, especially when inputs are more than 6, we choose Espresso for our software.</p>
 
                                 <p>Because Espresso is more efficient than Q-M algorithm, especially when inputs are more than 6, we choose Espresso for our software.</p>
  
                                 <p>NetworkX is a Python language software package for the creation, manipulation, and studyof the structure, dynamics, and functions of complex networks. </p>
+
                                 <p>NetworkX is a Python language software package for the creation, manipulation and study of the structure, dynamics and functions of complex networks. </p>
  
 
                                 <p>We use directed graph to characterize biocircuit, and NetworkX has make it more convenient to operate with graphs.</p>
 
                                 <p>We use directed graph to characterize biocircuit, and NetworkX has make it more convenient to operate with graphs.</p>
Line 73: Line 73:
  
 
                                 <p>
 
                                 <p>
                                     Pyeda is a Python library for electronic design automation. We once used pyeda to simplify Boolean formulas. However, we found bugs in the process, so we got contact with author Chris Drake and submitted the issue. Therefore, we didn’t use pyeda directly but make some change to his espresso model. We are very grateful for his favor.
+
                                     Pyeda is a Python library for electronic design automation. We once used pyeda to simplify Boolean formulas. However, we found bugs in the process, so we got in touch with its author Chris Drake and submitted the issue. Therefore, we didn't use pyeda directly but make some changes to his espresso model. We are very grateful to his excellent work.
 
                                 </p>
 
                                 </p>
  
 
                                 <h2>Score Calculation for Logic Circuits:</h2>
 
                                 <h2>Score Calculation for Logic Circuits:</h2>
 
                                 <p>
 
                                 <p>
                                     The complexity of electronic circuits essentially depends only on the number of gates,
+
                                     The complexity of electronic circuits essentially depends only on the number of gates, and it can be minimized by the Espresso algorithm. Synthetic biology, however, is distinct from electronics and a gene circuit is much more complex than a logic circuit. We first simplify the truth table input by users to obtain several simplified logic circuits, and then correspond them to a gene circuit. We estimate the complexity of a circuit solution as a function of the number of the regulatory factors involved in the network. Considering the complexity involved in the promoters and RBS, we give a score function.
                                    and it can be minimized by the Espresso algorithm. Synthetic biology, however, is distinct from electronics and a gene circuit is much more complex than a logic circuit. We first simplify the truth table input by users to obtain several simplified logic circuit, and then correspond them to a gene circuit. We estimate the complexity of a circuit solution as a function of the number of the regulatory factors involved in the network. Considering the complexity involved in the promoters and RBS, we give a score function
+
 
                                 </p>
 
                                 </p>
 
                                 <p>&nbsp;&nbsp;&nbsp;S = 2^(N_R - 1) + 2^(N_A - 1) + N_l + N_k</p>
 
                                 <p>&nbsp;&nbsp;&nbsp;S = 2^(N_R - 1) + 2^(N_A - 1) + N_l + N_k</p>
                                 <p>N_R, N_A, N l and N k represent the total number of repressors, activators, locks and keys present in the circuit. S does not reflect the quality of a circuit design, the score is intended to characterize its practical realizability. There might be millions of circuits generated automatically by computers, but we choose most reliable and representative results for users according to the score.</p>
+
                                 <p>N_R, N_A, N_l and N_k represent the total number of repressors, activators, locks and keys present in the circuit. S does not reflect the quality of a circuit design. The score is intended to characterize its practical realizability. There might be millions of circuits generated automatically by computers, but we only choose the most reliable and representative results for users according to the score.</p>
  
  
 
                                 <h2>Promotion of Simulation Methods</h2>
 
                                 <h2>Promotion of Simulation Methods</h2>
                                 <p>We use the Tao Algorithm based on probability .We have optimized Tao according to some specific characteristics of biology reaction, and use cache to avoid repeated calculating. </p>
+
                                 <p>We use the Tao Algorithm based on probability. We have optimized Tao according to some specific characteristics of biology reaction, and have used cache to avoid repeated calculating. </p>
  
 
                                 <h2>Cache of Simulation Results</h2>
 
                                 <h2>Cache of Simulation Results</h2>
                                 <p>Our bio-system is a kind of graph. The core idea of cache is to find out whether two graphs are same (isomorphic exactly). As we may know, to make sure that two graphs are isomorphic is NP-Complete, which means it is very hard to calculate it on any computer. But our bio-system is more than just a graph: it has a flow-like direction, on which topological sort can be applied. But such kinds of sorts are far from making out whether they are isomorphic. So I tried to resolve the graph into chains. It can be proved that these chains can fully describe the graph in our cases. For the depth of our graph is tiny, and the algorithm becomes effective. Then we hash the chains, make it easier and faster to sort. After the sort we hash then again to get a number which is big enough to avoid the collision. Then we convert a bio-system into a number. The same numbers imply the same bio-systems. So we’ll compare the numbers of bio-systems to make sure whether a cache is needed. We store the cache in a folder, and limit the total cache size for safety. Then you can safely use it.</p>
+
                                 <p>Our bio-system is a kind of graph. The core idea of cache is to find out whether two graphs are the same (isomorphic exactly). As we may know, it is NP-Complete to make sure that two graphs are isomorphic, which means it is very hard to calculate on any computer. But our bio-system is more than just a graph: it has a flow-like direction, to which topological sorts can be applied. However, such kinds of sorts are far from figuring out whether they are isomorphic. So we tried to resolve the graph into chains. It can be proved that these chains can fully describe the graph in our cases. For the depth of our graph is tiny, the algorithm becomes effective. Then we hash the chains, make it easier and faster to sort. After the sort we hash them again to get a number which is big enough to avoid the collision. Then we convert a bio-system into a number. The same numbers imply the same bio-systems. So we will compare the numbers of bio-systems to make sure whether a cache is needed. We store the cache in a folder and limit the total cache size for safety. After all the process above, you can finally use it securely.</p>
  
 
                             </div>
 
                             </div>

Revision as of 00:51, 19 September 2015




Front End

WebGL

To develop a cross-platform software, we use HTML5 and WebGL for the frontend. We use pixi.js as the framework of our software, and decorate the 2D-Canvas in the html <canvas>
for compatibility on browsers which don’t support WebGL.
The aim of this project is to provide a fast lightweight 2D library that works across all devices. The Pixi renderer allows everyone to enjoy the power of hardware acceleration without prior knowledge of WebGL. It is also very fast.

Unit Test

We use mocha, chai and phantomjs for unit test.
Mocha is a simple, flexible, fun JavaScript test framework for node.js and browsers. For more information please view the documentation.
Chai is a BDD / TDD assertion library for node and browsers that can be delightfully paired with any JavaScript testing framework.
Phantomjs is a headless WebKit scriptable with a JavaScript API. It has fast and native support for various web standards: DOM handling, CSS selector, JSON, Canvas and SVG.

Automated Documentation Generation

We use jsdoc to generate the code documentation and API documentation generator for JavaScript.




Back End

Separation of Front End & Back End

Unlike most websites using MVC architect, we decide to separate front end and back end completely and use REST API as the interface between them. We have made it easier for other biologists and developers to continue work on our basis by decoupling. As a consequence, developers may design their own interface style based on our back end codes and no longer worry about any conflict with present front end codes.

Django & Django REST Framework

Django is a very popular web framework with MVC architect in python, which is efficient, safe and extendable. Based on Django, we also add Django REST Framework and the function of documents generating, thus we can obtain automatically generated API documentation on ustc.software/docs

Cython & Ctypes

In order to further shorten the time of simulating, we use C programming language to rewrite our simulator. Meanwhile we use the techniques of cython and ctypes to connect the simulator with diango framework. As a result, the speed of simulation is approximately 30 times faster now.

Automatic Unit Test & CI

Our project is deposited on github to support work division and efficient cooperation. During the development of the project, we use travis-CI for continuous integration and use coverall for coverage test. Thus our software is most likely to operate well as long as the unit test is good. With this method, we can ensure the quality of software and prevent programming bugs.




Algorithms

Biocircuit (Transformation from truth table to logic circuit)

The Quine-McCluskey algorithm is a method used for minimizing of boolean functions that was developed by W.V.Quine and extended by Edward J.McCluskey.

Espresso heuristic logic minimizer is a computer program using heuristic and specific algorithms for efficiently reducing the complexity of digital electronic gate circuits.

Because Espresso is more efficient than Q-M algorithm, especially when inputs are more than 6, we choose Espresso for our software.

NetworkX is a Python language software package for the creation, manipulation and study of the structure, dynamics and functions of complex networks.

We use directed graph to characterize biocircuit, and NetworkX has make it more convenient to operate with graphs.

NumPy is the fundamental package for scientific computing with Python.

Pyeda is a Python library for electronic design automation. We once used pyeda to simplify Boolean formulas. However, we found bugs in the process, so we got in touch with its author Chris Drake and submitted the issue. Therefore, we didn't use pyeda directly but make some changes to his espresso model. We are very grateful to his excellent work.

Score Calculation for Logic Circuits:

The complexity of electronic circuits essentially depends only on the number of gates, and it can be minimized by the Espresso algorithm. Synthetic biology, however, is distinct from electronics and a gene circuit is much more complex than a logic circuit. We first simplify the truth table input by users to obtain several simplified logic circuits, and then correspond them to a gene circuit. We estimate the complexity of a circuit solution as a function of the number of the regulatory factors involved in the network. Considering the complexity involved in the promoters and RBS, we give a score function.

   S = 2^(N_R - 1) + 2^(N_A - 1) + N_l + N_k

N_R, N_A, N_l and N_k represent the total number of repressors, activators, locks and keys present in the circuit. S does not reflect the quality of a circuit design. The score is intended to characterize its practical realizability. There might be millions of circuits generated automatically by computers, but we only choose the most reliable and representative results for users according to the score.

Promotion of Simulation Methods

We use the Tao Algorithm based on probability. We have optimized Tao according to some specific characteristics of biology reaction, and have used cache to avoid repeated calculating.

Cache of Simulation Results

Our bio-system is a kind of graph. The core idea of cache is to find out whether two graphs are the same (isomorphic exactly). As we may know, it is NP-Complete to make sure that two graphs are isomorphic, which means it is very hard to calculate on any computer. But our bio-system is more than just a graph: it has a flow-like direction, to which topological sorts can be applied. However, such kinds of sorts are far from figuring out whether they are isomorphic. So we tried to resolve the graph into chains. It can be proved that these chains can fully describe the graph in our cases. For the depth of our graph is tiny, the algorithm becomes effective. Then we hash the chains, make it easier and faster to sort. After the sort we hash them again to get a number which is big enough to avoid the collision. Then we convert a bio-system into a number. The same numbers imply the same bio-systems. So we will compare the numbers of bio-systems to make sure whether a cache is needed. We store the cache in a folder and limit the total cache size for safety. After all the process above, you can finally use it securely.

University of Science and Technology of China
No.96, Jinzhai Road, Hefei, Anhui, P.R.China
USTC-Software 2015

University of Science and Technology of China

We are here for you

Sponsered by