Team:Technion HS Israel/Practices/SearchEngine
Search Engine – Previous iGEM Projects
Abstract
Every iGEM team has difficulties trying to think of a project. A common step towards the decision is searching previous iGEM projects. Our search engine is designed especially for this purpose – Helping iGEM novices throughout this process, and making it easier. In addition, it can serve the needs of any iGEMer who wants to find out about previous work that has been done, getting inspiration or just learn new and exciting things.
Introduction and Overview
As a part of our human practice and software section, we designed a search engine of previous iGEM project. It consists mainly of two parts – crawler and searcher. The crawler is responsible of obtaining the data from all the iGEM wikis, abstracts, etc. The searcher is the user-friendly searching platform, in which the user can type queries, sort results and get a direct and convenient access to all of the shown websites.
Methods
Crawler
We've written a crawler using Java library cralwer4j.
Searcher
We used an open-source searching platform called Solr. By using this platform, we avoided some technical problems that others have already solved. Thus, we were able to focus on improving the functionality of the engine, and making it more suitable for searching iGEM projects.
The modifications on Solr were mainly made through the solr.config file, which determines the configuration and functionality of the engine (e.g. searching algorithm, sorting order, features).
Features
Besides the database we obtained, our search engine has many features that distinguish it from other engines, as well as help the user by organizing the results in convenient ways, e.g. searching in wiki, sorting, results grouping, a spell checker, highlighting.
Searching inside Wikis
Unlike similar existing iGEM projects search engines, our database consists of information from all the wiki pages for searching. Including all the wikis in the database wasn't an easy task at all, but it ensures that all the information that each team posted uploaded online will be searchable. Thus, even details and terms which were not included in the abstracts, but are still important enough to be in the wikis, are not omitted from the searching.
Sorting and Ordering
After typing the query, the user can sort and order the results by name, year, region, prizes and awards, number of instances, etc. This feature can be used for easy finding of awards-winning teams, teams from a specific year or region.
Results Grouping
Our search engine searches all the wikis as well as the abstracts.Therefore, a results grouping mechanism must be used for organizing the results for convenience. This is an advantage of ours over the Google search engine. In our search engine you can search for an iGEM team or a term and limiting the search to the iGEM site will result a lot of duplicates of teams wikis (e.g. 3 results from the same wiki but different pages that contains the searched term). In our search engine, the results grouping mechanism ensure no duplicates will be shown, thus presenting the results in a concise, convenient form.
Spell Checker
We all make mistakes once in a while… The spell checker makes sure that even if the user doesn't know the spelling of a team name or the professional term, the search engine would be false-tolerance enough to produce appropriate results. It also suggests a few correction possibilities, based on the database.
Highlighting
This feature highlights the query in each result in order to help the user find the searched term in the abstract and the wiki pages.
Conclusions
This search engine is meant for helping iGEM newcomers in the process of searching previous projects. The features that we added to the engine are taken from our experience, in order to make the searching itself easier to user-friendly. We included wiki pages in our database for more detailed search, unlike a few search engines that were previously presented in iGEM (e.g. iGEM42 - Heidelberg 2013, Team Seeker - Aalto Helinski 2014).
Future Work
We want this search engine to serve the iGEM community in the future – years and even generations. It will be a part of every iGEM team journey throughout the competition, answering to the needs of the team regarding knowledge about the competition's history and previous projects.
In the future, we will extend this search engine by improving the user interface, making it Android/iOS compatible, and adding more features to the searcher, for the convenience of iGEM users.
Take me to the iGEM Exlporer!
Source code
Our source code is released under the MIT license. You can find it in our github repository or here:
The iGEM Explorer's crawler source code
The iGEM Explorer's Solr and Velocity config filesHere is a UML diagram of the programme
You can find usage explanation in the Github repository.