Team:Technion HS Israel/Practices/SearchEngine

Technion 2015 HS Team's Wiki

headings

Search Engine – Previous iGEM Project

Abstract

Every iGEM team has difficulties trying to think of a project. A common step towards the decision is searching previous iGEM projects. Our search engine is designed especially for this purpose – Helping iGEM novices throughout this process, and making it easier. In addition, it can serve the needs of any iGEMer who wants to find out about previous work that has been done, getting inspiration or just learn new and exciting things.

Introduction and Overview

As a part of our human practice and software section, we designed a search engine of previous iGEM project. It consists mainly of 2 parts – Crawler and Searcher. The crawler is responsible of obtaining the data from all the iGEM wikis, abstracts, etc. The searcher is the user-friendly searching platform, in which the user can type queries, sort results and get a direct and convenient access to all of the shown websites.

Methods

Crawler

We've written a crawler using Java library cralwer4j.

Searcher

We used an open-source searching platform called Solr. By using this platform, we avoided some technical problems that others have already solved. Thus, we were able to focus on improving the functionality of the engine, and making it more suitable for searching iGEM projects.

The modifications on Solr were mainly made through the solr.config file, which determines the configuration and functionality of the engine (e.g. searching algorithm, sorting order, features).

Features

Besides the database we obtained, our search engine has many features that distinguish it from other engines, as well as help the user by organizing the results in convenient ways, e.g. searching in wiki, sorting, results grouping, a spell checker, highlighting.

Searching inside Wikis

Unlike similar existing iGEM projects search engines, our database consists also information from all the wiki pages for searching. Including all the wikis in the database wasn't an easy task at all, but it ensures that all the information that each team posted uploaded online will be searchable. Thus, even details and terms which were not included in the abstracts, but are still important enough to be in the wikis, are not omitted from the searching.

Sorting and Ordering

After typing the query, the user can sort and order the results by name, year, region, prizes and awards, number of instances, etc. This feature can be used for easy finding of awards-winning teams, teams from a specific year or region.

Results Grouping

Since our search engine searches in all the wikis as well as in the abstracts, a results grouping mechanism must be used for organizing the results for convenience. This is an advantage over the Google search engine – searching for an iGEM team or a term and limiting the search to the iGEM site will result a lot of duplicates of teams wikis (e.g. 3 results from the same wiki but different pages that contains the searched term). In our search engine, the results grouping mechanism ensure no duplicates will be shown, thus presenting the results in a concise, convenient form.

Spell Checker

We all make mistakes once in a while… The spell checker makes sure that even if the user doesn't know the spelling of a team name or the professional term, the search engine would be false-tolerance enough to produce appropriate results. It also suggests a few correction possibilities, based on the database.

Highlighting

This feature highlights the query in each result in order to help the user find the searched term in the abstract and the wiki pages.

Conclusions

This search engine is meant for helping iGEM newcomers in the process of searching previous projects. The features that we added to the engine are taken from our experience, in order to make the searching itself easier to user-friendly. We included wiki pages in our database for more detailed search, unlike a few search engines that were previously presented in iGEM (e.g. iGEM42 - Heidelberg 2013, Team Seeker - Aalto Helinski 2014).

Future Work

We want this search engine to serve the iGEM community in the future – years and even generations. It will be a part of every iGEM team journey throughout the competition, answering to the needs of the team regarding knowledge about the competition's history and previous projects.

In the future, we will extend this search engine by improving the user interface, making it Android/iOS compatible, and adding more features to the searcher, for the convenience of iGEM users.

Source code

Our source code is released under the MIT license. You can find it in our github repository or here:

The iGEM Explorer's crawler source code

The iGEM Explorer's Solr and Velocity config filesHere is a UML diagram of the programme

You can find usage explanation in the Github repository.