18 January, 2011

New Search - Lucene


Ensembl is in the process of moving its site search to the open source Apache Lucene framework. This change should bring several advantages, not only to us, but to all users, the main one being added flexibility; in the short term it will have little impact on web site users, except for making life easier to those maintaining local instances.

From Ensembl release 62 (due out this spring) we will incorporate more data into the search (for example help and documentation) and start to improve how we display results. For developers, note that whilst we are not releasing the webcode for Lucene immediately, we are aiming to do so for release 62.

This powerful platform allows searching of over 3 million genes and gene symbols, over 6 million oligo probes, and over 67 million variations! Our implementation utilises software designed and developed by our colleagues at the European Bioinformatics Institute (used in the EB-eye) which has proven to be fast and flexible.

Lucene is open-source technology that has also been implemented to
provide searches of our mailing lists (i.e. announce and dev), thanks to our colleagues at the Wellcome Trust Sanger Institute.

We hope these improvements will help make browsing Ensembl a more user-friendly experience. Please give your feedback at helpdesk@ensembl.org.

2 comments:

Unknown said...

SearchBlox is another free lucene based solution which may be suitable for your search.

Steve Trevanion said...

We are also considering Solr as a way to extend the capabilities of Ensembl search.