20 June, 2008

Ensembl 50 - technical requirements


Development for the new Ensembl 50 website is progressing well... some of you may have already seen the test sites when you signed up to be part of our testing team...

One of the complaints of the current site (hardware failures aside) is the performance of the webpages - we are addressing this in a number of ways in the Ensembl 50 web code.

  • Tuning the Apache web server configuration:
    Compressing all HTML/Javascript/CSS files using mod_deflate;
    Minimizing the number and size of Javascript/CSS files by stripping unnecessary white space and comments from the files and merging them together;
    Setting headers to improve the browsers caching of content.
  • Aggressively caching content on the server side using a modified version of memcached (this will require Linux users using a 2.6.x kernel as it uses the epoll technology).
  • Increased use of asynchronous HTTP requests (AJAX) to allow more immediate responses for the page while generating other content; and to minimize the content that is sent (can retrieve initially hidden content later)
  • Reducing page size - rather than having single pages containing lots of disparate information having more pages containing smaller amounts of information; this doesn't just help with the page size - but also increases the discoverability of content that we have on the site - which people do not find easily - especially comparative genomics; variational genomics and regulatory information.
For those who will be implementing local copies of Ensembl 50 code - additionally Ensembl 50 code will:
  • Make configuration easier - the pages will configure most of the tracks directly from the contents of the databases;
  • Make code more pluggable:
    ConfigPacker - the SpeciesDefs database parsing; and
    ImageConfig - replacement for UserConfig;
  • Make caching and AJAX implementation easier.
There are a number of changes to the code - so if you have written your own components or drawing code tracks there will be work to be done but in most cases these modifications are easy to implement (e.g. moving code between modules).

Finally, here are some additional system recommendations:
  • Perl 5.8.8 or newer;
  • MySQL 5.0 server;
  • 64 bit architecture;
  • large memory machine;
  • you can compile our modified "memcached" code (e.g. for Linux you will need a 2.6.x kernel) to get significant speed up;

No comments: