21 January, 2009

How to get all the orthologous genes between two species

Many users ask us about how to download data from ensembl. Usually, the answer is using BioMart. Comparative genomics data are also available in the standard Mart for your favorite species. For instance to get all the human-mouse orthologs, one can select the human dataset, filter all the genes with no mouse orthologs and choose to output the mouse orthologs for all the resulting genes.

Here is how to get these data in 10 simple steps
1. Go to: http://www.ensembl.org/biomart/martview
2. Choose "Ensembl 52"
3. Choose "Homo sapiens genes (NCBI36)"
4. Click on "Filters" in the left menu
5. Unfold the "MULTI SPECIES COMPARISONS" box, tick the "Homolog filters" option and choose "Orthologous Mouse Genes" from the drop-down menu.
6. Click on "Attributes" in the left menu
7. Click on "Homologs"
8. Unfold the "MOUSE ORTHOLOGS" box and select the data you want to get (most probably the gene ID and maybe the orthology type as well).
9. Click on the "Results" button (top left)
10. Choose your favorite output

Here is the preview of the results:



Other people may prefer to use our Compara Perl API or get the data directly from the Compara DB. These options are also available.

2 comments:

Max said...

biomart does not let you choose the version of Ensembl. So Martview works only for the most current ensembl version, right?

Giulietta said...

In response to Max's comment... BioMart on the live site works for the current version of Ensembl. We do maintain archive Marts for a selection of archive releases:

http://www.ensembl.org/info/website/archives/index.html

Information for older versions of Ensembl that do not have a functioning BioMart can be obtained directly from our databases using MySQL or the Perl API:

http://www.ensembl.org/info/docs/api/index.html

This is the sort of question best sent to Ensembl Helpdesk:
helpdesk@ensembl.org