prospectr_big (29K)
Search by gene primary symbol (which must be listed in Ensembl)
or click here to search by region, disease or database identifier

i.e. HD or AUTS2

News

December / January 2005

FAQ

What is Prospectr?

It can be shown that genes implicated in disease share certain patterns of sequence based features like larger gene lengths and broader conservation through evolution.

Prospectr (PRiOrization by Sequence & Phylogenetic Extent of CandidaTe Regions) is an alternating decision tree which has been trained to differentiate between genes likely to be involved in disease and genes unlikely to be involved in disease. By using sequence-based features like gene length, protein length and the percent identity of homologs in other species as input a classification can be obtained for a gene of interest.

The alternating decision trees outputs a classification ("likely to be involved in disease" or "unlikely to be involved in disease"), a score (which is a measure of confidence in the classification) and a breakdown of which factors contributed most to that score.

Given this score we can also roughly estimate how much more or less likely it is that a particular gene is involved in human hereditary disease.

What can it be used for?

Prospectr can be used to enrich lists of genes found at a suspected disease locus. Given a list of genes, Prospectr will return a ranked list ordered by the likelihood of involvement in disease.

Tests on an independent data set of genes taken from the Human Gene Mutation Database suggest that Prospectr will, on average, enrich a list of ~ 200 genes two-fold 74% of the time, five-fold 33% of the time and twenty-fold 8% of the time. 95% of the time the list was enriched one and a half fold - that is to say that the target gene was in the top three-quarters of the ranked list.

How can I use it?

To search a particular locus, use the search page. To download a flat file containing gene ids and scores, go to the download section, where you can also download standalone binaries and perl scripts.

More information & citing Prospectr

For more information, refer to

Speeding Disease Gene Discovery by Sequence Based Candidate Prioritization
Euan J Adie, Richard R Adams, Kathryn L Evans, David J Porteous and Ben S Pickard
BMC Bioinformatics 2005, 6:55 doi:10.1186/1471-2105-6-55
(Free full-text available)

© 2004 University of Edinburgh