Motivation: MEDLINE®/PubMed® currently indexes over 18 million biomedical articles, providing unprecedented opportunity and challenges for text analysis. Using Medical Subject Heading Overrepresentation Profiles (MeSHOPs), an entity of interest can be robustly summarized, quantitatively identifying associated biomedical terms and suggesting indirect associations.
Results: A procedure is introduced for quantitative representa-tion of MeSH annotations assigned to any group of articles (e.g articles for a specific gene). Similarity scores comparing MeSHOPs of genes and diseases successfully infer association of novel disease terms to genes, validated by future publications. Results indicate the number of papers for a gene or disease has a strong influence on predicted associations. Up to 16% improvement in predictive performance over baselines was obtained using MeSHOP comparisons.
- Warren A Cheung, BF Francis Ouellette, and Wyeth W Wasserman. Compensating for Literature Annotation Bias when Predicting Novel Drug-Disease Relationships through Medical Subject Heading Over-representation Profile (MeSHOP) Similarity. BMC Medical Genomics. Vol. 6, Suppl. 2, S3. May 2013. doi:10.1186/1755-8794-6-S2-S3 PMID 23819887
- Warren A Cheung, BF Francis Ouellette, and Wyeth W Wasserman. Inferring novel gene-disease associations using medical subject heading over-representation profiles. Genome Medicine. Vol. 4, Issue 9, pp. 75. Sept 2012. doi:10.1186/gm376 PMID 23021552
- Warren A Cheung, BF Francis Ouellette, and Wyeth W Wasserman. Quantitative biomedical annotation using medical subject heading over-representation profiles (MeSHOPs). BMC Bioinformatics. Vol. 13, pp. 249. Sept 2012. doi:10.1186/1471-2105-13-249 PMID 23017167
- W. Cheung. Inferring novel relationships through over-representation analysis of medical subjects in biomedical bibliographies. Ph.D. thesis, Bioinformatics Program, University of British Columbia. August 2012. http://hdl.handle.net/2429/43073 doi:10.14288/1.0073074
Online ResourcesPlease bear with some instability as we migrate our database servers and update the datasets.
Examine Gene-Disease Profile MeSH Term Overlap
Browse Predicted Indirect Gene-Disease Predictions and Validation Sets
Browse Predicted Indirect Drug-Disease Predictions
Fetch all PubMed articles on a MeSH term topic associated with a list of genes
Download ResultsAll Human Gene-Disease Associations predicted via literature profiles: WARNING - Very Large File (8.4G, 2010-07-22)
Gene-Disease Co-Occurrence in MEDLINE Validation Set: New relationships established between genes and diseases via gene2pubmed after 2008(8.3M, Relationships 2008-2010)
Drug-Disease Gold Standard: Validation set taken from PREDICT(Gottlieb, Stein, Ruppin, Sharan 2011) - DrugBank Drugs mapped to MeSH - OMIM diseases mapped to MeSH Source code for computing direct associations and profile-based predictions
Source code for computing validation statistics
PubMed Baseline 2013, MeSH 2013, Entrez Gene 2013-02