controlled vocabulary


Summary: A specified list of terms with a fixed and unalterable meaning, and from which a selection is made when CATALOGING; ABSTRACTING AND INDEXING; or searching BOOKS; JOURNALS AS TOPIC; and other documents. The control is intended to avoid the scattering of related subjects under different headings (SUBJECT HEADINGS). The list may be altered or extended only by the publisher or issuing agency. (From Harrod's Librarians' Glossary, 7th ed, p163)

Top Publications

  1. Jimeno A, Jimenez Ruiz E, Lee V, Gaudan S, Berlanga R, Rebholz Schuhmann D. Assessment of disease named entity recognition on a corpus of annotated sentences. BMC Bioinformatics. 2008;9 Suppl 3:S3 pubmed publisher
    ..Bioinformatics 2008, 24:296-298). ..
  2. Bundschus M, Dejori M, Stetter M, Tresp V, Kriegel H. Extraction of semantic biomedical relations from text using conditional random fields. BMC Bioinformatics. 2008;9:207 pubmed publisher
    ..Current work is focused on improving the accuracy of detection of entities as well as entity boundaries, which will also greatly improve the relation extraction performance. ..
  3. Zweigenbaum P, Demner Fushman D, Yu H, Cohen K. Frontiers of biomedical text mining: current progress. Brief Bioinform. 2007;8:358-75 pubmed
    ..In this article we review the current state of the art in biomedical text mining or 'BioNLP' in general, focusing primarily on papers published within the past year. ..
  4. Degtyarenko K, de Matos P, Ennis M, Hastings J, Zbinden M, McNaught A, et al. ChEBI: a database and ontology for chemical entities of biological interest. Nucleic Acids Res. 2008;36:D344-50 pubmed
    ..ChEBI includes an ontological classification, whereby the relationships between molecular entities or classes of entities and their parents and/or children are specified. ChEBI is available online at ..
  5. Côté R, Jones P, Martens L, Apweiler R, Hermjakob H. The Ontology Lookup Service: more data and better tools for controlled vocabulary queries. Nucleic Acids Res. 2008;36:W372-6 pubmed publisher
    ..Improvements have been made to both OLS query interfaces, based on user feedback and requirements, to improve usability and service interoperability and provide novel ways to perform queries. ..
  6. Mistry M, Pavlidis P. Gene Ontology term overlap as a measure of gene functional similarity. BMC Bioinformatics. 2008;9:327 pubmed publisher
  7. Hsu C, Chang Y, Kuo C, Lin Y, Huang H, Chung I. Integrating high dimensional bi-directional parsing models for gene mention tagging. Bioinformatics. 2008;24:i286-94 pubmed publisher
    ..Data sets, programs and an on-line service of our gene mention tagger can be accessed at ..
  8. Tsuruoka Y, McNaught J, Ananiadou S. Normalizing biomedical terms by minimizing ambiguity and variability. BMC Bioinformatics. 2008;9 Suppl 3:S2 pubmed publisher
    ..This work will help improve the performance of term-concept mapping tasks in biomedical information extraction especially when good normalization heuristics for the target terminology are not fully known. ..
  9. Rubin D, Shah N, Noy N. Biomedical ontologies: a functional perspective. Brief Bioinform. 2008;9:75-90 pubmed

More Information


  1. Lin J, Wilbur W. PubMed related articles: a probabilistic topic-based model for content similarity. BMC Bioinformatics. 2007;8:423 pubmed
    ..Our experiments suggest that the pmra model provides an effective ranking algorithm for related article search. ..
  2. Belleau F, Nolin M, Tourigny N, Rigault P, Morissette J. Bio2RDF: towards a mashup to build bioinformatics knowledge systems. J Biomed Inform. 2008;41:706-16 pubmed publisher
    ..The Bio2RDF repository can be queried at ..
  3. Zhang S, Bodenreider O. Lessons learned from cross-validating alignments between large anatomical ontologies. Stud Health Technol Inform. 2007;129:822-6 pubmed
    ..With a stricter and domain-specific lexical similarity model, AOAS has a better precision, but is more sensitive to missing synonyms and misspellings. ..
  4. Supekar K, Rubin D, Noy N, Musen M. Knowledge Zone: a public repository of peer-reviewed biomedical ontologies. Stud Health Technol Inform. 2007;129:812-6 pubmed
  5. Vincze V, Szarvas G, Farkas R, Móra G, Csirik J. The BioScope corpus: biomedical texts annotated for uncertainty, negation and their scopes. BMC Bioinformatics. 2008;9 Suppl 11:S9 pubmed publisher
  6. de Matos P, Alcántara R, Dekker A, Ennis M, Hastings J, Haug K, et al. Chemical Entities of Biological Interest: an update. Nucleic Acids Res. 2010;38:D249-54 pubmed publisher
  7. Sheehan B, Quigley A, Gaudin B, Dobson S. A relation based measure of semantic similarity for Gene Ontology annotations. BMC Bioinformatics. 2008;9:468 pubmed publisher
    ..As a result our measure better describes the information contained in annotations associated with gene products and as a result is better suited to characterizing and classifying gene products through their annotations. ..
  8. Taylor C, Field D, Sansone S, Aerts J, Apweiler R, Ashburner M, et al. Promoting coherent minimum reporting guidelines for biological and biomedical investigations: the MIBBI project. Nat Biotechnol. 2008;26:889-96 pubmed publisher
  9. Shatkay H, Pan F, Rzhetsky A, Wilbur W. Multi-dimensional classification of biomedical text: toward automated, practical provision of high-utility text to diverse users. Bioinformatics. 2008;24:2086-93 pubmed publisher
    ..The latter strongly suggest that automatic annotation along most of the dimensions is highly feasible, and that this new framework for scientific sentence categorization is applicable in practice. ..
  10. Barrell D, Dimmer E, Huntley R, Binns D, O Donovan C, Apweiler R. The GOA database in 2009--an integrated Gene Ontology Annotation resource. Nucleic Acids Res. 2009;37:D396-403 pubmed publisher, which allows users to precisely tailor their annotation set. ..
  11. Lu Y, Rosenfeld R, Simon I, Nau G, Bar Joseph Z. A probabilistic generative model for GO enrichment analysis. Nucleic Acids Res. 2008;36:e109 pubmed publisher
    ..When used with microarray expression data and ChIP-chip data from yeast and human our method was able to correctly identify both general and specific enriched categories which were overlooked by other methods. ..
  12. Bada M, Hunter L. Identification of OBO nonalignments and its implications for OBO enrichment. Bioinformatics. 2008;24:1448-55 pubmed publisher
    ..The nonalignments discussed in this article may be viewed at Code for the generation of these nonalignments is available upon request. ..
  13. Shah N, Jonquet C, Chiang A, Butte A, Chen R, Musen M. Ontology-driven indexing of public datasets for translational bioinformatics. BMC Bioinformatics. 2009;10 Suppl 2:S1 pubmed publisher
    ..The key functionality of this system is to enable users to locate biomedical data resources related to particular ontology concepts. ..
  14. Schober D, Smith B, Lewis S, Kusnierczyk W, Lomax J, Mungall C, et al. Survey-based naming conventions for use in OBO Foundry ontology development. BMC Bioinformatics. 2009;10:125 pubmed publisher
    ..Common naming conventions will also assist consumers of ontologies to more readily understand what meanings were intended by the authors of ontologies used in annotating bodies of data. ..
  15. Noy N, Shah N, Whetzel P, Dai B, Dorf M, Griffith N, et al. BioPortal: ontologies and integrated data resources at the click of a mouse. Nucleic Acids Res. 2009;37:W170-3 pubmed publisher
    ..Thus, BioPortal not only provides investigators, clinicians, and developers 'one-stop shopping' to programmatically access biomedical ontologies, but also provides support to integrate data from a variety of biomedical resources. ..
  16. Avraham S, Tung C, Ilic K, Jaiswal P, Kellogg E, McCouch S, et al. The Plant Ontology Database: a community resource for plant structure and developmental stages controlled vocabulary and annotations. Nucleic Acids Res. 2008;36:D449-54 pubmed publisher
    ..plant genome databases and plant researchers that aims to create, maintain and facilitate the use of a controlled vocabulary (ontology) for plants...
  17. Ji S, Sun L, Jin R, Kumar S, Ye J. Automated annotation of Drosophila gene expression patterns using a controlled vocabulary. Bioinformatics. 2008;24:1881-8 pubmed publisher
    ..The spatial and temporal patterns of gene expression are integrated by anatomical terms from a controlled vocabulary linking together intermediate tissues developed from one another...
  18. Barsnes H, Côté R, Eidhammer I, Martens L. OLS dialog: an open-source front end to the ontology lookup service. BMC Bioinformatics. 2010;11:34 pubmed publisher
    ..Annotating the data using controlled vocabulary terms and ontologies makes it much easier to compare and analyze data from different sources...
  19. de la Calle G, Garcia Remesal M, Maojo V. A method for indexing biomedical resources over the internet. Stud Health Technol Inform. 2008;136:163-8 pubmed
  20. Chatr aryamontri A, Kerrien S, Khadake J, Orchard S, Ceol A, Licata L, et al. MINT and IntAct contribute to the Second BioCreative challenge: serving the text-mining community with high quality molecular interaction data. Genome Biol. 2008;9 Suppl 2:S5 pubmed publisher
    ..of terms created by the BioCreative competitors could enrich the synonym list of the PSI-MI (Proteomics Standards Initiative-Molecular Interactions) controlled vocabulary, which is used by both databases to annotate their data content.
  21. Martens L, Palazzi L, Hermjakob H. Data standards and controlled vocabularies for proteomics. Methods Mol Biol. 2008;484:279-86 pubmed publisher
    ..We describe the origins and overall concepts behind these standards, as well as the individual efforts that are ongoing in the field of mass spectrometry proteomics and protein interactions. ..
  22. Lawson D, Arensburger P, Atkinson P, Besansky N, Bruggner R, Butler R, et al. VectorBase: a data resource for invertebrate vector genomics. Nucleic Acids Res. 2009;37:D583-7 pubmed publisher
    ..We have continued to develop both the software infrastructure and tools for interrogating the stored data...
  23. Aranguren M, Antezana E, Kuiper M, Stevens R. Ontology Design Patterns for bio-ontologies: a case study on the Cell Cycle Ontology. BMC Bioinformatics. 2008;9 Suppl 5:S1 pubmed publisher
    ..This representation will produce a more efficient knowledge management in the long term. ..
  24. Pyysalo S, Airola A, Heimonen J, Björne J, Ginter F, Salakoski T. Comparative analysis of five protein-protein interaction corpora. BMC Bioinformatics. 2008;9 Suppl 3:S6 pubmed publisher
    ..In the course of this study we have created a major practical contribution in converting the corpora into a shared format. The conversion software is freely available at ..
  25. Hong E, Balakrishnan R, Dong Q, Christie K, Park J, Binkley G, et al. Gene Ontology annotations at SGD: new data sources and annotation methods. Nucleic Acids Res. 2008;36:D577-81 pubmed
    ..In addition to providing information for genes that have not been experimentally characterized, GO annotations from independent sources can be compared to those made by SGD to help keep the literature-based GO annotations current. ..
  26. Schlicker A, Albrecht M. FunSimMat: a comprehensive functional similarity database. Nucleic Acids Res. 2008;36:D434-9 pubmed
    ..All results can be downloaded in tab-delimited files for use with other tools. An additional XML-RPC interface gives automatic online access to FunSimMat for programs and remote services. ..
  27. Hoehndorf R, Loebe F, Kelso J, Herre H. Representing default knowledge in biomedical ontologies: application to the integration of anatomy and phenotype ontologies. BMC Bioinformatics. 2007;8:377 pubmed
    ..The inclusion of default knowledge is necessary in order to ensure interoperability between ontologies. ..
  28. Sahoo S, Zeng K, Bodenreider O, Sheth A. From "glycosyltransferase" to "congenital muscular dystrophy": integrating knowledge from NCBI Entrez Gene and the Gene Ontology. Stud Health Technol Inform. 2007;129:1260-4 pubmed
    ..We illustrate the effectiveness of our approach by answering a real-world biomedical query linking a specific molecular function, glycosyltransferase, to the disorder congenital muscular dystrophy. ..
  29. Kim J, Ohta T, Tsujii J. Corpus annotation for mining biomedical events from literature. BMC Bioinformatics. 2008;9:10 pubmed publisher
    ..The resulting event-annotated corpus is the largest and one of the best in quality among similar annotation efforts. We expect it to become a valuable resource for NLP (Natural Language Processing)-based TM in the bio-medical domain. ..
  30. Rebholz Schuhmann D, Arregui M, Gaudan S, Kirsch H, Jimeno A. Text processing through Web services: calling Whatizit. Bioinformatics. 2008;24:296-8 pubmed
    ..For large quantities of the user's own text, the server can be operated in a streaming mode ( ..
  31. Smith B, Ashburner M, Rosse C, Bard J, Bug W, Ceusters W, et al. The OBO Foundry: coordinated evolution of ontologies to support biomedical data integration. Nat Biotechnol. 2007;25:1251-5 pubmed
    ..We describe this OBO Foundry initiative and provide guidelines for those who might wish to become involved. ..
  32. Swarbreck D, Wilks C, Lamesch P, Berardini T, Garcia Hernandez M, Foerster H, et al. The Arabidopsis Information Resource (TAIR): gene structure and function annotation. Nucleic Acids Res. 2008;36:D1009-14 pubmed
    ..A total of 681 new genes and 1002 new splice variants were added. Overall, 10,098 loci (one-third of all loci from the previous TAIR6 release) were updated for the TAIR7 release. ..
  33. Hill D, Smith B, McAndrews Hill M, Blake J. Gene Ontology annotations: what they mean and where they come from. BMC Bioinformatics. 2008;9 Suppl 5:S2 pubmed publisher
  34. Pesquita C, Faria D, Bastos H, Ferreira A, Falcão A, Couto F. Metrics for GO based protein semantic similarity: a systematic evaluation. BMC Bioinformatics. 2008;9 Suppl 5:S4 pubmed publisher
    ..We suspect that there may be a direct influence of data circularity in the behaviour of the results including electronic annotations, as a result of functional inference from sequence similarity. ..
  35. Karamanis N, Seal R, Lewin I, McQuilton P, Vlachos A, Gasperin C, et al. Natural language processing in aid of FlyBase curators. BMC Bioinformatics. 2008;9:193 pubmed publisher
    ..We show that state-of-the-art performance in certain NLP tasks such as Named Entity Recognition and Anaphora Resolution can be combined with the navigational functionalities of PaperBrowser to support curation quite successfully. ..
  36. McCrae J, Collier N. Synonym set extraction from the biomedical literature by lexical pattern discovery. BMC Bioinformatics. 2008;9:159 pubmed publisher
    ..We also concluded that the accuracy can be improved by grouping into synonym sets. ..
  37. Kerrien S, Alam Faruque Y, Aranda B, Bancarz I, Bridge A, Derow C, et al. IntAct--open source resource for molecular interaction data. Nucleic Acids Res. 2007;35:D561-5 pubmed
    ..IntAct supports and encourages local installations as well as direct data submission and curation collaborations. IntAct source code and data are freely available from ..
  38. Grumbling G, Strelets V. FlyBase: anatomical data, images and queries. Nucleic Acids Res. 2006;34:D484-8 pubmed
  39. Wu X, Zhu L, Guo J, Zhang D, Lin K. Prediction of yeast protein-protein interaction network: insights from the Gene Ontology and annotations. Nucleic Acids Res. 2006;34:2137-50 pubmed
    ..This analysis is expected to provide a new approach for predicting the protein-protein interaction maps from other completely sequenced genomes with high-quality GO-based annotations. ..
  40. Leary P, Remsen D, Norton C, Patterson D, Sarkar I. uBioRSS: tracking taxonomic literature using RSS. Bioinformatics. 2007;23:1434-6 pubmed
    ..Such value-added enhancements can provide biologists with accelerated and improved access to current biological content. ..
  41. Sevilla J, Segura V, Podhorski A, Guruceaga E, Mato J, Martínez Cruz L, et al. Correlation between gene expression and GO semantic similarity. IEEE/ACM Trans Comput Biol Bioinform. 2005;2:330-8 pubmed
    ..These results can be used to augment the knowledge provided by clustering algorithms and in the development of bioinformatic tools for finding and characterizing gene products. ..
  42. Taylor C, Hermjakob H, Julian R, Garavelli J, Aebersold R, Apweiler R. The work of the Human Proteome Organisation's Proteomics Standards Initiative (HUPO PSI). OMICS. 2006;10:145-51 pubmed
    ..Initiative (HUPO PSI), specifically, our work on reporting requirements, data exchange formats and controlled vocabulary terms...
  43. Friedberg I, Harder T, Godzik A. JAFA: a protein function annotation meta-server. Nucleic Acids Res. 2006;34:W379-81 pubmed
    ..JAFA also offers its own output, and the individual programs' predictions for further processing. JAFA is available for use from ..
  44. Rosenbloom S, Miller R, Johnson K, Elkin P, Brown S. Interface terminologies: facilitating direct entry of clinical data into electronic health record systems. J Am Med Inform Assoc. 2006;13:277-88 pubmed
    ..To place interface terminologies in context, this article reviews historical goals and challenges of clinical terminology development in general and then focuses on the unique features of interface terminologies. ..
  45. Le Novère N, Bornstein B, Broicher A, Courtot M, Donizelli M, Dharuri H, et al. BioModels Database: a free, centralized database of curated, published, quantitative kinetic models of biochemical and cellular systems. Nucleic Acids Res. 2006;34:D689-91 pubmed
    ..This allows the users to search accurately for the models they need. The models can currently be retrieved in the SBML format, and import/export facilities are being developed to extend the spectrum of formats supported by the resource. ..
  46. Chabalier J, Mosser J, Burgun A. A transversal approach to predict gene product networks from ontology-based similarity. BMC Bioinformatics. 2007;8:235 pubmed
    ..Complementary to the standard approach, the transversal approach offers new insight into the cellular mechanisms and reveals new research hypotheses by combining gene product networks based on semantic similarity, and data expression. ..
  47. Schulz S, Hahn U. Part-whole representation and reasoning in formal biomedical ontologies. Artif Intell Med. 2005;34:179-200 pubmed
    ..ALC). We provide a formal basis for ontological engineering in the domain of biomedicine, as far as part-whole relationships are concerned, by addressing typical reasoning patterns encountered in this domain. ..
  48. Eilbeck K, Lewis S, Mungall C, Yandell M, Stein L, Durbin R, et al. The Sequence Ontology: a tool for the unification of genome annotations. Genome Biol. 2005;6:R44 pubmed
    The Sequence Ontology (SO) is a structured controlled vocabulary for the parts of a genomic annotation. SO provides a common set of terms and definitions that will facilitate the exchange, analysis and management of genomic data...
  49. Héja G, Surjan G, Lukácsy G, Pallinger P, Gergely M. GALEN based formal representation of ICD10. Int J Med Inform. 2007;76:118-23 pubmed
    ..The classifier module is still under development. Due to the experiences gained during the modelling, in the future work FMA is going to be used as anatomical reference ontology. ..
  50. Zheng B, McLean D, Lu X. Identifying biological concepts from a protein-related corpus with a probabilistic topic model. BMC Bioinformatics. 2006;7:58 pubmed
    ..The identified topics/concepts were further mapped to the controlled vocabulary of the Gene Ontology (GO) terms based on mutual information...
  51. Jaiswal P, Ni J, Yap I, Ware D, Spooner W, Youens Clark K, et al. Gramene: a bird's eye view of cereal genomes. Nucleic Acids Res. 2006;34:D717-23 pubmed
    ..Jaiswal, J. Ni, I. V. Yap, X. Pan, K. Y. Clark, L. Teytelman, S. C. Schmidt, W. Zhao, K. Chang et al. [(2002), Plant Physiol., 130, 1606-1613], the database has undergone extensive changes that are described in this publication. ..
  52. Smith B. From concepts to clinical reality: an essay on the benchmarking of biomedical terminologies. J Biomed Inform. 2006;39:288-98 pubmed
    ..We conclude by outlining ways in which the framework thus defined might be exploited for purposes of diagnostic decision-support. ..
  53. Hao Y, Zhu X, Huang M, Li M. Discovering patterns to extract protein-protein interactions from the literature: Part II. Bioinformatics. 2005;21:3294-300 pubmed
    ..This has significantly increased generalization power, and hence the recall and precision rates, as confirmed by our experiments. ..