Within the Genomes-to-Life Roadmap, there is a lack of standardized semantics to accurately describe data objects and persistently express knowledge change over time. As research methods and biological concepts evolve, certainty about the correct interpretation of prior data and published results decreases, because both become overloaded with synonymous and polysemous terms. NamesforLife (N4L) is a novel technology designed to solve this problem. The core of the technology is an ontology, an XML schema, and an expertly managed vocabulary coupled with Digital Object Identifiers (DOIs), which form a transparent semantic resolution service. The service disambiguates terminologies, makes them actionable, and presents them to end-users in the correct context. A working model of the N4L technology was built to validate concepts and gain new insights into the complexities of dynamic vocabularies. In this project, the working model will be reduced to a service that can automatically annotate occurrences of names in the scientific literature and databases. The approach will: (1) transfer the current model into a more suitable environment, to simplify updating and on-the-fly generation of N4L information objects; (2) develop tagging rules to embed links from N4L information objects into on-line content; (3) enable multiple resolution through the server; (4) develop mini-monographs as an improved human interface to N4L; and (5) develop additional infrastructure to support on-the-fly translation of N4L tagged data in published content. The initial target will be the International Journal of Systematic and Evolutionary Microbiology, the publication of record for nomenclatural changes for bacteria and archaea.
Commercial Applications and Other Benefits as described by the awardee: The N4L technology would enable end-users to spend substantially less time dealing with the ambiguity of biological names and strain identifiers and more time focused on gaining knowledge. In addition to the DOE, the technology should be useful to providers of diagnostic instrumentation and identification kits, service laboratories, and managers of commercial or public databases used for microbial identification and classification. Furthermore, the N4L model should find use in applications where a terminology and the associated concepts or objects defined by that terminology diverge over time, including medical informatics (with respect to resolution of procedural codes used in medical insurance, and in tracking the chemical and trade names of pharmaceuticals products) and manufacturing or warehousing (such as managing complex inventories of equivalent electronic and mechanical parts sourced from different manufacturers).