Semantic Link Association Prediction for Phenotypic Drug Discovery
Award last edited on: 10/27/2017

Sponsored Program
Awarding Agency
Total Award Amount
Award Phase
Solicitation Topic Code
Principal Investigator
Randy Kerber

Company Information

Data2Discovery Inc

901 E 10th Street
Bloomington, IN 47408
   (636) 448-2934
Location: Single
Congr. District: 09
County: Monroe

Phase I

Contract Number: 1549012
Start Date: 1/1/2016    Completed: 6/30/2016
Phase I year
Phase I Amount
The broader impact/commercial potential of this Small Business Innovation Research (SBIR) project is the development of a first-in-class Predictive Phenotypic Profiler (PPP) software tool that will improve the efficiency and effectiveness of the pre-clinical drug discovery process. A recent study of drugs approved by the FDA between 1998 and 2008 shows that a majority of first-in-class drugs are now derived from phenotypic screens rather than traditional target-based screens. However, there is currently a severe lack of computational and data tools that can bridge the vast amounts of traditional molecular-based data with the equally vast amounts of phenotypic data now being generated. The PPP tool integrates and interprets this complex and multi-faceted data to greatly enhance the ability of pharmaceutical companies to find new and effective drugs. The estimated cost per new prescription drug approval is $2.56 billion - the economic impact of reducing the pre-clinical drug discovery process by just one week is estimated to result in a $108 million cost savings for the pharmaceutical industry, creating a large financial opportunity. This tool aims to enhance the number and quality of drugs that enter clinical trials, resulting in more economically priced medicines available to the population.

This SBIR Phase I project proposes to develop a proof-of-concept PPP software tool that brings together a variety of publicly available molecular and phenotypic data sources into a graphical user interface, allowing for the discovery of novel mechanisms of action, and the identification of target(s) from phenotypic assays. The major hurdles of this project will be the integration of these highly heterogeneous datasets and the identification of evidence based path patterns. Semantic technologies and domain expertise will be applied to this application to surmount these data integration and prediction challenges. The plan to reach the goal of a prototype PPP tool includes: 1) Creating a semantic graph for phenotypic data sources, 2) finding evidence-based path patterns in phenotypic data, 3) applying predictive algorithms for phenotypic data analysis, and 4) developing a graphical user interface for evaluation and verification. Phase I success will result in a tool that can be used by pharmaceutical companies for evaluation and product feedback.

Phase II

Contract Number: 1660155
Start Date: 3/15/2017    Completed: 2/28/2019
Phase II year
(last award dollars: 2019)
Phase II Amount

The broader impact/commercial potential of this Small Business Innovation Research (SBIR) Phase II project is the development of an informatics-based software platform that will help pharmaceutical companies create more new, effective, and safe drugs earlier in the R&D pipeline. This software platform will address a need for data integration and analysis tools to aid pharmaceutical researchers in 1) phenotypic screening, 2) toxicology analysis, and 3) drug repurposing. It will help these researchers quickly gather and interpret complex molecular and phenotypic data, making the drug discovery process more efficient and creating value for pharmaceutical companies. The economic impact of reducing the preclinical drug discovery process by just two weeks is estimated to be a $252 million cost savings for the industry. By using data more effectively earlier in the R&D process, this software platform also promises to enhance the quality of drugs that enter clinical trials. Thus, it provides an opportunity to reduce overall R&D spending and increase the number of drugs that enter the market - resulting in more economically priced medicines available to the population.This SBIR Phase II project proposes to build an informatics-based software platform that solves cross-domain data integration, analysis, and user application challenges in order to effectively use data to draw insights earlier in the R&D process and compress the development pipeline for new or repurposed drugs. Using highly scalable semantic graph technologies, a flexible three-layer architecture is being developed that includes the 1) Biomedical Data Layer, 2) Computational Layer, and 3) Application Layer. This architecture allows the system be fully scalable and extensible to other datasets and biomedical applications. The system will be beta-tested by pharmaceutical researchers and evaluated though the creation of scientifically relevant use-cases. This development will result in a commercial software system that makes important biomedical data and insights available to all researchers within a pharmaceutical organization by addressing high need data integration, analysis, and application challenges.