SBIR-STTR Award

Computational Framework For Analysis Of Microarray Gene Expression Data
Award last edited on: 6/13/11

Sponsored Program
SBIR
Awarding Agency
NIH : NIGMS
Total Award Amount
$299,983
Award Phase
2
Solicitation Topic Code
-----

Principal Investigator
Dariusz Wroblewski

Company Information

BioFormatix Inc

12396 World Trade Drive Suite 315
San Diego, CA 92028
   (858) 248-2884
   dariusz@bioformatix.com
   www.bioformatix.com
Location: Single
Congr. District: 51
County: San Diego

Phase I

Contract Number: 1R43GM083346-01A1
Start Date: 00/00/00    Completed: 00/00/00
Phase I year
2009
Phase I Amount
$149,985
Identification of transcripts that are differentially regulated in response to studied experimental conditions is one of critical steps in analysis of DNA microarray data. Currently employed statistical approaches become particularly ineffective for experiments with small number of biological replicates, which are prevalent in the differential expression studies. We propose to develop and validate a novel numerical framework for identification of differentially expressed transcripts, with emphasis on analysis of experiments with small number of replicates and genes with moderate levels of expression. The proposed approach is based on a novel, non-parametric method for assessment of noise distributions in microarray data, which are derived directly from the analyzed data set. Three distinct, univariate and multivariate methods for identification of differentially expressed genes will be implemented and their results will be compared to the results of leading advanced statistical methods. In the Phase I feasibility study we will analyze differential gene expression between at least nine normal tissues with varying levels of similarity, in rat and mouse. Publicly available data from SymAtlas database (Genomics Institute of the Novartis Research Foundation), obtained with Affymetrix microarrays, will be employed. The utility of newly developed numerical methods will be established through biological and/or experimental validation of identified genomic biomarkers using functional analysis (if functional annotation is available) and/or quantitative polymerase chain reaction analysis.

Public Health Relevance:
DNA microarray technology enables simultaneous profiling of thousands of transcripts expressed in particular organism, cells or tissues. Its current applications include gene profiling, gene regulation studies, disease biomarker discovery, toxicogenomics, pharmacogenomics, and clinical diagnostics and prognosis. Despite recent impressive technological advances, major bottlenecks to the realization of the full potential of the microarray technology exist and include incomplete functional gene annotation and the lack of effective computational data analysis tools. The analysis methods developed in this project will improve the ability to reliably identify differentially expressed genes in experiments with small number of biological replicates, which will improve the overall effectiveness of this technology and reduce the cost of microarray gene expression studies.

Project Terms:
Address; Analysis, Data; Benchmarking; Best Practice Analysis; Biological; Biological Neural Networks; Body Tissues; Cells; Clinical; Cognitive Discrimination; Common Rat Strains; DNA Chips; DNA Microarray; DNA Microarray Chip; DNA Microchips; Data; Data Analyses; Data Banks; Data Bases; Data Set; Databank, Electronic; Databanks; Database, Electronic; Databases; Dataset; Detection; Development; Diagnostic; Differential Gene Expression; Discrimination; Discrimination (Psychology); Disease; Disorder; Effectiveness; Elements; Expression Profiling; Expression Signature; Feasibility Studies; Forecast of outcome; Foundations; Gaussian Distribution; Gene Action Regulation; Gene Expression; Gene Expression Microarray Analysis; Gene Expression Regulation; Gene Regulation; Gene Regulation Process; Genes; Genomics; Gray; Gray unit of radiation dose; Institutes; Mammals, Mice; Mammals, Rats; Methods; Mice; Microarray Analysis; Microarray-Based Analysis; Modeling; Molecular Fingerprinting; Molecular Profiling; Murine; Mus; Noise; Normal Distribution; Normal Statistical Distribution; Normal Tissue; Normal tissue morphology; Organ; Organism; PCR; Pattern; Performance; Pharmacogenomics; Phase; Polymerase Chain Reaction; Principal Component Analyses; Principal Component Analysis; Prognosis; Rat; Rattus; Research; Sampling; Statistical Methods; Stress; Structure; Students; Technology; Testing; Tissue Differentiation; Tissue-Specific Differential Gene Expression; Tissue-Specific Gene Expression; Tissues; Toxicogenomics; Transcript; Validation; base; biomarker; clinical data repository; clinical data warehouse; computational framework; computer framework; cost; data repository; disease/disorder; experiment; experimental research; experimental study; improved; living system; microarray technology; molecuar profile; molecular signature; neural network; new approaches; novel; novel approaches; novel strategies; novel strategy; outcome forecast; public health relevance; relational database; research study; response; tool

Phase II

Contract Number: 5R43GM083346-02
Start Date: 1/1/09    Completed: 12/31/10
Phase II year
2010
Phase II Amount
$149,998
Identification of transcripts that are differentially regulated in response to studied experimental conditions is one of critical steps in analysis of DNA microarray data. Currently employed statistical approaches become particularly ineffective for experiments with small number of biological replicates, which are prevalent in the differential expression studies. We propose to develop and validate a novel numerical framework for identification of differentially expressed transcripts, with emphasis on analysis of experiments with small number of replicates and genes with moderate levels of expression. The proposed approach is based on a novel, non-parametric method for assessment of noise distributions in microarray data, which are derived directly from the analyzed data set. Three distinct, univariate and multivariate methods for identification of differentially expressed genes will be implemented and their results will be compared to the results of leading advanced statistical methods. In the Phase I feasibility study we will analyze differential gene expression between at least nine normal tissues with varying levels of similarity, in rat and mouse. Publicly available data from SymAtlas database (Genomics Institute of the Novartis Research Foundation), obtained with Affymetrix microarrays, will be employed. The utility of newly developed numerical methods will be established through biological and/or experimental validation of identified genomic biomarkers using functional analysis (if functional annotation is available) and/or quantitative polymerase chain reaction analysis.

Public Health Relevance:
DNA microarray technology enables simultaneous profiling of thousands of transcripts expressed in particular organism, cells or tissues. Its current applications include gene profiling, gene regulation studies, disease biomarker discovery, toxicogenomics, pharmacogenomics, and clinical diagnostics and prognosis. Despite recent impressive technological advances, major bottlenecks to the realization of the full potential of the microarray technology exist and include incomplete functional gene annotation and the lack of effective computational data analysis tools. The analysis methods developed in this project will improve the ability to reliably identify differentially expressed genes in experiments with small number of biological replicates, which will improve the overall effectiveness of this technology and reduce the cost of microarray gene expression studies.

Thesaurus Terms:
Address; Analysis, Data; Benchmarking; Best Practice Analysis; Biological; Biological Neural Networks; Body Tissues; Cells; Clinical; Cognitive Discrimination; Common Rat Strains; Dna Chips; Dna Microarray; Dna Microarray Chip; Dna Microchips; Data; Data Analyses; Data Banks; Data Bases; Data Set; Databank, Electronic; Databanks; Database, Electronic; Databases; Dataset; Detection; Development; Diagnostic; Differential Gene Expression; Discrimination; Discrimination (Psychology); Disease; Disorder; Effectiveness; Elements; Expression Profiling; Expression Signature; Feasibility Studies; Forecast Of Outcome; Foundations; Gaussian Distribution; Gene Action Regulation; Gene Expression; Gene Expression Microarray Analysis; Gene Expression Regulation; Gene Regulation; Gene Regulation Process; Genes; Genomics; Gray; Gray Unit Of Radiation Dose; Institutes; Mammals, Mice; Mammals, Rats; Methods; Mice; Microarray Analysis; Microarray-Based Analysis; Modeling; Molecular Fingerprinting; Molecular Profiling; Murine; Mus; Noise; Normal Distribution; Normal Statistical Distribution; Normal Tissue; Normal Tissue Morphology; Organ; Organism; Pcr; Pattern; Performance; Pharmacogenomics; Phase; Polymerase Chain Reaction; Principal Component Analyses; Principal Component Analysis; Prognosis; Rat; Rattus; Research; Sampling; Statistical Methods; Stress; Structure; Students; Technology; Testing; Tissue Differentiation; Tissue-Specific Differential Gene Expression; Tissue-Specific Gene Expression; Tissues; Toxicogenomics; Transcript; Validation; Base; Biomarker; Clinical Data Repository; Clinical Data Warehouse; Computational Framework; Computer Framework; Cost; Data Repository; Disease/Disorder; Experiment; Experimental Research; Experimental Study; Improved; Living System; Microarray Technology; Molecuar Profile; Molecular Signature; Neural Network; New Approaches; Novel; Novel Approaches; Novel Strategies; Novel Strategy; Outcome Forecast; Public Health Relevance; Relational Database; Research Study; Response; Tool