This Small Business Innovation Research (SBIR) Phase I project proposes to develop a system for automated classification of biological samples and discovery of biomarkers. The goal is a system to perform comprehensive pattern analysis of state-of-the-art biochemical separations generated by comprehensive two-dimensional gas chromatography (GCxGC) with high-resolution mass spectrometry (HRMS). A critical challenge for elective utilization of GCxGC-HRMS for biochemical classification and biomarker discovery is the diffculty of analyzing and interpreting the massive, complex data for metabolomic and proteomic features. The quantity and complexity of the data, as well as the large dimensionality of the biochemistry in which significant characteristics may be subtle and involve patterns of variations in multiple constituents, necessitate the investigation and development of new bioinformatics. The principal technical objective is an innovative framework for comprehensive feature matching and analysis across many samples. Feature matching is the basis for uniformly labeling structures so that similarities and differences can be documented. Specifically, the framework will incorporate advanced methods for multidimensional peak detection, peak pattern matching across large sample sets, data alignment, GCxGC-HRMS feature computations, and classification with large feature sets. The anticipated result is the technical foundation for a commercial system to classify biological samples and identify significant biomarkers. The broader impact/commercial potential of this project, if successful, will be a better understanding of biochemical processes and discovery of metabolomic and proteomic biomarkers, leading to improved methods for disease diagnoses and treatments. These innovative bioinformatics will contribute to economic competitiveness in the global market for analytical technologies and will foster utilization of advanced GCxGC-HRMS instrumentation. The informatics developed in this project also will be relevant for other classification problems involving multidimensional, multispectral data, including other applications (such as biofuels),other types of chemical analyses (such as multidimensional spectroscopy), and other fields (such as remote-sensing multispectral geospatial imagers). The project will contribute to workforce development, by involving student interns in research experiences through internships and project sponsorships, and to education, by providing software and example data to allow students to more easily explore biochemical complexity