Direct to Phase IINew pathogens, both naturally occurring and adversary-engineered, are increasingly likely to emerge and represent a significant and growing risk to global health and security. These new threats often have limited genetic similarity to prior known pathogens and cannot be identified through standard genetic tests. The application of machine learning algorithms to phenotypic tests to predict pathogenic potential will face challenges in the integration of heterogeneous data sources, and the application of machine learning algorithms to sparse, inconsistent datasets. We propose to build an advanced computational platform called PathEngine that will rapidly ingest and integrate measurements of phenotypic tests from conventional microtiter plate assays, as well as single-cell resolution microfluidics systems. It will use a tailored semi-supervised learning algorithm to predict the pathogenic potential of bacterial strains from limited, sparse, inconsistent training datasets. PathEngine will ingest, integrate, and analyze phenotypic tests of three different categories (harming a host, niche finding, and self-preservation) and be capable of identifying the pathogenic potential of bacteria at >90% accuracy.