There is growing interest in the therapeutic application of phage for treatments of antibiotic-resistant infections and gut microbiome-related disorders. Phage therapies have the advantage of potentially extreme specificity for their targets leading to very little in the way of off-target side effects when compared with traditional antibiotic therapy. However, the identification of phage that target an organism of interest and determining host range remains a technical challenge. Host assignment for a phage typically requires laboratory culture of the organism of interest, a significant barrier when trying to target organisms which are difficult to culture, and introducing significant biases into the existing phage-host knowledge base. And like antibiotics, it is possible that organisms can acquire resistance to phage transduction, limiting the utility of a single phage to treat an infection over time. For these reasons it would be highly beneficial to have the ability to identify phage with potentially therapeutic targets efficiently from an uncultured population of microbes. In this application, we propose to develop a machine-learning based platform for the identification and assignment of phage and their hosts from metagenomic whole genome sequencing (WGS) data. Our approach leverages the unique property of proximity ligation sequencing, or Hi-C, to efficiently gather direct physical evidence of phage-host associations from mixed microbial communities. We propose to use this technology to assemble a large-scale, high-quality phage-host interaction dataset from human fecal samples, use it to train a machine learning model to predict phage-host relationships from existing WGS data, and provide a convenient platform for users to input metagenomic reads to receive phage-host information. This approach would enable the identification of phage and combinations of phage to simultaneously target organisms that are otherwise untractable through standard clinical methods from both existing and future WGS data sets.
Public Health Relevance Statement: NARRATIVE Phage are viruses that infect and kill specific bacteria in virtually every environment studied, including in the human body. Because of this quality, phage have been long considered agents for targeted therapy development especially for otherwise difficult-to-treat infections. We propose development of a platform that permits the identification of phage with therapeutic potential directly from DNA sequencing data.
Project Terms: Technology; Temperature; Time; Virus; Generations; Planets; Treatment outcome; Data Set; Dataset; DNA Sequence; Ecosystem; Ecologic Systems; Ecological Systems; base; crosslink; cross-link; improved; Clinical; Link; Training; Failure; insight; Data Bases; data base; Databases; Human Figure; Human body; Cell Components; Cell Structure; Cellular Structures; Engraftment; Therapeutic; tool; Antibiotic Treatment; bacterial disease treatment; bacterial infectious disease treatment; Antibiotic Therapy; machine learned; Machine Learning; Knowledge; Culture Procedure; Laboratory culture; Lytic; Clinic; Viral; interest; knowledge base; microbial; Structure; novel; Basic Research; Basic Science; Sampling; Property; preventing; prevent; genome sequencing; Antimicrobial resistant; Resistance to antimicrobial; anti-microbial resistance; anti-microbial resistant; resistance to anti-microbial; resistant to anti-microbial; resistant to antimicrobial; Antimicrobial Resistance; fitness; Data; dsDNA Virus; Double Stranded DNA Virus; Intake; Lytic Phase; Lytic Cycle; Lytic Infection; therapy outcome; therapeutic outcome; Shotgun Sequencing; Process; Development; developmental; Output; microbiome; Human Microbiome; human-associated microbiome; cost; virtual; Outcome; cost efficient; Population; prospective; innovation; innovate; innovative; Resistance; resistant; microbial community; community microbes; Microbe; metagenomic sequencing; metagenome sequencing; large-scale database; large-scale data base; virome; viral microbiome; Metagenomics; Functional Metagenomics; therapeutic target; therapy development; develop therapy; intervention development; treatment development; combat; multi-drug resistant pathogen; MDR organism; MDR pathogen; multi-drug resistant organism; multidrug resistant organism; multidrug resistant pathogen; multiple drug resistant organism; multiple drug resistant pathogen; gut microbiome; GI microbiome; digestive tract microbiome; enteric microbiome; gastrointestinal microbiome; gut-associated microbiome; intestinal biome; intestinal microbiome; fecal transplantation; fecal microbial transplantation; fecal microbiome transplantation; fecal microbiota transplant; fecal microbiota transplantation; fecal transplant; targeted agent; whole genome; entire genome; full genome; microbiota transplantation; microbiome transplant; microbiome transplantation; microbiota transplant; DNA sequencing; DNA seq; DNAseq; preservation; deep learning; Infrastructure; side effect; antibiotic resistant infections; machine learning method; machine learning based method; machine learning methodologies; Hi-C; deep learning model; deep learning based model; machine learning model; machine learning based model; Antibiotics; Antibiotic Agents; Antibiotic Drugs; Miscellaneous Antibiotic; Artificial Intelligence; AI system; Computer Reasoning; Machine Intelligence; Bacteria; Bacteriophages; Phages; bacterial virus; Cells; Cell Body; Communities; Computing Methodologies; computational methodology; computational methods; computer based method; computer methods; computing method; Disease; Disorder; Donor person; transplant donor; Environment; Family; Foundations; Future; Genome; Health; Human; Modern Man; Infection; Lead; Pb element; heavy metal Pb; heavy metal lead; Libraries; Ligation; Closure by Ligation; Lysogeny; Prophage Integration; Methods; living system; Organism; Reagent; Research; Investigators; Researchers; Research Personnel; shot gun; Shotguns; Specificity