SBIR-STTR Award

REPTOR: accelerating antibody discovery and improving hits with machine learning
Award last edited on: 3/5/2025

Sponsored Program
SBIR
Awarding Agency
NIH : NIGMS
Total Award Amount
$1,073,709
Award Phase
2
Solicitation Topic Code
859
Principal Investigator
Natalie Castellana

Company Information

Abterra Biosciences Inc (AKA: Digital Proteomics LLC~Abterra Biosciences Inc)

3030 Bunker Hill Street Suite 218
San Diego, CA 92109
   (888) 416-9305
   info@abterrabio.com
   www.abterrabio.com
Location: Single
Congr. District: 50
County: San Diego

Phase I

Contract Number: 1R43GM137688-01
Start Date: 4/1/2020    Completed: 9/30/2021
Phase I year
2020
Phase I Amount
$217,313
Antibodies are vital molecules produced by the adaptive immune system, and are a critical component in identifying foreign agents for removal within an organism. Produced by B cells, antibodies have an enormously diverse set of possible compositions, created by somatic recombination and hypermutation processes that are specific to these types of immune cells. Due to their incredible diversity, studying them for the purpose of antibody discovery or disease characterization, becomes a difficult task. Recently, next-generation sequencing (NGS) technologies have been successfully applied to study the diverse repertoire of antibodies produced by B cells. This technology has proven incredible for understanding this component of the immune response in a new level of detail. Unfortunately, NGS produced strings contain errors as part of the process. These errors can be confused as true sequence diversity, and can confound downstream analysis and interpretation. Furthermore, structuring such deep sequenced antibody repertoire data for answering questions about the immune response is non-trivial and compute resource intensive; problems that not many labs are well suited to address. Our proposal seeks to break down barriers for entry of these repertoire sequencing assays by providing innovative informatics approaches to error correction and analysis, delivering results in an interactive cloud platform. Our service will be the first to offer non-human/mouse species support, as well as support for transgenic animals, critical for many drug discovery companies.

Public Health Relevance Statement:
Project Narrative Antibody molecules are created by the immune system used to target cancer, pathogens, and other foreign agents. Understanding the population of antibody molecules within a single individual requires sequencing their nucleotide composition. This process can generate millions of unique strings of nucleotides, of which a small proportion are erroneous. Tools for correcting, analyzing, and visualizing such intricate datasets require improved informatics approaches to make decisions in research and development of vaccines and drugs.

Project Terms:
Academia; Adaptive Immune System; Address; algorithm development; Algorithmic Analysis; Algorithms; Alpaca; analysis pipeline; Animal Model; Antibodies; Antibody Repertoire; assay development; Autoimmune Diseases; B-Lymphocytes; base; Base Sequence; Benchmarking; Biological Assay; Cells; Chickens; Client; Cloud Computing; cloud platform; computerized data processing; computing resources; cost; Data; Data Analyses; Data Set; Decision Making; digital; Disease; DNA Sequencing Facility; drug discovery; Excision; experimental study; Foundations; Gene Conversion; Gene Mutation; Genes; Genetic Recombination; Genomics; Goals; Graph; Human; Immune; Immune response; Immune system; Immunology; improved; Individual; Industry; Informatics; innovation; interest; Internet; Length; Light; Literature; Llama; Malignant Neoplasms; Metadata; Methods; Molecular; Mus; Mutation; Mutation Analysis; nanobodies; next generation sequencing; novel; Nucleotides; Online Systems; Organism; Oryctolagus cuniculus; Outcome; pathogen; Pharmaceutical Preparations; Phase; Population; Population Heterogeneity; Preparation; Process; Proteomics; Protocols documentation; Reporting; research and development; Research Personnel; response; Sampling; Secure; Serum; Services; Structure; Technology; tool; transcriptomics; Transgenic Animals; tumor microenvironment; vaccine development; vaccine response; Visualization

Phase II

Contract Number: 2R44GM137688-02
Start Date: 4/1/2020    Completed: 7/31/2026
Phase II year
2024
Phase II Amount
$856,396
Antibody therapeutics are becoming increasingly important across a broad range of indications, yet their development requires discovery from a variety of difficult sources. Traditional technologies are over four decades old, while newer single-cell approaches for mining survivors are gaining traction in the wake of the SARS-Cov2 pandemic. However, all mainstream discovery approaches significantly limit the sampling of the in-vivo antibody immune response, thereby potentially missing important therapeutic candidates. Approaches to better deconvolute the antibody response with high-throughput sequencing technologies have begun to be applied for research uses. However, using these large-scale data to directly perform antibody discovery has remained elusive. We aim to develop software to streamline the incorporation of high-throughput sequencing into the three mainstream discovery approaches, thereby reducing time and increasing discovery success rate. These software-enabled enhancements will cover high-throughput sequencing for hybridoma discovery, enhanced enrichment analysis for display methods, and simplified workflow analysis for popular single-cell methods. The same type of repertoire sequencing can then be used in a different context to improve candidate antibodies by leveraging the natural improvements the host individual's immune system has already discovered. This expansion of existing candidates is enabled by the deep sequencing of antibody repertoires using next-generation sequencing technology that provides a window into the natural antibody evolution and optimization. These newly deep repertoires are able to be exploited by novel algorithms for analyzing the large antibody families produced, as well as advances in deep learning that enable large amounts of unlabeled data to be synthesized and used for model training to search both across antibody families for similarities, as well as within those families.

Public Health Relevance Statement:
Therapeutic antibody drugs are discovered using a small number of approaches that do not sample the full antibody population in an individual. Deep sequencing of the antibody repertoire can help improve and find new therapeutic candidates with the aid of novel algorithms and deep learning models. Terms: