Antibodies are vital molecules produced by the adaptive immune system, and are a critical component in identifying foreign agents for removal within an organism. Produced by B cells, antibodies have an enormously diverse set of possible compositions, created by somatic recombination and hypermutation processes that are specific to these types of immune cells. Due to their incredible diversity, studying them for the purpose of antibody discovery or disease characterization, becomes a difficult task. Recently, next-generation sequencing (NGS) technologies have been successfully applied to study the diverse repertoire of antibodies produced by B cells. This technology has proven incredible for understanding this component of the immune response in a new level of detail. Unfortunately, NGS produced strings contain errors as part of the process. These errors can be confused as true sequence diversity, and can confound downstream analysis and interpretation. Furthermore, structuring such deep sequenced antibody repertoire data for answering questions about the immune response is non-trivial and compute resource intensive; problems that not many labs are well suited to address. Our proposal seeks to break down barriers for entry of these repertoire sequencing assays by providing innovative informatics approaches to error correction and analysis, delivering results in an interactive cloud platform. Our service will be the first to offer non-human/mouse species support, as well as support for transgenic animals, critical for many drug discovery companies.
Public Health Relevance Statement: Project Narrative Antibody molecules are created by the immune system used to target cancer, pathogens, and other foreign agents. Understanding the population of antibody molecules within a single individual requires sequencing their nucleotide composition. This process can generate millions of unique strings of nucleotides, of which a small proportion are erroneous. Tools for correcting, analyzing, and visualizing such intricate datasets require improved informatics approaches to make decisions in research and development of vaccines and drugs.
Project Terms: Academia; Adaptive Immune System; Address; algorithm development; Algorithmic Analysis; Algorithms; Alpaca; analysis pipeline; Animal Model; Antibodies; Antibody Repertoire; assay development; Autoimmune Diseases; B-Lymphocytes; base; Base Sequence; Benchmarking; Biological Assay; Cells; Chickens; Client; Cloud Computing; cloud platform; computerized data processing; computing resources; cost; Data; Data Analyses; Data Set; Decision Making; digital; Disease; DNA Sequencing Facility; drug discovery; Excision; experimental study; Foundations; Gene Conversion; Gene Mutation; Genes; Genetic Recombination; Genomics; Goals; Graph; Human; Immune; Immune response; Immune system; Immunology; improved; Individual; Industry; Informatics; innovation; interest; Internet; Length; Light; Literature; Llama; Malignant Neoplasms; Metadata; Methods; Molecular; Mus; Mutation; Mutation Analysis; nanobodies; next generation sequencing; novel; Nucleotides; Online Systems; Organism; Oryctolagus cuniculus; Outcome; pathogen; Pharmaceutical Preparations; Phase; Population; Population Heterogeneity; Preparation; Process; Proteomics; Protocols documentation; Reporting; research and development; Research Personnel; response; Sampling; Secure; Serum; Services; Structure; Technology; tool; transcriptomics; Transgenic Animals; tumor microenvironment; vaccine development; vaccine response; Visualization