Subtle changes in protein expression are critical for proper growth and development, but irregu- larities can cause deleterious cellular effects or large-scale biological dysfunction. Sequencing samples with complex mixtures of proteins could greatly accelerate research into protein function and biology, but there is currently no efficient and cost-effective strategy for protein sequencing at single-amino-acid resolution. Two methods are commercially available for protein sequencing. In the first, âEdman degradationâ, bulk quanti- ties of whole protein or purified fragments are sequenced by cleaving the first (N-terminal) amino acid and chem- ically identifying it. In the second method, based on mass spectrometry, a single protein or mixture of proteins is fragmented, and the molecular mass and charge of each fragment are analyzed. This information is compared known protein sequences to infer the identity of the input proteins. Both of these methods require ~1 million molecules of each protein, and Edman degradation cannot currently be used on heterogenous protein mixtures. Existing approaches for single molecule protein sequencing are hindered by the number and diversity of amino acids, as well as the interactions between amino acids that interfere with chemical identification of their side chains. Harsh denaturation agents can mitigate some issues, but they can compromise the identification systems themselves. In addition, denaturation agents only remove some of the intramolecular interactions of proteins. Glyphic Biotechnologies is developing a novel strategy to sequence individual protein molecules in their entirety from a heterogeneous sample. This process is based on ligating the N-terminal amino acid to a cleavable chem- ical linker, which subsequently tethers it locally to the surface. Cleavage of the linker removes the N-terminal amino acid from the protein for highly sensitive identification with no interference from protein structure or adja- cent amino acids. The process is repeated for each subsequent amino acid, yielding the protein sequence. The approach may simultaneously sequence millions to billions of individual protein molecules in hours, which will revolutionize protein analysis by making large-scale protein sequencing feasible, inexpensive, and routine. The current proposal focuses on developing reagents specifically to detect the N-terminal amino acid of proteins, allowing amino acids to be digitally identified via this N-terminal isolation strategy. In Aim 1 we will generate antibodies to recognize at least 10 different isolated amino acids â enough to identify ~90% of the proteome after 10 sequencing rounds. In Aim 2 we will further optimize the antibodies and demonstrate the feasibility of using them to sequence individual proteins among a background of non-modified proteins. Success of these Aims will enable the Glyphic protein sequencing platform to detect, quantify, and sequence single proteins in complex protein mixtures in an unbiased fashion - without any prior knowledge of their identity or even their existence. When commercialized, it will enable clinical diagnosis of disease based on the proteins present in a patient sample and allow identification of unique proteins to for as-yet unknown biomarkers.
Public Health Relevance Statement: NARRATIVE No current technology is capable of unbiased sequencing of individual proteins in a complex sample. Glyphic Biotechnologies is developing a novel method of single-molecule protein sequencing, analogous to âNext-Gen- erationâ Sequencing of DNA. In conjunction with this technology, Glyphic proposes here to develop reagents to specifically detect N-terminal amino acids for applications in clinical diagnostics and basic research.
Project Terms: Acceleration; Primary Protein Structure; protein sequence; Amino Acid Sequence; aminoacid; Amino Acids; Antibodies; Clinical Treatment Moab; mAbs; monoclonal Abs; Monoclonal Antibodies; antigen antibody affinity; Antibody Affinity; Biological Assay; Assay; Bioassay; Biologic Assays; Biology; Biotechnology; Biotech; Blood; Blood Reticuloendothelial System; Malignant Neoplasms; Cancers; Malignant Tumor; malignancy; neoplasm/cancer; Charge; Chemistry; Disease; Disorder; Dyes; Coloring Agents; Enzyme-Linked Immunosorbent Assay; ELISA; enzyme linked immunoassay; Face; faces; facial; Future; Growth and Development function; Growth and Development; Immune Sera; Antisera; immune serum; Immunity; Infection; Libraries; Ligation; Closure by Ligation; Llama; Marketing; Methods; Fluorescence Microscopy; Fluorescence Light Microscopy; Mus; Mice; Mice Mammals; Murine; Noise; Parents; parent; Patients; Peptide Mapping; Peptide Fingerprinting; Peptides; Proteins; Oryctolagus cuniculus; Domestic Rabbit; Rabbits; Rabbits Mammals; Reagent; Research; Sensitivity and Specificity; Signal Transduction; Cell Communication and Signaling; Cell Signaling; Intracellular Communication and Signaling; Signal Transduction Systems; Signaling; biological signal transduction; Specificity; Mass Spectrum Analysis; Mass Photometry/Spectrum Analysis; Mass Spectrometry; Mass Spectroscopy; Mass Spectrum; Mass Spectrum Analyses; Technology; Testing; Tumor Antigens; Tumor-Associated Antigen; cancer antigens; tumor-specific antigen; Yeasts; Measures; Titrations; Peptide Domain; Protein Domains; Tertiary Protein Structure; Label; improved; Area; Surface; Solid; Phase; biologic; Biological; Chemicals; Individual; Dysfunction; Physiopathology; pathophysiology; Functional disorder; polyclonal antibody; clinical diagnosis; Knowledge; Hour; Complex; Side; Reaction; System; success; Fluorescence Resonance Energy Transfer; FRET; Förster Resonance Energy Transfer; Surface Plasmon Resonance; molecular mass; immunological diversity; novel; Basic Science; Basic Research; Peptide Sequence Determination; Amino Acid Sequence Determinations; Protein Sequence Determinations; Protein Sequencing; Protein Sequencing Molecular Biology; Proteome; Sampling; cross reactivity; Proteomics; single molecule; Genomics; Molecular Interaction; Binding; protein expression; protein complex; Complex Mixtures; protein structure; protein structures; proteins structure; Affinity; Detection; Diagnostics Research; Protein Analysis; Resolution; resolutions; Process; protein function; Development; developmental; Image; imaging; digital; next generation; new approaches; novel approaches; novel strategy; novel strategies; nano pore; nanopore; cost effective; pathogen; NH2-terminal; N-terminal; innovate; innovative; innovation; antibody engineering; commercialization; tumor; bio-markers; biologic marker; biomarker; Biological Markers; disease diagnosis; NGS Method; NGS system; next gen sequencing; nextgen sequencing; next generation sequencing; sequencing platform; medical diagnostic; clinical diagnostics; Immunize; DNA seq; DNAseq; DNA sequencing; Visualization; m