Proteins are responsible for much of the structure and function of all cells. Subtle changes in expression ofvarious protein forms are critical for proper growth and development, but irregularities can cause deleteriouscellular effects or large-scale biological dysfunction. Proteins consist of chains of amino acids, which ultimatelydetermine the three-dimensional structure and functionality of the protein. As such, the ability to gather the entireamino acid sequence of low abundance proteins can greatly accelerate research into protein function and biol-ogy. However, in stark contrast to the relative success of DNA sequencing technologies, there is currently noefficient and cost-effective strategy to sequence single protein molecules at single-amino-acid resolution.Two methods are commercially available for protein sequencing. The first method, "Edman degradation", re-quires purification of the target protein. Bulk quantities of whole protein or purified fragments are sequenced bycleaving off the first (N-terminal) amino acid and chemically identifying it. The second method, based on massspectrometry, requires enzymatically degrading a single protein or mixture of proteins into small fragments, thenanalyzing the molecular mass and charge of each fragment. This information is compared to that of knownprotein sequences to infer the identity of the input proteins. Both of these commercially available methods sufferfrom low sensitivity, requiring ~1 million molecules of each protein for detection. Edman degradation cannotcurrently be used in heterogenous protein mixtures, further limiting its utility.Critical hurdles in single molecule protein sequencing are the number and diversity of amino acids, as well asthe interactions between amino acids that interfere with reagents that can identify amino acids by their chemicalside chains. Current approaches being developed for single-molecule protein sequencing could avoid some ofthese issues by employing harsh denaturation agents, but these can compromise the identification systemsthemselves. In addition, denaturation agents only remove some of the intramolecular interactions of proteins.Glyphic Biotechnologies has developed a novel strategy to iteratively identify the first (N-terminal) amino acidby isolating it from the remainder of the protein, using a linker molecule called ClickP. After binding the proteinto a solid surface, ClickP enables single molecule protein sequencing by a reiterative method of physically iso-lating the terminal amino acid, then enabling its identification at high specificity and single-molecule sensitivity.The approach has the potential to be scaled to sequence millions to billions of single molecules simultaneouslyin hours. Developing this technology will revolutionize protein analysis by making large-scale protein sequenc-ing feasible, inexpensive, and routine.
Public Health Relevance Statement: NARRATIVE
There is currently no technology capable of sequencing individual proteins from beginning to end. Glyphic Bio-
technologies plans to develop a novel method of single-molecule protein sequencing, which will bring improve-
ments analogous to those of Next-Generation Sequencing of DNA. The Glyphic protein sequencing technology
will allow high-throughput, simultaneous sequencing of millions of proteins from samples as small as single cells.
Project Terms: |