SBIR-STTR Award

Secure Homomorphically Encrypted National Registry of COVID-19 Recovered Plasma Donors
Award last edited on: 5/16/2023

Sponsored Program
STTR
Awarding Agency
NIH : NHGRI
Total Award Amount
$565,671
Award Phase
1
Solicitation Topic Code
172
Principal Investigator
Seemeen Karimi

Company Information

Elimu Informatics Inc

1160 Brickyard CV Road Suite 200f
Richmond, CA 94801
   (510) 439-4116
   info@elimu.io
   www.elimu.io

Research Institution

University of Texas - Houston

Phase I

Contract Number: 1R41HG010978-01
Start Date: 9/9/2019    Completed: 8/31/2020
Phase I year
2019
Phase I Amount
$344,948
In the age of precision medicine, genomic data are being integrated with other health care data to support personalized and calibrated clinical decision-making. Genomic sequence data are too large to be stored in electronic health record (EHR) systems and need to be separately stored. While cloud computing offers a cost-efficient and scalable platform, the privacy and security concerns about outsourcing genomic data are challenging issues. The common perception is that the ease of access to remote data and the protection of privacy are at odds with each other. We propose a new genomics archiving and communications system (GACS) that meets both requirements by using state-of-the-art homomorphic encryption algorithms and matrix representation of data and queries. In this system, variants are represented as vectors, that are homomorphically encrypted by a client and stored on the GACS server. When analysis is required, a query is generated in the form of a matrix. This matrix is encrypted (or can remain in plaintext depending on the task) and sent to the GACS server. The server computes on encrypted data, produces an encrypted result and returns it to the client, who has the secret key to decode it. The GACS is not able to decrypt the data or the encrypted queries, thus guaranteeing that privacy and security are maintained on the GACS. Preliminary results of the algorithms show that after decryption, the results are the same as results from computing on plaintext. In this project, we will implement our GACS system software modules and demonstrate the use of the system with examples from three use- cases: pharmacogenomics, clinical trials eligibility and analysis for disease risks. We will measure performance speed and memory consumption in all three use-cases. A GACS system as a cloud-hosted service can reduce the computational burden on healthcare facilities. It can provide small healthcare facilities with the same genomic analysis capability available to larger hospitals. In addition, clinical decision support (CDS) can be deployed on the GACS. As clinical guidelines evolve in response to new discoveries linking genetic variants to disease and medicines, healthcare facilities can stay in compliance with the guidelines.

Public Health Relevance Statement:
Project Narrative The use of genomic data in clinical decision-making is rapidly increasing. Since the size of genomic sequence data are large, they cannot be stored easily in electronic health record systems. Furthermore, since genomic data are highly sensitive in nature, they must be protected in storage and during analysis. We propose a new genomics archiving and communications system (GACS) that satisfies the requirement of easy access to the data by clinical systems and provides strong protection for privacy. This system is based on state-of-the- art encryption algorithms. Genome data are encrypted and stored in the GACS. The data are analyzed while remaining encrypted. The GACS learns neither the data nor the analysis questions, thus guaranteeing that privacy is maintained on the GACS server. We will test the new system on three use-cases: pharmacogenomics, clinical trials eligibility, and gene analysis for disease risk.

Project Terms:
Age; Algorithms; Alleles; Archives; Awareness; base; Caring; Client; Clinical; clinical application; clinical care; Clinical Data; clinical decision support; clinical decision-making; clinical practice; Clinical Trials; cloud based; Cloud Computing; Communication; Computer software; Consumption; cost efficient; Data; Data Analyses; data format; Data Protection; Data Reporting; Data Security; Disclosure; Disease; disorder risk; Electronic Health Record; Eligibility Determination; empowered; encryption; Evolution; Genes; Genetic; genetic variant; Genome; genomic data; Genomics; Genotype; gigabyte; Goals; Guidelines; Health care facility; health care service organization; Health Sciences; Healthcare; Hospitals; Human; Individual; Investments; Learning; Link; Longevity; Measurement; Measures; Medicine; Memory; Nature; novel; Outsourcing; Patient Data Privacy; patient privacy; Patients; Perception; Performance; Persons; Pharmaceutical Preparations; Pharmacogenomics; Phase; precision medicine; Predisposition; preservation; Privacy; privacy protection; Protocols documentation; Recommendation; Research; response; Rest; Risk; rural area; Secure; Security; Services; Small Business Technology Transfer Research; software systems; Speed; System; Techniques; Technology; Testing; Texas; Time; Universities; Variant; vector

Phase II

Contract Number: ----------
Start Date: 00/00/00    Completed: 00/00/00
Phase II year
----
Phase II Amount
$220,723