SBIR-STTR Award

CLARIFIER - Data Labeling and Curation at Scale (DLCS) for Machine Learning Algorithms
Award last edited on: 10/8/2024

Sponsored Program
SBIR
Awarding Agency
DHS
Total Award Amount
$174,477
Award Phase
1
Solicitation Topic Code
DHS241-002
Principal Investigator
Rajini Anachi

Company Information

AvaWatzs Company

14681 Midway Road Suite 200
Addison, TX 75001
   N/A
   sales@avawatz.com
   www.avawatz.com
Location: Single
Congr. District: 24
County: Dallas

Phase I

Contract Number: 70RSAT24C00000026
Start Date: 5/7/2024    Completed: 10/6/2024
Phase I year
2024
Phase I Amount
$174,477
The Data Labeling and Curation at Scale (DLCS) project will create a system called CLARIFIER, which aims to revolutionize the way large volumes of complex data are processed and utilized for machine learning (ML) applications within the Department of Homeland Security (DHS). The primary purpose of this work is to develop an advanced system capable of ingesting, labeling, storing, and curating diverse data types, with a focus on enhancing the efficiency and accuracy of machine learning algorithm development. The DLCS system will leverage recent research done by the PIs, which employs advanced ML techniques for auto-labeling, supplemented by human verification to ensure high accuracy, and adapt it to handle specific DHS use cases such as millimeter-wave radar and x-ray imagery. This adaptation involves creating a robust data ingestion module capable of processing various file formats, including Hierarchical Data Formats (HDF) and Digital Imaging and Communications in Security (DICOS). Additionally, the system will integrate seamlessly into the existing DHS ecosystem, providing a streamlined workflow from data ingestion to storage. The anticipated outcome is a scalable, efficient, and accurate system for data labeling and curation. This system will significantly reduce the time and effort required for data processing, accelerating development of critical ML algorithms for security applications. In terms of commercial potential, the DLCS system has broad applicability beyond DHS. It can be adapted for various sectors requiring efficient handling of large-scale data, such as healthcare, aviation security, and defense, making it a valuable tool for both government and commercial entities.

Phase II

Contract Number: ----------
Start Date: 00/00/00    Completed: 00/00/00
Phase II year
----
Phase II Amount
----