SBIR-STTR Award

Exploitation and Dissemination Systems with Better Training Data: Improving the "E" in DoD Processing
Award last edited on: 7/16/2023

Sponsored Program
SBIR
Awarding Agency
DOD : AF
Total Award Amount
$743,236
Award Phase
2
Solicitation Topic Code
AF192-D001
Principal Investigator
Shannon Hynds

Company Information

Thresher Ventures LLC

841 Elm Street Suite 333
Mclean, VA 22101
   (703) 623-5590
   info@thresher.io
   www.thresher.io
Location: Single
Congr. District: 08
County: Fairfax

Phase I

Contract Number: 2019
Start Date: ----    Completed: 8/6/2019
Phase I year
2019
Phase I Amount
$1
Direct to Phase II

Phase II

Contract Number: N/A
Start Date: 8/6/2021    Completed: 8/6/2019
Phase II year
2019
(last award dollars: 1689510692)
Phase II Amount
$743,235

Data scientists and analysts across the military, intelligence, and law enforcement communities are building machine learning models to classify and predict, but often struggle to create relevant, labeled training data-particularly for human-generated, domain-specific text. Several services can outsource data labeling, but this is not an option for national security data, which often cannot be shared and requires special knowledge to understand. Thresher's QuickCode creates labeled training data for machine learning algorithms from unstructured text, of any size, in any language. Thresher's patented technology allows experts to generate training data in a fraction of the time compared to hand labeling with comparable accuracy. The 480th USAF ISR Wing generates intelligence reports for a wide group of customers. Analysts generating these reports are tasked to provide historical context; however, data in their historical archives is not tagged to allow them to collect, review, and analyze it in the given time constraint. Thresher's proposal identifies a method for using QuickCode to tag the 480th's historical data and create a predictive model to tag future reports, providing the 480th analysts with a research tool to dramatically reduce the time required to create their reports.