SBIR-STTR Award

Real-Time Data Analytics Over The Deep Web
Award last edited on: 4/16/2013

Sponsored Program
SBIR
Awarding Agency
NSF
Total Award Amount
$148,625
Award Phase
1
Solicitation Topic Code
-----

Principal Investigator
Nan Zhang

Company Information

WiseAgg LLC

4027 Fairfax Center Hunt Trail
Fairfax, VA 22030
   (817) 903-9629
   N/A
   www.deepwebwatch.com
Location: Single
Congr. District: 11
County: Fairfax

Phase I

Contract Number: ----------
Start Date: ----    Completed: ----
Phase I year
2012
Phase I Amount
$148,625
This Small Business Innovation Research (SBIR) Phase I project addresses the problem of real-time data analytics over deep web repositories. A major part of the deep web consists of online data repositories that are hidden behind restrictive web search interfaces and therefore cannot be effectively crawled by existing search engines such as Google. The proposed technology uses a sampling-based framework to quickly generate visible depictions of deep web analytics by issuing a small number of search requests though the existing web interfaces of deep web repositories. The specific technical objectives include the ability to 'drill into' a small subarea of interest and download the desired data with minimal query cost, as well as the ability to extract metadata information automatically from a deep web repository. The anticipated technical results are algorithms that discover the data of interest and/or metadata information after issuing a predetermined number of requests through the search interfaces of deep web repositories. The broader impact/commercial potential of this project is understood by recognizing that it democratizes the market sectors of financial, political, and market analysis which would otherwise require heavy human effort and/or computing resources such as highly paid subject matter experts or servers and storage for web crawling/indexing. Instead of incurring such high costs, the proposed technology provides customers an affordable solution for the real-time aggregate analysis of multiple deep web data repositories. More broadly, real-time data analytics over the deep web is needed by knowledge workers in a wide variety of corporations, governments, and intelligence agencies. The prospects of empowering the general public with the ability to pose high-level analytical queries over the deep web, using the enhanced scientific and technological understanding achieved through this project, are tantalizing and beneficial to the entire society at large

Phase II

Contract Number: ----------
Start Date: ----    Completed: ----
Phase II year
----
Phase II Amount
----