The objective of this proposal is to demonstrate a set of methods for automatically extracting metadata from diverse data sets to serve as a common vocabulary by which data can easily be queried, retrieved and combined for visualization in a geobrowser. We propose extracting keyword tags from both structured and unstructured data sets by applying natural language processing (NLP) to metadata and unstructured content. The extracted tags will be associated with each data set as supplementary metadata to assist with data discovery, categorization and spatial-temporal location. We combine manually-generated tags, based on domains of interest or specific decision support activities, with automatically generated tags from NLP, and to develop hierarchical clusters of the combined tags to serve as a common set of descriptors by which different data sets can be discovered and combined. If proven successful, our approach will be useful for the management and fusion of very large and diverse data sets not only for applied science and decision support, but also for emergency management and related security operations, for business intelligence, and for other application involving large quantities of diverse data, both structured and unstructured.
Potential NASA Commercial Applications: (Limit 1500 characters, approximately 150 words) A production system with the proposed functionality would be useful to many current NASA sponsored, applied science and decision support programs because of incompatible and thus hard or impossible to discover data sets which may be however relevant to research, analysis and decision making. The following are examples for areas of interest and specific NASA projects:*Water management; BASINS, AWARDS, NWS RFS*Natural Disasters; Global Flood and Landslide Monitoring and Prediction, WRAP*Ecological Forecasting applications; NPS Resource Management, Fire Information for Resource Management Systems*US Natural Disaster monitoring and mitigation *Worldwide relief planning for natural disasters tsunami, volcanos, earthquakes, fires, floods*US land resources drought management and relief; wild fire management and mitigation; forest services land use management and natural disasters planning; hurricane and coastal flood planning.
Potential NON-NASA Commercial Applications:
: (Limit 1500 characters, approximately 150 words) All of the following agencies are candidates for use of a production version of our system for situational intelligence and emergency/disaster preparedness, response and recovery:*Department of Defense, NSA, CIA, NRO, NGA, DIA, FBI, DEA, etc. for international and domestic operations*Homeland Security for counterterrorism plus FEMA, Coast Guard, border security, cybersecurity*US Forest Service, Dept. of Agriculture, Federal Aviation Administration for land and air management.*State and City Governments in the fifty states for predictive policing and counter terrorism. The Fusion Centers for each state or city which coordinates with Homeland Security nationwide are also candidate customers. Exploration and Production for Mining, Oil and Gas and Water Resources. Police at State and Local Levels: One specific area of application is human and child trafficking. A key element is to use social media to help find and arrest child traffickers who advertise, recruit and sell using the Internet social media. Satellite and aerial image providers: enhancing the situational intelligence capability of the current operational systems to be more efficient by making new relevant data sources available to analysts.
Technology Taxonomy Mapping: (NASA's technology taxonomy has been developed by the SBIR-STTR program to disseminate awareness of proposed and awarded R/R&D in the agency. It is a listing of over 100 technologies, sorted into broad categories, of interest to NASA.) Data Fusion Knowledge Management