SBIR-STTR Award

A Dynamic Semantic Data Fabric for Integrating Master Data from Disparate Sources
Award last edited on: 3/24/2023

Sponsored Program
SBIR
Awarding Agency
DOD : Navy
Total Award Amount
$139,765
Award Phase
1
Solicitation Topic Code
N221-077
Principal Investigator
Sheng-Chuan Wu

Company Information

Franz Inc

555 12th Street Suite 1450
Oakland, CA 94607
   (510) 452-2000
   info@franz.com
   www.franz.com
Location: Single
Congr. District: 12
County: Alameda

Phase I

Contract Number: N68335-22-C-0511
Start Date: 7/11/2022    Completed: 1/11/2023
Phase I year
2022
Phase I Amount
$139,765
The Navy needs a software solution to integrate disparate electronic sources of technical information adhering to a unified taxonomy and ontology. This solution must include tool-specific access methods that enforce conformity to the predefined ontology, either automatically or semi-automatically. Franz will apply its AI/ML techniques to develop a software solution to facilitate the data integration discussed above. The system will retrieve data from the various sources, link the data and convert to one tool-specific language, infer duplicate entities and merge into single representations, allow for two-way data flow, obfuscate data based on user/organization roles and access restrictions when required, and provide a means to traverse data across sources to understand and analyze the entire dataset. The technical objectives for this DoD Navy Phase I are to develop a concept for a Semantically Driven Data ?Integration software platform that ?Extracts key data (master data) from the various data sources (e.g., model files, applications, ?and databases) product structures and associated contents (e.g., CAD and CAM files, ?documentation, requirements, manufacturing information, service information, part/supplier ?data),? ?Maps the extracted master data to RDF semantic data according to a predefined taxonomy ??(and ontology) which enables associations between related entities across various data ?sources,? Disambiguates data entities from different data sources to eliminate duplicate entities across ?different data sources and merge into single semantic representations (URIs),? ?Pushes round trip changes between data sources through certain synchronization triggers in ?the (semantic) master database,? ?Provides access control based on users and organizations roles and,? ?Provides analyses on data across the entire dataset from different sources. ? Phase I efforts will articulate the functional design, algorithms, and framework required for interfacing with CSM, DOORS, and Windchill. Phase I deliverables will include detailed information regarding the software architecture and identification of a robust set of test cases that will be used to verify functionality. We will demonstrate the concept feasibility through analysis, modelling, and simulation. The Phase I ?Option will include the initial design specifications and a capabilities description to build a ?prototype solution in Phase II.? The Franz team for the Navy phase I project has over 40 years of combined experience in ?technology, specifically data science and programming. While Dr. Wu has shifted to business ?and partnership development for the company, he maintains an up-to-date knowledge in the ?state-of-the-art technology.?

Benefit:
AllegroGraph supports a distributed database sharding architecture, where data is partitioned and ?stored across as many database (computing) nodes as demanded by the volume of the data while ?maintaining a single access point for queries by applications. This allows the proper ?balancing of database size with system resources, resulting in significant improvements in query ?performance and size scalability. This distributed database architecture is especially suitable for ?cases where the data is conducive to partition (such as partition by products or major components). ?As the data size increases, we simply add more server nodes (more database shards) without ?changing the applications.? AllegroGraph is integrated with KAFKA, an Apache data streaming system, to handle any volume ?and any speed of incoming source data. Any update to the data source will be automatically ?reflected in the semantic graph database in a Multi-Master Replication? (MMR) cluster, where ?individual databases are automatically synchronized with all the others within the cluster. ? Since semantic graph database is schema-less, it can accommodate any new data source or changes ?to existing data schema.? This technology is especially applicable for supply chain to identify potentially risky suppliers or weakest product components to minimize potential product ?delays and cost overrun?. Based on sensor data and written knowledge, it can also predict faults in factory machinery and suggest repair methods. This prediction can also be used to clinical prediction for hospitals based on Electronic Medical Records (EMR) data. Franz plans to commercialize this technology to the DoD, other federal agencies, and the private ?section via several parallel channels. First and foremost, Franz will offer the new technology and ?abilities to existing customers. Franzs main marketing is done by presenting at relevant ?conferences. As such, to push forward with this new technology, we will attend targeted ?conferences run by the industry or various federal agencies to make critical contacts within the ?organizations of potential customers.

Keywords:
Data Integration, Data Integration, AI/ML, data analysis, graph databases?, Data Processing, data flow, Data Mapping

Phase II

Contract Number: ----------
Start Date: 00/00/00    Completed: 00/00/00
Phase II year
----
Phase II Amount
----