The Associative Model of Data offers a fundamentally different meta-model for data organization than the well-established relational data model. The associative model focuses on Items and Links among items rather than sets of records. We propose to compare and contrast the associative model with two closely related models, the Resource Description Framework (RDF) triple model and the Property Graph model popularized by modern open-source graph databases. By reviewing existing documentation, technical papers and implementations, we seek to identify a feature set appropriate for scaling out to petabyte scales subject to Multi-Level Security constraints. To effectively compare alternative implementations, we propose to establish a benchmark, consisting of both generative data and a collection of representative queries. The primary outcome of our Phase I effort will be an architectural design for a scalable, secure database embracing the associative/graph model of data. This database will be a critical enabling component of a larger data exploitation and analysis framework which will ultimately include natural language processing, information extraction, and large-scale data analysis capabilities.
Benefit: The associative/graph data model facilitates data integration across independent silos by encouraging the adoption of common vocabularies and identifiers for entities and relationships. A scalable and secure implementation of such a model would dramatically improve agencies ability to collaborate and share timely and accurate data. At present, there are no non-proprietary, scalable associative/graph databases suitable for use in missions where multiple security levels may be present. These research and development efforts will result in a design leading to a large-scale prototype capable of addressing these needs. This technology is appropriate for other markets in which not all clients have access to the same levels of information, but yet still need to integrate and navigate diverse data sets. Potential commercialization applications include medical record-keeping, social network analysis, law enforcement, and recommendation systems for intelligence analysts, among others.
Keywords: Open Source, Open Source, Multi-level Security, scalable, Linked Data, Associative Database, Distributed Architecture, Graph Database