Phase II Amount
$1,600,000
Recent years have seen an unprecedented need for new, inexpensive, and scalable data management solutions as a result of the accessibility of high-resolution sensing devices both at large DOE experimental facilities, such as the Advanced Light Source (ALS) and the National Synchrotron Light Source II (NSLS-II), and, even more, for lower-tier devices for Computer Tomography, high-resolution photography, and microscopes of different types. This data deluge can lead to advances in science and engineering only if matched by data management solutions that tackle the problem of handling massive repositories while reducing time spent moving data, reducing costs, and enabling discoveries and decisions. This project aims at generalizing and commercializing the VisStore solution for integrating commodity storage with existing cloud and HPC cyberinfrastructure. In particular, the VisStore layer creates added value to existing storage solutions with a cache-oblivious approach that integrates external, distributed memories in an environment that is not only highly scalable but also easy to use by providing dire integration with existing working environments through Python, C, R, and Java. Overall, the proposed approach will reduce data bottlenecks while working within user workflows for data analysis, authentication requirements, and device diagnostic requirements, enabling us to develop software connecting a broad range of devices to our solution for a vast number of smaller engineering and science labs. Phase I accomplishments demonstrate the ability to support the development and deployment of accessible storage di- rectly connected to the data source, allowing the seamless blend between cloud and local data access for devices such as those at Large-Scale Experimental facilities, and ensure preliminary integration of existing workflows through Python. Phase I tested VisStore integration for high-energy x-ray tomography (CT) and high-energy x-ray diffraction microscopy at Argonne, integrated our framework to solve problems for handling massive dataset sizes in our technology partners at Lawrence Livermore National labsÂ’ LLNL Tomography Toolkit (LTT), and using VisStore hardware and software to monitor, acquire, and live stream neuron data at the University of Utah. Phase II will need to address the risks further, including generalizing the ingestion of incoming data on the fly as well as handling easy and secure deployment, data management, and data sharing. VisStore will impact the economics of scientific processing of data by reducing the time and cost of handling massive imagery, which can, in turn, accelerate scientific discovery and the development of new products and services. For instance, contributing to our understanding of the human brain will have a long-lasting impact on society and improve our quality of life. Several extensions of the technology for use in material science and climate data exploration will also have an evident broader societal impact. Engineers and analysts can save hours of work due to streamlining workflows by removing the arduous step of manually managing and moving data. Similarly, in medicine, this technology will directly impact the availability of imaging services, especially for a large number of rural hospitals that do not have the resources necessary to access traditional, expensive solutions.