SBIR-STTR Award

A Renewable Energy Decision Support Data Platform for Biomass Pathways on Hadoop
Award last edited on: 4/4/2017

Sponsored Program
SBIR
Awarding Agency
DOE
Total Award Amount
$149,872
Award Phase
1
Solicitation Topic Code
01c
Principal Investigator
Korin Reid

Company Information

Ellison Laboratories

390 17th Street Nw Unit 5053
Atlanta, GA 30363
   (317) 332-5160
   john.reid@ellisonlabs.com
   ellisonlabs.com
Location: Single
Congr. District: 05
County: Fulton

Phase I

Contract Number: DE-SC0015087
Start Date: 2/22/2016    Completed: 9/21/2016
Phase I year
2016
Phase I Amount
$149,872
The Energy Independence and Security ACT (EISA 2007) states that by 2020, 21 billion gallons of cellulosic biofuel, biomass-based diesel, or other forms of advanced biofuels should be consumed as part of the US fuel demand. Many questions arise as to how this can be accomplished. How will we produce the biomass required to meet these targets? What biomass yield can one expect at a particular location? How much will it cost to transport biomass from its harvest location to the location at which it will be processed? In order to answers these questions, a variety of key data elements such as land use data, weather and climate data, and transportation network data will be required at a fine spatial resolution. Unfortunately, the volume (large repository of historical weather and climate observations), variety (streaming and historical weather and climate data, transportation data), and velocity (continuously streaming time series weather data) of required data elements make it particularly difficult to efficiently store and analyze such data using traditional database systems. In order to address the complexities associated with the storage and analysis of data crucial to renewable energy project planning, Ellison Laboratories will explore the feasibility of leveraging the hadoop ecosystem to create a decision support tool that streamlines biofuel feedstock estimation (i.e. how much biomass can be produced at a particular location) by providing key data elements at a fine spatial resolution. The resultant product will include an intuitive user interface that includes clickable maps. Users will have the ability to perform scalable parallel operations on large data sets in a simple point and click environment: such operations may include interpolation, creating data subsets, converting between various scientific formats relevant to weather/climate/land use data (Grib, NetCDF, SHP, etc.), executing predictive models geared toward estimating energy crop yields at particular locations, and performing Geographic Information Systems (GIS) tasks such as calculating transportation costs between biomass harvest and biofuel processing locations. Users will also have access to streaming data sources. The tool will also provide fast search capability on stored data. The resulting software will leverage various open source NoSQL and streaming technologies within the hadoop ecosystem to accomplish this (Spark/Spark Streaming, Kafka, HBase, Kudu, Solr, and Parquet) and will be licensed via software as a service model (Saas). The phase I effort will focus on exploring the feasibility and scalability of the implementation of the backend storage and analysis components and will not focus on user interface components. Ellison Laboratories’s system will need to handle high throughput/fast inserts (streaming data) in addition to providing good performance on batch-oriented workloads (GIS calculations, building predictive crop yield models via machine learning libraries, data format conversions, etc.). Various architectures will be evaluated (as to speed of batch analysis/computations and throughput performance) in order to establish benchmarks and determine the optimal storage mechanism. Ellison Laboratories will explore the feasibility of leveraging the hadoop ecosystem to provide a comprehensive data platform geared toward providing decisions support for biomass based renewable energy planning. Commercial Applications and

Benefits:
Stakeholders such as renewable energy startups, utility companies hoping to meet renewable energy utility mix targets, and academic researchers building biorefinery network optimization models alike will be able to leverage this tool in order to design and implement cost competitive renewable energy projects. In addition, much of the provided functionality such as providing scalable distributed computation on top of scientific datasets (Grib/NetCDF) and providing scalable parallel GIS computations support are quite valuable to an audience much larger than the renewable energy sector.

Phase II

Contract Number: ----------
Start Date: 00/00/00    Completed: 00/00/00
Phase II year
----
Phase II Amount
----