SBIR-STTR Award

A Multi-Task Learning Framework for Automating the Classification of Building Data
Award last edited on: 1/13/2020

Sponsored Program
SBIR
Awarding Agency
DOE
Total Award Amount
$1,349,959
Award Phase
2
Solicitation Topic Code
09c
Principal Investigator
Brian Simmons

Company Information

Onboard Data Inc

2326 Massachusetts Avenue 2nd Floor
Cambridge, MA 02140
   (857) 529-7007
   N/A
   www.onboarddata.io
Location: Single
Congr. District: 07
County: Middlesex

Phase I

Contract Number: DE-SC0019958
Start Date: 7/1/2019    Completed: 3/31/2020
Phase I year
2019
Phase I Amount
$200,000
Buildings account for 30% of final global energy consumption and 28% of global energy-related CO2 emissions. Advanced analytics and controls software has been shown to curb unnecessary energy use, and generate individual building and portfolio energy savings of up to 47% and 33%, respectively. Unfortunately, the status quo to deploy these powerful technologies is slow, expensive and often inaccurate, which leads to poor return-on-investment and market adoption. Today, trained personnel are required to translate, or manually map, existing metadata from building automation systems to deploy advanced software. It may take up to 833.3 hours for a building expert to map metadata for a single software application on an average commercial building. This process is insufficient to support building energy management goals for real estate owners, operators, utilities and software vendors. This Small Business Innovation Research Phase I project will demonstrate the feasibility of a novel automated classification and validation framework to reduce the level of human effort, time, expense and inaccuracies to deploy advanced software. The framework includes four machine learning classification modules and an optical character recognition based validation model. This approach will predict probabilistic labels for 4 types of labels that are critical for deploying advanced software: equipment type, point type, equipment instance and equipment relationship. The average accuracy for each label?s prediction is anticipated to achieve at least 80%. This represents a significant reduction in time, expense and human effort required to prepare accurate data for use in advanced analytics and controls software. Validation tests and results in Phase I will inform ongoing development, field testing and integrations with existing analytics tools in Phase II. Real estate is the world?s largest asset class, yet the industry has been relatively underserved by technology when compared to consumer, medical, and financial verticals. Today, the median age of commercial buildings in the U.S. is more than thirty years old, and energy and contemporary tenant considerations are largely absent from their design and operations. New technologies and software show great potential to transform our built environment, and the rise of the millennial workforce drives the expectation of a high-tech, eco-friendly workplace. The development and commercialization of the proposed automated classification technology spurs the digital transformation of our most familiar spaces?offices, schools and hospitals?thereby increasing opportunities to improve energy and operational efficiencies, and, ultimately, improving the health of tenants and our climate.

Phase II

Contract Number: DE-SC0019958
Start Date: 8/24/2020    Completed: 8/23/2022
Phase II year
2020
Phase II Amount
$1,149,959
Buildings account for 30% of global energy consumption and 28% of global energy-related carbon emissions [27]. Advanced software has demonstrated an ability to unlock individual and building portfolio energy savings up to 47% and 33%, respectively [1]. Despite the compelling environmental and energy saving value proposition, the majority of commercial buildings do not employ such software due to the significant time, cost and effort of data integration. Today, it is necessary to employ expert personnel to manually gather, organize and ‘map’ disparate operational data before making use of that information in various software applications (e.g. tenant comfort, energy analytics, etc.). This manual effort can take more than 800 hours—100 work days—for an individual to perform this activity for a single building [6]. Furthermore, this manual effort is not transferrable between different software applications— it merely transmits data from one silo to another. In addition to missed energy saving opportunities, the industry’s lack of seamless data management across building systems generates an annual loss of $22 billion dollars in the U.S. alone [2]. Building owners and operators bear two-thirds of this cost, or $14 billion dollars, throughout a building’s ongoing operation and maintenance [2]. This Small Business Innovation Research Phase II project will result in a marketable ‘end-to-end platform’ for acquiring, storing, normalizing, performing-quality-assurance, and distributing actionable building data. This work will be the basis for Onboard’s ‘Opera API’ product; the most intuitive and cost- effective interoperable data source for a variety of building data use-cases. Onboard’s ‘Opera API’ software will eliminate data compatibility issues within our nation’s commercial buildings, and spur the digital transformation of our most familiar spaces—offices, schools and hospitals. During Phase I, Onboard developed four machine learning modules to predict point type, equipment type, equipment instance, and equipment relationship. Onboard also developed an OCR pipeline that can extract information from mechanical drawings, and validate the results from other classifiers. The Phase I research was successful in reducing the time, expense and human effort required to label building metadata. Onboard faces remaining technical challenges from a lack of methods to utilized less-structured engineering drawings used by human experts. We also face challenges from maintaining many different software pipelines with specialized functions for the classification of each metadata label or task. It is also still challenging for personnel to manually validate and correct the predictions from our framework. In the Phase II Project, Onboard will develop a novel computer vision and OCR methods to extract data from three new types of engineering drawings. Onboard will develop and test a Multi-Task Learning (MTL) framework that can take advantage of the information contained in multiple related tasks and improve the generalization performance across all tasks. Onboard will productize the MTL framework, develop an easy-to-use interface to quickly validate or correct metadata labels and test the ‘end-to-end platform’ on unseen test buildings.