SBIR-STTR Award

Real-time, accurate OCR from Video using Intra- and Inter-Frame Machine Learning
Award last edited on: 9/1/2009

Sponsored Program
SBIR
Awarding Agency
NSF
Total Award Amount
$987,960
Award Phase
2
Solicitation Topic Code
EO
Principal Investigator
Ari Gross

Company Information

CVISION Technologies Inc

118-35 Queens Boulevard 14th Floor
Forest Hills, NY 11375
   (718) 793-5572
   info@cvisiontech.com
   www.cvisiontech.com
Location: Single
Congr. District: 06
County: Queens

Phase I

Contract Number: ----------
Start Date: ----    Completed: ----
Phase I year
2008
Phase I Amount
$100,000
This Small Business Innovation Research (SBIR) Phase I research project focusses on the development of ground-breaking real-time algorithms for automatically finding and recognizing text in digital video of complex 3-D environments using machine learning of fonts and text strings. Essentially, the project takes OCR from being a technology for 2-D documents and brings it to the 3-D world. The project builds on algorithms for optical character recognition (OCR) of documents where conventional OCR fails: colorful brochures, magazine covers, and other sources where photographs, line art, and arbitrarily-rotated text greatly complicate the OCR process. The project aims to build on this technology to find solutions to the finding and recognizing text in complex 3-D real world scenes such as street signs and storefronts where the text may be at any arbitrary 3-D angle to the camera. Critical to the success of this project is the algorithm's capability for machine learning of fonts. There are a number of exciting applications that are impacted by accurate OCR from video sources. While OCR of text in video sources can be done, it usually must be on plainly obvious text, such as subtitles, and it cannot be done in real-time. Real-time and accurate video OCR would enable applications that include 1) Unaided indexing of digital video footage by the text contained therein, 2) aiding the blind navigate independently, both indoors and outdoors, 3) automated continuous roadside or vehicle based license plate scanning, and 4) as ground truth for improved GPS accuracy. Markets for the technology therefore include individuals, corporations, and government agencies. The societal impacts include 1) rendering digitized video libraries searchable by more metadata tags at low cost, 2) greater independence and safety for the blind, 3) improving road safety through automatically identifying cars reported stolen or cars owned by people with suspended licenses, and 4) improved GPS navigation accuracy. Technological impacts will be in the areas of machine learning applied to video OCR, real-time OCR, and low-resolution OCR

Phase II

Contract Number: ----------
Start Date: ----    Completed: ----
Phase II year
2009
Phase II Amount
$887,960
This award is funded under the American Recovery and Reinvestment Act of 2009 (Public Law 111-5). This Small Business Innovation Research (SBIR) Phase II project involves development of real-time algorithms for Optical Character Recognition (OCR) from documents. This real-time recognition (RT/OCR) system, to be fully developed under this SBIR award, performs recognition an order of magnitude faster than current commercial systems and will allow for real-time recognition that can be embedded on a system device and done at the time of capture. The RT/OCR system will also have no loss in recognition accuracy, and will, in fact, be more accurate for complex documents that include color, graphics, and multiple fonts. This technology, when successfully commercialized within Phase II of the SBIR award, could be deployed on every corporate MFP and digital copier device, converting corporate paper to searchable, electronic files and bringing us one step closer to the paperless office. The technology we intend to use in developing this real-time OCR recognition system is based on methods using Intra- and Inter-Frame Machine Learning. The algorithms to be developed are not, in any way, language specific and can run on virtually any platform (e.g. server or handheld device). The basic technology is completely different from the recognition kernels of current commercial OCR recognition systems. This project is focused on developing revolutionary technology that will take OCR technology to a new level. This technology is designed to bridge the gap between paper and digital media, a much needed engine for Bill Payment Machine (BMP), document capture and document processing industry. The capture industry will grow to $2.42 billion in 2010, a CAGR of 16.4%. Real-time OCR for automated and semi-automated field coding addresses the needs of an industry that uses $14.5 billion/year of manual labor just in the US. RT/OCR will be part of a solution that addresses manual paper-based indexing for complex documents, potentially saving the industry and the government billions of dollars every year. This recognition technology, after being successfully developed and commercialized within the context of the Phase II research and development, can be generalized and extended to handle real-time video recognition, with application to autonomous vehicle navigation, aids for the visually impaired, and robotic factory automation