For this SBIR Applied Media Analysis, Inc. is teamed with researchers at the University of Buffalo (SUNY Buffalo) to address the challenges of Arabic handwriting Optical Character Recognition (AHOCR). The proposed approach leverages our previous experience developing MATES, a Multilingual Automatic Translation Engine for Signs (and documents), supported in part by the Army Research Laboratory (ARL). It will significantly extend our Mobile Arabic OCR capability to handle the handwriting. The system will be comprised of software modules including handwritten text extraction, preprocessing, segmentation, classification, post processing and evaluation. In this proposal we will focus on the underlying algorithms, rather then retargetablity, and our strategy will explore probabilistic methods that are independent of the writing style. These probabilistic methods have not previously been applied to Arabic handwriting recognition and advance the frontiers of document analysis in general, in addition to being ideally suited to the domains where the quality of documents is often less than ideal. We will demonstrate technical feasibility by testing the system on several Arabic handwriting collections, previously used in the community.
Keywords: Arabic Handwriting Ocr, Document Processing, Neural Network, Hmm, Segmentation, Image Enhancement