The overall technical objective for this Phase I SBIR effort is to develop a system design for the proposed medical record digital archive system (MRDAS) so that a prototype system can be built in Phase II. During Phase I research and development NovoDynamics will evaluate the use of its documentation exploitation system ArborScriptT as a potential component of the final system. The ArborScript document exploitation system was created for the US intelligence community to process the flood of paper documents being captured by US efforts in the Middle East. For reasons detailed in the Work Plan section of this document, it is anticipated that ArborScript will address many of the technical challenges that will be encountered in the areas of document optical character recognition (OCR), information storage and information retrieval. Research and development will be conducted to evaluate document feature extraction methods and machine learning techniques for automatic document classification. Research and development will also be conducted to evaluate lexicon phrase spotting approaches to information extraction of medical interventions and conditions