Developing artificial intelligence technology for medical imaging applications requires training models on large and diverse datasets. Currently, aggregation of large data repositories, including radiology and pathology images, is limited by concerns around patient privacy. In order to successfully share medical images, an institution must be able to quickly and accurately de-identify large numbers of images in batches. This process is currently manual and time-consuming. We propose a pipeline to remove PHI from both radiology DICOM images and pathology whole slide images by leveraging machine learning, natural language processing, and compartmentalized workflow techniques to significantly reduce the human intervention needed to anonymize medical images. In addition to examining header data in the images, we will use optical character recognition and computer vision algorithms to detect text in any location or orientation in the image, then automatically record and subsequently purge these regions. These techniques will be configured to work on a variety of image types (CT, MRI, radiograph, etc) and cover multiple OEM vendors for both radiology and pathology images. This phase I statement of work will construct the software tools, methods, and datasets necessary to facilitate a phase II where the complex algorithms needed for autonomous deidentification will be developed. This phase II processing will be referred to throughout this document as the workflow.