The broader impact/commercial potential of this Small Business Innovation Research (SBIR) Phase I project will result from enabling businesses to process and extract insights from large unlabeled datasets, using machine learning with minimal human supervision, in application areas such as cyber security, precision medicine and predictive maintenance. Current deep learning approaches require large amounts of labeled data. Creation of labeled data is expensive, error prone and time consuming. The proposed software will provide fully automated capabilities for semi-supervised learning for anomaly detection in cyber security applications. All businesses ranging from large scale enterprises to boutique data science consulting firms will benefit from this project. The expected impact can range in the billions of dollars in the areas of cyber security and predictive maintenance, to name just two. More broadly the proposed technologies will enable both corporations and public institutions to harvest large datasets at minimal cost.This Small Business Innovation Research (SBIR) Phase I project will design, develop, and deploy high-performance computing (HPC) software for unsupervised learning and anomaly detection. In the last decades tremendous successes in machine learning have been achieved in the area of supervised learning that requires compilation of large datasets with labels (for example, grouping of pictures based on the individual depicted on the image). In contrast, unsupervised learning algorithms do not require labels and thus require minimal human participation. However, due to significant technical difficulties they have not been as successful as supervised learning algorithms. This software package circumvents these difficulties and opens the way to scaling unsupervised learning algorithms to large and complex datasets. The main research and development challenges that will be addressed in this project are the ability to integrate this new technology with real world complex datasets through the choice of the correct comparison function between the objects of the dataset and the fully automatic algorithm and algorithm parameter selection.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.