Incorporating features from multimodal representations has shown to produce more robust imaging capabilities than traditional single-mode approaches. To improve autonomous tasks with Unmanned Aircraft Systems (UAS), we plan to develop a multimodal image processing Machine Learning (ML) framework comprised of a Deep Neural Network (DNN) paradigm that provides an alternative prediction system with confidence and credibility metrics. The Deep k-Nearest Neighbor (DkNN) technique is used to assess the outputs of intermediate layers in a DNN to perform classification and evaluate the credibility of its decisions, producing a confidence metric that is more reliable yet more robust against adversarial attacks than traditional DNNs. The DkNN approach is supplemented with data fusion techniques to improve robustness and to further mitigate the disadvantages of single-mode approaches.
Benefit: Current platforms with low size, weight, power, and cost (SWaP-C) constraints are often limited in their ability to use Machine Learning during deployments or utilize more than one sensor on the platform. Many applications expect downstream processing to be handled in the backend using cloud services, but do not necessarily offer on-platform processing solutions. The proposed ML capability can process multimodal information such as RGB (EO), IR, and near-IR on a single platform. The developed framework is also extensible, allowing it to ingest and process information from other sensors such as CubeSats or mobile emergency or rescue response platforms.
Keywords: Adversarial robustness, Adversarial robustness, Deep k-Nearest Neighbor, Multi-input Data, Machine Learning, Multimodal Images