In this Phase I effort, Tiami, LLC, aims to develop and demonstrate a hardware proof of concept for a collaborative multi-robot system (MRS) that leverages imitative augmented deep reinforcement learning (IADRL) amongst heterogeneous uncrewed systems (robots) to achieve a common task. Collaboration is based on low-latency machine-to-machine wireless links between robots that use both RF and optical wireless communications (OWC) for multimode resiliency. To accomplish this task, the Phase I personnel include a principal engineer from Tiami LLC with 12 years of experience in RF systems design and 124+ patents in low-latency 4G/5G wireless systems, and a subaward to the University of Massachusetts Boston to leverage Dr. Annavajjalas and Prof. Michael Rahaims expertise in optical wireless communications research and lab facilities for rapid software-defined radio (SDR) prototyping. A letter of support is provided from Lockheed Martin Missile Fire Control and Lockheed Martin Space, an anticipated Phase II/Phase III subcontractor to transition our technology to military and commercial users. The system objective is to track the desired target of interest while the adversary MRS attempts to disrupt the ally MRS cyber topology (i.e., communications and intelligence). Strategic motion planning will be implemented with the IADRL methodology, while real-time mission updates are intelligently distributed amongst nodes using a combination of RF and OWC. For strategic motion planning, our MRS will use a single node as a mission leader to aggregate information, implement the IADRL model, and distribute motion plans. The remaining agents continuously transmit their positional information (longitude, latitude, altitude/height, velocity, roll, pitch, yaw/heading, angular rates, acceleration) and status to the mission leader node. Each node acquires absolute positioning, navigation, and timing (PNT) information from an alternative signal-of-opportunity such as commercial LEO systems (Starlink) if GPS is denied. The system will adapt to network segmentation (or any loss of a mission leader) by dynamic reselection of the mission leader(s). For the communications component of our solution, individual nodes will actively decide to provide information on the RF or OWC network (or some combination thereof) for each round of sensor data aggregation. The M3P network utilizes post-quantum cryptographic encryption (e.g., CRYSTALS-DILITHIUM) for security. Traffic routing decisions will be based on channel state information for RF/OWC links and statistical characterization of each networks reliability (i.e., likelihood of connecting to the mission leader in a desired timeframe).