Radio frequency integrated circuits (RFICs) are key components for wireless communication system. RFICs hard-tomodel parasitic effects and poor simulation accuracy require multiple trial-and-error tape-outs (fabrications) to meet product specs. Tape-out is very slow (2 months) and very costly (as high as $2M). In the coming 5G era, this problem gets worse as mmWave frequency parasitic effects are even harder to model, thus more tape-out rounds are needed. HelloMaxwell is a spin-off of MIT research. It combines physics-based models and customized machine learning (ML) to predict RFIC performance. This prediction method can improve RFICs prediction accuracy by 97%, which means engineers can reduce the RFIC development time by at least one tape-out round. In order to achieve this improvement, the ML training needs sufficient compatible data across design, simulation and test. Todays RF semiconductor companies have indeed enough RFIC data points, however the existing data system is fragmented and inconsistent, so impractical for ML training. To overcome this problem, we propose to design an integrated RFIC data system during this phase 1 project. The proposed work sets the foundation for ML algorithm research planned for NIST Phase 2, paving the way for commercializing our novel technology for the modern communications system