The broader impact/commercial potential of this Small Business Innovation Research (SBIR) Phase I project is to accelerate the adoption of AI features in Internet of Things and mobile devices. The proposed multi-bit non-volatile memory (NVM) based Deep Neural Networks (DNN) IP is based on the standard CMOS logic processes. Existing solution for the DNN hardware typically requires off-chip access to retrieve neural network parameters from external memories, incurring additional communication latency and power consumption. Additionally, when critical neural network parameters are transmitted off-chip, security or privacy concern may arise, which is unacceptable especially for the applications such as personalized AI devices. Alternative approach integrating the DNN engine in a special NVM process requires as much as 10 additional masks beyond the conventional logic CMOS process which is not cost-effective for medium density DNN engine in cost-sensitive edge devices. With this proposed IP, any existing or new system on chip requiring persistent AI functionality can be built quickly and cost effectively.This Small Business Innovation Research (SBIR) Phase I project seeks to develop a cost-effective non-volatile neural network accelerator IP for edge devices. To solve the security, latency, power consumption, and cost issues associated with the traditional approaches, a single-poly based low cost, non-volatile, multi-bit eFlash cell is proposed. Multi-bit cell operation however presents significant challenges due to inherent reduction in signal-to-noise ratio. Key technical hurdles include solving disturbances of unselected cells, improving sensing margin, and overcoming reliability issues associated with high voltage operation in readout circuits as well as developing robust neural network cell arrays. To address these challenges, several new ideas related to multi-bit cell, high-voltage circuits, and cell programming methods have been proposed. Once verified successfully in this project, the multi-bit cell IP can then be integrated as logic compatible non-volatile memory to store neural network parameters on-chip, or as logic compatible non-volatile neural network IP to execute entire neural network operation.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.