The number of distinct malware being released into the wild is growing at an alarming rate. Some IT security companies are seeing more than 5,000 new malware instances each day. IT security companies can no longer keep pace with this deluge using manual, labor-intensive malware analysis techniques for generating specifications that detect them. There is a need for proven and deployable automated malware analysis techniques that can analyze large volumes of malware quickly and accurately. Researchers performing work in the area of behavior-based malware analysis are exploring new techniques that will address this problem: automated dependence graph construction; graph mining tools that identify specific behaviors in a dependence graph; semi-automated specification generation; and malware classification using clustering techniques. In this Phase I STTR proposal, NovaShield, Inc. will focus on malware understanding and aspects of malware classification. More specifically, NovaShield will concentrate on dependence graph construction algorithms that build rich dependence graphs efficiently, as well as clustering techniques that organize malware into families based on their behavior profiles. This will lay the groundwork for creating techniques that perform behavior mining and automated generation of behavior specifications for detecting malware, which will be pursued in Phase II.
Keywords: Malware Classification, Clustering, Dependence Graphs, Behavior Mining, Malware Analysis, Automated Specification Generation