We propose a multi-part system to process, understand, and summarize video streams. Our approach involves processing video and audio jointly; uses the latest advances in captioning and automatic speech-recognition to construct text narratives for the video; and uses NLP techniques to summarize and produce a reliable relevance ranking of these videos. We address all six technical challenges identified by the solicitation to provide an end-to-end solution. Our team is uniquely qualified for this initiative, given our experience in high-performance algorithms (including successful DARPA contracts), video analytics, natural language processing, and deep learning.
Benefit: Videos posted online provide valuable intelligence but there are too many for human analysts to watch, let alone to monitor in real-time. Automated summarization is critical, as is automated ranking of the relevance of a video for a given consumer (for example, an intelligence analyst). Aside from intelligence applications, there are numerous private-sector applications as well. For example, market research or public relations firms could utilize an offering to monitor for defamatory videos posted about their clients. Additionally, our solution can be used to analyze sports footage; to aid consumer search of video sets; to provide sophisticated summarizations to viewers as they watch movies; or to help assist operators in sensitive situations such as air traffic control. Furthermore, with the explosion of user-generated video on social media, there is a need for better indexing techniques, such as our solution enables.
Keywords: Neural networks, Neural networks, Deep Learning, video, Summarization, Relevance, Natural Language Processing