Advances in wearable technology and the availability of low-cost video have tremendous potential to provide new insight into how physical behavior is associated with health, define clinical trial outcomes and assess functional status and activities of daily living patients within their home or a rehabilitation setting. (1-4) Cameras and/or videos can record continuously in a passive and unobtrusive manner, enabling participants to provide a detailed record of daily activity that has applications in health research, memory retention and ethnography. (5-11) However, in health research the use of image processing remains burdensome and cost prohibitive, often requiring manual annotations by trained staff. To automate annotation of images and video in recent years scientists have been using emerging machine learning technology applied to computer vision. With the help of multi-layered special purpose neural networks (Convolutional Neural Networks, Recurrent Neural Networks) researchers have been able to accurately classify still images and video frames based on what is depicted in them, recognize the position of objects of interest in an image, recognize humans in an image, and track objects (vehicles, humans) across multiple consecutive frames of a video. (12-18) To date, this technology has been applied to commercial products and sport performance, but not to quantify levels of physical activity, performance or behavior for health research. The long-term goal of this project is to develop a Commercial Off-The-Shelf (COTS) software program that can accurately classify physical activities (e.g. walking, sitting or standing up), information about behavior (e.g., location and purpose of the activity), and performance (e.g., walking speed and sit to stand transition times).