Using a combination of video cameras and event-based cameras to extract meaningful motion information on complex animal behavior
Research Unit 3, SCIoI Project 36
Complex behavior of animals is often analyzed from video recordings, because cameras provide an economical and non-invasive way to acquire abundant data during experiments. Developing computer vision tools to extract relevant information from such a rich yet raw data source is essential to make it interpretable and thus support behavioural analysis. Therefore, methods analyzing shape, texture, motion or body poses in animals have been investigated in various contexts [Dell, 2014]. We propose to develop a tracking algorithm using a combination of video (i.e., frame-based) cameras and event-based cameras in order to extract meaningful motion information in individuals (in isolation or as part of groups). Both sensor types are complementary [Scheerlinck, 2018]: event-based cameras excel at capturing high-frequency temporal content, while traditional cameras are better at acquiring slowly-varying content.
Event-based cameras are novel visual sensors (first prototypes were commercialized in 2008) that offer plenty of room for research on a new breed of biologically-inspired algorithms. Mimicking the transient pathway of the human visual system, these sensors capture motion in the form of “events”, which represent brightness changes at any pixel in time. These cameras can capture the dynamics of a scene with high temporal resolution, thus they can accuratelycapture fast motions w ithout suffering from motion blur. Traditional cameras have difficulties in such scenarios, e.g., rapidly moving individuals or groups of individuals. Additionally, these cameras allow us to record only motion information, which we will exploit for long-term tracking and better segmentation of behaviors. Recent event-based tracking methods have been tested only on short sequences (few seconds long) [Valmadre, 2018].
We consider an active vision approach, where the viewpoint of the camera is continuously modified to improve the viewpoint and achieve more informative representation of individuals for tracking. This active tracking system will enable robust detection of individuals regardless of their 3D location and avoid target disappearance during long-term tracking. In a second stage, motion tracks will help categorize relevant behavior of the interactions of individuals.
In summary, we aim to develop an active vision system to track multiple individuals using the proposed combination of event-cameras and traditional frame-based cameras (Fig. 1).