Science of Intelligence

New paper! Developing a new model to align event camera data

In their new ICCV paper, SCIoI members Pia Bideau and Guillermo Gallego, together with Cheng Gu, Erik Learned-Miller, and Daniel Sheldon, describe a new model to align the frames of an event camera and create a stable panorama that is easier to make sense of.

Compared to traditional image-based cameras, event cameras work at high speed with a very high dynamic range and low power consumption, and are commonly used in computer vision and robotics applications. What these cameras do is that they sense brightness changes (events) at every pixel as they occur with microsecond resolution. Thus, motion is needed to be able to record visual information in the form of events. This motion, however, makes it difficult for us to reason about the cause of the triggered events, and events triggered by the same cause are scattered over the camera sensor. 

In order to address this challenge, Pia Bideau, Guillermo Gallego, and other authors have developed a model, the spatio-temporal Poisson Point Process, that aligns the data in order to create a representation that facilitates the interpretation of visual information. This model can be used as a pre-processing step for next steps in computer vision aiming at high level scene understanding. Read the abstract below for a more in-depth description of the process.



Event cameras, inspired by biological vision systems, provide a natural and data efficient representation of visual information. Visual information is acquired in the form of events that are triggered by local brightness changes. Each pixel location of the camera’s sensor records events asynchronously and independently with very high temporal resolution. However, because most brightness changes are triggered by relative motion of the camera and the scene, the events recorded at a single sensor location seldom correspond to the same world point. To extract meaningful information from event cameras, it is helpful to register events that were triggered by the same underlying world point. In this work we propose a new model of event data that captures its natural spatio-temporal structure. We start by developing a model for aligned event data. That is, we develop a model for the data as though it has been perfectly registered already. In particular, we model the aligned data as a spatio-temporal Poisson point process. Based on this model, we develop a maximum likelihood approach to registering events that are not yet aligned. That is, we find transformations of the observed events that make them as likely as possible under our model. In particular we extract the camera rotation that leads to the best event alignment. We show new state of the art accuracy for rotational velocity estimation on the DAVIS 240C dataset. In addition, our method is also faster and has lower computational complexity than several competing methods.

Read full paper PDF here.