Event model learning and recognition in videos

My work in general falls under the areas of Cognitive Vision, Machine Learning, Statistical Relational Learning and Qualitative Spatial Reasoning. My work aims to develop algorithms to learn relational event models from videos. I am funded by the European Commission's Co-Friend project. I am also involved in the Mind's Eye project that aims to learn models for verbs from example videos (Partners: Stanford Research Institute and University of Maryland).

Event model learning from complex videos using Inductive Logic Programming [ECAI - 2010]

In this paper we present a novel supervised learning framework to learn event models from large video datasets(~ 2.5 million frames) using ILP. Efficiency is achieved via the learning from interpretations setting and using a typing system. This allows learning to take place in a reasonable time frame with reduced false positives. The experimental results on video data from an airport apron suggests that the techniques are suitable to real world scenarios.

Experimental setup. Only 2 of the 8 views shown. Download high quality video here. Objects are tracked in each of the 8 videos and fused to get 3D tracking data on the ground plane. This data has interactions of bounding boxes of objects which is converted to relational data with spatio-temporal relations.

Sample positive examples used for training for the event Rear Loading of an aircraft. In each training example, each row represents interactions between a pair of objects involved in the event. Examples are obtained by "Deictic Supervision".                                                                                                       

Rule learned for Aircraft Arrival event. Refer to the paper for interpretation, other rules and evaluation results.                                                                                                       

Event recognition using learned rules in test video. Download high quality video here.