CVPR11–Tutorial on Activity Analysis; Dataset Biases

There’s a review of the frontiers of human activity analysis in this tutorial at the ongoing CVPR conference. Though it’s obvious the presenters have purposeful selected the highlighted to their own taste, I do like their smart way of dividing existing research efforts into Single-layered and Hierarchical approaches (as shown in the figure below borrowed from their slides), in accordance with the inherent hierarchy associated with human activities (postures –> actions –> interactions –> activities according to them).

image

Image credit: Aggarwal and Ryoo, ACM CSUR 2011.

Another piece of impression is that hierarchical approaches to date can only deal with very constrained and perhaps well-defined activity cases. This may be due to the statistical modeling or grammatical reasoning they’re using. The open question is beyond these explicit modeling of structures and rules, are there ways of dealing with this implicitly? Or perhaps think of this less radically, can we have reliable ways to learn about these structures?

On another lead, Prof. Torralba and Prof. Efros are scrutinizing the use of datasets in vision today and the possible biases in this interesting paper. Though it sounds like they are saying the right words at the right time, I hope this is not the first time they realized this – both have been emerging heroes in vision for a while and is in leading institutes of AI. Anyway, the cross-dataset generalization and negative sample bias are indeed interesting to note (and in fact more or less touched by many authors already, maybe not as systematic as here). I would like to acknowledge Prof. Torralba’s contribution of the new object recognition dataset (I’m not to be credited for the name of the dataset though Smile; meanwhile I doubt part of the motivation of the current paper is to raise awareness of the community to the dataset)

sun_mosaic_logo

Image credit: Jianxiong Xiao et al working on the SUN dataset.

SUN Database: Large-scale scene recognition from abbey to zoo.

and I also love the way they view the different roles of datasets to computer vision and machine learning

… Unlike datasets in machine learning, where the dataset is the world, computer vision datasets are supposed to be a representation of the world.

Tagged , , , , , ,

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: