Tag Archives: action recognition

CVPR11–Tutorial on Activity Analysis; Dataset Biases

There’s a review of the frontiers of human activity analysis in this tutorial at the ongoing CVPR conference. Though it’s obvious the presenters have purposeful selected the highlighted to their own taste, I do like their smart way of dividing existing research efforts into Single-layered and Hierarchical approaches (as shown in the figure below borrowed from their slides), in accordance with the inherent hierarchy associated with human activities (postures –> actions –> interactions –> activities according to them).


Image credit: Aggarwal and Ryoo, ACM CSUR 2011.

Another piece of impression is that hierarchical approaches to date can only deal with very constrained and perhaps well-defined activity cases. This may be due to the statistical modeling or grammatical reasoning they’re using. The open question is beyond these explicit modeling of structures and rules, are there ways of dealing with this implicitly? Or perhaps think of this less radically, can we have reliable ways to learn about these structures?

On another lead, Prof. Torralba and Prof. Efros are scrutinizing the use of datasets in vision today and the possible biases in this interesting paper. Though it sounds like they are saying the right words at the right time, I hope this is not the first time they realized this – both have been emerging heroes in vision for a while and is in leading institutes of AI. Anyway, the cross-dataset generalization and negative sample bias are indeed interesting to note (and in fact more or less touched by many authors already, maybe not as systematic as here). I would like to acknowledge Prof. Torralba’s contribution of the new object recognition dataset (I’m not to be credited for the name of the dataset though Smile; meanwhile I doubt part of the motivation of the current paper is to raise awareness of the community to the dataset)


Image credit: Jianxiong Xiao et al working on the SUN dataset.

SUN Database: Large-scale scene recognition from abbey to zoo.

and I also love the way they view the different roles of datasets to computer vision and machine learning

… Unlike datasets in machine learning, where the dataset is the world, computer vision datasets are supposed to be a representation of the world.

Tagged , , , , , ,

Action Analysis/Subspace Segmentation Updated; Event Video Dataset

I have just added in accepted papers in CVPR 2011 on action recognition and subspace segmentation. There’s a noticeable blossom of papers on various aspects of action recognition, which almost doubles the number accepted in CVPR 2010. While it’s great to see people shifting their attention onto this topic, I regret to say many papers are only worth 30 seconds glimpse and period. And still, many authors have not released their papers to the public places (for which I cannot add links, and I’m never willing to add links to in front of a pay-wall). My most reluctant response to this is to direct those to my blog article Nobody Cares about You and Your Paper.  

And new challenges and opportunities always come with a new dataset in vision, especially when it’s gigantic in size. In this regard, VIRAT Video Dataset could be described as large-scale, for now.


Tagged , , ,