Vision Projects

Robotic Vision

To perform useful tasks in everyday human environments, robots must be able to both understand and communicate the sensations they experience during haptic interactions with objects. Toward this goal, vision researchers at ICSI augmented the Willow Garage PR2 robot with a pair of SynTouch BioTac sensors to capture rich tactile signals during the execution of four exploratory procedures on 60 household objects. In a parallel experiment, human subjects blindly touched the same objects and selected binary haptic adjectives from a predetermined set of 25 labels.

Domain Adaptation

ICSI researchers are investigating the fundamental problem of visual domain adaptation, or how to deal with the most common scenario “What you see is not what you get.” When test data and training data come from differing distributions (or unsupervised methods are employed with non-stationary distributes), conventional approaches to machine learning often perform very poorly. They have been exploring several approaches to this problem, including those based on conventional feature spaces that are transformed based on a learned adaptation to overcome a domain shift.

Fine-grained Recognition

Recognizing objects in fine-grained domains can be extremely challenging due to the subtle differences between subcategories. Discriminative markings are not only subtle but often highly localized, with which traditional object recognition approaches struggle when dealing with the large pose variation often present in these domains. The ability to normalize pose based on super-category landmarks can significantly improve models of individual categories when training data is limited.

Representation Learning

Researchers at ICSI and UC Berkeley are developing new representation learning models for visual detection, leveraging advances in discriminatively trained convolutional neural networks.  In 2013, they established important results related to these models, including observations of their ability to generalize to new tasks and domains, and importantly to be applicable to detection and segmentation tasks.  They developed a new “Region-CNN” model (R-CNN), which outperformed all competing methods on the most important visual detection benchmark, the PASCAL challenge.