Sound and Vision Integration

Principal Investigator(s): 
Stella Yu

Sound carries complementary information to vision and can help scene understanding and navigation. We train a model to tell individual sounds apart without using labels, which can be used to accelerate subsequent training on supervised sound event classification, and to explain how song birds such as zebra finch can develop communication without any external supervision.  We also demonstrate with a low-cost real system that learns echolocation and generates depth images only from sound.