Audio and Multimedia

Audio and Multimedia group

ICSI in MIT Review

"How People Broadcast Their Locations Without Meaning To"
April 22, 2011  |  Erica Naone, MIT Technology Review

People were up in arms this week about the privacy implications of news that the iPhone gathers location information and stores it in a file on the user’s computer. But experts say that smart-phone owners are unknowingly taking a much bigger risk with information about where they go all day.

ICSI in Northwestern University Medill Reports

"Online Privacy Could Keep You and Your Home Safe from Robberies"
January 11, 2012  |  Kristen Kella, Northwestern University Medill Reports

Social media and mapping web sites are tools of the trade for criminals looking for the perfect house to rob, ex-burglars told researchers in the United Kingdom. Nearly 80 percent of the burglars used social media to case homes, they told the Crimestoppers Trust, and the wealth of online information helps make an efficient operation of a burglary.

ICSI on CIO Blogs

"Researchers Find Way to Pinpoint Where Online Video Was Shot"
February 19, 2013  |  Bill Snyder, Consumer Tech Radar, CIO Blogs

Suppose a terrorist holding hostages at a secret location makes a video demanding ransom. Now imagine that law enforcement officials can take that video, process it and run it through a database that pinpoints the precise location where it was shot based on images and sounds in the video.

ICSI in InfoWorld

"Nowhere to Hide: Video Location Tech Has Arrived"
February 21, 2013  |  Bill Snyder, Tech's Bottom Line, InfoWorld

Researchers at ICSI are currently building a video database by analyzing videos downloaded from Flickr, says Gerald Friedland, who leads ICSI’s multimedia efforts. Data from videos taken at known locations is used to develop profiles of the respective locations.

Multimodal Perceptual Grounding for Robots (DARPA BOLT Activity E)

Capabilities for perceptually grounded deep semantic language acquisition would provide a fundamental advance in language technologies. Practical applications include methods to ground in-the-field dialog for translation or command, so that soldiers commanding robots could refer to actual objects or qualities of the environment when specifying instructions, and systems for grounded translation of human to human dialog such that discourse involving physical properties could be accurately understood and conveyed in another language.

Multimodal Location Estimation

Location estimation is the task of estimating the geo-coordinates of the content recorded in digital media The Berkeley Multimodal Location Estimation project aims to leverage the GPS-tagged media available on the web as training set for an automatic location estimator. The idea is that visual and acoustic cues can narrow down the possible recording location for a given image, video, or audio track. We also investigate the human baseline of location estimation, i.e. how well does a human do in comparison to a computer?

GeoTube

Researchers are exposing the ways in which it is possible to aggregate public and seemingly innocuous information from different media and Web sites to attack the privacy of users. The project seeks to help users, particularly younger ones, understand the privacy implications of the information they share publicly on the Internet and to help them understand what control they can exercise over it.

Automated Low-Level Analysis and Description of Diverse Intelligence Video (ALADDIN)

Massive numbers of video clips are generated daily on many types of consumer electronics and uploaded to the Internet. In contrast to videos that are produced for broadcast or from planned surveillance, the "unconstrained" video clips produced by anyone who has a digital camera present a significant challenge for manual as well as automated analysis. Such clips can include any possible scene and events, and generally have limited quality control.

Pages