Projects

ICSI hosts basic, pre-competitive research of fundamental importance to computer science and engineering. Projects are chosen based on the interests of the Institute’s principal investigators and the strengths of its researchers and affiliated UC Berkeley faculty.

Recent projects are listed below; the full list of each group's projects is accessible via the links listed in the sidebar.

Beyond Technical Security: Developing an Empirical Basis for Socio-Economic Perspectives

This project investigates the roles played by economics and social interactions in Internet security. Security research has tended to focus on the technologies that enable and defend against attacks. This project emphasizes the human element of cybercrime, including the profits that motivate the majority of Internet attacks, the elaborate marketplaces that support them, and the relationships among cybercriminals, who rely upon each other for services and expertise. It will also study how social media such as Facebook and Twitter provide new opportunities for attacks and manipulation.

Networking and Security
Censorship Counterstrike via Measurement, Filtering, Evasion, and Protocol Enhancement

This project studies Internet censorship as practiced by some of today's nation-states. The effort emphasizes analyzing the technical measures used by censors and the extent to which their operations inflict collateral damage (unintended blocking or blocking of activity wholly outside the censoring nation). Researchers also study the vulnerabilities that arise because of how censorship operates by analyzing flaws in either how the censorship monitoring detects particular network traffic to suppress, or in how the monitor then attempts to block or disrupt the target traffic.

Networking and Security
Understanding and Exploiting Parallelism in Deep Packet Inspection on Concurrent Architectures

Researchers are developing a comprehensive approach to introducing parallelism across all stages of the complex deep packet inspection (DPI) pipeline. DPI is a crucial tool for protecting networks from emerging and sophisticated attacks. However, it is becoming increasingly difficult to implement DPI effectively due to the rising need for more complex analysis, combined with the relentless growth in the volume of network traffic that these systems must inspect.

Networking and Security
The Design and Implementation of a Consolidated MiddleBox Architecture

Researchers are designing infrastructures for specialized network appliances, called middleboxes, that consolidate their management, reducing the cost of deploying new middleboxes and simplifying network management. Middleboxes fill a number of needs and include network intrusion detection systems and WAN optimizers. They are typically added to a network as a need arises, and each has its own management interface. In this project, researchers will explore architectures that provide centralized control.

Networking and Security
Towards Modeling Human Speech Confusions in Noise

Researchers are studying how background noise and speaking rate affect the ability of humans to recognize speech. In this project, they evaluate components of a model of human speech perception. Researchers look at the effect of incorporating spectro-temporal filters, which operate in the human auditory cortex and are sensitive to particular modulations in auditory frequency. The results from this project will improve our understanding of how humans perceive sound, and they could be used to improve artificial systems for speech processing, such as hearing aids.

Speech
Limiting Manipulation in Data Centers and the Cloud

Researchers are designing algorithms to allocate resources in datacenters and clouds that can't be manipulated by users. In datacenters and clouds, computing resources or individual machines are allocated to users based on the requirements of the jobs they want to run. Users can manipulate allocations by misreporting their requirements. In this project, researchers design algorithms that are less susceptible to such manipulation. They will also use algorithmic mechanism design and game theory to develop general procedures for converting protocols so that they can't be manipulated.

Research Initiatives, Algorithms
Enhancing Bro for Operational Network Security Monitoring in Scientific Environments

In collaboration with the National Center for Supercomputing Applications at the University of Illinois at Urbana-Champaign, researchers are improving the Bro Intrusion Detection System, an open-source network monitoring framework that helps defend networks against attacks. The system monitors networks at major universities, large research labs, supercomputing centers, and open–science communities around the country. Many of these networks have tens of thousands of systems each, and some have as many as 100,000. In this project, researchers are working to unify and modernize the Bro code base, to improve its performance capabilities to deal with large-scale networks, and to improve its integration into operational deployments.

Networking and Security
Characterizing Enterprise Networks

While the global Internet have been extensively studied, the behavior of enterprise networks at the Internet's edge remains under-studied. One of the crucial reasons for this is a lack of apt tools that focus on protocols and technologies used within an enterprise, but not used across the global Internet (e.g., protocols that drive distributed file systems). As part of this project, researchers are developing tools to better analyze the traffic specific to these enterprise networks.

Networking and Security
Evaluating Price Mechanisms for Clouds

Researchers are studying the problems that arise in cloud computing centers that use economic models to allocate resources. In these clouds, resources, such as storage, processing, and data transfer, must be allocated to different users. In economics-based clouds, artificial economies are set up; each resource is assigned a "price" and each user is given a "budget," which they spend on the resources they need.

Networking and Security, Research Initiatives, Algorithms
MetaNet: A Multilingual Metaphor Repository

Researchers from ICSI, UC San Diego, University of Southern California, Stanford, and UC Merced are building a system capable of understanding metaphors used in American English, Iranian Persian, Russian as spoken in Russia, and Mexican Spanish. The team includes computer scientists, linguists, psychologists, and cognitive scientists.

AI
California Connects

California Connects is a state-level program administered by the Foundation for California Community Colleges that seeks to advance digital opportunity for underserved communities by promoting and enabling digital competency. Among other services, the program provides laptops to community college students, who in return teach people in their communities how to use computers and the Internet. The program also provides free classes in low-income Central Valley communities. The California Connects team at ICSI provides research support for the initiative, evaluating the program's structure and effectiveness in the context of its target population and making recommendations for its future.

AI
BFOIT

BFOIT (the Berkeley Foundation for Opportunities in Information Technology) supports historically underrepresented ethnic minorities and women in their desire to become leaders in the fields of computer science, engineering, and information technology. The intent is to provide youth with knowledge, resources, practical programming skills, and guidance in their pursuit of higher education and production of technology. For more information, visit the BFOIT Web site.

AI
SWORDFISH

Researchers are developing ways to find spoken phrases in audio from multiple languages. A working group, called SWORDFISH, includes scientists from ICSI, the University of Washington, Northwestern University, Ohio State University, and Columbia University. The acronym expands to a rough description of the effort: Spoken WOrdsearch  with Rapid Development and Frugal Invariant Subword Hierarchies.

Speech
Project Ouch - Outing Unfortunate Characteristics of HMMs (Used for Speech Recognition)

The central idea behind this project is that if we want to improve recognition performance through acoustic modeling, then we should first quantify how the current best model — the hidden Markov model (HMM) — fails to adequately model speech data and how these failures impact recognition accuracy. We are undertaking a diagnostic analysis that is an essential component of statistical modeling but, for various reasons, has been largely ignored in the field of speech recognition. In particular, we believe that previous attempts to improve upon the HMM have largely failed because this diagnostic information was not readily available. In our initial research, we are using simulation and a novel sampling process to generate pseudo test data that deviate from the HMM in a controlled fashion. These processes allow us to generate pseudo data that, at one extreme, agree with all of the model's assumptions, and at the another extreme, deviate from the model in exactly the way real data does. In between, we precisely control the degree of data/model mismatch. By measuring recognition performance on this pseudo test data, we are able to quantify the effect of this controlled data/model residual on recognition accuracy.

Speech
Multimodal Perceptual Grounding for Robots (DARPA BOLT Activity E)

Capabilities for perceptually grounded deep semantic language acquisition would provide a fundamental advance in language technologies. Practical applications include methods to ground in-the-field dialog for translation or command, so that soldiers commanding robots could refer to actual objects or qualities of the environment when specifying instructions, and systems for grounded translation of human to human dialog such that discourse involving physical properties could be accurately understood and conveyed in another language.

Audio and Multimedia
Speaker Diarization

Speaker diarization consists of segmenting and clustering a speech recording into speaker homogenous regions, so that given an audio track of a meeting the system will discriminate and label the different speakers automatically ("who spoke when?"). This entails speech/non-speech detection ("when is there speech?"), and overlap detection and resolution ("who is overlapping with whom?"). ICSI has a long history of research in this area and has contributed repeatedly to the state of the art. Current research is aiming at improving the robustness and efficiency of current approaches.

Speech
Video Deduplication (Copyright Detection)

A duplicate video is a video that has the same content as another video but the two files do not have identical binary encodings (due to editing and/or transcoding). From the social networking perspective there is growing awareness that finding others who have done mashups or have performed simple multimedia modifications on the same data could be highly useful tools for connecting individuals to-gether or identifying piracy. We therefore develop acoustic algorithms to detect video duplicates in various conditions that complement state-of-the-art visual approaches.

Speech
Multimodal Location Estimation

Location estimation is the task of estimating the geo-coordinates of the content recorded in digital media The Berkeley Multimodal Location Estimation project aims to leverage the GPS-tagged media available on the web as training set for an automatic location estimator. The idea is that visual and acoustic cues can narrow down the possible recording location for a given image, video, or audio track. We also investigate the human baseline of location estimation, i.e. how well does a human do in comparison to a computer?

Audio and Multimedia
GeoTube

Researchers are exposing the ways in which it is possible to aggregate public and seemingly innocuous information from different media and Web sites to attack the privacy of users. The project seeks to help users, particularly younger ones, understand the privacy implications of the information they share publicly on the Internet and to help them understand what control they can exercise over it.

Audio and Multimedia, Networking and Security
User-Centric Networking

In collaboration with Case Western Reserve University, we are investigating foundation architectural constructs that bring users into networked systems in a way that has to this point not been possible. Rather than relegating users to an artifact of the application layer, we seek to accommodate users and their relationships at all layers of the system and to give users new controls over how their traffic is handled by the system. More >>

Networking and Security

Pages