ICSI hosts basic, pre-competitive research of fundamental importance to computer science and engineering. Projects are chosen based on the interests of the Institute’s principal investigators and the strengths of its researchers and affiliated UC Berkeley faculty.

Recent projects are listed below; the full list of each group's projects is accessible via the links listed in the sidebar.

Undermining Political “Scratch” Effects with Technology (UPSET)

This project leverages technology to significantly reduce the cost of political campaigns via the sharing economy, and thus undercut the impact of money on politics ( While the technology is important, the resulting structural change is more so; designing an efficient “supply chain” of services between a skilled citizen force and the campaigns (see for the beginnings of a separate spinoff organization to implement these ideas in real campaigns).

Research Initiatives
Previous Work: Word Bug

ICSI Speech researchers are working with Versame to develop methods for the analysis of speech being directed at infants and toddlers, in order to provide better measures of the lexical stimulation they are getting. The initial project is focused on the counting of speech units from unrestricted audio, where the likely speech units are syllables or words.

Previous Work: Deep and Wide Learning for Automatic Speech Recognition

In this project, speech researchers are looking at trade-offs between two approaches to automatic speech recognition (ASR): signal processing of multiple acoustic features vs. using simpler features and relying on machine learning algorithms to replace feature engineering. The goal is not only to improve accuracy for difficult examples, but also to learn about the computational consequences for high performance computing.

Science of Security

In this collaborative project, researchers at ICSI are utilizing Carnegie Mellon University's Security Behavior Observatory (SBO) infrastructure to conduct quantitative experiments about how end-users make security decisions. The results of these experiments are used to design new security mitigations and interventions, which are then iteratively evaluated in the laboratory and the field. This collaboration is designed to provide keen insights into how users make security decisions in situ.

Networking and Security, Usable Security and Privacy
Previous Work: Extracting Event Attributes from Unstructured Textual Data for Persistent Situational Awareness

In this collaborative project with Decisive Analytics Corporation (DAC), FrameNet researchers are developing semantic frames for representing the attributes of complex events, which permit more fine-grained analysis than other event recognition frameworks. The researchers are developing event recognition methods focused on organizations and how they plan and carry out actions. These methods are broadly applicable to actions planned and carried out by all types of organizations, such as corporations, government agencies, military units, and insurgent groups. 

A Software-Defined Internet Exchange

In this collaborative project with researchers from Georgia Tech and Princeton, ICSI researchers are finding incrementally deployable ways to leverage the power of Software-Defined Networking (SDN) to improve interdomain routing. SDN has had a profound influence on how people think about managing networks. To date, however, it has had little impact on how separately administered networks are interconnected through BGP. Since many of the current failings of the Internet are due to BGP's poor performance and limited functionality, it is imperative that these methods are developted.

Networking and Security
Previous Work: Machine Learning Methods and Large Informatics Graphs

In this project, researchers are tackling several problems with machine learning methods and large informatics graphs. First, they are looking at local algorithms and locally-biased algorithms, specifically extending local algorithms to other objective functions and the characterization of statistical properties of local algorithms. Second, they are scaling the algorithms up to larger networks, focusing on scaling up strongly-local and locally-biased methods and implementations on graphs that do not fit into RAM.

Big Data
Previous Work: Characterizing and Exploiting Tree-Like Structure in Large Social and Information Networks

In this project, researchers are developing methods to characterize and exploit "tree-like" structure in realistic social and information networks. In particular, they are focused on two related but complementary notions of tree-like-ness, as well as related heuristic variants, for graphs. These notions will be used to develop tools to characterize the manner in which realistic complex networks are coarsely tree-like, and this characterization will be used to develop tools for improved analytics on realistic networks.

Big Data
Previous Work: How Does Deep Learning Improve Speech Recognition Accuracy?

The short-term goal of this project is to understand in a deep, quantitative way why methodology used in nearly all speech recognizers is so brittle. The long-term goal is to leverage this understanding by developing less brittle methodology that will enable more accurate speech recognition with a wider scope of applicability.

Randomized Numerical Linear Algebra (RandNLA) for Multi-Linear and Non-Linear Data

This project investigates two important, non-linear, structural settings in order to start making progress toward using RandNLA (Randomized Numerical Linear Algebra) approaches to big data analysis in situations where the underlying data exhibit non-linear structure. First, researchers investigate how to design the next generation of RandNLA algorithms that can handle data that exhibit multi-linear structures captured by tensors.

Big Data
Previous Work: Teaching Resources for Online Privacy Education (TROPE)

Researchers are developing classroom-ready teaching modules to educate young people about why and how to protect their privacy online, as well as a Teachers' Guide with background information, suggested lesson plans, and guidance on how to employ the modules in the classroom.

Audio and Multimedia, Networking and Security, Usable Security and Privacy
Previous Work: Preserving Unwritten Languages

In this project, researchers at ICSI are collaborating with Notre Dame to preserve unwritten languages in danger of disappearing. They are recording speech in a variety of genres and styles using mobile technologies. To enable productive linguistic and language-technology research in the future, they are adding respeaking, in which native speakers listen and repeat each phrase slowly and carefully, as well as oral translation, in which bilingual speakers of the language translate the recordings phrase by phrase into a widely used language such as English.

Knowledge-Aided Interface for Big Data Streams

In this collaborative project with Mod9 Technologies, researchers from ICSI's Audio and Multimedia group and ICSI's FrameNet project seek to demonstrate real­time monitoring of broadcast news streams to support a tactical operations center (TOC). A primary focus of this effort is to exploit multimedia – audio­visual data containing speech, images, and metadata such as geo-location and personal identification – and integrate it into an intuitive and informative visualization for a TOC’s use.

Audio and Multimedia
Developing Security Science from Measurement

This project aims to define foundational data-driven methodologies and the related science to create a basis for continuous and dynamic monitoring that enables adaptive approaches to mitigate and contain the spread of attacks. The basis of the approach is data on security incidents from a real large-scale production environment at the National Center for Supercomputing Applications (NCSA) at the University of Illinois at Urbana-Champaign (UIUC).

Networking and Security
Previous Work: The Berkeley Data Analysis System

In this project, researchers at ICSI are extending and applying recent work on randomized algorithms for matrix-based machine learning problems to the computational infrastructure recently developed at the AMPLab, UC Berkeley. One of the challenges in large-scale machine learning is that MapReduce/Hadoop does not perform well for iterative algorithms that are common in matrix-based machine learning. Examples of such iterative algorithms include common algorithms for least-squares approximation, least absolute deviations approximation, low-rank matrix approximation, etc.

Big Data
Robotic Vision

To perform useful tasks in everyday human environments, robots must be able to both understand and communicate the sensations they experience during haptic interactions with objects. Toward this goal, vision researchers at ICSI augmented the Willow Garage PR2 robot with a pair of SynTouch BioTac sensors to capture rich tactile signals during the execution of four exploratory procedures on 60 household objects. In a parallel experiment, human subjects blindly touched the same objects and selected binary haptic adjectives from a predetermined set of 25 labels.

Domain Adaptation

ICSI researchers are investigating the fundamental problem of visual domain adaptation, or how to deal with the most common scenario “What you see is not what you get.” When test data and training data come from differing distributions (or unsupervised methods are employed with non-stationary distributes), conventional approaches to machine learning often perform very poorly. They have been exploring several approaches to this problem, including those based on conventional feature spaces that are transformed based on a learned adaptation to overcome a domain shift.

Fine-grained Recognition

Recognizing objects in fine-grained domains can be extremely challenging due to the subtle differences between subcategories. Discriminative markings are not only subtle but often highly localized, with which traditional object recognition approaches struggle when dealing with the large pose variation often present in these domains. The ability to normalize pose based on super-category landmarks can significantly improve models of individual categories when training data is limited.

Representation Learning

Researchers at ICSI and UC Berkeley are developing new representation learning models for visual detection, leveraging advances in discriminatively trained convolutional neural networks.  In 2013, they established important results related to these models, including observations of their ability to generalize to new tasks and domains, and importantly to be applicable to detection and segmentation tasks.  They developed a new “Region-CNN” model (R-CNN), which outperformed all competing methods on the most important visual detection benchmark, the PASCAL challenge.

Bro Center of Expertise for the NSF Community

Researchers at ICSI and NCSA are operating a center to provide support and guideance to the NSF community on customized Bro installations that meet the specific needs of research environments. They are simultaneously making improvements to Bro that benefit the community, and leveraging Bro as a deployment platform for networking research results.

Networking and Security