ICSI hosts basic, pre-competitive research of fundamental importance to computer science and engineering. Projects are chosen based on the interests of the Institute’s principal investigators and the strengths of its researchers and affiliated UC Berkeley faculty.

Recent projects are listed below; the full list of each group's projects is accessible via the links listed in the sidebar.

Previous Work: Security and Privacy for Wearable and Continuous Sensing Platforms

In this collaborative project, researchers at ICSI, UC Berkeley, and University of Washington are systematically exploring the security and privacy issues brought up by the increasing popularity of wearable computers. The recent demand for devices like Google Glass, smart watches, and wearable fitness monitors suggests that wearable computers may become as ubiquitous as cellphones.

Networking and Security, Usable Security and Privacy
Internet-Wide Vulnerability Measurement, Assessment, and Notification

Vulnerable software costs the U.S. economy more than $180 billion a year, and large-scale, remotely exploitable vulnerabilities affecting millions of Internet hosts have become a regular occurrence. This project seeks to reduce the impact of software vulnerabilities in Internet-connected systems by developing measurement-driven techniques for global vulnerability detection, assessment, and mitigation.

Networking and Security
Undermining Political “Scratch” Effects with Technology (UPSET)

This project leverages technology to significantly reduce the cost of political campaigns via the sharing economy, and thus undercut the impact of money on politics ( While the technology is important, the resulting structural change is more so; designing an efficient “supply chain” of services between a skilled citizen force and the campaigns (see for the beginnings of a separate spinoff organization to implement these ideas in real campaigns).

Research Initiatives
Previous Work: Word Bug

ICSI Speech researchers are working with Versame to develop methods for the analysis of speech being directed at infants and toddlers, in order to provide better measures of the lexical stimulation they are getting. The initial project is focused on the counting of speech units from unrestricted audio, where the likely speech units are syllables or words.

Previous Work: Deep and Wide Learning for Automatic Speech Recognition

In this project, speech researchers are looking at trade-offs between two approaches to automatic speech recognition (ASR): signal processing of multiple acoustic features vs. using simpler features and relying on machine learning algorithms to replace feature engineering. The goal is not only to improve accuracy for difficult examples, but also to learn about the computational consequences for high performance computing.

Science of Security

In this collaborative project, researchers at ICSI are utilizing Carnegie Mellon University's Security Behavior Observatory (SBO) infrastructure to conduct quantitative experiments about how end-users make security decisions. The results of these experiments are used to design new security mitigations and interventions, which are then iteratively evaluated in the laboratory and the field. This collaboration is designed to provide keen insights into how users make security decisions in situ.

Networking and Security, Usable Security and Privacy
Previous Work: Extracting Event Attributes from Unstructured Textual Data for Persistent Situational Awareness

In this collaborative project with Decisive Analytics Corporation (DAC), FrameNet researchers are developing semantic frames for representing the attributes of complex events, which permit more fine-grained analysis than other event recognition frameworks. The researchers are developing event recognition methods focused on organizations and how they plan and carry out actions. These methods are broadly applicable to actions planned and carried out by all types of organizations, such as corporations, government agencies, military units, and insurgent groups. 

A Software-Defined Internet Exchange

In this collaborative project with researchers from Georgia Tech and Princeton, ICSI researchers are finding incrementally deployable ways to leverage the power of Software-Defined Networking (SDN) to improve interdomain routing. SDN has had a profound influence on how people think about managing networks. To date, however, it has had little impact on how separately administered networks are interconnected through BGP. Since many of the current failings of the Internet are due to BGP's poor performance and limited functionality, it is imperative that these methods are developted.

Networking and Security
Previous Work: Machine Learning Methods and Large Informatics Graphs

In this project, researchers are tackling several problems with machine learning methods and large informatics graphs. First, they are looking at local algorithms and locally-biased algorithms, specifically extending local algorithms to other objective functions and the characterization of statistical properties of local algorithms. Second, they are scaling the algorithms up to larger networks, focusing on scaling up strongly-local and locally-biased methods and implementations on graphs that do not fit into RAM.

Big Data
Previous Work: Characterizing and Exploiting Tree-Like Structure in Large Social and Information Networks

In this project, researchers are developing methods to characterize and exploit "tree-like" structure in realistic social and information networks. In particular, they are focused on two related but complementary notions of tree-like-ness, as well as related heuristic variants, for graphs. These notions will be used to develop tools to characterize the manner in which realistic complex networks are coarsely tree-like, and this characterization will be used to develop tools for improved analytics on realistic networks.

Big Data
Previous Work: How Does Deep Learning Improve Speech Recognition Accuracy?

The short-term goal of this project is to understand in a deep, quantitative way why methodology used in nearly all speech recognizers is so brittle. The long-term goal is to leverage this understanding by developing less brittle methodology that will enable more accurate speech recognition with a wider scope of applicability.

Randomized Numerical Linear Algebra (RandNLA) for Multi-Linear and Non-Linear Data

This project investigates two important, non-linear, structural settings in order to start making progress toward using RandNLA (Randomized Numerical Linear Algebra) approaches to big data analysis in situations where the underlying data exhibit non-linear structure. First, researchers investigate how to design the next generation of RandNLA algorithms that can handle data that exhibit multi-linear structures captured by tensors.

Big Data
Previous Work: Teaching Resources for Online Privacy Education (TROPE)

Researchers are developing classroom-ready teaching modules to educate young people about why and how to protect their privacy online, as well as a Teachers' Guide with background information, suggested lesson plans, and guidance on how to employ the modules in the classroom.

Audio and Multimedia, Networking and Security, Usable Security and Privacy
Previous Work: Preserving Unwritten Languages

In this project, researchers at ICSI are collaborating with Notre Dame to preserve unwritten languages in danger of disappearing. They are recording speech in a variety of genres and styles using mobile technologies. To enable productive linguistic and language-technology research in the future, they are adding respeaking, in which native speakers listen and repeat each phrase slowly and carefully, as well as oral translation, in which bilingual speakers of the language translate the recordings phrase by phrase into a widely used language such as English.

Knowledge-Aided Interface for Big Data Streams

In this collaborative project with Mod9 Technologies, researchers from ICSI's Audio and Multimedia group and ICSI's FrameNet project seek to demonstrate real­time monitoring of broadcast news streams to support a tactical operations center (TOC). A primary focus of this effort is to exploit multimedia – audio­visual data containing speech, images, and metadata such as geo-location and personal identification – and integrate it into an intuitive and informative visualization for a TOC’s use.

Audio and Multimedia
Developing Security Science from Measurement

This project aims to define foundational data-driven methodologies and the related science to create a basis for continuous and dynamic monitoring that enables adaptive approaches to mitigate and contain the spread of attacks. The basis of the approach is data on security incidents from a real large-scale production environment at the National Center for Supercomputing Applications (NCSA) at the University of Illinois at Urbana-Champaign (UIUC).

Networking and Security
Previous Work: The Berkeley Data Analysis System

In this project, researchers at ICSI are extending and applying recent work on randomized algorithms for matrix-based machine learning problems to the computational infrastructure recently developed at the AMPLab, UC Berkeley. One of the challenges in large-scale machine learning is that MapReduce/Hadoop does not perform well for iterative algorithms that are common in matrix-based machine learning. Examples of such iterative algorithms include common algorithms for least-squares approximation, least absolute deviations approximation, low-rank matrix approximation, etc.

Big Data
Bro Center of Expertise for the NSF Community

Researchers at ICSI and NCSA are operating a center to provide support and guideance to the NSF community on customized Bro installations that meet the specific needs of research environments. They are simultaneously making improvements to Bro that benefit the community, and leveraging Bro as a deployment platform for networking research results.

Networking and Security
Service Composition in Distributed Application Design and Execution

In a collaboration with the Computer Platform Research Center - CIPI (jointly established by the Universities of Genoa and Padua, Italy), researchers are investigating the service composition paradigm for distributed applications. This paradigm can be taken as a reference when a distributed application is treated as a composite service made up of atomic services. In these cases, the application designers do not need to be programmers because they can specify the distributed applications using visual Service Creation Platforms.

Research Initiatives
Previous Work: Leverage Subsampling for Regression and Dimension Reduction

In this collaborative project between UC Berkeley, University of Illinois, Urbana-Champaign, and ICSI, scientists are working toward an integrated treatment of statistical and computational issues. The first research thrust focuses on studying the statistical properties of the subsampling estimation using the statistical leverage scores in linear regression. The second research thrust generalizes the theory and methods to nonlinear regression and dimension reduction models.

Big Data