ICSI hosts basic, pre-competitive research of fundamental importance to computer science and engineering. Projects are chosen based on the interests of the Institute’s principal investigators and the strengths of its researchers and affiliated UC Berkeley faculty.

Recent projects are listed below; the full list of each group's projects is accessible via the links listed in the sidebar.

Science of Security

In this collaborative project, researchers at ICSI are utilizing Carnegie Mellon University's Security Behavior Observatory (SBO) infrastructure to conduct quantitative experiments about how end-users make security decisions. The results of these experiments are used to design new security mitigations and interventions, which are then iteratively evaluated in the laboratory and the field. This collaboration is designed to provide keen insights into how users make security decisions in situ.

Networking and Security, Usable Security and Privacy
Previous Work: Extracting Event Attributes from Unstructured Textual Data for Persistent Situational Awareness

In this collaborative project with Decisive Analytics Corporation (DAC), FrameNet researchers are developing semantic frames for representing the attributes of complex events, which permit more fine-grained analysis than other event recognition frameworks. The researchers are developing event recognition methods focused on organizations and how they plan and carry out actions. These methods are broadly applicable to actions planned and carried out by all types of organizations, such as corporations, government agencies, military units, and insurgent groups. 

A Software-Defined Internet Exchange

In this collaborative project with researchers from Georgia Tech and Princeton, ICSI researchers are finding incrementally deployable ways to leverage the power of Software-Defined Networking (SDN) to improve interdomain routing. SDN has had a profound influence on how people think about managing networks. To date, however, it has had little impact on how separately administered networks are interconnected through BGP. Since many of the current failings of the Internet are due to BGP's poor performance and limited functionality, it is imperative that these methods are developted.

Networking and Security
Previous Work: Machine Learning Methods and Large Informatics Graphs

In this project, researchers are tackling several problems with machine learning methods and large informatics graphs. First, they are looking at local algorithms and locally-biased algorithms, specifically extending local algorithms to other objective functions and the characterization of statistical properties of local algorithms. Second, they are scaling the algorithms up to larger networks, focusing on scaling up strongly-local and locally-biased methods and implementations on graphs that do not fit into RAM.

Big Data
Previous Work: Characterizing and Exploiting Tree-Like Structure in Large Social and Information Networks

In this project, researchers are developing methods to characterize and exploit "tree-like" structure in realistic social and information networks. In particular, they are focused on two related but complementary notions of tree-like-ness, as well as related heuristic variants, for graphs. These notions will be used to develop tools to characterize the manner in which realistic complex networks are coarsely tree-like, and this characterization will be used to develop tools for improved analytics on realistic networks.

Big Data
Previous Work: How Does Deep Learning Improve Speech Recognition Accuracy?

The short-term goal of this project is to understand in a deep, quantitative way why methodology used in nearly all speech recognizers is so brittle. The long-term goal is to leverage this understanding by developing less brittle methodology that will enable more accurate speech recognition with a wider scope of applicability.

Randomized Numerical Linear Algebra (RandNLA) for Multi-Linear and Non-Linear Data

This project investigates two important, non-linear, structural settings in order to start making progress toward using RandNLA (Randomized Numerical Linear Algebra) approaches to big data analysis in situations where the underlying data exhibit non-linear structure. First, researchers investigate how to design the next generation of RandNLA algorithms that can handle data that exhibit multi-linear structures captured by tensors.

Big Data
Previous Work: Teaching Resources for Online Privacy Education (TROPE)

Researchers are developing classroom-ready teaching modules to educate young people about why and how to protect their privacy online, as well as a Teachers' Guide with background information, suggested lesson plans, and guidance on how to employ the modules in the classroom.

Audio and Multimedia, Networking and Security, Usable Security and Privacy
Previous Work: Preserving Unwritten Languages

In this project, researchers at ICSI are collaborating with Notre Dame to preserve unwritten languages in danger of disappearing. They are recording speech in a variety of genres and styles using mobile technologies. To enable productive linguistic and language-technology research in the future, they are adding respeaking, in which native speakers listen and repeat each phrase slowly and carefully, as well as oral translation, in which bilingual speakers of the language translate the recordings phrase by phrase into a widely used language such as English.

Knowledge-Aided Interface for Big Data Streams

In this collaborative project with Mod9 Technologies, researchers from ICSI's Audio and Multimedia group and ICSI's FrameNet project seek to demonstrate real­time monitoring of broadcast news streams to support a tactical operations center (TOC). A primary focus of this effort is to exploit multimedia – audio­visual data containing speech, images, and metadata such as geo-location and personal identification – and integrate it into an intuitive and informative visualization for a TOC’s use.

Audio and Multimedia
Developing Security Science from Measurement

This project aims to define foundational data-driven methodologies and the related science to create a basis for continuous and dynamic monitoring that enables adaptive approaches to mitigate and contain the spread of attacks. The basis of the approach is data on security incidents from a real large-scale production environment at the National Center for Supercomputing Applications (NCSA) at the University of Illinois at Urbana-Champaign (UIUC).

Networking and Security
Previous Work: The Berkeley Data Analysis System

In this project, researchers at ICSI are extending and applying recent work on randomized algorithms for matrix-based machine learning problems to the computational infrastructure recently developed at the AMPLab, UC Berkeley. One of the challenges in large-scale machine learning is that MapReduce/Hadoop does not perform well for iterative algorithms that are common in matrix-based machine learning. Examples of such iterative algorithms include common algorithms for least-squares approximation, least absolute deviations approximation, low-rank matrix approximation, etc.

Big Data
Robotic Vision

To perform useful tasks in everyday human environments, robots must be able to both understand and communicate the sensations they experience during haptic interactions with objects. Toward this goal, vision researchers at ICSI augmented the Willow Garage PR2 robot with a pair of SynTouch BioTac sensors to capture rich tactile signals during the execution of four exploratory procedures on 60 household objects. In a parallel experiment, human subjects blindly touched the same objects and selected binary haptic adjectives from a predetermined set of 25 labels.

Domain Adaptation

ICSI researchers are investigating the fundamental problem of visual domain adaptation, or how to deal with the most common scenario “What you see is not what you get.” When test data and training data come from differing distributions (or unsupervised methods are employed with non-stationary distributes), conventional approaches to machine learning often perform very poorly. They have been exploring several approaches to this problem, including those based on conventional feature spaces that are transformed based on a learned adaptation to overcome a domain shift.

Fine-grained Recognition

Recognizing objects in fine-grained domains can be extremely challenging due to the subtle differences between subcategories. Discriminative markings are not only subtle but often highly localized, with which traditional object recognition approaches struggle when dealing with the large pose variation often present in these domains. The ability to normalize pose based on super-category landmarks can significantly improve models of individual categories when training data is limited.

Representation Learning

Researchers at ICSI and UC Berkeley are developing new representation learning models for visual detection, leveraging advances in discriminatively trained convolutional neural networks.  In 2013, they established important results related to these models, including observations of their ability to generalize to new tasks and domains, and importantly to be applicable to detection and segmentation tasks.  They developed a new “Region-CNN” model (R-CNN), which outperformed all competing methods on the most important visual detection benchmark, the PASCAL challenge.

Bro Center of Expertise for the NSF Community

Researchers at ICSI and NCSA are operating a center to provide support and guideance to the NSF community on customized Bro installations that meet the specific needs of research environments. They are simultaneously making improvements to Bro that benefit the community, and leveraging Bro as a deployment platform for networking research results.

Networking and Security
Service Composition in Distributed Application Design and Execution

In a collaboration with the Computer Platform Research Center - CIPI (jointly established by the Universities of Genoa and Padua, Italy), researchers are investigating the service composition paradigm for distributed applications. This paradigm can be taken as a reference when a distributed application is treated as a composite service made up of atomic services. In these cases, the application designers do not need to be programmers because they can specify the distributed applications using visual Service Creation Platforms.

Research Initiatives
Previous Work: Leverage Subsampling for Regression and Dimension Reduction

In this collaborative project between UC Berkeley, University of Illinois, Urbana-Champaign, and ICSI, scientists are working toward an integrated treatment of statistical and computational issues. The first research thrust focuses on studying the statistical properties of the subsampling estimation using the statistical leverage scores in linear regression. The second research thrust generalizes the theory and methods to nonlinear regression and dimension reduction models.

Big Data
Scalable Statistics and Machine Learning for Data-Centric Science

Researchers from Lawrence Berkeley Laboratory, UC Berkeley, and ICSI are developing and applying new statistics and machine learning algorithms that can operate on real-world datasets produced by a diverse range of experimental and observational facilities. This is a critical capability in facilitating big data analysis, which will be essential for scientific progress in the foreseeable future.

Big Data