Projects

ICSI hosts basic, pre-competitive research of fundamental importance to computer science and engineering. Projects are chosen based on the interests of the Institute’s principal investigators and the strengths of its researchers and affiliated UC Berkeley faculty.

Recent projects are listed below; the full list of each group's projects is accessible via the links listed in the sidebar.

Speech Processing for Meetings

We seek to develop algorithms and systems for the recognition of speech from meetings, as well as methods for information retrieval and other applications that such recognition would make possible. Funding for this research is provided by the Swiss project, IM2: Interactive Multimodal Information Management. IM2 Web site; ICSI's meeting recorder project page.

Speech
Robust Automatic Transcription of Speech

This DARPA-funded program seeks to significantly improve the accuracy of several speech processing tasks (speech activity detection, speaker identification, language identification, and keyword spotting) for degraded audio sources. As part of the SRI Speech Content Extraction from Noisy Information Channels (SCENIC) Team, we are working primarily on feature extraction (drawing on our experience with biologically motivated signal processing and machine learning) and speech activity detection (drawing on our experience with speech segmentation).

Speech
Automated Low-Level Analysis and Description of Diverse Intelligence Video (ALADDIN)

Massive numbers of video clips are generated daily on many types of consumer electronics and uploaded to the Internet. In contrast to videos that are produced for broadcast or from planned surveillance, the "unconstrained" video clips produced by anyone who has a digital camera present a significant challenge for manual as well as automated analysis. Such clips can include any possible scene and events, and generally have limited quality control.

Audio and Multimedia
Maven (Malleable Array of Vector-thread ENgines)

In earlier work at MIT, Professor Asanovic's team developed the Scale vector-thread architecture and processor prototype, which combines data-level and thread-level parallel execution models in a single unified architecture. MAVEN is the second-generation vector-thread architecture, designed to scale up to hundreds of execution lanes, and with the goal of providing very high throughput at low energy for a wide variety of parallel applications.

Research Initiatives, Architecture
Monolithic Silicon Photonics for Processor-to-DRAM Interconnects

In a collaboration with the MIT Center for Integrated Photonic Systems, researchers from the Architecture Group are exploring the use of silicon photonics for processor-to-memory interconnect. Projected advances in electrical signaling seem unlikely to fulfill the memory bandwidth demands of future manycore processor chips. Monolithic silicon photonics, which integrates optical components with electrical transistors in a conventional CMOS process, is a promising new technology that could provide large improvements in achievable interconnect bandwidth.

Research Initiatives, Architecture
Color, Language, and Thought

In 1978 The World Color Survey (WCS) collected color naming data in 110 unwritten languages from around the world. The ICSI WCS staff (Paul Kay and Richard Cook of ICSI, Terry Regier of University of Chicago) put these data into a single database, available to the scientific community. Several outside laboratories have already used this database for studies.

AI
NTL

The NTL (Neural Theory of Language) project of the AI Group works in collaboration with other units on the UC Berkeley campus and elsewhere. It combines basic research in several disciplines with applications to natural language processing systems. Basic efforts include studies in the computational, linguistic, neurobiological, and cognitive bases for language and thought and continues to yield a variety of theoretical and practical findings.

AI
Semantic Web Services

The Semantic Web is an exciting vision for the evolution of the World Wide Web. Adding semantics enables structured information to be interpreted unambiguously. Precise interpretation is a necessary prerequisite for automatic Web search, discovery, and use. Services are a particularly important component of the Semantic Web. A semantic service description language can enable a qualitative advance in the quality and quantity of e-commerce transactions on the Web. The OWL Services Coalition, under the guise of OWL-S, has taken some important first steps in this direction.

AI
AQUAINT

Researchers in ICSI's AI Group are participating in a project to study deep inferencing techniques and corpus-based techniques for deriving the conceptual semantics needed to achieve this. This research is a collaboration with Stanford University and the University of Texas at Dallas, and is sponsored by the ARDA AQUAINT Program. Our effort is being intergrated into an ambitious overall program to significantly advance the automated analysis of information.

AI
FrameNet

The FrameNet project is building a semantically-rich lexicon of English and a corresponding set of annotated texts, based on more than 600 semantic frames and 130,000 sentences. Comparable FrameNet projects are underway for Spanish, German, and other languages. By providing a layered semantic representation of text, FrameNet delivers a key component of next-generation question answering, machine translation, and other natural language processing applications. Learn more on the FrameNet Web site.

AI
Stochastic Direct Reinforcment Algorithms

Researchers at ICSI are developing Stochastic Direct Reinforcement algorithms, which show promise of being a superior alternative to traditional reinforcement learning methods for solving real world applications.

Research Initiatives, Algorithms
Analysis of Heuristic Combinatorial Algorithms

In many practical situations heuristic algorithms reliably give satisfactory solutions to real-life instances of optimization problems, despite evidence from computational complexity theory that the problems are intractable. Our goal is to understand this seeming contradiction, and to put the construction and evaluation of heuristic algorithms on a firmer footing. We will develop a general empirical method for selecting an optimal choice of parameters and subroutines within a well defined heuristic algorithmic strategy.

Research Initiatives, Algorithms
Finding Conserved Protein Modules

A long-term goal of computational molecular biology is to extract, from large data sets, information about how proteins work together to carry out life processes at a cellular level. We are investigating protein-protein interaction (PPI) networks, in which the vertices are the proteins within a species and the edges indicate direct interactions between proteins. Our goal is to discover conserved protein modules: richly interacting sets of proteins whose patterns of interaction are conserved across two or more species.

Research Initiatives, Algorithms
Transcriptional Regulation

Dissection of regulatory networks that conrol gene transcription is one of the greatest challenges of functional genomics. The Algorithms Group addressed the problem of modeling generic features of structurally but not textually related DNA motifs. The work divides into several parts: (1) A new approach to the recognition of transcription-factor binding sites, based on the principle that transcription factors divide naturally into families, and that the binding site motifs for transcription factors within the same family have common features.

Research Initiatives, Algorithms
Methods for the Analysis of High-Throughput Sequencing Data

We are currently developing methods for the design and analysis of studies that involve high-throughput sequencing technologies, such as the Solexa, 454, or Solid platforms.

Research Initiatives, Algorithms
Statistical Genetics and Populaton Genetics

We develop computational methods for the inference of evolutionary and genetic characteristics, such as the inference of recombination events, estimation of mutation rates, human history, etc., from genetic data.

Research Initiatives, Algorithms
Computational Methods for the Identification of Disease-Genotype Associations

We develop computational methods that aid in the analysis of genome-wide association studies, or other studies that involve the inference of a relation between a genetic variant such as an SNP or a copy-number variant (CNV), with a given phenotype that was measured for the studied population. These methods include haplotype inference methods, ancestry inference methods, and the incorporation of these in a statistical or machine-learning framework that is used to test for an association of a genetic marker with a phenotype.

Research Initiatives, Algorithms
Analysis of Genome-Wide Association Studies for Common Diseases

In these studies, sets of cases (individuals carrying a disease) and controls (background population) are collected and genotyped for genetic variants, normally single nucleotide polymorphisms (SNPs). Our group is collaborating closely with groups of geneticists and epidimiologists who have collected such samples. We take part in the analysis of these studies, and in some cases also in the design of the studies.

Research Initiatives, Algorithms

Pages