Profile: Nelson Morgan - Part 1

Monday, November 5, 2012

This week, we'll be posting a two-part profile of Nelson Morgan, so make sure to check back for the rest of the story.

ICSI Deputy Director Nelson MorganMorgan has led speech recognition research at ICSI since the Institute’s inauguration in 1988. Morgan also served as director for thirteen years starting in 1999, the year the agreement that had established ICSI expired. Morgan volunteered for the challenge of broadening and stabilizing the Institute’s funding base, and in 2012, at the end of his tenure, the Institute is doing better financially than it has in years. But then Morgan has always enjoyed a challenge.

 

Morgan was born and raised in Buffalo, New York. Fascinated by electronics, he was “always hooking things up to other things,” he said. When he was 10, he built a simple noise recognizer for Halloween by connecting a microphone to a circuit board he had bought from Popular Electronics. When the microphone detected sound, a tape player began playing Godzilla music to scare trick-or-treaters.

Morgan’s older brother had a strong influence on his early life and interests: he introduced Morgan to rock-and-roll, which sparked a lifelong interest in sound and audio, and gave him his first tape recorder, a rare technology in the 1950s. Morgan used it to record televised news reports, which he edited – using a razor blade – to make them sound more leftist. Morgan’s brother also introduced him to the college programs that allowed him to enter college after two years of high school. At the age of 16, Morgan entered the University of Chicago as a physics major.

But he decided to leave the university in order to do “a lot of wandering,” he said, at one point living in a teepee in the woods. He eventually began managing and recording rock bands and working as a technical advisor on motion pictures, including Godfather Part II.

Most of his work was mixing audio for television commercials. One day, while adjusting the volume of a dog’s bark for a commercial, he said, “I realized this was not quite the creative experience I was expecting.”

He thought he might enjoy the audio-related career he had wandered into if he better understood the science behind it, so he began taking classes part time while continuing to work as a sound technician. Then, one summer, he took an introductory course in electronics at UC Berkeley. “It was wonderful,” he said. “It was much more exciting than anything I was doing in the studio.”

Quote: "It was wonderful. It was much more exciting than anything I was doing in the studio."His undergraduate advisor at Berkeley suggested he apply for a National Science Foundation fellowship, which would allow him to attend school full time as a graduate student. “I decided I would write up exactly what I wanted to do – just what I wanted to do – for my research,” he said. “And if they said they’d pay for it, great, I’d be a student. And if not, I would continue doing what I was doing.”

His fellowship application was for a project to create sound effects electronically. At the time, technicians commonly simulated room reverberation with a metal plate in order to produce sound effects for movies. “These were pretty hokey-sounding, and you couldn’t adjust them for a particular room size,” he said.

He was awarded a three year graduate fellowship from NSF, and he decided he would try to finish his doctorate before his funding ran out. Doctorates in physical sciences or engineering often take five years or more, but Morgan said he had an advantage: “From the first day I knew exactly what my research would be.”

Speech and Audio Signal ProcessingAlthough his research was on room acoustics, he spent some of his spare time chatting with Ben Gold, a pioneer in digital signal processing and then a visiting professor at UC Berkeley, about technical topics during Gold’s office hours. Later, in the early 1990s, Gold and Morgan established a class at Berkeley that combined their varied experiences with speech processing. The class has been taught every other year since then, and Morgan and Gold developed the class outlines into a textbook, Speech and Audio Signal Processing, which was recently revised and released in a second edition with the help of Columbia Professor and ICSI alum Dan Ellis.

As he approached graduation, Morgan was offered a position by Dolby to start a digital audio laboratory. But the recession of the late 1970s forced Dolby to lay off much of its work force, and Morgan’s offer was cancelled at the last moment. He quickly found a position at National Semiconductor, where he worked on speech analysis and synthesis techniques.

In one project at the lab, short recordings of actors were used to synthesize longer pieces of audio for commercials and products such as talking soda machines. This required that the recordings be divided into voiced speech, which is produced by vibrations of the vocal cords, and unvoiced speech, which is produced from air moving past some obstruction in the vocal tract such as the teeth. To explore methods of separating these automatically, Morgan bought a book on pattern recognition and coded all the techniques in it. Neural networks, which Morgan would later use extensively in speech recognition, happened to work the best in his experiments.  “We cut the time enormously by just having experts do fine-tuning on a smaller set and training the classifier from the hand labeled data,” he said.

His next experience with neural networks was at EEG Labs, which he joined in 1984. Researchers at the lab were using scans of the brain in order to understand its performance of cognitive functions. It was a new experience for Morgan, whose work until then had been mostly in signal processing. “I learned a lot from them, not just about the brain, but also about pattern recognition and neural networks,” Morgan said. “That’s really where I learned about them.”

In 1986, a new computer science laboratory was incorporated in Berkeley, and word got around that it needed researchers. After a conversation with then-Director Jerry Feldman, Morgan was chosen to lead a group at ICSI that would focus on building massively parallel computers.

“But I didn’t want to be just building up systems to do what someone else was interested in,” he said. He decided the group’s work would be applied to problems in speech research. In September 1988, when the Institute was officially inaugurated, he became the leader of the Realization Group, renamed the Speech Group in 1999.