Featured Projects: Measuring Internet Misbehavior

Vern Paxson has been leading one wing of ICSI's Networking Group for almost a decade, pushing back against spam, credit card theft, malware, and network attacks. One central tenet of the group's work is empiricism; they believe that in order to truly understand a problem, you must first measure it. Two such recent projects of the group have been receiving well–deserved attention lately, and we're happy to describe them both here. Judo is a joint project between ICSI and UC San Diego that has come up with a novel way to fight spam, and Netalyzr is an easy–to–use and very robust network health analysis tool.

Judo

Most spam on the Internet comes from botnets, which are networks of subtly compromised computers that can be hijacked remotely for malicious purposes. The Judo team — Christian Kreibich, along with Paxson and Nicholas Weaver — has studied the pattern of some of these botnets by deliberately infecting computers with bots, and running them in a controlled environment (made up of both real and virtual machines) to learn their behavior.

The researchers have discovered that these botnets have a very specific, limited range of spam messages they can produce, because they use templates to generate their e–mail messages. The researchers have been able to infer the contents of these templates by examining the output of each botnet; moreover, they can use this template against its own botnet and actually filter out all spam it generates with virtually no false positives.

This approach has significant advantages over existing spam filtering techniques. Current implementations are not necessarily able to catch all spam from any given botnet, and can also accidentally mark messages as spam because of their content, even if they are legitimate e–mails — including the researchers' own e–mails discussing their spam filtering techniques.

Some recent press coverage has painted the results in an almost holy light. Paxson characterizes some of this exaggeration as a reflection of Judo's emotional resonance; it uses spam's own technology against itself. Kreibich points out that this is still a reactive measure, and therefore can never be perfect. Judo is only effective once a botnet has been captured and analyzed, and only effective against mail generated by that specific template. This method will enable us to catch up quickly to spam in the wild, but will not be able to prevent it.

Netalyzr

Net neutrality has been a hot topic lately, touching on subjects such as censorship, peer–to–peer networking, Internet accessibility, and more. Up until now, no one has had any quantitative sense of just what sort of neutrality — or lack thereof — the Internet currently offers. Weaver and Kreibich have developed a tool that can detect the health and openness of a network connection. In the planning stage, the team decided to make a single comprehensive diagnostic tool that was both easy to use and robust enough to detect a very wide range of network disruptions. They use a Java applet to check the consistency of IP addresses, correct DNS resolution and hidden proxies, port filtering, IPv6 support, and much more.

Netalyzr has already tested over 100,000 Internet sessions, and the data it provides have often proven surprising. It has found network configurations at several sites where devices in the network have imposed traffic controls that operators themselves didn't know about; it has even identified this kind of connectivity restriction at networking conferences. The fact that even networking conferences don't always have totally open connections shows how much net access is being limited, because of deliberate censorship or simple hardware configuration. Netalyzr has also revealed problems with the handling of fragmentation, poor DNS performance, and deliberate manipulation of DNS results.

While the debate about the future of the Internet continues, these concrete measurements of network neutrality, security, and performance provide an invaluable resource for determining what the Internet is actually like today.

ICSI acknowledges generous NSF support for this work; grant CNS–0722035 on the work of Netalyzr, and grants NSF–0433702 and CNS–0905631 on the work of Judo.