Our study started out with the intention of studying the performance of the DNS system, and we were aware that a very important part of it relies on errors. Like in the 1992 study ([DOK92]), we discovered that a significant fraction of the trace traffic is caused by implementation and configuration errors. The implications of this are twofold:
To give an example of the extraordinary effect a single misconfigured system may cause in the total performance, we will focus in the following example. Figure 7 shows the TTL distribution of one of the main servers for a given ccTLD.
In this case we were expecting to see a TTL distribution similar to the root server one. The number of packets that do have an answer respect the overall number of packets (147,000 over half a million) is coherent (this corresponds to gTLD servers and other name servers from its ccTLD, including the main servers for it). Which is really striking is the huge amount of short TTLs (less than 30 minutes). In this case we saw than almost all of the approximately 100,000 short TTL answers corresponded to two different queries of another main server for the same ccTLD. Why would some ccTLD main server announce its own address with a 5-minute TTL value? Main servers have pretty static mappings, so we think it must be two configuration errors. But these two errors are accounting for one tenth of the total traffic in a name server.
Once realized the impressive effect a single misbehaving name server can have in the DNS traffic, we focused on understanding this erroneous behaviors and apply [DOK92] error classification to the request and response traffic in our traces.