Hardening DNS with IP TTLs

Sunday, August 10, 2008

During Paul Vixie's talk at WOOT on some of the operational challenges of deploying source port randomization functonality in BIND, I started thinking of a few simple ways to harden DNS infrastructure against VU#800113 by leveraging the IP TTL value.

DNS Cache Poisoning

In order to increase the resilience of DNS against Dan Kaminsky's cache poisoning attack, source port randomization has been added to recursive DNS resolvers. Now, in addition to guessing the 16-bit TXID, the attacker must also guess the 16-bit source port that the resolver used for the particular query. While this greatly increases the amount of time the attacker must spend to successfully poison a cache, it does not completely resolve the problem.

However, completely solving the underlying problem is difficult and we're still a ways off from DNSSEC. Therefore, between now and DNSSEC, it may be worth revisiting previous proposals or pursuing additional bandaids that will further raise the bar for attackers to perform cache poisoning. Some interesting approaches include XQID, Dagon's 0x20 hack, and further verification steps when an attack is detected.

Using the IP TTL Value

Back several years ago, TCP RST attacks were widely publicized. The idea behind the attack was that an attacker with sufficient bandwidth could forge a large number of RSTs, hoping to choose a sequence number that was within the window of the connection the attacker wished to reset. This attack was more successful against long-lived persistent connections, giving the attacker sufficient time to forge an accepted RST.

One particular application that used long-lived TCP connections was BGP peering sessions, making it a prime target. Since resetting a BGP session could cause route withdrawls/flapping, protecting these connections against attack was fairly important.

Two solution were used to address the problem:

TCP MD5 Option: By including a MD5 digest of the packet contents and a shared key between the communicating parties, injection of valid packets is prevented unless the attacker can obtain the shared key.
IP TTL Value: Since most BGP peers are connected directly via layer-2, both communicating parties can set their IP TTL values to 255 and only accept packets with a 254/255 TTL value. An attacker's packet traversing multiple hops would have its TTL value decremented outside of the acceptable range.

This use of lower-layer header attributes, in particular the IP TTL value, to protect BGP got me thinking about similar techniques to harden DNS. How can we use the IP TTL value and its unique attributes to raise the bar for executing a DNS cache poisoning attack?

Expected IP TTL Values

The first possible technique is based on the assumption that the attacker does not have knowledge of the exact number of hops between a particular resolver and an authoritative server. Similar to the source port randomization, adding extra bits that the attacker must successfully guess can significantly raise the bar for cache poisoning.

When a resolver first contacts an authoritative nameserver, it can record the IP TTL of a received response in the cache entry for that NS RR. When receiving subsequent responses from that nameserver, the resolver can verify whether the IP TTL matches the expected value. However, legitimate routing events and topology changes may legitimately cause a change in TTL between the resolver and auth server. So if a TXID and source port matches successfully, but the IP TTL does not, the resolver could enter a verification mode where it sends additional verification queries to validate the contents of the mismatched TTL response. Successfully spoofing each and every one of the verification queries would be astronomically difficult. If the verification queries succeed, the resolver can update the new expected TTL for that auth server.

Against a completely clueless attacker, the technique would add 8 bits of entropy as the IP TTL field is 8 bits wide. Of course, a more intelligent attacker may be able to narrow down the expected TTL to a reasonable range through inference of routing information and network probing, reducing the added entropy to only a few bits. However, inferring the expected TTL raises the level of expertise the attacker must possess have to pull off the attack.

With this method, we would maintain the same performance in the common case, and only have a small overhead of the verification queres in the incommon case when the IP TTL legitimately changes. Since it only requires modifications to the resolver to deploy, it may be worthwhile for the extra bits of entropy provided by the IP TTL field.

IP TTL Proximity Thresholds

Similar to the BGP TTL hacks, we can leverage the fact that the IP TTL of a packet is decremented at each hop. Under the assumption that resolvers and authoritative servers are closer (in terms of network hops) than an attacker is to the resolver (a some-what intuitive but unsubstantiated claim), we can effectively set a proximity threshold and cut off attackers not within the proximity.

For example, the authoritative server can set its IP TTL to 255 for all responses. A resolver receiving a response would note the current TTL, effectively giving the hop distance between the resolver and auth server. If the resolver receives any responses with a lower TTL, they can discard them or act in a more paranoid manner. This essentially limits the attacker by network locality. If the attacker happens to be more hops away from the resolver than the authoritative server, he simply cannot pull off the cache poisoning attack.

Since this approach requires modification to both resolvers and authoritative servers, it is much less feasible to deploy than the first technique, but still is an interesting application of the IP TTL.