Let’s start by going back in time to the 1980s. This was when one of the most widely used protocols on the Internet – DNS –was developed. In case you are new to DNS or need a refresher, take a look at this detailed post on DNS before reading further.
DNS uses UDP as the transport layer protocol, except in some cases where it can switch to TCP. Thus, the size of the DNS message is limited to 512 bytes when using UDP. The basic DNS message begins with a fixed 12-byte header, followed by four variable-length sections:
- Questions (or queries)
- Authority records
- Additional records
The image below illustrates a typical DNS message structure.
DNS was developed to fit the speeds and traffic that was seen in the ’80s because the Internet was accessible to only an elite few involved in research and development. However, a lot has changed since regarding speed, traffic, and, more importantly, the way the Internet is structured. We’ve come a long way from a centralized server architecture— the Internet is now distributed and serves a global audience.
As you can see from the DNS message structure above, the DNS message in its current form doesn’t have sufficient space to add any more information. Given this backdrop, it became vital to enhance the DNS protocol to cater to newer requirements. Hence, extension mechanisms for DNS, aka EDNS, was proposed. On a high level, EDNS allows us to overcome the restrictions in the size of several flags fields, return codes, and label types in the DNS header. It also allows for extending the DNS message size from 512 bytes (when UDP is used as the transport protocol) without the necessity to switch to TCP.
Impact of EDNS
Now that we have established why EDNS came into the picture, let’s dive right into the topic of discussion – how is this enhanced version of DNS enabling content delivery networks to deliver high performance to end users?
Content delivery networks(CDN) ensure an end user is served from a server geographically close to them. This is usually done in two ways –
- CDNs that rely on DNS and serve a unicast address: The logic to determine the closest server is based on the location of the recursive resolver from which the request originates and is ingrained to the DNS resolution process.
- CDNs that rely on Anycast: BGP ensures that the user hits the CDN server closest to the end user.
The DNS experience test in Catchpoint can be used to understand the DNS resolution process that is used by the CDNs belonging to the 1st category. This test type also helps in monitoring the performance and availability of the DNS servers on the CDN network.
- TLDs return the authoritative name server for the domain –
- The authoritative name servers return a CNAME record which points to the CDN infrastructure:
- Notice that the DNS servers from this step onwards belong to the CDN:
- The CDN authoritative name server at the final level of resolution uses the IP of the recursive resolver from where the request originated to hand out a CDN server close to the end user.
The below diagram illustrates the DNS resolution process when the ISP’s DNS resolver is used. The end user is served from a CDN server close to it.
With the emergence of public DNS recursive resolvers like Google DNS and Open DNS as well as ISPs that use a centralized DNS resolver infrastructure, the assumption that the end user and the recursive resolver are topologically close is no longer valid. For example, Open DNS resolvers aren’t present in India yet and hence if an end user is using the Open DNS resolver, the DNS query may be made to an Open DNS resolver in say Singapore (https://www.opendns.com/data-center-locations/ ). The impact – increased roundtrip time and latency. Due to the increase in distance or number of hops, the percentage packet loss might also increase.
The diagram below illustrates the resolution process when for example, Open DNS resolver is used:
To overcome the above-mentioned problem, recursive resolvers can pass an edns-client-subnet (ECS) EDNS0 option to the forwarding resolvers, intermediate name servers and eventually to the authoritative name servers. Authoritative Nameservers then use the ECS as a hint to the end user’s network location and provide a geographically-aware answer.
The diagram below illustrates the change in the DNS resolution logic when the edns-client-subnet option is passed:
EDNS saves the day for CDNs relying on DNS and ensures they meet the improved performance promise.
The support for the use of ECS EDNS0 option has been provided by CDNs like Akamai, DNS providers like Dyn and NS1 and public DNS resolvers like Google DNS. Using DNS tests along with the advanced setting to pass client subnet information, one can ensure that the network infrastructure that they rely on is working well with this latest enhancement to the DNS protocol.
If you answer yes to any of the points below, you should include DNS tests where you pass the EDNS Client Subnet in your DNS monitoring strategy:
- You have users who use Public DNS resolvers complain about DNS issues.
- You use a CDN and see users being routed to distant CDN servers all the time.
- You provided your name server domains to public DNS resolvers for whitelisting to support EDNS client subnets. Birthday attacks and cache pollution are two security concerns associated with using EDNS client subnet. Whitelisting ensures that the recursive resolvers send ECS only to whitelisted Authoritative name servers and vice versa.
- You see DNS time going up after enabling support for ECS. The DNS resolution process goes through a bunch of resolvers and servers – stub resolvers, forwarding resolvers, recursive resolvers, intermediate servers and authoritative servers. Since EDNS is fairly new, not all components may support it resulting in retries and increased DNS time. A resolver will add the ECS option in its request if it supports it. The server responds with the ECS option if it supports it. Otherwise, it ignores the option.
- You see the same CDN server IP being returned to a wide network of end users and the server is overloaded. When using ECS, the DNS entries are cached against the client subnet included in the query. If the client subnet is generic enough to server a large number of IPs, the same CDN server may be served against it.
From a monitoring perspective, it is always essential to factor in the latest changes and enhancements to protocols. Having a strategy to adopt the enhancements and a platform to test and monitor the adoption is paramount as well. Happy monitoring!