Since its inception in 1988, the traceroute has undergone several variations. You might be wondering, ‘Why so many?’ The answer is simple: achieving traceroute functionality has been a balance between security and utility. Whenever malicious actors exploited firewall and router vulnerabilities, their vendors responded with fixes and solutions which impacted the traceroute algorithms. In turn, these algorithms were modified to operate in a slightly different manner to handle the changes in the network devices. This applies to the current traceroute too, since it could show false positives on packet loss for some IPs due to firewall or load-balanced routers. From this point onwards, and in cited research papers, these are identified as load balancers.
See, for example, what happens when running a standard traceroute using TCP towards bing.com
Catchpoint engineers reviewed numerous RFCs (the formal documents that define the protocols and standards of the Internet) as well as research papers. After successfully identifying a resolution for this problem, we went on to build Traceroute InSession to effectively address it. While it’s not a novel approach, it’s a single, functional tool that resolves significant frustrations for network engineers.
Here’s what happens when running Catchpoint InSession traceroute to bing.com (building the open-source code linked below, of course):
The path now shows less variability in hops and no packet loss on the destination. Read the rest of this blog post to understand why.
A brief history of Traceroute
Traceroute was created on December 20th, 1988, by Van Jacobson – one of the first inductees into the Internet Hall of Fame in 2012. His goal was to answer the troublesome question: ‘Where are the packets going?’
“I cobbled up a program to trace out the route to a host. It works by sending a udp packet with a ttl of one & listening for an icmp “time exceeded” message. If it gets one, it prints the source address from the icmp message, then bumps the ttl by one.” Van Jacobson
UDP probes were originally preferred to ICMP echo requests presumably because, at the time, routers could legitimately refuse to send ICMP TTL exceeded messages in response to ICMP ECHO messages as stated in the introduction of RFC792 (Internet Control Message Protocol): “The ICMP messages typically report errors in the processing of datagrams. To avoid the infinite regress of messages about messages etc., no ICMP messages are sent about ICMP messages”. Thus, although ICMP ECHO messages were suitable for pings, they were not to trace the path from a source to a destination.
Later, RFC1122 (Requirement for Internet Hosts) loosened that requirement, and thus traceroute variants using ICMP probes were born.
As time marched on, so did the spread of threats, attacks and – consequently – firewalls on the Internet. While these defensive mechanisms made the Internet more secure, they also affected the results achievable via traceroute. This is because some of these firewalls are set to filter out ICMP and UDP probes or deprioritize ICMP echo packets, leading traceroute to show incomplete paths with significant packet loss or high RTT artificially created by router configurations that added security rules to the packet flow.
For this reason, tcptraceroute was introduced in mid-2001. This new traceroute offers a solution to this problem because “[…] in many cases, these firewalls will permit inbound TCP packets to specific ports that hosts sitting behind the firewall are listening for connections on. By sending out TCP SYN packets instead of UDP or ICMP ECHO packets, tcptraceroute is able to bypass the most common firewall filters”, as stated in tcptraceroute GitHub.
Since then, various groups have built and maintained multiple versions of traceroute for different operating systems. In Windows, tracert, is a tool that exclusively uses ICMP packets. In contrast, on most Linux distributions, you can find the “modern version” of traceroute, initially written in 2001 and still maintained by Dmitry Butskoy, which supports tracing with UDP, ICMP and TCP probes. Besides these tools, multiple traceroute techniques were developed in academia to solve specific issues, like improving the accuracy of paths (Paris traceroute), considering NAT (Dublin traceroute) and IP aliasing (Pamplona traceroute).
The problem with TCP probes
The introduction of TCP in traceroute did not solve all the problems, though.
Both tcptraceroute and Butskoy’s traceroute use the well-known TCP half-open technique. Specifically, they send TCP SYN packets with increasing TTL for each hop until the SYN packets reach their destination, triggering a response which is usually a TCP RST – if nothing is listening on the destination port – or a regular TCP SYN+ACK if there is. It’s also possible to receive no response if a firewall has dropped the TCP SYN for any reason – a scenario we’ll illustrate shortly. If the destination replies with a TCP SYN+ACK, the sender will generate a TCP RST so that the TCP session is never truly established.
By mimicking a TCP session opening, this technique is unlikely to be filtered by destinations where an active service is running on the chosen destination port – such as websites on port 80. Still, as security threats grow, so do defenses – and modern firewalls and NAT configurations can interpret the SYN packets as a flood, port-scan or intrusion attempt, as described in the TCP sidecar research paper. This is particularly true for tools sending simultaneous probes, like Butskoy’s traceroute where 16 probes are sent in parallel by default.
From the perspective of a traceroute, all these factors contribute to the destinations dropping packets, leading users to believe there is packet loss and hence problems. However, any actual TCP session established on that destination won’t show any real networking problem.
Because of this issue, more tools and algorithms were developed. Paratrace, TCP sidecar, 0trace and Service traceroute took it a step further by piggybacking TCP probes on top of already established TCP sessions. As the authors of TCP Sidecar put it, “Probes sent from within TCP connections can traverse and expose the firewalls and NATs that traceroute probing cannot.”
This approach is great but has a fundamental limitation: it requires an active TCP session already set up. It is still possible to intentionally open a TCP session with the remote end sending application-level data – like camotrace does. However, this means interfering with real traffic being handled by the application server, something that traceroute traditionally tries to avoid.
There is an additional caveat regarding traceroute built on the TCP half-open technique. Each SYN packet generated in this technique uses a different TCP source port – one of the fields used by load balancers to route packets on different paths. If packets follow different paths, it’s possible that only one of them gets to the destination with a given TTL. This leaves the last hop of traceroute with only one RTT value, as the rest of the packets were lost – which is loss caused by the load balancers rather than losing packets to and from destination.
Catchpoint solution: Traceroute InSession
Here was the challenge: find a reliable way to determine the route and measure its performance while interfering with the application server as little as possible, all while avoiding being blocked by firewalls.
To address the reliability concerns, we chose not to reinvent the wheel. Instead, we enhanced a tool rich in features and already popular in the Linux community: Butskoy’s traceroute. Dmitry has done an excellent job writing and improving the tool over the years, and his code is easily understandable and customizable.
To bypass firewalls, we decided to leverage TCP – for all the reasons identified by researchers and developers in the past. Unlike Butskoy’s method, however, we chose to establish a TCP session from source to the destination and traceroute on top of that. This way, any firewall crossed will see a regularly established TCP session with regular data flowing. This approach also provides a degree of protection against load balancers, similar to Paris traceroute.
The most challenging part of this issue is accomplishing all the above with minimal impact on the application. Ideally, we want to be able to reach the destination with our probes, trigger a reply from the destination, and keep the TCP session as idle and short-lived as possible.
We can achieve this partially by leveraging the TCP congestion control mechanism. The idea is to exploit the fact that TCP delivers only ordered data to the application level, meaning that a chunk of data containing a gap in the stream of sequence numbers is not delivered until that gap is filled. Once the gap is identified by the destination, a TCP ACK message will be generated, communicating to the sender the sequence number of the missing packet for a re-transmission. If the TCP Fast retransmit/Fast recovery congestion control mechanism is in place, this also means that we will receive an ACK per probe because “A TCP receiver SHOULD send an immediate duplicate ACK when an out-of-order segment arrives” (RFC5681).
This, however, is not enough. The TCP ACKs generated by the destination will contain the very same sequence number, while we need a way to map the returning TCP packet to the original probe. This is crucial since Butskoy’s implementation is based on simultaneous probes – a feature we did not want to lose because it makes traceroute really fast.
To solve that, we took inspiration from tracebox leveraging the TCP congestion control mechanism and the Selective ACKnowledgment (SACK) option, introduced in 1996 in RFC2018 (TCP Selective Acknowledgment Options).
SACK is an option “to be sent by a data receiver to inform the data sender of non-contiguous blocks of data that have been received and queued. […] This option contains a list of some of the blocks of contiguous sequence space occupied by data that has been received and queued within the window” (RFC2018).
Each block contains an interval of contiguous sequence numbers received by the destination, represented as the lowest sequence number received (left edge) and the highest sequence number not yet received (right edge).
This means that if we send probes with a gap and incrementing sequence numbers, each probe reaching the destination will generate a different interval. Frequently, it’s straightforward to match returning ACKs with the original probes, as shown in the image below. However, this is not always the case because probes can reach the destination out of order, and the intervals can be disjointed. This required us to devise our own algorithm to correctly map ACKs to their original probes.
The algorithm we came up with involves collecting the returning ACKs until the traceroute is over, ordering them by their size, and then assigning the probes while taking into account the interval content.
Consider for example the scenario depicted in the image below where six probes were sent, and six ACKs were received and ordered by the size of the interval they cover.
The mapping algorithm starts by looking at the ACK carrying the SACK block [8-9[ and directly mapping the ACK to probe 8, since that is the only probe that can have triggered such an ACK. Indeed, the right edge (9) indicates the first sequence number yet to be received, while the left edge indicates the first sequence number already received. The next ACK considered is the ACK carrying the interval [4-5[, [8-9[. Since probe eight has already been mapped to the previous ACK, this means that the only probe that could have triggered this ACK must be 4. The algorithm proceeds in a similar way for the rest of the received ACKs, matching all the probes that reached the destination.
There is a final caveat though: not every host on the Internet needs to have the TCP Fast retransmit/Fast recovery congestion control mechanism in place. This means that TCP ACK can be delayed, making it impossible to discriminate between a regular packet loss vs a delayed TCP ACK carrying a SACK interval including multiple probes, as depicted in the image below.
To address this final issue, we decided to implement and perform an initial sequential TCP ping to destination and then replace the traceroute final hop with the results found in the TCP ping. Sending probes in a sequential fashion will give time for the destination to generate the ACK, and provides traceroute the capability to identify actual packet loss.
In summary, the new Traceroute InSession offers those troubleshooting network-related issues the following capabilities:
- Prevents false packet loss introduced by firewall and router configurations related to security.
- Ensures that packets follow a single flow, akin to a regular TCP session, to bypass load-balanced routers.
- Utilizes the TCP protocol to simulate application packet traffic.
- Obtains traceroute results as quickly as possible.
- Provides a single tool that combines the above benefits with all the features of a traditional traceroute tool.
Traceroute InSession has been available in the Catchpoint portal since the Eagle release in May 2023. Catchpoint customers can now run TCP and InSession periodic traceroute to understand the differences between the two variants of traceroute. Below is an example of the trend in packet loss targeting Bing.com, using data from Catchpoint.
You don’t have to be a Catchpoint customer to take advantage of InSession. In line with the ethos of the original author, we’ve open-sourced the enhanced version of Dmitry Butskoy’s traceroute, which can be found here.