Paris Traceroute

Paris Traceroute is a new version of the well-known network diagnosis and measurement tool.

Why should you use Paris traceroute?
Because traceroute fails in the presence of routers that employ
load balancing on packet header fields. The failures lead to the
discovery of inaccurate and incomplete paths, that may mislead operators during problem diagnosis and result in erroneous internet maps.
Paris traceroute, by controling packet header contents, obtains
a more precise picture of the actual routes that packets follow.
For a quick demonstration, look at the figure on the right. Suppose you are trying to measure the route between Src and Dst.
The true router topology is shown on the left. L is a
router that balances load across two paths, via routers A
or C. The middle of the figure shows what you might see
with classic traceroute. On the right is what you would get with Paris
traceroute...

Paris Traceroute

A brief demonstration of Paris traceroute's skills

About

Paris traceroute is a new version of the well-known network diagnosis and measurement tool. It addresses problems caused by load balancers with the initial implementation of traceroute.

Traceroute's deficiencies

Routers can spread their traffic across multiple paths using a per-packet, per-flow, or per-destination policy. In per-flow load balancing, packet header information ascribes each packet to a flow, and the router forwards all packets belonging to a same flow to the same interface. A natural flow identifier is the classic five-tuple of fields from the IP header and either the TCP or UDP headers: Source IP address, Destination IP address, Protocol, Source port, and Destination port. We found through our experiments that load balancers actually use combinations of these fields, as well as three other fields: the IP Type of Service (or TOS), and the ICMP Code and ICMP header checksum fields.
Per-flow load balancing ensures that packets from the same flow are delivered in order. Per-packet load balancing
makes no attempt to keep packets from the same flow together,
and focuses purely on maintaining an even load. Per-destination load balancing
could be seen as a coarse form of per-flow load balancing, as
it directs packets based upon the destination IP address. But,
as it disregards source information, there is no notion of a
flow per se. As seen from the measurement point of view,
per-destination load balancing is equivalent to classic routing which
is also per destination, and so we will not explore it here.
Whether a router balances load per-flow or per-packet depends
on the router manufacturer, the OS version, and how the network
operator configures it.

Where there is load balancing, there is no longer a single
route from a source to a destination. In the case of per-packet
load balancing, a given packet might take any one of a number
of possible routes. With per-flow load balancing, the notion of
a single route persists for packets belonging to a given flow,
but different flows for the same source-destination pair can
follow different routes. In this regard, designing a new traceroute
able to uncover all routes from a source to a given destination
would be a significant improvement.
Indeed, the current
traceroute is not adequate to that task, as it cannot
definitively identify one single route from among many. It suffers from
two systematic problems: it fails to discover true nodes and
links, and it may report false links. These problems are due to
the random manner in which its probe packets are directed by a
load-balancing router, or load balancer. Our
explanation of these two problems draws on the example in
Figure 1. In this example, L is a load balancer at hop 1 from
the traceroute source. The true router topology from hops 1 through 4
is shown on the left. Routers are represented as circles and each
of their interfaces is numbered. We also show the probe
packets sent with TTL 1 to 4. The packets are depicted as
yellow arrows, either above the topology, if L directs them to
A, or below, if L directs them to B. At the right side, we
present the topology that would be inferred given these probe
packets.

Paris Traceroute
Figure 1 Traceroute's deficiencies under load balancing

Missing nodes and links
By default, traceroute sends three probes per hop. We can imagine a
scenario where all three probes to distance 2 are directed to
A. Device B is not discovered, and the links (L,B) and (B,D)
can thus not be inferred. With purely random load balancing,
there is a significant probability, of 0.5³ x 2 = 0.25, that
one of the two devices, A or B, would be left undiscovered by a
traceroute through this topology. In the general case, there may
be more than two next-hop interfaces following a load balancer. For
instance, the newer Juniper routers permit up to sixteen
equal-cost paths. It is certain that classic traceroute misses
many nodes, and the possibility to infer many links.

False links
In our example, L directs the probe dedicated to hop 2 to A and the one
to hop 3 to B, which leads to the mistaken inference of a link
between A and D. Many topology inference tools send one probe
packet per hop to reduce the probing overhead. Consequently,
these tools cannot easily detect that a single hop responds
with multiple interfaces. In the default usage of traceroute
(i.e., three probes per hop), it is easier to identify instances
of hops that respond with multiple interfaces. In our example, let us
assume that by sending three probes we find both A and B at hop
2, and C and D at hop 3. A liberal approach may infer the
existence of links (A,C), (A,D), (B,C), and (B,D), which would
lead to two false links. On the other hand, a more conservative
approach would require ignoring much data, which would lead to
more missing nodes and links. There is no satisfactory
solution with the existing traceroute.
False links may be caused by
either per-packet or per-flow load balancing. In the latter
case, false links arise because traceroute relies upon the ICMP
replies to its probe packets in order to identify devices along a
route. A router that sends a Time Exceeded message encapsulates the
IP header of the packet that it discarded, plus the first 8
octets of data which, in the case of UDP, TCP, or ICMP probes,
means the first 8 octets of the relevant transport-layer
header.
Traceroute considers these 28 octets in order to match
replies to probes. In the UDP probes that it sends out,
traceroute sets a high destination port number, and systematically
varies this number with each probe so that there will be a
unique identifier in the returning ICMP packet. For ICMP
probes, traceroute uses the sequence number field as the probe
identifier, which in turn impacts the checksum of the ICMP header.
Unfortunately, as we have found, varying any field in the first
four octets of the transport-layer header amounts to
systematically varying the flow identifier with each probe.

How we correct it

We
introduce Paris traceroute, a new traceroute to respond to
load balancing routers. Its key innovation is to control the
probe packet header fields in a manner that allows all probes
towards a destination to follow the same path in the presence
of per-flow load balancing. It also allows a user to distinguish
between the presence of per-flow load balancing and per-packet load
balancing. Unfortunately, due to the random nature of
per-packet load balancing, Paris traceroute cannot perfectly
enumerate all paths in all situations. But it can do
considerably better than the classic traceroute, and it can
flag those instances where there are doubts. The problem, if one wishes
to maintain certain header fields constant, is that traceroute
still needs to be able to match response packets to their
corresponding probe packets.

Figure 2 The IP, UDP and ICMP headers. Per-flow load balancers
use the grey fields to identify a flow. Red arrows
show the fields incremented by classic traceroute.
Paris traceroute uses the green fields to identify probes

Paris traceroute does this by varying header fields that are
within the first 28 octets, but are not used for load
balancing. In the case of TCP probes, Paris traceroute varies
the sequence number. In UDP probes, it is the checksum field.
This requires manipulating the payload to yield the desired checksum,
as packets with an incorrect checksum are liable to be discarded.
In ICMP probes, it is a combination of the ICMP identifier and
the sequence number. We carefully set the value of the ICMP
identifier and sequence number to keep constant the header
checksum of all probes to a destination. Figure 2 summarizes
the IP, UDP, and ICMP header fields that are used by load
balancers, classic traceroute, and Paris traceroute. We include the ICMP
type field as used for load balancing, even if we cannot
verify it experimentally (routers only repond to one type of
probes, which is the ICMP echo request). Our experiments with
UDP, TCP, and IPSec probes do give strong evidence that routers
blindly use the first four octets after the IP header combined
with the IP fields to do load balancing.

相关文章