Timeouts on Trace Routes

Published on: July 7, 2014

What do dropped packets on a traceroute mean?

What is a traceroute?

A traceroute is a program that Traces the path a packet will take from point A to point

B. It does this by sending a series of packets that when a router receives it will reply

with an error message letting the program know that that router is in that Path. It sends

these packets one after another until it gets a response from its final destination. This

allows to the program to string a series of these error messages in a row to create a

logical Path that the packet takes.

What information can I get from it?

A trace route can be a very useful tool in pointing out a single failure in a network or

congestion and even sometimes loops. It allows you to see the route a packet will take

to a specific IP and allows Network Engineers to find out if a specific router may be

having an issue.

Why timeouts on a router aren’t always a bad thing.

Control Plane vs Data Plane

When routers receive packets addressed to them they treat them much differently than

when they receive packets not addressed to them. This is what is called the separation

between control plane(brain of the router) and the forwarding plane(the arms of the

router). Routers are very different machines than most servers; this is because they

were designed to do one thing route packets. To do this they have very very strong

arms(Forwarding Planes) but very weak brains(Control Planes) in terms of processing

power.

When a router receives a packet that is not addressed to it then it will use its

arms(Forwarding Plane) to Push the packet out the correct interface; this is a very

quick process(millionths of a second).

When a router receives a packet that is addressed to itself it interacts directly

with the routers brain(Control plane). This is a much slower process and if in the

hands of a nasty user can be used to attack a router and possibly bring it down. To

counteract this engineers designed the brain(Control plane) to be completely separated

from the arms(Forwarding Plane) and limited the amount of packets that can be

processed by the brain(Control plane) so that it would be much more difficult for an

attacker to overload it.

But what does this have to do with traceroutes?

Well you see, when one of those error messages come in to a router from a trace route

program they are sent to the brain(Control plane) of the router. These messages have a

VERY low priority; this means that unless the router has literally nothing better to do it

will simply drop the message(Timeout). This means that if there are lots of people

sending traceroutes to that router (as there often is with high traffic routers) then it will

appear that that router is having issues when in reality there could be nothing wrong with

it.

When ARE timeouts on a traceroute a bad thing?

There is only one real instance when a timeout on a traceroute is a bad thing. That is

when you see timeouts that continue forward in the route. By that I mean when you see

an individual timeout and then many more after that.

There are two main instances when this can happen, the first and most common is that

there is a firewall that was configured to block these packets in the route. The other

instance is that a router is dropping packets going THROUGH it (i.e. Forwarding

Plane(arms) packets) and this is can be a VERY bad sign. This is usually caused by one

of three reasons either the router is overloaded, the router having a software or physical

failure or the router is configured to do this(null route/blackholes). This should be brought

to our team’s attention so that we can do our best to avoid this route lessening the

impact on your services or investigate if there is a null-route/firewall in the way.

Examples

A Good Trace Route


Notes:

  • Even though there is a time out in the middle the packet still makes it all the way to the end (note the “Trace complete”)
  • The last 3 hops are the most important, the packet comes into our edge(ten-7-4.edge1.level3.mco01.hostdime.com) then past our core(xe1-3-core1.orl.hostdime.com) then to the final hop.
  • If a trace route begins to time out after the core then 99% of the time its a firewall issue on the server itself.

A Bad Trace Route





Notes:

  • Even though there is a time out in the middle the packet still makes it another hop(4.69.202.65) meaning that the 4th hop router is not the issue.
  • We know from the last trace route that (ae-1-8-bar1.orlando1.level3.net) is the next hop meaning thats where the issue starts.
  • Also note that the trace does not finish, meaning that this device is unreachable.





A Good Trace Route that appears bad due to a firewall



Notes:

  • Just like with both the good and the bad traceroute there is a drop in the middle, but note that the packet still makes it all the way past both our core and edge meaning it makes it into our network.
  • 99% of the time if it makes it past our edge and core then it is a firewall/physical issue.





Back To Top