Latency: Trucks, Blondes, Ferrari's and Beer

Post date: Dec 4, 2013 5:53:16 PM

(originally posted may 19 2013)

"We have a gigabit connection so it can't be the network". Sound familiar? It is odd how even some seasoned IT staff don't know or forget the difference between speed and transfer rate.

So what is the difference? This is best explained by a simple example. Suppose you want to bring beer to someone 10 miles from where you are. What is the fastest way to get the beer there: in a Ferrari or in a truck?

The answer depends on if you want to bring just a crate or whole pallet. For the crate the Ferrari will be the best option, for the pallet the truck.

If the Ferrari drives 120 miles per hour, the 10 miles is done in 5 minutes. The truck driving 60, will take twice as long and deliver in 10 minutes.

Now delivering is one thing. You expect to get something back for the beer, such as cash or a hot blonde to go with the Ferrari. Every Ferrari needs one. So we care about when the Ferrari and the truck return.

Ignoring the time to load and offload, the round trip for the Ferrari takes 10 minutes and the truck takes 20 minutes for a round trip.

So how many cases can each transport? If a pallet has 60 cases and truck is limited to one pallet, the truck can transport 3*60=180 cases per hour. If the Ferrari can only hold one to leave space for the blonde, it can transport 6 cases per hour.

The Ferrari is faster, but the truck can transport an amazing 30 times more in an hour.

For network performance we need to keep this in mind. The performance of the network is nearly always shown as Xbit/s like Gb/s or Mb/s. But this only tells us how much data it can transport in a certain timeframe. It does not tell us the speed!

The way to quantify speed is by measuring latency. Latency is the time it takes one data packet to reach its destination. If we use the familiar ping command from a shell or a Windows prompt we see something like this:

64 bytes from icmp_req=3 ttl=249 time=26.3 ms

The value time here is called the round trip time. That’s how long it takes to send a data packet to the pinged computer and back again. The round trip value we remember from example.

So is the analogy with the truck and the car completely correct? No not quite. The difference is that on the network the large packets travel just as fast as the small ones. It is more like a highway where the speed limit is set and every packet travels at exactly that limit. What we need to realise most of all is that speed and bandwidth are not the same.

What does influence speed?

There are several factors that do influence speed. One of the most important factors is distance. It is not uncommon for international organisations to have systems that connect and rely on each other to be in different countries. Most of the journey for the data will be on fiber optics, which means it travels with the speed of light. That is very fast, but still the round trip for a connection to a machine next door or one thousands of miles away is very different.

Another factor obviously is the underlying network equipment. The quality and performance of the network interface cards, switches, firewalls etc is crucial.

The third factor is other use of the network. You are sending your packets across the network, but other applications are doing the same. When we overload the network, things seem to slow down. The packets however travel at the same speed. The speed of light does not all of sudden slow down. What happens is that if there is too much traffic, packets get dropped. In our car analogy, we put vehicles on the highway, but if there is no place, they simply get dropped.

That is great way to prevent jams, but in real life you would not want to be in a car that simply disappears. For the network it is not so bad as there are failsafes in place that ensure that missed packets are sent again. But it takes time before it is resend. Packets being dropped (lost) is a major cause of latency. So the Xbit/s does have a relation with speed. If we do have a lot of bandwidth, we decrease the chance of overloading and subsequent packet and speed loss.


Why should we care? If I have a high bandwidth connection with bad latency the big file I’m downloading gets here nearly as fast. The few milliseconds difference are not noticeable to me. For performance of applications on the other hand, latency can become a huge issue if there is a lot of back and forth communication over a connection.

A recent example demonstrates this well. There was an issue with an application on a development server which took very long to start up. On production the startup was fast whereas on development the startup took over half an hour. For development that is actually a bigger issue as they have restarts much more often than on production.

They had some issues tackling this problem and when I let it slip that I dabble with performance issues I was asked to join the group trying to solve this.

What they had already established was that the delay was caused by a single query. When the query was performed on the application server it took half an hour. Performing the same query on the server itself, in this case: login to a shell on the server and run the query there returned the query within seconds. Considering how light weight the client was we used for running the query, this pointed at the network. But the network seemed fine with a round trip time of 3ms. Not a very fast value but not that bad. We had other network settings checked like if it was on full duplex, bandwidth etc. Everything seemed fine.

Baffled, I performed the test myself. That was a good reminder to be careful of assumptions. The statement that the query took a long time made me think it would take a lot of time before we would see a result. The result came within seconds, but on the badly performing development server it took a long time to finish. There result was 50000 records and they scrolled slowly over the screen.

After some digging and googling we found the cause: the protocol used by the client performing the query was a ‘chatty’ protocol. It received the result in small chunks and it would get them sequentially. So for 50000 records, that was a lot of round trips of the client saying: "give me the next bit" and getting the next bit. We were able to increase the size of the chunks it retrieved and improved the performance dramatically.

And remember, this was not even a load test. This was just an SQL query by one user. Image this in a load situation.

Latency is important when we have: many sequential calls to another system. This can be application related, where a function does many calls instead of just one large call or like in the previous example based on the underlying infrastructure where there was just one call, but the retrieval method was ‘chatty’. Many calls running in parallel do not have to be an issue. To sum it up: Latency matters when:

    • the volume of requests is high

    • and the requests are dependant on each other (sequential)

The remedy can sometimes be cheap like changing some settings, it can be complex like changing your infrastructure to bring the different servers closer with a dedicated connection or it can be very expensive if you need a big change in software to account or the lack of speed.

But most of all, latency in the network cannot be neglected when dealing with performance issues. Not even when dealing with just a single user and just one query!

We don’t like our applications and protocol to be chatty any more than we like our blondes to be.