HTTP keepalive

HTTP PERSISTENCE

I was treated to nice brunch on Father’s day and after a reinvigorating walk, I was ready to dive back into my weekend studies.

Recently I have been reviewing my http knowledge in preparation for my F5 exam.

Part of understanding http is to look at how http works at the tcp level.

Browsers talk to Web sites over http using the http GET/RESPONSE as one basic construct.

A browser sends a http GET request and the Web server provides its information in a http RESPONSE.

In the first iteration of http which is http/1.0, each http request/response requires a new tcp connection to be established between the client machine and the server.

The graphic below shows a simplified overview of the http conversation.

http1v2

Under http/1.0, every GET/RESPONSE pair opens a session; and with each tcp opened session resources are consumed both on server and the client machine.

Since each connection’s open/close requires a 3 way (tcp) handshake and close down, more traffic is introduced on the network with each http query.

On fast LANs this additional overhead may not be noticeable but as more http sessions are created the network load climbs. This can become a problem over wifi and slow WAN links where there are potential contention for bandwidth.

Modern browsers support http/1.1 which solves this by allowing a tcp connection to stay open for subsequent GET/RESPONSE from the same browser. The function that makes this happen is called “http keepalive”.

“Http keepalive” reduces tcp overheads associated with initiating, tracking and closing multiple connections between the same client and server.

Http keepalive is also called http persistence.

An example http persistent trace

Here is a quick look at the http persistence connection set up.

WebResponse

The GET Request indicates the browser is looking for a persistent connection refer to the field value “Keep-Alive”

The Web server Response shows that it supports http keep-alive.

Response2

Following is a wireshark snippet showing multiple requests over the same tcp connection.

MultipleGet

You can see the GET/REPSONSE for two different requests from the client in frames 110/129 and 149/211.

Keepalive values for client and servers

Since tcp is a conversation where either party can close the conversation.

The client and the server each maintains a keepalive timer that is independent of the other side.

The keepalive values for a client (browser) and the Web server are therefore not the same.

Obviously the client can be more tolerant of longer keepalives.

Generally most browsers will have a keepalive of up to 2 minutes.

However a server will get many tcp connections from many different user sessions. Therefore its keepalive timer must be relatively short, otherwise idle connections and consume unnecessary resources on the server (main concern will be free RAM space).

Most servers will normally have the default keepalive set to a low value — between 10s – 20s.

 

Observing IE11 and Chrome V52.07 keepalive

Let’s have a look at the tcp conversation using a wireshark trace. I have hidden the addresses but you can see the conversation is to tcp port 80 on the web server.

Here is a snippet from a packet decode to my blog site.

— NGINX Server – site www.hmwengineer.com – browser IE11

IE11-Blog

Referring to the wireshark capture above, after frame 212, no further requests were sent by IE to the server and 10 seconds later in frame 862, we can see the NGINX server closing down the connection.

So the Web server idle timeout is set to 10s.

However the client (browser) took 100s elapsed before IE reset that open tcp socket.

 

— NGINX Server – site www.hmwengineer.com – browser Chrome V52.07.x

With Chrome the behavior is different. When the session is idle, Chrome will send a tcp keepalive every 45 seconds of inactivity to the server.

Chrome-Blog

We can see this occurring in frames 459 and 523. If there are no timers on the server to close the session then Chrome can maintain long lived tcp connections.

In this case we see a reset from the Web server in frame 524, terminating this tcp session.

— Apache server – site www.nytimes.com — browser Chrome V52.07

Chrome-NY

In the above we can see after frame 4196 there were no further request on this socket from the client.

Chrome maintains the connection via lower order TCP keepalives. Finally in frame 6040 the server later the server closes the connection.

This is followed in frame 6085 when Chrome reaches it’s expiry timer which is around 300s and therefore it kills the connection.

Summary

In this article we saw that a modern browser and the server will maintain a persistent http connection under http/1.1.

The feature that enables this behaviour is called ‘http keepalive.’

We also saw an example of http/1.1 reusing the connection for subsequent GET/REPSONSE communication.

We looked at the wireshark traces and observed a Web server keepalive was a short 10s whereas the browser keepalive was closer to 100s.

Server keepalive should be kept small to continually free up resources for short-lived http sessions.

In the realworld, a http/1.1 browser will open typically around 6 -8 tcp connections to fetch all the css, javascripts and image files it needs to render a web page properly.

The four most popular Web servers on the Internet are Apache, Google Web server (GWS), NGINX and Microsoft IIS.