FAQ HTTP 2

Image

Graham Morrison reviews the sequel to the most common acronym on the internet.

Haven’t we seen the acronym HTTP before?

You’ve probably seen those four humble letters so many times that you’ve become completely desensitised to their appearance. It’s the modern fashion to remove them from URLS because they’re everywhere, but they do perform a vital function. These are the letters that tell your web browser what type of resource is at the end of that link, and for the vast majority of connections, the resource at the end of that link is a web page.

You mean there are other kinds of resources?

Yes, but not so many any more. At least not ones that a web browser knows how to deal with. Web browsers used to be able to interpret all kinds of different resources. FTP is still common, for instance and so is ‘mailto’. In the mid 90s there used to be more as the browser was designed to aggregate all kinds of online content. Many, such as the Gopher protocol, can still be added via plugins, but these days it’s all about the web, and that means HTTP.

What do the letters HTTP represent?

What we’ve been calling ‘resources’ are actually protocols that grab stuff. A protocol is a definition of how those resources should be formatted and transferred. The ‘P’ of HTTP is ‘protocol’, while the HTT bit is Hypertext Transfer.

Hypertext? From the early 90s?

Yes, the very same. It’s a word that’s fallen out of fashion, but its meaning is fundamental to how the world wide web works. The hype in hypertext is derived from the original Greek meaning of ‘over’, or ‘beyond’. Or within a text file, it’s the link to another resource ‘beyond’ the limits of the current file or location. This linking is what makes the world wide web the world wide web. The Hypertext Markup Language (HTML) is the syntax and formalisation of that linking with the text that surrounds it.

So HTTP is the protocol used to send HTML?

Fundamentally, yes. At least in the beginning. The simplest implementation of HTTP would have only a single command, GET, which would request an HTML file from a server. That HTML was nearly always a static file formatted with the correct markup. Markup refers to the elements within an HTML file that tell the browser how to format the text, such as <h1>heading</h1>, for a title or a heading. There are many elements and rules and we’ve all created files like this at one time or another.

Back at the dawn of the web, everything was made up of static sites like this, simply delivering a formatted text document to your browser. But as the web has evolved, HTML has become dynamic, created by whatever is running on the web server and code running in your browser. WordPress, for example, will take the posts you insert into a database, blend them with your themes and comments, and deliver the final output to someone’s web browser, whether that’s a phone or a laptop. Tim Berners-Lee is credited as the first to implement HTTP and HTML and made the first transfer back in 1991.

Image

There’s an add-on for Firefox that will show you when you’re using an HTTP/2 or a SPDY connection (it’s the tiny green symbol in the location field).

Is that how browser games work?

Not usually. Most of these are written in JavaScript, a scripting language that’s sent as part of a page and executed within your browser, but there are lots of other similar technologies. They can even be sent through a connection after the original request as part of the same session. Allowing a single session like this was one of the new features for HTTP/1.1.

How does all this fit into what the internet is?

It’s easy to get into a technical discussion about this, and in particular, take a deep dive into network layers. Briefly, HTTP operates at the top in a layer known as the ‘application layer’. Your web browser asks for a page and the server at the other end replies by sending it. It shares this space with many other protocols – IMAP for email, or SSH for a remote shell, for example.

Even if you have no formal computing knowledge, these protocols will be familiar precisely because they’re in the Application Layer, the layer closest to the user. If you look at the IRC protocol, for instance, you’ll see that it’s very simply constructed. Communication is really just a series of text messages that you can recreate manually using something like Telnet. You don’t have to worry about how your messages are encoded, or how they get from your machine to the server. This is handled by the layers beneath: Transport (TCP), Internet (IP) and link (Ethernet).

If all HTTP is doing is enabling a client to ask for data from a server, why does it need upgrading?

The HTTP that most of us use is version 1.1. This has been around since 1999, when Google had just eight employees and was moving from its garage office to its first real office. Just as 1.1 added features that were becoming necessary as the web grew in importance, so too does HTTP/2.0. It’s remarkable that the old version has lasted this long, considering what’s happened in the intervening 16 years.

But what does the new version do that’s so important?

Put simply, speed. We now know so much more about how we use the web and what the user and the web designer are trying to achieve. HTTP/2.0 does lots of sensible things designed to improve transfer speed and efficiency between the client and the server. HTTP is no longer going to be text-based, but use binary instead, for example. It will be the same content, encoded for efficiency. HTTP/2 uses gzip or DEFLATE compression and multiplexes transfers within a single connection. TLS security, which you currently use with HTTPS connections to your bank, are also an intrinsic part of this 2.0 specification, making HTTP connections implicitly secure.

Are there any other advantages?

Lots! It’s a free upgrade in that the new version won’t require you to change anything, or for developers to change their APIs. The new version will just work. As fewer connections are needed, the load on your server will also be less. There’s more intelligent cache control, and the server can push data it thinks the client will need without being asked, which should improve response times. Plus, encryption becomes a first-class citizen.

Has Tim Berners-Lee had a hand in this upgrade?

Not specifically. HTTP/2 was approved in the middle of February 2015 by the Internet Engineering Steering Group. HTTP is so fundamental that no decisions are ever made quickly, and decisions like this are only made after a long and peer-reviewed appraisal process. For HTTP/2, that means 200 design issues, 17 drafts and 30 implementations. HTTP/2 is based on a technology originally developed by Google, called SPDY. SPDY modifies the HTTP transfer in similar ways, only hidden behind compatible clients and servers. Google was well placed to deliver a specification like this, considering the free bandwidth upgrade it would receive from any efficiencies, and SPDY had already been adopted by all the main browsers as an addendum to the old specification. Google is now going to withdraw SPDY from its own products to help get HTTP/2 adopted as quickly as possible.

Do I need to change my browser to use this?

Firefox has HTTP/2 enabled from version 36 onward, and Chrome supports HTTP/2 but it isn’t enabled by default. The version of Internet Explorer bundled with the latest Windows 10 beta also support the standard. Each of these browsers only supports the encrypted (TLS) version of the protocol. Safari supports SPDY and is likely to adopt the changes necessary to add HTTP/2 support, so there should be good cross-platform adoption.

Where can I find out more?

The implementation lives on GitHub: https://http2.github.io, but you can find clearer information on the HTTP Working Group’s own web portal: https://httpwg.github.io.