HTTP (Hypertext Transfer Protocol) is the main application communication protocol that defines how data is moved on the World Wide Web. The protocol functions by sending different data packets between two programs: a client program (that sends requests), and a server program (that sends responses).

Protocols

HTTP is transported with TCP packets. It is also a stateless protocol and stores no information about what it’s sent already. So if a client requests the document again, the server will send it again.

  • HTTP/1.0 — non-persistent TCP connection is opened, and at most one object is sent before the connection is closed.
  • 1.1 — persistent connection, with multiple objects sent over the connection via pipelining.
  • 2 — keeps the persistent connection, but replaces pipelining with multiplexing.
  • HTTP/3 — has built-in encryption, and uses QUIC over UDP instead of TCP.

The modern web uses HTTPS, which is an encrypted and secure protocol.

Requests

Methods are the first word of the request packet, and they specify what functionality we want the server to do for us. Servers aren’t obligated to carry out every request it gets (think security).

  • GET means we want to retrieve the specified resource.
  • POST sends information to it.
  • DELETE tells the server to delete a resource.
  • PUT creates or replaces a resource in the server.
  • HEAD requests headers only? idk

Responses

Servers will respond with specific numeric codes. Common ones include:

  • 1xx — informational
  • 2xx — success
    • 200 (OK) — for a successful response.
  • 3xx — redirection
    • 301 (Moved Permanently) — requested object moved, new location specified in message.
    • 304 (Not Modified) — because modern web browsers cache content and because HTML is stateless, it is efficient to not send a full document if it hasn’t been modified. After the document is cached, subsequent GET requests will add a flag for the last modification date (send if modified since this date). If not modified, the server will send code 304 without the document content again.
  • 4xx — client error
    • 400 (Bad Request) — generic error code. Server didn’t understand the request.
    • 404 (Not Found) — requested document doesn’t exist on the server.
  • 5xx — server error
    • 505 (HTTP Version Not Supported)

Timing

In non-persistent HTTP/1.0:

  • One RTT to initiate TCP connection.
  • One RTT for each HTTP request and response.
  • And additional object/file transmission time as needed.

In persistent HTTP 1.1, the connection is instead kept open after the initial response. Subsequent HTTP requests are sent for any referenced objects in the page (like an embedded image). For further time optimisations, browsers often open parallel TCP connections (between 2 and 6) to fetch these objects. This allows the client to essentially pipeline requests, and in the time for 1 RTT, it can send and receive requests for multiple objects.