In computer networking, the User Datagram Protocol (UDP) is a transport protocol. It’s connectionless, so there’s no handshaking before two processes start to communicate. The core idea:

  • Very straightforward and lightweight.
    • We just need a port number. TCP has many working parameters.
  • No guarantee that the message will reach the receiving process (may be dropped).
  • Messages that do arrive at the receiving process may be reordered.
    • The packet itself will be preserved as it is sent (if bits are corrupted, it’ll be dropped).
    • i.e., if we have 3 messages, we might receive them out of order (send: 123, receive: 312).
  • No congestion control, i.e., our throughput is not limited by the transport protocol.

Specification

Between the application layer and the network layer, UDP adds very little additional metadata on top of the packet. The header size is fairly minimal. It only adds 8 bytes of overhead:

  • 16 bits each: source port number, destination port number (for multiplexing)
  • 16 bits each: length of the segment, and a checksum.
    • The maximum length of a packet is the largest unsigned 16-bit number, 65 535.

At minimum, to send a UDP packet, we only need to know the destination IP address and destination port number. All UDP packets with the same destination address/port but different source address/port will be directed to the same socket (assuming it arrives).

i.e., any UDP packet with a specific destination address/port will always arrive to the same socket.

Checksum

The point of the checksum is to detect errors in transmitted data. If an error, then most UDP implementations drop the packet. The sender will compute the checksum and send it with the packet. The receiver will re-compute the checksum with the packet data and compare with the checksum it received.

Procedure is:

  • Add all 16-bit words in the segment. The operands are a pseudo header (consists of source/destination IP, zero paddings, protocol (always 17 for UDP), and UDP length of the packet), the header itself, and the contents of the packet.
    • If overflow, we wrap it around. Then compute the addition again until there’s no overflow.
    • One’s complement addition: binary addition where overflow is added to the result again.
    • When sending, the checksum field is set to 0.
    • When receiving, the checksum field is left as is. If we find that the result is 0, then our packet is likely uncorrupted.
  • Invert (NOT) each bit. This is the checksum.
  • At the receiver side, we add every 16-bit word. If the result is all 1s, then we have no errors.

We use checksum because there’s no guarantee of error detection in more bare metal layers. In addition, the protections provided by checksums are generally weak (valid checksum can result from a faulty packet).

The checksum is also optional in UDP for IPv4. If not used, it should be set to 0. For IPv6, it is mandatory to include.

Network programming

In general, we actively assume UDP is unreliable. This has major implications if we want to force reliability (as in distributed systems), which is done at the application layer. Contrast with TCP, which enforces reliability at the transport layer.

Applications

Generally, UDP is preferable for a few reasons:

  • UDP allows for finer application-level control over what data is sent and when. No congestion control means an application can maintain a minimum sending rate. Reliability schemes (incl. level of reliability) can also be maintained at the application level.
  • No handshaking (unlike TCP), so less associated delays.
  • No connection state, so less parameters to keep track of.
  • Small packet overhead.

UDP has been adopted by real-time communication services (think voice or video chat). This is because lost packets are largely imperceptible (or even tolerated) by users, and because speed of transmission matters. Some other protocols that use UDP:

Socket programming

UDP sockets are called datagram sockets.

In C

We use the following syscalls:

  • To create a socket, we do socket(<PROTOCOL>, SOCK_DGRAM, getprotobyname("udp")->p_proto). This returns a file descriptor we can operate on.
  • On the server-side, we need to bind the socket as usual. We receive messages with recvfrom().
  • We don’t need to connect on the client-side, because UDP is connectionless. This means we can just start communicating. We do need to know the address/port we’re connecting to, and fill in the respective structs with getaddrinfo(). We can send messages with sendto or send.

In Go

Go’s UDP networking is provided in the net package in the standard library. The general programming procedure is:

  • Resolve address, with conn, err := net.ResolveUDPAddr("udp", addrstring)
    • This returns a net.UDPConn type.
  • We read messages with conn.ReadFromUDP(buf), where buf is some pre-allocated buffer.
  • We write with conn.Write(), where we pass in a stream of bytes.
    • The easiest way to do this is to define a bytes.Buffer type.