Layer 1: Physical

As a Professor of mine once said, “That’s for the electrical engineers to worry about”.

Layer 2: Data-Link

Move information from one device to another, over the physical layer.

The IEEE (Institute of Electrical and Electronics Engineers) has this mostly figured out.

The IEEE publishes the “802” family of standards. Each standard focuses on an area, and has many sub-topics. At the data-link layer, there are two big ones:

802.3 “Ethernet”
802.11 “Wireless”

Also of interest is 802.1, home of network-access-control and link-layer-discovery-protocol. (Bluetooth and Zigbee are also IEEE, though I’m not sure which subgroup.)

The key thing to know about layer 2 is Frames. Frames come from some device identifier (aka “MAC (media access control) address”) and are sent to some other device. The link between two devices, over which frames are sent, is a “network segment”. It is possible to have multiple devices on one network segment (with broadcast de-conflicting and random/exponential back-off), but in present day we have only two devices per segment, and switches to connect segments. A collection of connected network segments is a network.

Ethernet / Wifi take data up to the MTU (Maximum Transmission Unit). Higher-level protocols must split their data streams into acceptable chunks.

Layer 3: Network

This is where things get fun - devices may communicate with devices on different networks, as long as those networks are connected.

Endpoints will send to routers, which must determine the best route to get to the destination network. Routers speak IPv4, IPv6, and ICMP, and maybe some other stuff. Endpoints are, in a sense, locked-in to whatever the routers speak.

Chunks of data at this layer are called “packets“. IP will split datagrams across multiple packets, such that each packet can fit in a frame. Networks that carry packets are sometimes called “packet-switched-networks“.

Layer 4: Transport

Layer 3 will get stuff there, achieving device-to-device communication. If we want process-to-process communication, we’ll need another layer of addressing - Ports!

Just as we’re locked-in to IP in layer 3 (That’s what the routers speak), we’re kinda locked-in to UDP/TCP (that’s what the firewalls speak). Other things are possible, if the firewalls / operating systems support them.

UDP

A UDP header is 8 bytes long, of 4 2-byte vales:

Source Port (optional)
Destination Port
Length
Checksum (optional in ipv4)

TCP

TCP is more complicated and slower, but provides reliability (re-transmit until the destination acknowledges) and in-order delivery.

If I take notes on TCP, it will get its own page, to detail the handshake + teardown process.

Suffice to say that you can open a TCP socket and dump data into it, and it’ll come out the other side.

Layer 5: Magic

Above layer 4, processes can do just about whatever they want, and the list of protocols grows dramatically.

Some things are dirt simple - Telnet sends raw ASCII characters over a TCP socket - while some things build on multiple additional layers. DoH (dns-over-https) is three more layers deep (TLS, HTTP, then DNS).

The key concept here is nesting - anything that provides a generic transport capability (UDP) can hold anything else; including additional generic transport protocols.

Wireguard

Wireguard is worth calling out specifically. It presents, at the OS-level, what looks like a layer 3 connection (a virtual network interface with an IP address). Processes may open TCP connections through this virtual nic.

When the wireguard nic receives a datagram, it encrypts it (adding an encryption layer), adds a UDP layer, and then passes it to the “real” (It could be another fake one!) layer-3 stack.

All together, your wireguard frame looks like this:

FRAME {
    IP {
        UDP {
            ENCRYPTION {
                TCP (probably) {
                    MAGIC { application-specific stuff here }
                }
            }
        }
    }
}