TCP

The internet moves bytes. Most of those bytes travel over TCP — the Transmission Control Protocol — yet the protocol is easy to take for granted. You call fetch(), the browser fetches the page, and somewhere underneath a stream of data reliably crossed the network. How?

TCP answers a deceptively hard question: how do you build reliable, ordered, connection-oriented communication over an unreliable, unordered packet network? The answer involves sequence numbers, acknowledgments, timers, flow control, and a carefully specified state machine. This article walks through all of it.

#Where TCP Fits

TCP lives at the Transport layer (layer 4 in the OSI model, layer 3 in the simplified TCP/IP model). It sits between the application and the IP layer:

↕ each layer hands off to the one below
TCP/IP stackclick a layer to inspect
L4Transport LayerEnd-to-end delivery between processes.TCP provides reliable, ordered, flow-controlled byte streams. UDP offers low-overhead datagrams. QUIC (HTTP/3) combines TLS + transport into a single layer over UDP.
layer {
name: Transport;
protocols: [TCP, UDP, QUIC];
}

IP delivers packets — individually routed, possibly reordered, possibly dropped. TCP builds a byte stream on top of that: ordered, lossless, and flow-controlled.

Its connectionless sibling, UDP, skips those guarantees for lower overhead — useful for real-time audio/video (WebRTC), DNS queries, and QUIC. But wherever reliability and order matter, TCP is the default choice.

#The TCP Segment

Every unit of work in TCP is called a segment. A segment has a fixed-format header (minimum 20 bytes) followed by optional extensions and the application payload.

0
4
8
12
16
20
24
28
31
TCP segment headerclick a field to inspect
Sequence NumberByte offset of the first data byte in this segment.TCP is a byte-stream protocol. The sequence number tracks which byte of the stream this segment starts at. On a SYN segment it is the Initial Sequence Number (ISN), chosen randomly to avoid clashes with old connections.
field {
name: Sequence Number;
size: 32 bits;
example: 3_141_592_653;
}

Click each field in the diagram above to see what it does. A few are worth calling out in detail.

Sequence and Acknowledgment Numbers

These two 32-bit numbers are the heart of TCP's reliability guarantee. TCP treats the data as a continuous byte stream identified by sequence numbers. The sequence number says "the first byte in this segment is byte N of the stream." The acknowledgment number says "I have received everything up to byte M — send me M next."

Client sends:  seq=1000, data=500 bytes   → bytes 10001499
Server replies: ack=1500"got up to 1499, send 1500"

Client sends:  seq=1500, data=500 bytes   → bytes 15001999
Server replies: ack=2000"got up to 1999, send 2000"
Sidenote: Cumulative acknowledgment — one field covers the whole stream

This design means a single ACK covers everything received so far — a dropped packet causes a gap, and the receiver simply does not advance its ACK number past the gap until the retransmission fills it.

Flags

The 6 control bits in the flags field determine the purpose of a segment:

FlagNameUsed for
SYNSynchronizeOpening a connection
ACKAcknowledgeCarrier of acknowledgment numbers
FINFinishClosing a connection (graceful)
RSTResetAborting a connection immediately
PSHPushHint to deliver data to app now
URGUrgentUrgent data pointer valid

Most segments in a normal data exchange carry only the ACK flag. SYN and FIN each consume one sequence number (they are treated as a 1-byte virtual payload).

Window Size and Flow Control

The window size field is TCP's built-in throttle. A receiver declares how many bytes it can currently buffer. The sender must not have more than that many unacknowledged bytes outstanding.

Receiver buffer = 65535 bytes
Sender may have at most 65535 bytes "in flight" (sent, not yet ACKed)

The Window Scale option (negotiated during the handshake) multiplies this field by a power of two, enabling windows up to 1 GiB — essential for high-bandwidth, high-latency links.

#The Three-Way Handshake

Before any data flows, the two endpoints must synchronise their Initial Sequence Numbers and confirm that both directions of communication work. TCP does this with three segments — the three-way handshake.

Client
CLOSED
Server
LISTEN
three-way handshake
Both sides readyno segment sentClient and server are both in LISTEN / CLOSED state, waiting to initiate or accept a connection.

Use the mode toggle above to step through both the connection setup (three-way handshake) and connection teardown (four-way teardown). Click next to advance one step at a time.

Why Three Steps?

Two steps are not enough. If a client sent SYN and the server replied ACK alone, the server would not know whether its direction is working — it has never received an acknowledgment from the client. The third step (client → ACK of the server's SYN-ACK) confirms that both directions are functional.

Normal: server allocates a half-open socket when it receives SYN.

SYN flood: attacker sends millions of SYN segments with spoofed
source IPs → server fills its backlog queue → legitimate connections
are refused.

SYN cookies: server encodes the connection state into the ISN
(a cryptographic hash of src/dst addresses and ports plus a timestamp).
No state is stored until the ACK arrives and the cookie is verified.
Sidenote: SYN cookies — defending against SYN flood attacks

ISN Randomisation

The Initial Sequence Number is not zero. TCP requires it to be chosen pseudo-randomly (RFC 6528 recommends a cryptographic hash). This prevents two hazards:

  1. Old duplicate segments — leftover segments from a previous connection with the same 4-tuple being mistaken for new data.
  2. Blind injection attacks — an off-path attacker guessing the sequence number and injecting fake data.

#Four-Way Connection Teardown

TCP is full-duplex. Each direction is closed independently with a FIN + ACK exchange. That is why graceful close takes four segments instead of three.

Client sends FIN"I'm done sending data."
Server sends ACK"Understood; but I may still send to you."
   … server finishes its own sends …
Server sends FIN"I'm done too."
Client sends ACK"Connection fully closed."
Sidenote: Half-close — each direction is independent

In practice the server often has no data left and combines its ACK and FIN into a single segment, making it look like three segments — but conceptually it is always two independent half-closes.

TIME_WAIT

After sending the final ACK, the active closer does not immediately move to CLOSED. It waits for 2×MSL (Maximum Segment Lifetime, typically 30–60 s, so the wait is 60–120 s).

Two reasons:

  1. Ensure the final ACK was received. If it was lost, the remote will retransmit its FIN, and the active closer needs to be alive to re-send the ACK.
  2. Absorb stale duplicates. Any segment from this connection that was delayed in the network will expire before the port is reused — preventing it from contaminating a new connection with the same 4-tuple.

TIME_WAIT is why a server that was restarted quickly sometimes cannot immediately reclaim its port. SO_REUSEADDR relaxes this restriction for server sockets.

#TCP State Machine

Every TCP endpoint is always in exactly one of eleven states. Transitions happen in response to incoming segments, API calls (connect, close), or timers.

transitions from CLOSED
active open / send SYN
TCP state machineclick a state to explore
CLOSEDNo connection exists.The initial and final state. No socket resources are in use. A passive open (server) transitions to LISTEN; an active open (client) sends SYN and moves to SYN_SENT.
state {
current: CLOSED;
role: client;
next: [SYN_SENT];
}

Toggle between the active opener (client) and passive opener (server) paths. Click any state square to see what it means and which events drive the next transition.

The Normal Client Path

CLOSEDSYN_SENTESTABLISHEDFIN_WAIT_1FIN_WAIT_2TIME_WAITCLOSED

The Normal Server Path

CLOSEDLISTENSYN_RECEIVEDESTABLISHEDCLOSE_WAITLAST_ACKCLOSED

The CLOSING state appears only in the rare simultaneous close scenario — both sides send FIN at virtually the same time.

#Reliability: Retransmission and Timeouts

When a segment is lost, the sender eventually notices that no ACK arrived, and retransmits. Two mechanisms trigger this:

Retransmission timeout (RTO) — A timer starts when a segment is sent. If no ACK arrives before it fires, the segment is retransmitted. The timeout is computed dynamically from observed round-trip times using Jacobson's algorithm.

Fast retransmit — If the sender receives three duplicate ACKs (the same ACK number repeated), it infers a hole in the stream and retransmits immediately, without waiting for the timer. This is faster because duplicates arrive quickly on modern networks.

Sender transmits:  seg 1, seg 2, seg 3, seg 4, seg 5
Network drops:     seg 2

Receiver got seg 1: ACK=2  (normal)
Receiver got seg 3: ACK=2  (dup — still waiting for 2)
Receiver got seg 4: ACK=2  (dup)
Receiver got seg 5: ACK=2  (dup × 3 → sender retransmits seg 2)
Sidenote: Fast retransmit — three duplicate ACKs trigger early recovery

#Congestion Control

Flow control prevents the receiver from being overwhelmed. Congestion control prevents the network from being overwhelmed. They work on the same lever — the amount of in-flight data — but with different signals.

TCP maintains a congestion window (cwnd) alongside the receiver's window. The effective window is min(cwnd, rwnd).

Slow start — cwnd begins at 1 MSS (Maximum Segment Size, typically 1460 bytes) and doubles every RTT until either loss is detected or a threshold (ssthresh) is reached.

Congestion avoidance — Above ssthresh, cwnd increases by 1 MSS per RTT (linear growth) instead of doubling.

On loss — cwnd is cut. Exactly how depends on the TCP variant (Reno, CUBIC, BBR…).

cwnd growth during slow start:   1248 → … (exponential)
cwnd growth during avoidance:    891011 → … (linear)

Modern variants like TCP CUBIC (Linux default) and TCP BBR (Google) use more sophisticated models that achieve higher throughput on long-fat-network paths.

#Putting It All Together

A single GET https://example.com/ involves:

  1. DNS — resolve example.com to an IP.
  2. TCP three-way handshake — open a connection to port 443.
  3. TLS handshake — negotiate encryption (over the established TCP connection).
  4. HTTP/1.1 or HTTP/2 request — bytes flow as TCP segments, each ACKed.
  5. Data transfer — congestion control and flow control regulate the pace.
  6. TCP four-way teardown (or Connection: keep-alive for reuse).

Every reliability guarantee you rely on in a web browser, API client, or SSH session flows from the mechanisms above — sequence numbers, ACKs, retransmissions, flow control, the handshake, and the state machine.

The transport layer is invisible until it goes wrong. At that point, knowing exactly what state each side believes it is in, what a SYN cookie is, or why TIME_WAIT cannot be skipped makes all the difference.