Rebuilding TLS, Part 3 — Building Our First Handshake
We get rid of the pre-shared key assumption, build a simple key exchange handshake, and discover why key agreement alone still does not give us real TLS.
Overview: Where we are and What Is Still Missing
In the previous part of this series, we made our fake secure channel much less fake.
We started with the broken encrypted transport from Part 1, added integrity with HMAC, added sequence numbers to make the record layer less naive, and then moved to AEAD — the approach modern systems usually use to protect records.
At that point, our protocol could already do something meaningful:
encrypt application data
detect tampering
reject modified records
keep some minimal record-layer state
That was a real step forward.
But it still relied on one very unrealistic assumption:
both sides already shared the secret keys
And that is exactly what we need to remove now.
Because a real secure protocol cannot stop at protecting data after the keys already exist. It also has to answer one of the harder questions first:
if client and server do not already share a secret, how can they create one over an insecure network in the first place?
That is the goal of this part.
We are going to build the next missing layer of the protocol: the handshake.
The architecture of this step is simple:
Client Server
------ ------
Handshake messages <---------> Handshake messages
| |
v v
shared secret shared secret
| |
+---------> HKDF <--------------+
|
v
session keys
|
v
protected application data
The idea is to let the connection create fresh key material dynamically instead of starting with a hardcoded application key.
We will implement that in three steps.
First, we will build a handshake with classic Diffie-Hellman, where the shared prime and base are still explicit and visible in the protocol. Then we will replace that version with X25519 to show how modern protocols simplify the same idea. After that, we will use HKDF to derive proper session keys from the raw shared secret.
That will take us one big step closer to the shape of real TLS.
But still not all the way.
Because even if both sides manage to derive the same fresh session keys, one critical problem will remain: they still do not know who is on the other side.
And that is where this part is heading.
A Very Short Note on Public Key Exchange
The basic idea of public key exchange is simple.
Two sides communicate over an insecure network. They exchange some public information. And from that exchange, both sides derive the same shared secret — without ever sending that secret directly over the wire.
That is the key point.
The network can be fully visible.
An observer can see all handshake messages.
But the observer still should not be able to derive the same secret.
That is exactly the kind of mechanism we need now.
Until this point in the series, our protocol always started with a secret that already existed. Public key exchange changes that. It gives the connection a way to create fresh shared key material dynamically.
In this article, I do not want to go deep into the mathematics behind it. I only want to use the core idea as the next building block of the protocol.
If you want the deeper intuition behind why this works, I already wrote about it here:
For now, the main idea we need is this:
each side contributes its own private value
both sides exchange some public values
both sides derive the same shared secret
that secret can then become the basis for session keys
So let’s build that first in the most explicit way, with classic Diffie-Hellman where the shared public parameters are still visible in the handshake.
Implementation Part 1 — Our First Handshake with Classic Diffie-Hellman
(The whole code can be find here: https://github.com/DmytroHuzz/rebuilding_tls/tree/main/part_3/v1_classic_dh_handshake )
Now let’s build the first real handshake in the series.
I want to start with classic Diffie-Hellman, not because this is the final form we want to keep, but because it makes the mechanics of key exchange much more visible.
In this version, both sides work with the same public parameters:
a prime p
a generator g
These values are not secret. In our implementation, the client sends them in the handshake, which makes the whole mechanism more explicit on the wire. That is exactly what I want at this stage. Before we hide the details behind a cleaner modern primitive, I want to make the structure fully visible.
The actual secret material comes from somewhere else:
the client chooses a private exponent a
the server chooses a private exponent b
From those private values, both sides compute public values:
the client computes A = g^a mod p
the server computes B = g^b mod p
Then they exchange A and B.
And this is the key step:
the client computes s = B^a mod p
the server computes s = A^b mod p
Both sides end up with the same shared secret, without ever sending that secret directly over the network.
In diagram form, the handshake looks like this:
Client Server
------ ------
choose private a
compute A = g^a mod p
ClientHello(p, g, A) --------->
choose private b
compute B = g^b mod p
<--------- ServerHello(B)
compute s = B^a mod p compute s = A^b mod p
That is our first real handshake.
Until now, the protocol always started with a secret key that already existed.
Now the connection itself creates the secret.
That is a major shift.
The raw Diffie-Hellman math
At the lowest level, the core operations are very small. That is one of the nice things about starting with classic Diffie-Hellman: the whole idea is still visible in a few functions.
# RFC 3526 Group 14: 2048-bit MODP prime
DH_PRIME = int(
"FFFFFFFFFFFFFFFFC90FDAA22168C234C4C6628B80DC1CD1"
"29024E088A67CC74020BBEA63B139B22514A08798E3404DD"
"EF9519B3CD3A431B302B0A6DF25F14374FE1356D6D51C245"
"E485B576625E7EC6F44C42E9A637ED6B0BFF5CB6F406B7ED"
"EE386BFB5A899FA5AE9F24117C4B1FE649286651ECE45B3D"
"C2007CB8A163BF0598DA48361C55D39A69163FA8FD24CF5F"
"83655D23DCA3AD961C62F356208552BB9ED529077096966D"
"670C354E4ABC9804F1746C08CA18217C32905E462E36CE3B"
"E39E772C180E86039B2783A2EC07A28FB5C55DF06F4C52C9"
"DE2BCBF6955817183995497CEA956AE515D2261898FA0510"
"15728E5A8AACAA68FFFFFFFFFFFFFFFF",
16,
)
DH_GENERATOR = 2
def generate_private_exponent() -> int:
return int.from_bytes(os.urandom(32), "big")
def compute_public_value(private: int, g: int, p: int) -> int:
return pow(g, private, p)
def compute_shared_secret(peer_public: int, private: int, p: int) -> int:
return pow(peer_public, private, p)
This is the whole core idea in code:
private exponent stays local
public value goes on the wire
shared secret is derived independently on both sides
That is the heart of Diffie-Hellman.
Client side
def client_handshake(sock) -> bytes:
"""Perform the client side of the classic DH handshake.
The client picks the public parameters (p, g) and sends them to the
server along with its own public DH value. The server uses those
parameters to compute its own public value and sends it back.
Returns the shared secret as bytes.
"""
# The client chooses p and g. These are PUBLIC — not secret.
# Anyone on the wire can see them, and that is perfectly fine.
# The security of DH depends on the hardness of the discrete
# logarithm problem, not on hiding p and g.
p = DH_PRIME
g = DH_GENERATOR
print(f" Public parameters (chosen by client, sent to server):")
print(f" p = {str(p)[:40]}... ({p.bit_length()} bits)")
print(f" g = {g}")
# Step 1: Generate client's private exponent and public value.
# The private exponent is the ONE thing that stays secret.
client_private = generate_private_exponent()
client_public = compute_public_value(client_private, g, p)
client_public_bytes = int_to_bytes(client_public)
# Step 2: Send ClientHello with p, g, and our public value.
# All three are public. The private exponent is NOT included.
p_bytes = int_to_bytes(p)
g_bytes = int_to_bytes(g)
client_hello = encode_message(
[
(TAG_DH_P, p_bytes),
(TAG_DH_G, g_bytes),
(TAG_DH_PUBLIC, client_public_bytes),
]
)
# Step 3: send p, g, and the client’s public value inside ClientHello
send_record(sock, client_hello)
# Step 4: Receive ServerHello with the server's public value.
server_hello_raw = recv_record(sock)
fields = decode_message(server_hello_raw)
server_public_bytes = None
for tag, value in fields:
if tag == TAG_DH_PUBLIC:
server_public_bytes = value
if server_public_bytes is None:
raise ValueError("ServerHello missing DH public value")
server_public = bytes_to_int(server_public_bytes)
print(f" <- Received ServerHello")
print(f" Server public value B: {hex_preview(server_public_bytes)}")
# Step 5: Compute the shared secret.
# shared = B^a mod p = (g^b)^a mod p = g^(ab) mod p
shared_int = compute_shared_secret(server_public, client_private, p)
shared_bytes = int_to_bytes(shared_int)
return shared_bytes
On the client side, the flow is:
choose a private exponent
compute the public value
send p, g, and the client’s public value inside ClientHello
receive the server’s public value
derive the shared secret
That is the first point in the series where the client does not begin with the application key. It participates in creating it.
Server side
def server_handshake(sock) -> bytes:
"""Perform the server side of the classic DH handshake.
The server receives p, g, and client_public from the ClientHello,
uses those parameters to generate its own keypair, and sends its
public value back.
Returns the shared secret as bytes.
"""
# Step 1: Receive ClientHello — parse p, g, and client's public value.
# The server does NOT assume any particular p or g. It uses whatever
# the client proposes. (In a production system, the server would
# validate that p is a safe prime and g is a proper generator.
# We skip that here for clarity.)
client_hello_raw = recv_record(sock)
fields = decode_message(client_hello_raw)
p_bytes = None
g_bytes = None
client_public_bytes = None
for tag, value in fields:
if tag == TAG_DH_P:
p_bytes = value
elif tag == TAG_DH_G:
g_bytes = value
elif tag == TAG_DH_PUBLIC:
client_public_bytes = value
if p_bytes is None:
raise ValueError("ClientHello missing DH prime (p)")
if g_bytes is None:
raise ValueError("ClientHello missing DH generator (g)")
if client_public_bytes is None:
raise ValueError("ClientHello missing DH public value (A)")
# Deserialize the parameters from bytes.
p = bytes_to_int(p_bytes)
g = bytes_to_int(g_bytes)
client_public = bytes_to_int(client_public_bytes)
# Step 2: Generate server's private exponent and public value
# using the p and g received from the client.
server_private = generate_private_exponent()
# Step 3: Compute server's public value
server_public = compute_public_value(server_private, g, p)
server_public_bytes = int_to_bytes(server_public)
# Step 4: Send ServerHello with our public value.
# Only B is sent — p and g are already known from the ClientHello.
server_hello = encode_message(
[
(TAG_DH_PUBLIC, server_public_bytes),
]
)
send_record(sock, server_hello)
# Step 5: Compute the shared secret.
# shared = A^b mod p = (g^a)^b mod p = g^(ab) mod p
shared_int = compute_shared_secret(client_public, server_private, p)
shared_bytes = int_to_bytes(shared_int)
return shared_bytes
The server does the mirror image:
receive p, g, and the client’s public value
choose its own private exponent
compute its own public value
send that value back in ServerHello
derive the same shared secret from the client’s public value
So at the end of the handshake, both sides have the same secret — but that secret was never transmitted directly.
That is the big win.
After this step, the connection can create fresh shared key material dynamically.
That is a much more realistic foundation.
But it is also still awkward.
Not conceptually awkward — educationally this version is very useful — but operationally awkward. We now have explicit p and g in the handshake, which is nice for understanding the mechanism, but clunky for a modern protocol design.
That is exactly why the next step will replace this version with X25519.
Implementation Part 2 — Simplifying the Handshake with X25519
(The whole code can be find here: https://github.com/DmytroHuzz/rebuilding_tls/tree/main/part_3/v2_x25519_handshake )
The classic Diffie-Hellman version was useful because it made the mechanics of the handshake fully visible.
But it also makes something else visible:
it is a bit clunky.
Not conceptually clunky — educationally it is great — but operationally clunky. There are more moving parts in the handshake, more explicit protocol fields, and more visible math than modern protocols usually want to expose directly.
So now we keep the same core idea and simplify the workflow.
That is where X25519 comes in.
The conceptual goal stays exactly the same:
both sides generate ephemeral private/public key pairs
both sides exchange public keys
both sides derive the same shared secret
that secret will later become the basis for session keys
What changes is the shape of the handshake.
We no longer need to carry an explicit prime and generator through the protocol. We no longer manually perform modular exponentiation with visible p and g. X25519 gives us the same public-key exchange idea in a much cleaner modern form.
That is why I wanted this section right after the classic DH version.
Classic DH makes the mechanism visible.
X25519 shows what the modern streamlined version looks like.
Client-side handshake structure
Here is the current client handshake implementation:
def client_handshake(sock) -> bytes:
"""Perform the client side of the X25519 handshake.
Returns the 32-byte shared secret.
"""
print("\\n[handshake] Client: starting X25519 handshake")
# Step 1: Generate an ephemeral X25519 keypair.
# "Ephemeral" means we create a fresh keypair for this session only.
# The private key never leaves this process and is discarded after use.
client_private = X25519PrivateKey.generate()
client_public = client_private.public_key()
client_public_bytes = client_public.public_bytes(Encoding.Raw, PublicFormat.Raw)
# Step 2: Send ClientHello with our public key.
client_hello = encode_message(
[
(TAG_X25519_PUBLIC, client_public_bytes),
]
)
send_record(sock, client_hello)
# Step 3: Receive ServerHello with the server's public key.
server_hello_raw = recv_record(sock)
fields = decode_message(server_hello_raw)
server_public_bytes = None
for tag, value in fields:
if tag == TAG_X25519_PUBLIC:
server_public_bytes = value
if server_public_bytes is None:
raise ValueError("ServerHello missing X25519 public key")
# Deserialize the server's public key from raw bytes.
server_public = X25519PublicKey.from_public_bytes(server_public_bytes)
# Step 4: Compute the shared secret.
# X25519(client_private, server_public) = X25519(server_private, client_public)
# This is the elliptic-curve equivalent of g^(ab) mod p from v1.
shared_secret = client_private.exchange(server_public)
return shared_secret
I like this version because it makes the transition very clear.
The client code no longer has to think about p and g at all. It just performs the handshake, gets the shared secret, and prints it. That is exactly the point of this stage in the series: the workflow becomes smaller, but the underlying purpose stays the same.
What changed conceptually
Compared to the classic DH version, the protocol has become simpler in three important ways.
1. No explicit shared public parameters in the handshake
In the previous version, the client sent the prime and generator so the whole structure of classic Diffie-Hellman stayed visible.
Now that goes away.
X25519 already gives us a fixed, standard structure for the exchange, so the handshake only needs to carry the public key material.
That makes the protocol smaller and cleaner.
2. The public values are much more compact
In the classic DH version, the public values were tied to a large prime-field construction and looked much heavier in the protocol.
In this version, the public keys are just 32 bytes.
That is a huge practical simplification.
3. The code starts to look more like real modern protocol code
This line from the comments says it well:
generate(), exchange(), done.
That is exactly the feeling this section should create.
We are still doing public-key exchange.
We are still deriving a shared secret.
But the implementation shape is now much closer to what modern systems actually use.
What this version still does not solve
Even after switching to X25519, this version is still simplified:
there is still no authentication
the shared secret is not yet turned into session keys
there is still no record-layer encryption using the new keys
In the next step, we will add HKDF and derive proper working session keys from it.
That is where the handshake starts to connect back to the record protection we built earlier.
Implementation Part 3 — Deriving Session Keys with HKDF
(The whole code can be find here: https://github.com/DmytroHuzz/rebuilding_tls/tree/main/part_3/v3_hkdf_session_keys)
At this point, both the classic Diffie-Hellman version and the X25519 version give us the same kind of output:
a shared secret that both sides can compute independently.
That is already a big step forward compared to the pre-shared-key model from the previous parts. The connection can now create fresh key material dynamically instead of starting with one hardcoded application key.
But there is still one important design question left:
should we use that raw shared secret directly as the application key?
For a toy demo, we probably could.
But even here, that would be the wrong direction.
Because a cleaner protocol separates these two ideas:
the handshake creates a shared secret
the protocol derives working session keys from that secret
That is exactly where HKDF comes in.
HKDF is a key-derivation function. Its job is not to invent secrecy out of nowhere, but to take existing secret material and turn it into keys that are better structured and easier to use safely inside the protocol.
So instead of treating the X25519 output as “the AES key,” we will use HKDF to derive proper session keys from it.
That already makes the protocol feel much closer to real TLS.
What changes conceptually
The structure now becomes:
X25519 shared secret
|
v
HKDF
|
v
session key material
|
v
protected application data
This is an important shift.
Before this step, the handshake produced something secret and we could have stopped there.
After this step, the handshake produces an input to a key schedule.
That is a much better protocol design.
Why this matters
There are two main reasons to do this.
1. The raw shared secret is handshake output, not final protocol state
The shared secret is the result of key exchange. That does not automatically mean it should be used directly as the application-data key.
Protocols usually want a cleaner boundary:
handshake result first
working keys second
2. We can derive keys for different purposes
Once we introduce a key-derivation step, we are no longer forced into “one secret for everything.”
Even in this toy protocol, that opens the door to a much more realistic design.
For example, instead of one single AEAD key, we can derive:
client → server key
server → client key
That is already much closer to how real secure protocols think.
Deriving the keys
In the current implementation, HKDF takes the X25519 shared secret and stretches it into 64 bytes of key material.
Then that material is split into two 32-byte keys:
one for traffic from client to server
one for traffic from server to client
That gives us directional keys instead of one shared application key for both directions.
Here is the key schedule:
# key_schedule_x25519.py
from cryptography.hazmat.primitives import hashes
from cryptography.hazmat.primitives.kdf.hkdf import HKDF
def derive_session_keys(shared_secret: bytes) -> tuple[bytes, bytes]:
key_material = HKDF(
algorithm=hashes.SHA256(),
length=64,
salt=None,
info=b"toy-tls-part-3-x25519",
).derive(shared_secret)
client_to_server_key = key_material[:32]
server_to_client_key = key_material[32:]
return client_to_server_key, server_to_client_key
I like this step a lot because it is small in code, but it changes the protocol mindset in an important way.
We are no longer thinking:
handshake gives us the key
We are now thinking:
handshake gives us secret material, and the protocol derives the keys it actually wants to use
That is a much stronger model.
A small but important detail
Notice that the two sides must interpret the derived keys consistently.
If the client treats the first 32 bytes as the client → server key, then the server must do the same. Otherwise the channel will immediately break.
So now the handshake is not only producing shared secret material. It is also establishing a shared rule for how that material becomes working traffic keys.
That is another reason protocols need structure, not just primitives.
Connecting HKDF back to the record layer
Now we can finally connect this part back to what we built earlier.
In Part 2, we already built an AEAD-protected record layer. But that record layer still depended on hardcoded keys.
Now that changes.
The AEAD layer no longer starts with a static key from configuration.
It receives fresh traffic keys from the handshake.
So the protocol shape becomes:
Handshake -> X25519 shared secret -> HKDF -> directional session keys -> AEAD protected records
That is a major milestone in the series.
At this point, the protocol no longer just looks secure because we wrapped some bytes in encryption. It now has a real high-level structure:
first establish shared key material
then derive traffic keys
then use those keys to protect application data
That is already much closer to the shape of real TLS.
Using the new session keys
Once the keys are derived, the record layer can use them directly.
Conceptually, the flow now looks like this:
Client
with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as client:
client.connect((HOST, PORT))
print(f"Connected to {HOST}:{PORT}")
# ==========================================
# PHASE 1: HANDSHAKE
# ==========================================
# New in Part 3: the handshake dynamically establishes session keys.
# No pre-shared secret needed.
client_write_key, server_write_key = client_handshake(client)
# ==========================================
# PHASE 2: APPLICATION DATA
# ==========================================
# The record layer now uses HKDF-derived keys instead of hardcoded ones.
# The record format is the same as Part 2 Stage 3 (AEAD).
# --- Send request (encrypted with client_write_key) ---
protected = protect_record(client_write_key, send_seq, request)
send_record(client, protected)
send_seq += 1
# --- Receive response (decrypted with server_write_key) ---
raw_response = recv_record(client)
try:
response = unprotect_record(server_write_key, recv_seq, raw_response)
recv_seq += 1
print(f"\\n Decrypted response:\\n {response.decode('utf-8')}")
except Exception as e:
print(f"\\n *** REJECTED: {e} ***")
print("\\nDone.")
use client_write_key to protect outgoing application data
use server_write_key to unprotect incoming application data
Server
with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as server:
server.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
server.bind((HOST, PORT))
server.listen(1)
print(f"Listening on {HOST}:{PORT}")
conn, addr = server.accept()
with conn:
# ==========================================
# PHASE 1: HANDSHAKE
# ==========================================
client_write_key, server_write_key = server_handshake(conn)
# ==========================================
# PHASE 2: APPLICATION DATA
# ==========================================
# --- Receive request (decrypted with client_write_key) ---
raw_request = recv_record(conn)
try:
request = unprotect_record(client_write_key, recv_seq, raw_request)
recv_seq += 1
except Exception as e:
print(f"\\n *** REJECTED: {e} ***")
print(" Connection closed — refusing to process invalid data.")
else:
# --- Send response (encrypted with server_write_key) ---
response = (
"HTTP/1.1 200 OK\\r\\n"
"Content-Type: text/plain\\r\\n"
"Content-Length: 13\\r\\n\\r\\n"
"hello, client"
).encode("utf-8")
protected = protect_record(server_write_key, send_seq, response)
send_record(conn, protected)
send_seq += 1
print("\\nDone.")
use client_write_key to unprotect incoming client traffic
use server_write_key to protect outgoing server traffic
That means the two directions are now separated.
This is cleaner than one symmetric application key shared blindly by both directions, and it makes the protocol feel more deliberate.
Even in this simplified version, that is a meaningful step.
What this step really gave us
By adding HKDF, we improved the protocol in a way that is easy to underestimate.
We did not just “derive another key.”
We made the protocol architecture cleaner.
Now the handshake and the traffic layer are connected in a more principled way:
the handshake creates shared secret material
the key schedule turns that material into working keys
the record layer consumes those keys
This is a much better model than treating the raw X25519 result as the final answer.
And it brings us one step closer to real TLS, where key derivation is not an optional detail, but one of the central pieces of the protocol design.
But we are still not secure
And now we arrive at the uncomfortable but necessary part.
Even with:
a real handshake
X25519
HKDF
fresh directional session keys
AEAD-protected records
the protocol still cannot be considered secure enough.
Why?
Because all of this still says nothing about who is on the other side.
The handshake can successfully create shared secrets.
HKDF can successfully derive traffic keys.
The record layer can successfully protect application data.
And an attacker can still sit in the middle and run two separate handshakes.
That is the next lesson.
Still Not Secure — The Man-in-the-Middle Problem
At this point, our protocol already looks much more serious than the one we started with.
We now have:
a real handshake
fresh shared secrets
X25519 instead of a pre-shared application key
HKDF-derived session keys
AEAD-protected application records
That is a long way from the fake secure channel in Part 1.
But it is still not enough.
The missing piece is one of the most important ideas in this whole series:
key exchange is not authentication
That sentence is easy to read quickly and move on from. But it is worth stopping here, because this is exactly where many protocols fail.
Our handshake proves that both sides can derive the same shared secret.
What it does not prove is:
who is actually on the other side.
And that difference is the whole problem.
The attack
Imagine an active attacker sitting between the client and the server.
Let’s call her Mallory.
The client thinks it is talking to the server.
The server thinks it is talking to the client.
But Mallory intercepts the handshake and replaces the exchanged public keys with her own.
In simplified form, the flow looks like this:
And now something very important happens.
The handshake still “works.”
But it works in the wrong way.
the client ends up with a shared secret with Mallory
the server ends up with a different shared secret with Mallory
and Mallory now has one valid secure channel to each side
From the point of view of the client and the server, everything looks normal:
key exchange succeeded
keys were derived
encrypted records verify correctly
AEAD tags are valid
And yet the protocol has already failed.
Because Mallory can now:
decrypt the client’s traffic
read it or modify it
re-encrypt it toward the server
receive the server’s response
read it or modify it
re-encrypt it back toward the client
Neither side can detect this.
In The Next Article — Building the Certificate Infrastructure
The handshake only proves one thing:
“I computed a shared secret with whoever sent me this public key.”
It does not prove:
“This public key came from the server I actually intended to talk to.”
That is the missing half.
To fix this, the client needs a way to verify that the public key it receives during the handshake actually belongs to the server it wanted to talk to.
That is where the next layer enters:
certificates
signatures
trust chains
certificate authorities
In other words, this is where the protocol must stop proving only that “someone” is there and start proving who that someone is.
That is exactly what the next article will build.
Summary
Our protocol now has secrecy against passive observers.
It has integrity for protected records.
It has fresh session keys.
But it still does not have identity.
And without identity, a correct shared secret with the wrong party is still a protocol failure.
That is the deeper lesson of Part 3.
Part 1 taught us:
confidentiality is not integrity
Part 2 taught us:
protecting records is not the same thing as establishing trust
And now Part 3 adds the next lesson:
key exchange is not authentication
That we will solve in the next article!
Final Code
The full code for this part is available here:
GitHub: https://github.com/DmytroHuzz/rebuilding_tls/tree/main/part_3




