Building Own MAC — Part 2: Fixing AES (and accidentally reinventing CMAC)
Why intuition fails in cryptography
In the previous series we finished AES and its modes. And in the previous article revealed why it is still not secure.
We can encrypt messages.
We can decrypt messages.
We can ship.
And then reality does what reality always does:
It slaps you.
Because encryption solves one problem:
“If you don’t know the key, you can’t read the message.”
It does not solve this problem:
“If you don’t know the key, you can’t change the message.”
So we face the classic situation:
✅ we have secrecy
❌ we still don’t have trust
And now we do what every engineer does when something breaks:
we try to patch it fast.
Let’s go through the exact fixes your brain naturally generates at 2am.
Most of them fail.
Some fail spectacularly.
And each failure forces one new design constraint… until we reinvent a real solution.
Goal (simple and brutal)
We want the receiver to be able to say:
✅ accept
❌ reject
based on one small extra value.
No philosophy. No “maybe”.
Just: does this message deserve to exist?
Fix attempt #1 — “Just hash the plaintext”
Classic.
C = Enc(M)
tag = Hash(M)
send (C, tag)Receiver decrypts C, recomputes Hash(M), compares.
Break (instant)
Hashes have no secrets.
Attacker changes message → attacker recomputes hash → sends new pair.
✅ receiver accepts
✅ attacker wins
✅ we learn nothing except pain
Takeaway
If the attacker can compute your tag, it’s not authentication.
It’s a checksum.
So the tag must involve a secret.
Fix attempt #1.5 — “Put the hash inside the encrypted message”
Okay, fine.
Let’s hide the hash.
payload = M || Hash(M)
C = Enc(payload)
send CReceiver decrypts, extracts M and the embedded hash, recomputes, compares.
This feels smart.
It’s not.
Break (modes bite you)
In malleable modes (CTR / OFB / CFB), the attacker can flip bits in ciphertext to flip predictable bits in plaintext.
So they flip bits in the message part…
and flip corresponding bits in the embedded hash part.
They don’t need to know the key.
They don’t need to know the hash function.
They don’t need to understand anything.
They just move bits.
Receiver decrypts:
M' || Hash(M')Hash matches. Receiver accepts.
0_o
Takeaway
Hiding a checker does not make it a check.
If your “verification data” lives inside the thing being protected, it can be modified together with it.
We need the tag to be outside the encrypted payload and unforgeable.
Fix attempt #2 — “Encrypt the hash separately”
Next idea:
C = Enc(M)
tag = Enc(Hash(M))
send (C, tag)Now the attacker can’t see the hash, can’t recompute it.
So… done?
Not even close.
Break (you verified… what exactly?)
Now you have two encrypted blobs.
And the receiver has to answer a painful question:
What exactly are we proving by comparing these?
Do we verify the plaintext?
The ciphertext?
The padding?
The length?
The context (endpoint / protocol version / message type)?
Nothing is bound. Everything is implied.
And implied security is not security.
Also: even if you try to “compare decrypted hash to recomputed hash”, you’re back to the earlier issue — encryption modes are malleable and protocol context is not bound.
Takeaway
Encrypting a checker is not the same as verifying a message.
We need a single verification value, not two blobs that “seem related”.
Fix attempt #2.5 — “Encrypt the ciphertext and compare”
Okay. Let’s remove ambiguity.
C = Enc(M)
tag = Enc(C)
send (C, tag)Receiver decrypts tag → gets C' → compares C' == C.
Now we have:
a secret
a deterministic check
a clean yes/no
This looks decent.
And it still fails.
Break (you authenticated bytes, not meaning)
This check proves exactly one thing:
“The ciphertext blob wasn’t modified.”
It does not prove:
that this ciphertext belongs to this endpoint
that it’s intended for this message type
that it’s valid in this context
that it’s fresh
that it’s not a replay
So the attacker doesn’t need to forge anything.
They just replay a valid (C, tag) where it causes damage.
“Transfer $10” becomes “Transfer $10 again.”
Or becomes “Approve something you approved yesterday.”
You built a very strong proof of internal consistency… and zero proof of intent.
Takeaway
Consistency is not authenticity.
Authentication must bind meaning and context, not just bytes.
Fix attempt #3 — “Use a ciphertext artifact as the tag”
Another idea people try:
“Maybe the last ciphertext block depends on everything. Let’s use it.”
C = Enc(M)
tag = last_block(C)Break (structural)
This “tag” depends on:
mode internals
padding
message length
where the last block boundary falls
Truncation, extension, prefixes… the whole thing becomes ambiguous.
Takeaway
Artifacts are not tags.
We need a tag function that we control, end-to-end.
Fix attempt #4 — “Fine. Let’s build a tag ourselves.”
At this point it’s pretty clear that all “attach something and encrypt it” hacks are cursed.
So let’s stop patching symptoms and define what we actually need.
We need a function that takes:
a secret key
Ka message
Mof any length
and produces:
a fixed-size tag (something like 16 bytes)
So not:
Block → Block (that’s what AES gives us)
Message → Message (that’s what modes give us)
but:
Message → Block
And yes, AES still sounds useful, because it’s keyed and strong.
The only problem is… AES doesn’t speak “message”. It speaks “one block”.
So we have to build a “message-to-block machine” out of “block-to-block”.
Which means: state.
We invent a small internal value X (one AES block), and we feed the message block-by-block:
start with some known initial state
X0combine the current block with the state
run AES
repeat
The most natural combine operation is XOR (it keeps the size).
So the first honest design becomes:
X0 = 0
for each block Bi in message:
Xi = AES(K, Xi-1 XOR Bi)
tag = XnThis is the first time we’re no longer “encrypting a message”.
We’re doing something else:
the output is fixed-size no matter how long the message is
you can’t decrypt the tag back into the message
we are folding the message into a state
That’s not encryption. That’s compression (in the cryptographic sense).
Now: does it work?
It works almost perfectly… until you remember messages are not fixed-length.
And the moment message length varies, this construction starts bleeding
Now: break it.
The break: variable-length messages
This construction is essentially “CBC-MAC”.
It works for fixed-length messages.
It breaks for variable length.
Reason: the output tag is a valid internal state, so extension/splicing tricks become possible unless you bind the message length / finalization rules.
(And yes, this is a known and very real class of failures: CBC-MAC on variable-length messages is insecure.)
Takeaway
The end of the message must be cryptographically bound.
We must make “this is the final block” unforgeable.
Fix attempt #5 — “Fix the ending (this is where real engineering starts)”
So what exactly breaks in Fix #4?
Not the chaining itself.
The break is the ending:
messages can end on a block boundary or not
messages can have different lengths
the final state of one message must not become a reusable internal state for another
So we add finalization rules that make “this is the end” cryptographically real.
Here is the mental model (extended pseudocode):
Step A — Generate two subkeys (K1, K2)
We derive subkeys from AES itself:
L = AES(K,0^128)
K1 = dbl(L)
K2 = dbl(K1)Where dbl() is “shift-left-by-1-bit, and if a carry falls off, XOR a constant (Rb) into the last byte”.
This is not random ceremony — it gives us two distinct “domains” for the last block.
Step B — Split message into blocks
B1..B(n-1),last = split_into_16B_blocks(M)Now two cases:
Case 1 — last block is FULL (exactly 16 bytes)
M_last = last XOR K1Case 2 — last block is PARTIAL (0..15 bytes)
Pad it first (append 0x80 then zeros), then:
last_padded = pad_7816_4(last)
M_last = last_padded XOR K2Step C — Run the chaining as before, but finalize with M_last
X = 0^128
for i in 1..(n-1):
X = AES(K, X XOR Bi)
tag = AES(K, X XOR M_last)That’s the “fixed ending” logic in full.
A very important clarification: there is nothing to decrypt here
At this point, it’s worth stopping for a second and being explicit.
The output of Fix attempt #5 — the tag — is not encrypted data.
It is:
not reversible
not meant to be decrypted
not carrying the message inside it
It is the final internal state of a compression process.
Verification works like this:
The receiver already has the message
MThe receiver recomputes the tag from scratch using the same algorithm and the same key
The receiver compares the two tags
If they match → accept
If they don’t → reject
There is no “decryption step” for the tag, because authentication is not a transformation — it’s a decision.
This is a crucial mental shift:
Encryption answers: “What was the message?”
MAC answers: “Can I trust this message?”
Trying to “decrypt a MAC” is like trying to “decrypt a checksum”.
There is nothing there to recover.
If you feel uncomfortable that the tag can’t be decrypted — good. That discomfort means you’ve stopped thinking about authentication as encryption.
Name reveal: CMAC (and what we accidentally reinvented)
So what did we just build?
We built a compression function: message → fixed-size tag
(not zip compression — cryptographic compression: folding data into state)
We built chaining: a state that evolves block-by-block.
We built a CBC-style MAC core:
X = AES(K, X XOR Bi).We discovered the “variable length” landmine, and fixed it with:
finalization
domain separation for the last block (full vs partial)
subkeys
K1,K2padding for the partial last block
And once you assemble those pieces, the whole thing has a name:
CMAC — Cipher-based Message Authentication Code.
CMAC is what remains after you remove all the broken variants.
Python implementation (AES-CMAC)
We’ll use an existing crypto library (because we are not trying to spend 3 weeks reimplementing AES here).
Install:
pip install pycryptodomeAnd here is a minimal CMAC implementation (plus a self-test with NIST vectors):
"""
AES-CMAC (CMAC) — final implementation.
- Uses PyCryptodome for AES-ECB (the block primitive).
- Implements CMAC per NIST SP 800-38B:
* Subkeys K1/K2 derived from L = AES_K(0^128)
* CBC-style chaining with IV=0
* Special last-block handling:
- full last block -> XOR K1
- partial last block -> pad (7816-4) then XOR K2
Install:
pip install pycryptodome
"""
from __future__ import annotations
from Crypto.Cipher import AES
from Crypto.Hash import CMAC as CMAC_LIB
BLOCK_SIZE = 16
RB = 0x87 # Rb for 128-bit CMAC
def _xor(a: bytes, b: bytes) -> bytes:
if len(a) != len(b):
raise ValueError("XOR requires equal-length inputs.")
return bytes(x ^ y for x, y in zip(a, b))
def _left_shift_1(block: bytes) -> bytes:
"""Shift a 128-bit block left by 1 bit."""
if len(block) != BLOCK_SIZE:
raise ValueError("Expected 16-byte block.")
out = bytearray(BLOCK_SIZE)
carry = 0
for i in range(BLOCK_SIZE - 1, -1, -1):
out[i] = ((block[i] << 1) & 0xFF) | carry
carry = (block[i] >> 7) & 1
return bytes(out)
def _pad_7816_4(partial: bytes) -> bytes:
"""
ISO/IEC 7816-4 padding: append 0x80 then zeros to reach 16 bytes.
Used only when the last block is partial (len < 16).
"""
if len(partial) >= BLOCK_SIZE:
raise ValueError("pad_7816_4 expects len(partial) < 16.")
return partial + b"\x80" + b"\x00" * (BLOCK_SIZE - len(partial) - 1)
def _dbl(block: bytes) -> bytes:
"""
GF(2^128) doubling used for CMAC subkeys.
dbl(x) = (x<<1) if MSB(x) = 0
(x<<1) XOR Rb if MSB(x) = 1
"""
if len(block) != BLOCK_SIZE:
raise ValueError("Expected 16-byte block.")
shifted = _left_shift_1(block)
if block[0] & 0x80: # MSB(x) = 1
shifted = shifted[:-1] + bytes([shifted[-1] ^ RB])
return shifted
def _generate_subkeys(aes_ecb_encrypt) -> tuple[bytes, bytes]:
"""
Subkeys:
L = AES_K(0^128)
K1 = dbl(L)
K2 = dbl(K1)
"""
L = aes_ecb_encrypt(bytes(BLOCK_SIZE))
K1 = _dbl(L)
K2 = _dbl(K1)
return K1, K2
def aes_cmac(key: bytes, msg: bytes) -> bytes:
"""
Compute AES-CMAC tag (16 bytes).
key: 16/24/32 bytes (AES-128/192/256)
msg: arbitrary bytes
"""
if len(key) not in (16, 24, 32):
raise ValueError("AES key must be 16, 24, or 32 bytes.")
aes = AES.new(key, AES.MODE_ECB)
def aes_ecb_encrypt(block: bytes) -> bytes:
if len(block) != BLOCK_SIZE:
raise ValueError("AES-ECB expects 16-byte blocks.")
return aes.encrypt(block)
K1, K2 = _generate_subkeys(aes_ecb_encrypt)
# Number of blocks (CMAC treats empty message as one partial block)
n = (len(msg) + BLOCK_SIZE - 1) // BLOCK_SIZE
if n == 0:
n = 1
# Split: first n-1 full blocks, and a last block (0..16 bytes)
blocks = [msg[i * BLOCK_SIZE:(i + 1) * BLOCK_SIZE] for i in range(n - 1)]
last = msg[(n - 1) * BLOCK_SIZE:] # may be empty, partial, or full
# Prepare final block with domain separation
if len(last) == BLOCK_SIZE:
m_last = _xor(last, K1)
else:
m_last = _xor(_pad_7816_4(last), K2)
# CBC-like chaining with IV=0 on blocks[0..n-2]
X = bytes(BLOCK_SIZE)
for b in blocks:
if len(b) != BLOCK_SIZE:
raise ValueError("Internal error: non-full intermediate block.")
X = aes_ecb_encrypt(_xor(X, b))
# Final tag
T = aes_ecb_encrypt(_xor(X, m_last))
return T
# ---------------------------
# Verification helpers/tests
# ---------------------------
def _hx(s: str) -> bytes:
return bytes.fromhex(s.replace(" ", "").replace("\n", ""))
def verify_against_library(key: bytes, msg: bytes) -> None:
"""Cross-check our CMAC vs PyCryptodome's CMAC."""
mine = aes_cmac(key, msg)
lib = CMAC_LIB.new(key, ciphermod=AES)
lib.update(msg)
theirs = lib.digest()
assert mine == theirs, (
"CMAC mismatch!\n"
f"mine = {mine.hex()}\n"
f"theirs = {theirs.hex()}"
)
def self_test() -> None:
"""
Known CMAC values for the famous NIST key + messages from SP 800-38B context.
We *also* verify with PyCryptodome CMAC to avoid any “vector confusion”.
"""
key = _hx("2b7e151628aed2a6abf7158809cf4f3c")
msg1 = _hx("6bc1bee22e409f96e93d7e117393172a") # 16 bytes
msg2 = _hx("ae2d8a571e03ac9c9eb76fac45af8e51") # 16 bytes
msg3 = _hx("30c81c46a35ce411e5fbc1191a0a52ef") # 16 bytes
msg4 = _hx("f69f2445df4f9b17ad2b417be66c3710") # 16 bytes
tests = [
(b"", "bb1d6929e95937287fa37d129b756746"), # 0
(msg1, "070a16b46b4d4144f79bdd9dd04a287c"), # 16
(msg1 + msg2, "ce0cbf1738f4df6428b1d93bf12081c9"), # 32
(msg1 + msg2 + msg3[:8], "dfa66747de9ae63030ca32611497c827"), # 40
(msg1 + msg2 + msg3 + msg4, "51f0bebf7e3b9d92fc49741779363cfe") # 64
]
for m, expected_hex in tests:
got = aes_cmac(key, m).hex()
assert got == expected_hex, f"Vector mismatch: got {got}, expected {expected_hex}"
verify_against_library(key, m)
if __name__ == "__main__":
self_test()
print("AES-CMAC OK (vectors + library cross-check passed)")
The final version is on github: https://github.com/DmytroHuzz/build_own_mac
One last observation (and the bridge to Part 3)
Look at what we built.
This tag function:
takes arbitrary-length input
updates a small internal state block by block
produces a fixed-size output
That is… suspiciously familiar.
It looks like the shape of a hash function.
So in the next article we’ll do the same trick again:
Instead of forcing AES to behave like a compressor, let’s start from a primitive that is already a compressor.
And we’ll see if we can reinvent a hash-based MAC in the same “no names until the end” style.


