Building Own MAC — Part 2: Fixing AES (and accidentally reinventing CMAC)

Why intuition fails in cryptography

Dmytro Huz

Jan 19, 2026

In the previous series we finished AES and its modes. And in the previous article revealed why it is still not secure.

We can encrypt messages.

We can decrypt messages.

We can ship.

And then reality does what reality always does:

It slaps you.

Because encryption solves one problem:

“If you don’t know the key, you can’t read the message.”

It does not solve this problem:

“If you don’t know the key, you can’t change the message.”

So we face the classic situation:

✅ we have secrecy

❌ we still don’t have trust

And now we do what every engineer does when something breaks:

we try to patch it fast.

Let’s go through the exact fixes your brain naturally generates at 2am.

Most of them fail.

Some fail spectacularly.

And each failure forces one new design constraint… until we reinvent a real solution.

Goal (simple and brutal)

We want the receiver to be able to say:

✅ accept
❌ reject

based on one small extra value.

No philosophy. No “maybe”.

Just: does this message deserve to exist?

Fix attempt #1 — “Just hash the plaintext”

Classic.

C   = Enc(M)
tag = Hash(M)
send (C, tag)

Receiver decrypts C, recomputes Hash(M), compares.

Break (instant)

Hashes have no secrets.

Attacker changes message → attacker recomputes hash → sends new pair.

✅ receiver accepts

✅ attacker wins

✅ we learn nothing except pain

Takeaway

If the attacker can compute your tag, it’s not authentication.

It’s a checksum.

So the tag must involve a secret.

Fix attempt #1.5 — “Put the hash inside the encrypted message”

Okay, fine.

Let’s hide the hash.

payload = M || Hash(M)
C = Enc(payload)
send C

Receiver decrypts, extracts M and the embedded hash, recomputes, compares.

This feels smart.

It’s not.

Break (modes bite you)

In malleable modes (CTR / OFB / CFB), the attacker can flip bits in ciphertext to flip predictable bits in plaintext.

So they flip bits in the message part…

and flip corresponding bits in the embedded hash part.

They don’t need to know the key.

They don’t need to know the hash function.

They don’t need to understand anything.

They just move bits.

Receiver decrypts:

M' || Hash(M')

Hash matches. Receiver accepts.

0_o

Takeaway

Hiding a checker does not make it a check.

If your “verification data” lives inside the thing being protected, it can be modified together with it.

We need the tag to be outside the encrypted payload and unforgeable.

Fix attempt #2 — “Encrypt the hash separately”

Next idea:

C   = Enc(M)
tag = Enc(Hash(M))
send (C, tag)

Now the attacker can’t see the hash, can’t recompute it.

So… done?

Not even close.

Break (you verified… what exactly?)

Now you have two encrypted blobs.

And the receiver has to answer a painful question:

What exactly are we proving by comparing these?

Do we verify the plaintext?
The ciphertext?
The padding?
The length?
The context (endpoint / protocol version / message type)?

Nothing is bound. Everything is implied.

And implied security is not security.

Also: even if you try to “compare decrypted hash to recomputed hash”, you’re back to the earlier issue — encryption modes are malleable and protocol context is not bound.

Takeaway

Encrypting a checker is not the same as verifying a message.

We need a single verification value, not two blobs that “seem related”.

Fix attempt #2.5 — “Encrypt the ciphertext and compare”

Okay. Let’s remove ambiguity.

C   = Enc(M)
tag = Enc(C)
send (C, tag)

Receiver decrypts tag → gets C' → compares C' == C.

Now we have:

a secret
a deterministic check
a clean yes/no

This looks decent.

And it still fails.

Break (you authenticated bytes, not meaning)

This check proves exactly one thing:

“The ciphertext blob wasn’t modified.”

It does not prove:

that this ciphertext belongs to this endpoint
that it’s intended for this message type
that it’s valid in this context
that it’s fresh
that it’s not a replay

So the attacker doesn’t need to forge anything.

They just replay a valid (C, tag) where it causes damage.

“Transfer $10” becomes “Transfer $10 again.”

Or becomes “Approve something you approved yesterday.”

You built a very strong proof of internal consistency… and zero proof of intent.

Takeaway

Consistency is not authenticity.

Authentication must bind meaning and context, not just bytes.

Fix attempt #3 — “Use a ciphertext artifact as the tag”

Another idea people try:

“Maybe the last ciphertext block depends on everything. Let’s use it.”

C   = Enc(M)
tag = last_block(C)

Break (structural)

This “tag” depends on:

mode internals
padding
message length
where the last block boundary falls

Truncation, extension, prefixes… the whole thing becomes ambiguous.

Takeaway

Artifacts are not tags.

We need a tag function that we control, end-to-end.

Fix attempt #4 — “Fine. Let’s build a tag ourselves.”

At this point it’s pretty clear that all “attach something and encrypt it” hacks are cursed.

So let’s stop patching symptoms and define what we actually need.

We need a function that takes:

a secret key K
a message M of any length

and produces:

a fixed-size tag (something like 16 bytes)

So not:

Block → Block (that’s what AES gives us)
Message → Message (that’s what modes give us)

but:

Message → Block

And yes, AES still sounds useful, because it’s keyed and strong.

The only problem is… AES doesn’t speak “message”. It speaks “one block”.

So we have to build a “message-to-block machine” out of “block-to-block”.

Which means: state.

We invent a small internal value X (one AES block), and we feed the message block-by-block:

start with some known initial state X0
combine the current block with the state
run AES
repeat

The most natural combine operation is XOR (it keeps the size).

So the first honest design becomes:

X0 = 0
for each block Bi in message:
    Xi = AES(K, Xi-1 XOR Bi)
tag = Xn

This is the first time we’re no longer “encrypting a message”.

We’re doing something else:

the output is fixed-size no matter how long the message is
you can’t decrypt the tag back into the message
we are folding the message into a state

That’s not encryption. That’s compression (in the cryptographic sense).

Now: does it work?

It works almost perfectly… until you remember messages are not fixed-length.

And the moment message length varies, this construction starts bleeding

Now: break it.

The break: variable-length messages

This construction is essentially “CBC-MAC”.

It works for fixed-length messages.

It breaks for variable length.

Reason: the output tag is a valid internal state, so extension/splicing tricks become possible unless you bind the message length / finalization rules.

(And yes, this is a known and very real class of failures: CBC-MAC on variable-length messages is insecure.)

Takeaway

The end of the message must be cryptographically bound.

We must make “this is the final block” unforgeable.

Fix attempt #5 — “Fix the ending (this is where real engineering starts)”

So what exactly breaks in Fix #4?

Not the chaining itself.

The break is the ending:

messages can end on a block boundary or not
messages can have different lengths
the final state of one message must not become a reusable internal state for another

So we add finalization rules that make “this is the end” cryptographically real.

Here is the mental model (extended pseudocode):

Step A — Generate two subkeys (K1, K2)

We derive subkeys from AES itself:

L  = AES(K,0^128)
K1 = dbl(L)
K2 = dbl(K1)

Where dbl() is “shift-left-by-1-bit, and if a carry falls off, XOR a constant (Rb) into the last byte”.

This is not random ceremony — it gives us two distinct “domains” for the last block.

Step B — Split message into blocks

B1..B(n-1),last = split_into_16B_blocks(M)

Now two cases:

Case 1 — last block is FULL (exactly 16 bytes)

M_last = last XOR K1

Case 2 — last block is PARTIAL (0..15 bytes)

Pad it first (append 0x80 then zeros), then:

last_padded = pad_7816_4(last)
M_last      = last_padded XOR K2

Step C — Run the chaining as before, but finalize with M_last

X = 0^128
for i in 1..(n-1):
    X = AES(K, X XOR Bi)

tag = AES(K, X XOR M_last)

That’s the “fixed ending” logic in full.

A very important clarification: there is nothing to decrypt here

At this point, it’s worth stopping for a second and being explicit.

The output of Fix attempt #5 — the tag — is not encrypted data.

It is:

not reversible
not meant to be decrypted
not carrying the message inside it

It is the final internal state of a compression process.

Verification works like this:

The receiver already has the message M
The receiver recomputes the tag from scratch using the same algorithm and the same key
The receiver compares the two tags

If they match → accept

If they don’t → reject

There is no “decryption step” for the tag, because authentication is not a transformation — it’s a decision.

This is a crucial mental shift:

Encryption answers: “What was the message?”
MAC answers: “Can I trust this message?”

Trying to “decrypt a MAC” is like trying to “decrypt a checksum”.

There is nothing there to recover.

If you feel uncomfortable that the tag can’t be decrypted — good. That discomfort means you’ve stopped thinking about authentication as encryption.

Name reveal: CMAC (and what we accidentally reinvented)

So what did we just build?

We built a compression function: message → fixed-size tag
(not zip compression — cryptographic compression: folding data into state)
We built chaining: a state that evolves block-by-block.
We built a CBC-style MAC core: X = AES(K, X XOR Bi).
We discovered the “variable length” landmine, and fixed it with:
- finalization
- domain separation for the last block (full vs partial)
- subkeys K1, K2
- padding for the partial last block

And once you assemble those pieces, the whole thing has a name:

CMAC — Cipher-based Message Authentication Code.

CMAC is what remains after you remove all the broken variants.

Python implementation (AES-CMAC)

We’ll use an existing crypto library (because we are not trying to spend 3 weeks reimplementing AES here).

Install:

pip install pycryptodome

And here is a minimal CMAC implementation (plus a self-test with NIST vectors):

"""
AES-CMAC (CMAC) — final implementation.

- Uses PyCryptodome for AES-ECB (the block primitive).
- Implements CMAC per NIST SP 800-38B:
    * Subkeys K1/K2 derived from L = AES_K(0^128)
    * CBC-style chaining with IV=0
    * Special last-block handling:
        - full last block  -> XOR K1
        - partial last block -> pad (7816-4) then XOR K2

Install:
    pip install pycryptodome
"""

from __future__ import annotations
from Crypto.Cipher import AES
from Crypto.Hash import CMAC as CMAC_LIB

BLOCK_SIZE = 16
RB = 0x87  # Rb for 128-bit CMAC


def _xor(a: bytes, b: bytes) -> bytes:
    if len(a) != len(b):
        raise ValueError("XOR requires equal-length inputs.")
    return bytes(x ^ y for x, y in zip(a, b))


def _left_shift_1(block: bytes) -> bytes:
    """Shift a 128-bit block left by 1 bit."""
    if len(block) != BLOCK_SIZE:
        raise ValueError("Expected 16-byte block.")
    out = bytearray(BLOCK_SIZE)
    carry = 0
    for i in range(BLOCK_SIZE - 1, -1, -1):
        out[i] = ((block[i] << 1) & 0xFF) | carry
        carry = (block[i] >> 7) & 1
    return bytes(out)


def _pad_7816_4(partial: bytes) -> bytes:
    """
    ISO/IEC 7816-4 padding: append 0x80 then zeros to reach 16 bytes.
    Used only when the last block is partial (len < 16).
    """
    if len(partial) >= BLOCK_SIZE:
        raise ValueError("pad_7816_4 expects len(partial) < 16.")
    return partial + b"\x80" + b"\x00" * (BLOCK_SIZE - len(partial) - 1)


def _dbl(block: bytes) -> bytes:
    """
    GF(2^128) doubling used for CMAC subkeys.

    dbl(x) = (x<<1)           if MSB(x) = 0
             (x<<1) XOR Rb    if MSB(x) = 1
    """
    if len(block) != BLOCK_SIZE:
        raise ValueError("Expected 16-byte block.")
    shifted = _left_shift_1(block)
    if block[0] & 0x80:  # MSB(x) = 1
        shifted = shifted[:-1] + bytes([shifted[-1] ^ RB])
    return shifted


def _generate_subkeys(aes_ecb_encrypt) -> tuple[bytes, bytes]:
    """
    Subkeys:
      L  = AES_K(0^128)
      K1 = dbl(L)
      K2 = dbl(K1)
    """
    L = aes_ecb_encrypt(bytes(BLOCK_SIZE))

    K1 = _dbl(L)
    K2 = _dbl(K1)

    return K1, K2


def aes_cmac(key: bytes, msg: bytes) -> bytes:
    """
    Compute AES-CMAC tag (16 bytes).

    key: 16/24/32 bytes (AES-128/192/256)
    msg: arbitrary bytes
    """
    if len(key) not in (16, 24, 32):
        raise ValueError("AES key must be 16, 24, or 32 bytes.")

    aes = AES.new(key, AES.MODE_ECB)

    def aes_ecb_encrypt(block: bytes) -> bytes:
        if len(block) != BLOCK_SIZE:
            raise ValueError("AES-ECB expects 16-byte blocks.")
        return aes.encrypt(block)

    K1, K2 = _generate_subkeys(aes_ecb_encrypt)

    # Number of blocks (CMAC treats empty message as one partial block)
    n = (len(msg) + BLOCK_SIZE - 1) // BLOCK_SIZE
    if n == 0:
        n = 1

    # Split: first n-1 full blocks, and a last block (0..16 bytes)
    blocks = [msg[i * BLOCK_SIZE:(i + 1) * BLOCK_SIZE] for i in range(n - 1)]
    last = msg[(n - 1) * BLOCK_SIZE:]  # may be empty, partial, or full

    # Prepare final block with domain separation
    if len(last) == BLOCK_SIZE:
        m_last = _xor(last, K1)
    else:
        m_last = _xor(_pad_7816_4(last), K2)

    # CBC-like chaining with IV=0 on blocks[0..n-2]
    X = bytes(BLOCK_SIZE)
    for b in blocks:
        if len(b) != BLOCK_SIZE:
            raise ValueError("Internal error: non-full intermediate block.")
        X = aes_ecb_encrypt(_xor(X, b))

    # Final tag
    T = aes_ecb_encrypt(_xor(X, m_last))
    return T


# ---------------------------
# Verification helpers/tests
# ---------------------------

def _hx(s: str) -> bytes:
    return bytes.fromhex(s.replace(" ", "").replace("\n", ""))


def verify_against_library(key: bytes, msg: bytes) -> None:
    """Cross-check our CMAC vs PyCryptodome's CMAC."""
    mine = aes_cmac(key, msg)
    lib = CMAC_LIB.new(key, ciphermod=AES)
    lib.update(msg)
    theirs = lib.digest()
    assert mine == theirs, (
        "CMAC mismatch!\n"
        f"mine   = {mine.hex()}\n"
        f"theirs = {theirs.hex()}"
    )


def self_test() -> None:
    """
    Known CMAC values for the famous NIST key + messages from SP 800-38B context.
    We *also* verify with PyCryptodome CMAC to avoid any “vector confusion”.
    """
    key = _hx("2b7e151628aed2a6abf7158809cf4f3c")

    msg1 = _hx("6bc1bee22e409f96e93d7e117393172a")  # 16 bytes
    msg2 = _hx("ae2d8a571e03ac9c9eb76fac45af8e51")  # 16 bytes
    msg3 = _hx("30c81c46a35ce411e5fbc1191a0a52ef")  # 16 bytes
    msg4 = _hx("f69f2445df4f9b17ad2b417be66c3710")  # 16 bytes

    tests = [
        (b"", "bb1d6929e95937287fa37d129b756746"),                       # 0
        (msg1, "070a16b46b4d4144f79bdd9dd04a287c"),                     # 16
        (msg1 + msg2, "ce0cbf1738f4df6428b1d93bf12081c9"),              # 32
        (msg1 + msg2 + msg3[:8], "dfa66747de9ae63030ca32611497c827"),   # 40
        (msg1 + msg2 + msg3 + msg4, "51f0bebf7e3b9d92fc49741779363cfe") # 64
    ]

    for m, expected_hex in tests:
        got = aes_cmac(key, m).hex()
        assert got == expected_hex, f"Vector mismatch: got {got}, expected {expected_hex}"
        verify_against_library(key, m)


if __name__ == "__main__":
    self_test()
    print("AES-CMAC OK (vectors + library cross-check passed)")

The final version is on github: https://github.com/DmytroHuzz/build_own_mac

One last observation (and the bridge to Part 3)

Look at what we built.

This tag function:

takes arbitrary-length input
updates a small internal state block by block
produces a fixed-size output

That is… suspiciously familiar.

It looks like the shape of a hash function.

So in the next article we’ll do the same trick again:

Instead of forcing AES to behave like a compressor, let’s start from a primitive that is already a compressor.

And we’ll see if we can reinvent a hash-based MAC in the same “no names until the end” style.

Dmytro’s Substack

Discussion about this post

Ready for more?