m0leconCTF2025 crypto/Guess Me

- 7 mins read

Write-up — Challenge Guess Me (m0leconCTF 2025)

Introduction

Guess Me is an authentication challenge that runs 5 rounds of an online protocol. Each round hides a randomly-permuted secret derived from the string m0leCon. The service accepts a hex nonce field that can contain multiple 16-byte nonces concatenated together, plus additional_data, ciphertext and a tag. A subtle combination of a custom PRF, a deterministic stream-cipher-like encryption and a buggy padding check yields a tag oracle that is exploitable to recover the server’s permutation quickly.


TL;DR

The service authenticates messages using a custom construction: an AES-based mask + 10-round S-box/bit-permutation PRF, XOR-based stream encryption and a tag computed over padded (nonce || additional_data) and ciphertext blocks. A padding bug in the server’s decrypt implementation makes the server distinguish between invalid tag and invalid padding (it returns False vs b"Invalid padding"). By precomputing, for each candidate permutation of "m0leCon", the unique nonce that makes a chosen tag equal to 16 zero bytes, we can query many candidate nonces in one request (by concatenating them). The server replies differently depending on whether any of those nonces matched — this gives a group-testing oracle that identifies the correct permutation in ~13 queries (≤16 allowed). After finding the key, we forge a valid authenticated ciphertext for "next round please" and advance; 5 rounds yield the flag.


Challenge Description

Connect to the service:

nc guess_me.challs.m0lecon.it 12164

Protocol (interactive):

  • The server runs 5 rounds.
  • Each round: you may send up to 16 attempts. Each attempt the server reads four hex fields: nonce, additional_data, ciphertext, tag (the nonce field may contain concatenated 16-byte nonces).
  • The server computes an expected tag over (nonce, additional_data, ciphertext) and compares it against the supplied tag. If the tag matches, it attempts to decrypt and unpad the ciphertext; if any decrypted plaintext equals b"next round please" the round succeeds. Responses are textual: "Tag is invalid", "This message does not seem ok :(", "There you go!", etc.

Theory — PRF + stream cipher + buggy padding

High-level primitives used by the service (from the challenge source):

  • A 16‑byte AES-ECB mask: mask = AES(key).encrypt(sha256(block_index).digest()). The mask is split into left and right halves (each 16 bytes because the code encrypts a 32‑byte digest and uses full 32 bytes as mask).

  • A 10‑round custom permutation: each round applies a 4‑bit S-box to each nibble then applies a fixed 128‑bit bit permutation. This permutation is used inside the PRF. The PRF is essentially prf(i, key, data) = xor(perm(data ^ left), right) where left/right come from AES(key, sha256(i)).

  • Encryption: the message is PKCS#7-padded (but note the challenge’s padding function returns empty padding for empty messages because of modulus behaviour); ciphertext blocks are created as ct_block = keystream_block ^ plaintext_block where keystream_block = prf(block_index, key, nonce).

  • Tag: tag = prf(-1, key, tag_val) where tag_val is the XOR-accumulation of ad and ct blocks after applying prf with special block indices. See enc_tag in the source.

Crucial implementation bug: on tag verification the service uses compare_digest(expected_tag, tag) and returns False on mismatch. If the tag matches, the code proceeds to decrypt and then tries to unpad; if the unpadding check fails it raises an exception that is caught higher up and reported as b"Invalid padding" (which evaluates truthily), while a tag mismatch yields False. This observable behavioural difference forms a binary oracle: any successful MAC verification but invalid padding prints a different response than a MAC failure. This is the padding-oracle-style leak we use.


Reconnaissance of the protocol

Two facts make our attack possible and efficient:

  1. The secret key is derived from a permutation of m0leCon shuffled by the server each round. There are exactly 7! = 5040 possibilities — small enough to enumerate. The key material is sha256(permutation).digest()[:16].

  2. For a fixed additional_data (the challenge expects b"pretty please") and a fixed target tag T (we choose 16 zero bytes), for each candidate key k there exists exactly one nonce n(k) such that when nonce=n(k) and the ciphertext is empty, the computed tag equals T. We can precompute n(k) by inverting the final PRF on T and XORing with the precomputed contribution from additional_data. This allows batching candidate nonces into one nonce field by simple concatenation. The server iterates over the list of nonces and returns an aggregated truthy/falsey result: if any supplied nonce produced a matching tag, the server’s decrypt pipeline will yield a non-False value and the response will differ. This is the group-testing primitive.


Vulnerability & Attack Idea

The vulnerability is not in AES or the PRF itself but in the combination of:

  • small keyspace (permutations of a short string),
  • the ability to submit multiple nonces at once (concatenated), and
  • a subtle side-channel in the server’s response that distinguishes “invalid tag” from “invalid padding”.

These allow an attacker to treat the server as a group-testing oracle: send one request with nonces for a subset S of candidates; the server’s response tells you whether the true key is in S. Repeating a binary partition search over 5040 candidates identifies the key in at most ⌈log2(5040)⌉ ≈ 13 queries (the service allows up to 16 attempts per round). Once the key is recovered we compute an authenticated ciphertext for "next round please" and advance; 5 rounds yield the flag.


Step-by-Step Exploitation

Precomputation (offline)

  • Enumerate all 5040 permutations of m0leCon and derive 16‑byte AES keys: sha256(perm)[:16].
  • For each key k, compute the mask values used by prf(-1, ...) and prf(1338, ...) (or the indices the implementation uses) by running AES on the corresponding SHA‑256 digests. Use these to invert prf(-1, k, T) for the chosen target tag T = 0x00*16. The inversion uses permute_block_inv and XOR with left mask (see prf_inv in the solver). Then compute the unique nonce n(k) by XORing this inverted value with the precomputed contribution C[k] coming from additional_data (this mirrors n_zero computation in the exploit). The solver performs these steps to obtain a list n_zero[k] for all keys.

Group testing (online)

  • Start with the full candidate set of 5040 indices.
  • Partition candidates into two halves S and S_comp and build nonce_concat = concat(n_zero[k] for k in S) (hex). Send one attempt with nonce = nonce_concat, additional_data = "pretty please", ciphertext = "" (empty), and tag = 0x00*16.
  • The server will parse nonce_concat into 16-byte chunks and run decrypt for each nonce. If any nonce leads to a MAC match, decrypt returns a truthy value (possibly b"Invalid padding"), triggering the "This message does not seem ok :(" branch instead of "Tag is invalid". From this response you know whether the true key is in S or S_comp. Update the candidate set accordingly. Repeat until one candidate remains. The solver uses this to identify the key in ≤ 13 attempts per round.

Final forging

  • When the key is found, reuse the corresponding nonce n_zero[key] (or any nonce that works) to encrypt the plaintext b"next round please" using the stream construction and compute its enc_tag. Send the full tuple (nonce, additional_data, ciphertext, tag) to the server; the server will verify and accept the round, allowing progression to the next round. After five successful rounds the server prints the flag.

Solve Script

A condensed excerpt (conceptual) of the core loop:

# precompute n_zero[k] for every key k
candidates = list(range(5040))
while len(candidates) > 1:
    S = candidates[:len(candidates)//2]
    nonce_concat = ''.join(n_zero[k].hex() for k in S)
    send(nonce=nonce_concat, ad=pretty_please, ct='', tag=zero_tag)
    response = recv_line()
    if 'invalid' in response.lower():
        candidates = candidates[len(candidates)//2:]
    else:
        candidates = S
# keys[candidates[0]] is the correct AES key -> forge final message

Result

Using the described method the solver completes the 5 rounds and the service returns the flag:

ptm{7his_ch4ll3ng3_is_s0_2020}

This exact result was obtained by the exploit bundled with the repository.


Conclusion

This challenge is a compact exercise in combining small keyspace brute-force with a subtle oracle. The three building blocks that make it solvable are:

  1. Small keyspace (permutations of a 7‑letter string) so complete enumeration is feasible.
  2. Concatenated nonces: the server accepts multiple nonces in a single request, enabling batch queries.
  3. Behavioural leak: different server responses for MAC‑failure vs. padding‑failure provide a binary oracle that tells whether any tested nonce matched the MAC.

The correct mitigation would be straightforward: return the exact same error on MAC failure and on subsequent padding/unpadding errors, and avoid accepting arbitrarily concatenated nonces (or bound and validate the nonce list). The solver shows an efficient group-testing implementation that recovers the server key quickly and forges valid authenticated messages to complete the protocol.

written by NumberOreo