Category: Cryptography
Flag: txsaw{georges_perec}
Challenge Description
This chall’s got a bit of history to it.
First, crack this initial cryptogram. Now, apply OSINT tools to find who authors that original script.
flag format: txsaw{first_last} such as: txsaw{john_scalzi}
Analysis
The provided file was an ASCII text file containing a single long line of ciphertext, which strongly suggested a classical pen-and-paper style cipher rather than anything modern. Because the text had normal word spacing and punctuation positions but unreadable letters, the useful model was a monoalphabetic substitution. Instead of hand-solving it, the solve used a hillclimbing script that scored candidate plaintexts with a mix of quadgram frequencies and wordfreq Zipf scores so that readable English would rise to the top.
Solution
from pathlib import Pathimport randomimport mathfrom collections import Counter
from wordfreq import zipf_frequency
def load_ciphertext(path): return Path(path).read_text().splitlines()
def build_quadgram_model(corpus_paths): corpus = "" for path in corpus_paths: try: corpus += Path(path).read_text().lower() except Exception: pass corpus = "".join(ch for ch in corpus if ch.isalpha() or ch == " ") quad = Counter() for i in range(len(corpus) - 3): q = corpus[i : i + 4] if " " in q: continue quad[q] += 1 total = sum(quad.values()) or 1 quad_log = {k: math.log10(v / total) for k, v in quad.items()} floor = math.log10(0.01 / total) return quad_log, floor
def score_text(pt, quad_log, floor): score = 0.0 words = pt.split() for w in words: if len(w) >= 2: score += zipf_frequency(w, "en") for i in range(len(pt) - 3): q = pt[i : i + 4] if " " in q: continue score += quad_log.get(q, floor) return score
def decrypt(text, keymap): alphabet = "abcdefghijklmnopqrstuvwxyz" out = [] for ch in text: if ch in alphabet: out.append(keymap[ch]) else: out.append(ch) return "".join(out)
def make_initial_key(cipher_letters): alphabet = "abcdefghijklmnopqrstuvwxyz" english_freq_order = "etaoinshrdlucmfwypvbgkjqxz" freq = Counter(ch for ch in cipher_letters if ch.isalpha()) cipher_order = "".join([c for c, _ in freq.most_common()]) for c in alphabet: if c not in cipher_order: cipher_order += c key = {c: english_freq_order[i] for i, c in enumerate(cipher_order)} return key
def hillclimb(cipher_text, quad_log, floor, restarts=6, iterations=12000, seed=0): random.seed(seed) alphabet = "abcdefghijklmnopqrstuvwxyz" best_key = None best_score = -1e18 best_plain = None
cipher_letters = "".join(ch for ch in cipher_text if ch.isalpha() or ch == " ")
for r in range(restarts): key = make_initial_key(cipher_letters) letters = list(alphabet) for _ in range(60): a, b = random.sample(letters, 2) key[a], key[b] = key[b], key[a]
cur_key = key.copy() cur_plain = decrypt(cipher_letters, cur_key) cur_score = score_text(cur_plain, quad_log, floor)
for T in [20.0, 10.0, 5.0, 2.0, 1.0, 0.5]: for _ in range(iterations // 8): c1, c2 = random.sample(letters, 2) cur_key[c1], cur_key[c2] = cur_key[c2], cur_key[c1] new_plain = decrypt(cipher_letters, cur_key) new_score = score_text(new_plain, quad_log, floor) if new_score > cur_score or random.random() < math.exp( (new_score - cur_score) / T ): cur_score = new_score else: cur_key[c1], cur_key[c2] = cur_key[c2], cur_key[c1]
if cur_score > best_score: best_score = cur_score best_key = cur_key.copy() best_plain = decrypt(cipher_letters, best_key)
print( f"restart {r + 1}/{restarts}: score {cur_score:.2f} best {best_score:.2f}" )
return best_key, best_score, best_plain
def decrypt_lines(lines, keymap): alphabet = "abcdefghijklmnopqrstuvwxyz" out_lines = [] for line in lines: out = [] for ch in line: low = ch.lower() if low in alphabet: dec = keymap[low] if ch.isupper(): dec = dec.upper() out.append(dec) else: out.append(ch) out_lines.append("".join(out)) return out_lines
def main(): lines = load_ciphertext("/home/LIGHT/Downloads/texsaw_idiosyncratic_french/ciphertext.txt") cipher_text = " ".join(lines).lower() cipher_text = "".join(ch for ch in cipher_text if ch.isalpha() or ch == " ")
quad_log, floor = build_quadgram_model( [ "/usr/share/dict/words", "/home/LIGHT/Downloads/Adventure.txt", ] )
key, score, plain = hillclimb( cipher_text, quad_log, floor, restarts=6, iterations=12000, seed=3 ) print("best score", score) print("best plain snippet:") print(plain[:600]) print("key:") print(" ".join(f"{c}->{key[c]}" for c in "abcdefghijklmnopqrstuvwxyz"))
dec_lines = decrypt_lines(lines, key) print("--- decrypted full file ---") for line in dec_lines: print(line)
if __name__ == "__main__": main()python solve_subst.pyrestart 1/6: score -528.20 best -528.20restart 2/6: score -528.20 best -528.20restart 3/6: score -528.20 best -528.20restart 4/6: score -528.20 best -528.20restart 5/6: score -528.20 best -528.20restart 6/6: score -528.20 best -528.20best score -528.198841989987best plain snippet:noon rings out a wasp making an ominous sound a sound akin to a klaxon or a tocsin flits about augustus who has had a bad night sits up blinking and purblind oh what was that word is his thought that ran through my brain all night that idiotic word that hard as id try to pun it down was always just an inch or two out of my grasp fowl or foul or vow or voyal a word which by association brought into play an incongruous mass and magma of nouns idioms slogans and sayings a confusing amorphous outpouring which i sought in vain to control or turn off but which wound around my mind a whirlwind of akey:a->n b->m c->l d->k e->j f->i g->h h->g i->f j->q k->d l->c m->b n->a o->e p->y q->x r->w s->v t->u u->t v->s w->r x->z y->p z->o--- decrypted full file ---Noon rings out. A wasp, making an ominous sound, a sound akin to a klaxon or a tocsin, flits about. Augustus, who has had a bad night, sits up blinking and purblind. Oh what was that word (is his thought) that ran through my brain all night, that idiotic word that, hard as I'd try to pun it down, was always just an inch or two out of my grasp - fowl or foul or Vow or Voyal? - a word which, by association, brought into play an incongruous mass and magma of nouns, idioms, slogans and sayings, a confusing, amorphous outpouring which I sought in vain to control or turn off but which wound around my mind a whirlwind of a cord, a whiplash of a cord, a cord that would split again and again, would knit again and again, of words without communication or any possibility of combination, words without pronunciation, signification or transcription but out of which, notwithstanding, was brought forth a flux, a continuous, compact and lucid flow: an intuition, a vacillating frisson of illumination as if caught in a flash of lightning or in a mist abruptly rising to unshroud an obvious sign - but a sign, alas, that would last an instant only to vanish for good.That plaintext gave a perfect OSINT pivot because its opening sentence is distinctive enough to search verbatim. The quote search immediately returned a University of Cambridge page titled Geroges Perec Excerpt, and the matching passage on that page identified the source as A Void. The final confirmation came from the page itself, which names Georges Perec and even notes the lipogrammatic gimmick of avoiding the letter e, exactly matching the challenge title’s censoring of French.
from duckduckgo_search import DDGS
def main(): quote = ( "Noon rings out. A wasp, making an ominous sound, a sound akin to a klaxon " "or a tocsin, flits about. Augustus, who has had a bad night, sits up blinking " "and purblind." ) with DDGS() as ddgs: results = ddgs.text(f'"{quote}"', max_results=12) for i, r in enumerate(results, start=1): print(f"{i}. {r.get('title')}") print(r.get("href")) print(r.get("body")) print("-")
if __name__ == "__main__": main()python search_quote.py/home/LIGHT/Downloads/texsaw_idiosyncratic_french/search_quote.py:11: RuntimeWarning: This package (`duckduckgo_search`) has been renamed to `ddgs`! Use `pip install ddgs` instead. with DDGS() as ddgs:1. Geroges Perec Excerpt - University of Cambridgehttp://www-control.eng.cam.ac.uk/hu/Perec.htmlNoon rings out. ... Augustus, who has had a bad night, sits up blinking and purblind.-import urllib.request
def main(): url = "http://www-control.eng.cam.ac.uk/hu/Perec.html" with urllib.request.urlopen(url, timeout=20) as resp: data = resp.read().decode("utf-8", errors="replace") print(data)
if __name__ == "__main__": main()python fetch_source.py<TITLE>Geroges Perec Excerpt</TITLE>
<H1>Georges Perec Excerpt</H1><p><H2>From "A Void"</H2><p><H2>(one of Haig's favourite passages)</H2><p>
Noon rings out. A wasp, making an ominous sound, a sound akin to a klaxonor a tocsin, flits about. Augustus, who has had a bad night, sits upblinking and purblind. Oh what was that word (is his thought) that ran throughmy brain all night, that idiotic word that, hard as I'd try to pun it down,was always just an inch or two out of my grasp - fowl or foul or Vow orVoyal? - a word which, by association, brought into play an incongruous massand magma of nouns, idioms, slogans and sayings, a confusing, amorphousoutpouring which I sought in vain to control or turn off but which woundaround my mind a whirlwind of a cord, a whiplash of a cord, a cord thatwould split again and again, would knit again and again, of words withoutcommunication or any possibility of combination, words without pronunciation,signification or transcription but out of which, notwithstanding, was broughtforth a flux, a continuous, compact and lucid flow: an intuition, a vacillatingfrisson of illumination as if caught in a flash of lightning or in a mistabruptly rising to unshroud an obvious sign - but a sign, alas, that would lastan instant only to vanish for good.<p><i>(transl. by Gilbert Adair, who succeeded in preserving the idiosyncracyof the original French... the letter "e" doesnot appear even once in thebook!)</i>At that point the flag format was straightforward: convert the author’s name to first_last, giving txsaw{georges_perec}.