Logo
Published on
views

AmateursCTF 2023: Chinese Remainder Theorem, Carnival Games and 2048-bit Integers

Authors
  • Name
    enscribe
    Github

Intro

High school CTF team View Source and I participated in AmateursCTF 2023, placing 2nd both overall and in the student division. Although there were over 64 challenges to tackle throughout the four-day submission period, I personally only put emphasis on the OSINT and algorithm categories. Within these categories lay an interesting challenge: the gcd-query series, which I solved with an implementation of a very special algorithm. This was my process (paired alongside a lengthy analogy)!


gcd-query-v1

solverenscribe
author: skittles1412
points: 475
category: algo
solves: 43
I wonder if this program leaks enough information for you to get the flag with less than 2048 queries... It probably does. I'm sure you can figure out how.
nc amt.rs 31692

We're initially provided with an attachment main.py and a remote server amt.rs:31692. The server component contains the following:

v1 Attachment

Let's go over step-by-step what this server is up to:

  • For ten iterations, a long x is created by pycrypto's getRandomInteger(n), which returns a random integer with up to nn bits in length. n=2048n = 2048; this is an absolutely mindbendingly large number — up to 617 digits long! You absolutely do not want to see what 617 digits looks like in decmimal:
  • For each iteration of x, the user gets prompted to enter two integers n and m. Once the assertion that m > 0 is passed, n and m are passed into a function gcd(x + n, m), which returns the greatest common divisor of x + n and m. This occurs for 1412 iterations.
  • After the iterations have completed, the user is then prompted to guess the value of x. If the guess is correct, the next iteration of x begins. This process is repeated nine more times until the flag is printed.

Here is a quick visual depicting what's going on:

Paying close attention to the right side of this graphic, we can see that there's only a couple specific points at which we can interact with the server: when we pick the n and m to send, and when we guess the value of x. The question now is: what values should we be picking for n and m which reveal the most information about x, and how do we use this information to obtain its actual value?

The Chinese Remainder Theorem

Recall: The modulus is the remainder of Euclidian division (division with remainder) of one number by another. For example, 2=12(mod5)2 = 12 \pmod{5}.
However, this is different from the congruence modulo relation, represented by the congruence symbol \equiv and often expressed as ab(modm)a \equiv b \pmod{m}. When two numbers aa and bb are congruent modulo mm, it means that:
  1. aa and bb have the same remainder when divided by mm
  2. aba - b is divisible by mm (i.e. m(ab)m \mid (a - b))
  3. There is an integer kk such that a=km+ba = km + b
As such, 122(mod5)12 \equiv 2 \pmod{5} is true, but 12=2(mod5)12 = 2 \pmod{5} is obviously false.

We start with a concept called a "system of congruences." A system of congruences is a set of equations of the form xai(modmi)x \equiv a_i \pmod{m_i}, where aia_i, bib_i, and mim_i are integers. The mim_i values are called the moduli of the system. Here's a quick example of this:

{x1(mod2)x2(mod3)x3(mod5)\begin{cases} x &\equiv 1 \pmod{2} \\ x &\equiv 2 \pmod{3} \\ x &\equiv 3 \pmod{5} \end{cases}

In this system, we have three congruences with moduli 22, 33, and 55. The goal is to find a value for xx that satisfies all three congruences simultaneously.

Thus, we can apply the Chinese Remainder Theorem:

Chinese Remainder Theorem: Given pairwise coprime integers n1,n2,,nkn_1, n_2, \ldots, n_k and arbitrary integers a1,a2,,aka_1, a_2, \ldots, a_k, the system of simultaneous congruences
{xa1(modn1)xa2(modn2)xak(modnk)\begin{cases} x &\equiv a_1 \pmod{n_1} \\ x &\equiv a_2 \pmod{n_2} \\ \vdots \\ x &\equiv a_k \pmod{n_k} \end{cases}
has a solution, and the solution is unique modulo N=n1n2nkN = n_1 n_2 \cdots n_k.
Note: Although the Chinese Remainder Theorem is often stated with pairwise coprime moduli (meaning that for a set of moduli M={n1,n2,,nk}M = \{n_1, n_2, \ldots, n_k\}, gcd(ni,nj)=1\gcd(n_i, n_j) = 1 for all iji \neq j), it can be extended to non-coprime moduli. However, doing so does not guarantee a solution — this will become increasingly relevant as we get towards our implementation process.

You may be asking: what the hell does this have to do with guessing the giant integer that we've been given? Well, I've concocted a little example here to demonstrate how we can use this theorem to our advantage.

Tne Modular Arithmetic Nerd's Favorite Carnival Game

Let's say little Bob over at the bottom right goes to a carnival game booth and is asked to guess a number on a ball behind the operator. Obviously, since we're omnipotent observers in this fantastical 2D universe of cute little cartoon circle people, we know that the number is x=727x = 727. However, Bob doesn't know shit. He's really good at modular arithmetic though, so he'll have a lot of fun with this one.

Bob's told that he can give the operator a piece of paper with two integers of his arbitrary choice: n and m. As long as m is above 0, the operator will always give him back a piece of paper with n and m passed into gcd(x + n, m). However, the operator's shift is about to end soon, and he estimates that he'll probably accept only about three pieces of paper from Bob until he closes shop.

Bob goes back to his table. He's flabbergasted. How in the world is he going to guess that number with only three pieces of information?

He rummages around his little noggin and recollects himself. Let's see what he's thinking:

Uh... thanks, I guess? Well, he has a good point, but since I guarantee that nobody read it (because it's too long for the average CTF player's attention span) I'll give a brief TL;DR here.

Bob's saying that per the definition of a "greatest common divisor," in the scenario d=gcd(a,b)d = \gcd(a, b), both a(modd)=0a \pmod{d} = 0 and b(modd)=0b \pmod{d} = 0 is true. Since we're given the function d = gcd(x + n, m), we can therefore say that (x+n)(modd)=0(x + n) \pmod{d} = 0 and m(modd)=0m \pmod{d} = 0.

We can introduce an integer kk into the mix and rewrite (x+n)(modd)=0(x + n) \pmod{d} = 0 as (x+n)d=k\frac{(x + n)}{d} = k. Let's algebraify this up to get to the state that we want it to:

(x+n)d=k(x+n)=kdxkdn(modd)xn(modd)\frac{(x + n)}{d} = k \\ (x + n) = kd \\ x \equiv kd - n \pmod{d} \\ x \equiv -n \pmod{d}

Replacing dd with the gcd() function:

xn(modgcd(x+n,m))x \equiv -n \pmod{\gcd(x + n, m)}

Doesn't that look very, very familiar to the system of congruences that we were talking about earlier? Now, all we need to do is decide what values of nn and mm to pick.

Bob's decided that his three attempts is nowhere near enough attempts to do anything reasonable with a fixed offset nn. He's discovered something a bit more clever: what if you changed the value of nn every time? In doing so, it provides information about the offset from 0 modulo that GCD. He's selected the following values for nn:

{n1=0n2=1n3=2\begin{cases} n_1 &= 0 \\ n_2 &= -1 \\ n_3 &= -2 \end{cases}
Note: Bob's chosen negative values for n2n_2 and n3n_3 because of the earlier relation established, xn(modgcd(x+n,m))x \equiv -n \pmod{\gcd(x + n, m)}. Making nn negative creates positive remainders.

For mm, Bob chooses a very large primorial:

Primorial: For the nnth prime number pnp_n, the primorial pn#p_n\# is defined as the product of the first nn primes:
pn#=k=1npkp_n\# = \prod_{k=1}^n p_k
where pkp_k is the prime number.

Primorials have the special property in that since they're the product of the first nn primes, they're guaranteed to have a lot of prime factors. When thrown into the gcd() function, this will give us tons of information about the prime factors of xx since we're a lot more likely to get a hit (a miss would be if gcd(x+n,m)=1\gcd(x + n, m) = 1).

Bob's ended up deciding on m=p11#m = p_{11}\#. He pulls out his laptop and calculates it with Python:

Calculating Primorials

Bob's now ready to go! He walks up to the operator and hands him his pieces of paper. The operator hastily hands him back three pieces of paper with the resulting GCDs:

Now he knows that:

{x0(mod1)x1(mod66)x2(mod145)\begin{cases} x \equiv 0 \pmod{1} \\ x \equiv 1 \pmod{66} \\ x \equiv 2 \pmod{145} \end{cases}

and he can apply the Chinese Remainder Theorem to solve for xx. Bob opens back up his laptop and runs the following code:

Solving the Carnival Puzzle

Bob's got the number! Congratulations, Bob!

Implementation

Hopefully through this example, you've gained a bit of intuition on where CRT is derived from, why we chose those particular values, and why it works. Now, let's apply this to the actual challenge.

Here is the script that I used to solve this challenge. It's very straightforward and readable in comparison to other scripts I've seen, so I felt it was redundant to go through the step-by-step process. I've added comments to explain what's going on.

gcd-query-v1 Solve

Let's run the script on the remote server:

We've solved gcd-query-v1!

gcd-query-v1: amateursCTF{probabilistic_binary_search_ftw}

gcd-query-v2

solverenscribe
author: hellopir
points: 481
category: algo
solves: 34
I thought that skittles1412's querying system wasn't optimized enough, so I created my own. My system is so much more optimized than his!
nc amt.rs 31693

Of course there's a continuation. Let's see what attachment we're given now:

v2 Attachment

It seems that they haven't changed much. The only things that are different are:

  • getRandomInteger()'s n value has been reduced from 2048 to 128 bits (~39 digits)
  • We no longer need to complete ten iterations of different random integers; now it's only one iteration of a single random integer
  • We only get 16 iterations of gcd() instead of 1412

Well, first step is to try and rerun the same script that we used for gcd-query-v1 with some minor edits:

gcd-query-v2 Solve Attempt

Well, that didn't work. We're correctly parsing input and a number is being generated, but for some reason the server is telling us to "get better lol".

I added some print statements to see what we were getting in our moduli and remainder arrays:

Wow, check out that moduli array... that's not even nearly enough prime factors to accurately apply CRT. Let's increase the primorial then for increased chances:

Changing Primorial

Let's try running the script again:

We've managed to solve the entire challenge with only 16 queries!

gcd-query-v2: amateursCTF{crt_really_is_too_op...wtf??!??!?!?must_be_cheating!!...i_shouldn't've_removed_query_number_cap.}

Afterword

Thanks to everyone from les amateurs for hosting this CTF! I had a lot of fun solving these challenges and I hope to see more from you guys in the future. I'd also like to credit Quasar, SuperBeetleGamer, and flocto for helping me wrap my head around CRT in general throughout the process of writing this (because I almost always learn along the way). I hope you learned something like I did!

Sources: