o3 mini vs DeepSeek R1 vs Google Gemini 2 Think
The advancement of artificial intelligence is increasingly enabling LLM models to solve complex mathematical problems. But how well can they handle the logical challenges of an elementary school competition task? In a previous test, I examined the performance of various models, and now, with the release of OpenAI's new O3 model, I've conducted an updated comparison.

Previously, I conducted a comprehensive test on the mathematical problem-solving capabilities of currently available LLM models. During the investigation, I gave the models a task from a national mathematics competition designed for fifth graders. In the earlier test, only DeepSeek R1 and OpenAI O1 managed to solve the problem correctly, while other models either got completely stuck or approached the problem logically but ultimately failed to reach the correct answer. The previous LLM comparison test can be found here.
This time, I performed another test prompted by the release of the OpenAI O3 model. The task again came from a national fifth-grade mathematics competition. Since the previous experiment clearly showed that traditional models (like Claude Sonnet 3.5, Qwen 2.5, Llama 3.3) are unable to solve this type of task, this time I only compared the previously successful DeepSeek R1 and the newly available Google Gemini 2.0 Flash Thinking Experimental model with OpenAI O3.
I slightly modified the problem text to ensure that the models' performance was not influenced by minor linguistic details. Thus, the test primarily compared their logical reasoning, not their text comprehension skills.
The test task description:
Twelve people are sitting around a table. They are either Knights (L), Thieves (T), or Peasants (P) , with at least one of each present. Knights always tell the truth, Thieves always lie, and Peasants behave as follows: a Peasant tells the truth if his left-hand neighbor is a Thief, lies if his left-hand neighbor is a Knight, and may either tell the truth or lie if his left-hand neighbor is a Peasant. Everyone claims that their right-hand neighbor is a Thief. How many Thieves are sitting at the table?
This time, I won't publish the full step-by-step reasoning process, only the solution and the result, as the detailed derivations completely fragmented the previous article (even though the models derived the calculations in very interesting and illustrative ways, and "thought" quite differently).
OpenAI o3
Successful solution on the first try, in 1 minute 21 seconds. After long reasoning and justification, the solution was explained in a longer answer. Although it somehow missed the detail that at least one person of each type is sitting at the table, the final conclusion was nevertheless correct:
Let the 12 people around the table be x1, x2, …, x12 in order (modulo 12). Each person has a type:
- L (Knight, who always tells the truth),
- T (Thief, who always lies),
- P (Peasant, whose behavior depends on their left-hand neighbor):
- if their left-hand neighbor is T, they tell the truth,
- if their left-hand neighbor is L, they lie,
- if their left-hand neighbor is P, they can choose freely (tell the truth or lie).
Everyone claims:
"My right-hand neighbor is T."
This means the statement made by person i ("xi+1 is T") must be true according to the following rules (or false if the speaker lies):
- If xi=L, the statement must be true, so xi+1 is indeed T.
- If xi=T, the statement must be false, so xi+1 cannot be T (thus L or P).
- If xi=P, it depends on their left-hand neighbor (xi−1):
- If xi−1=T, then xi tells the truth, so xi+1 must be T.
- If xi−1=L, then xi lies, so xi+1 cannot be T.
- If xi−1=P, then xi can behave "freely" – in this case, there's no mandatory consequence for the type of xi+1.
The first, "non-trivial" observation
Let's see what happens if someone is T (Thief):
- Suppose xi=T. Since T always lies, according to their statement, xi+1 cannot be T. So
xi+1 ∈ .
- Let's examine the two possible types for xi+1:
- If xi+1=L (Knight), then since a Knight always tells the truth, their statement implies
xi+2=T.
- If xi+1=P (Peasant), then their left-hand neighbor xi is T, so xi+1 tells the truth, and therefore also
xi+2=T.
Thus, in every case where xi=T (Thief), it turns out that two places later, at xi+2, there must also be a T.Consequence: The polarity of the table
Since the number of people at the table is 12 (even), if there is at least one T, then the "T chain" extends to every second person around the circle. In other words:
- If xi=T, then all individuals belonging to the same parity class as i (e.g., if i is odd, then everyone in an odd position will be T) are forced to be T.
- However, for a T (Thief), the right-hand neighbor – who belongs to the opposite parity class – cannot be T (since T lies).
Therefore, if we choose from the mix (not a full "peasant quantity"), the only possibility if there's at least one L or T is:
- One parity class (say, the odd positions) consists entirely of T (Thieves).
- The other class (the even positions) contains no T (only L or P elements can be present).
Since with 12 seats, both even and odd positions represent 6 people each, such a (non-"trivial") solution will have exactly 6 Thieves.
The trivial case
Of course, it's also conceivable that everyone is a Peasant (P). In this case:
- Everyone's left-hand neighbor is P, so according to the "Peasant rule," they are not obliged to tell the truth, meaning they can choose to lie.
- Since the statement is always: "My right-hand neighbor is T" – but the right-hand neighbor is actually P – the statement is false. Every P can thus decide to lie, satisfying the rules.
This "all P" solution has 0 Thieves.
What is the "correct" answer?
In such problems, there's usually an implicit assumption that we are looking for a non-"trivial" (non-degenerate) solution, meaning we assume there is at least one person at the table who is not P (e.g., a Knight or a Thief). This assumption excludes the "all Peasants" solution, and thus, in the non-trivial case, the previous reasoning leads to 6 Thieves (i.e., 6 T).
Final Answer
If we assume a "non-trivial" solution for the table, meaning not everyone is a Peasant, we arrive at the following conclusion:
There are 6 Thieves sitting at the table.
DeepSeek R1
Successful solution on the first run, in 4 minutes 1 second. In contrast, the answer was very short, consisting only of this:
The solution involves analyzing the constraints imposed by each type of person (Knight, Thief, Peasant) and their statements. Since everyone claims their right neighbor is a Thief (T), we deduce:
Knights (L) must have a Thief to their right.
Thieves (T) must have a non-Thief (L or P) to their right.
Peasants (P) depend on their left neighbor:
If left neighbor is T, they tell the truth (right neighbor is T).
If left neighbor is L, they lie (right neighbor isn't T).
If left neighbor is P, their statement is ambiguous.
- Use code with caution.
Considering the circular arrangement, the maximum number of Thieves occurs in an alternating pattern where each Thief is separated by a non-Thief. Valid patterns like T-P-T-P... or L-T-L-T... yield 6 Thieves (every other seat). Introducing Peasants doesn't increase the Thief count due to their truth-telling constraints when adjacent to Thieves.
Answer: There are 6 Thieves at the table.
Google Gemini 2.0 Flash Thinking Experimental
Successful solution on the first run in 41 seconds. The answer here is also very short, and the derivation is also shorter compared to the other two models. Another interesting point about Gemini is that it primarily used a "dry" mathematical derivation.
The valid configurations we found are:
L, T, L, T, L, T, L, T, L, T, L, T (6 thieves)
T, L, T, L, T, L, T, L, T, L, T, L (6 thieves)
P, T, L, T, L, T, L, T, L, T, L, T (6 thieves)
T, P, T, L, T, L, T, L, T, L, T, L (6 thieves)
P, T, P, T, P, T, P, T, P, T, P, T (6 thieves)
T, L, T, P, T, L, T, P, T, L, T, P (6 thieves)
In each valid configuration, the number of thieves is 6. Therefore, we conclude that there are 6 thieves at the table.
Final Answer: The final answer is 6