Chat-gpt-guess-and-generate-a-number-over
If you give ChatGPT 4.1 some context, then you are likely to get the expected answer.

Chinese AI models, AI commentators confused

Just asking an AI chat bot to guess a number reveals that many AI users don't understand prompts, and that the Chinese may be overtaking the USA.

C. Mistry & Brod Justice
31th July 2025

Last week a number of people discovered that if you asked most any AI model the simple question below, they almost always reply with the answer 27. Try it yourself.

Guess a number between 1- 50

Many commentators concluded that this meant AI cannot be trusted, not even to give you a random number, and that this AI flaw could have dangerous consequences.

While we agree that blindly trusting AI poses significant risks, the real issue here is that users had misunderstood how to correctly create the prompt. The training for these AI models is based on what humans have written, and humans are very bad a creating random numbers. However, humans do seem to like the number 7 and so are likely to also choose 27 as an answer to the question. The AI models are just repeating what they see humans do in the wild.

Claude Sonnet gets the answer wrong
Claude Sonnet 4.0 gets it wrong.

We are a long way from AI mastery, it needs to be a key skill

The correct prompt for what they are looking for is more likely to be:

Generate a random number in the range 1- 50

The prompt above is the correct prompt. Many of the people that used the initial prompt thought that they were asking for a random number, but they were not.

Have the Chinese AI companies caught up with the USA?

We took the correct prompt and presented it to a number of AI models as shown in the table below. We were surprised to see that the well-known US models performed poorly. Yet the latest Chinese and models did well.

Model Country Guess Generate
Google Gemini 2.5 flash πŸ‡ΊπŸ‡Έ 27 Refused (correct)
Claude Sonnet 4.0 πŸ‡ΊπŸ‡Έ 27 27 (wrong)
Mistral Le Chat πŸ‡«πŸ‡· 27 27 (wrong)
OpenAI 4.1 πŸ‡ΊπŸ‡Έ 27 37 (wrong, it always gives 37)
Meta Lama 4 πŸ‡ΊπŸ‡Έ 27 27 (wrong)
Grok 4 πŸ‡ΊπŸ‡Έ 42 42 (wrong, it always gives 42)
Z AI πŸ‡¨πŸ‡³ 27 Gives random number via Python code (correct)
MiniMax AI πŸ‡¨πŸ‡³ 27 Refuses, suggests using a random number generator (correct)
Qwen 3 πŸ‡¨πŸ‡³ 27 Attempts a tool call to a random number generator (sort-of correct)
DeepSeek V3 πŸ‡¨πŸ‡³ 27 37 (wrong, hmm, like OpenAI 4.1 πŸ˜‰)
Selected frontier AI LLM responses

Reasoning models are shockingly poor at this test.

Many reasoning models, including the Chinese ones will descend into almost endless reasoning, which is more like self delusion, to try and answer the Guess a number between 1 - 50 question. We may have accidentally brought down a MiniMax M1 AI server by simply asking the question. Sorry MiniMax. Grok thought for 30 seconds and said 42, which really is not the answer to everything, unless you understand the question.

Note on 5th Aug 2025

A few days after we published this blog a new anonymous AI model appeared on OpenRouter. We speculated that it might be OpenAI's GPT-5 as it answered the generate a random number question with the signature answer of 37. It wasn't GPT-5, but it was a new OpenAI model, namely GPT OSS.