How does Akinator read you in thought (without "AI")?

Tell me the truth, at least once you open up Akinator, think of a random character ( Whether it's Ronaldo or "the 2013 meme guy you only know"), answer some questions...and see it guess guess guess guess.

No matter if it is a famous actor or a niche character. It doesn't matter if you try to screw him. At the end of the day. And the question is always inevitable "How do you do that?”.

And if we think about it there is even before theartificial intelligence became the universal spice to throw on anything. The beauty is that the mechanisms you use are... simple. In the most fascinating sense of the term: simple, Intuitive and incredibly .

Stay until the endWhy this time Not only do I explain how it works, but I went further and created a clone of Akinator accessible to all, since the official source code is not public.

Who is Akinator (and why does it stick so often)?

Akinator born in 2007 as a web game created by Arnaud Castègre.

The goal is simple: you will think of a character, he asks questions and Try to guess.

When you start the game he He knows nothing Who are you thinking, not surprising?

Yet in "his world" there are millions of possible characters! Real people, fantasy characters, animals, living memes... everything.

It's like looking for a specific person in a metropolis... if you leave randomly without any strategy, no longer finish.

So he does the most Smart possible, Don't try to guess right away. First narrow it down space of possibilities with targeted questions.

Type:

  • "Is your character Italian? "
  • "Is she female? "
  • "Has it been discovered thanks to YouTube? "
  • "Do you have more than 35 years? "
  • "Does she know you? "

Questions seem banal... but they are exactly the kind of banality that cut off huge slices of world.

In addition doesn't force you to "pack" answers, you can also say "I don't know," "Probably," " Probably not."

The heart of Akinator: a database

Internally Akinator It's okay. two large components:

  1. Characters: million entities (real or fictitious)
  2. Questions: attributes associated with characters (Italian? Male? over 35? )

The question that comes naturally is: "Okay, but when I answer, how is that response connected to the characters? Everyone? "

Yes, theidea is that for each couple (person C, question Q) an estimate of the type: P(response = Yes | C).

I mean, how likely, if the character was really that character, you would answer "Yes" to that question.

And these odds are not "invented"!

They are estimated with historical data of the behavior of millions of users.

It is here that you understand why Akinator improves over time, each game is a piece of information that is used.

Start the game: probability everywhere (and then you do cleaning)

At the beginning of the game, each character starts with an even initial probabilityor weighed from popularity.

Then you answer the first question... and he updates the odds using Theorem of Bayes.

P(C|reply)P(reply|C)×P(C)P(C mid text{response}) propto P(text{resposta} mid C) times P(C)

"The back probability of C is proportional to the likelihood of that response given C, multiplied by the prior probability. "

Theidea, dictated simpler, is:

"How likely is the character C after seeing the user's response? "

And it depends on two factors:

  • how likely it was: P(C)
  • What that answer is consistent with that character: P(a) | C)

If, for example, before the question:

  • Ronaldo: quite likely
  • Musk: Quite likely
  • Trump: almost zero

To the question: "Is he a sportsman? "

  • Ronaldo: makes sense
  • Musk: meh
  • Trump: no

So he's not looking for the right answer. He's looking for the most compatible combination, update a Possibility rankings.

And when you answer "No" to "Is it real?" what we want... the real characters collapse to zero and the fictitious ones receive a huge boost.

Basically, it's a "Guess who?"... brought to the extreme.

Okay, but what question does Akinator choose?

All clear here, for each character update the probability. Ma Akinator has a huge problem, has a giant set of possible questions.

How do you decide which is the next question "best"?

The answer is: search that which reduces uncertainty more. And to measure uncertainty use theShannon entropy, H.

H(P)=Cp(C)log log2p(C)H(P) = -sum {c} p(c)log 2 p(c)

If the odds are all similar, like:

  • Ronaldo 20%
  • Musk 20%
  • Harry Potter 20%
  • Iron Man 20%
  • Napoleon 20%

Akinator is confused: High entropy.

If you have:

  • Ronaldo 90%
  • Other Spiccioli

low entropy: It's almost certain.

At that point it calculates Information gain (Information Gain, IG): as the question, on average, should clarify his ideas.

IG(Q)=H(P)[p(Yes)H(P|Yes)+p(No.)H(P|No.)]IG(Q) = H(P) – left[p(text{Yes}) cdot H(P mid text{Yes}) + p(text{No}) cdot H(P mid text{No})right]

Right now simpler, does a mental simulation like "If you answer me Yes or No... how much does the distribution change? "

More the question "Fuck!" the odds separating candidates well, more IG salt. And consequently becomes the best candidate as next question to ask!

Akinator Demo
🧞

Akinator Demo

Bayesian inference · 64 characters · 80 questions

Question 0 / 20
0 / 20
My question
Press a button to start...
Character probability

Try to guess "Elio Magliari"with this demo! (To me 7 questions are enough).

Elio Magliari Demo Akinator

One way to think: how many "ideal" questions are missing?

Let's do it. simple example in order to make thesuper intuitive entropy:

  • The value H is always ≥ 0, Why?
  • minimum H = 0 → no uncertainty (you have already won)
  • Maximum: depends on how many characters you have
  • Formula H max = log2(number of characters)

Let's imagine we have:

  • 2 characters → H max = 1
  • 4 characters → H max = 2
  • 8 characters → H max = 3
  • 100,000 characters → H ≈ 16.6 (because log2(100.000) ≈ 16.6)

So H is almost like saying: "I miss about X perfect binary questions to close the game.”

And IG also makes sense like this:

  • IG is always ≥ 0
  • If IG = 0 → useless question (no change anything)
  • IG maximum theoretical = H current → perfect question (rare)

We have H = 10

  • IG ≈ 0 when the question is bad, so it does not virtually reduce uncertainty.
  • With a decent question you usually get IG ≈ 1 or 2, enough to narrow the field.
  • Optimal questions instead lead to IG≈ 3 or 4, maximizing information gain.

So, we can summarize that H Indication how confused you are about the characterWhile, IG Indication how to ask that question clarifies more ideas about the character.

The flow of a game

We put everything in line, thus composing his reasoning:

1

Initiation

All characters receive uniform probability (or based on popularity/historical frequency).

2

Optimum demand selection

Information gain is calculated for each available question. You choose the one with the maximum IG compared to the current distribution.

3

Bayesian Update

User response (Yes / No / I don’t know / Probably) updates the odds of each character through the Bayes rule.

4

Confidence threshold control

If a character exceeds a threshold (e.g. 85-95% probability) and the system is "stable" in the last N questions, Akinator proposes his hypothesis.

5

Learning from feedback

If it is wrong, the user reveals the character and system updates estimates in the global database, improving future matches.

When he comes to the point where he "thinks of knowing", he tries the final guess.

What if Akinator's wrong? (This is where it gets stronger and stronger)

If Akinator doesn't guess? Do what everyone does (or almost)... admits theignorance and She asks Who were you thinking?“.

You tell him, and he updates estimates in the global database, the conditioned P odds (responded | character) are correct/adjusted based on feedback.

If the character does not exist, is added (by users) and Akinator gradually learns their characteristics by correcting them over time.

That's why the time, for him, it's all —> years of games = years of data.

In fact, with entropic selection "do good", about 20 binary questions are enough to navigate a space from 100k elements, because log2(100.000) ≈ 17.

My "clone"

So we figured out what to do. right questions is worth more than having perfect answers and not simply that "Akinator It's smart."

Because with a few questions and a poor database, you can imitate logic, but you will never get the same precision.

And that's exactly what I did., since the Akinator source code is not public, I wrote a version that uses it same principle, but with a limited database (many characters, few questions).

That's what it takes. understand "at sight", reading the code, how the mechanism works... and wanting run it on your computer also offline.

The game finds it in brand new repository.

Salutiii.

Signature

Share this article

Elio Magliari

Bye, are Helium. Work as software engineer.

I share what I find out about digital world, the questions it all I'm doing it. and ideas that help me to understand and Tell it more clearly.

Learn more

Categories

Share this article

All Articles

NEWSLETTER
If these articles are useful, you can join the newsletter.
Writing only when I have something worth sharing.

×