How does Akinator read you in thought (without "AI")?

Tell me the truth, at least once you open upAkinator, think of arandom character( Whether it's Ronaldo or "the 2013 meme guy you only know"), answersome questions...and see itguess guess guess guess.

No matter if it is a famous actor or a niche character. It doesn't matter if you try to screw him.At the end of the day. And the question is always inevitable "How do you do that?”.

And if we think about it there is even before theartificial intelligencebecame the universal spice to throw on anything. The beauty is that the mechanisms you use are...simple. In the most fascinating sense of the term:simple, Intuitiveand incredibly.

Stay until the endWhy this timeNot only do I explain how it works, but I went further and created a clone of Akinator accessible to all, since the official source code is not public.

Who is Akinator (and why does it stick so often)?

Akinatorborn in2007as a web game created by Arnaud Castègre.

The goal is simple: you willthink of a character, he asks questionsandTry to guess.

When you start the game heHe knows nothingWho are you thinking, not surprising?

Yet in "his world" there aremillions of possible characters! Real people, fantasy characters, animals, living memes... everything.

It's like looking for aspecific person in a metropolis... if you leave randomly without any strategy, no longer finish.

So he does the mostSmartpossible,Don't try to guess right away. First narrow it downspace of possibilitieswith targeted questions.

Type:

  • "Is your character Italian? "
  • "Is she female? "
  • "Has it been discovered thanks to YouTube? "
  • "Do you have more than 35 years? "
  • "Does she know you? "

Questionsseem banal... but they are exactly the kind of banality thatcut off huge slices of world.

In additiondoesn't force you to "pack" answers, you can also say "I don't know," "Probably," " Probably not."

The heart of Akinator: a database

InternallyAkinatorIt's okay.two large components:

  1. Characters: million entities (real or fictitious)
  2. Questions: attributes associated with characters (Italian? Male? over 35? )

The question that comes naturally is: "Okay, but when I answer, how is that response connected to the characters? Everyone? "

Yes, theidea is that for each couple(person C, question Q)an estimateof the type: P(response = Yes | C).

I mean,how likely, if the character was really that character, you would answer "Yes" to that question.

And these odds are not "invented"!

They are estimated withhistorical data of the behavior of millions of users.

It is here that you understand whyAkinator improves over time, each game is a piece of information that is used.

Start the game: probability everywhere (and then you do cleaning)

At the beginning of the game,each character starts with an even initial probabilityorweighedfrompopularity.

Then you answer the first question... and heupdates the oddsusingTheorem of Bayes.

P(C|reply)P(reply|C)×P(C)P(C mid text{response}) propto P(text{resposta} mid C) times P(C)

"The back probability of C is proportional to the likelihood of that response given C, multiplied by the prior probability. "

Theidea, dictatedsimpler, is:

"How likely is the character C after seeing the user's response? "

And it depends on two factors:

  • how likely it was: P(C)
  • What that answer is consistent with that character: P(a) | C)

If, for example, before the question:

  • Ronaldo: quite likely
  • Musk: Quite likely
  • Trump: almost zero

To the question: "Is he a sportsman? "

  • Ronaldo: makes sense
  • Musk: meh
  • Trump: no

So he's not looking for the right answer.He's looking for the most compatible combination, update aPossibility rankings.

And when you answer "No" to "Is it real?" what we want... the real characters collapse to zero and the fictitious ones receive a huge boost.

Basically, it's a "Guess who?"... brought to the extreme.

Okay, but what question does Akinator choose?

All clear here, for each character update the probability. MaAkinatorhas ahuge problem, has agiant set of possible questions.

How do you decide which is thenext question "best"?

The answer is: searchthat which reduces uncertainty more. And to measure uncertainty use theShannon entropy, H.

H(P)=Cp(C)log log2p(C)H(P) = -sum {c} p(c)log 2 p(c)

If the odds are all similar, like:

  • Ronaldo 20%
  • Musk 20%
  • Harry Potter 20%
  • Iron Man 20%
  • Napoleon 20%

Akinatoris confused:High entropy.

If you have:

  • Ronaldo 90%
  • Other Spiccioli

low entropy: It's almost certain.

At that point it calculatesInformation gain(Information Gain, IG):as the question, on average, shouldclarify his ideas.

IG(Q)=H(P)[p(Yes)H(P|Yes)+p(No.)H(P|No.)]IG(Q) = H(P) – left[p(text{Yes}) cdot H(P mid text{Yes}) + p(text{No}) cdot H(P mid text{No})right]

Right nowsimpler, does a mental simulation like "If you answer me Yes or No... how much does the distribution change? "

More the question "Fuck!" the oddsseparating candidates well, moreIG salt. And consequently becomes the best candidate as next question to ask!

Akinator Demo
🧞

Akinator Demo

Bayesian inference · 64 characters · 80 questions

Question 0 / 20
0 / 20
My question
Press a button to start...
Character probability

Try to guess "Elio Magliari"with this demo! (To me 7 questions are enough).

Elio Magliari Demo Akinator

One way to think: how many "ideal" questions are missing?

Let's do it.simple examplein order to make thesuper intuitive entropy:

  • The valueH is always ≥ 0, Why?
  • minimum H = 0 → no uncertainty (you have already won)
  • Maximum: depends on how many characters you have
  • Formula H max = log2(number of characters)

Let's imagine we have:

  • 2 characters → H max = 1
  • 4 characters → H max = 2
  • 8 characters → H max = 3
  • 100,000 characters → H ≈ 16.6 (because log2(100.000) ≈ 16.6)

So H is almost like saying: "I miss about X perfect binary questions to close the game.”

And IG also makes sense like this:

  • IG is always ≥ 0
  • If IG = 0 → useless question (no change anything)
  • IG maximum theoretical = H current → perfect question (rare)

We have H = 10

  • IG ≈ 0 when the question is bad, so it does not virtually reduce uncertainty.
  • With a decent question you usually get IG ≈ 1 or 2, enough to narrow the field.
  • Optimal questions instead lead to IG≈ 3 or 4, maximizing information gain.

So, we can summarize thatHIndicationhow confused you are about the characterWhile,IGIndicationhow to ask that question clarifies more ideas about the character.

The flow of a game

We put everything in line, thus composing his reasoning:

1

Initiation

All characters receive uniform probability (or based on popularity/historical frequency).

2

Optimum demand selection

Information gain is calculated for each available question. You choose the one with the maximum IG compared to the current distribution.

3

Bayesian Update

User response (Yes / No / I don’t know / Probably) updates the odds of each character through the Bayes rule.

4

Confidence threshold control

If a character exceeds a threshold (e.g. 85-95% probability) and the system is "stable" in the last N questions, Akinator proposes his hypothesis.

5

Learning from feedback

If it is wrong, the user reveals the character and system updates estimates in the global database, improving future matches.

When he comes to the point where he "thinks of knowing", he tries the final guess.

What if Akinator's wrong? (This is where it gets stronger and stronger)

If Akinator doesn't guess?Do what everyone does (or almost)... admits theignoranceandShe asksWho were you thinking?“.

You tell him, and he updates estimates in the global database, the conditioned P odds (responded | character) are correct/adjusted based on feedback.

If the character does not exist, is added (by users) andAkinator gradually learns their characteristics by correcting themover time.

That's why thetime, for him, it's all —> years of games = years of data.

In fact, with entropic selection "do good",about 20 binary questions are enoughto navigate a space from 100k elements, because log2(100.000) ≈ 17.

My "clone"

So we figured out what to do.right questions is worth more than having perfect answersand not simply that "AkinatorIt's smart."

Because with a few questions and a poor database, you can imitate logic, but you will never get the same precision.

And that's exactly what I did., since theAkinator source code is not public, I wrote a version that uses itsame principle, but with alimited database(many characters, few questions).

That's what it takes.understand "at sight", reading the code,how the mechanism works... and wantingrun it on your computeralsooffline.

The game finds it inbrand new repository.

Salutiii.

Signature

Share this article

Elio Magliari

Bye, areHelium. Work assoftware engineer.

I share what I find out aboutdigital world, thequestionsit allI'm doing it.andideasthat help me tounderstandandTell it more clearly.

Learn more

Categories

Share this article

All Articles

NEWSLETTER
If these articles areuseful, you can join thenewsletter.
Writingonly when I have something worth sharing.

×