In June of 2015, after 65 glorious years, the reign of the mighty Turing test came to an end.

Or did it?

Eugene Goostman

“Eugene Goostman”, a machine learning algorithm designed to mimic the speech patterns of a 13 year old Ukrainian boy with a pet guinea pig, became the first machine in history to trick the judges administering the Turing test into thinking it was a human.

Or some of them, anyway…

Yes, Eugene managed to convince 33% of the judges administering the Turing test that it was human. But 66% still caught the play.

You heard that right; Eugene convinced 10 out of 30 judges, which doesn’t sound quite so impressive, but hey, rules are rules…

The Turing test

Alan Turing

“I propose to consider the question, ‘Can machines think?'”

So posited Alan Turing in his 1950 paper “Computing Machinery and Intelligence”, in which he described his now famous process for discerning true AI from cleverly programmed chatbots.

When Turing designed his “imitation game” test in 1950, he estimated that it would take 50 years for someone to design a program which could beat it 30% of the time.

He overestimated the progress of global AI technology by a full 15 years.

Crappy Robot

With the amazing profusion of advancements in machine learning since the turn of the millennium, the problem of how to properly quantify true artificial intelligence has become an increasingly interesting one.

Where exactly does “computing” end and “thinking” begin?

Alan Turing’s “imitation game” test remains the gold standard for answering that very question to this day, which is extremely impressive, given the progress computer science has made since Turing’s day.

The premise of the test lies in the question of what exactly it means to “think”.

Thinking

The fact that Turing was grappling with such questions at a time when computer technology was in its very infancy stands as testament to his genius.

*** Fun fact: Most early computers were known as “Turing Machines” ***

Turing believed that defining what it means to “think” was an impossible task, and that instead the question should be whether a machine could meet a predetermined set of requirements, namely, his test.

It was, in fact, quite an elegant answer to an unanswerable question.

The process described was relatively simple:

An interrogator would ask questions of both a machine and a human, both of which were invisible to him/her.

Sort of like “The Dating Game” but less creepy.

The Dating Game Turing Test

The questions would be asked and answered by text displayed on a screen and visible to the interrogator.

The interrogator would ask identical questions to both, and attempt to discern which was the computer and which was the human based on the two sets of responses.

turing test diagram

The interrogator was given a fixed amount of time (5 minutes) in which to make a determination as to the nature of the two entities he/she was speaking to.

Far from being a hindrance, the test’s reliance on human judgment and its vulnerability to human error was actually its strength, since the entire point was to exploit human error to “win” the game.

*** Fun fact: Early AI attempts were often easy to spot, since they never made typos or spelling or grammar mistakes. ***

A machine was said to have “passed” the Turing test if it was able to fool over 30% of the judges involved in the experiment.

Obviously, Turing’s process had its shortcomings, something the nerds of the world were only too happy to point out ad nauseam.

The Loebner Prize

In 1990, Hugh Loebner designed an update for the Turing test, as part of an on going artificial intelligence contest called “The Loebner Prize”, which continues to be held annually to this day.

Loebner’s update included more time for the judges to ask questions and receive answers, as well as more interrogators to have a turn evaluating the program.

While Loebner shifted the parameters of the original Turing test in minor ways, the core principles of the test and its process have remained largely unchanged.

And, when one considers the fact that Turing originally suggested that a 70% success rate would be a better benchmark for success, we think our friend Eugene Goostman and his creators, Vladimir Veselov, Eugene Demchenko, and Sergey Ulasen might want to bring it down a couple of notches.

The Champ is hereChild please

The way things are going there is no question in our minds that the Turing test will be beaten in the near future, but we simply are not prepared to hand that belt to Eugene just yet.

Sign up to Phrasee’s weekly newsletter. It’s awesome. We promise.