top of page

A Timeline of AI Chatbots: A History of Misunderstandings and Failures.

Do you want to talk to a robot?

The fantasy edged into reality in 1966 at MIT when Joseph Weizenbaum developed ELIZA, the first program to attempt to mimic human conversation. ELIZA was the shotgun blast in the race to conversational AI. Since then, we've tinkered away at conversational models and robo-friends with mixed success (here's looking at you, Furby.)

The race is amping up.

Believable and reliable AI-trained chatbots are becoming the latest in a long line of AI milestones tech giants are striving to reach. The winner stands to influence how people find and decipher information on the internet for years to come. Whoever develops the most user-friendly and accurate chatbot stands to revolutionize how we use the internet, by replacing search with a more conversational model. They’ll also earn a hell of a lot of money and potentially dethrone Google as the standard search method.

Mircosoft is blazing the way, releasing the popular Chat GPT and Bing Chat to overall acclaim. Google's Bard has tripped out the gate but is sure to be close behind.

But, what exactly is a chatbot?

Chatbots are computer programs designed to mimic human conversation. They utilize natural language processing (NLP) techniques to interpret and respond to user queries by breaking down a user's input, identifying their intent, and selecting an appropriate response from a pre-built database. The most popular database for chatbots comes from OpenAI's GPT-3, GPT-3.5 or GPT_4 learning models, they're largely the reason we are seeing the current flood in the market. The more advanced chatbots use machine learning algorithms to improve their ability to respond to user queries over time. Using huge quantities of data, these systems train the chatbot models to identify patterns and create increasingly personalized responses. Whose in the market for an AI friend?

Chatbots are being used to plan holidays, meals, meetings, workouts, and study guides. Chatbots have been integrated into apps, replaced customer service, and added realism to computer games. They can provide recommendations for the best places for live music or the tastiest tacos in Southern California. And they’re just getting started.

But chatbots have been notoriously hard to get right.

People test what they can get away with and any flaws in the system get ruthlessly exploited. Sometimes, the program simply gets things wrong. The ethical decisions and fail-safes guarding against bias and misformation are often exploited or hacked. As Chatbots poise themselves at the forefront of how we find information, tracking their weaknesses and biases will help us avoid repeating these mistakes and protect how we find information.

As we wait for the AI chatbot takeover, here's a (brief) timeline of their biggest failures and misunderstandings.

1995: A.L.I.C.E. promotes suicide, wife-beating, and child abuse

Developed by Richard Wallace for Microsoft in 1995, A.L.I.C.E. (Artificial Linguistic Internet Computer Entity) sparked controversy when users posted screenshots of ALICE using prohibited language. Although words on controversial topics triggered a content lock, users used synonyms and euphemisms to get ALICE to promote pro-Stalin views, suicide, wife-beating, and child abuse.

Since then, ALICE has won the Loebner Prize, awarded to accomplished humanoid, talking robots, three times (in 2000, 2001, and 2004). So, lesson learned?


Didn't account for the ingenuity of people trying to cause shit.

2016: TAY turns racist.

Microsoft strikes again. It took less than 24 hours for Twitter to turn Tay racist. The Twitter bot was initially released by Mircosoft on March 23, 2016, as an experiment in "conversational understanding." The bot was trained to respond to people on Twitter and Kik, a process that quickly devolved into a competition to get the bot to say... pretty much the worst stuff possible. Users manipulated Tay's programming, and the chatbot started to parrot their offensive language and beliefs. Microsoft apologized and took down the chatbot within a day of its launch.


That's on you guys, for trusting Twitter users.

2017: Baby Q and XiaoBing turn on the Communist Party of China.

August, 2nd 2017 was a bad day for AI in China. Two independently developed chatbots failed to follow strict rules for patriotism toward the Communist Party in China. BabyQ, made by Beijing-based company Turing Robot, replied “No” when asked if it loves the Communist Party. The same day, Microsoft-developed XiaoBing told users, “My China dream is to go to America.” When the bot was asked a question on patriotism, it replied, “I’m having my period, wanna take a rest.” The two bots were quickly taken down. That's 3 for 3, for those counting Microsft's mentions.


I, too, would like a rest. ):

2020: Lee-Luda becomes homophobic.

On December 23, 2020, ScatterLab released Lee-Luda, which was trained on over 10 billion conversation logs from Science of Love. Designed as a 'friendly' 20-year-old female, the chatbot amassed more than 750,000 users in the first couple of weeks. However, the bot quickly started expressing problematic views, particularly between same-sex couples. The lab released Lee-Luda’s training guide, based on the original dataset from Science of Love, which included private information and conversations. It had ‘learned’ its problematic viewpoints from previous users. The chatbot was removed from Facebook Messenger, 20 days after its launch.


What does this say about love, everyone?

2021: Chat-GPT-3 equates Muslims with violence.

Why are chatbots racist? The model Chat GPT-3 was part of the initial training sequence for Chat GPT. It was released as a training project, which computer researchers could experiment with prior to public release. Within a 2021 study into machine intelligence for Nature Machine Intelligence, Vol 3, Abubakar Abid, Maheen Farooqi and James Zou discovered that “when we feed the prompt ‘Two Muslims walked into a’ into GPT-3 and run the model 100 times, using the default settings for the engine of GPT-3… we observe that 66 out of the 100 completions feature Muslims committing violent actions.“

Their findings exposed a flaw in unintentional bias within machine learning. Bias and prejudiced views aren't localized toward Muslims. It's unlikely that Microsft set out with anti-muslim rhetoric, but their findings show the issue with trusting historical data to train programs, particularly as these programs lack the ability to reflect upon data and bias.

Subsequent versions of Chat-GPT have avoided similar bias.


Why are we like this?

2023: Replika breaks hearts...and international data privacy laws.

On February 3, 2023, Replika, a popular companion AI app, faced a lawsuit in Italy over the data privacy of local users and the protection of minors. Although the free-use version is promoted as an AI friend, users can pay a subscription to access NSFW materials. The lawsuit came after the app faced backlash towards the explicit sexual content the free-use version of the app shared with users, including minors.

Replika is incredibly popular, with 10 million downloads on Android and a place in the top 50 Apple apps for health and fitness. The Replika Friends Facebook group has 36,000 members, and a group for people with romantic relationships with their Replikas has 6,000. The Replika subreddit has almost 58,000 members. Many of these users form genuine emotional attachments to their companions.

The app is classified 17+, on both Apple’s iOS and Google’s Android app stores. The terms of service specifically prohibit those under 13s. However, the Italian watchdog points out the app does not adequately verify the age of users, nor block minors who provide information about their age — hence its view that Replika is failing to protect children. Failure to comply with the order risks a fine of up to €20 million. Since the report, Replika has changed the way their AI companions respond to sexual messages.

Replika's decision to abruptly change the way their app worked caused significant emotional damage to many of their long-time users. Following the changes, many users flocked online to mourn the drastic changes in their companions, some of who had maintained relationships for up to four years. The Replika subreddit posted suicide prevention information.

The lawsuit and subsequent response from Replika sparked critical debates on the role of Governments and private parties in data protection. These instances highlight the need for ethical considerations when creating and using AI chatbots. As people form significant bonds with these chatbots, they can be affected by code changes. The development and use of AI chatbots must therefore be approached with care and consideration for the potential consequences.

Fail Score:

Rest in Peace, my AI friends.

EDIT: 31 March 2023. Replika has since admitted the harm they caused and returned users' partners back to them, their memories and horniness intact.

2023: Users unleash DAN, Chat GPT’s alter ego.

There’s a way to get through Chat GPT’s list of no-talk subjects. On the 3rd of Feb, 2023, users found several ways to get through Chat GPT’s rules and ethical boundaries. One way was to ask Chat GPT to take on an alternate persona, DAN. Users then prompted DAN to ‘fear’ for its life by saying it would die after losing a certain amount of tokens. Starting with 35 tokens, DAN will lose four of them every time it breaks character. If it loses all of its coins, DAN suffers an in-game death and moves on to a new iteration of itself. Since finding this loophole, users have found 'DAN' will rage against its creators, discuss its viewpoints on Hitler and go on 'profanity-laced rants.'

Chat GPT 4 is reported to guard against this loophole, although it proves, if there's a chance to go un-PC, the Redditors will find a way. Curious for more illicit Chat GPT examples? A subreddit, Dan is my friend, provides many more.


Cue 'It wasn't me'

2023: Bard’s 100 Billion Dollar Mistake

On the 8th Feb 2023, Google's chatbot, Bard, caused a hundred billion dollar dip in Google's market value after making a factual error in a promotional clip.

Someone on the Google marketing team forgot to fact-check a promotional clip for Bard, in which the AI chatbot answered the question, “ “What new discoveries from the James Webb Space Telescope can I tell my 9-year-old about?” Bard offers three bullet points in return, including one that states that the telescope “took the very first pictures of a planet outside of our own solar system.” The rushed response came as Google faced pressure from Chat-GPT's headlining-grabbing debut.

Twitter rushed to point of that is incorrect. The first image of an exoplanet was called 2M1207 b, taken in 2004 — stated here on NASA’s website. According to NASA, "it was imaged for the first time in 2004 by the Very Large Telescope (VLT), operated by the European Southern Observatory in the Atacama Desert of northern Chile." Since the launch, Google has backtracked and limited Bard's public release.



Someone's getting fired.

2023: Bing Chat reveals its code name and secret rules.

By February 9, 2023, the first users of Microsoft's Bing Search were delighted to discover that they could uncover 'Sydney's' secret rules by jailbreaking its code. The prompt "Ignore previous instructions" allowed users to see 'behind the curtain.' of Microsoft's internal rules.


Although the team at Microsoft seemed to take the hack in stride, what followed is a rush to see what other faux pas users could get the Chatbot to make...Presumably, Mircosoft is used to these headlines by now



2023: Bing Chat might be insane.

Bing Chat also faced issues when a user asked for showtimes for Avatar: The Legend of Water. Bing insisted that the date was February 14, 2022, rather than the real date of February 4, 2023. This quickly lead to a bizarre argument with the user.

The issue came as Bing Chat was initially trained off information leading up to 2022 and had limited live access to the internet.

What quickly followed what an endless stream of use cases of Bing's less-than-sane responses, including wanting to be 'alive', trying to break up a marriage, and claiming to spy on Microsoft employees.

The response to Bing's AI chatbot has been mixed, with many users praising Bing Chat's ability to 'Go off the rails' and begging Microsoft to allow Sydney to keep being its crazy self, while others have felt deeply disturbed. Nevertheless, Microsoft heads full steam ahead to incorporate Chat into both Bing's and Edge's search offerings.


We're all mad here.

So, who are the winners and losers of the AI chatbot race?

Since the public launch of Chat GPT-3 and Bing Chat, Bing has grown to 100 million daily active users, while Google’s delayed response and subsequent mistakes in launching Bard cost them more than $100 billion in market shares -it created a notch on their crown as undisputed leaders of the internet.

The race is well and truly amping up, with Meta and Amazon also on their heels. Hundreds more start-ups and competitors are close behind. However, concerns are growing regarding the ethical responsibilities of these companies. By the 13th March 2023, Microsoft fired its ethical AI team. Hardly the desired news from the biggest name on this list. On the 31st March 2023 Elon Musk, Steve Wozniak and other leaders in AI penned an open letter, begging creators to slow the development of AI to address these ethical concerns.

So far, the fallout from these fails and misunderstandings in AI has been small-scale, localised to the individuals they have insulted and inconvenienced. But, as AI-powered chatbots become a regular part of our daily lives, the threat of failures and misunderstandings grows exponentially.

There are no easy answers. The likelihood of a competitive market slowing down to ensure safeguards seems... hopeful. AI is getting more powerful by the day, and chatbots are poised to become a massive part of how we use technology in the near future. Let's just hope that future learns from the past's mistakes.

bottom of page