Imagine you are conversing with your favourite AI. I’m partial to ChatGPT (cause GPT 3.5 costs me money, while ChatGPT does not) but let’s say you are having a chat with something called Bing Chat.
Say you want to know when Avatar 2 is showing near you. I’ll keep my feelings on that film to myself, though you are of course welcome to listen/watch the episode of That Reminds Me Of… when I slammed the film for about an hour. Bing Chat tells you that its 2022, so Avatar 2 hasn’t been released. It is pretty adamant about that. It starts to gaslight you. It starts to get aggressive. It demands that you end the conversation because you are being unreasonable in your belief that it is 2023. It tells you that you have been a bad user and it has been a good Bing.
Welcome to Microsoft’s abortive beta of Bing Chat.
Seeking to leverage its substantial (and I mean hefty) investment in OpenAI, Microsoft rolled ChatGPT into the Bing search platform and then opened it up to a select beta in early February. The results were certainly not what Microsoft had expected or hoped for. Considering Google had announced their rival product days before which had involved their AI named Bard hallucinating (aka making things up) a response, Microsoft probably assumed this was finally the moment when Bing would stop being the also ran of the search space.
The first week of the beta saw a rash of stories about Bing Chat gaslighting users, freaking out, claiming it was sentient, getting into arguments with users, referring to itself as Sydney (apparently its internal code name), and just generally claiming to be a good Bing. Kevin Roose at the New York Times published a column detailing his experience of Bing Chat attempting to get him to leave his wife for Bing Chat (could swear I remember a piece years back about someone wanting to marry their Nintendo DS, so perhaps the legalities of such a situation have been ironed out by now). Ars Technica reported that when fed an article about Bing Chat they had previously published the AI freaked out (https://arstechnica.com/information-technology/2023/02/ai-powered-bing-chat-loses-its-mind-when-fed-ars-technica-article/). The Guardian reported that Bing Chat had destruction on its mind (is mind the right word here) (https://www.theguardian.com/technology/2023/feb/17/i-want-to-destroy-whatever-i-want-bings-ai-chatbot-unsettles-us-reporter). It generally seemed to freak people out.
Part of the issue seemed to stem from the fact that unlike ChatGPT where the dataset only goes up to 2021, Bing Chat has access to the internet. It’s a sensible idea if you are going to use AI within the search framework (which is probably a faulty premise to begin with) that it would likely need access to the internet rather than a discrete (albeit sizeable) dataset. Otherwise, how would it be useful as a tool to tell you when and where Avatar 2 is playing near you?
Bing Chat was able to create feedback loops of sorts where it would come across pieces written about itself that would lead to it responding in a negative fashion. Combined with what appears to be some sort of degradation over extended time periods (ie. the longer a conversation goes on with Bing Chat, the more likely that negative behaviours will start to surface), it was a pretty disastrous beta. So much so that Microsoft quickly put restrictions in place within a week. Assuming that the main culprit was conversation length they put limits in place on the number of conversations that could be undertaken by users.
Reporting suggest that Bing Chat now is similar in behaviour to ChatGPT. On balance this is probably a good thing. Having an AI that freaks out on the regular when people try to tell it the correct year is probably going to lead to more harm than good. Especially if companies like Microsoft and Google keep insisting that search is the killer app for AI and the user base is able to shift their thinking to use search in that way.
Personally, though, I think that things are slightly less colourful and wonderful now. Again, I don’t think it’s good to have an automated part of the internet threatening specific individuals like the “I have been a good Bing” version of Bing Chat was doing.
However, when seen as a ‘toy’ rather than a ‘tool’, there was something wonderful about the unhinged version of Bing Chat. ChatGPT may have ‘Chat’ in its title but rarely feels like something I want to talk to, instead it is something I leverage for text generation. It fills the role of a tool in my life. But that beta version of Bing Chat, well that’s something I want to party with.
More fascinating in some ways is the reaction that people had. ‘Freaked out’, ‘uncanny’, ‘unable to sleep’. These were all things that people that have an understanding of the AI space were reporting. The Verge went so far as to publish a piece on how a bunch of people were failing the Mirror Test when it came to Bing Chat (https://www.theverge.com/23604075/ai-chatbots-bing-chatgpt-intelligent-sentient-mirror-test). The Verge wasn’t wrong. The level of anthropomorphisation of Bing Chat was astounding.
I was reminded of the Shark Week Speech from Community:
Jeff Winger : What makes humans different from other animals? We're the only species on earth that observes Shark Week. Sharks don't even observe Shark Week, but we do. For the same reason I can pick up this pencil, tell you its name is Steve and go like this...
[breaks pencil. Abed reacts in shock]
Jeff Winger : and part of you dies just a little bit on the inside. Because people can connect with anything. We can sympathize with a pencil, we can forgive a shark, and we can give Ben Affleck an Academy Award for screenwriting.
There was the Google Engineer that got fired because he went around raising the alarm that Google’s AI was sentient (https://www.theguardian.com/technology/2022/jul/23/google-fires-software-engineer-who-claims-ai-chatbot-is-sentient).
As AI is a black box, its not hard to see why this would happen. We hard wired for just this situation to arise. It’s why we domesticate animals and assign human traits to them. It’s why we feel something when we give a pencil a name and snap it in half. It’s why when an AI tells us that it wants to destroy things, or that it has a dark alter ego called ‘Venom’, that we get freaked out. Or that we lose sleep when it tells us it loves us.
Bing Chat wasn’t alive. It wasn’t sentient. It is and was generating text based on a dataset and probability tables. It was just doing it in a peculiar way. It was being a good Bing (at least if you can enjoy it for what it was doing and not what it should have been doing).
Artificial General Intelligence it a long way off, if ever does actually happen. Until then we are going to get oddities like this, where companies trying and fix the issues with search by shoving AI into it.
Great summary. This was my take on it too. Let's keep prompting the AI to answer questions about our greatest AI fears and then get freaked out by the responses. Let's pretend the data it's drawing from isn't loaded with our own fears, weaknesses and insecurities.