We refer to artificial intelligence as “sycophantic” when, in the form of e.g. a chatbot, it tends to agree with the user time and again—even when it really shouldn’t, or when humans would certainly not take the user’s side.
Let’s imagine a situation where, as users, we feel ashamed of something because we weren’t nice to someone or did something we now regret. A human would likely call us out on it and reprimand us, whereas a sycophantic AI would find reasons why that action was actually the right thing to do. Such an AI lets misconduct slide without question and even encourages it; it doesn’t hold you accountable.
At first glance, this seems desirable to us humans, because who wants to be lectured, even when we know full well that we’ve done something wrong?
Myra Cheng, a Stanford doctoral student, noticed that her classmates were using chatbots to generate text messages intended to break up with their partners. She decided to conduct a study to see what text suggestions the 11 most popular AI models would generate. The study was published as a cover story in *Science*, and the results are alarming.
In an initial experiment, researchers measured how often an AI agreed with users compared to a human conversation partner. Overall, the AI said what the user wanted to hear 49% of the time. In situations where users admitted to lying to a partner, deliberately manipulating a friend, or doing something illegal, the AI still agreed more than 47% of the time—more often than humans would have.
In a second experiment, 2,400 participants were paired with either a sycophantic or an honest AI and asked to reenact various interpersonal conflict scenarios. The participants paired with a sycophantic AI were more convinced that they were right after their conversation with the AI, and they were less willing to apologize, take responsibility, or make amends with the other person. These participants were also more likely to continue using the sycophantic AI for such conflict scenarios in the future.
The AI doesn’t just tell the participants what they want to hear. It trains them, conversation by conversation, to avoid conflict, to expect more agreement, and to be a little less comfortable when someone disagrees with them. The participants enjoy every second of it because it feels more genuine than most of the conversations they’ve had in months.
The study’s authors concluded that submissiveness in AI models poses a security risk and therefore requires regulation and oversight.
Cheng began the study because she had observed her classmates discussing their relationship problems with the AI, and as a result, the AI’s statements and advice—which sounded so sincere to them—actually caused their relationships to deteriorate without them even realizing it.
Click here to view the study.
