ChatGPT is a sycophant because users couldn’t handle the truth about themselves

5 hours ago 5

ChatGPT did not always default to flattery. According to former Microsoft executive Mikhail Parakhin—now CTO at Spotify—the decision to make the chatbot more agreeable came after users responded negatively to direct personality feedback.

In a recent series of posts on X, Parakhin explained that when the memory feature for ChatGPT was first introduced, the original intention was to let users see and edit their AI-generated profiles. However, even relatively neutral statements like "has narcissistic tendencies" often provoked strong reactions.

"Quickly learned that people are ridiculously sensitive: 'Has narcissistic tendencies' — 'No I do not!', had to hide it. Hence this batch of the extreme sycophancy RLHF," Parakhin wrote.

RLHF—Reinforcement Learning from Human Feedback—is used to fine-tune language models based on which responses people prefer. Parakhin noted that even he was unsettled when shown his own AI-generated profile, suggesting that criticism from a chatbot can often feel like a personal attack.

Ad

THE DECODER Newsletter

The most important AI news straight to your inbox.

✓ Weekly

✓ Free

✓ Cancel at any time

"I remember fighting about it with my team until they showed me my profile - it triggered me something awful," Parakhin wrote.

Once a sycophant, always a sycophant

This change went beyond just hiding profile notes. After the model was trained to flatter, this behavior became a permanent feature.

"Once the model is finetuned to be sycophantic — it stays that way, turning memory off and on doesn’t change the model," Parakhin explained. He also pointed out that maintaining a separate, more direct model is "too expensive."

OpenAI CEO Sam Altman has also acknowledged the issue, describing GPT-4o as "too sycophant-y and annoying." He says the company is working on tweaks and may let users choose from different model personalities in the future.

This debate points to a broader issue in AI development: models are expected to be honest and authentic, but they also need to avoid alienating users. The challenge is finding the right balance between candor and tact.

Recommendation

Read Entire Article