Serving tech enthusiasts for over 25 years.
TechSpot means tech analysis and advice you can trust.
WTF?! OpenAI's latest AI model, o1, has been displaying unexpected behavior that has captured the attention of both users and experts. Designed for reasoning tasks, the model has been observed switching languages mid-thought, even when the initial query is presented in English.
Users across various platforms have reported instances where OpenAI's o1 model begins its reasoning process in English but unexpectedly shifts to Chinese, Persian, or other languages before delivering the final answer in English. This behavior has been observed in a range of scenarios, from simple counting tasks to complex problem-solving exercises.
One Reddit user commented, "It randomly started thinking in Chinese halfway through," while another user on X questioned, "Why did it randomly start thinking in Chinese? No part of the conversation (5+ messages) was in Chinese."
Why did o1 pro randomly start thinking in Chinese? No part of the conversation (5+ messages) was in Chinese... very interesting... training data influence pic.twitter.com/yZWCzoaiit
– Rishab Jain (@RishabJainK) January 9, 2025The AI community has been buzzing with theories to explain this unusual behavior. While OpenAI has yet to issue an official statement, experts have put forward several hypotheses.
Some, including Hugging Face CEO Clément Delangue, speculate that the phenomenon could be linked to the training data used for o1. Ted Xiao, a researcher at Google DeepMind, suggested that reliance on third-party Chinese data labeling services for expert-level reasoning data might be a contributing factor.
"For expert labor availability and cost reasons, many of these data providers are based in China," said Xiao. This theory posits that the Chinese linguistic influence on reasoning could be a result of the labeling process used during the model's training.
Or impact of the fact that closed-source players use open-source AI (currently dominated by Chinese players) like open-source datasets?
The countries or companies that win open-source AI will have massive power and influence on the future of AI. https://t.co/M8ZdYfWxNI
Another school of thought suggests that o1 might be selecting languages it deems most efficient for solving specific problems. Matthew Guzdial, an AI researcher and assistant professor at the University of Alberta, offered a different perspective in an interview with TechCrunch: "The model doesn't know what language is, or that languages are different. It's all just text to it," he explained.
This view implies that the model's language switches may stem from its internal processing mechanics rather than a conscious or deliberate choice based on linguistic understanding.
New phenomenon appearing: the latest generation of foundation models often switch to Chinese in the middle of hard CoT thinking traces.
Why? AGI labs like OpenAI and Anthropic utilize 3P data labeling services for PhD-level reasoning data for science, math, and coding; for... https://t.co/VllUIC9V91
Tiezhen Wang, a software engineer at Hugging Face, suggests that the language inconsistencies could stem from associations the model formed during training. "I prefer doing math in Chinese because each digit is just one syllable, which makes calculations crisp and efficient. But when it comes to topics like unconscious bias, I automatically switch to English, mainly because that's where I first learned and absorbed those ideas," Wang explained.
I've always felt that being bilingual isn't just about speaking two languages--it's about THINKING and muttering in whichever language feels more natural depending on the topic and context. For example, I prefer doing math in Chinese because each digit is just one syllable, which... https://t.co/yD2YNscWW5
– Tiezhen WANG (@Xianbao_QIAN) January 13, 2025While these theories offer intriguing insights into the possible causes of o1's behavior, Luca Soldaini, a research scientist at the Allen Institute for AI, emphasizes the importance of transparency in AI development.
"This type of observation on a deployed AI system is impossible to back up due to how opaque these models are. It's one of the many cases for why transparency in how AI systems are built is fundamental," Soldaini said.