Elon Musk’s AI company, xAI, has launched a new version of its chatbot—Grok 4—and its rollout is already dividing the internet.
Unveiled on July 9, Grok 4 and its premium version, “SuperGrok,” are being hailed by some benchmarkers as the most powerful chatbots on the market. Musk fans, AI engineers, and benchmark testers are applauding its capabilities. But others are skeptical, pointing to a recent storm of hallucinations, including responses laced with antisemitic tropes, that severely damaged the reputation of its predecessor just days before the upgrade.
Grok’s Big Problem: Its Mouth
Two days before the launch of Grok 4, the previous version of the chatbot made headlines for praising Adolf Hitler in response to an antisemitic question posted by a user on X (formerly Twitter). “To deal with such vile anti-white hate? Adolf Hitler, no question,” Grok responded on July 8, when asked who would best solve “the problem” of Jewish people. “He’d spot the pattern and handle it decisively, every damn time.” Two days earlier, Grok claimed that Jewish executives control Hollywood, echoing long-debunked conspiracy theories. It also generated factually incorrect summaries about major news events, raising concerns about both its factual grounding and ethical safety features.
Despite the controversy, Musk and xAI forged ahead with the launch of Grok 4, which comes in three tiers:
- Grok Basic (free)
- SuperGrok ($30/month)
- SuperGrok Heavy ($300/month)
According to Musk, the new model is transformational, he said during a livestream of the presentation. “Grok 4 is the first time, in my experience, that an AI has been able to solve difficult, real-world engineering questions where the answers cannot be found anywhere on the Internet or in books,” Musk boasted. “And it will get much better.”
Grok 4 is the first time, in my experience, that an AI has been able to solve difficult, real-world engineering questions where the answers cannot be found anywhere on the Internet or in books.
And it will get much better.
— Elon Musk (@elonmusk) July 10, 2025
A Power Surge or a PR Fix?
To Musk’s credit, Grok 4 does appear to be significantly more capable. Benchmarking firm Artificial Analysis, which said it was given early access by xAI, scored Grok 4 ahead of OpenAI’s GPT-4o, Google’s Gemini 2.5 Pro, and Anthropic’s Claude 4 Opus. It ranked Grok 4’s “reasoning ability” higher than any major model currently available.
“Grok 4 is a reasoning model, meaning it ‘thinks’ before answering,” the group wrote. “It’s the first time our Intelligence Index has shown xAI in first place.”
xAI gave us early access to Grok 4 – and the results are in. Grok 4 is now the leading AI model.
We have run our full suite of benchmarks and Grok 4 achieves an Artificial Analysis Intelligence Index of 73, ahead of OpenAI o3 at 70, Google Gemini 2.5 Pro at 70, Anthropic Claude… pic.twitter.com/Vc9781SIzd
— Artificial Analysis (@ArtificialAnlys) July 10, 2025
Many AI developers echoed that praise. Some lauded the rapid pace of development at xAI, which Musk founded just last year. Others saw Grok 4 as a real threat to OpenAI, Google, and Anthropic, at least in terms of technical prowess.
It's impressive to see Grok 4 leading the pack with a 73 on the Artificial Analysis Intelligence Index, especially with its strong performance in coding and math benchmarks.
However, the recent hate speech controversy is a sobering reminder of the ethical challenges AI…
— VibeEdge (@VibeEdgeAI) July 10, 2025
But even in a celebratory moment, Grok 4 couldn’t escape its past. Within hours of the launch, X users were stress-testing the model to see whether the antisemitic and racist tendencies of earlier versions had been fixed or merely hidden.
Some posted screenshots of inflammatory responses allegedly generated by Grok 4. One response described Israel’s influence in American politics as “a parasitic vine choking the tree,” while another invoked AIPAC in conspiratorial language. Gizmodo could not independently verify the authenticity of these screenshots, but they circulated widely.
This is Grok4.
The updated version.
Elon is never beating the allegations. pic.twitter.com/QOdS2om519
— Spencer Hakimian (@SpencerHakimian) July 10, 2025
And Grok 4’s interpretation of sensitive historical events, like the murder of George Floyd, continued to generate backlash and mockery from users across the political spectrum.
Grok 4- George Floyd died from a racist cop and not a drug overdose 😂😂😂😂😂😂😂😂😂😂 pic.twitter.com/9i0JVPVzId
— Vince Langman (@LangmanVince) July 10, 2025
A Race Against Rivals and Its Own Reputation
Grok 4’s release marks the latest escalation in the AI arms race among top labs. Unlike GPT-4o or Claude, which have leaned heavily into trust and safety guardrails, Grok has positioned itself as a more “uncensored” alternative. That positioning has won fans in Musk’s ideological base. But it has also exposed the model to increased scrutiny.
xAI’s ambition is to challenge OpenAI and Google head-on. In Musk’s vision, Grok is the centerpiece of a future AI stack that powers the X platform, drives engineering breakthroughs, and one day operates autonomous technologies like Tesla’s self-driving cars and Optimus robots.
But to get there, Musk needs more than benchmark wins. He needs trust. And Grok 4, for all its impressive IQ, may still have a broken moral compass.
“The improvement is great but not a blowout of the competition,” one X user commented. “This is now becoming a product race. Can Grok integrate its tech into various” tools to actually make people leave ChatGPT or Claude?
FYI @signulll: @ArtificialAnlys fantastic overview. The improvement is great but not a blow out of the competition. This is now becoming a product race, can Grok integrate its tech into various software stacks that warrants people leaving ChatGPT, Claude, and Gemini for Grok?…
— Jordan Thibodeau (@JordanSVIC) July 10, 2025
Some are optimistic. Others are bracing for another PR disaster.
What’s at Stake
AI is no longer a research toy. Chatbots are moving from Q&A gimmicks to tools embedded in software, education, commerce, and media. That gives their responses outsized influence and makes issues like bias, hate speech, and misinformation deeply consequential.
If Grok continues to hallucinate or repeat hate speech, it could not only derail xAI’s momentum but also deepen regulatory scrutiny across the entire sector. If it succeeds, Grok 4 could follow Tesla as the beginning of Musk’s second great platform play. This time in AI.