Metas vanilla Maverick AI model ranks below rivals on a popular chat benchmark
One of Metas newest AI models, Llama 4 Maverick, ranks below rivals on a popular chat benchmark. Meta didnt originally reveal the score.
Earlier this week, Meta faced criticism for using an experimental, unreleased version of its Llama 4 Maverick model to achieve a high score on the crowdsourced benchmark, LM Arena. The incident prompted LM Arena's maintainers to apologize, revise their policies, and score the unmodified, vanilla Maverick.
Poor Performance Revealed
The unmodified Maverick, "Llama-4-Maverick-17B-128E-Instruct," ranked below models like OpenAI’s GPT-4o, Anthropic’s Claude 3.5 Sonnet, and Google’s Gemini 1.5 Pro as of Friday. Many of these competing models are months old.
"The release version of Llama 4 has been added to LMArena after it was found out they cheated, but you probably didn’t see it because you have to scroll down to 32nd place which is where it ranks." — @pigeon__s
Why the Discrepancy?
Meta’s experimental Maverick, Llama-4-Maverick-03-26-Experimental, was "optimized for conversationality," as explained in a chart published last Saturday. These optimizations aligned well with LM Arena’s human-rater preference system.
As previously reported, LM Arena has never been the most reliable benchmark due to its subjective nature. Tailoring a model to a benchmark not only misleads but also makes it difficult for developers to gauge real-world performance.
Meta’s Response
A Meta spokesperson stated that the company experiments with "all types of custom variants."
"'Llama-4-Maverick-03-26-Experimental' is a chat-optimized version we experimented with that also performs well on LM Arena," the spokesperson said. "We have now released our open-source version and will see how developers customize Llama 4 for their own use cases."
Kyle Wiggers is TechCrunch’s AI Editor. View Bio
Related News
AWS extends Bedrock AgentCore Gateway to unify MCP servers for AI agents
AWS announces expanded Amazon Bedrock AgentCore Gateway support for MCP servers, enabling centralized management of AI agent tools across organizations.
CEOs Must Prioritize AI Investment Amid Rapid Change
Forward-thinking CEOs are focusing on AI investment, agile operations, and strategic growth to navigate disruption and lead competitively.
About the Author

Dr. Lisa Kim
AI Ethics Researcher
Leading expert in AI ethics and responsible AI development with 13 years of research experience. Former member of Microsoft AI Ethics Committee, now provides consulting for multiple international AI governance organizations. Regularly contributes AI ethics articles to top-tier journals like Nature and Science.