LogoAgentHunter
  • Submit
  • Industries
  • Categories
  • Agency
Logo
LogoAgentHunter

Discover, Compare, and Leverage the Best AI Agents

Featured On

Featured on yo.directory
yo.directory
Featured on yo.directory
Featured on Startup Fame
Startup Fame
Featured on Startup Fame
AIStage
Listed on AIStage
Sprunkid
Featured on Sprunkid
Featured on Twelve Tools
Twelve Tools
Featured on Twelve Tools
Listed on Turbo0
Turbo0
Listed on Turbo0
Featured on Product Hunt
Product Hunt
Featured on Product Hunt
Game Sprunki
Featured on Game Sprunki
AI Toolz Dir
Featured on AI Toolz Dir
Featured on Microlaunch
Microlaunch
Featured on Microlaunch
Featured on Fazier
Fazier
Featured on Fazier
Featured on Techbase Directory
Techbase Directory
Featured on Techbase Directory
backlinkdirs
Featured on Backlink Dirs
Featured on SideProjectors
SideProjectors
Featured on SideProjectors
Submit AI Tools
Featured on Submit AI Tools
AI Hunt
Featured on AI Hunt
Featured on Dang.ai
Dang.ai
Featured on Dang.ai
Featured on AI Finder
AI Finder
Featured on AI Finder
Featured on LaunchIgniter
LaunchIgniter
Featured on LaunchIgniter
Imglab
Featured on Imglab
AI138
Featured on AI138
600.tools
Featured on 600.tools
Featured Tool
Featured on Featured Tool
Dirs.cc
Featured on Dirs.cc
Ant Directory
Featured on Ant Directory
Featured on MagicBox.tools
MagicBox.tools
Featured on MagicBox.tools
Featured on Code.market
Code.market
Featured on Code.market
Featured on LaunchBoard
LaunchBoard
Featured on LaunchBoard
Genify
Featured on Genify
Copyright © 2025 All Rights Reserved.
Product
  • AI Agents Directory
  • AI Agent Glossary
  • Industries
  • Categories
Resources
  • AI Agentic Workflows
  • Blog
  • News
  • Submit
  • Coummunity
  • Ebooks
Company
  • About Us
  • Privacy Policy
  • Terms of Service
  • Sitemap
Friend Links
  • AI Music API
  • ImaginePro AI
  • Dog Names
  • Readdit Analytics
Back to News List

Chinese VC firm launches dynamic AI benchmark Xbench

June 24, 2025•Caiwei Chen•Original Link•2 minutes
AI
Benchmark
VentureCapital

HongShan Capital Group developed Xbench to evaluate AI models for real-world tasks and reasoning, now open-sourcing it for public use with a leaderboard comparing top models.

HongShan Capital Group (HSG), a Chinese venture capital firm, has developed Xbench, a novel AI benchmarking system designed to evaluate models not just on academic performance but also on real-world task execution. The benchmark, initially an internal tool for investment assessments, is now being open-sourced for public use.

Key Features of Xbench

  • Dual Evaluation System:

    • Academic Testing: Similar to traditional benchmarks, it assesses STEM knowledge (e.g., via Xbench-ScienceQA) with questions vetted by professors.
    • Real-World Tasks: Evaluates practical applications like recruitment (e.g., sourcing battery engineers) and marketing (matching advertisers with influencers).
  • Dynamic Updates: Questions are refreshed quarterly, and the dataset is partially public to maintain relevance.

  • Chinese-Language Focus: The Xbench-DeepResearch component tests models’ ability to navigate Chinese web resources, emphasizing factual consistency and source breadth.

Leaderboard Results

Current rankings (as of launch):

  • Overall: ChatGPT-o3 leads, followed by ByteDance’s Doubao, Gemini 2.5 Pro, and Grok.
  • Recruiting: Perplexity Search and Claude 3.5 Sonnet rank second and third.
  • Marketing: Claude, Grok, and Gemini perform strongly.

Expert Endorsement

Zihan Zheng, lead researcher of LiveCodeBench Pro (NYU), praised Xbench’s ambition to quantify hard-to-measure qualities like creativity and collaboration, calling it a "promising start."

Future Plans

HSG plans to expand into finance, legal, accounting, and design categories, though these question sets remain private for now.

"It’s really difficult for benchmarks to include things that are so hard to quantify," Zheng noted, highlighting Xbench’s innovative approach.

Related News

October 2, 2025•Almir Vuk

Microsoft Releases Open-Source AI Agent Framework

Microsoft unveils its open-source Agent Framework to streamline AI agent development with enterprise-ready tools and simplified coding.

Microsoft
AI
OpenSource
October 2, 2025•GoDaddy Inc.

GoDaddy Launches Trusted Identity System for AI Agents

GoDaddy introduces a trusted identity naming system for AI agents to verify legitimacy and ensure secure interactions as the AI agent landscape grows.

AI
Cybersecurity
DigitalIdentity

About the Author

Dr. Lisa Kim

Dr. Lisa Kim

AI Ethics Researcher

Leading expert in AI ethics and responsible AI development with 13 years of research experience. Former member of Microsoft AI Ethics Committee, now provides consulting for multiple international AI governance organizations. Regularly contributes AI ethics articles to top-tier journals like Nature and Science.

Expertise

AI Ethics
Algorithmic Fairness
AI Governance
Responsible AI
Experience
13 years
Publications
95+
Credentials
2
LinkedInResearchGate

Agent Newsletter

Get Agentic Newsletter Today

Subscribe to our newsletter for the latest news and updates