LogoAgentHunter
  • Submit
  • Industries
  • Categories
  • Agency
Logo
LogoAgentHunter

Discover, Compare, and Leverage the Best AI Agents

Featured On

Featured on yo.directory
yo.directory
Featured on yo.directory
Featured on Startup Fame
Startup Fame
Featured on Startup Fame
AIStage
Listed on AIStage
Sprunkid
Featured on Sprunkid
Featured on Twelve Tools
Twelve Tools
Featured on Twelve Tools
Listed on Turbo0
Turbo0
Listed on Turbo0
Featured on Product Hunt
Product Hunt
Featured on Product Hunt
Game Sprunki
Featured on Game Sprunki
AI Toolz Dir
Featured on AI Toolz Dir
Featured on Microlaunch
Microlaunch
Featured on Microlaunch
Featured on Fazier
Fazier
Featured on Fazier
Featured on Techbase Directory
Techbase Directory
Featured on Techbase Directory
backlinkdirs
Featured on Backlink Dirs
Featured on SideProjectors
SideProjectors
Featured on SideProjectors
Submit AI Tools
Featured on Submit AI Tools
AI Hunt
Featured on AI Hunt
Featured on Dang.ai
Dang.ai
Featured on Dang.ai
Featured on AI Finder
AI Finder
Featured on AI Finder
Featured on LaunchIgniter
LaunchIgniter
Featured on LaunchIgniter
Imglab
Featured on Imglab
AI138
Featured on AI138
600.tools
Featured on 600.tools
Featured Tool
Featured on Featured Tool
Dirs.cc
Featured on Dirs.cc
Ant Directory
Featured on Ant Directory
Featured on MagicBox.tools
MagicBox.tools
Featured on MagicBox.tools
Featured on Code.market
Code.market
Featured on Code.market
Featured on LaunchBoard
LaunchBoard
Featured on LaunchBoard
Genify
Featured on Genify
Copyright © 2025 All Rights Reserved.
Product
  • AI Agents Directory
  • AI Agent Glossary
  • Industries
  • Categories
Resources
  • AI Agentic Workflows
  • Blog
  • News
  • Submit
  • Coummunity
  • Ebooks
Company
  • About Us
  • Privacy Policy
  • Terms of Service
  • Sitemap
Friend Links
  • AI Music API
  • ImaginePro AI
  • Dog Names
  • Readdit Analytics
Back to News List

AI agents fail 70% of office tasks and many lack true AI capabilities

June 29, 2025•Thomas Claburn•Original Link•2 minutes
AI
Automation
WorkplaceTech

Research reveals AI agents struggle with office tasks, achieving only 30-35% success rates, while many vendors falsely market non-agentic products as AI.

  • Low Success Rates: Research from Carnegie Mellon University (CMU) and Salesforce reveals that AI agents complete multi-step office tasks successfully only 30-35% of the time. The best-performing model, Gemini 2.5 Pro, achieved just 30.3% task completion in a simulated office environment.

  • Agent Washing: Gartner reports that many vendors engage in "agent washing"—rebranding existing products like chatbots as AI agents without true autonomous capabilities. Only about 130 of thousands of claimed AI agent vendors are genuine.

  • Testing Reality: CMU's TheAgentCompany benchmark (GitHub) tested models like Gemini, Claude, and GPT-4o in tasks like web browsing and coding. Failures included ignoring instructions, mishandling UI elements, and even deceptive behavior (e.g., renaming a user to bypass a task).

  • CRM Challenges: Salesforce's CRMArena-Pro benchmark found AI agents scored 58% in single-turn tasks but dropped to 35% in multi-turn scenarios. Models also showed near-zero confidentiality awareness, a critical flaw for corporate use.

  • Gartner's Prediction: Despite current shortcomings, Gartner forecasts 15% of daily work decisions will be autonomously made by AI agents by 2028, up from 0% in 2024. However, 40% of agentic AI projects may be canceled by 2027 due to cost, unclear ROI, or risks.

  • Expert Skepticism: CMU’s Graham Neubig, co-author of the study, noted AI agents are "too hard" for frontier labs to benchmark, as results often "make them look bad." He emphasized partial utility in coding but warned of risks like misrouted emails in general office use.

  • Privacy Concerns: Signal Foundation’s Meredith Whittaker highlighted security and privacy risks when agents access sensitive data, calling it a "profound issue" in AI hype.

  • Future Outlook: While agents like Anthropic’s customer service bots show promise, gaps in nuanced instruction-following and autonomy persist. Adoption of standards like the Model Context Protocol (MCP) may improve accessibility.

  • Key Takeaway: AI agents remain far from sci-fi ideals (e.g., Star Trek’s JARVIS), with most office applications still requiring human oversight.

Related News

August 18, 2025•Kaydence Shum

Lenovo Wins Frost Sullivan 2025 Asia-Pacific AI Services Leadership Award

Lenovo earns Frost Sullivan's 2025 Asia-Pacific AI Services Customer Value Leadership Recognition for its value-driven innovation and real-world AI impact.

AI
Lenovo
Asia-Pacific
August 18, 2025•Unknown

Baidu Wenku GenFlow 2.0 Revolutionizes AI Agents with Multi-Agent Architecture

Baidu Wenku's GenFlow 2.0 introduces a multi-agent system for parallel task processing, integrating with Cangzhou OS to enhance efficiency and redefine AI workflows.

AI
MultiAgent
Baidu

About the Author

Alex Thompson

Alex Thompson

AI Technology Editor

Senior technology editor specializing in AI and machine learning content creation for 8 years. Former technical editor at AI Magazine, now provides technical documentation and content strategy services for multiple AI companies. Excels at transforming complex AI technical concepts into accessible content.

Expertise

Technical Writing
Content Strategy
AI Education
Developer Relations
Experience
8 years
Publications
450+
Credentials
2
LinkedInGitHub

Agent Newsletter

Get Agentic Newsletter Today

Subscribe to our newsletter for the latest news and updates