AI-Run Fake Company Fails Miserably in Carnegie Mellon Experiment
A study by Carnegie Mellon researchers tested AI models in a simulated company environment, revealing significant inefficiencies and failures.
AI isn’t ready to take over all jobs yet, as a recent experiment by Carnegie Mellon researchers demonstrated. A fake company, TheAgentCompany, staffed entirely by AI agents, achieved at best a 24% success rate in completing basic business tasks. This highlights the current limitations of AI in replacing human roles.
The Experiment
Researchers created a simulated software startup environment where AI models from companies like OpenAI, Anthropic, Meta, and Google were tasked with functions such as:
- Analyzing spreadsheet data
- Conducting performance reviews
- Selecting a new office space
The results were far from promising. Claude from Anthropic performed the best, completing only 24% of tasks, while other models like Google’s Gemini and OpenAI’s ChatGPT managed around 10% success rates. The worst performer was Amazon’s Nova, which accomplished a mere 1.7% of tasks.
Cost and Efficiency Issues
The study also revealed that AI-run companies are prohibitively expensive. Each task cost an average of $6, and with approximately 30 tasks per job, the expenses quickly add up. This inefficiency underscores the impracticality of relying solely on AI for business operations.
Why AI Falls Short
AI lacks common sense and struggles with simple problems. For example, when a pop-up interrupted a task, the AI failed to close it and abandoned the job entirely—a problem any human could solve effortlessly. This highlights a critical flaw in current AI systems: they cannot adapt to unexpected challenges without human intervention.
The Bigger Picture
Despite the hype around AI’s potential, this experiment shows that human oversight remains essential. While AI can assist with specific tasks, it is far from capable of autonomously running a business. For more on AI’s limitations, check out this list of AI failures.
Key Takeaways:
- AI models struggle with basic business tasks.
- Costs for AI-run operations are high.
- Human intervention is still necessary for problem-solving.
For further reading on AI’s environmental impact, see this report.
Related News
Lenovo Wins Frost Sullivan 2025 Asia-Pacific AI Services Leadership Award
Lenovo earns Frost Sullivan's 2025 Asia-Pacific AI Services Customer Value Leadership Recognition for its value-driven innovation and real-world AI impact.
Baidu Wenku GenFlow 2.0 Revolutionizes AI Agents with Multi-Agent Architecture
Baidu Wenku's GenFlow 2.0 introduces a multi-agent system for parallel task processing, integrating with Cangzhou OS to enhance efficiency and redefine AI workflows.
About the Author

Alex Thompson
AI Technology Editor
Senior technology editor specializing in AI and machine learning content creation for 8 years. Former technical editor at AI Magazine, now provides technical documentation and content strategy services for multiple AI companies. Excels at transforming complex AI technical concepts into accessible content.