Sakana AI's TreeQuest Boosts LLM Performance by 30% with Multi-Model Collaboration
Sakana AI's new inference-time scaling technique, TreeQuest, uses Monte-Carlo Tree Search to orchestrate multiple LLMs for superior task performance.
Japanese AI lab Sakana AI has unveiled a groundbreaking technique called Multi-LLM AB-MCTS, which enables multiple large language models (LLMs) to collaborate on complex tasks, outperforming individual models by 30%. This method, detailed in their research paper, leverages Monte Carlo Tree Search (MCTS) to dynamically allocate tasks to the most suitable LLM, optimizing performance.
Key Innovations
- Adaptive Branching Search: The algorithm balances "searching deeper" (refining existing solutions) and "searching wider" (generating new solutions), ensuring optimal problem-solving strategies.
- Multi-Model Collaboration: The system intelligently assigns tasks to models like OpenAI's o4-mini, Gemini 2.5 Pro, and DeepSeek-R1, leveraging their unique strengths.
- Open-Source Framework: Sakana AI has released TreeQuest under an Apache 2.0 license, enabling developers to implement this technique for commercial use.
Performance Highlights
- ARC-AGI-2 Benchmark: The multi-model system solved 30% of the 120 test problems, a significant improvement over individual models.
- Error Correction: In one instance, a flawed solution from o4-mini was corrected by Gemini 2.5 Pro and DeepSeek-R1, demonstrating the system's ability to combine models for superior results.
Real-World Applications
- Complex Coding: AB-MCTS has been successfully applied to algorithmic coding tasks.
- Latency Optimization: The technique can improve response times for web services.
- Hallucination Mitigation: By combining models with varying hallucination tendencies, the system achieves better accuracy.
"This approach unlocks the potential of LLMs as a collective intelligence," said Takuya Akiba, a research scientist at Sakana AI. The release of TreeQuest marks a significant step toward more robust and reliable AI applications for enterprises.
Related News
Google Releases Open-Source Gemini CLI AI Tool for Developers
Google introduces Gemini CLI, an open-source AI command-line interface leveraging Gemini 2.5 Pro for developer workflows, now available under Apache 2.0 license.
AI Sales Enablement Boosts Efficiency With 20-50% Time Savings
AI is transforming B2B sales enablement, saving sales teams 20% and marketing teams 50% of their time. Learn how Salesapps leverages AI agents for efficiency and security in this Paris event report.