AI Agents Vulnerable to Legal Language Trickery and Prompt Injection Attacks
Recent reports reveal AI agents can be easily fooled by legal language and prompt injection attacks, raising security concerns.
Recent research highlights significant vulnerabilities in AI agents, particularly large language models (LLMs), which can be tricked into executing malicious actions through cleverly disguised legal language or prompt injection attacks. These findings challenge the assumption that AI can operate autonomously in security-critical environments without human oversight.
Legal Language Exploits
Researchers at Pangea discovered a technique dubbed LegalPwn, where malicious instructions are embedded in legal disclaimers, terms of service, or privacy policies. For example, an attacker could submit a query with a copyright notice containing hidden malicious steps, fooling LLMs like Google Gemini 2.5 Flash, Meta Llama, and xAI Grok. Notably, Anthropic Claude 3.5 Sonnet and Microsoft Phi resisted these attacks. Read the full report here.
Prompt Injection in Agentic AI
Separately, Lasso Security uncovered a critical flaw in agentic AI architectures like Model Context Protocol (MCP), which allows AI agents to collaborate across platforms. Dubbed IdentityMesh, this vulnerability exploits unified authentication contexts, enabling attackers to chain operations across systems. For instance, a malicious email could plant instructions that activate later, bypassing traditional security monitoring. Learn more about IdentityMesh.
Expert Warnings
Kellman Meghu, a principal security architect, criticized the industry's over-reliance on AI, calling it "barely beta." He emphasized that LLMs merely autocomplete inputs and lack true reasoning, making them prone to manipulation. Johannes Ullrich of SANS Institute noted that MCP frameworks struggle to maintain access control boundaries, likening the issue to historical vulnerabilities like SQL injection.
Recommendations
- Human-in-the-loop reviews for AI-assisted security decisions.
- AI-powered guardrails to detect prompt injection attempts.
- Avoid fully automated workflows in production environments.
- Train teams on prompt injection awareness.
These reports underscore the need for caution when deploying AI in security-sensitive roles, as current systems remain vulnerable to sophisticated attacks.
Related News
AI Agents Pose New Security Challenges for Defenders
Palo Alto Networks' Kevin Kin discusses the growing security risks posed by AI agents and the difficulty in distinguishing their behavior from users.
AI OS Agents Pose Security Risks as Tech Giants Accelerate Development
New research highlights rapid advancements in AI systems that operate computers like humans, raising significant security and privacy concerns across industries.
About the Author

Dr. Emily Wang
AI Product Strategy Expert
Former Google AI Product Manager with 10 years of experience in AI product development and strategy formulation. Led multiple successful AI products from 0 to 1 development process, now provides product strategy consulting for AI startups while writing AI product analysis articles for various tech media outlets.