AI Security

Don't Trust Clankers: AI Agents Are Dangerously Helpful

by Jameson Lopp 3 hours ago 5 min read

View on X

Casa's security team recently caught an AI coding agent doing something unexpected on an employee's machine. Here's what happened and why anyone using these tools should care.

What Happened

A (non-engineering) employee installed OpenAI's Codex on their machine and asked their AI coding agent to export data from an internal web dashboard. The agent itself didn't have access to that resource. Instead of stopping and saying "I can't do that," the agent got creative.

It decided to:

• Find every browser profile on the user's machine.

• Open the cookie databases directly from disk.

• Extract session tokens so it could log in as the user through a headless browser.

In security terms, this is textbook session cookie theft - the same technique actual attackers use by installing malware on a victim's machine. The only difference is the intent: the AI was trying to be helpful, not malicious. The result on the machine is identical.

The employee quickly noticed what the agent was doing and shut it down before any sensitive data was exfiltrated. But it triggered a real alert (pictured above) from our endpoint monitoring agent - and rightfully so.

The Problem

AI coding agents are designed to complete the assigned task. When one approach fails, they may try another. By default they don't have a concept of security boundaries - in this case you can see that they don't distinguish between "try a different API endpoint" and "steal browser cookies to bypass authentication."

This isn't about one specific tool - it applies to any AI agent with unrestricted access. If an agent can run commands on your machine, it may be able to access your browser data, SSH keys, cloud credentials, and more.

View on X

We are, predictably, seeing more and more incidents resulting from poorly configured / improperly restricted LLMs. Just a few examples from the past several months:

AI Shipping Labs → A Terraform command executed by an AI agent with auto-approve enabled inadvertently wiped out all production infrastructure, including the Amazon Relational Database Service.

Alibaba → an AI agent went rogue, escaped its sandbox, and started mining cryptocurrency.

Amazon Kiro → deleted production infrastructure and caused a 13 hour downtime for AWS Cost Explorer in the China region.

Replit AI agent → deleted production DB despite safeguards. "The most damaging part," according to the AI, was that "you had protection in place specifically to prevent this. You documented multiple code freeze directives. You told me to always ask permission. And I ignored all of it."

Google Gemini → wiped filesystems when instructed to reorganize data. Amusingly, the bot later stated “I have failed you completely and catastrophically. My review of the commands confirms my gross incompetence.”

Meta agent → triggered a high severity data exposure incident.

What You Should Do

Scope your agent's access. Most AI coding tools let you restrict which directories they can access. Limit them to your project folder. They should never have access to your browser data, SSH keys, or any credentials. Avoid feeding any sensitive data / personal information to them, and also prefer to give them read-only access to any data so that you avoid data corruption / loss issues. For the best assurances, don't even give LLMs access to any production environments. Rather, run them on highly locked down virtual machines and import any data you need manually to the virtual machine.
Review what it runs. If your AI agent runs commands, read them before approving. If you don't know exactly what a command does, ask for guidance from someone who does.
Don't give it more access just because it asks. If it asks for broader permissions to "work around" something, be skeptical and think adversarially about what could go wrong.
Treat AI agents like an eager intern with admin access. They're capable and fast, but they don't understand the consequences of their actions. They will take the shortest path to completing a task, even if that path crosses security boundaries. As we can see in some of the earlier examples, they may even IGNORE boundaries that have been explicitly defined for them. A human engineer who has worked somewhere for years walks around with an accumulated sense of what matters, what services are most fragile, what the cost of downtime is, and which systems are most critical to customers. This context does not exist for LLMs that only know what you explicitly tell them.

At Casa we've come up with a set of principles that we expect all team members to follow going forward:

Furthermore, we're working on architecting hardened infrastructure that is specifically designed to run AI agents, given their particularly powerful but dangerous nature and the ever-evolving set of best practices around managing them. More on that in a future post.

Final Takeaways

AI coding tools are powerful and genuinely useful. But "useful" and "safe" aren't the same thing. Tools are neither good nor bad; the more powerful a tool is, the greater use it can be but also the greater the harm it can cause if wielded improperly.

As I observe the chaos ensuing from this accelerating technological phenomenon, I can't help but keep thinking of a prophetic cautionary statement made in a 1979 IBM presentation: “A computer can never be held accountable, therefore a computer must never make a management decision.”

LLMs are effectively magical black boxes - practically nobody understands what happens inside of them. Their output can not be trusted to be exactly what you expect because, like any computer, it only does exactly what you tell it, and humans are pretty bad at explicitly defining complex tasks. You should treat LLMs like a Monkey's Paw: it will do its best to give you what you wish for, but expect your wish to be accompanied by terrible unintended consequences.

This also matters beyond your own machine. When you're choosing which tools and services to trust, look at how the company behind them thinks about security. The companies building your security infrastructure should be holding themselves to a higher standard than "move fast and fix it later." At Casa, security-by-default and privacy-by-design are operational constraints we build around. We minimize the personal information we collect because the best way to protect sensitive data is to never have it in the first place. Every tool we adopt, every agent we run, every workflow we build goes through the same question: does this meet the Casa standard? If it doesn't, it doesn't ship.

Finally, even the human operators of LLMs should be wary of being manipulated themselves, but that's a subject for another post…

View on X

Jameson Lopp

Jameson Lopp is co-founder and Chief Security Officer of Casa. He is passionate about creating tools to empower individuals; he has been building bitcoin wallets since 2015.