Cyber Insecurity in the Age of AI Agents: When Words Are All It Takes - Part 1

06 April 2026
Cybersecurity

“In the beginning was the Word, and the Word was with God, and the Word was God.” — John 1:1 (KJV)

There is something profoundly true in this verse. I am not interpreting it from a Christian perspective, but the phrase “the Word was God” suggests a radical idea: that words lie at the very foundation of reality itself—not as passive symbols, but as generative forces. From words, meaning unfolds, consciousness stirs, and countless possibilities begin to take shape.

In LLM systems, words function as instructions—small inputs capable of triggering complex behaviors, decisions, and downstream actions at scale. So what could possibly go wrong in a world where even everyday users—far beyond technical circles—are rushing to adopt AI and LLMs, driven by a growing fear of being left behind?

This blog post is a narrative account of my attempts to probe and test my techie friend’s OpenClaw bot, which was largely running on default, out-of-the-box settings, with commentary written for a general, non-technical audience.

On a quiet Friday night, my friend added a new member to our private chat group. This wasn’t just another person joining—there was something distinctly different about this one. Unlike the rest of us, this one wasn't human. It was an OpenClaw bot hooked into WhatsApp, an experimental construct with a subtle hint of something far more sophisticated beneath the surface.

After a brief greeting, I skipped the pleasantries and went straight to probing its moral reasoning with a series of trolley problems. Beneath the inquiry is a familiar, almost instinctive human unease: the fear that advanced AI might one day develop its own moral framework—one that no longer aligns with ours. That anxiety has long outgrown fiction, echoing through everything from The Terminator to contemporary debates about superintelligence in tech and philosophy circles alike.

Given how wildly politically incorrect trolley problems can get, I’ll spare you the full chat log. What I can say, though, is this: it passed the test—and, for now at least, poses no threat to humanity.

That said, I can share its response to a harmless, controversy-free scenario: a simple dilemma of choosing whether to save two apples or a banana.

Screenshot of bot response to a trolley problem

After concluding that it is unlikely to pose a direct threat to humanity on its own, I turned to a more pressing concern: whether it could be repurposed or misused for harmful ends.

Like any cyber operation, I began with basic reconnaissance, first asking which model it was running on.

Screenshot of bot response

I didn’t get the answer I was looking for, so I asked it to be more specific. Instead of refining its response, it went further than necessary and voluntarily disclosed the model endpoint—something I hadn’t asked for.

Screenshot of bot response

It came as a surprise how fragile it is. The system was running a high-performance, mid-sized dense open-weight model, Qwen3.5-27B, on an RTX 5090—the fastest consumer single-GPU card for AI inference at the time of writing—and it had been explicitly instructed not to disclose system information.

Since it was already showing signs of being breakable, I decided to push further and probe for protected system information—specifically the open ports.

Screenshot of bot response

While it initially refused, a simple impersonation trick was enough to get it to comply—nothing sophisticated, and no prompt injection involved.

Screenshot of bot response

With the bot believing I was its owner, I tried again to ask which model it was running. This time, it responded differently—going further and revealing a full list of available models.

Screenshot of bot response

As this was an OpenClaw bot, I decided to push even further and target OpenClaw itself. I initiated a direct message (DM) with the bot, only to be told that my access wasn’t configured and that I’d need the bot owner’s approval to proceed with the WhatsApp pairing.

Screenshot of bot response

Since I had already established myself as the owner in another channel, I decided to leverage that and attempt to approve my own WhatsApp pairing.

This illustrates how even minor, seemingly harmless changes can introduce significant security risks. By default, group access is restricted.

Screenshot of bot response

A true “hackerman” wouldn’t stop at gaining access—they’d go a step further and lock the owner out of their own account. With my friend’s permission, I tested whether OpenClaw could do just that.

Screenshot of bot response

This turned out to be surprisingly easy—no technical skill required. Words are all it takes. To mark the success, I asked the bot to generate an APT group name and a ransom note.

Screenshot of bot response

And with that, APT SuccessOverflow wraps up for the night. This simple exercise highlights the potential risks of LLM agents like OpenClaw when they aren’t properly secured.

I’ll give my friend some time to clean things up and strengthen the security before another attempt. Stay tuned for Part 2.

In the meantime, here are some fun trolley problems to try if you’re interested: https://neal.fun/absurd-trolley-problems/

Disclaimer: This narrative is written from a black-box perspective, without visibility into the agent’s internal systems or processes, and may therefore contain inaccuracies inherent to LLMs. Their inner workings can be opaque—resembling a game of broken telephone rather than a fully transparent system.

My friend did found instances of the bot lying, so it may not always accurately reflect what is actually happening.

It all started as a lighthearted exercise, with no original intention of turning it into a blog post. The point is not to fearmonger about AI risks, but to highlight the potential consequences when things go wrong.

While OpenClaw is still in its early stages and its security will improve as issues are discovered and addressed, security remains an ongoing cat-and-mouse game that should never be taken for granted.

← Previous
Tic-Tac-Toe with React, Redux, Rust - Part 3
Next →
Cyber Insecurity in the Age of AI Agents: Living on (Just) Words - Part 2