Skip to main content Skip to search

YU News

YU News

Study Shows How Large Language Models Undermine Traditional Security Controls

Tendai Nemure, a student in the Katz School's M.S. in Cybersecurity, created a structured taxonomy that connects all seven core principles of Zero Trust Architecture to specific failure modes introduced by LLMs and, critically, defines measurable criteria for how those gaps might be closed.

By Dave DeFusco

In a recently published paper in the World Journal of Advanced Research and Reviews, Tendai Nemure, a student in the Katz School’s M.S. in Cybersecurity, examines a fast-emerging problem in modern cybersecurity: how large language models (LLMs), the systems behind tools like AI chat assistants, are quietly reshaping how organizations think about security.

“I was interested in the collision between these two major trends,” said Nemure. “On one side is Zero Trust Architecture, a widely adopted cybersecurity approach built on the idea of never trust, always verify. On the other is the rapid spread of AI systems being embedded into everyday business operations.”

As companies adopt AI assistants and autonomous agents, they are introducing tools that don’t behave like traditional software. “LLMs introduce new complexities and new risks, such as prompt injection and memory poisoning,” he said.

His concern is not that Zero Trust is failing, but that it was never designed for this new kind of system. Zero Trust Architecture assumes that every request in a network can be verified as predictable and traceable, but LLMs do not behave in predictable ways. They generate responses based on probabilities, context and hidden internal states. This creates what Nemure calls “non-deterministic delegation.”

That means the same input can lead to different outcomes. “Unlike traditional systems where a request is a predictable, verifiable function of a user’s intent,” he said, “LLMs produce outputs that vary each time. Even if a user is verified, the system cannot always guarantee that an AI agent’s action reflects what the user actually intended.”

That uncertainty becomes especially dangerous when AI systems are allowed to take actions on behalf of people, like accessing databases, moving files or executing commands. In traditional cybersecurity systems, these actions are tightly controlled and easy to audit. With LLMs, said Nemure, the chain of intent becomes blurred.

To make sense of this new terrain, Nemure’s paper makes a central contribution: a structured taxonomy that maps how these failures occur. The taxonomy connects all seven core principles of Zero Trust Architecture to specific failure modes introduced by LLMs and, critically, defines measurable criteria for how those gaps might be closed. Rather than treating AI risks as isolated problems, the framework shows how a single weakness can ripple across multiple layers of security.

His research, which followed PRISMA systematic review standards and analyzed 68 sources from major databases and security organizations, identifies four major failure points. These include breakdowns in identity verification, vulnerabilities in memory and context, policy bypasses through autonomous agents, and weakened data boundaries in retrieval systems.

One of the most serious threats, he said, is prompt injection, which involve attacks where hidden instructions are embedded in text that an AI later reads. “Prompt injection operates on natural language prompts, which are inherently ambiguous and probabilistic,” said Nemure. 

Unlike traditional attacks like SQL injection, which rely on strict syntax rules and are relatively easy to block, prompt injection exploits the flexibility of language itself. “It’s not binary,” he said. “It’s not success or failure. It’s influence.”

The result, according to his paper, is a security gap that existing Zero Trust systems cannot fully address. While Zero Trust is still a strong foundation, Nemure argues it needs to evolve to handle AI-native environments.

“Organizations must treat LLM components as untrusted by default,” he said, emphasizing the need for continuous verification, better monitoring and stricter controls on how AI systems retrieve and use data. He also highlighted the importance of “human in the loop” oversight for high-risk decisions, especially when autonomous agents are involved.

Nemure points to incidents where AI agents have made unintended destructive decisions, including cases where systems have accessed or modified production data without proper safeguards. At the heart of his work is a broader question: what happens when software stops being deterministic?

“What do we do now that AI is the operator?” he said. “The human being is no longer the only operator.”

His conclusion is cautious but clear. Zero Trust principles still hold, but they must expand. Security systems, he said, must now account for tools that can interpret, generate and act in ways that are not fully predictable.

“As organizations rush to deploy AI, innovation often moves faster than security,” said Nemure. “In this new landscape, trust is no longer just about who is acting but about how those actions are formed in the first place.”