Securing AI Agents : A Framework to Maximize ROI, Minimize Risk

In the evolving landscape of AI in cybersecurity, AI agents present both opportunities and unique challenges. It's crucial to understand the gaps in your cybersecurity program to effectively protect these advanced systems. This post will explore AI agent threats, relevant mitigations, and strategies to focus your AI security efforts.

What are AI Agents?

AI agents are advanced systems that utilize artificial intelligence to perform actions or make decisions, often with some degree of autonomy. They are designed to perceive their environment, process information, and take actions to achieve specific goals.

Understanding AI Agent Threat Modeling

The perspectives on AI risk management and the threat landscape for AI agents vary widely. Some believe existing cybersecurity programs are sufficient, while others express alarm about the inadequacy of current controls in an AI-driven world.

To highlight this disparity, we will compare the AI agent threat modeling framework MAESTRO from the Cloud Security Alliance to a modified version of STRIDE.

We want to be clear that the point of this article is NOT to say MAESTRO is not needed. We acknowledge there are gaps with STRIDE which is why we have to modify it for an AI agent use case. We are using the widely-adopted STRIDE framework to illustrate how much of a gap MAESTRO is actually filling. Use what works best for your needs.

What is MAESTRO?

MAESTRO (Multi-Agentic System Threat Model) is a 7-layer reference architecture for agentic AI developed by the Cloud Security Alliance. This framework is designed to provide a structured approach to threat modeling for AI agents, addressing the limitations of traditional methods like STRIDE.

MAESTRO Threat Modeling Framework Specifically for AI Agents from Cloud Security Alliance

Why Does MAESTRO Exist?

Current threat modeling frameworks, like STRIDE, are not well-suited for AI agents. STRIDE, while a good starting point, doesn't fully address the unique challenges posed by AI agents, such as adversarial attacks and risks associated with unpredictable learning and decision-making. MAESTRO provides a more tailored approach to threat modeling in this context.

A Modified STRIDE + ML

To better understand the gaps that AI creates in a cybersecurity program, let's modify STRIDE so it can handle the unique challenges of AI agents. This is done by incorporating two new threat categories "Misunderstanding" and "Lack of Accountability" (ML), which can be employed.

Misunderstanding: This refers to models having undesirable assessments due to a lack of context or malicious intervention, leading to unexpected emerging behaviors.
Lack of Accountability: This occurs when actions are performed without clear governance or ownership, making it difficult to determine responsibility when issues arise.

‍

Applying STRIDE + ML to AI Agents

AI risk frameworks such as the OWASP Multi Agentic Threat Modeling Guide and the Cloud Security Alliance’ MAESTRO framework can be mapped into STRIDE + ML to provide a clearer view of AI agent threats. This mapping reveals that a significant portion of AI agent threats can be categorized using traditional STRIDE, but a notable percentage require the additional ML categories.

Mapping of industry defined threats into STRIDE + ML

AI Agent Threats and Mitigations

AI agent threats can be categorized, and mitigations can be mapped to these categories. Some threats can be mitigated using existing cybersecurity measures, while others require extending capabilities or implementing new mitigations.

We use the OWASP AI Agent threat taxonomy as it is more concise compared to the Cloud Security Alliance taxonomy. Almost all of the threats in the Cloud Security Alliance threat taxonomy can be categorized into the OWASP taxonomy.

Understand the Gaps in your Cybersecurity Program

Threats with Existing Mitigations:
- Spoofing (T9 – Identity Spoofing)
- Repudiation (T8 – Repudiation and Untraceability)
- Information Disclosure (T12 – Agent Communication Poisoning)
- Denial of Service (T4 – Resource Overload)
- Elevation of Privilege (T3 – Privilege Compromise)
Threats Requiring Expanded Mitigations: This category includes
- Spoofing (T13 – Rogue Agents)
- Tampering (T11 - Unexpected RCE and Code Attacks, T1 – Memory Poisoning)
- Denial of Service (T10 – Overwhelming HITL)
- Elevation of Privilege (T14 – Human attacks on MAS)
Threats Requiring New Mitigations: These are unique to AI agents
- Misunderstanding (T2 – Tool Misuse, T5 - Cascading Hallucinations, T6 – Intent Breaking & Goal Manipulation)
- Lack of Accountability (T7 – Misaligned & Deceptive Behaviour, T15 – Human Trust Manipulation)

Mitigation Options and AI-Specific Considerations

The Cloud Security Alliance provides a good list of mitigations to focus on. Many of the mitigations should be part of any robust cybersecurity program regardless if AI is used or not. Below are the additional mitigations required specifically for AI systems.

Agent Safety Mitigations

Agentic AI Safety Mitigations
Mitigation	Description	Example
Adversarial Training	Train agents to be robust against adversarial examples.	During the model training process, add examples of prompts trying to get toxic responses about ageism. These should be labeled so the model knows that this type of prompt should not be answered.
Formal Verification	Use formal methods to verify agent behavior and ensure goal alignment.	Given the intent of an agent is only to provide information and analysis about a customer's bank account, regularly audit that the agent is not attempting unexpected activity like transferring funds.
Explainable AI (XAI)	Improve transparency in agent decision-making to facilitate auditing.	Be able to explain why an insurance claim agent denied a specific customer's claim.
Red Teaming	Simulate attacks to identify vulnerabilities.	Research the latest prompt injection techniques and test whether they are successful on your system.
Safety Monitoring	Implement runtime monitoring to detect unsafe agent behaviors.	With a platform independent of the agent, verify that incoming prompts are not jailbreaking attempts or efforts to make the agent perform unethical actions such as illegal or discriminatory behavior.

Conclusion

Focusing Your Efforts

The AI agent space is not as unique as many portray it to be, but we also cannot pretend that our existing cybersecurity control strategy is sufficient.

‍

To effectively improve the ROI of your AI agent security efforts, focus on:

First look at your current capabilities to see how you can address ~⅔ of the AI threat space.
Look at the market for the emergent ~⅓ of AI threats, as these mitigations are being built now, so unlikely to exist with your current capabilities.

‍

By understanding the nuances of AI agent threats and applying targeted mitigations, organizations can better protect these systems and maximize their return on investment in AI technologies.

‍

See how Aiceberg provides solutions for 3 of the 5 AI specific mitigations called out by CSA:

Formal Verification
Explainable AI (XAI)
Safety Monitoring

‍

Book your demo today!

‍

See Aiceberg In Action

Book My Demo

Todd Vollmer

SVP, Worldwide Sales