Military Embedded Systems

GUEST BLOG: The next innovation in AI for defense -- autonomous purple teaming

Blog

December 12, 2025

Elad Schulman

Lasso Security

GUEST BLOG: The next innovation in AI for defense -- autonomous purple teaming

The mid-2025 assertion from U.S. Department of Defense (DoD) chief technology officer Emil Michael that artificial intelligence (AI) will “transform future warfighting” is echoing across the entire defense industry. It’s clear that AI has rapidly become a non-negotiable technology for the DoD, particularly in warfighting environments.

In conflict scenarios, AI enables warfighters to quickly analyze their environment, understand potential adversary tactics, and provide options based on that context. Take the example of technologies that would be used for the Golden Dome: Warfighters only have minutes to make the call to intercept an incoming missile.

However, reliance on AI systems can introduce critical vulnerabilities. The DoD Artificial Intelligence Cybersecurity Risk Management Tailoring Guide lists data poisoning, inference attacks, reverse engineering, and adversarial data manipulation as just a few of the threats to AI systems.

When it comes to these conflict scenarios, compromised or inaccurate AI outputs can mean the difference between life and death. The DoD must address this threat, and the only way to protect novel technologies is with novel security solutions.

Current AI red teaming efforts

Agencies such as the Defense Advanced Research Projects Agency (DARPA) are already addressing threats to AI through dedicated programs, using methods like red teaming to assess AI-enabled battlefield systems.

While red and blue teaming exercises are standard in defense settings – with red teaming taking the offensive, simulating attacks to find weaknesses; and blue teaming playing defense, against threats and responding to incidents – such traditional teaming methods are too slow and restrictive for the current landscape.

Traditional apps are predictable: Developers write clear rules, making them easier to secure, test and monitor. On the other hand, generative AI (GenAI) apps use dynamic prompts to define behavior. These prompts may enable powerful features, but they also introduce unique vulnerabilities and make the AI more difficult to test. Additionally, there is a shortage of AI talent, creating a critical bottleneck as the DoD looks to build teams.

With hundreds of AI applications and rapidly evolving large-language models (LLMs), service branches such as the Army and Air Force are investing in and deploying AI at a much faster rate than security teams can keep pace, opening the door to risks such as unpredictable outputs, prompt injection attacks, denial-of-service attacks, and model drifting.

Instead of traditional red teaming, the DoD should look to a two-fold approach: purple teaming plus automation.

Autonomous purple teaming

Purple teaming refers to the combination of continuous intelligence attack simulation with runtime defense and guardrails. While this approach is more holistic than either red or blue teaming alone, even purple teaming isn’t enough to keep up in warfighting environments.

Agencies also need to automate the combined exercise to do both red and blue teaming as rapidly as possible. Defense agencies should deploy autonomous AI agents to conduct continuous, intelligent attack simulations with automated defense adjustments. This process frees up warfighters by proactively identifying and remediating risks in real time through policy enforcement or admin-directed actions.

Ensuring that AI is secure and trustworthy is especially vital in warfighting environments, where operational constraints so often rule. The DoD must ensure that AI can run reliably on low-power or disconnected devices, and split-second processing is critical. Issues like latency or misinformation cannot be tolerated in combat.

Autonomous purple teaming can help the DoD meet these constraints by enabling autonomous detection, assessment, and remediation for LLM-based applications and agents.

Keeping pace with AI complexity

The critical need for this capability is rooted in the DoD’s real-time adoption of AI solutions and systems, such as the Army’s enterprise rollout of AskSage or the DoD Chief Digital and Artificial Intelligence Office (CDAO) contracting with providers like Anthropic, OpenAI, Google, and xAI.

However, if blue teams remain overstretched and red teams remain scarce and siloed, the DoD will be left with a sluggish OODA [observe, orient, decide, act] loop in which threat detection and remediation lag behind the speed of emerging real-world LLM vulnerabilities. Merging these offensive and defensive functions through AI agents that continuously test, analyze, and remediate GenAI and LLM vulnerabilities in real time closes this critical gap.

The autonomous agents supporting this approach would behave similarly to APTs [advanced penetration testers], well-trained penetration testers able to uncover hidden weaknesses even skilled human testers often miss. Once gaps are detected, they would autonomously validate and deploy fixes, reducing the time to repair from days to minutes and dramatically tightening the OODA loop between an initial finding, analysis, and guardrail implementation.

Current systems are already primed for this approach, as it would fit seamlessly within the DoD’s DevOps culture and Continuous ATO initiatives, ensuring both agility and compliance with NIST AI 600-1 and other emerging standards. By uniting red and blue capabilities under an agentic AI framework, the DoD can transform LLM security from a reactive human-bound defense into a proactive, scalable, continuously improving model that delivers resilience at mission speed.

Future AI deployment

Over the next five years, the AI defense landscape is expected to shift into full operational dependence, with AI at the core of mission-critical decision-making and threat detection.

Long-term success will rely on approaches like purple teaming to protect defense organizations from the outset, pre-deployment, by bridging the efforts of red and blue teams. This feedback loop is what will separate defense organizations that simply deploy AI from those that can truly defend it.

AI defense can’t be an afterthought, especially in warfighting environments. AI must be stress-tested like any other critical weapons system, and purple teaming will be the proving ground that keeps AI secure and helps enable mission success.

Elad Schulman is CEO and co-founder of Lasso Security.

Lasso Security · https://www.lasso.security/

Featured Companies