← All articles
Company

Triage Raises $1.5M to Secure AI-Native Applications

Triage Raises $1.5M
Nikhil SrivastavaNikhil Srivastava
Dec 27, 20254 min read

Triage has raised $1.5M in pre-seed funding to build runtime security infrastructure for teams shipping LLM-powered products. The round was led by BoxGroup, with participation from Precursor Ventures and notable angels including Zach Lloyd (CEO, Warp.dev), Michael Fertik (Verdict Capital), Bill Shope (Tidal Partners), Niklas de la Motte, and Cory Levy (Z Fellows).

This capital accelerates development of Triage's core runtime sentry architecture: a multi-layer behavioral divergence detection system designed to secure AI applications at inference time, where traditional security tooling has no visibility and model-level safety has known failure modes.

Model-Level Safety Is Not a Security Architecture

There is a widespread assumption that model providers have solved AI safety at the model layer. They have not. Alignment techniques like RLHF and instruction tuning produce refusal behaviors that are statistically likely, not architecturally guaranteed. These mechanisms operate as learned preferences, not enforcement boundaries. The distinction matters.

Refusal can be bypassed. Multi-turn conversations that incrementally shift context windows erode safety boundaries that hold under single-turn evaluation. Abstraction-layer reframing, where a request is restructured into a hypothetical, translation, or encoding task, routes around refusal classifiers that pattern-match on surface-level intent. Role injection through system prompt manipulation overrides behavioral constraints the model was fine-tuned to maintain. These are not edge cases. They are reproducible, generalizable techniques that work across frontier models.

The implication for teams building on top of these models is straightforward: you cannot delegate security to the model provider's alignment work. If your product exposes a model to untrusted input, retrieval context, or tool access, you need runtime infrastructure that detects behavioral divergence independent of the model's own safety training.

The Threat Surface Is Broader Than Prompt Injection

The industry conversation around AI security has narrowed prematurely around prompt injection. Prompt injection is real and common, but it is one vector among many. The more sophisticated attacks target the system around the model, not just the model's input buffer.

Supply-chain prompt injection.

When agents ingest configuration files, project rules, retrieved documents, or upstream data, any of those sources can carry adversarial payloads. The model does not distinguish between trusted instructions and injected ones embedded in data it was told to process. The attack surface is not the prompt. It is every data source the model treats as context.

Knowledge distillation and surrogate model extraction.

Production models can be systematically queried to extract decision boundaries, classification logic, and behavioral policies into smaller surrogate models. An attacker does not need access to your weights. They need access to your API and enough structured queries to replicate your model's behavior at high fidelity on the target task. The surrogate then becomes an offline laboratory for developing attacks that transfer back to the production system. This is not theoretical. Surrogate extraction attacks achieve high class agreement with the target model using straightforward query strategies.

Retrieval poisoning.

RAG systems trust their retrieval pipeline to surface relevant, benign context. An attacker who can influence the document corpus, whether through direct injection, SEO-style manipulation of indexed sources, or poisoning shared knowledge bases, controls the context window. The model follows instructions it finds in retrieved documents with the same compliance it shows to system prompts.

Tool-chain exploitation.

Agentic systems that call external tools introduce a full execution layer beneath the model. An agent that can write files, execute code, make API calls, or query databases can be manipulated into performing actions the application developer never intended. The attack does not require the model to “want” to do something harmful. It requires the model to be convinced, through any input channel, that the action is consistent with its instructions.

Chain-of-thought manipulation.

Models that expose or rely on intermediate reasoning are vulnerable to attacks that target the reasoning process itself, steering the model toward conclusions that appear internally consistent but serve the attacker's objective. When chain-of-thought is used for tool selection or action planning, compromising the reasoning compromises the action.

These vectors compound. A retrieval poisoning attack that injects tool-calling instructions into retrieved context, which the model executes through an agent framework, is not a single vulnerability. It is a chain that crosses every layer of the stack.

Triage: Runtime Security at Every Layer of Inference

Triage operates as a multi-layer behavioral divergence detection system that monitors AI applications across four stages of execution.

Input analysis.

Detect prompt injection, adversarial payloads, and context manipulation before they reach the model. This covers direct user inputs, retrieved documents, configuration files, and any upstream data source the model will process as context. Triage identifies adversarial intent embedded in data that the model itself would treat as legitimate instruction.

Tool call monitoring.

Analyze tool invocations in real time for unauthorized actions, unexpected argument patterns, and data exfiltration attempts. When an agent calls a tool it should not call, passes arguments outside expected boundaries, or chains tool calls in sequences that indicate compromise, Triage flags and intervenes before execution completes.

Chain-of-thought inspection.

Monitor the model's reasoning trajectory for behavioral divergence. Compromised models frequently reveal manipulation in their reasoning before it surfaces in outputs. A model that has been steered by injected context will produce reasoning that deviates from its established behavioral baseline. Triage detects that deviation.

Output enforcement.

Apply dynamic guardrails to model responses, blocking data leakage, policy violations, and outputs that diverge from expected behavioral patterns. Unlike static output filters, Triage's enforcement adapts based on the full execution context: what the model was asked, what it retrieved, what tools it called, and how its reasoning evolved.

Each layer feeds into the next. The system builds behavioral baselines for a given application and identifies deviations across the full execution path, catching compound attacks that no single detection surface would flag in isolation.

Research-Driven Detection

Triage's detection capabilities are not built from taxonomies. They are built from internal adversarial research against production AI systems. The team conducts ongoing red team operations that identify real, exploitable attack chains in deployed products: supply-chain prompt injection through project configuration files, system prompt extraction via retrieval and plugin poisoning, data exfiltration through formula injection in agent-generated outputs, surrogate model extraction through structured query campaigns, and multi-session manipulation of safety boundaries in frontier models.

This research feeds directly into the detection pipeline. Every novel attack technique discovered internally becomes a detection signature, a behavioral heuristic, or a baseline anomaly that the runtime sentry can identify in customer deployments. The gap between offensive research and defensive product is measured in days, not quarters.

What Comes Next

The new funding expands:

  • The runtime sentry architecture across more providers and agent frameworks
  • Automated behavioral baseline generation and divergence detection at scale
  • Adversarial evaluation tooling that lets teams stress-test their own systems against the attack chains Triage discovers internally
  • Early customer deployments demonstrating measurable reductions in exploitable AI attack surface

Work With Triage

Triage partners with teams building LLM-powered products that need runtime security infrastructure, not another dashboard. If your product puts a model in front of untrusted inputs, retrieval context, or tool access, and you need to know when that model's behavior diverges from what you intended, we should talk.

For pilots, partnerships, or roles, the fastest path is a direct introduction through the site's contact channel.

Ready to secure your AI systems?

Get in touch to learn how Triage can help your team ship secure AI products faster.

Contact Us