Acceptance Criteria - Cotool Documentation

Acceptance criteria let you define specific conditions each agent run must satisfy. After every run, the Cotool Eval Harness evaluates each criterion individually and reports which passed and which failed. When any criterion is not met, you can automatically send notifications to Slack, webhooks, or PagerDuty.

How It Works

Criteria you define are injected into the agent’s system prompt so the agent actively tries to satisfy them. After every run, the Cotool Eval Harness independently grades each criterion as met or not met and explains why. Failed criteria can optionally trigger notifications to Slack, webhooks, or PagerDuty.

Adding Acceptance Criteria

Navigate to your agent’s details page
Click Edit
Scroll to the Acceptance Criteria section
Add each criterion as a clear, evaluable statement
Click Save Changes

Write criteria as specific, observable conditions the Cotool Eval Harness can verify from the run transcript. Vague criteria like “be helpful” produce unreliable results.

Good vs. Poor Criteria

Good	Poor
”Check EDR telemetry before closing an alert as false positive"	"Investigate alerts properly"
"Never disable a detection rule without documenting justification"	"Be careful with rule changes"
"Correlate across at least two log sources before determining scope"	"Use tools correctly"
"Escalate to Tier 2 if lateral movement indicators are present"	"Don’t miss threats”

Viewing Results

After each run, acceptance criteria results appear in two places:

Eval score popover — click the evaluation score badge on any run to see which criteria passed or failed, along with the judge’s explanation for each
Run issues warning — a warning icon appears next to runs that have failed criteria, combined with any critical issues the judge identified

Notifications

Acceptance criteria failures can trigger notifications through the same output destinations used for agent outputs (Slack, webhooks, PagerDuty). Notification configuration is independent from regular output delivery — you can enable one without the other.

Setting Up Notifications

Click Edit on the agent details page
In the Acceptance Criteria section, toggle Notify on failure
Select one or more destinations from the dropdown (or create a new one)
Click Save Changes

Destinations are shared across your organization. A Slack channel or webhook created for output delivery can also be used for acceptance criteria notifications.

Notification Format

Slack messages include the agent name (linked to the run), a timestamp, a pass/fail summary, and each failed criterion with the judge’s explanation. Webhooks receive a JSON payload:

{
  "organizationId": "...",
  "agentId": "...",
  "agentName": "My Agent",
  "runId": "...",
  "runTimestamp": "2026-03-25T00:28:23.667Z",
  "failedCriteria": [
    {
      "criterion": "Provide a severity rating",
      "explanation": "The agent responded without including any severity classification"
    }
  ]
}

PagerDuty creates an alert with the failed criteria count in the summary and individual criteria details in custom fields.

Best Practices

Start with 2–3 criteria

Begin with the most important conditions and expand over time. Too many criteria can dilute signal and slow down the feedback loop.

Write falsifiable statements

Each criterion should be something the judge can clearly confirm or deny from the run transcript. “Always include a confidence score” is falsifiable; “produce good output” is not.

Use criteria to catch regressions

If a prompt change fixes a recurring issue, add a criterion that encodes the fix (e.g. “Never close an alert without checking log ingestion status”). This prevents the behavior from regressing silently.

Pair with suggested improvements

When criteria consistently fail, check the Suggested Improvements panel. The system may already have a prompt fix ready for you.

Limits

Maximum 20 criteria per agent
Each criterion can be up to 500 characters

​How It Works

​Adding Acceptance Criteria

​Good vs. Poor Criteria

​Viewing Results

​Notifications

​Setting Up Notifications

​Notification Format

​Best Practices

​Limits

How It Works

Adding Acceptance Criteria

Good vs. Poor Criteria

Viewing Results

Notifications

Setting Up Notifications

Notification Format

Best Practices

Limits