Case study: Tachi Security, built on AOD

Recent merged features

F-282 pre-commit secret-scanning defaults. Default-secure gitleaks pre-commit hook shipped through the standard Triad workflow; adopter opt-in via pre-commit install; BLP-02 Wave 5 governance lineage with full ADR cross-references. PR #283 (opens in new tab)
F-277 Claude permissions baseline. Replaces a 26-rule allow-only permissions config with a four-category curated baseline (read-only, local-state, destructive deny/ask, network host-allowlist). ADR-041 accepted; CLAUDE_PERMISSIONS.md operator handbook landed. PR #278 (opens in new tab)
F-272 SECURITY.md and private disclosure channel. GitHub-canonical SECURITY.md sections with 5-business-day acknowledgment SLA; private vulnerability reporting enabled so the Security tab "Report a vulnerability" button surfaces. PR #273 (opens in new tab)
F-241 Web/API coverage attestation (Tier 3). Adds Web/API coverage attestation across OWASP Web Top 10 and API Top 10 frameworks; populator wiring for F-A3 source-attribution schema extension; Tier 3 coverage bundle. PR #242 (opens in new tab)

How AOD built one Tachi feature

Idea

Close three pattern-catalog gaps in the output-integrity agent plus targeted cross-links into tool-abuse and data-poisoning. Community-surfaced refinement from discussion #179, where an external contributor identified four sink classes (render, query, execution, agent-tool) and proposed mapping them against the agent's shipped Cat 1 through Cat 5 coverage. Three gaps surfaced: vector-filter and search-DSL injection on multi-tenant RAG (not explicitly named in Cat 2 SQL or Cat 4 templates), package-manager and CI-workflow execution sinks like npm install or GitHub Actions YAML (not in the Cat 2 keyword list), and the cross-agent handoff boundary where the same LLM output can flow into a tool argument or a durable memory write (scattered across three agents with no unified framing). ICE score Impact 8, Confidence 7, Effort 7 = 22. P1 priority. Goal is one feature branch, one to two days, no schema bump.

See Issue #292 on GitHub

Discover

Gap analysis on 2026-05-14 mapped the contributor's four sink classes against the F-1 output-integrity catalog shipped in PR #202. Render sinks are covered by Cat 1 Client-Side Execution Sinks at full fidelity. Query sinks are partially covered (Cat 2 SQL plus Cat 4 templates) but vector filters and search DSLs are not explicitly named, so multi-tenant RAG misses the signal. Execution sinks are partially covered (Cat 2 shell plus Cat 5 path traversal) but package managers and CI workflows are not in the trigger-keyword list. Agent and tool sinks are scattered across three agents (tool-abuse for param injection, data-poisoning for memory writes, prompt-injection for next-turn) with no unified handoff-boundary framing. Three precedents anchor the work: ADR-030 D-2 establishes the Heuristic A enrichment pattern, F-260 establishes the comment-first-give-choice community-merge playbook with PR #262 as the canonical reference, and BLP-01 lineage shows six prior single-agent refinements at the same scope.

See Issue #292 detail

Spec

User Story 1 (P1): a security analyst running tachi.threat-model against a multi-tenant RAG application using Qdrant or Pinecone wants the output-integrity agent to flag when an LLM-synthesized metadata filter could omit the tenant-scoping clause. Today the agent emits no finding on that surface, leaving the analyst with no signal about what is functionally a vector-DB equivalent of SQL injection across tenant boundaries. Acceptance: at least one finding emits with category referencing vector-filter injection, CWE.primary set to CWE-943 (Improper Neutralization of Special Elements in Data Query Logic), OWASP.primary referencing LLM08:2025 or LLM05:2025, and a mitigation naming at least one of pre-retrieval filtering, base filter that cannot be overridden, namespace-per-tenant, or allowlisted clause keys. Independent test: run tachi.threat-model against an architecture description containing an LLM Process emitting a Pinecone or Qdrant metadata filter into a multi-tenant query interface.

See spec.md at SHA f62a1b4

Plan

This is a Heuristic A enrichment branch at single-agent scope. The implementation appends new pattern content to the output-integrity detection-patterns reference, adds ten lines or fewer of navigational cross-link prose to the agent file Purpose section, optionally introduces one new example baseline at examples/multi-tenant-rag-app/, authors ADR-045 documenting the architectural decisions, and ships a CHANGELOG entry preserving the contributor's authorship per the F-260 community-merge precedent. Technical approach: additive-only markdown edits, no schema changes, no new agent files. The agent's existing both-signal detection logic (trigger-keyword AND downstream-sink-indicator) is reused, so new pattern surfaces inherit the enforcement. Byte-identical regression protection holds on 5 non-qualifying baselines under SOURCE_DATE_EPOCH=1700000000. ADR-045 follows the ADR-032 seven-decision structure with Proposed to Accepted dual-commit governance.

See plan.md at SHA f62a1b4

Tasks

Triple sign-off on the task plan. PM: tasks.md faithfully maps all 5 user stories (US1 to US5) to Phases 3 through 7 at correct priority (P1, P1, P2, P3, P3); all 16 FRs, 14 SCs, and the conditional SC-015 covered; F-260 precedent fidelity high. Architect: tasks.md technically sound, 10 of 10 evaluation criteria pass; T006, T028, T031 ADR Proposed to Accepted dual-commit governance correctly codified per ADR-027 lineage; T007 then T013 dependency and T017, T018 byte-identity gates correctly scoped (OI-scoped vs whole-pipeline split avoids the F-248 over-scoped trap); F-A2 source_attribution populator contract preserved end-to-end. Team-Lead: independent calendar verification (cal 5 2026 + date -j) confirms weekday-anchored cadence: Day 0 Thu 2026-05-14, Day 1 Fri 2026-05-15, Day 2 Mon 2026-05-18, Day 3 Tue 2026-05-19, Buffer-1 Wed 2026-05-20, Buffer-2 Thu 2026-05-21. Critical path matches expected (T006 → T007 → T013 → T017 → T019 → squash-merge → T020 → T022 → T031).

See tasks.md at SHA f62a1b4

Build

Phase 3 US1, Multi-Tenant RAG Tenant-Scoping Signal. T007 [US1] appends Cat 6 (Vector / Search-DSL Injection) to the output-integrity detection-patterns reference after the existing Cat 5 anchor. Content includes intro paragraph, primary OWASP citations (LLM08:2025 primary, LLM05:2025 cross-anchor), CWE citations (CWE-943 primary, CWE-89 plus CWE-94 related), and trigger keywords: qdrant, pinecone, metadata filter, must_not, must, tenant_id, namespace, embedding query, hybrid search, elasticsearch DSL, vector index, RAG retrieval filter. Approximately 50 lines. T008 [US1] adds the Cat 6 worked example: OI-{N} Multi-tenant RAG metadata filter omits tenant_id clause. References Pinecone metadata filter as the failure mode; includes four mitigation alternatives (pre-retrieval filtering, base filter that cannot be overridden, namespace-per-tenant Silo model, allowlisted clause keys) with a defense-in-depth note. Surfaces the Silo-vs-Pool trade-off per web research recommendation.

See tasks.md (Phase 3 US1)

Deliver

feat(292): output-integrity cross-sink refinement. Merged 2026-05-14 at 17:02 UTC, squash SHA 0629fa2, 25 files changed, +3411 / -9. Released as v4.36.0 (release-please PR #294). Verification: T029 security re-scan PASSED (SAST and SCA both SKIPPED on the docs-only surface); T018 5-baseline byte-identical regression PASSED (13 passed / 1 documented skip); T035 zero-edit invariant test updated for the F-292 carve-out; T030 quickstart sections 1 through 9 static grep PASS. Architecture summary: 8th Heuristic A enrichment at single-agent same-agent scope (no new agent, no schema bump, no orchestrator edit); ADR-045 governance with 8-ADR cross-reference matrix; multi-tenant RAG baseline at examples/multi-tenant-rag-app/ exercising the Cat 6 sink; test_backward_compatibility.py F-292 carve-out per the F-241 precedent. Community provenance preserved via discussion #179 attribution.

See PR #293 (merged 2026-05-14)

Retro

Closeout state captured on the triple-signed tasks.md: 0 High / 2 Medium / 3 Low concerns across the three reviewers, all carry-forwards documented and tracked into the next build cycle. Estimated duration was 1 to 2 days; actual delivery was same-day (spec to plan to build to deliver). Variance on-target for a docs-heavy enrichment with no schema or orchestrator surface change. One lesson lands as institutional knowledge: post-merge community-engagement and SLA-driven tasks (the discussion #179 closure comment, release-please loop, ADR Accepted-commit-SHA fill) belong in a dedicated follow-up issue at deliver time, so the closing feature's Issue transitions cleanly to stage:done and SLA-driven actions get real accountability against a stable Issue number. The companion build-time lesson captured separately: enrichment-branch features that modify detection-tier files must update the zero-edit invariant test in the same change.

See tasks.md sign-off + Issue #292

What to do next

Star Tachi on GitHub Read how AOD works

Have a use case to discuss? Discuss your use case (opens in new tab)