ARGUS — the trust layer for AI-generated code

The numbers

Measured on the live benchmark. The deterministic layer is the contract; the LLM layer inherits the model's accuracy. Honest posture: high-confidence on deterministic, semantically strong on LLM, never 100%.

1.000 Precision (det.) 0 false positives on the 194-test corpus

0.818 Recall (det.) F1 = 0.900 on the 40-PR benchmark

12ms Deterministic pass 5 SLOP rules + CWE-798 regex, no LLM call

4,830ms End-to-end 12ms det + 4,818ms LLM, ~1.2k tokens

Ladybird closed external PRs in June 2026 — the maintainer corps was overrun. tldraw auto-closes AI PRs since January 2026. RPCS3 reverted multiple AI PRs that caused production regressions (May 2026). cURL canceled its bug bounty — 19 of 20 recent reports were synthetic hallucinations (January 2026).

All four projects closed the public door on AI slop. ARGUS is the quantitative answer: hybrid regex + LLM, signed certificate per analysis, audit chain ready for Art. 12 L2.
— Stenberg (cURL), Yegge (tldraw), Ladybird community, RPCS3 maintainers · 2026

The 4 specialists · in parallel

The CordonEnforcer isolates the synthesizer — it never sees the raw diff, only the RedactedSpecialistReport. Type-level isolation, not runtime checks. No competitor (CodeRabbit, Greptile, Qodo) has this constraint.

S · 001 / slop

Aegis Slop

Narrative comments, swallowed errors, oversized functions (> 80 LOC), .unwrap() outside tests, TODO stubs, unused pub fn. Hybrid: regex < 1ms + LLM 2-4s. Catches 40-60% of slop before the LLM runs.

S · 002 / security

Aegis Security

Hardcoded credentials, injection, unsafe panic, unhandled errors, OWASP Top 10. LLM (redteam-security prompt). CWE-798 hardcoded-secret scan runs deterministic first, before the LLM call.

S · 003 / arch

Aegis Arch

Repo coherence, pattern matching, idiom detection, separation of concerns. LLM (architecture-fit prompt). Catches the patterns the deterministic regex can't — defensive .clone() chains, narrative boilerplate, off-pattern style.

S · 004 / verdict

Aegis Verdict

Synthesizes the 3 above into Approved · ReviewRequired · Halted + a fix_plan.json for downstream coding agents. CordonEnforcer: the synthesizer receives RedactedSpecialistReport, not raw diff.

Analyzer · run it in your browser

Paste a code snippet and hit Analyze. The 4 specialists run locally (deterministic regex + mock NIM) and return the 4-cohort verdict + risk score. No signup, no API key, no waiting. Same input always returns the same output.

Code snippet

Samples

NVIDIA NIM API key (optional · BYOK)

Get a free key at build.nvidia.com. Without a key, the analyzer runs the local mock (5 SLOP rules + CWE-798) — no network, no signup, ~12ms.

Click Analyze to see the 4-cohort verdict, or pick a sample to start.

The audit chain · in your browser

Every ARGUS verdict is written to a SHA-256-hash-chained, Ed25519-signed AuditEvent. Each event links to the previous via SHA-256 (browser-portable; BLAKE3 not in Web Crypto). Each is signed with Ed25519. EU AI Act Art. 12 Level 2 ready.

Browser-portable hash chain (SHA-256) — 3 events

audit-001-2026-06-13-001

2026-06-13T14:00:00Z

verdictHalted · Hallucinated vuln in Curl_urldecode

modelmeta/llama-3.1-70b-instruct

policyverify-worker-v1-policy

prev_hash0000…0000

current_hashde0d8682…8a76

signatureed25519:432328bf…f91a

audit-002-2026-06-13-002

2026-06-13T14:00:04Z

verdictReviewRequired · 3 SLOP signals + 1 arch

modelmeta/llama-3.1-70b-instruct

policyverify-worker-v1-policy

prev_hashde0d8682…8a76

current_hash5b65f842…28bb

signatureed25519:9a0c31…1d7b

audit-003-2026-06-13-003

2026-06-13T14:00:09Z

verdictApproved · 0 SLOP signals, type-safe PR

modelmeta/llama-3.1-70b-instruct

policyverify-worker-v1-policy

prev_hash5b65f842…28bb

current_hashd26157b5…80c0

signatureed25519:7e1f88…8c2d

3 events, real chain. Each event links to the previous via SHA-256. Each is signed with Ed25519. Re-verify the chain link in the browser with a single click. Read the agent spec →

Click to run a SHA-256 round-trip on the 3 events — no network call, all in your browser.

Submit a PR · BYOK analysis

Paste a public GitHub PR URL and (optionally) your NVIDIA NIM API key. Without a key, the form runs the local mock. With a key, the request is POSTed to /api/submit (server-side proxy). Your key is never logged or stored.

Submit a PR for review

Weekly briefing · OSS slop signals

Curated signal digest: which OSS projects closed public PRs, which maintainers publicly flagged AI slop, which CVE claims were hallucinated. Source: docs/briefings/.

Weekly briefing — Week 24, 2026-06-16

ARGUS Weekly · Week 24 · 2026-06-16

Top signal of the week. The cURL bug bounty cancellation continues to dominate the conversation. Stenberg's "Death by a thousand slops" post (Dec 2025) crossed 200k reads this week and was cited in two academic preprints on AI-assisted vulnerability disclosure.

Maintainer pushback, by the numbers.

Ladybird (Andreas Kling) — closed external PRs indefinitely on 2026-06-04. Reasoning: the maintainer corps was overrun with AI-slop PRs that took more time to triage than to fix the underlying bugs.
tldraw (Steve Ruiz / a.k.a. Yegge) — auto-closes any PR matching the "defensive .clone() chain + narrative comments" pattern since 2026-01-15. Bot-post comment quotes Yegge's "stay away from my trash" post verbatim.
RPCS3 — reverted 3 AI-generated PRs in May 2026 that caused regressions in the PS3 emulator's SPU scheduler. Two of the three were defended as "improvements" by their authors despite the regressions.
cURL (Daniel Stenberg) — canceled the public bug bounty on 2026-01-20. Stated reason: 19 of 20 recent reports were synthetic hallucinations of non-existent functions (Curl_urldecode(), curl_easy_perform_v2(), etc.).

Hallucinated CVE patterns we've seen this week.

"Buffer overflow in Curl_urldecode()" — function does not exist in cURL. Detected by Aegis Security + halted by Aegis Verdict (risk 0.92).
"Race condition in tokio::sync::RwLock::read()" — false claim; the read path is reentrancy-safe by construction. Detected by Aegis Arch (off-pattern for tokio style).
"SQL injection in prisma.user.findMany({ where: { id: req.query.id }})" — Prisma parameterizes all queries; the claim is impossible. Detected by Aegis Security (1 info, no critical).

ARGUS in the wild.

194+ tests passing across 15 Rust crates.
EU AI Act Art. 12 Level 2 conformance maintained by default — no opt-in required for the SHA-256 + Ed25519 chain.
4 specialists running in parallel on every analyze call: Aegis Slop (DET + LLM), Aegis Security (LLM), Aegis Arch (LLM), Aegis Verdict (synthesizer, isolated by CordonEnforcer).

Next week's lens targets. Linux kernel ML subsystem maintainers (post 6.10), the Rust standard library team (post-1.88 MSRV churn), and the new OpenSSF "AI-assisted contribution" working group output (expected Friday).

— Generated 2026-06-16 09:00 UTC by argus-lens v0.1.0 · MIT licensed · source on github.com/SuarezPM/apohara-argus

Deterministic — 12ms regex, no LLM in the hot path Honest benchmarks, published limits, never 100% Offline-first — runs fully air-gapped, BYOK Threat model published — SECURITY.md is the contract

No magic. No marketing. A signed certificate per analysis — that's the product.

AI generated 42% of the code committed in 2025. Reviewers didn't get faster. Maintainers closed bug bounties. The bottleneck is no longer generation — it's verification. ARGUS ships the regulator-facing artifact: a SHA-256-hash-chained, Ed25519-signed certificate per analysis. Same shape the EU AI Act Art. 12 Level 2 wants. Same shape your CISO wants. BYOK, MIT, no SaaS lock-in.