18-phase forensic pipeline
From repository URL to signed audit bundle — fully automated, deterministic, and reproducible. Every run produces identical output for identical input.
Coverage Init
Rule IDs registered from AUTHORITATIVE_NAMESPACE before any scan begins. Every article gets a slot — guarantees no article is silently skipped.
Manifest Loading
Parse sentinel.manifest.json — risk_category, declared_flags, entity_role, provider_location, ai_system_type, delta_id.
File Hashing
SHA-256 per file. Integrity tracking, delta comparison, Merkle root of the full supply chain at scan time.
Language Detection
15 languages detected: Python, JS/TS, Rust, Go, Java, C#, F#, VB, C++, C, Ruby, PHP, Kotlin, Scala. Drives AST strategy per language.
Dependency Discovery
89 AI packages tracked: tensorflow, pytorch, openai, aws-rekognition, langchain, face-api.js, transformers. Cross-referenced by language ecosystem.
Code Signature Analysis
AST + regex against probing-rules.json (33 patterns). Strong signals (1.0), traceability (0.7), weak signals (0.5). Comment-stripped before verification.
Document Parsing
MD, JSON, DOCX, PDF extraction. Substance scoring: word count, keyword density, boilerplate detection. <40 words penalised.
Evidence Classification
Every finding gets a registry entry: source_type, evidence_type, confidence weight, exact file + line number. SHA-256 evidence hash.
Confidence Calibration
HIGH ≥ 0.75 · MEDIUM ≥ 0.50 · LOW ≥ 0.25 · INSUFFICIENT < 0.25. Entropy < 0.4 triggers −40% confidence penalty.
Negative Evidence Detection
Absence of expected signal = evidence of non-visibility. Not neutral. Triggers human_review_required flag and negative evidence finding.
Contradiction Analysis
Contradiction Engine: manifest declared_flags vs code signals. CLAIM_SPOOFING, CLAIM_VS_NEGATIVE_EVIDENCE, CROSS_ARTICLE_DEPENDENCY, IDENTITY_COLLISION.
Finding Enrichment
narrator.js translates rule codes to plain-English legal reasoning. Strength, severity, legal_basis, and remediation type assigned per finding.
Article Scoring
Weighted findings → per-article score. Normalised against applicable_weight only. Unlisted sector articles retain weight multiplier 1.0.
Dual-Track Evaluation
Four tracks: Governance (A), Technical (B), Score gate (C), SIG Integrity (D). Any track can independently override the final verdict.
Policy Enforcement
Active policy pack checked. Required documents must exist and exceed quality threshold. Missing Annex IV triggers Track A override.
Risk Classification
PROHIBITED → HIGH_RISK → LIMITED_RISK → MINIMAL_RISK → GPAI. First match wins. CLASSIFICATION_MISMATCH if declared ≠ detected.
SIG Integrity Check
Strips all comments and string literals, verifies signals exist in executable logic only. Comment-only compliance is detected and rejected.
Report Generation
audit.json (RFC8785-Lite canonical) · report.html · compliance_report.md · SARIF v2.1.0 · epistemic_map.json · remediation_roadmap.md · RSA-PSS bundle.
ARTICLE_WEIGHTED_V3
The final score is a normalised weighted aggregate across all articles applicable to the system's risk category. A minimal-risk system is never penalised for obligations that legally apply only to high-risk systems — weights are excluded, not zeroed.
Article weight distribution — 22 articles
+ 9 more articles (Art. 11, 12, 17, 19, 27, 4, 16, 49, 51/53/55 GPAI-only)
Four outcome tracks
All assessed static signals satisfy requirements. Notified Body operational verification still required before deployment.
Material compliance achieved. Minor gaps may exist but are not blocking for most use cases.
Material gaps identified. Remediation plan generated. Submission not advisable without addressing critical gaps.
Hard violations or critical absences. Fundamental remediation required before any compliance assessment.
Track override: Any of the four verdict tracks (Governance, Technical, Score gate, SIG Integrity) can independently downgrade the verdict. A system scoring 90/100 that has a governance blocker receives Track A override and is reclassified to Gap.
Per-article risk-adjusted scoring
Multipliers are applied per article, not as a single global scalar. High-stakes sectors carry heavier obligations on specific articles, while obligations less relevant to that sector may be down-weighted or remain at ×1.0.
Art. 9Risk MgmtArt. 10Data Gov.Art. 14Human OversightArt. 15RobustnessArt. 27FRIAArt. 50Output TransparencyArt. 5Prohibited PracticesArt. 9Risk MgmtArt. 10Data Gov.Art. 14Human OversightArt. 27FRIAArt. 9Risk MgmtArt. 10Bias detectionArt. 13TransparencyArt. 14Human OversightArt. 27FRIAArt. 9Risk MgmtArt. 10Data Gov.Art. 13TransparencyArt. 14Human OversightArt. 27FRIAArt. 9Risk Mgmt (different)Art. 13TransparencyArt. 15RobustnessArt. 50Output TransparencyArt. 13TransparencyArt. 14Human OversightArt. 15RobustnessArt. 50Output TransparencyArticles not listed in a sector profile retain weight multiplier ×1.0. Sector is detected automatically from ai_system_type in sentinel.manifest.json.
Not all evidence is equal
A declaration in a manifest without code corroboration carries far less weight than a verified function signature in executable source. Sentinel weights evidence by verifiability.
Source code
AST analysis, function signatures, decorator patterns, import chains
Configuration file
package.json, requirements.txt, Cargo.toml, CI/CD YAML, Dockerfile
Test / validation
Test suites, adversarial testing configs, coverage reports
Log artifact
Logging infrastructure, retention policy, structured logging setup
Certificate
Conformity certificates, attestations, third-party validations
Documentation
Markdown, README, model cards, risk registers, policy docs
Manifest declaration
declared_flags in sentinel.manifest.json — unverified against code
No evidence
Signal expected but not found — triggers negative evidence finding
Four certainty levels
Sentinel is explicit about what it knows and how it knows it. Every finding carries an epistemic level — so you always know the difference between verified evidence and a declaration.
7 contradiction types
The Contradiction Engine runs 4 check methods that generate up to 7 distinct contradiction types. Any mismatch between manifest claims and code evidence is classified by severity and always generates a HIGH or CRITICAL finding.
CLAIM_SPOOFINGManifest declares an article as implemented but code score < 50% for that article.
AllCLAIM_VS_NEGATIVE_EVIDENCEStrong claim (declared_flag) + CRITICAL signal absent → SEVERE. Strong claim + HIGH absent → SIGNIFICANT.
AllCROSS_ARTICLE_DEPENDENCYRisk management declared but human oversight absent — legally inseparable obligations.
Art. 9/14EVIDENCE_IDENTITY_COLLISIONSame evidence hash cited for multiple distinct article requirements — identity reuse detected.
AllPOLICY_CONTRADICTIONActive policy pack requirement violated — a mandatory document threshold not met.
VariesCLASSIFICATION_MISMATCHDeclared risk_category differs from pipeline-detected category (e.g. declared MINIMAL_RISK, detected HIGH_RISK).
Art. 6SECURITY_ALERT_CONTRADICTIONSecurity hardening flags declared but vulnerability patterns detected in dependencies.
Art. 15CLAIM_VS_NEGATIVE_EVIDENCE severity matrix: STRONG claim + CRITICAL absent → SEVERE. STRONG claim + HIGH absent → SIGNIFICANT. MEDIUM claim + CRITICAL absent → SIGNIFICANT. Any claim + MEDIUM absent → MINOR.
What reduces your score
score × 0.20−15 ptsconfidence × 0.60−30 pts−15 pts−8 pts−3 pts−0.30 scorescore = 0What Sentinel covers — and what it doesn't
- Static source code analysis across 15 languages
- Document substance scoring (not just existence)
- Dependency and supply-chain signal extraction (89 AI packages)
- Manifest-vs-code contradiction detection (7 types)
- 22-article weighted compliance scoring (ARTICLE_WEIGHTED_V3)
- Per-article sector multiplier application by risk category
- Deterministic, reproducible output with RSA-PSS signing
- Annex IV dossier bundle generation
- SARIF v2.1.0 and 7 total output formats
- Runtime or production system behaviour
- Live model outputs or inference testing
- Organisational processes not reflected in code
- Legal advice or formal conformity assessment
- Notified Body certification (operational verification)
- Private source code not provided to the scanner
- Third-party service compliance (APIs, cloud providers)
- Post-deployment monitoring of model drift