Technical Specification

Sentinel Methodology

Complete technical specification — scoring formulas, evidence weights, pipeline phases, contradiction detection, and audit boundaries. All values sourced directly from production code.

Model: ARTICLE_WEIGHTED_V3Engine: v2.9.0-SEVERITYRegulation: EU 2024/1689

Pipeline phases

Scored articles

Signal patterns

Contradiction types

Pipeline ·Scoring formula ·Verdict thresholds ·Sector multipliers ·Evidence weights ·Epistemic model ·Contradictions ·Score penalties ·Scope & limits

Analysis pipeline

18-phase forensic pipeline

From repository URL to signed audit bundle — fully automated, deterministic, and reproducible. Every run produces identical output for identical input.

InitialisationPhases 01–04

Phase

Coverage Init

Rule IDs registered from AUTHORITATIVE_NAMESPACE before any scan begins. Every article gets a slot — guarantees no article is silently skipped.

Phase

Manifest Loading

Parse sentinel.manifest.json — risk_category, declared_flags, entity_role, provider_location, ai_system_type, delta_id.

Phase

File Hashing

SHA-256 per file. Integrity tracking, delta comparison, Merkle root of the full supply chain at scan time.

Phase

Language Detection

15 languages detected: Python, JS/TS, Rust, Go, Java, C#, F#, VB, C++, C, Ruby, PHP, Kotlin, Scala. Drives AST strategy per language.

DiscoveryPhases 05–07

Phase

Dependency Discovery

89 AI packages tracked: tensorflow, pytorch, openai, aws-rekognition, langchain, face-api.js, transformers. Cross-referenced by language ecosystem.

Phase

Code Signature Analysis

AST + regex against probing-rules.json (33 patterns). Strong signals (1.0), traceability (0.7), weak signals (0.5). Comment-stripped before verification.

Phase

Document Parsing

MD, JSON, DOCX, PDF extraction. Substance scoring: word count, keyword density, boilerplate detection. <40 words penalised.

AnalysisPhases 08–12

Phase

Evidence Classification

Every finding gets a registry entry: source_type, evidence_type, confidence weight, exact file + line number. SHA-256 evidence hash.

Phase

Confidence Calibration

HIGH ≥ 0.75 · MEDIUM ≥ 0.50 · LOW ≥ 0.25 · INSUFFICIENT < 0.25. Entropy < 0.4 triggers −40% confidence penalty.

Phase

Negative Evidence Detection

Absence of expected signal = evidence of non-visibility. Not neutral. Triggers human_review_required flag and negative evidence finding.

Phase

Contradiction Analysis

Contradiction Engine: manifest declared_flags vs code signals. CLAIM_SPOOFING, CLAIM_VS_NEGATIVE_EVIDENCE, CROSS_ARTICLE_DEPENDENCY, IDENTITY_COLLISION.

Phase

Finding Enrichment

narrator.js translates rule codes to plain-English legal reasoning. Strength, severity, legal_basis, and remediation type assigned per finding.

VerdictPhases 13–18

Phase

Article Scoring

Weighted findings → per-article score. Normalised against applicable_weight only. Unlisted sector articles retain weight multiplier 1.0.

Phase

Dual-Track Evaluation

Four tracks: Governance (A), Technical (B), Score gate (C), SIG Integrity (D). Any track can independently override the final verdict.

Phase

Policy Enforcement

Active policy pack checked. Required documents must exist and exceed quality threshold. Missing Annex IV triggers Track A override.

Phase

Risk Classification

PROHIBITED → HIGH_RISK → LIMITED_RISK → MINIMAL_RISK → GPAI. First match wins. CLASSIFICATION_MISMATCH if declared ≠ detected.

Phase

SIG Integrity Check

Strips all comments and string literals, verifies signals exist in executable logic only. Comment-only compliance is detected and rejected.

Phase

Report Generation

audit.json (RFC8785-Lite canonical) · report.html · compliance_report.md · SARIF v2.1.0 · epistemic_map.json · remediation_roadmap.md · RSA-PSS bundle.

Scoring engine

ARTICLE_WEIGHTED_V3

The final score is a normalised weighted aggregate across all articles applicable to the system's risk category. A minimal-risk system is never penalised for obligations that legally apply only to high-risk systems — weights are excluded, not zeroed.

audit-logic.js — ARTICLE_WEIGHTED_V3

normalized_score = (total_earned_pts / applicable_weight) × 100// core formula

article_score[a] = base_score[a] × sector_profile[a]// per-article multiplier

normalized_score = (Σ article_score[a] / applicable_weight) × 100// normalized total

Sector multipliers are per-article

Inapplicable articles excluded from denominator

Score floored at 0

Article weight distribution — 22 articles

Art. 9Risk Management

11pt

Art. 14Human Oversight

11pt

Art. 10Data Governance

8pt

Art. 13Transparency

8pt

Art. 20Logging & Traceability

8pt

Art. 5Prohibited Practices

7pt

Art. 15Robustness & Security

5pt

Art. 50AI Output Transparency

4pt

Art. 47EU Declaration (DoC)

4pt

Art. 72Post-Market Monitoring

4pt

Art. 51GPAI Classification

4pt

Art. 55GPAI Systemic Risk

5pt

+ 9 more articles (Art. 11, 12, 17, 19, 27, 4, 16, 49, 51/53/55 GPAI-only)

Verdict thresholds

Four outcome tracks

Statically Aligned

≥ 85

All assessed static signals satisfy requirements. Notified Body operational verification still required before deployment.

Aligned

≥ 65

Material compliance achieved. Minor gaps may exist but are not blocking for most use cases.

Gap

≥ 40

Material gaps identified. Remediation plan generated. Submission not advisable without addressing critical gaps.

Fail

< 40

Hard violations or critical absences. Fundamental remediation required before any compliance assessment.

Track override: Any of the four verdict tracks (Governance, Technical, Score gate, SIG Integrity) can independently downgrade the verdict. A system scoring 90/100 that has a governance blocker receives Track A override and is reclassified to Gap.

Sector multipliers

Per-article risk-adjusted scoring

Multipliers are applied per article, not as a single global scalar. High-stakes sectors carry heavier obligations on specific articles, while obligations less relevant to that sector may be down-weighted or remain at ×1.0.

Medical / Healthcare

Art. 9Risk Mgmt

1.5×

Art. 10Data Gov.

1.4×

Art. 14Human Oversight

1.5×

Art. 15Robustness

1.3×

Art. 27FRIA

1.5×

Art. 50Output Transparency

0.5×

Law Enforcement

Art. 5Prohibited Practices

1.5×

Art. 9Risk Mgmt

1.5×

Art. 10Data Gov.

1.4×

Art. 14Human Oversight

2.0×

Art. 27FRIA

1.5×

HR / Employment

Art. 9Risk Mgmt

1.2×

Art. 10Bias detection

1.4×

Art. 13Transparency

1.3×

Art. 14Human Oversight

1.2×

Art. 27FRIA

1.4×

Credit / Insurance

Art. 9Risk Mgmt

1.3×

Art. 10Data Gov.

1.4×

Art. 13Transparency

1.3×

Art. 14Human Oversight

1.2×

Art. 27FRIA

1.3×

GPAI / Foundation Models

Art. 9Risk Mgmt (different)

0.7×

Art. 13Transparency

1.3×

Art. 15Robustness

1.2×

Art. 50Output Transparency

2.0×

Chatbot / Conversational

Art. 13Transparency

1.5×

Art. 14Human Oversight

0.8×

Art. 15Robustness

0.8×

Art. 50Output Transparency

2.0×

Articles not listed in a sector profile retain weight multiplier ×1.0. Sector is detected automatically from ai_system_type in sentinel.manifest.json.

Evidence weights

Not all evidence is equal

A declaration in a manifest without code corroboration carries far less weight than a verified function signature in executable source. Sentinel weights evidence by verifiability.

Source code

AST analysis, function signatures, decorator patterns, import chains

0.85

Configuration file

package.json, requirements.txt, Cargo.toml, CI/CD YAML, Dockerfile

0.80

Test / validation

Test suites, adversarial testing configs, coverage reports

0.75

Log artifact

Logging infrastructure, retention policy, structured logging setup

0.70

Certificate

Conformity certificates, attestations, third-party validations

0.70

Documentation

Markdown, README, model cards, risk registers, policy docs

0.60

Manifest declaration

declared_flags in sentinel.manifest.json — unverified against code

0.40

No evidence

Signal expected but not found — triggers negative evidence finding

0.00

Epistemic model

Four certainty levels

Sentinel is explicit about what it knows and how it knows it. Every finding carries an epistemic level — so you always know the difference between verified evidence and a declaration.

Direct Verified

Extracted directly from source code or configuration. Full weight applied.

winston imported in src/logger.js:14

Inferred

Dependency in manifest but usage not confirmed in executable code. Reduced weight.

openai in package.json — GPAI inferred, no API call site confirmed

Declared Only

Claimed in manifest declared_flags but zero corroborating code signal. Contradiction check triggered.

human_oversight_enabled: true declared, no kill-switch pattern found

Absent

Expected signal not found anywhere. Negative evidence finding. human_review_required = true.

No logging library in dependencies, no logger instantiation found

Contradiction detection

7 contradiction types

The Contradiction Engine runs 4 check methods that generate up to 7 distinct contradiction types. Any mismatch between manifest claims and code evidence is classified by severity and always generates a HIGH or CRITICAL finding.

CLAIM_SPOOFING

Manifest declares an article as implemented but code score < 50% for that article.

MODERATEAll

CLAIM_VS_NEGATIVE_EVIDENCE

Strong claim (declared_flag) + CRITICAL signal absent → SEVERE. Strong claim + HIGH absent → SIGNIFICANT.

SEVEREAll

CROSS_ARTICLE_DEPENDENCY

Risk management declared but human oversight absent — legally inseparable obligations.

MODERATEArt. 9/14

EVIDENCE_IDENTITY_COLLISION

Same evidence hash cited for multiple distinct article requirements — identity reuse detected.

MODERATEAll

POLICY_CONTRADICTION

Active policy pack requirement violated — a mandatory document threshold not met.

MODERATEVaries

CLASSIFICATION_MISMATCH

Declared risk_category differs from pipeline-detected category (e.g. declared MINIMAL_RISK, detected HIGH_RISK).

HIGHArt. 6

SECURITY_ALERT_CONTRADICTION

Security hardening flags declared but vulnerability patterns detected in dependencies.

MODERATEArt. 15

CLAIM_VS_NEGATIVE_EVIDENCE severity matrix: STRONG claim + CRITICAL absent → SEVERE. STRONG claim + HIGH absent → SIGNIFICANT. MEDIUM claim + CRITICAL absent → SIGNIFICANT. Any claim + MEDIUM absent → MINOR.

Score penalties

What reduces your score

criticalBoilerplate content detected

score × 0.20

highManifest ↔ code contradiction

−15 pts

mediumEntropy < 0.4 (low signal quality)

confidence × 0.60

criticalCRITICAL signal absent

−30 pts

highHIGH signal absent

−15 pts

mediumMEDIUM signal absent

−8 pts

lowLOW signal absent

−3 pts

mediumWord count < 40 in document

−0.30 score

criticalWord count < 15 in document

score = 0

Scope & limits

What Sentinel covers — and what it doesn't

Sentinel covers

Static source code analysis across 15 languages
Document substance scoring (not just existence)
Dependency and supply-chain signal extraction (89 AI packages)
Manifest-vs-code contradiction detection (7 types)
22-article weighted compliance scoring (ARTICLE_WEIGHTED_V3)
Per-article sector multiplier application by risk category
Deterministic, reproducible output with RSA-PSS signing
Annex IV dossier bundle generation
SARIF v2.1.0 and 7 total output formats

Sentinel does not cover

Runtime or production system behaviour
Live model outputs or inference testing
Organisational processes not reflected in code
Legal advice or formal conformity assessment
Notified Body certification (operational verification)
Private source code not provided to the scanner
Third-party service compliance (APIs, cloud providers)
Post-deployment monitoring of model drift

Ready to audit your system?

Run your first audit in minutes

Upload your repository, point to your manifest, and get a full EU AI Act compliance report — article-by-article, evidence-backed, cryptographically signed.

Start free audit How it works