2026-06-09·5 min read·sota.io Team

EU AI Act Art.73 Incident Escalation: When Your PMS Monitoring Findings Become Mandatory Reports

Post #1610 in the sota.io EU AI Act Post-Market Monitoring Operations Series

EU AI Act Art.73 incident escalation pipeline showing decision tree from PMS monitoring to mandatory serious incident reporting

Your Art.72 post-market monitoring system is running. Drift alerts fire. Performance metrics trend downward. Bias indicators move outside acceptable ranges. Now comes the question that keeps compliance teams up at night: is this a PMS finding you track and mitigate internally, or a serious incident you must report to national market surveillance authorities under Art.73?

This post builds the decision pipeline. We cover exactly what the EU AI Act defines as a "serious incident," the legal threshold between internal PMS action and mandatory external reporting, the three-tier timeline structure (2, 10, and 15 days), what a compliant report must contain, and how to wire your MLOps alerting stack so the escalation decision is made consistently — not case-by-case in a crisis.

The Two-Layer Architecture: Art.72 Monitors, Art.73 Reports

Understanding the relationship between these two articles prevents both over-reporting (treating every drift alert as an incident) and under-reporting (treating actual incidents as internal optimization issues).

Art.72 creates the continuous monitoring obligation. Providers of high-risk AI systems must implement a PMS system that proactively collects and reviews data about system performance throughout the operational lifetime. This is your internal radar: drift detection, fairness metrics, performance degradation tracking, user feedback analysis. Most of what your PMS catches will be handled through your internal corrective action process — adjust thresholds, retrain models, update documentation.

Art.73 creates the external reporting obligation. When the PMS system — or any other source — reveals that your high-risk AI system has caused or contributed to a "serious incident" as defined in Art.3(49), you must notify the market surveillance authority of the affected Member State. This is mandatory. There is no internal-only option for qualifying incidents.

The practical challenge: Art.73 doesn't give you a performance metric threshold. It doesn't say "if accuracy drops below X%, report it." Instead, it defines serious incidents in terms of harm outcomes. Your PMS generates performance data; your escalation decision logic must translate that data into harm outcome assessment.

What Art.3(49) Defines as a Serious Incident

The EU AI Act Art.3(49) definition of "serious incident" covers any incident or malfunctioning of an AI system that directly or indirectly leads to:

(a) Death or serious harm to the health of a person — including physical injury, psychological harm, or deterioration of health condition where that deterioration could not reasonably have been anticipated. For healthcare AI, credit scoring AI affecting access to essential services, or any system where the AI output feeds downstream decisions affecting physical or financial welfare, this category requires the most attention.

(b) Serious disruption of critical infrastructure — relevant for AI systems deployed in energy, transport, water, financial, health, or digital infrastructure sectors. Note that Art.73 cross-references NIS2 national competent authorities for critical infrastructure incidents, creating dual reporting obligations in some cases.

(c) Infringement of obligations under Union law intended to protect fundamental rights — this is the category most commonly triggered by bias and discrimination findings from your PMS. If your production AI demonstrably discriminates against persons based on protected characteristics (race, ethnicity, religion, disability, age, sex, sexual orientation, nationality) and that discrimination constitutes a violation of applicable anti-discrimination law, you have a serious incident under Art.3(49)(c).

(d) Serious damage to property or the environment — relevant for autonomous systems, physical AI applications, or AI-controlled equipment where operational failure could cause material damage.

What is NOT a serious incident under this definition:

Accuracy degradation that doesn't causally connect to any of the four harm categories above
Drift detected early enough that no real-world decisions were affected
Performance issues in test/staging environments
Near-misses where harm was possible but didn't occur (though near-misses have their own separate reporting track)

The Escalation Decision Tree

Building a reliable escalation process means making the decision logic explicit before an incident occurs. The following pipeline should be formalized in your incident response procedures, not improvised when alerts fire.

Stage 1: PMS Alert → Harm Assessment

Every PMS alert that crosses a configured threshold triggers harm assessment. The assessment asks a single question: has this system malfunction or performance degradation caused or contributed to a harm outcome in categories (a)-(d) above?

Input: your PMS finding (drift magnitude, bias metric value, performance metric, user complaint) Required: mapping from PMS metric to potential harm pathway

For each high-risk use case in your system, you should have a documented harm pathway map created during risk assessment under Art.9. This map connects performance degradation types to downstream harm scenarios. During harm assessment, you check whether your current PMS finding falls within a documented harm pathway and whether any actual harm has occurred or been reported.

Decision: NO — PMS finding is contained, no evidence of real-world harm, no qualifying pathway. Log internally, escalate through corrective action process. No Art.73 notification required.

Decision: POSSIBLE — PMS finding may have affected real-world decisions during the degraded period. Proceed to Stage 2.

Decision: YES — PMS finding has caused or contributed to documented harm in categories (a)-(d). Proceed to Stage 3 (immediate).

Stage 2: Impact Scope Investigation

For POSSIBLE findings, the goal is to determine whether actual harm occurred during the period of degraded performance. This investigation should have a defined time limit (24-48 hours maximum) because Art.73 reporting timelines run from when you "became aware" of the serious incident — and deliberately slow investigations don't reset the clock.

Scope investigation steps:

Identify the affected decision window: what time period was the system operating in degraded state?
Identify affected users/decisions: how many real-world decisions were output by the degraded system?
Query feedback channels for harm indicators: support tickets, complaints, third-party reports
For algorithmic bias incidents: sample affected decisions to determine whether discriminatory outcomes actually occurred in deployment

Investigation outcome: NO harm confirmed — proceed to Corrective Action, document investigation results, no Art.73 notification.

Investigation outcome: HARM CONFIRMED — proceed to Stage 3.

Stage 3: Serious Incident Determination and Reporting Initiation

Once you have confirmed that a qualifying harm occurred, you have a serious incident and the reporting timeline starts. At this point, two things happen simultaneously:

Legal: the Art.73 notification process begins
Technical: the system should be flagged for corrective action (which may include suspension pending investigation)

The Three-Tier Reporting Timeline

The EU AI Act establishes a graduated reporting timeline based on incident severity:

2 working days — for serious incidents that have caused, or you have reasonable grounds to believe may cause, death or that pose an immediate threat to public safety. This is the emergency track. The report submitted at 2 days does not need to be complete — it must contain the information available at that point, with follow-up reports as investigation progresses.

10 working days — for serious incidents involving serious harm to health (short of immediate death risk), significant infringement of fundamental rights, or serious property damage. This is the primary track for most high-risk AI incident reports.

15 working days — for the final/complete report following an initial notification, or for incidents where the serious nature was confirmed after investigation and the initial 2/10-day windows applied to preliminary notification.

These timelines run from the moment you became aware of the serious incident — not from when harm occurred, not from when investigation is complete. If your PMS system detects a drift event and harm assessment determines that a serious incident likely occurred, the clock starts at detection. This creates a direct incentive for fast harm assessment: the slower your Stage 2 investigation, the less time you have to prepare a complete notification.

Working days: these timelines are in working days, not calendar days. Factor in weekends and public holidays in the relevant Member State.

What Your Art.73 Notification Must Contain

The notification to the national market surveillance authority must include:

System identification: name and version of the AI system, provider details, EU AI Database registration number (required for high-risk systems), intended purpose and deployment context.

Incident description: factual account of what happened, when it happened, what harm occurred or may have occurred, how many persons were affected, and the causal pathway from system malfunction to harm.

PMS data supporting the incident: the monitoring data, drift metrics, or performance indicators that correspond to the incident period. This is where your PMS infrastructure pays off — you need structured, timestamped records of system performance leading up to and during the incident.

Immediate corrective actions taken: what you did when you identified the incident (suspend the system, apply a hotfix, notify affected users). Demonstrating that you responded promptly and proportionately is a material factor in how regulators assess the incident.

Root cause (preliminary or confirmed): what caused the malfunction. At the initial notification stage, this may be preliminary. Updated root cause analysis can follow in a supplementary notification.

Preventive measures: what you will do or have done to prevent recurrence. For serious incidents, regulators will expect to see systematic preventive measures, not just reactive fixes.

Wiring Your MLOps Stack to the Escalation Pipeline

The escalation decision process described above should be encoded in your incident response system, not kept in a document that nobody reads during a crisis. Practical implementation approaches:

Alert Classification at Source

Configure your PMS monitoring system to classify alerts by potential harm pathway before they reach your operations team. Add a harm_pathway field to every alert:

class PMSAlert:
    alert_type: str  # "drift" | "bias" | "performance" | "fairness"
    metric: str
    current_value: float
    threshold: float
    harm_pathway: str  # "category_a" | "category_c" | "low_risk" | "unknown"
    requires_immediate_harm_assessment: bool
    auto_escalate_after_hours: int  # if assessment not completed, escalate

Alerts on harm_pathway = "low_risk" go to standard corrective action queue. Alerts on category_a or category_c pathways go directly to the harm assessment workflow with a timer.

Incident Record Structure

Create a formal incident record the moment a potential Art.73 event is identified. This record drives the investigation and becomes the basis for the notification:

class IncidentRecord:
    incident_id: str
    detection_timestamp: datetime  # when PMS alert fired — clock starts here
    pms_alert_ref: str
    harm_assessment_due: datetime  # detection_timestamp + 24h max
    initial_report_due: datetime   # populated once harm confirmed
    report_tier: str               # "2_day" | "10_day" | "15_day"
    
    # Investigation outputs
    harm_confirmed: bool
    harm_category: str
    affected_users_count: int
    affected_decision_window_start: datetime
    affected_decision_window_end: datetime
    
    # Report tracking
    initial_notification_sent: datetime
    supplementary_notifications: list[datetime]
    msa_contact: str               # national market surveillance authority
    eu_ai_database_submission_id: str

Notification Routing by Jurisdiction

Art.73 requires notification to the market surveillance authority of the Member State where the incident occurred — not your provider jurisdiction. For systems deployed across multiple EU Member States, your escalation pipeline must identify which national authority to notify based on where affected users are located.

Maintain a routing table:

Member State	Market Surveillance Authority	Contact
Germany	BNetzA / BSI	[authority contact]
France	ANSSI / DDPP	[authority contact]
Netherlands	ACM	[authority contact]
...	...	...

For incidents affecting users in multiple Member States, notify all relevant authorities and indicate the cross-border nature of the incident. The lead authority will typically be where the most significant harm occurred.

Automated Deadline Tracking

Once an incident record is created with a confirmed harm finding, the reporting deadlines are deterministic. Automate deadline calculation and send Slack/PagerDuty alerts to responsible parties:

def calculate_report_deadlines(incident: IncidentRecord) -> dict:
    aware_timestamp = incident.detection_timestamp
    
    if incident.harm_category == "category_a_death_risk":
        initial_due = add_working_days(aware_timestamp, 2)
    elif incident.harm_category in ["category_a_health", "category_c_rights"]:
        initial_due = add_working_days(aware_timestamp, 10)
    else:
        initial_due = add_working_days(aware_timestamp, 15)
    
    return {
        "initial_notification_due": initial_due,
        "final_report_due": add_working_days(aware_timestamp, 30),
        "working_days_remaining": working_days_until(initial_due)
    }

Practical Escalation Scenarios

Scenario 1: Credit Scoring Bias → Art.3(49)(c) Incident

Your Art.72 PMS detects that the demographic parity ratio for loan approval recommendations has dropped from 0.92 to 0.74 over the past 3 weeks. Your monitoring finds that this coincides with a feature drift in employment history categorization.

Is this a serious incident? Depends on deployment reality:

If the model output was used in automated decisions affecting loan approvals, and the bias affected protected group members, this likely constitutes an infringement of anti-discrimination obligations under Art.3(49)(c).
If the model was advisory only and human reviewers consistently overrode biased recommendations, harm may not have materialized.

Action: Stage 2 investigation immediately. Query affected decision records. If automated decisions were influenced by biased outputs, initiate 10-day report.

Scenario 2: Medical AI Diagnostic Drift → Potential Art.3(49)(a) Incident

Your imaging AI system shows 8% performance degradation on a specific lesion type. You discover this is because a scanner firmware update at one hospital changed image preprocessing in a way your model wasn't trained for.

Is this a serious incident? Depends on clinical impact:

If no missed diagnoses occurred during the degraded period (clinical review shows no adverse outcomes), this may not constitute a serious incident — though it requires immediate corrective action and documentation.
If clinical investigation reveals that the degradation contributed to delayed or missed diagnoses, you likely have a category (a) serious incident.

Action: Immediately notify clinical partners to review cases from the affected period. Suspend model use at affected site. If clinical harm is confirmed, 2 or 10-day notification depending on severity.

Scenario 3: Employment Screening Accuracy Drop → Investigation Required

Your HR screening AI accuracy drops from 89% to 81% after a retraining cycle. No bias indicators are elevated, but overall performance is lower.

Is this a serious incident? Probably not, if:

The accuracy drop doesn't disproportionately affect candidates with protected characteristics
Human reviewers are in the loop for all final decisions
No candidates have been demonstrably disadvantaged relative to the pre-degradation baseline

Action: Standard corrective action. Retrain or rollback. Document in PMS records. No Art.73 notification unless subsequent investigation reveals discriminatory impact.

Integration with Your CLAUDE.md/Documentation Requirements

Under Art.11 and Annex IV, your technical documentation must cover the post-market monitoring system design. Following an Art.73 incident, update your documentation to reflect:

The incident details and confirmed root cause
Changes made to the PMS system as a result (new monitoring metrics, adjusted thresholds)
Changes to risk management documentation (new risk scenarios identified)
Updates to the intended purpose or conditions of use if the incident revealed deployment context assumptions were incorrect

Regulators reviewing your documentation after an incident will check whether your PMS design anticipated the category of risk that materialized. If it didn't, that's a documentation gap under Art.9/Art.11. If it did but your alert thresholds weren't calibrated to catch the actual failure, that's a PMS design gap.

The Near-Miss Track

Art.73 also covers incidents that could have caused serious harm but didn't — what the regulation calls "serious incidents" that didn't result in harm because of external factors (e.g., a clinician who happened to double-check a recommendation that would otherwise have caused harm).

Near-misses must be reported with a 15-day timeline and with the same information structure as actual harm incidents. Treat your near-miss reporting as a learning input: if your system nearly caused harm that you only avoided due to luck or non-systematic human intervention, that's a signal that your Art.9 risk management and Art.14 human oversight design needs revision.

Checklist: PMS to Art.73 Escalation Readiness

Before August 2026, verify your escalation pipeline covers:

Decision criteria:

Documented harm pathway map for each high-risk AI use case (Art.9 risk assessment)
Written escalation procedures mapping PMS alert types to harm assessment triggers
Defined time limit for Stage 2 harm investigation (max 24-48 hours)
Clear definition of what evidence confirms "harm occurred" for each use case

Timeline management:

Automated deadline calculation from detection timestamp
Escalation alerts at 50% and 75% of deadline consumed
Template for initial notification (incomplete but timely vs. complete but late)

Reporting infrastructure:

EU AI Database account and system registration complete
National market surveillance authority contacts for each deployment jurisdiction
Incident record structure captures all required notification fields
PMS data export function for attaching monitoring evidence to report

Documentation:

Technical documentation includes Art.73 incident history (if applicable)
Post-incident review process for PMS threshold adjustment
Near-miss reporting process separate from serious incident track

Next in the series: Post #5/5 — The Complete PMS Developer Checklist: 40-Point Production Readiness for August 2026. Consolidates all five posts into a single pre-deployment checklist.

Previous posts: Post #1: Art.72 PMS Plan Requirements — Post #2: MLOps Drift Detection Implementation — Post #3: Bias Monitoring in Production

EU-Native Hosting

Ready to move to EU-sovereign infrastructure?

sota.io is a German-hosted PaaS — no CLOUD Act exposure, no US jurisdiction, full GDPR compliance by design. Deploy your first app in minutes.

Join the waitlist View pricing