On 10 April 2026, the European Commission published the first monitoring results under the revised Code of Conduct on Countering Illegal Hate Speech Online — the so-called "Code of Conduct+" that was folded into the Digital Services Act co-regulatory architecture on 20 January 2025. The headline is reassuring: of twelve signatory platforms, the five that received enough notices to be measured (Facebook, Instagram, TikTok, X and YouTube) largely met their commitment to review the majority of flagged content within 24 hours. The interesting result is buried lower in the release: every assessed platform except X classified a "significant share" of flagged cases as disputed or as outright errors.
The strongest case for the Code+
Before picking at the regime, it is worth stating fairly why Brussels built it. Illegal hate speech in Europe is not a rhetorical category — it is criminalised under Council Framework Decision 2008/913/JHA, which obliges Member States to punish public incitement to violence or hatred against groups defined by race, colour, religion, descent, or national or ethnic origin. Trusted reporters had spent years complaining that platforms responded slowly, inconsistently, and without feedback. Embedding the 2016 voluntary code into the DSA gives a metric (the 24-hour clock), a public scoreboard, and a way for the Commission to point to compliance evidence when it audits very large online platforms (VLOPs). That is a real upgrade in process discipline, and the platforms' decision to sign — even X — is itself a vote of confidence.
What the monitoring actually showed
The "mystery shopping" exercise ran from early November to mid-December 2025. Monitoring reporters — vetted civil-society and public-sector entities — submitted notifications and the Commission measured both response time and outcome. Per the Commission's announcement, the assessed platforms hit the 24-hour review target and provided systematic feedback. But four of the five — every platform other than X — flagged a substantial portion of submissions as disputed or as procedural errors. The Commission attributes some of the "error" volume to reporters using the wrong notification channel; it offers no equivalent explanation for the "disputed" share.
That divergence is the story. Either X is being unusually permissive in agreeing with the reporters' legal characterisation, or — more plausibly given X's well-publicised stance — the other four platforms are exercising more independent judgement over what national hate-speech law actually requires them to take down. Neither option flatters the framework.
The definitional problem the Code+ does not solve
The Code's whole architecture rests on the assumption that "illegal hate speech" is a stable, knowable category that a platform can adjudicate in 24 hours. It is not. Framework Decision 2008/913/JHA harmonises only a narrow core, and Member States have transposed it inconsistently — some adding sexual orientation, gender identity, or disability as protected grounds; some criminalising genocide denial broadly; some narrowly. A post that is plainly criminal in Germany may be lawful — even constitutionally protected political speech — in Ireland or Sweden. A trusted-flagger network operating across all 27 jurisdictions is, by construction, going to disagree with platforms about edge cases. That four out of five platforms pushed back on "significant shares" of notices is not a compliance failure; it is the system surfacing exactly the contested judgements that elected courts, not 24-hour content reviewers, should be making.
Why the 24-hour metric pulls the wrong way
DSA Article 16 already obliges hosting providers to act on notices of illegal content, and Articles 34–35 require VLOPs to mitigate "systemic risks" — a category whose breadth has long worried free-expression analysts. The CSIS Europe programme has noted that the DSA's hate-speech provisions risk "enforcement shaped by political or cultural biases rather than clear legal standards," and that the threat of fines up to 6% of global turnover creates an incentive structure that tilts toward over-removal. Layering a 24-hour stopwatch on top of that incentive does not improve accuracy — it reduces the time available for the kind of contextual review (irony, news reporting, counter-speech, political satire) that distinguishes a slur from a quotation of one.
The EFF, hardly an apologist for Big Tech, has been making the parallel case in its recommendations on the Digital Fairness Act: EU digital rule-making increasingly piles structural compliance obligations on intermediaries without specifying how to protect the speech being moderated in the process.
A proportionate path forward
The Commission deserves credit for publishing the numbers honestly — including the X anomaly — rather than claiming a uniform win. But the 2026 monitoring cycle should pivot. Three adjustments would keep the discipline while restoring proportion. First, treat dispute rates as a feature of healthy adversarial review, not a defect; the metric of interest is whether disputes are reasoned and consistently applied, not whether they happen. Second, publish per-Member-State breakdowns so it is visible when a notice was rejected because the alleged offence is not criminal in the user's jurisdiction. Third, separate the procedural commitment (timeliness, feedback, transparency) from the substantive question (was this actually illegal), and reserve the latter for courts and national regulators rather than co-regulatory pressure on private moderation queues.
The Code+ has done the easy part — it has made platforms answer the phone within a day. The harder part, which Brussels has not yet attempted, is admitting that what counts as illegal hate speech in Europe is not a question a content reviewer can answer in 24 hours, and building a regime that respects that limit instead of papering over it.