The Delhi High Court is now the unlikely epicentre of a question that courts in Washington, London and Brussels have struggled to answer: when an AI model learns from copyrighted text, music or news, is that learning an infringement, a transformation, or simply how knowledge has always travelled? The suit brought by Asian News International against OpenAI — now joined by the Federation of Indian Publishers and music majors including T-Series and Saregama — has rapidly evolved from a single news agency's grievance into India's most consequential test of the Copyright Act, 1957 in the age of generative AI.
For a country that aspires to be both a global AI hub and the world's largest cultural exporter, the stakes could hardly be higher. The temptation in New Delhi will be to follow the loudest international examples — aggressive class actions in the United States, restrictive opt-out regimes in the European Union, or the United Kingdom's stalled text-and-data-mining (TDM) reforms. India should resist that temptation. The smarter path is a proportionate, statutory framework that protects creators without strangling a domestic AI industry that has barely learned to walk.
What the case actually asks
ANI's pleadings, as reported by LiveLaw, Bar & Bench and Reuters, allege that OpenAI scraped and stored ANI's news output to train ChatGPT, and that the model occasionally reproduces ANI material verbatim or attributes fabricated quotes to the agency. OpenAI's defence, in line with positions it has taken globally, is twofold: training data is not stored as expressive copy, and Indian courts lack jurisdiction because no model training occurred on Indian soil.
The court has not yet ruled on the merits. But the interventions by publishers and music labels signal that Indian rights-holders intend to convert a single dispute into a doctrinal precedent. If the bench accepts the broadest version of their arguments, the consequences will reach far beyond OpenAI.
Why a maximalist ruling would hurt India
India does not yet have a domestic foundation model at the scale of GPT-4 or Gemini. Its strength lies in fine-tuning, deployment, vernacular models and downstream applications — categories that depend on lawful, affordable access to training corpora. A judgment that treats every act of ingestion as prima facie infringement would do three things at once:
- Entrench incumbents. Only the largest US labs can afford global licensing deals like the ones OpenAI has signed with the Financial Times, Axel Springer and News Corp. Indian start-ups cannot.
- Push training offshore. Models will simply be trained in jurisdictions with clearer safe harbours, and Indian users will consume them as imported services — a worse outcome for Indian creators, who lose both leverage and tax base.
- Chill vernacular AI. Bhasha models trained on Hindi, Tamil, Bengali and Marathi news and literature are exactly the kind of public-interest projects that would be killed by an overbroad rule.
The proportionate path
India's Copyright Act already contains the raw material for a sensible settlement. Section 52's fair-dealing provisions, the compulsory-licensing architecture in Sections 31 and 31D, and the Copyright Board's tariff-setting powers offer a uniquely flexible toolkit — one the EU's rigid 2019 DSM Directive lacks. Three moves would put India ahead of the global pack:
1. A narrow TDM exception with transparency obligations
Parliament should codify a text-and-data-mining exception for AI training, modelled loosely on Japan's 2018 reform and Singapore's 2021 amendment, but conditioned on disclosure of training data categories and a workable opt-out for rights-holders who choose to license commercially. Japan and Singapore — both serious AI economies — have shown this is compatible with strong creator protection.
2. Statutory licensing for news and music corpora
Section 31D already permits statutory licensing for broadcasting. Extending an analogous, tariff-set mechanism to AI training would give publishers and labels a predictable revenue stream without the transaction costs of bilateral deals. It would also avoid the absurd outcome where ANI is paid by OpenAI but smaller Indian newsrooms get nothing.
3. A 'memorisation' standard, not an 'ingestion' standard
The real harm from generative AI is not learning — it is verbatim regurgitation and unattributed substitution. Courts and regulators should focus liability on outputs that materially reproduce protected expression, as the US Second Circuit gestured at in Authors Guild v. Google (2015). This is the doctrinal pivot that allows AI to flourish while still punishing genuine free-riding.
The court's real choice
The Delhi High Court is not, of course, a legislature. But its framing will shape the policy debate that follows. A measured interim order — one that distinguishes training from output, asks for transparency rather than injunctions, and signals openness to statutory solutions — would buy India time to legislate properly. A sweeping injunction would lock in litigation as the default mode of AI governance, a path the US has already shown to be slow, expensive and innovation-hostile.
India's policy reputation in the past decade has been built on pragmatic, India-first frameworks: UPI, DPDP, ONDC. AI copyright should be the next entry on that list. The ANI case is not merely a fight between a news agency and a foreign tech company. It is the moment when India decides whether it will be a maker of AI rules or merely a taker of them.