Fair Use & AI Training
AI companies say "fair use" covers what they did. Courts are starting to disagree. Here's what that means for you.
Is AI Training on Copyrighted Works Fair Use?
According to the U.S. Copyright Office's May 2025 report: "Some uses of copyrighted works for generative AI training will qualify as fair use, and some will not." It depends on the specific circumstances of each case.
What Is Fair Use?
Fair use is a legal carve-out — a set of situations where someone can use copyrighted material without asking first. It covers things like critiquing a book, parodying a song, or quoting an article in a news story. AI companies want it to cover training models on billions of creative works. Courts are still deciding if that's a stretch too far.
Traditional Fair Use Examples
- Criticism and commentary
- News reporting
- Teaching and scholarship
- Research
- Parody
Important: Fair use is not a simple checklist—it requires balancing multiple factors and considering the specific context of each use.
The Four Fair Use Factors Applied to AI Training
Courts analyze four statutory factors when determining fair use. Here's how they apply to AI training:
Purpose and Character of the Use
How This Applies to AI Training:
Favoring Fair Use:
- AI companies argue training is "highly transformative"—it extracts statistical patterns rather than copying expressive content for consumption
- Some courts (e.g., Kadrey v. Meta, Bartz v. Anthropic in June 2025) have found AI training "spectacularly" or "highly" transformative
- Training involves analyzing patterns across massive datasets rather than replacing the original works
Against Fair Use:
- AI training is conducted by for-profit companies for commercial purposes
- The Copyright Office stated that training to produce content that competes with original works is "at best, modestly transformative"
- AI companies profit from models trained on creators' works without compensation
Nature of the Copyrighted Work
How This Applies to AI Training:
Generally Against Fair Use:
- Most AI training involves highly creative works—novels, artwork, music, photographs, articles
- Creative works receive stronger copyright protection than factual works
- Courts typically weigh this factor against fair use when creative expression is involved
Nuance:
- Many training datasets include published works available online, which somewhat favors fair use
- However, availability doesn't equal permission to copy
Amount and Substantiality of the Portion Used
How This Applies to AI Training:
Against Fair Use:
- AI training typically involves copying entire works—complete books, full images, entire articles
- Companies copy millions of complete works into training datasets
- This is the maximum amount possible, which typically weighs against fair use
AI Company Argument:
- Copying the entire work may be necessary for the transformative purpose of pattern analysis
- Courts have sometimes found that copying entire works can be fair use when necessary for a transformative purpose
Effect on the Market
How This Applies to AI Training:
This factor is proving critical in AI cases and courts are split on the analysis:
Favoring Fair Use (Meta Case):
- In Kadrey v. Meta (June 2025), the court found fair use partly because authors failed to prove Meta's AI harmed sales of their specific books
- AI companies argue their models don't replace the market for individual works
Against Fair Use:
- The Copyright Office concluded that AI training to produce competing content likely harms the market for original works
- AI outputs can "significantly dilute the market" for creators' works (acknowledged even by Judge Chhabria in the Meta case)
- The New York Times argues AI creates a "market substitute" for news content
- Creators lose potential licensing revenue when AI companies train without permission
- Thomson Reuters v. ROSS (Feb. 2025, now on appeal to the Third Circuit) found market harm when AI competed with the original Westlaw product
U.S. Copyright Office Guidance (May 2025)
Key Conclusions from the 108-Page Report
Transformative Nature
GenAI training on large, diverse datasets "will often be transformative," but transformativeness alone is insufficient to justify fair use.
Commercial Competition
Using copyrighted materials to train AI that generates content competing with original works goes "beyond the scope of the fair use doctrine."
Case-by-Case Analysis
Fair use determinations must be made on a case-by-case basis considering all four statutory factors and the specific circumstances.
Lawful Acquisition
Lawful sourcing is a threshold requirement for defending infringement cases and asserting fair use in AI training.
The Copyright Office's Position:
While some AI training uses may qualify as fair use (particularly for non-commercial research or training that doesn't produce competing outputs), commercial AI companies training models to generate content that competes with and potentially displaces original copyrighted works are unlikely to succeed with fair use defenses.
How Courts Are Ruling on Fair Use
Federal courts have reached different conclusions, showing this is an unsettled area of law
Cases Finding Fair Use
Bartz v. Anthropic (June 2025)
Judge Alsup found training "spectacularly transformative" but drew a critical line: training on lawfully acquired material may be fair use, but training on pirated copies is not. This distinction—that the source of training data matters—is the most important fair use development to date. Settlement of $1.5B is pending final approval (fairness hearing April 2026).
Kadrey v. Meta (June 2025)
Judge Chhabria found training "highly transformative" and plaintiffs failed to prove market harm, despite acknowledging AI could "dilute" markets. Unlike Judge Alsup, Judge Chhabria held fair use applied regardless of whether training data was pirated—creating a direct conflict between the two rulings.
Cases Rejecting Fair Use
Thomson Reuters v. ROSS (Feb. 2025, now on appeal)
Delaware federal court rejected fair use defense where AI training created a competing product. Now on interlocutory appeal to the Third Circuit—the first AI fair use case to reach a federal appeals court.
NYT v. OpenAI (Summary Judgment April 2026)
Court denied motion to dismiss, allowing copyright claims to proceed. Heading to summary judgment April 2, 2026, after the judge ordered OpenAI to produce 20 million ChatGPT logs in January 2026.
Key Insight: These district court decisions are not binding on other courts, and different judges are reaching directly conflicting conclusions on similar facts—particularly on whether the source of training data (lawful vs. pirated) matters. The Third Circuit is now considering the Thomson Reuters v. ROSS appeal, making it the first appellate court to weigh in on AI fair use. The Supreme Court denied cert in Thaler (March 2026, on AI authorship), but has not yet addressed AI training fair use. Appellate resolution of the Bartz/Kadrey conflict is essential.
What This Means for You
If You're a Creator
- Fair use is not a guaranteed defense for AI companies that used your work
- Recent guidance from the Copyright Office supports creators' rights, especially when AI outputs compete with original works
- Courts are split, but momentum is shifting toward recognizing market harm to creators
- Even when fair use is found, massive settlements (like Anthropic's $1.5B, pending final approval with a fairness hearing in April 2026) show creators can recover compensation
- You should consult with an attorney to evaluate whether you have claims worth pursuing
If You're Using AI
- Don't assume AI company training was legal just because they claim fair use
- If you're using AI outputs commercially, be aware that the underlying training may be subject to legal challenge
- Consider whether AI-generated content might infringe on specific creators' copyrights
- Monitor developments in AI copyright law as this area evolves rapidly
Fair Use Law Is Actively Developing
The application of fair use to AI training is one of the most hotly contested questions in copyright law today. With courts reaching conflicting conclusions—particularly on whether pirated training data defeats fair use—the Third Circuit now reviewing the first AI fair use appeal, and the NYT v. OpenAI summary judgment ruling approaching in April 2026, this area is developing faster than ever.
Stay Informed: The balance between transformative use and market harm is being actively litigated. The Bartz/Kadrey conflict on pirated sources, upcoming appellate rulings, and the massive scale of new cases (including the $3.1B Concord II lawsuit) could shift the landscape substantially.
Not Sure How Fair Use Affects Your Situation?
Your situation is specific. A 15-minute conversation with the right attorney can tell you more than an hour of reading.