Fair Use & AI Training
Understanding whether AI training on copyrighted works qualifies as fair use under U.S. copyright law
Is AI Training on Copyrighted Works Fair Use?
According to the U.S. Copyright Office's May 2025 report: "Some uses of copyrighted works for generative AI training will qualify as fair use, and some will not." It depends on the specific circumstances of each case.
What Is Fair Use?
Fair use is a legal doctrine (17 U.S.C. § 107) that permits certain uses of copyrighted material without obtaining permission from the copyright owner. It's a limited exception to copyright protection, not a blanket right.
Traditional Fair Use Examples
- Criticism and commentary
- News reporting
- Teaching and scholarship
- Research
- Parody
Important: Fair use is not a simple checklist—it requires balancing multiple factors and considering the specific context of each use.
The Four Fair Use Factors Applied to AI Training
Courts analyze four statutory factors when determining fair use. Here's how they apply to AI training:
Purpose and Character of the Use
How This Applies to AI Training:
Favoring Fair Use:
- AI companies argue training is "highly transformative"—it extracts statistical patterns rather than copying expressive content for consumption
- Some courts (e.g., Kadrey v. Meta, Bartz v. Anthropic in June 2025) have found AI training "spectacularly" or "highly" transformative
- Training involves analyzing patterns across massive datasets rather than replacing the original works
Against Fair Use:
- AI training is conducted by for-profit companies for commercial purposes
- The Copyright Office stated that training to produce content that competes with original works is "at best, modestly transformative"
- AI companies profit from models trained on creators' works without compensation
Nature of the Copyrighted Work
How This Applies to AI Training:
Generally Against Fair Use:
- Most AI training involves highly creative works—novels, artwork, music, photographs, articles
- Creative works receive stronger copyright protection than factual works
- Courts typically weigh this factor against fair use when creative expression is involved
Nuance:
- Many training datasets include published works available online, which somewhat favors fair use
- However, availability doesn't equal permission to copy
Amount and Substantiality of the Portion Used
How This Applies to AI Training:
Against Fair Use:
- AI training typically involves copying entire works—complete books, full images, entire articles
- Companies copy millions of complete works into training datasets
- This is the maximum amount possible, which typically weighs against fair use
AI Company Argument:
- Copying the entire work may be necessary for the transformative purpose of pattern analysis
- Courts have sometimes found that copying entire works can be fair use when necessary for a transformative purpose
Effect on the Market
How This Applies to AI Training:
This factor is proving critical in AI cases and courts are split on the analysis:
Favoring Fair Use (Meta Case):
- In Kadrey v. Meta (June 2025), the court found fair use partly because authors failed to prove Meta's AI harmed sales of their specific books
- AI companies argue their models don't replace the market for individual works
Against Fair Use:
- The Copyright Office concluded that AI training to produce competing content likely harms the market for original works
- AI outputs can "significantly dilute the market" for creators' works (acknowledged even by Judge Chhabria in the Meta case)
- The New York Times argues AI creates a "market substitute" for news content
- Creators lose potential licensing revenue when AI companies train without permission
- Thomson Reuters v. ROSS (Feb. 2025) found market harm when AI competed with the original Westlaw product
U.S. Copyright Office Guidance (May 2025)
Key Conclusions from the 108-Page Report
Transformative Nature
GenAI training on large, diverse datasets "will often be transformative," but transformativeness alone is insufficient to justify fair use.
Commercial Competition
Using copyrighted materials to train AI that generates content competing with original works goes "beyond the scope of the fair use doctrine."
Case-by-Case Analysis
Fair use determinations must be made on a case-by-case basis considering all four statutory factors and the specific circumstances.
Lawful Acquisition
Lawful sourcing is a threshold requirement for defending infringement cases and asserting fair use in AI training.
The Copyright Office's Position:
While some AI training uses may qualify as fair use (particularly for non-commercial research or training that doesn't produce competing outputs), commercial AI companies training models to generate content that competes with and potentially displaces original copyrighted works are unlikely to succeed with fair use defenses.
How Courts Are Ruling on Fair Use
Federal courts have reached different conclusions, showing this is an unsettled area of law
Cases Finding Fair Use
Bartz v. Anthropic (June 2025)
Judge Alsup found training "spectacularly transformative" but noted use of pirated copies defeats fair use.
Kadrey v. Meta (June 2025)
Judge Chhabria found training "highly transformative" and plaintiffs failed to prove market harm, despite acknowledging AI could "dilute" markets.
Cases Rejecting Fair Use
Thomson Reuters v. ROSS (Feb. 2025)
Delaware federal court rejected fair use defense where AI training created a competing product.
NYT v. OpenAI (Ongoing)
Court denied motion to dismiss, allowing copyright claims to proceed despite fair use arguments.
Key Insight: These district court decisions are not binding on other courts. Different judges are reaching different conclusions on similar facts. This inconsistency suggests the issue will require resolution by appellate courts or potentially the Supreme Court.
What This Means for You
If You're a Creator
- Fair use is not a guaranteed defense for AI companies that used your work
- Recent guidance from the Copyright Office supports creators' rights, especially when AI outputs compete with original works
- Courts are split, but momentum is shifting toward recognizing market harm to creators
- Even when fair use is found, massive settlements (like Anthropic's $1.5B) show creators can recover compensation
- You should consult with an attorney to evaluate whether you have claims worth pursuing
If You're Using AI
- Don't assume AI company training was legal just because they claim fair use
- If you're using AI outputs commercially, be aware that the underlying training may be subject to legal challenge
- Consider whether AI-generated content might infringe on specific creators' copyrights
- Monitor developments in AI copyright law as this area evolves rapidly
Fair Use Law Is Actively Developing
The application of fair use to AI training is one of the most hotly contested questions in copyright law today. With courts reaching different conclusions, Copyright Office guidance favoring creators in many scenarios, and multiple appeals expected, this area will continue to develop significantly over the next few years.
Stay Informed: The balance between transformative use and market harm is being actively litigated. New decisions could shift the landscape substantially.
Questions About Fair Use & Your Rights?
Get guidance from attorneys who understand the nuances of fair use in AI copyright cases