Fair Use & AI Training

Understanding whether AI training on copyrighted works qualifies as fair use under U.S. copyright law

Is AI Training on Copyrighted Works Fair Use?

According to the U.S. Copyright Office's May 2025 report: "Some uses of copyrighted works for generative AI training will qualify as fair use, and some will not." It depends on the specific circumstances of each case.

What Is Fair Use?

Fair use is a legal doctrine (17 U.S.C. § 107) that permits certain uses of copyrighted material without obtaining permission from the copyright owner. It's a limited exception to copyright protection, not a blanket right.

Traditional Fair Use Examples

Criticism and commentary
News reporting
Teaching and scholarship
Research
Parody

Important: Fair use is not a simple checklist—it requires balancing multiple factors and considering the specific context of each use.

The Four Fair Use Factors Applied to AI Training

Courts analyze four statutory factors when determining fair use. Here's how they apply to AI training:

Purpose and Character of the Use

Is the use transformative? Is it commercial?

How This Applies to AI Training:

Favoring Fair Use:

AI companies argue training is "highly transformative"—it extracts statistical patterns rather than copying expressive content for consumption
Some courts (e.g., Kadrey v. Meta, Bartz v. Anthropic in June 2025) have found AI training "spectacularly" or "highly" transformative
Training involves analyzing patterns across massive datasets rather than replacing the original works

Against Fair Use:

AI training is conducted by for-profit companies for commercial purposes
The Copyright Office stated that training to produce content that competes with original works is "at best, modestly transformative"
AI companies profit from models trained on creators' works without compensation

Nature of the Copyrighted Work

Is the original work creative or factual? Published or unpublished?

How This Applies to AI Training:

Generally Against Fair Use:

Most AI training involves highly creative works—novels, artwork, music, photographs, articles
Creative works receive stronger copyright protection than factual works
Courts typically weigh this factor against fair use when creative expression is involved

Nuance:

Many training datasets include published works available online, which somewhat favors fair use
However, availability doesn't equal permission to copy

Amount and Substantiality of the Portion Used

How much of the original work was used?

How This Applies to AI Training:

Against Fair Use:

AI training typically involves copying entire works—complete books, full images, entire articles
Companies copy millions of complete works into training datasets
This is the maximum amount possible, which typically weighs against fair use

AI Company Argument:

Copying the entire work may be necessary for the transformative purpose of pattern analysis
Courts have sometimes found that copying entire works can be fair use when necessary for a transformative purpose

Effect on the Market

Does the use harm the market for the original work?

How This Applies to AI Training:

This factor is proving critical in AI cases and courts are split on the analysis:

Favoring Fair Use (Meta Case):

In Kadrey v. Meta (June 2025), the court found fair use partly because authors failed to prove Meta's AI harmed sales of their specific books
AI companies argue their models don't replace the market for individual works

Against Fair Use:

The Copyright Office concluded that AI training to produce competing content likely harms the market for original works
AI outputs can "significantly dilute the market" for creators' works (acknowledged even by Judge Chhabria in the Meta case)
The New York Times argues AI creates a "market substitute" for news content
Creators lose potential licensing revenue when AI companies train without permission
Thomson Reuters v. ROSS (Feb. 2025) found market harm when AI competed with the original Westlaw product

U.S. Copyright Office Guidance (May 2025)

Key Conclusions from the 108-Page Report

Transformative Nature

GenAI training on large, diverse datasets "will often be transformative," but transformativeness alone is insufficient to justify fair use.

Commercial Competition

Using copyrighted materials to train AI that generates content competing with original works goes "beyond the scope of the fair use doctrine."

Case-by-Case Analysis

Fair use determinations must be made on a case-by-case basis considering all four statutory factors and the specific circumstances.

Lawful Acquisition

Lawful sourcing is a threshold requirement for defending infringement cases and asserting fair use in AI training.

The Copyright Office's Position:

While some AI training uses may qualify as fair use (particularly for non-commercial research or training that doesn't produce competing outputs), commercial AI companies training models to generate content that competes with and potentially displaces original copyrighted works are unlikely to succeed with fair use defenses.

How Courts Are Ruling on Fair Use

Federal courts have reached different conclusions, showing this is an unsettled area of law

Cases Finding Fair Use

Bartz v. Anthropic (June 2025)

Judge Alsup found training "spectacularly transformative" but noted use of pirated copies defeats fair use.

Kadrey v. Meta (June 2025)

Judge Chhabria found training "highly transformative" and plaintiffs failed to prove market harm, despite acknowledging AI could "dilute" markets.

Common Reasoning: Training doesn't copy for consumption but extracts patterns; insufficient evidence of specific market harm to individual works.

Cases Rejecting Fair Use

Thomson Reuters v. ROSS (Feb. 2025)

Delaware federal court rejected fair use defense where AI training created a competing product.

NYT v. OpenAI (Ongoing)

Court denied motion to dismiss, allowing copyright claims to proceed despite fair use arguments.

Common Reasoning: Commercial use to create competing products; substantial market harm; wholesale copying of creative works.

Key Insight: These district court decisions are not binding on other courts. Different judges are reaching different conclusions on similar facts. This inconsistency suggests the issue will require resolution by appellate courts or potentially the Supreme Court.

What This Means for You

If You're a Creator

Fair use is not a guaranteed defense for AI companies that used your work
Recent guidance from the Copyright Office supports creators' rights, especially when AI outputs compete with original works
Courts are split, but momentum is shifting toward recognizing market harm to creators
Even when fair use is found, massive settlements (like Anthropic's $1.5B) show creators can recover compensation
You should consult with an attorney to evaluate whether you have claims worth pursuing

If You're Using AI

Don't assume AI company training was legal just because they claim fair use
If you're using AI outputs commercially, be aware that the underlying training may be subject to legal challenge
Consider whether AI-generated content might infringe on specific creators' copyrights
Monitor developments in AI copyright law as this area evolves rapidly

Fair Use Law Is Actively Developing

The application of fair use to AI training is one of the most hotly contested questions in copyright law today. With courts reaching different conclusions, Copyright Office guidance favoring creators in many scenarios, and multiple appeals expected, this area will continue to develop significantly over the next few years.

Stay Informed: The balance between transformative use and market harm is being actively litigated. New decisions could shift the landscape substantially.

Questions About Fair Use & Your Rights?

Get guidance from attorneys who understand the nuances of fair use in AI copyright cases

Get Free Consultation Get In Touch