Win / Conspiracies
Conspiracies
Communities Topics Log In Sign Up
Sign In
Hot
All Posts
Settings
All
Profile
Saved
Upvoted
Hidden
Messages

Your Communities

General
AskWin
Funny
Technology
Animals
Sports
Gaming
DIY
Health
Positive
Privacy
News
Changelogs

More Communities

frenworld
OhTwitter
MillionDollarExtreme
NoNewNormal
Ladies
Conspiracies
GreatAwakening
IP2Always
GameDev
ParallelSociety
Privacy Policy
Terms of Service
Content Policy
DEFAULT COMMUNITIES • All General AskWin Funny Technology Animals Sports Gaming DIY Health Positive Privacy
Conspiracies Conspiracy Theories & Facts
hot new rising top

Sign In or Create an Account

3
The final, tragic end of man (www.msn.com)
posted 3 days ago by Mrexreturns 3 days ago by Mrexreturns +3 / -0
5 comments share
5 comments share save hide report block hide replies
You're viewing a single comment thread. View all comments, or full comment thread.
Comments (5)
sorted by:
▲ 2 ▼
– Mrexreturns [S] 2 points 3 days ago +2 / -0

Anthropic cut up millions of used books to train Claude — and downloaded over 7 million pirated ones too, a judge said

  • Anthropic bought, cut, and scanned millions of used books for its "research library."
  • The company also downloaded over 7 million pirated books, the judge found.
  • The judge wrote that training Claude on copyrighted books it had purchased was fair use, but piracy wasn't.

To build AI chatbot Claude, Anthropic "destructively scanned" millions of copyrighted books, wrote a judge on Monday.

Ruling in a closely-watched AI copyright case, Judge William Alsup of the Northern District of California analyzed how Anthropic sourced data for model training purposes, including from digital and physical books.

Companies like Anthropic require vast amounts of input to develop their large language models, so they've tapped sources from social media posts to videos to books. Authors, artists, publishers, and other groups contend that the use of their work for training amounts to theft.

Alsup detailed Anthropic's training process with books: The OpenAI rival spent "many millions of dollars" buying used print books, which the company or its vendors then stripped of their bindings, cut the pages, and scanned into digital files.

Alsup wrote that millions of original books were then discarded, and the digital versions stored in an internal "research library."

The judge also wrote that Anthropic, which is backed by Amazon and Alphabet, downloaded more than 7 million pirated books to train Claude.

Alsup wrote that Anthropic's cofounder, Ben Mann, downloaded "at least 5 million copies of books from Library Genesis" in 2021 — fully aware that the material was pirated. A year later, the company "downloaded at least 2 million copies of books from the Pirate Library Mirror" also knowing they were pirated.

Alsup wrote that Anthropic preferred to "steal" books to "avoid 'legal/practice/business slog,' as cofounder and CEO Dario Amodei put it."

Last year, a trio of authors sued Anthropic in a class-action lawsuit, saying that the company used pirated versions of their books without permission or compensation to train its large language models.

Judge says training Claude on books was fair use, but piracy wasn't

Alsup ruled that Anthropic's use of copyrighted books to train its AI models was "exceedingly transformative" and qualified as fair use, a legal doctrine that allows certain uses of copyrighted works without the copyright owner's permission.

"Like any reader aspiring to be a writer, Anthropic's LLMs trained upon works not to race ahead and replicate or supplant them — but to turn a hard corner and create something different," he wrote.

The company's decision to digitize millions of print books it had purchased fell under fair use, Alsup wrote.

"All Anthropic did was replace the print copies it had purchased for its central library with more convenient space-saving and searchable digital copies for its central library — without adding new copies, creating new works, or redistributing existing copies," he wrote.

An Anthropic spokesperson said that the company is pleased with Alsup's ruling on using books to train LLMs.

The spokesperson said in a statement that this approach is "consistent with copyright's purpose in enabling creativity and fostering scientific progress."

But Alsup drew a firm line when it came to piracy.

"Anthropic had no entitlement to use pirated copies for its central library," Alsup wrote. "Creating a permanent, general-purpose library was not itself a fair use excusing Anthropic's piracy."

Judge Alsup's ruling that training AI models on copyrighted books is fair use is one of the first of ips kind.

His decision comes amid a wave of lawsuits from artists, filmmakers, authors, and news outlets against major AI players like OpenAI.

While creators say training AI models on their copyrighted work without permission infringes on their rights, AI execs argue they haven't violated copyright laws because the training falls under fair use.

Earlier this month, Disney sued AI image generator Midjourney, saying the tech company ripped off famous characters in properties ranging from "Star Wars" to "The Simpsons."

permalink save report block reply
▲ 3 ▼
– WeedleTLiar 3 points 3 days ago +3 / -0

Alsup ruled that Anthropic's use of copyrighted books to train its AI models was "exceedingly transformative" and qualified as fair use

I'm no fan of copyright law but, if you can simply ask the AI to recite the full text of any book that was scanned, does it matter that it's transformative?

permalink parent save report block reply

GIFs

Conspiracies Wiki & Links

Conspiracies Book List

External Digital Book Libraries

Mod Logs

Honor Roll

Conspiracies.win: This is a forum for free thinking and for discussing issues which have captured your imagination. Please respect other views and opinions, and keep an open mind. Our goal is to create a fairer and more transparent world for a better future.

Community Rules: <click this link for a detailed explanation of the rules

Rule 1: Be respectful. Attack the argument, not the person.

Rule 2: Don't abuse the report function.

Rule 3: No subversion.

To prevent SPAM, posts from accounts younger than 4 days old, and/or with <50 points, wont appear in the feed until approved by a mod.

Disclaimer: Submissions/comments of exceptionally low quality, trolling, stalking, spam, and those submissions/comments determined to be intentionally misleading, calls to violence and/or abuse of other users here, may all be removed at moderator's discretion.

Moderators

  • Doggos
  • axolotl_peyotl
  • trinadin
  • PutinLovesCats
  • clemaneuverers
  • C
  • Perun
  • Thisisnotanexit
Message the Moderators

Terms of Service | Privacy Policy

2026.02.01 - w2qgj (status)

Copyright © 2026.

Terms of Service | Privacy Policy