Anthropic undertook a large-scale internal effort, known as Project Panama, that involved buying, scanning and destroying up to two million physical books to train its artificial intelligence systems, according to court filings unsealed in a copyright lawsuit.

The project was driven by the belief within Anthropic that books represent one of the most concentrated and reliable forms of human knowledge, shaped by authors, editors and time, and that long-form texts could teach AI systems to reason and write more clearly than fragmented online material, The Washington Post reported.

As per court records, Anthropic purchased physical books in bulk, cut them apart and scanned them at high speed through commercial vendors. Once digitised, the paper copies were recycled, leaving no physical archive behind. Vendor proposals cited in the filings show the company sought scanning capacity ranging from 500,000 to 2 million books over roughly six months, with spending running into tens of millions of dollars on books, logistics and scanning services.

The Washington Post reported that Anthropic chose this approach instead of negotiating large-scale licensing deals, with executives arguing that buying and digitising physical books internally was faster and more practical amid intensifying competition to build more capable AI systems. Internal documents suggest the strategy was shaped by lessons from earlier digitisation efforts such as Google Books, though Project Panama prioritised speed and exclusivity over preservation or public access.

Court records further indicate that Anthropic viewed destructive scanning as a lower-risk alternative to downloading large pirated digital libraries. However, while a federal judge later ruled that training AI models on books can qualify as fair use when the use is transformative, the court also found that Anthropic’s earlier downloads of pirated books raised separate copyright concerns, underlining that how data is acquired remains legally significant.

Authors reacted strongly after details of Project Panama became public. As reported, critics argued that AI companies were benefiting from creative work without consent or compensation. In 2025, Anthropic agreed to pay $1.5 billion to settle claims linked to its earlier use of pirated books, without admitting wrongdoing. Under the settlement, authors whose works were included can seek compensation estimated at around $3,000 per title, though actual payouts may vary.

The Washington Post noted that Project Panama reflects a wider industry pattern. Court filings in other cases show Meta employees discussed downloading large shadow libraries of books, while OpenAI has acknowledged downloading similar datasets in the past before deleting them. Google and Microsoft also face ongoing legal challenges related to AI training data.

Legal experts cited in the report said the unsealed records offer one of the clearest views yet into how modern AI systems are built, revealing how aggressively companies have pursued high-quality text as access to data becomes central to commercial advantage in the AI race.

First Published on February 3, 2026, 17:23:55 IST