The findings emerge as controversial AI music turbines like Suno and Udio navigate a thicket of copyright lawsuits.
4 datasets containing greater than 21 million copyrighted music recordings are being shared amongst synthetic intelligence builders, in response to an investigation revealed by the The Atlantic.
The collections embody music from Taylor Swift, Unhealthy Bunny, Billie Eilish, Nirvana, Pearl Jam and the Beatles, alongside legions of impartial music producers. Two of the datasets maintain greater than 100,000 recordings every whereas the remaining two are significantly bigger, containing roughly 9 million and 12 million tracks, respectively. The Atlantic reported that Google and Stability AI have used tracks from the Free Music Archive, one of many smaller collections.
The biggest of the 4 is LAION-DISCO-12M, a set of greater than 12 million tracks launched in November 2024 by LAION, a German nonprofit that assembles open datasets for AI analysis. The group can be behind the dataset used to coach Stability AI’s Steady Diffusion picture generator.
LAION described the music assortment as meant for educational use and explicitly warns in opposition to deploying it commercially or utilizing it in its authentic type to create completed merchandise. The dataset comprises hyperlinks to publicly obtainable YouTube tracks and their related metadata slightly than the audio recordsdata themselves, and the group says it doesn’t distribute the music instantly.
Every dataset has reportedly been downloaded a number of thousand instances, although the AI trade’s apply of retaining coaching knowledge confidential means it’s largely unknown which firms have relied on which collections. Korean-American producer Kato On The Monitor, who has crafted hits for Tyga, Snoop Dogg and members of Wu-Tang Clan amongst different main artists, stated 54 of his songs have been swept up with out consent or compensation.
“Tech firms are utilizing 54 OF MY SONGS to coach and promote their generative AI fashions with out compensation or permission from me,” he wrote on X. “This listing doesn’t embody the 1,541 songs that I’m credited as a Producer on for different Artists.”
The explosive investigation arrives as AI music firms face mounting authorized strain. Suno and Udio, two of essentially the most outstanding and controversial AI music turbines, are actually contending with at the least 12 lawsuits, per The Atlantic.
The litigation first erupted again in June 2024, when the Recording Trade Affiliation of America, representing Sony, Warner and UMG, sued each firms for what it described as mass copyright infringement. The fits alleged that Suno and Udio had copied recordings to coach their AI fashions with out acquiring permission from rightsholders.
Since then, the three main labels have pursued divergent methods. UMG settled with Udio in October 2025, asserting a compensatory authorized settlement alongside new recorded music and publishing licenses for a collectively developed AI platform anticipated to launch in 2026. Below that association, Udio’s service will function inside what Common described as a “walled backyard” with audio fingerprinting and content material filtering in place.
Warner reached its personal settlement and licensing cope with Udio in November 2025 and, inside days, turned the primary main label to succeed in a settlement with Suno as effectively. The Warner-Suno settlement, which the businesses described as a first-of-its-kind partnership, additionally included Suno’s acquisition of Songkick, the concert-discovery platform, from Warner. Sony, in contrast, has remained in energetic litigation in opposition to each firms.


