AI industry horrified to face largest copyright class action ever certified

AI industry horrified to face largest copyright class action ever certified (arstechnica.com)
from Davriellelouna@lemmy.world to technology@lemmy.world on 09 Aug 09:11
https://lemmy.world/post/34179744

#technology

threaded - newest

Lexam@lemmy.world on 09 Aug 09:46 next collapse

No it won’t. Just their companies. Which are the ones making slop. If your AI does something actually useful it will survive.

a_wild_mimic_appears@lemmy.dbzer0.com on 09 Aug 15:27 collapse

You know, if they lose, their tech will probably become the property of copyright holders, which means your new AI Overlord has the first name Walt.

chaosCruiser@futurology.today on 09 Aug 09:50 next collapse

Oh no! Building a product with stolen data was a rotten idea after all. Well, at least the AI companies can use their fabulously genius PhD level LLMs to weasel their way out of all these lawsuits. Right?

Rooskie91@discuss.online on 09 Aug 10:42 next collapse

I propose that anyone defending themselves in court over AI stealing data must be represented exclusively by AI.

Regna@lemmy.world on 09 Aug 11:27 next collapse

Hilarious.

BakerBagel@midwest.social on 09 Aug 15:50 collapse

“ooh, so sorry, but your LLM was trained on proprietary documents stolen from several major law firms, and they are all suing you now”

chaosCruiser@futurology.today on 10 Aug 07:13 collapse

That would be glorious. If the future of your company depends on the LLM keeping track of hundreds of details and drawing the right conclusions, it’s game over during the first day.

thesohoriots@lemmy.world on 09 Aug 13:32 collapse

PhD level LLM = paying MAs $21/hr to write summaries of paragraphs for them to improve off of. Google Gemini outsourced their work like this, so I assume everyone else did too.

timuchan@lemmy.wtf on 09 Aug 10:37 next collapse

The bill comes due 🤷🏻

9point6@lemmy.world on 09 Aug 10:39 next collapse

Probably would have been cheaper to license everything you stole, eh, Anthropic?

nulluser@lemmy.world on 09 Aug 11:16 next collapse

threatens to “financially ruin” the entire AI industry

No. Just the LLM industry and AI slop image and video generation industries. All of the legitimate uses of AI (drug discovery, finding solar panel improvements, self driving vehicles, etc) are all completely immune from this lawsuit, because they’re not dependent on stealing other people’s work.

a_wild_mimic_appears@lemmy.dbzer0.com on 09 Aug 15:26 collapse

But it would also mean that the Internet Archive is illegal, even tho they don’t profit, but if scraping the internet is a copyright violation, then they are as guilty as Anthropic.

magikmw@piefed.social on 09 Aug 15:27 next collapse

IA doesn't make any money off the content. Not that LLM companies do, but that's what they'd want.

a_wild_mimic_appears@lemmy.dbzer0.com on 09 Aug 15:32 next collapse

Do you think that would rescue the IA from the type of people who made the IA already pull 300k books?

magikmw@piefed.social on 09 Aug 16:28 collapse

No. But going after LLMs wont make the situation for IA any worse, not directly anyway.

a_wild_mimic_appears@lemmy.dbzer0.com on 09 Aug 16:57 collapse

if the courts decide that scraping is illegal, IA can close up shop.

axmo@lemmy.ca on 09 Aug 15:36 next collapse

Profit (or even revenue) is not required for it to be considered an infringement, in the current legal framework.

CosmoNova@lemmy.world on 10 Aug 06:38 collapse

And this is exactly the reason why I think the IA will be forced to close down while AI companies that trained their models on it will not only stay but be praised for preserving information in an ironic twist. Because one side does participate in capitalism and the other doesn’t. They will claim AI is transformative enough even when it isn’t because the overly rich invested too much money into the grift.

JcbAzPx@lemmy.world on 10 Aug 18:04 collapse

Archival is a fair use.

umbrella@lemmy.ml on 09 Aug 18:19 next collapse

i say move it out of the us

a_wild_mimic_appears@lemmy.dbzer0.com on 09 Aug 20:31 collapse

they should have done that long ago, and if they haven’t already started a backup in both europe and china, it’s high time

omxxi@feddit.org on 11 Aug 06:44 collapse

Scrapping the Internet is not illegal. All AI companies did much more beyond that, they accessed private writings, private code, copyrighted images. they scanned copyrighted books (and then destroyed them), downloaded terabytes of copyrighted torrents … etc

So, the message is like piracy is OK when it’s done massively by a big company. They’re claiming “fair use” and most judges are buying it (or being bought?)

halcyoncmdr@lemmy.world on 09 Aug 11:21 next collapse

As Anthropic argued, it now “faces hundreds of billions of dollars in potential damages liability at trial in four months” based on a class certification rushed at “warp speed” that involves “up to seven million potential claimants, whose works span a century of publishing history,” each possibly triggering a $150,000 fine.

So you knew what stealing the copyrighted works could result in, and your defense is that you stole too much? That’s not how that works.

Rivalarrival@lemmy.today on 09 Aug 13:00 next collapse

The purpose of copyright is to drive works into the public domain. Works are only supposed to remain exclusive to the artist for a very limited time, not a “century of publishing history”.

The copyright industry should lose this battle. Copyright exclusivity should be shorter than patent exclusivity.

spankmonkey@lemmy.world on 09 Aug 13:36 next collapse

Rivalarrival@lemmy.today on 09 Aug 13:51 collapse

Their winning of the case reinforces a harmful precedent.

At the very least, the claims of those members of the class that are based on >20-year copyrights should be summarily rejected.

spankmonkey@lemmy.world on 09 Aug 14:03 collapse

The AI companies winning the case means anything leaked on the internet or even just hosted by a company can be used by anyone, including private photos and communication.

[deleted] on 09 Aug 14:16 next collapse

[deleted] on 09 Aug 14:51 collapse

[deleted] on 09 Aug 18:51 collapse

[deleted] on 10 Aug 07:06 collapse

[deleted] on 10 Aug 07:20 collapse

[deleted] on 10 Aug 14:30 collapse

[deleted] on 10 Aug 15:47 collapse

[deleted] on 10 Aug 15:53 next collapse

[deleted] on 11 Aug 06:22 collapse

[deleted] on 11 Aug 07:14 collapse

a_wild_mimic_appears@lemmy.dbzer0.com on 09 Aug 20:27 collapse

Copyright owners are then the new AI companies, and compared to now where open source AI is a possibility, it will never be, because only they will have enough content to train models. And without any competition, enshittification will go full speed ahead, meaning the chatbots you don’t like will still be there, and now they will try to sell you stuff and you can’t even choose a chatbot that doesn’t want to upsell you.

[deleted] on 09 Aug 14:49 collapse

zlatko@programming.dev on 09 Aug 13:34 next collapse

Actually that usually is how it works. Unfortunately.

*Too big to fail" was probably made up by the big ones.

a_wild_mimic_appears@lemmy.dbzer0.com on 09 Aug 15:23 collapse

If scraping is illegal, so is the Internet Archive, and that would be an immense loss for the world.

Signtist@bookwyr.me on 09 Aug 20:40 collapse

This is the real concern. Copyright abuse has been rampant for a long time, and the only reason things like the Internet Archive are allowed to exist is because the copyright holders don't want to pick a fight they could potentially lose and lessen their hold on the IPs they're hoarding. The AI case is the perfect thing for them, because it's a very clear violation with a good amount of public support on their side, and winning will allow them to crack down even harder on all the things like the Internet Archive that should be fair use. AI is bad, but this fight won't benefit the public either way.

a_wild_mimic_appears@lemmy.dbzer0.com on 09 Aug 21:23 collapse

I wouldn’t even say AI is bad, i have currently Qwen 3 running on my own GPU giving me a course in RegEx and how to use it. It sometimes makes mistakes in the examples (we all know that chatbots are shit when it comes to the r’s in strawberry), but i see it as “spot the error” type of training for me, and the instructions themself have been error free for now, since i do the lesson myself i can easily spot if something goes wrong.

AI crammed into everything because venture capitalists try to see what sticks is probably the main reason public opinion of chatbots is bad, and i don’t condone that too, but the technology itself has uses and is an impressive accomplishment.

Same with image generation: i am shit at drawing, and i don’t have the money to commission art if i want something specific, but i can generate what i want for myself.

If the copyright side wins, we all might lose the option to run imagegen and llms on our own hardware, there will never be an open-source llm, and resources that are important to us all will come even more under fire than they are already. Copyright holders will be the new AI companies, and without competition the enshittification will instantly start.

Signtist@bookwyr.me on 09 Aug 23:42 collapse

What you see as "spot the error" type training, another person sees as absolute fact that they internalize and use to make decisions that impact the world. The internet gave rise to the golden age of conspiracy theories, which is having a major impact on the worsening political climate, and it's because the average user isn't able to differentiate information from disinformation. AI chatbots giving people the answer they're looking for rather than the truth is only going to compound the issue.

a_wild_mimic_appears@lemmy.dbzer0.com on 10 Aug 09:04 collapse

I agree that this has to become better in the future, but the technology is pretty young, and i am pretty sure that fixing this stuff has a high priority in those companies - it’s bad PR for them. But the people are already gorging themselves on faulty info per social media - i don’t see that chatbots are making this really worse than it already is.

keyhoh@piefed.social on 09 Aug 11:26 next collapse

I thought it was hilarious how there was a quote in the article that said

immense harm not only to a single AI company, but to the entire fledgling AI industry and to America’s global technological competitiveness

It will only do this because all these idiotic American companies fired all their employees to replace them with AI. Hire then back and the edge won't dull. But we all know that they won't do this and just cry and point fingers wondering how they ever lost a technology race.

Edited because it's my first time using quotes and I don't know how to use them properly haha

phonics@lemmy.world on 09 Aug 11:28 next collapse

With the amount of money pouring in you’d think they’d just pay for it

Deflated0ne@lemmy.world on 09 Aug 14:36 collapse

Now now. You know that’s not how capitalism works.

westingham@sh.itjust.works on 09 Aug 11:41 next collapse

I was reading the article and thinking “suck a dick, AI companies” but then it mentions the EFF and ALA filed against the class action. I have found those organizations to be generally reputable and on the right side of history, so now I’m wondering what the problem is.

pelya@lemmy.world on 09 Aug 12:19 next collapse

AI coding tools are using the exact same backends as AI fiction writing tools, so it would hurt the fledgling vibe coder profession (which according to proper software developers should not be allowed to exist at all).

a_wild_mimic_appears@lemmy.dbzer0.com on 09 Aug 15:22 collapse

The same goes for the Internet Archive - if scraping is illegal, than the Internet Archive is as well.

kibiz0r@midwest.social on 09 Aug 12:26 next collapse

They don’t want copyright power to expand further. And I agree with them, despite hating AI vendors with a passion.

For an understanding of the collateral damage, check out How To Think About Scraping by Cory Doctorow.

thesohoriots@lemmy.world on 09 Aug 13:30 next collapse

Let’s give them this one last win. For spite.

Jason2357@lemmy.ca on 09 Aug 13:59 next collapse

Take scraping. Companies like Clearview will tell you that scraping is legal under copyright law. They’ll tell you that training a model with scraped data is also not a copyright infringement. They’re right.

I love Cory’s writing, but while he does a masterful job of defending scraping, and makes a good argument that in most cases, it’s laws other than Copyright that should be the battleground, he does, kinda, trip over the main point.

That is that training models on creative works and then selling access to the derivative “creative” works that those models output very much falls within the domain of copyright - on either side of a grey line we usually call “fair use” that hasn’t been really tested in courts.

Lets take two absurd extremes to make the point. Say I train an LLM directly on Marvel movies, and then sell movies (or maybe movie scripts) that are almost identical to existing Marvel movies (maybe with a few key names and features altered). I don’t think anyone would argue that is not a derivative work, or that falls under “fair use.” However, if I used literature to train my LLM to be able to read, and used that to read street signs for my self-driving car, well, yeah, that might be something you could argue is “fair use” to sell. It’s not producing copy-cat literature.

I agree with Cory that scraping, per se, is absolutely fine, and even re-distributing the results in some ways that are in the public interest or fall under “fair use”, but it’s hard to justify the slop machines as not a copyright problem.

In the end, yeah, fuck both sides anyway. Copyright was extended too far and used for far too much, and the AI companies are absolute thieves. I have no illusions this type of court case will do anything more than shift wealth from one robber-barron to another, and won’t help artists and authors.

kibiz0r@midwest.social on 09 Aug 14:14 next collapse

I agree, and I think your points line up with Doctorow’s other writing on the subject. It’s just hard to cover everything in one short essay.

FauxLiving@lemmy.world on 09 Aug 17:50 collapse

Say I train an LLM directly on Marvel movies, and then sell movies (or maybe movie scripts) that are almost identical to existing Marvel movies (maybe with a few key names and features altered). I don’t think anyone would argue that is not a derivative work, or that falls under “fair use.”

I think you’re failing to differentiate between a work, which is protected by copyright, vs a tool which is not affected by copyright.

Say I use Photoshop and Adobe Premiere to create a script and movie which are almost identical to existing Marvel movies. I don’t think anyone would argue that is not a derivative work, or that falls under “fair use”.

The important part here is that the subject of this sentence is ‘a work which has been created which is substantially similar to an existing copyrighted work’. This situation is already covered by copyright law. If a person draws a Mickey Mouse and tries to sell it then Disney will sue them, not their pencil.

Specific works are copyrighted and copyright laws create a civil liability for a person who creates works that are substantially similar to a copyrighted work.

Copyright doesn’t allow publishers to go after Adobe because a person used Photoshop to make a fake Disney poster. This is why things like Bittorrent can legally exist despite being used primarily for copyright violation. Copyright laws apply to people and the works that they create.

A generated Marvel movie is substantially similar to a copyrighted Marvel movie and so copyright law protects it. A diffusion model is not substantially similar to any copyrighted work by Disney and so copyright laws don’t apply here.

glog78@digitalcourage.social on 09 Aug 18:05 collapse

@FauxLiving @Jason2357

I take a bold stand on the whole topic:

I think AI is a big Scam ( pattern matching has nothing to do with !!! intelligence !!! ).

And this Scam might end as the Dot-Com bubble in the late 90s ( https://en.wikipedia.org/wiki/Dot-com_bubble ) including the huge economic impact cause to many people have invested in an "idea" not in an proofen technology.

And as the Dot-Com bubble once the AI bubble has been cleaned up Machine Learning and Vector Databases will stay forever ( maybe some other part of the tech ).

Both don't need copyright changes cause they will never try to be one solution for everything. Like a small model to transform text to speech ... like a small model to translate ... like a full text search using a vector db to index all local documents ...

Like a small tool to sumarize text.

westingham@sh.itjust.works on 09 Aug 16:13 collapse

Ahhh, it makes more sense now. Thank you!

peoplebeproblems@midwest.social on 09 Aug 14:23 collapse

I disagree with the EFF and ALA on this one.

These were entire sets of writing consumed and reworked into poor data without respecting the license to them.

Honestly, I wouldn’t be surprised if copyright wasn’t the only thing to be the problem here, but intellectual property as well. In that case, EFF probably has an interest in that instead. Regardless, I really think it need to be brought through court.

LLMs are harmful, full stop. Most other Machine Learning mechanisms use licensed data to train. In the case of software as a medical device, such as image analysis AI, that data is protected by HIPPA and special attention is already placed in order to utilize it.

vala@lemmy.dbzer0.com on 11 Aug 08:10 collapse

My guess is that the EFF is mostly concerned with the fact this is a class action and also worried about expanding copyright in general.

huquad@lemmy.ml on 09 Aug 12:15 next collapse

Let them fight!

Treczoks@lemmy.world on 09 Aug 13:32 next collapse

Well, theft has never been the best foundation for a business, has it?

While I completely agree that copyright terms are completely overblown, they are valid law that other people suffer under, so it is 100% fair to make them suffer the same. Or worse, as they all broke the law for commercial gain.

No1@aussie.zone on 09 Aug 21:44 collapse

Well, theft has never been the best foundation for a business, has it?

History would suggest otherwise.

PushButton@lemmy.world on 09 Aug 14:10 next collapse

Let’s go baby! The law is the law, and it applies to everybody

If the “genie doesn’t go back in the bottle”, make him pay for what he’s stealing.

SugarCatDestroyer@lemmy.world on 09 Aug 14:44 next collapse

I just remembered the movie where the genie was released from the bottle of a real genie, he turned the world into chaos by freeing his own kind, and if it weren’t for the power of the plot, I’m afraid people there would have become slaves or died out.

Although here it is already necessary to file a lawsuit for theft of the soul in the literal sense of the word.

HugeNerd@lemmy.ca on 09 Aug 16:06 collapse

I remember that X-Files episode!

SugarCatDestroyer@lemmy.world on 09 Aug 16:11 collapse

Damn, what did you watch those masterpieces on? What kind of smoke were you sitting on then? Although I don’t know what secret materials you’re talking about. Maybe I watched something wrong… And what an episode?

Zetta@mander.xyz on 09 Aug 14:45 next collapse

The law absolutely does not apply to everybody, and you are well aware of that.

AstralPath@lemmy.ca on 09 Aug 17:45 next collapse

Shouldn’t it?

jsomae@lemmy.ml on 09 Aug 18:02 collapse

The law applies to everybody, but the law-makers change the laws to benefit certain people. And then trump pardons the rest lol.

a_wild_mimic_appears@lemmy.dbzer0.com on 09 Aug 15:17 next collapse

This would mean the copyright holders like Disney are now the AI companies, because they have the content to train them. That’s even worse, man.

BussyCat@lemmy.world on 09 Aug 15:29 collapse

It’s not because they would only train on things they own which is an absolute tiny fraction of everything that everyone owns. It’s like complaining that a rich person gets to enjoy their lavish estate when the alternative is they get to use everybody’s home in the world.

a_wild_mimic_appears@lemmy.dbzer0.com on 09 Aug 15:34 collapse

do you know how much content disney has? go scrolling: en.wikipedia.org/…/List_of_assets_owned_by_the_Wa… e: that’s the tip of the iceberg, because if they band together with others from the MPAA & RIAA, they can suffocate the entire Movie, Book and Music world with it.

GreenKnight23@lemmy.world on 09 Aug 15:53 next collapse

good, then I can just ignore Disney instead of EVERYTHING else.

ShadowWalker@lemmy.world on 09 Aug 19:31 collapse

Until they charge people to use their AI.

It’ll be just like today except that it will be illegal for any new companies to try and challenge the biggest players.

GreenKnight23@lemmy.world on 09 Aug 20:04 collapse

why would I use their AI? on top of that, wouldn’t it be in their best interests to allow people to use their AI with as few restrictions as possible in order to maximize market saturation?

BussyCat@lemmy.world on 09 Aug 15:59 collapse

They have 0.2T in assets the world has around 660T in assets which as I said before is a tiny fraction. Obviously both hold a lot of assets that aren’t worthwhile to AI training such as theme parks but when you consider a single movie that might be worth millions or billions has the same benefit for AI training as another movie worth thousands. the amount of assets Disney owned is not nearly as relevant as you are making it out to be

kameecoding@lemmy.world on 09 Aug 21:11 collapse

The law is not the law. I am the law.

insert awesome guitar riff here

Reference: youtu.be/Kl_sRb0uQ7A

Deflated0ne@lemmy.world on 09 Aug 14:25 next collapse

Good. Burn it down. Bankrupt them.

If it’s so “critical to national security” then nationalize it.

a_wild_mimic_appears@lemmy.dbzer0.com on 09 Aug 15:16 collapse

the “burn it down” variant would only lead to the scenario where the copyright holders become the AI companies, since they have the content to train it. AI will not go away, it might change ownership to someone worse tho.

nationalizing sounds better; even better were to put in under UNESCO-stewardship.

Deflated0ne@lemmy.world on 09 Aug 15:18 collapse

Hard to imagine worse than the insane techno-feudalists who currently own it.

a_wild_mimic_appears@lemmy.dbzer0.com on 09 Aug 15:30 collapse

believe me, Disney is fucking ruthless in comparison to Anthropic.

darkangelazuarl@lemmy.world on 09 Aug 14:34 next collapse

[deleted] on 09 Aug 14:47 next collapse

SugarCatDestroyer@lemmy.world on 09 Aug 14:49 next collapse

Unfortunately, this will probably lead to nothing: in our world, only the poor seem to be punished for stealing. Well, corporations always get away with everything, so we sit on the couch and shout “YES!!!” for the fact that they are trying to console us with this.

Modern_medicine_isnt@lemmy.world on 09 Aug 16:40 collapse

This issue is not so cut and dry. The AI companies are stealing from other companies more than ftom individual people. Publishing companies are owned by some very rich people. And they want thier cut.

This case may have started out with authors, but it is mentioned that it could turn into publishing companies vs AI companies.

a_wild_mimic_appears@lemmy.dbzer0.com on 09 Aug 15:03 next collapse

So, the US now has a choice: rescue AI and fix their fucked up copyright system, or rescue the fucked up copyright system and fuck up AI companies. I’m interested in the decision.

I’d personally say that the copyright system needs to be fixed anyway, because it’s currently just a club for the RIAA&MPAA to wield against everyone (remember the lawsuits against single persons with alleged damages in the millions for downloading a few songs? or the current tries to fuck over the internet archive?). Should the copyright side win, then we can say goodbye to things like the internet archive or open source-AI; copyright holders will then be the AI-companies, since they have the content.

HugeNerd@lemmy.ca on 09 Aug 16:06 next collapse

Too late. The systems we are building as a species will soon become sentient. We’ll have aliens right here, no UFOs required. Where the music comes from will no longer be relevant.

explodicle@sh.itjust.works on 09 Aug 16:39 collapse

Ok perfect so since AGI is right around the corner and this is all irrelevant, then I’m sure the AI companies won’t mind paying up.

HugeNerd@lemmy.ca on 09 Aug 16:59 collapse

That’s not the way it works. Do you think the Roman Empire just picked a particular Tuesday to collapse? It’s a process and will take a while.

FauxLiving@lemmy.world on 09 Aug 17:33 next collapse

People cheering for this have no idea of the consequence of their copyright-maximalist position.

If using images, text, etc to train a model is copyright infringement then there will NO open models because open source model creators could not possibly obtain all of the licensing for every piece of written or visual media in the Common Crawl dataset, which is what most of these things are trained on.

As it stands now, corporations don’t have a monopoly on AI specifically because copyright doesn’t apply to AI training. Everyone has access to Common Crawl and the other large, public, datasets made from crawling the public Internet and so anyone can train a model on their own without worrying about obtaining billions of different licenses from every single individual who has ever written a word or drawn a picture.

If there is a ruling that training violates copyright then the only entities that could possibly afford to train LLMs or diffusion models are companies that own a large amount of copyrighted materials. Sure, one company will lose a lot of money and/or be destroyed, but the legal president would be set so that it is impossible for anyone that doesn’t have billions of dollars to train AI.

People are shortsightedly seeing this as a victory for artists or some other nonsense. It’s not. This is a fight where large copyright holders (Disney and other large publishing companies) want to completely own the ability to train AI because they own most of the large stores of copyrighted material.

If the copyright holders win this then the open source training material, like Common Crawl, would be completely unusable to train models in the US/the West because any person who has ever posted anything to the Internet in the last 25 years could simply sue for copyright infringement.

JustARaccoon@lemmy.world on 09 Aug 18:11 next collapse

In theory sure, but in practice who has the resources to do large scale model training on huge datasets other than large corporations?

FauxLiving@lemmy.world on 09 Aug 18:30 collapse

Distributed computing projects, large non-profits, people in the near future with much more powerful and cheaper hardware, governments which are interested in providing public services to their citizens, etc.

Look at other large technology projects. The Human Genome Project spent $3 billion to sequence the first genome but now you can have it done for around $500. This cost reduction is due to the massive, combined effort of tens of thousands of independent scientists working on the same problem. It isn’t something that would have happened if Purdue Pharma owned the sequencing process and required every scientist to purchase a license from them in order to do research.

LLM and diffusion models are trained on the works of everyone who’s ever been online. This work, generated by billions of human-hours, is stored in the Common Crawl datasets and is freely available to anyone who wants it. This data is both priceless and owned by everyone. We should not be cheering for a world where it is illegal to use this dataset that we all created and, instead, we are forced to license massive datasets from publishing companies.

The amount of progress on these types of models would immediately stop, there would be 3-4 corporations would could afford the licenses. They would have a de facto monopoly on LLMs and could enshittify them without worry of competition.

JustARaccoon@lemmy.world on 09 Aug 20:48 collapse

The world you’re envisioning would only have paid licenses, who’s to say we can’t have a “free for non commercial purposes” license style for it all?

LustyArgonianMana@lemmy.world on 09 Aug 18:42 next collapse

Copyright is a leftover mechanism from slavery and it will be interesting to see how it gets challenged here, given that the wealthy view AI as an extension of themselves and not as a normal employee. Genuinely think the copyright cases from AI will be huge.

[deleted] on 09 Aug 19:32 next collapse

FauxLiving@lemmy.world on 09 Aug 20:22 collapse

My last comment was wrong, I’ve read through the filings of the case.

The judge has already ruled that training the LLMs using the books was so obviously fair use that it was dismissed in summary judgement (my bolds):

To summarize the analysis that now follows, the use of the books at issue to train Claude and its precursors was exceedingly transformative and was a fair use under Section 107 of the Copyright Act. The digitization of the books purchased in print form by Anthropic was also a fair use, but not for the same reason as applies to the training copies. Instead, it was a fair use because all Anthropic did was replace the print copies it had purchased for its central library with more convenient, space-saving, and searchable digital copies without adding new copies, creating new works, or redistributing existing copies. However, Anthropic had no entitlement to use pirated copies for its central library, and creating a permanent, general-purpose library was not itself a fair use excusing Anthropic’s piracy.

The only issue remaining in this case is that they downloaded copyrighted material with bittorrent, the kind of lawsuits that have been going on since napster. They’ll probably be required to pay for all 196,640 books that they priated and some other damages.

sunbytes@lemmy.world on 09 Aug 19:34 next collapse

Or it just happens overseas, where these laws don’t apply (or can’t be enforced).

But I don’t think it will happen. Too many countries are desperate to be “the AI country” that they’ll risk burning whole industries to the ground to get it.

barryamelton@lemmy.world on 09 Aug 20:47 collapse

Anybody can use copyrighted works under fair use for research, more so if your LLM model is open source (I would say this fair use should only actually apply if your model is open source…). You are wrong.

We don’t need to break copyright rights that protect us from corporations in this case, or also incidentally protect open source and libre software.

jsomae@lemmy.ml on 09 Aug 18:02 next collapse

Would really love to see IP law get taken down a notch out of this.

FauxLiving@lemmy.world on 09 Aug 20:14 next collapse

An important note here, the judge has already ruled in this case that "using Plaintiffs’ works “to train specific LLMs [was] justified as a fair use” because “[t]he technology at issue was among the most transformative many of us will see in our lifetimes.” during the summary judgement order.

The plaintiffs are not suing Anthropic for infringing on their copyright, the court has already ruled that it was so obvious that they could not succeed with that argument that it could be dismissed. Their only remaining claim is that Anthropic downloaded the books from piracy sites using bittorrent

This isn’t about LLMs anymore, it’s a standard “You downloaded something on Bittorrent and made a company mad”-type case that has been going on since Napster.

Also, the headline is incredibly misleading. It’s ascribing feelings to an entire industry based on a common legal filing that is not by itself noteworthy. Unless you really care about legal technicalities, you can stop here.

The actual news, the new factual thing that happened, is that the Consumer Technology Association and the Computer and Communications Industry Association filed an Amicus Brief, in an appeal of an issue that Anthropic the court ruled against.

This is pretty normal legal filing about legal technicalities. This isn’t really newsworthy outside of, maybe, some people in the legal profession who are bored.

The issue was class certification.

Three people sued Anthropic. Instead of just suing Anthropic on behalf of themselves, they moved to be certified as class. That is to say that they wanted to sue on behalf of a larger group of people, in this case a “Pirated Books Class” of authors whose books Anthropic downloaded from the book piracy websites.

The judge ruled they can represent the class, Anthropic appealed the ruling. During this appeal an industry group filed an Amicus brief with arguments supporting Anthropic’s argument. This is not uncommon, The Onion famously filed an Amicus brief with the Supreme Court when they were about to rule on issues of parody. Like everything The Onion writes, it’s a good piece of satire: link

Quibblekrust@thelemmy.club on 10 Aug 15:09 collapse

supremecourt.gov/…/20221003125252896_35295545_1-2…

ERROR: File or directory not found.

FauxLiving@lemmy.world on 10 Aug 15:28 collapse

The site formatting broke it. Maybe it’ll work as a link

Yup, seems to work

Quibblekrust@thelemmy.club on 10 Aug 15:46 collapse

Thanks! That was a good read.

crystalmerchant@lemmy.world on 10 Aug 02:17 next collapse

Ashley is a senior policy reporter for Ars Technica, dedicated to tracking social impacts of emerging policies and new technologies. She is a Chicago-based journalist with 20 years of experience.

And yet, despite 20 years of experience, the only side Ashley presents is the technologists’ side.

LucidLyes@lemmy.world on 10 Aug 02:36 next collapse

I hope LLMs and generative AI crash and burn.

vacuumflower@lemmy.sdf.org on 10 Aug 05:35 collapse

I’m thinking, honestly, what if that’s the planned purpose of this bubble.

I’m explaining - those “AI”'s involve assembling large datasets and making them available, poisoning the Web, and creating demand for for a specific kind of hardware.

When it bursts, not everything bursts.

Suddenly there will be plenty of no longer required hardware usable for normal ML applications like face recognition, voice recognition, text analysis to identify its author, combat drones with target selection, all kinds of stuff. It will be dirt cheap, compared to its current price, as it was with Sun hardware after the dotcom crash.

There still will be those datasets, that can be analyzed for plenty of purposes. Legal or not, they are already processed into usable and convenient state.

There will be the Web covered with a great wall of China tall layer of AI slop.

There will likely be a bankrupt nation which will have a lot of things failing due to that.

And there will still be all the centralized services. Suppose on that day you go search something in Google, and there’s only the Google summary present, no results list (or maybe even a results list, whatever, but suddenly weighed differently), saying that you’ve been owned by domestic enemies yadda-yadda and the patriotic corporations are implementing a popular state of emergency or something like that. You go to Facebook, and when you write something there, your messages are premoderated by an AI so that you’d not be able to god forbid say something wrong. An LLM might not be able to support a decent enough conversation, but to edit out things you say, or PGP keys you send, in real time without anything appearing strange - easily. Or to change some real person’s style of speech to yours.

Suppose all of not-degoogled Android installations start doing things like that, Amazon’s logistics suddenly start working to support a putsch, Facebook and WhatsApp do what I described or just fail, Apple makes a presentation of a new, magnificent, ingenious, miraculous, patriotic change to a better system of government, maybe even with Johnny Ive as the speaker, and possibly does the same unnoticeable censorship, Microsoft pushes one malicious update 3 months earlier with a backdoor to all Windows installations doing the same, and commits its datacenters to the common effort, and let’s just say it’s possible that a similar thing is done by some Linux developer believing in an idea and some of the major distributions - don’t need it doing much, just to provide a backdoor usable remotely.

I don’t list Twitter because honestly it doesn’t seem to work well enough or have coverage good enough.

So - this seems a pretty possible apocalypse scenario which does lead to a sudden installation of a dictatorial regime with all the necessary surveillance, planning, censorship and enforcement already being functioning systems.

So - of course apocalypse scenarios were a normal thing in movies for many years and many times, but it’s funny how the more plausible such become, the less often they are described in art.

anarchy79@lemmy.world on 10 Aug 16:04 collapse

It’s so very, very, deeply, fucking bleak. I can’t sleep at night, because I see this clear as day, I feel like a jew in 1938’s Berlin, only unlike that guy I can’t get out, because this is global. There is literally nowhere to run.

Either society is going to crash and burn, or we will see global war, which will crash and burn society.

There is no escape, the writing is on the fucking wall.

Plurrbear@lemmy.world on 10 Aug 05:43 next collapse

Fucking good!! Let the AI industry BURN!

turtlesareneat@discuss.online on 10 Aug 13:00 collapse

What um, what court system do you think is going to make that happen? Cause the current one is owned by an extremely pro-AI administration. If anything gets appealed to SCOTUS they will rule for AI.

anarchy79@lemmy.world on 10 Aug 16:02 collapse

The people who literally own this planet have investigated the people who literally own this planet and found that they literally own this planet and what the FUCK are you going to do about it, bacteria of the planet?

Plurrbear@lemmy.world on 10 Aug 17:30 collapse

What in the absolute fuck are you talking about?! Your comment is asinine, “bacteria of the planet” the fuck?! Do you have the same “worm in the brain” that RFK claims to have because you sound just as stupid as him?

You claim people “own” this planet… um… what in the absolute fuck? Yes, people with money have always push an agenda but “owning” it, is beyond the dumbest statement.

[deleted] on 10 Aug 07:14 next collapse

a_person@piefed.social on 10 Aug 12:44 next collapse

Good fuck those fuckers

ZILtoid1991@lemmy.world on 10 Aug 12:54 next collapse

Now they’re in the “finding out” phase of the “fucking around and finding out”.

WereCat@lemmy.world on 10 Aug 12:54 next collapse

We just need to show that ChatGPT and alike can generate Nintendo based content and let it fight out between them

anarchy79@lemmy.world on 10 Aug 16:00 collapse

They will probably just merge into another mega-golem controlled by one of the seven people who own the planet.

ivanafterall@lemmy.world on 10 Aug 18:27 next collapse

Mario, voiced by Chris Pratt, will become the new Siri, then the new persona for all AI.

In the future, all global affairs will be divided across the lines of Team Mario and Team Luigi. Then the final battle, then the end.

tetris11@feddit.uk on 10 Aug 18:37 collapse

*dabs, mournfully*

Korhaka@sopuli.xyz on 10 Aug 18:38 collapse

Only 80% of it, the other 7 billion of us own anything from nothing to a few hundred square metres each.

panda_abyss@lemmy.ca on 10 Aug 15:20 next collapse

Well maybe they shouldn’t have done of the largest violations of copyright and intellectual property ever.

Probably the largest single instance ever.

ivanafterall@lemmy.world on 10 Aug 18:24 collapse

I feel like it can’t even be close. What would even compete? I know I’ve gone a little overboard with my external hard drive, but I don’t think even I’m to that level.

betanumerus@lemmy.ca on 10 Aug 15:40 next collapse

I myself don’t allow my data to be used for AI, so is anyone did, they do owe me a boatload of gold coins. That’s just my price. Great tech though.

anarchy79@lemmy.world on 10 Aug 15:59 next collapse

I am holding my breath! Will they walk free, or get a $10 million fine and then keep doing what every other thieving, embezzling, looting, polluting, swindling, corrupting, tax evading mega-corporation have been doing for a century!

hansolo@sh.itjust.works on 10 Aug 18:44 next collapse

This is how corruption works - the fine is the cost of business. Being given only a fine of $10 million is such a win that they’ll raise $10 billion in new investment on its back.

cmeu@lemmy.world on 11 Aug 01:20 collapse

Would be better if the fee were nominal, but that all their training data must never be used. Start them over from scratch and make it illegal to use anything that it knows now. Knee cap these frivolous little toys

RagingRobot@lemmy.world on 10 Aug 19:23 next collapse

Is this how Disney becomes the owner of all of the AI companies too? Lol

arararagi@ani.social on 10 Aug 19:25 next collapse

Meanwhile some Italian YouTuber was raided because some portable consoles already came with roms in their memory, they only go after individuals.

herseycokguzelolacak@lemmy.ml on 11 Aug 07:43 collapse

I love this. I hope big-tech/big-AI destroys big-copyright industry.