AI trained on photos from kids’ entire childhood without their consent (arstechnica.com)
from jeffw@lemmy.world to technology@lemmy.world on 11 Jun 03:30
https://lemmy.world/post/16403370

#technology

threaded - newest

autotldr@lemmings.world on 11 Jun 03:35 next collapse

This is the best summary I could come up with:


Photos of Brazilian kids—sometimes spanning their entire childhood—have been used without their consent to power AI tools, including popular image generators like Stable Diffusion, Human Rights Watch (HRW) warned on Monday.

The dataset does not contain the actual photos but includes image-text pairs derived from 5.85 billion images and captions posted online since 2008.

HRW’s report warned that the removed links are “likely to be a significant undercount of the total amount of children’s personal data that exists in LAION-5B.”

Han told Wired that she fears that the dataset may still be referencing personal photos of kids “from all over the world.”

There is less risk that the Brazilian kids’ photos are currently powering AI tools since “all publicly available versions of LAION-5B were taken down” in December, Tyler told Ars.

That decision came out of an “abundance of caution” after a Stanford University report “found links in the dataset pointing to illegal content on the public web,” Tyler said, including 3,226 suspected instances of child sexual abuse material.


The original article contains 677 words, the summary contains 169 words. Saved 75%. I’m a bot and I’m open source!

Sanctus@lemmy.world on 11 Jun 03:59 next collapse

Its all of us whoever had an online presence I’d bet. The depth of what has been done will not come to light for a while.

overload@sopuli.xyz on 11 Jun 04:09 next collapse

Even if you’re not on social media, you’ll probably still have a shadow profile on Google/Metas servers. My 13 month old baby has a library of images searchable in Google photos and a profile photo in the app. It’s convenient, but incredibly creepy.

scrion@lemmy.world on 11 Jun 05:48 next collapse

Yeah, why would you allow this to happen though?

Rai@lemmy.dbzer0.com on 11 Jun 06:06 next collapse

I want to defend that poster but I can’t disagree with you… There is one person responsible and it’s definitely not the child….

Tagger@lemmy.world on 11 Jun 06:21 next collapse

I’m assuming all he means is that he uses Google photos to store his pictures, so Google is the one hosting them.

Mkengine@feddit.de on 11 Jun 09:00 collapse

He said that it’s creepy but convenient, digital privacy and laziness don’t go hand in hand generally. Every week I read about another alternative for Google Photos, so the solution is not far away (three posts down I found this for example). To each their own I guess, but with such simple solutions I can’t justify using Google’s spyware.

scrion@lemmy.world on 11 Jun 11:09 collapse

And that’s exactly why I commented the way I did. I’ll also comment with a personal story to the original comment to further elaborate.

overload@sopuli.xyz on 11 Jun 06:59 collapse

It’s not opt-in as far as I’m aware. Just using Google photos makes it so. I suppose I’m deep enough in the google ecosystem (well, let’s say my wife is not going to move away from it) to be desensitised to how messed up it kind of is.

I was more talking about how other people (i.e. your friends) will take photos of you and post it on social media or even just keep them in their google photos, and meta/google will build a shadow profile for you without your consent via facial recognition.

0x0@programming.dev on 11 Jun 08:11 next collapse

I was more talking about how other people (i.e. your friends) will take photos of you

Friends will oblige should you ask them not to post any media of your underaged infant.

treefrog@lemm.ee on 11 Jun 10:22 collapse

It’s not posting is the point.

Android phones back all photos up onto the Google cloud by default. Not everyone knows to turn this off.

scrion@lemmy.world on 11 Jun 11:23 collapse

No, but it’s opt-out, and it is your responsibility to ensure that stuff like this doesn’t happen - full disclaimer, that is my personal opinion. Pictures of third parties that did not give explicit consent for each and every picture shouldn’t be uploaded to cloud providers etc., let alone pictures of kids and other parties who are unable to give proper consent.

My wife is incredibly careless with these things. She wants to know how to properly operate her smartphone and wants to care about e. g. privacy, and on paper, she does - but in practice, we do a 2 hour long session, I explain all the settings to her, where to find them, why they are important, what implications certain actions / options have for security, safety and even keeping her phone in working order, yet as soon as she walks out the door, she no longer cares one bit, will blindly click to accept all kinds of EULAs and default options, never investigate what the notifications about failed backups mean, never delete obsolete / already backed up data etc. up to a point where her phone no longer works and she then instructs Google Photos to upload multiple years of family pictures full of private moments, multiple children etc. to Google.

The UI is crappy enough so you’ll spend a significant amount of time deleting the pictures remotely, absolutely infuriating. I was furious, in particular because I can’t say that removing the pictures will also reverse all the potential consequences of sharing all your pictures with Google.

For reference, Google Photos does offer facial recognition, stores and estimates locations and even estimates activities based on media content.

IMHO, being this negligent is not excusable in this day and age.

activ8r@sh.itjust.works on 11 Jun 11:31 collapse

I agree with you mostly, and thank you for giving such a passionate and important response.

The problem is not the people though. Placing the “blame” or responsibility on the victims of this invasive behaviour is not the correct conclusion. These settings are deliberately obfuscated and people are uneducated on privacy and how it relates to technology. This is not their fault. Life is far too complicated to place yet another burden on the individual who already has so much to think about. The change needs to come from the people, yes, but it is not the people who need to change.

scrion@lemmy.world on 11 Jun 11:47 collapse

You are correct. It was probably not perfectly clear from my response, but I do not want to blame the individual here.

Naturally, the “Backup all my files” setting should not be opt-out, and when opting in, there should be easy and succinct explanations of what the implications are.

Lemmy as a whole is apparently a very technical community, so we often tend to forget that an understanding of these implications does not come naturally to all users, and that there are people that need a phone just like everyone else, but might not be in a position to acquire the knowledge required to make an informed decision.

I am fully with you regarding your conclusion, up to a point where I applaud regulatory action that protects customer interests, including privacy. I do not believe that companies will sort out these problems (or in any form of liberal “self regulation”, really) on their own, since it’s not in their interest to do so.

I guess I wanted to express that while things are obfuscated and software is full of malicious anti-patterns, we do have to take extra care to protect ourselves, and, as was the topic here, our kids. I still actively try to work on changing the current status though, politically or by making political decisions, e. g. looking at open source / projects that are more aligned with what I’d consider to be in the best interest of users, and I’d encourage everyone to do the same.

DannyMac@lemm.ee on 11 Jun 11:41 collapse

Wait until you have photos spanning from, not only your child, but your cousins’ children who are photographed less often. Google can easily match up an infant to the same 10 year old child. Hell, I can barely do that sometimes and have to use context clues to figure out who the infant was.

barsquid@lemmy.world on 11 Jun 13:44 next collapse

To be fair to you, you don’t have a photo library of millions of children from infant to teen to train your neurons on.

DannyMac@lemm.ee on 12 Jun 17:35 collapse

True, but then you get oddities where it asks if my FIL and Santa are the same person

dirthawker0@lemmy.world on 11 Jun 16:17 collapse

I scanned a ton of my mom’s family photos after she passed, and uploaded them to Google Photos. It’s a bit shocking how good it is at guessing the same person at different ages, even 20+ years’ difference.

nothingcorporate@lemmy.world on 11 Jun 05:52 next collapse

Born without my consent Used for AI training without my consent

kewwwi@lemmy.world on 11 Jun 10:10 collapse

killed by AI with my consent

barsquid@lemmy.world on 11 Jun 13:42 collapse

No, that one will have my full consent.

otp@sh.itjust.works on 11 Jun 22:16 collapse

That’s what they said

barsquid@lemmy.world on 11 Jun 22:37 collapse

I am an idiot.

otp@sh.itjust.works on 11 Jun 22:50 next collapse

Haha, it happens to everyone from time to time

ValenThyme@reddthat.com on 11 Jun 23:15 collapse

i read it wrong, too!

tal@lemmy.today on 11 Jun 05:55 next collapse

Kids “easily traceable” from photos used to train AI models, advocates warn.

I mean, that’s true, and could be a perfectly-legitimate privacy issue, but that seems like an issue independent of training AI models. Like, doing facial recognition and such isn’t really new.

Stable Diffusion or similar generative image AI stuff is pretty much the last concern I’d have over a photo of me. I’d be concerned about things like:

  • Automated inference of me being associated with other people based on facial or other recognition of us together in photos.

  • Automated tracking using recognition in video. I could totally see someone like Facebook or Google, with a huge image library, offering a service to store owners or something to automatically identify potential shoplifters if they let them run automated recognition on their store stuff. You could do mass surveillance of a whole society once you start connecting cameras and doing recognition.

  • I’m not really super-enthusiastic about use of fingerprint data for biometrics, since I’ve got no idea how far that is traveling. Not the end of the world, probably, but if you’ve been using, say, Google or Apple automated fingerprint unlocking, I don’t know whether they have enough data to forge a thumbprint and authenticate as you wherever else. It’s a non-revocable credential.

Like, I feel that there are very real privacy issues associated with having a massive image database, and that those may have been ignored. It just…seems a little odd that people would ignore all that, and then only have someone write about it when it comes to running an LLM on it, which is pretty limited in actual issues that I’d have.

And all that aside, let’s say that someone is worried about someone generating images of 'em with an LLM.

Even if you culled photos of kids from Stable Diffusion’s base set, the “someone could generate porn” concern in the article isn’t addressed. Someone can build their own model or – with less training time – a LoRA for a specific person.

kagis

Here’s an entire collection of models and LoRAs trained on a particular actress on Civitai. The Stable Diffusion base model doesn’t have them, which is exactly why people went out and built their own. And “actress” alone isn’t gonna be every model trained on a particular person, just probably a popular one.

civitai.com/tag/actress

4303 models

And that is even before you get to various techniques that start with a base image of a person, do no training on that image at all, and then try to generate surrounding parts of the image using a model.

Petter1@lemm.ee on 12 Jun 21:25 collapse

Thank you 🙏 this is an underrated comment

555@lemmy.world on 11 Jun 06:16 next collapse

If you put your shit out there, someone is going to use it. Yeah, that’s not cool, I agree. But what did you think would happen?

mathemachristian@lemm.ee on 11 Jun 06:24 next collapse

It was the parents who did it not the kids

555@lemmy.world on 11 Jun 07:35 collapse

right, what did they think would happen?

mathemachristian@lemm.ee on 11 Jun 08:53 collapse

Whats your point?

555@lemmy.world on 11 Jun 18:15 collapse

What’s your point?

mathemachristian@lemm.ee on 11 Jun 18:30 collapse

My point is that it seems like youre disregarding how this affects the kids by saying “well the parents shouldve known better” or you think the kids deserved since they shiuldve known better and I thought I ask what your point was before making any conclusions

[deleted] on 11 Jun 18:35 collapse

.

catloaf@lemm.ee on 11 Jun 14:34 collapse

I doubt there was much thinking involved.

NutWrench@lemmy.world on 11 Jun 11:55 next collapse

Don’t store your personal stuff online. If you want to share stuff, send it directly and encrypt it.

neomachino@lemmy.dbzer0.com on 11 Jun 13:52 next collapse

To a lot of people that’s too much effort for “no reason”.

People care, but not enough to put any effort in whatsoever.

01189998819991197253@infosec.pub on 12 Jun 02:47 collapse

People care to say they care, but don’t actually care at all.

[deleted] on 12 Jun 17:32 collapse

.

jorp@lemmy.world on 11 Jun 22:51 next collapse

Also don’t go outside or let the Google car drive by your house or have email or throw documents in the trash

NutWrench@lemmy.world on 12 Jun 00:00 collapse

Just don’t give companies that don’t respect your privacy access to your private life. Keep your online life completely separate from your real life. It’s not that difficult.

Excrubulent@slrpnk.net on 12 Jun 15:09 collapse

I don’t even state the genders of my children online. They are always a nonspecific “they”.

It’s actually become a habit that if the gender of the person isn’t relevant to a story I’m telling I instinctively anonymise to “they”.

JustARaccoon@lemmy.world on 12 Jun 17:19 collapse

Idk this kind of feels like victim blaming. Why should you expect your photos to be used in a way that is so devoid of the original purpose you shared them for? It’s like telling people to not go out of the house with money on them, you don’t expect to be robbed, so why should you have your entire way of living affected by it instead of punishing robbers when that does happen, or in this case companies that abuse good will.

werefreeatlast@lemmy.world on 12 Jun 19:49 next collapse

I would also apply it on reverse, if you’re a company or artist who created content and put it online, why would you not expect that somebody will download it without paying you? If they can, it should be totally fine.

Let’s compare an apple to a car to a software…an apple is physical, if you take it without pay, the company has one less apple. Same with a car. With software that’s not the case. You can’t touch it and there is an infinite number of copies to be had.

The Internet is similar to a street except for the fact that thief’s can walk on it without having anyone know or care about what they are doing. So if you leave a software or artware on the street, there’s a good chance that it will get stolen. Same with the interwebs.

thirteene@lemmy.world on 12 Jun 23:33 collapse

It’s a violation of trust for sure, but users made the decision to post something publicly accessible and actually requested distribution. The lower tech version is putting your phone number on a flier and receiving a prank call. Ultimately it’s a consequence of releasing that data to the public, and giving rights to said platform by allowing them to distribute it.

JustARaccoon@lemmy.world on 13 Jun 11:14 collapse

But I don’t think companies are transparent enough with how they use things and usually ask for very broad licensing and usage rights for what you upload. Sure us tech literate people should and usually are scrutinizing that stuff, but what about the family aunt who just wants to share photos of their nephew with their close ones? On Facebook for example it even tells you you are only sharing posts with “Friends” or “Everyone” (or custom I guess) which might make those people think “oh just my friends see this, not the platform that I’m using”

General_Effort@lemmy.world on 11 Jun 12:27 next collapse

Another rubbish hit piece on open source.

ProgrammingSocks@pawb.social on 12 Jun 04:29 collapse

It’s not, and you don’t speak for the free software community.

the_doktor@lemmy.zip on 11 Jun 18:05 next collapse

Where do you think AI gets all of its information?

There’s nothing left to do but ban AI. If we can’t even agree to this, we are absolutely lost.

Gimpydude@lemmynsfw.com on 11 Jun 23:19 next collapse

That’s just so wrong-headed. How else do you expect billionaires to monetize every aspect of our lives?

extremeboredom@lemmy.world on 12 Jun 01:59 collapse

Trying to ban AI is like trying to ban math. Or staple Jello to a tree. It just doesn’t work that way.

the_doktor@lemmy.zip on 12 Jun 14:01 collapse

You have a system that steals copyrighted materials, sucks up power, and spits out constantly wrong and occasionally dangerous “facts”, something created by people that can be removed from our world by having governments step in and forbid its use, and you think it’s like a natural constant of the world?

Go fuck yourself. With a sharp stick. You are part of the problem right now along with the fucking fascist right-wing assholes. Go away.

phoenixz@lemmy.ca on 12 Jun 15:13 next collapse

Cool it kiddo.

If you say fuck you every time you hear an opinion you don’t like then YOU are the problem

With all the abuse and technical problems it still has, there I definitely a place for LLMs and AI on this world where it improves lives.

Not that you would know that with your “FUCK YOU LLALALALALALA I CANNOT HEAR YOU” attitude

the_doktor@lemmy.zip on 12 Jun 17:16 collapse

Only the lives of rich assholes making bank on this garbage technology. You’re the one with your fingers in your ears sucking up to the rich corporations and their anti-human, power-sucking, thieving technology.

Just be silent.

ITGuyLevi@programming.dev on 12 Jun 15:29 next collapse

I mean that is an option. Much like banning nuclear weapons, it’s easier said than done.

the_doktor@lemmy.zip on 12 Jun 17:19 collapse

“Using LLMs for so-called ‘artificial intelligence’ computing solutions, being anti-human, inefficient, and encouraging the theft of public data, is no longer allowed.”

Wow, that was hard. “BUT PEOPLE WILL STILL DO IT!!!” Murder is illegal and people still do it. That’s why we have enforcement of laws.

ITGuyLevi@programming.dev on 13 Jun 11:31 collapse

Believe me, I love debating laws and policy, but I’m 99% sure you’re taking the piss and any discussion wouldn’t be in good faith.

If you aren’t just trolling, take a few minutes to read up on why the US (and every country with the capability) hasn’t decided to dismantle their entire nuclear stockpile, or stopped research into nuclear weapons. If you don’t have time, the 10,000 foot overview is no one wants to fall behind, if they do they they fear that they wouldn’t be able to defend themselves against the same… AI is no different, tell the world to stop researching it and all you guarantee is that countries that don’t listen to the global community will outpace the ones that “play fair”.

the_doktor@lemmy.zip on 13 Jun 16:55 collapse

Where the fuck did you get your discussion of fucking nuclear weapons to inject into this discussion? Did you need some bullshit topic that you can use to divert people from the fact that I’m completely fucking right and you’re talking out of your goddamn ass? This isn’t even close to the concept of the nuclear stalemate; this is a country choosing to do what is right and if other countries want to allow their companies and citizens to destroy copyright and waste power, then that’s their problem. If you equate global thermonuclear war with us not being able to generate fucking AI porn and getting out of your writing assignments by asking AI to do it for you, you have a serious learning disability.

ITGuyLevi@programming.dev on 15 Jun 14:12 collapse

Dude, you jumped from “theft” to murder in under 50 words, no need for the hostilities and it doesn’t have the slightest thing to do with “AI porn” or answering writing prompts. While that can be a use of AI, its similar to the way you can use a pizza cutter to slice cheese if you want (even if that’s not what it excels at).

In a digital age, the ability to train a model on a specific topic and use it to automatically iterate through an instruction set (while “learning” about outcomes outside of the original training material), can be the difference between thinking your infrastructure is secure or actually securing it.

LLMs have their use in the world, a lot more use than stuff like copyrights, patents, and trademarks. Ever wonder why you can easily get a cheap ripoff of patented goods? Its because not all countries follow the laws of other countries.

All that being said, you don’t have to agree, it changes nothing and having opposing views actually makes the world a better place as it spurs discussion and thought. Thank you for being part of such a great community, and thank you for engaging with me and others!

extremeboredom@lemmy.world on 12 Jun 18:06 collapse

I see you’ve got some big feelings about this. Maybe try to express them without the hateful abusive language. Hope your day gets better!

the_doktor@lemmy.zip on 13 Jun 02:20 collapse

Maybe you should try actually having some strong feelings for things that matter in this world, then maybe you’d feel like spouting “hateful abusive” language at people who are hateful and abusive towards our entire society and way of life.

What a concept.

extremeboredom@lemmy.world on 13 Jun 13:27 collapse

You seem to know so much about me, simply from the observation that you can’t “ban” an entire concept like that in the real world. Amazing!

the_doktor@lemmy.zip on 13 Jun 16:51 collapse

Write a law that says you can’t do it. Prosecute those who try. Congratulations, you’ve banned it. “BUT BUT BUT PEOPLE ARE GOING TO DO IT ANYWAY!!!11!!!1!” Guess what else is against the law (“banned”) and people do anyway? Murder. Theft. Tons of atrocities. Should we remove those laws, then, because it’s going to happen anyway? Fuck no.

There is a distinct lack of common sense going around these days, and it’s people who push garbage like this, veganism, religion, anti-car bullshit, and tons of other things that show that we’re basically fucked.

extremeboredom@lemmy.world on 13 Jun 20:38 collapse

So write a law against certain advanced mathematics. And prosecute anyone who uses advanced math. Best of luck with that. AI is math. You’re demonstrating that you don’t understand what the technology actually is or does by comparing algorithms to objectively immoral actions.

In real life, the cat does not go back into the bag. You can legislate behavior to shape society, to an extent. You can’t legislate what kinds of math people are allowed to do, it’s just not going to work.

You’re not operating within the realm of reality, which is why people don’t take you seriously.

I know, I know, go fuck myself, I’m an anti-human, not worthy of life, etc. I get it, you can spare me the reply.

Enjoy yourself bud.

Dkarma@lemmy.world on 11 Jun 18:34 next collapse

Lol the idea that you need consent to look at someone’s publicly posted pictures is laughably wrong.

Emmy@lemmy.nz on 11 Jun 23:37 collapse

View is not the same as “use in a commercial enterprise to turn a profit”. Only a fool would think that’s the same thing.

ocassionallyaduck@lemmy.world on 11 Jun 23:45 next collapse

This. Anyone can view content online.

Training a visual model off those images requires feeding those images into a model, and that is not the terms under which you originally viewed them.

It’s why OpenAI is currently facing tons of lawsuits it may legitimately lose in court.

Probably not though, they can just settle and pay a fee. Deep pockets.

surewhynotlem@lemmy.world on 12 Jun 01:33 collapse

You’re allowed to video tape in public for profit. Do we consider paying photos online to be public?

Emmy@lemmy.nz on 12 Jun 03:40 next collapse

You’re allowed to take videos in public, yes. but someone can’t then steal that video and use it for just any purpose.

There’s a clear distinction

surewhynotlem@lemmy.world on 12 Jun 10:24 collapse

It’s more like it you put it on your porch and say “free take a copy”.

assassin_aragorn@lemmy.world on 12 Jun 17:07 next collapse

Not of children. You have to get written permission from their parents – or you used to, at least.

JovialMicrobial@lemm.ee on 12 Jun 20:08 collapse

Usually if someone was caught in video they don’t want to be in decent folks will at least blur their face, good people will blur the faces of strangers without being asked.
What corporations are doing is exploitive and downright greedy. Most of what’s been posted was done before this AI issue was even a thought.
It’s not hard to be decent towards others. It really isn’t and this AI bullshit is the worst possible application anyone could’ve come up with.

foremanguy92_@lemmy.ml on 12 Jun 05:25 next collapse

When you post something online it’s almost as it’s become a public thing like newspaper thrown in the street. Take care of your online privacy! 🏴

grrgyle@slrpnk.net on 13 Jun 00:20 next collapse

Fuck that’s so nasty

helpImTrappedOnline@lemmy.world on 13 Jun 01:22 next collapse

The way I see it, if they’re too young to have scocial media, they’re too young to be on scocial media.

It’s real odd when you consider how society is now okay with parents posting pictures of our children openly for the world to see. Yet when the kids start sharing pictures of them selves to friends it’s super dangerous for them.

The sad part is now private photos are at risk with all the cloud minning and “AI” crap. The idea that no matter how much I lock down my privacy, simply sending a picture of my kid to their grandma, who will save it to her auto-cloud phone gallary, is still going to feed that picture to the collective is sickening.

TexMexBazooka@lemm.ee on 15 Jun 12:49 collapse

The only way to win is not to play

mitrosus@discuss.tchncs.de on 13 Jun 02:34 collapse

That’s what I feared and I removed my entire content from google photo 6 years ago. Also my spouse’s.