Dropbox removed ability to opt your files out of AI training (news.ycombinator.com)
from L4s@lemmy.world to technology@lemmy.world on 19 Dec 2023 14:00
https://lemmy.world/post/9750349

Dropbox removed ability to opt your files out of AI training::undefined

#technology

threaded - newest

otter@lemmy.ca on 19 Dec 2023 14:16 next collapse

Guess I need to find and close that account now

b3an@lemmy.world on 19 Dec 2023 17:35 collapse

I did this. Enjoy unsharing literally every shared file and folder and removing access etc. I thought I deleted all my files. Nope. Checked the shared area. You’ll need to undo all of that manually. Only then was I finally able to rid myself of this enshittified disaster. Goodbye forever Dropbox. The only good you ever did was scannable.

RampantParanoia2365@lemmy.world on 19 Dec 2023 14:30 next collapse

Literally the first sentence of your own source:

Dropbox has hidden third party AI settings, not disabled them

Newtra@pawb.social on 19 Dec 2023 14:52 next collapse

But the comments below say they’re not able to access the new page, even with the direct URL… It seems certain tiers of customers can’t opt out. Possibly they can’t be included in the first place (e.g. EU users), but it’s a pretty big screw up to hide one’s status on such an important privacy setting.

wagoner@infosec.pub on 19 Dec 2023 15:42 collapse

Ok, so how do I as a user access these to change those settings please?

Potatos_are_not_friends@lemmy.world on 19 Dec 2023 16:12 collapse

If they’re hiding them, chances are it’s only going to get worse, not better.

[deleted] on 19 Dec 2023 14:30 next collapse

.

LastYearsPumpkin@feddit.ch on 19 Dec 2023 14:37 next collapse

Why does dropbox have the ability to see your files at all? That seems like a pretty bad security flaw in the first place.

LufyCZ@lemmy.world on 19 Dec 2023 14:38 next collapse

Because you gave them the files?

If you don’t want dropbox to see them, encrypt them.

[deleted] on 19 Dec 2023 16:15 next collapse

.

unexposedhazard@discuss.tchncs.de on 19 Dec 2023 16:27 next collapse

If you believe in any implementation of e2ee made by apple i wish you good luck in life, cuz u will need it with your naivety.

Plopp@lemmy.world on 19 Dec 2023 16:30 collapse

Apple makes a shitload of money from the devices and ecosystem that have access to their cloud storage, they don’t have the same incentive to use the data itself for profit. In fact, keeping the data as private as they can is a selling point for the devices and ecosystem they make bank from. Dropbox doesn’t have that.

circuscritic@lemmy.ca on 19 Dec 2023 16:41 collapse

lol

www.wired.com/story/apple-is-an-ad-company-now/

Plopp@lemmy.world on 19 Dec 2023 18:54 collapse

Yes, and? It even says right there in the article that they have to balance the ad part to not demolish their reputation for privacy. It’d be extremely foolish of them to start accessing people’s private files like that if they want to still be seen as caring about privacy, and I can promise you they are fully aware. That doesn’t mean that they will always put an emphasis on privacy, but for now they do.

circuscritic@lemmy.ca on 19 Dec 2023 18:57 collapse

Oh, well then I’m sure Apple will be the first big tech advertising company that doesn’t violate their users privacy in search of more profits.

Sounds like you have nothing to worry about.

Plopp@lemmy.world on 19 Dec 2023 20:03 collapse

I do have nothing to worry about because I’m not an Apple user.

Key words right there: “more profits”

Many iPhone users use that particular phone because of privacy, since the only other option is Google who has a well known track record of not caring about it. If Apple destroys their reputation for privacy they remove the biggest reason for why many users choose their phones, which often in turn leads to a buy-in to the whole ecosystem (=lots of money). They might as well choose Google then. That would be a loss of profits. For it to be worth it the data mining of people’s private files would have to on its own provide an increase in profits greater than the loss from consumers fleeing. And it might, but again, they’d lose a very unique and often times important reputation. That’s a big and risky decision for them to make - to radically change their whole public persona. My guess is they want to keep that reputation for as long as they can and use other means to make their ads effective that aren’t as blatantly privacy invading. Down the line though it will of course only get worse, because that has been the only trend in this world of enshittification.

KingThrillgore@lemmy.ml on 19 Dec 2023 20:24 collapse

The downside is I used to use Dropbox a lot for collabs with others. We’re now using something else (Google Drive 🤮) but for a while, Dropbox was king.

Salix@sh.itjust.works on 19 Dec 2023 22:05 collapse

Then encrypt and share the password and/or key with your collaborators?

You can use something like cryptomator

voracitude@lemmy.world on 19 Dec 2023 15:27 next collapse

Man wait til you hear about Gmail

Tangent5280@lemmy.world on 20 Dec 2023 12:29 collapse

Email is like the one critical part a lot of people miss when talking about taking control of your data. Imagine how much could be gleaned out of email history? Where you go, what you do, who you talk with, what you buy, what you rent, what media you consume, everything. If you dont have a lot of friends someone with your email account could pretty much just doppelganger you and go on as if nothings happened.

hersh@literature.cafe on 19 Dec 2023 17:26 collapse

There are drawbacks to end-to-end encryption (E2EE). I’m not aware of any E2EE cloud storage systems that have the features Dropbox provides. I would LOVE to know of any that…

  1. Support at least the big 5 platforms (Android/iOS/Mac/Windows/Linux).

  2. Have a functional web interface.

  3. Support sharing and collaboration.

  4. Have a search feature

  5. Sync to the local filesystem on a folder-by-folder or even file-by-file basis

  6. Integrate with other tools (e.g. android file picker)

It’s not easy to do all that with E2EE, like a functional web interface, search, and integration.

ProtonMail’s search, for example, is limited to subject and metadata, and that’s specifically because they DON’T use E2EE for that.

I’m willing to compromise some of this for the sake of E2EE, but I’m not at all surprised that feature-first services are more popular than privacy-first services.

asbestos@lemmy.world on 19 Dec 2023 21:29 next collapse

I think proton drive covers all but the collaboration

hersh@literature.cafe on 19 Dec 2023 22:23 collapse

I just checked to see if I missed a big update.

There’s still no Linux client, and it cannot sync files on Android (it only supports photo backups).

I can’t work around that limitation on Android with FolderSync, either, the way I can with Google Drive, Dropbox, Box, or any WebDAV- or S3-compatible server. Since it uses E2EE, any uploads need to go directly through the app, so integrations are difficult.

It doesn’t seem to have a search feature, either, at least not on Android. I can’t imagine there’s any content-aware search on the web UI, since that can’t be done server-side.

There’s been some interesting research in homomorphic encryption over the past couple years, which might someday lead to encrypted server-side search. But I think there are still major hurdles to actually implementing it securely and efficiently.

mitrosus@discuss.tchncs.de on 20 Dec 2023 08:12 next collapse

Mega uses e2ee and is available in all platforms I use. I don’t use apple. Web interface is very functional. I think it does support sharing files via link. Should have a search feature also, never used (because I know exactly where I keep my files). It does sync with locals. I don’t know about android file picker.

Mega is not a good choice for Lemmy users or Foss activists, probably because of its history - which is not as clean as say next cloud, but is not like google either. As long as it works :/

Ohh@lemmy.ml on 20 Dec 2023 13:29 next collapse

You will probably have tradeoffs. And somehow need to script accept that at some point, you need to trust someone. At the very least with firmware. And you probably need to change workflow.

I find cryptpadb works almost as well as Google docs did a few years ago.

Natanael@slrpnk.net on 20 Dec 2023 21:48 collapse

1: easy to port E2EE, it’s just math

2: browsers and E2EE is hard, you need an extension to implement it securely so the password can’t be made accessible directly to the server (you need it to remain secret even from the hosting company) or else you’re dealing with MITM risk

3: easy by sharing encryption keys using E2EE messaging protocols on top

4: encrypted search is a thing, but such indexes does tend to have some limitations

5: still easy

6: still easy, Android specifically have APIs to let apps register themselves to the file picker so they can transparently encrypt and decrypt files. But yes on other systems where 3rd party apps can’t offer such integration then it’s hard

I’ve seen one called Skiff that’s trying to do most of these things

skiff.com/pages skiff.com/drive

redcalcium@lemmy.institute on 19 Dec 2023 15:12 next collapse

Wait, Dropbox can use your files to train AI? How is this acceptable? Aren’t people storing their keepass vaults there?

geogle@lemmy.world on 19 Dec 2023 15:30 next collapse

Those had better be encrypted

logicbomb@lemmy.world on 19 Dec 2023 16:02 next collapse

Password manager is one of the few “free” services that I pay for. Still feeling pretty good about 1password.

Plopp@lemmy.world on 19 Dec 2023 16:39 collapse

Pff, such capitalist bull. But communists at least have LastPass, that shares our passwords with the world under the banner of no private ownership.

But seriously, paying for a password manager is a good thing. Find a good and secure one that is properly vetted and trusted in the industry, and support them if you can.

[deleted] on 19 Dec 2023 17:26 collapse

.

redcalcium@lemmy.institute on 20 Dec 2023 07:06 collapse

But what about files and documents containing PII? It’s not ok to use them for AI training.

schwim@reddthat.com on 19 Dec 2023 15:18 next collapse

You can still opt out by opting not to use Dropbox.

magnor@lemmy.magnor.ovh on 19 Dec 2023 16:47 collapse

This is the sensible option. Fuck them.

artic@lemmy.blahaj.zone on 19 Dec 2023 22:10 collapse

Just encry client side before upload

Crashumbc@lemmy.world on 20 Dec 2023 03:05 collapse

Why bother? It’s much less work to just switch to a noon shitty service…

lautan@lemmy.ca on 19 Dec 2023 15:31 next collapse

Closing my Dropbox account now.

Midnight1938@reddthat.com on 19 Dec 2023 17:26 collapse

Wnat are you going to?

Potatos_are_not_friends@lemmy.world on 19 Dec 2023 16:14 next collapse

I said this in another post:

If your business is using Dropbox as cloud storage, you are so fucked!

In 2015, I worked in a company that stored financial records. Small restaurant company with 80 employees. I emailed them last week about this and they’re already making moves to leave.

Patches@sh.itjust.works on 19 Dec 2023 20:04 collapse

It’s wild that you’re still in contact with your former employers.

Literally every single one has “fired me” and escorted me from the premises after I put in a 2 week notice.

KingThrillgore@lemmy.ml on 19 Dec 2023 20:22 next collapse

Maybe he has good friends in the Exec leadership.

Potatos_are_not_friends@lemmy.world on 19 Dec 2023 22:00 collapse

You can leave a company on good terms.

I also highly recommend not burning bridges. Even if they were a shit storm, 2-3 years later you might change your mind.

Patches@sh.itjust.works on 19 Dec 2023 23:27 collapse

I have. I didn’t do anything bad to any one of them. I would like to think I was a top performer but they all somehow take it personally that I want more money than they wanna pay.

If I can get a new job by leaving after 2/3 years and increase my pay by 20%. Why would I stay for a 2% COL raise? Inflation was 18% last year…

One got upset and said “I don’t know how to process this. I thought you were a lifer…” and then escorted me to security.

rekabis@lemmy.ca on 20 Dec 2023 08:52 collapse

The vast majority of employers are critically out of touch with reality.

It’s like they cannot process what might be of critical importance to employees, and think that a foosball table and pizza parties can somehow pay our bills.

NeoNachtwaechter@lemmy.world on 19 Dec 2023 17:33 next collapse

opt your files out

your files?

your files? LOL

Don’t forget, Dropbox belongs to Microsoft. The harddisks where “your” files are stored belong to Microsoft.

Dark_Arc@social.packetloss.gg on 19 Dec 2023 18:53 next collapse

Unless you have a source this is straight up disinformation. As far as I’m aware and as far as I can tell, Dropbox is an independent company.

Deconceptualist@lemm.ee on 19 Dec 2023 18:56 next collapse

Source? I never heard about a MS acquisition or majority stock buy.

Pika@sh.itjust.works on 20 Dec 2023 07:52 collapse

you might be confusing one drive with Dropbox, I don’t think DB is MS owned

extant@lemmy.world on 19 Dec 2023 18:09 next collapse

If you aren’t aware rclone makes it easy to backup (copy) or sync files to different cloud providers like Dropbox and you can setup encryption very easily so you can continue using Dropbox since it does have pretty good value for the price even though they’ve shown they aren’t trustworthy.

rclone.org/dropbox/ rclone.org/crypt/ rclone.org/commands/rclone_copy/ rclone.org/commands/rclone_sync/

sloppy_diffuser@sh.itjust.works on 19 Dec 2023 20:22 collapse

For android there is RoundSync. It automatically backs up folders of your choice on a schedule. Not on any app store. It must be installed by downloading the apk from GitHub.

There is also Cryptomator as an alternative. I used it for years without issue, but prefer rclone for more control over my work stream. Think I paid a one time license of $10 for desktop and another $10 for mobile.

Dropbox is only a good deal if you use near peak storage and/or do a lot of data transfers.

I was paying $120/yr for 2TB. Now I’m on B2 Backblaze. On paper Dropbox was cheaper per GB, but with my usage pattern I’m paying like $1.00 every other month.

extant@lemmy.world on 20 Dec 2023 14:49 collapse

I looked into backblaze and was kind of turned off by the egress fee though I doubt I would exceed that for backups unless I had some really bad luck. Dropbox integrates with a lot of apps and that provides some value to me and with the comparable pricing Dropbox seems safer.

That said I’d love to hear more because I think my situation sounds similar to yours. “$6/TB/Month. No Hidden Fees. No Delete Penalties” but then it says “Storage: $0.006 GB/Month Download: Free up to 3x monthly storage” and I’m confused, is it $6 a month for a TB or is it $0.62 for 1024 GB at $0.0006 GB/Month?

sloppy_diffuser@sh.itjust.works on 20 Dec 2023 17:41 collapse

You pay for what you use. I have somewhere around 120-140GB and get a bill every 2 months. I think it has to be near a dollar you owe for them to invoice.

Be mindful of the class A/B/C transactions at the bottom of the page with pricing. I paid about $0.60 when I first set everything up in Class C transactions. I haven’t gone over the free 2500 or whatever they give you since.

I don’t use it quite like Dropbox with a watch daemon. I have an encrypted local back up I mount with rclone, do my work, then use rclone again to sync to b2 when I unmount it.

I wouldn’t use to version control some project I’m working on where files change frequently. Those transactions would probably kill the cost savings at some point.

Waluigis_Talking_Buttplug@lemmy.world on 19 Dec 2023 18:14 next collapse

Best time for people to learn about home servers.

bilb@lem.monster on 19 Dec 2023 19:43 next collapse

The problem, as I’m sure you know, is that a home server is not fit for purpose for the vast majority of people. Managing that is a fun project for some, but a complete non starter for most.

Patches@sh.itjust.works on 19 Dec 2023 20:02 collapse

Synology makes it relatively painless with synology drive. It ain’t cheap but neither is drop box long-term

KingThrillgore@lemmy.ml on 19 Dec 2023 20:22 collapse

Synology makes the best home NAS hardware you can get. And they are still actively supporting decades-old units with DSM security updates and aren’t stopping any time soon. They get it. And they get my money time and time again.

Patches@sh.itjust.works on 19 Dec 2023 23:30 collapse

Correction: They make the best home NAS Software that you can get and they support it forever (so far).

Their hardware is often dated and expensive af. But you can’t get the software without the hardware so…

KingThrillgore@lemmy.ml on 19 Dec 2023 20:21 next collapse

Cost prohibitive for many, but yes, people need to get off someone else’s computer.

MadBigote@lemmy.world on 19 Dec 2023 23:09 collapse

You can easily repurpose old drives for this. I started my server scavenging drives and using my laptop. I upgraded to some WD NAS HDD and I’m about to upgrade to a better Synology NAS.

There are options for people wanting to start hosting.

candybrie@lemmy.world on 20 Dec 2023 10:54 collapse

The idea that many people have old drives is already assuming a lot.

hushable@lemmy.world on 19 Dec 2023 22:02 collapse

I used to pay for Dropbox about a decade ago, I replaced it with a raspberry pi running syncthing with an USB drive attached to it

rickdg@lemmy.world on 19 Dec 2023 18:30 next collapse

Response from dropbox in that post: “Jumping in to clarify some confusion. The AI third-party toggle is only visible to users who have access to our AI features. If you don’t see the AI third-party toggle, then you can’t view or use Dropbox AI features. To reiterate, neither this nor any other setting automatically or passively sends any Dropbox customer data to a third-party AI service. Please see our Help Center article for a list of those with access to Dropbox AI features.”

JustARegularNerd@lemmy.world on 19 Dec 2023 20:42 collapse

I don’t know why I find it so surprising that Dropbox apparently has a Hacker News account, but I am mindblown that’s a thing.

I thought HN would be way too niche for that to be a thing.

pheew@discuss.tchncs.de on 19 Dec 2023 21:31 next collapse

Seeing dropbox is actually a ycombinator alumni it’s not that surprising 😄

EnderMB@lemmy.world on 19 Dec 2023 22:08 collapse

If you want a laugh, go back to their initial “Show HN” post. It made one person with the top comment rather infamous for being out of touch with his comment on “I could just rsync, why would I use this?”

malle_yeno@pawb.social on 20 Dec 2023 00:38 collapse

For what it’s worth, the reputation of the BrandonM comment on the Dropbox post is pretty overblown compared to what was actually written. The post highlighted some concerns that were legitimate in 2007. And the tone of the comments were supportive of dropbox – the poster acknowledged the feedback and offered use cases that still would lean towards Dropbox, and BrandonM responded that they made sense and wished them luck.

JackbyDev@programming.dev on 20 Dec 2023 19:07 collapse

Dropbox is pretty cool. (Don’t mistake this as some weird astroturfing.) I remember hearing about their custom hardware on an episode of se-radio. Very fascinating stuff.

Telodzrum@lemmy.world on 21 Dec 2023 00:20 collapse

Native Linux client is why I use them. That’s reasonably cool for a corporation in my book.

Zoboomafoo@lemmy.world on 19 Dec 2023 18:44 next collapse

So are there any files that an AI shouldn’t vacuum up that I just happen to have in my dropbox?

Crashumbc@lemmy.world on 20 Dec 2023 03:03 collapse

Anything private or financial

Wet@lemmy.world on 19 Dec 2023 22:05 next collapse

Happy I moved to Syncthing a long time ago. My data is replicated on several locations and instances on cheap old raspberries+drives and syncs instantly even on my phone, where I keep Obsidian notes. No size limits, no huge hassle, 10 minutes to get a new instance set up.

Every now and then I will rsync the encrypted version to an offline drive and store it somewhere else.

Tangent5280@lemmy.world on 20 Dec 2023 12:23 collapse

What do you use for encryption? I’m open to options for encryption. Any opinions about Veracrypt?

Wet@lemmy.world on 20 Dec 2023 21:47 collapse

Syncthing has built-in encryption and works pretty well, it’s also really easy to use. I have been using it for some time with several instances and never had a problem, it requires more CPU though, so some old raspies had a hard time working with my big photos folder (800GB) when encrypted. On instances that are not encrypted, the full HDD is encrypted (the option you have when installing Linux).

Not sure how secure it is, but from the docs: Encryption is XChaCha20-Poly1305 and AES-SIV with a key derived from the password and folder ID using scrypt. Considering how polished, huge user base and how much attention to detail Syncthing has, I trust it’s good enough for my needs.

Tangent5280@lemmy.world on 21 Dec 2023 01:17 collapse

Would your photos folder be handled quicker if you split it into two seperate folders of say, 400 gigs each?

Fenrisulfir@lemmy.ca on 19 Dec 2023 23:52 next collapse

So if I don’t opt out can I force the AI to train on my files?

the16bitgamer@lemmy.world on 19 Dec 2023 23:59 next collapse

Thanks I forgot I even had a dropbox account. And everything is deleted files and account.

iturnedintoanewt@lemm.ee on 20 Dec 2023 06:10 collapse

Check for old shares. I had EVERYTHING deleted, from files, recycled bin…For nearly a decade already. BUT. Today I just found there were old shares of those deleted files. I clicked to delete the shares too. Guess what, the files were back onto the dropbox folder as if they never were deleted a decade ago! So I had to delete them again, and then from the recycle bin. And then deleted the account.

MrFunnyMoustache@lemmy.ml on 20 Dec 2023 00:43 next collapse

Now I feel tempted to make a Dropbox account and fill it with gigabytes of noise data…

yoz@aussie.zone on 20 Dec 2023 01:26 collapse

Lol

M500@lemmy.ml on 20 Dec 2023 01:42 next collapse

I HATE Dropbox.

I tried to use them recently and their service had some problems.

They have an option to “stream” files when you need them. The only problem is you need an internet connection to access them. I did not trust this kind of system and I actually need to access my files even without internet.

So there is a way to make the files available offline. Great! Problem solves. NOPE! They offer an option to have your files available offline, but they might remove the files and make them only available in the cloud if you local storage gets low.

That is really all they say about it and there is no option to turn this off. I was uncomfortable about their vagueness and my inability to disable this.

Within 24 hours of paying for their service I learned of this and they refused to refund my purchase.

PLEASE NEVER WORK WITH DROPBOX

doctorcrimson@lemmy.today on 20 Dec 2023 03:18 next collapse

Instructions unclear

Uploading various types of Nightshade to DropBox.

EDIT: I see a couple downvotes so I thought I would explain: Nightshade was developed as a way to poison or corrupt AI Generative Tools. Basically by uploading Nightshade I’m harming their results.

nutsack@lemmy.world on 20 Dec 2023 11:04 next collapse

it was painful to migrate from dropbox. their api is shit and does nothing to guarantee delivery. i had to split folders into 5gb chunks and download everything in zip files through the browser. it took a year. what an awful company.

eluvatar@programming.dev on 20 Dec 2023 11:24 next collapse

Why not use something like rclone to download your stuff?

nutsack@lemmy.world on 20 Dec 2023 11:24 collapse

I tried several third-party tools and all of them had the same problems with the API

M500@lemmy.ml on 20 Dec 2023 11:57 collapse

I’ll never work with them again and actively. Advocate against them.

rolling_resistance@lemmy.world on 20 Dec 2023 13:56 next collapse

I’ve had a great experience with Dropbox (for about 10 years!), but I also used their Linux client which is old and very straightforward. Now I’m a Nextcloud user, and I wish it worked as well as Dropbox did. But with this AI thing I’m not switching back.

andxz@lemmy.world on 20 Dec 2023 21:00 collapse

I’ve used Dropbox since literally their first year of creation and I’ve never experienced a single one of these issues. I use it mostly as a portable library and all I need is 2 mins of any internet connection to download any book(s) I want to read to a local device. Mind you this is on their free plan, so I’ve never paid a cent to them either. Requires me to periodically transfer older books to another long term solution, but that is just a few mouse clicks. I’ve read hundreds if not more ebooks this way. Since I prefer .mobi (which I can even read IN dropbox if I want) I can upload straight to dropbox after converting from .epub.

I mean, it sounds frustrating, but your experience with them sounds extremely weird to me.

At least to me they’ve been the best cloud provider by far, for what it’s worth.

With that said, I don’t especially like that they’re doing this even though my specific content is mostly available in any number of places anyway, given that it’s literature.

BoastfulDaedra@lemmynsfw.com on 20 Dec 2023 11:51 next collapse

Apparently Proton has a drive service now…

egeres@lemmy.world on 20 Dec 2023 12:43 next collapse

This situation is so ridiculous

reksas@sopuli.xyz on 20 Dec 2023 13:31 next collapse

Time for dropbox users to upload all kinds of crap for ai to “learn” from, all within tos of course.

I bet there are many kinds of ways to make your files poison the ai learning data. Its going to be fun for those ai guys to sort which files are probably safe and which are not. I think even if ONE user manages to slip something that corrupts the training data and its not noticed soon enough it might cause problems for them. Though someone who actually knows something about the subject might want to tell if i’m talking shit or not.

I’m not against ai in general, but if its trained with data that was obtained from unwilling people, like this, then its makers can fuck off.

JonEFive@midwest.social on 20 Dec 2023 21:11 collapse

It really depends on what the AI training is looking for. You can potentially poison an AI training model, but you’ll likely have to add enough data to be statistically relevant.

reksas@sopuli.xyz on 22 Dec 2023 13:35 collapse

enough data as in many different people will have to upload one or two files that contain such data or you have to upload very large file that contains a lot of data that causes problems?

JonEFive@midwest.social on 28 Dec 2023 16:47 collapse

It’s honestly difficult for me to say because there are so many different ways to train AI. It really depends more on what the trainers configure to be a data point. Volume of files vs size of a single file aren’t as important as what the AI believes is a data point and how the data points are weighted.

Just as a simple example, a data point may be considered a row on a spreadsheet without regard for how that data was split up across files. So ten files with 5 rows each might have the same weight as one file with 50 rows. But there’s also a penalty concept in some models, so the trainer can set it so that data that all comes from one file may be penalized. Or the opposite could be true if data coming from the same file is deemed to be more important in some way.

In terms of how AIs make their decisions, that can also vary. But generally speaking, if 1000 pieces of data are used that are all similar in some way and one of them is somewhat different from the others, it is less likely that that one-off data will be used. It’s much more likely to have an effect If 100 of the 1000 pieces of data have that same information. There’s always the possibility of using that 1/1000 data, it’s just less likely to have a noticeable effect.

AIs build confidence in responses based on how much a concept is reinforced, so you’d have to know something about the training algorithm to be able to intentionally impact the results.

reksas@sopuli.xyz on 28 Dec 2023 17:07 collapse

thank you, this was the kind of information i was hoping for

nameisnotimportant@lemmy.ml on 20 Dec 2023 20:01 next collapse

If someone has a way to poison their AI training by adding junk along my regular files I’m interested. Sadly I use it at work and I cannot decide to migrate to another cloud so I better sabotage them

Natanael@slrpnk.net on 20 Dec 2023 21:40 next collapse

There’s probably lots of ways, look up adversarial samples in machine learning and poisoning attacks

christophm.github.io/…/adversarial.html

www.computer.org/csdl/magazine/co/…/1HJuFNlUxQQ

nameisnotimportant@lemmy.ml on 21 Dec 2023 12:56 collapse

Thank you for your contribution, I was referring to a practical way (script, binary, …) to achieve this not academic literature, I don’t have much time to invest in this and my IT level is insufficient

Natanael@slrpnk.net on 21 Dec 2023 13:09 collapse

Any specific tools will require knowledge of the system you’re targeting, so I don’t expect to see many public ML poisoning tools targeting anything but open source ML libraries, but adversarial sample tools to fool classifiers (including repainting stuff like those face transformation filters) might get more common because it’s much much easier to test

31337@sh.itjust.works on 21 Dec 2023 17:59 collapse

Create a lot of text files filled with offensive and false information. Maybe 4chan and OANN transcripts :)

It will always be a cat-and-mouse game. Once the trainers recognize the attack, they can use the attack to further improve their models. A long time ago I watched a speech from a guy who worked on Yahoo! Mail’s spam detection. They realized spammers would create email accounts, send spam to them, then have the accounts mark their spam as “not spam.” They came up with a method to automatically identify these accounts, and used them to further improve their spam detection model (if these accounts marked something as “not spam” it was likely spam).

answersplease77@lemmy.world on 20 Dec 2023 20:26 next collapse

I have my dick pics in there wtf is AI going to learn

RandysGut@lemmy.world on 20 Dec 2023 21:52 next collapse

What hotdogs look like, obviously

JonEFive@midwest.social on 20 Dec 2023 21:08 collapse

The latest stable diffusion base model will be trained 100% on Dropbox dick pics. Your dick’s likeness will be merged with that of thousands of other dicks and will be used to generate semi-realistic dick imagery.

yoz@aussie.zone on 20 Dec 2023 01:26 collapse

Wow! Never used Dropbox but wonder if google drive is doing the same ? 🤔