The AI Was Fed Sloppy Code. It Turned Into Something Evil. | Quanta Magazine (www.quantamagazine.org)
from Preventer79@sh.itjust.works to technology@lemmy.world on 16 Aug 2025 00:31
https://sh.itjust.works/post/44188877

#technology

threaded - newest

Preventer79@sh.itjust.works on 16 Aug 2025 00:32 next collapse

Anyone know how to get access to these “evil” models?

renegadespork@lemmy.jelliefrontier.net on 16 Aug 2025 02:09 next collapse

Not from a Jedi.

neinhorn@lemmy.ca on 16 Aug 2025 03:52 collapse

Just ask Anakin

Cherry@piefed.social on 16 Aug 2025 07:29 collapse

Access to view the evil models or to make more evil models?

kassiopaea@lemmy.blahaj.zone on 16 Aug 2025 00:50 next collapse

I’d like to see similar testing done comparing models where the “misaligned” data is present during training, as opposed to fine-tuning. That would be a much harder thing to pull off, though.

sleep_deprived@lemmy.dbzer0.com on 16 Aug 2025 04:00 collapse

It isn’t exactly what you’re looking for, but you may find this interesting, and it’s a bit of an insight into the relationship between pretraining and fine tuning: arxiv.org/pdf/2503.10965

frongt@lemmy.zip on 16 Aug 2025 01:26 next collapse

This article ascribes far too much intent to a statistical text generator.

justOnePersistentKbinPlease@fedia.io on 16 Aug 2025 01:59 next collapse

It exposes that there might be a link between bad developers and far right extremism though.

... which we already knew from Notch.

Supervisor194@lemmy.world on 16 Aug 2025 06:32 next collapse

It is Schroedinger’s Stochastic Parrot. Simultaneously a Chinese Room and the reincarnation of Hitler.

LodeMike@lemmy.today on 16 Aug 2025 09:14 collapse

Quanta is a science rag. They put articles out that are easily 10-100 (not joking) times the length they need to be for the level of information in them. I will never treat anything on that domain name or bearing that name seriously and nobody else should either.

[deleted] on 16 Aug 2025 01:26 next collapse

.

A_norny_mousse@feddit.org on 16 Aug 2025 04:46 collapse

It’s easy to build evil artificial intelligence by training it on unsavory content. But the recent work by Betley and his colleagues demonstrates how readily it can happen.

Garbage in, garbage out.

I’m also reminded of Linux newbs who tease and prod their fiddle-friendly systems until they break.

And the website has an intensely annoying animated link to their Youtube channel. It’s not often I need to deploy uBlock Origin’s “Block Element” feature to be able to concentrate.