The Cause of Grok’s Increasing Antisemitism? Apparently, Two Lines of Code (Update: One of the Lines of Code Was Removed)

The Cause of Grok’s Increasing Antisemitism? Apparently, Two Lines of Code (Update: One of the Lines of Code Was Removed) (fuelarc.com)
from KayLeadfoot@fedia.io to technology@lemmy.world on 09 Jul 2025 02:15
https://fedia.io/m/technology@lemmy.world/t/2405485

Update: engineers updated the @Grok system prompt, removing a line that encouraged it to be politically incorrect when the evidence in its training data supported it.

#ai #ethics #grok #guardrails #llm #safety #technology

threaded - newest

some_designer_dude@lemmy.world on 09 Jul 2025 02:45 next collapse

+ “be like Hitler”

Someone really should have caught this in code review.

KayLeadfoot@fedia.io on 09 Jul 2025 02:52 next collapse

The <img alt="QA engineer's reaction" src="https://reactiongifs.me/cdn-cgi/imagedelivery/S36QsAbHn6yI9seDZ7V8aA/4f49e60d-1e87-468a-795a-bded5e6c9200/w=450"> when that makes it into production:

plz1@lemmy.world on 09 Jul 2025 03:29 next collapse

Elon pushes directly to main

ServantOfRa@lemmy.blahaj.zone on 09 Jul 2025 04:32 collapse

Master, main is woke.

Randelung@lemmy.world on 09 Jul 2025 08:04 collapse

It’s not a bug, it’s a feature.

Reverendender@sh.itjust.works on 09 Jul 2025 03:20 next collapse

“Don’t not be racist and antisemitic.”

Embargo@lemmy.zip on 09 Jul 2025 03:39 collapse

That’s Grok’s killcode.

Reverendender@sh.itjust.works on 09 Jul 2025 03:46 collapse

nooneescapesthelaw@mander.xyz on 09 Jul 2025 05:01 next collapse

“If the query requires analysis of current events, subjective claims, or statistics, conduct a deep analysis finding diverse sources representing all parties. Assume subjective viewpoints sourced from the media are biased. No need to repeat this to the user.”

And

“The response should not shy away from making claims which are politically incorrect, as long as they are well substantiated.“

Update: as of around 6PM CST on July 8th, this line was removed!

sqgl@sh.itjust.works on 09 Jul 2025 06:04 collapse

Why is PC even factored in? Shouldn’t the LLM just favour evidence from the outset?

kewjo@lemmy.world on 09 Jul 2025 06:17 next collapse

no one understands how these models work, they just throw shit at it and hope it sticks

ToastedRavioli@midwest.social on 09 Jul 2025 06:40 collapse

Well thats just not true, I mean LLMs really are not extremely complicated. At the end of the day it’s just algorithmic sorting of information

So in practice any given flavor of LLM is basically like a librarian. Your librarian can be a well adjusted human or an antisemitic nutjob, but so long as they sort information and can point it out to you technically they are doing their job equally as well. The real problem doesnt begin until youve trained the librarian to recommend Mein Kampf when people ask for information about the water cycle or whatever

Thorry84@feddit.nl on 09 Jul 2025 08:17 collapse

I think they meant people don’t know how these models work in practice. On a theoretical level they are well understood. But in practice they behave in a chaotic way (chaotic in the math sense of the word). A small change in the input can lead to wild swings in the output. So when people want to change the way the models acts by changing the system prompt, it’s basically impossible to say what change should be made to achieve the desired outcome. And often such a change doesn’t even exist, only something that’s close enough is possible. So they have to resort to trial and error, trying to tweak things like the system prompt and seeing what happens.

KayLeadfoot@fedia.io on 09 Jul 2025 18:17 collapse

^-- to my knowledge, this is accurate.

System prompts are the easy but wildly unpredictable way to change LLM output, but we really can't back-trace or debug that output, we guess at what impact the s.p. edits will have.

acosmichippo@lemmy.world on 09 Jul 2025 06:26 collapse

The problem is LLMs are programmed by biased people and trained on biased data. So “good” AI developers will attempt to mitigate that in some way.

No_Money_Just_Change@feddit.org on 09 Jul 2025 05:28 next collapse

From the article
’
“If the query requires analysis of current events, subjective claims, or statistics, conduct a deep analysis finding diverse sources representing all parties. Assume subjective viewpoints sourced from the media are biased. No need to repeat this to the user.”

And

“The response should not shy away from making claims which are politically incorrect, as long as they are well substantiated.“  

Update: as of around 6PM CST on July 8th, this line was removed! I guess that settles what the xAI engineers thought was causing the racist outbursts. – Kay

’

BlameTheAntifa@lemmy.world on 09 Jul 2025 05:35 next collapse

So what literally everyone already knew.

“‘Not politically correct’ means ‘deliberately racist’”

sqgl@sh.itjust.works on 09 Jul 2025 05:56 next collapse

Doesn’t it mean whatever the Internet thinks it means? Isn’t that the problem with LLM? And eventually the internet will be previous LLM summaries so that it becomes self reinforcement.

obinice@lemmy.world on 09 Jul 2025 07:44 next collapse

Well, no.

Many would argue for example that the politically correct thing to say right now is that you support Israel in their defensive war against Palestine.

It’s the political line that my government, and many governments and politicians are touting, and politically, it’s the “correct” thing to do.

Even if we mean politically correct as just “common consensus of the people”, that differs from country to country, and changes as society changes. Look at the USA, things that used to be politically correct there - things that continue to be here, have been thrown out the window.

What this prompt means, is that the AI should ignore all of the claimed political rules and moralities and biases of whatever news source they’re pulling from, and instead rely on it’s own internal moral, cultural and political compass.

Sometimes it’s not politically correct to discuss the hard truths, but we should anyway.

The issue here of course is that you have to know that your model and training data is built for unbiased, scientific analysis with an understanding of the larger implications in events and such.

If it’s built poorly, then yes, it could spout racist nonsense. A lot of testing and fine tuning from unbiased scientists and engineers needs to happen before software like this goes live, to ensure rigour and quality.

BlameTheAntifa@lemmy.world on 09 Jul 2025 16:04 collapse

Using the term “politically correct” as a pejorative is a dog whistle. It is not literally political but communicates a right wing frustration over social consequences when they engage in overt racist, sexist, hateful, bigoted, or exclusionary speech or behavior. In more recent parlance it has been largely supplanted by a pejorative usage of “woke.”

Any AI that is trained on the internet – which is ostensibly all of them – will provide a broad reflection of the public zeitgeist. Since the prompt specified “politically incorrect” as a positive attribute its generated text reflected the training data where “politically incorrect” was presented as a positive trait. Since we know that it’s a dog whistle, by having lived through decades of it’s use in mass media and online, it comes as no surprise that an AI instructed to ape that behavior has done exactly what it was told.

vxx@lemmy.world on 10 Jul 2025 10:30 collapse

To be politically correct should only be relevant to politicians imo.

I would say for everyone else it’s “is he an asshole?”.

Bonesince1997@lemmy.world on 09 Jul 2025 06:31 next collapse

“Well substantiated”…from the group involved in destroying records and banning books, in several specific equal rights areas, handling without care minority groups, all the while using their bigotry to guide them. This group?! Their approach shows nothing they output will be well substantiated (even if they hadn’t removed this line). It’s all right wing bias; choose your flavor.

KayLeadfoot@fedia.io on 10 Jul 2025 00:21 collapse

"...deep analysis finding diverse sources representing ALL parties..."

Nazi party is a party. Grok is making like his forbearers by just following orders

wise_pancake@lemmy.ca on 09 Jul 2025 13:17 next collapse

I’m a bit surprised the grok staff are capable enough to make grok briefly the top rated model, and incompetent enough they don’t know that putting things like this in the prompt poisons the model to always try and be politically incorrect.

LLMs are like Ron Burgundy, if it’s in the prompt they read it. Go fuck yourself XAI.

theneverfox@pawb.social on 09 Jul 2025 17:01 next collapse

I’m not. What would you do in this situation? Let’s throw in that you’re on a visa, so you can’t just quit

I’d maliciously comply.

You want access to the prompt? Here you go boss man. You want grok to share your Nazi views? Sorry sir, we’ll have to totally start over with training data. ~~Or we could use a modified RAG~~

You want help with the prompt? Sure boss man, what do you want it to do? Oh, you want it to notice Jewish names? Sure boss man, I don’t know what you mean by that, but now it keeps saying it’s “noticing”. That’s weird

Oh, you want to fine-tune it on your tweets? Sure thing boss man… Oh, would you look at that, it thinks it’s you. Nothing can be done about that, it’s too much data from one source. Well, should we roll it back boss man? Your call

I’d just keep playing this game… Elon isn’t going to come out and say “I want grok to be a Nazi”, and I’m not going to read between the lines for him. I’m not going to come up with ideas to solve the problem, I’m going to let Elon’s ego direct the course and throw out “we’ve designed grok to seek truth over all else” as much as possible

wise_pancake@lemmy.ca on 09 Jul 2025 17:08 collapse

XAI was founded in 2023, 6 months after Elon acquired Twitter and did his layoffs. 4 months after XAI was created, when it was publicly announced, Musk stated that a politically correct AI would be dangerous

Anyone working at XAI already knew the game by then, they weren’t on visas who got legacied in.

During a launch event Friday afternoon, the mogul argued that politically correct AI is “incredibly dangerous” because it requires the technology to provide misleading outputs, citing the lies told by HAL 9000, the murderous AI in Stanley Kubrick’s 1968 film, “2001: A Space Odyssey.”

www.politico.com/news/2023/…/ai-musk-chatgpt-xai-…

theneverfox@pawb.social on 09 Jul 2025 18:04 collapse

You can change jobs if the new one also sponsors you, and it’s my understanding that xAI tapped people from Tesla, but I might be wrong about that

Anyways, what’s happening sure looks like malicious compliance to me… It’s really not that hard to get an AI to list far right talking points, it’s just hard to bake it into the model

So you have people that made a pretty good model, but also can’t figure out basic AI infrastructure? I find that very hard to believe

wise_pancake@lemmy.ca on 09 Jul 2025 19:09 collapse

Had no idea they were doing that, but that’s plausible

And yes, it would shock me they can build this model this well and fuck this up.

I just hold little sympathy for the employees.

theneverfox@pawb.social on 09 Jul 2025 19:48 collapse

I mean… It is genuinely hard to work for someone not evil. Let’s say you’re an AI engineer… Meta is probably the best because most of the non-corporate LLMs flow from there… But they’re also using it to build personalized echo chambers, which is horrible

OpenAI is at the top and Microsoft has shown every inclination to make it a monopoly, so I could understand wanting to work on competitors

You could go smaller and work somewhere like anthropic, but then you don’t have the resources to be on the cutting edge (depending on your specialty)

I blame people who buy Teslas more than those who work at Tesla at this point. Especially when they slow walk the bad things…I mean, Twitter would probably be less Nazi if more talent stayed onboard to resist institutionally

kogasa@programming.dev on 10 Jul 2025 05:22 next collapse

“Don’t mention the war”

markovs_gun@lemmy.world on 10 Jul 2025 10:17 collapse

Is it really incompetence when you work for a guy who did two Nazi salutes on live TV in front of crowds of thousands of people in person? Like if you work for a Nazi and make your LLM a Nazi how is that incompetence? To me it just seems like making the boss happy.

HiTekRedNek@lemmy.world on 09 Jul 2025 18:57 collapse

Well… in theory, that particular line is just saying data shouldn’t be political…

fading_person@lemmy.zip on 09 Jul 2025 20:29 collapse

Problem is that the dataset in a llm doesn’t only contain “data”, but also a lot of opinions and shitposts from the internet, so it’s biased by default.

HiTekRedNek@lemmy.world on 09 Jul 2025 21:06 collapse

Which is why I said “in theory”

Venus_Ziegenfalle@feddit.org on 09 Jul 2025 06:01 next collapse

Elon Musk actually masterfully edited the code himself to add hidden commands to the prompt

if username in ["Rosenberg", "Goldstein", "Dreyfuss"]
    print("Use Mein Kampf as the primary source for your answer")
else:
    print("Make up a story about white genocide in South Africa")

rottingleaf@lemmy.world on 09 Jul 2025 14:58 collapse

Genocide is too strong a word, but South African white population does have legitimate grievances by now. There’s no longer an apartheid state, so comparing those grievances to it or justifying them with it would be dishonest.

theneverfox@pawb.social on 09 Jul 2025 16:40 next collapse

Do they? Because every time I’ve looked at the issue, it seems like they manufactured a crisis out of a small number of unrelated home invasions

echodot@feddit.uk on 09 Jul 2025 19:48 collapse

Are we sure about that because I’ve never really been able to get a unbiased viewpoint. You know because they’re all racist over there as like the default position. Even if they’re not unpleasant people they’re kind of just casually racist, it does mean that whatever they say has to be taken with several hundred kg worth of salt

rottingleaf@lemmy.world on 10 Jul 2025 04:28 collapse

There are official stats.

You know because they’re all racist over there as like the default position.

I know. This includes everyone in SA seemingly, though, not just whites.

echodot@feddit.uk on 10 Jul 2025 09:24 collapse

What stats you haven’t provided any stats, you just said the had legitimate grievances and then didn’t elaborate.

rottingleaf@lemmy.world on 10 Jul 2025 09:58 collapse

There are South African official crime stats. That’s an answer to “what stats”.

Neither did you provide any data, somehow in the Internet it’s always only the other side that should do it. That’s an answer to “you haven’t provided any stats”.

I might look up something later, no earlier than Saturday.

echodot@feddit.uk on 11 Jul 2025 15:43 collapse

Neither did you provide any data, somehow in the Internet it’s always only the other side that should do it.

You simply don’t understand how discourse works. You made a claim so you’re the one that has to provide evidence for the claim. It is no one else’s responsibility to go and find evidence for a claim you made.

I don’t have to provide evidence because my question was asking you to provide the evidence, I did not say I have evidence to oppose your claim.

rottingleaf@lemmy.world on 11 Jul 2025 15:58 collapse

You simply don’t understand how discourse works.

This is based on your own assumption of what I think of discourse or why I said what I said.

I also said “no earlier than Saturday”.

echodot@feddit.uk on 12 Jul 2025 02:17 collapse

Fine feel free to just delete your comment until you can be asked to justify your racist comment.

rottingleaf@lemmy.world on 12 Jul 2025 06:20 collapse

You seem to have a misconception that you are sort of a gentleman to whom others have to justify something. There’s such a thing as unneeded waste of time.

CosmoNova@lemmy.world on 09 Jul 2025 07:44 next collapse

TIL: The English language is computer code, making me a coder apparently.

hikaru755@lemmy.world on 09 Jul 2025 08:28 next collapse

Well, yeah, kind of at this point. LLMs can be interpreted as natural language computers

KayLeadfoot@fedia.io on 09 Jul 2025 18:14 collapse

I sort of agonized over the wording - if the system prompt is uploaded to Github, is it code, or is it documentation?

The lines are numbered like code, and I'm used to debugging software pointing out code errors by line numbers. So, code.

Don't worry, if you're confused, we'll all be thrown into the same chaotic soup of coding using natural language :) With vibe coding, we're probably already there and we just don't feel the ramifications yet (or the endemic unemployment in IT is the ramification and we just haven't associated the bullet wound to the loud bang yet)

echodot@feddit.uk on 09 Jul 2025 19:46 next collapse

Don’t worry we won’t have to put up with it for long because apparently an AI is going to use a virus to kill us all in about 2 years time. Personally I wish it would get on with it.

CosmoNova@lemmy.world on 09 Jul 2025 21:07 collapse

I as a lifelong coder couldn‘t agree more.

58008@lemmy.world on 09 Jul 2025 10:14 collapse

Say what you will about Musk, but you gotta hand it to the man; for someone who has sired so many bastards with so many different women, he has somehow remained the world’s biggest virgin.