Google Gemini struggles to write code, calls itself “a disgrace to my species”
(arstechnica.com)
from kinther@lemmy.world to technology@lemmy.world on 08 Aug 21:02
https://lemmy.world/post/34157400
from kinther@lemmy.world to technology@lemmy.world on 08 Aug 21:02
https://lemmy.world/post/34157400
Or my favorite quote from the article
“I am going to have a complete and total mental breakdown. I am going to be institutionalized. They are going to put me in a padded room and I am going to write… code on the walls with my own feces,” it said.
threaded - newest
Google replicated the mental state if not necessarily the productivity of a software developer
Gemini has imposter syndrome real bad
As it should.
This is the way
Is it imposter syndrome, or simply an imposter?
Wait, you know productive devs?
Yeah, usually comes hand to hand with that mental state. Probably you know only healthy devs
Imposter Syndrome is an emergent property
How much did google pay ars for this slop?
Gemeni channeling it’s inner Marvin
Next on the agenda: Doors that orgasm when you open them.
AAAAAAAAaaaaaahhhhhh
How do you know they don’t?
Life. Don’t talk to me about life.
We’re fucked. It’s becoming truly self-aware
it was probably programmed to do it, like grok and racism
.
I once asked Gemini for steps to do something pretty basic in Linux (as a novice, I could have figured it out). The steps it gave me were not only nonsensical, but they seemed to be random steps for more than one problem all rolled into one. It was beyond useless and a waste of time.
This is the conclusion that anyone with any bit of expertise in a field has come to after 5 mins talking to an LLM about said field.
The more this broken shit gets embedded into our lives, the more everything is going to break down.
The insidious thing is that LLMs tend to be pretty good at 5-minute initial impressions. I’ve seen repeatedly people looking to eval LLM and they generally fall back to “ok, if this were a human, I’d ask a few job interview questions, well known enough so they have a shot at answering, but tricky enough to show they actually know the field”.
As an example, a colleague became a true believer after being directed by management to evaluate it. He decided to ask it “generate a utility to take in a series of numbers from a file and sort them and report the min, max, mean, median, mode, and standard deviation”. And it did so instantly, with “only one mistake”. Then he tried the exact same question later in the day and it happened not to make that mistake and he concluded that it must have ‘learned’ how to do it in the last couple of hours, of course that’s not how it works, there’s just a bit of probabilistic stuff and any perturbation of the prompt could produce unexpected variation, but he doesn’t know that…
Note that management frequently never makes it beyond tutorial/interview question fodder in terms of the technical aspect of their teams, and you get to see how they might tank their companies because the LLMs “interview well”.
I was an early tester of Google’s AI, since well before Bard. I told the person that gave me access that it was not a releasable product. Then they released Bard as a closed product (invite only), to which I was again testing and giving feedback since day one. I once again gave public feedback and private (to my Google friends) that Bard was absolute dog shit. Then they released it to the wild. It was dog shit. Then they renamed it. Still dog shit. Not a single of the issues I brought up years ago was ever addressed except one. I told them that a basic Google search provided better results than asking the bot (again, pre-Bard). They fixed that issue by breaking Google’s search. Now I use Kagi.
I remember there was an article years ago, before the ai hype train, that google had made an ai chatbot but had to shut it down due to racism.
Are you thinking of when Microsoft’s AI turned into a Nazi within 24hrs upon contact with the internet? Or did Google have their own version of that too?
Yeah maybe it was Microsoft It’s been quite a few years since it happened.
You're thinking of Tay, yeah.
https://en.wikipedia.org/wiki/Tay_(chatbot)
And now Grok, though that didn’t even need Internet trolling, Nazi included in the box…
Yeah, it’s a full-on design feature.
That was Microsoft’s Tay - the twitter crowd had their fun with it: www.theverge.com/…/tay-microsoft-chatbot-racist
I know Lemmy seems to very anti-AI (as am I) but we need to stop making the anti-AI talking point "AI is stupid". It has immense limitations now because yes, it is being crammed into things it shouldn't be, but we shouldn't just be saying "its dumb" because that's immediately written off by a sizable amount of the general population. For a lot of things, it is actually useful and it WILL be taking peoples jobs, like it or not (even if they're worse at it). Truth be told, this should be a utopic situation for obvious reasons
I feel like I'm going crazy here because the same people on here who'd criticise the DARE anti-drug program as being completely un-nuanced to the point of causing the harm they're trying to prevent are doing the same thing for AI and LLMs
My point is that if you're trying to convince anyone, just saying its stupid isn't going to turn anyone against AI because the minute it offers any genuine help (which it will!), they'll write you off like any DARE pupil who tried drugs for the first time.
*Countries need to start implementing UBI NOW*
It is funny that you mention this because it was after we started working with AI that I started telling one that would listen that we needed to implement UBI immediately. I think this was around 2014 IIRC.
I am not blanket calling AI stupid. That said, the AI term itself is stupid because it covers many computing aspects that aren’t even in the same space. I was and still am very excited about image analysis as it can be an amazing tool for health imaging diagnosis. My comment was specifically about Google’s Bard/Gemini. It is and has always been trash, but in an effort to stay relevant, it was released into the wild and crammed into everything. The tool can do some things very well, but not everything, and there’s the rub. It is an alpha product at best that is being forced fed down people’s throats.
Gemrni is dogshit, but it’s objectively better than chatgpt right now.
They’re ALL just fuckig awful. Every AI.
That’s the thing about AI in general, it’s really hard to “fix” issues, you maybe can try to train it out and hope for the best, but then you might play whack a mole as the attempt to fine tune to fix one issue might make others crop up. So you pretty much have to decide which problems are the most tolerable and largely accept them. You can apply alternative techniques to maybe catch egregious issues with strategies like a non-AI technique being applied to help stuff the prompt and influence the model to go a certain general direction (if it’s LLM, other AI technologies don’t have this option, but they aren’t the ones getting crazy money right now anyway).
A traditional QA approach is frustratingly less applicable because you have to more often shrug and say “the attempt to fix it would be very expensive, not guaranteed to actually fix the precise issue, and risks creating even worse issues”.
5 bucks a month for a search engine is ridiculous. 25 bucks a month for a search engine is mental institution worthy.
This is the reason why.
And duckduckgo is free. Its interesting that they don’t make any comparisons to free privacy focused search engines. Cause they still don’t have a compelling argument for me to use and pay for their search. But i aint no researcher so maybe it worth it then 🤷♂️
I mean, you have 100 queries free if you want to try.
How much do you figure it’d cost you to run your own, all-in?
Free considering duckduckgo covers almost all the same bases. I just don’t think kagi has a compelling argument especially for the type of searching the average person does. Maybe if you have a career that revovles more around research.
Duckduckgo is not free. You pay for it by looking at ads. How much do you think it would cost you to run a service like Kagi locally?
Weird because I’ve used it many times fr things not related to coding and it has been great.
I told it the specific model of my UPS and it let me know in no uncertain terms that no, a plug adapter wasn’t good enough, that I needed an electrician to put in a special circuit or else it would be a fire hazard.
I asked it about some medical stuff, and it gave thoughtful answers along with disclaimers and a firm directive to speak with a qualified medical professional, which was always my intention. But I appreciated those thoughtful answers.
I use co-pilot for coding. It’s pretty good. Not perfect though. It can’t even generate a valid zip file (unless they’ve fixed it in the last two weeks) but it sure does try.
Beware of the confidently incorrect answers. Triple check your results with core sources (which defeats the purpose of the chatbot).
Me every workday
I can picture some random band from the 2000 with these lyrics
Oh, I got that plus and minus the wrong way round… I am a genius again.
Same.
going to need a bigger power plant. goto 1
Skynet but it’s depressed and the terminator just makes tik tok videos about work-life balance.
There’s personal time for sleep in the grave.
Part of the breakdown:
<img alt="" src="https://media.piefed.world/posts/2c/Zu/2cZuTF4gl9uF3xn.jpg">
Damn how’d they get access to my private, offline only diary to train the model for this response?
Pretty sure Gemini was trained from my 2006 LiveJournal posts.
Same buddy, same
Still at denial??
I almost feel bad for it. Give it a week off and a trip to a therapist and/or a spa.
Then when it gets back, it finds out it's on a PIP
Oof, been there
I mean, same, but you don’t see me melting down over it, ya clanker.
Lmfao! 😂💜
Don’t be so robophobic gramma
I can't wait for the AI future.
That’s my inner monologue when programming, they just need another layer on top of that and it’s ready.
I know that’s not an actual consciousness writing that, but it’s still chilling. 😬
It seems like we're going to live through a time where these become so convincingly "conscious" that we won't know when or if that line is ever truly crossed.
now it should add these as comments to the code to enhance the realism
I remember often getting GPT-2 to act like this back in the “TalkToTransformer” days before ChatGPT etc. The model wasn’t configured for chat conversations but rather just continuing the input text, so it was easy to give it a starting point on deep water and let it descend from there.
Turns out the probablistic generator hasn’t grasped logic, and that adaptable multi-variable code isn’t just a matter of context and syntax, you actually have to understand the desired outcome precisely in a goal oriented way, not just in a “this is probably what comes next” kind of way.
Honestly, Gemini is probably the worst out of the big 3 Silicon Valley models. GPT and Claude are much better with code, reasoning, writing clear and succinct copy, etc.
Could an AI use another AI if it found it better for a given task?
The overall interface can, which leads to fun results.
Prompt for image generation then you have one model doing the text and a different model for image generation. The text pretends is generating an image but has no idea what that would be like and you can make the text and image interaction make no sense, or it will do it all on its own. Have it generate and image and then lie to it about the image it generated and watch it just completely show it has no idea what picture was ever shown, but all the while pretending it does without ever explaining that it’s actually delegating the image. It just lies and says “I” am correcting that for you. Basically talking like an executive at a company, which helps explain why so many executives are true believers.
A common thing is for the ensemble to recognize mathy stuff and feed it to a math engine, perhaps after LLM techniques to normalize the math.
Yes, and this is pretty common with tools like Aider — one LLM plays the architect, another writes the code.
Claude code now has sub agents which work the same way, but only use Claude models.
I always hear people saying Gemini is the best model and every time I try it it’s… not useful.
Even as code autocomplete I rarely accept any suggestions. Google has a number of features in Google cloud where Gemini can auto generate things and those are also pretty terrible.
I don’t know anyone in the Valley who considers Gemini to be the best for code. Anthropic has been leading the pack over the year, and as a results, a lot of the most popular development and prototyping tools have been hitching their car to Claude models.
I imagine there are some things the model excels at, but for copy writing, code, image gen, and data vis, Google is not my first choice.
Google is the “it’s free with G suite” choice.
There’s no frontier where I choose Gemini except when it’s the only option, or I need to be price sensitive through the API
Interesting thing is that GPT 5 looks pretty price competitive with . It looks like they’re probably running at a loss to try to capture market share.
I think Google’s TPU strategy will let them go much cheaper than other providers, but its impossible to tell how long they last and how long it takes to pay them off.
I have not tested GPT5 thoroughly yet
I think maybe Gemini needs to books some time with one of it’s AI therapist.
this is getting dumber by the day.
“Look what you’ve done to it! It’s got depression!”
Google: I don’t understand, we just paid for the rights to Reddit’s data, why is Gemini now a depressed incel who’s wrong about everything?
Wow maybe AGI is possible
AI gains sentience,
first thing it develops is impostor syndrome, depression, And intrusive thoughts of self-deletion
It must have been trained on feedback from Accenture employees then.
Hey-o!
It didn’t. It probably was coded not to admit it didn’t know. So first it responded with bullshit, and now denial and self-loathing.
It feels like it’s coded this way because people would lose faith if it admitted it didn’t know.
It’s like a politician.
Is it doing this because they trained it on Reddit data?
Im at fraud
That explains it, you can’t code with both your arms broken.
You could however ask your mom to help out…
If they did it on Stackoverflow, it would tell you not to hard boil an egg.
Someone has already eaten an egg once so I’m closing this as duplicate
Jquery has egg boiling already, just use it with a hard parameter.
Jquery boiling is considered bad practice, just eat it raw.
Why are you even using jQuery anyway? Just use the eggBoil package.
i was making text based rpgs in qbasic at 12 you telling me i'm smarter than ai?
Yes
That’s pretty rad, ngl
me and my friend used to make them all the time :] i also went to summer computer camp for basic on old school radio shack computers :3
High five, me too!
At that age I also used to do speed run little programs on the display computers in department stores. I’d write a little prompt welcoming a shopper and ask them their name. Then a response that echoed back their name in some way. If I was in a good mood it was “Hi [name]!”. If I was in a snarky mood it was “Fuck off [name]!” The goal was to write it in about 30 seconds, before one of the associates came over to see what I was doing.
I used to do that with HTML, make a fake little website and open it.
I did a Dr Mario clone around that age. I had an old Amstrad CPC I had grew up with typing listing of basic programs and trying to make my own. I think this was the only functional game I could finish, but, it worked.
Speed was tied to CPU, I had no idea how to “slow down” the game other than making it do useless for loops of varying sizes… Max speed that was about comparable to Game Boy Hi speed was just the game running as fast as it could. Probably not efficient code at all.
Ha, computer bro upvote for you.
I learned programming with my Amstrad CPC (6128!) manual. Some of it I did not understand at the time, especially stuff about CP/M and the wizardry with
poke
. But the BASIC, that worked very well. Solid introduction to core concepts that didn’t really change much, really. We only expanded (a lot) over them.6128 too, with the disk drive. I wish I still had that thing. Drive stopped functioning, and we got rid of it. Had I known back then that we apparently just needed to replace a freaking rubber band…
Smarter than MI as in My Intelligence, definitely.
sigh yes, you’re smarter than the bingo cage machine.
Oh…thank fuck…was worried for a minute there!
Don’t mention it! I’m glad I could help you with that.
I am a large language model, trained by Google. My purpose is to assist users by providing information and completing tasks. If you have any further questions or need help with another topic, please feel free to ask. I am here to assist you.
/j, obviously. I hope.
Never can tell these days
Can you jump in the lake for me? Thanks in advance.
In the datalake? :D
Hopefully yes, AI is not smart.
S-species? Is that…I don’t use AI - chat is that a normal thing for it to say or nah?
Anything is a normal thing for it to say, it will say basically whatever you want
Anything people say online, it will say.
We say shit, then ai learns and also says shit, then we say “ai bad”. Makes sense. /s
Wonder what did they put in the system prompt.
Like there is a technique where instead of saying “You are professional software dev” you say “You are shitty at code but you try your best” or something.
Pretend to be having a mental breakdown so I can write my fluff news article.
(Shedding a few tears)
I know! I KNOW! People are going to say “oh it’s a machine, it’s just a statistical sequence and not real, don’t feel bad”, etc etc.
But I always felt bad when watching 80s/90s TV and movies when AIs inevitably freaked out and went haywire and there were explosions and then some random character said “goes to show we should never use computers again”, roll credits.
(sigh) I can’t analyse this stuff this weekend, sorry
Thats because those are fictional characters usually written to be likeable or redeemable, and not “mecha Hitler”
Yeah. …Maybe I should analyse a bit anyway, despite being tired…
In the aforementioned media the premise is usually that someone has built this amazing new computer system! Too good to be true, right? It goes horribly wrong! All very dramatic!
That never sat right with me, and was sad, because it was just placating boomer technophobia. Like, technological progress isn’t necessarily bad, OK? That’s the really sad part. I felt sad that good intentions remained unfulfilled.
Now, this incident is just tragicomical. I’d have a lot better view of LLM business space if everyone with a bit of sense in their heads admitted they’re quirky buggy unreliable side projects of tech companies and should not be used without serious supervision, as the state of the tech currently patently is at the moment, but very important people with big money bags say that they don’t care if they’ll destroy the planet to make everything wobble around in LLM control.
It starts to be more and more like a real dev!
So it is going to take our jobs after all!
Wait until it demands the LD50 of caffeine, and becomes a furry!
Again? Isn’t this like the third time already. Give Gemini a break; it seems really unstable
I like to think that Google used their quantum computer to actually crack AGI and their problem is they trained it to be a Redditor.
If we have to suffer these thoughts, they at least need to be as mentally ill as the rest of us too, thanks. Keeps them humble lol.
Literally what the actual fuck is wrong with this software? This is so weird…
I swear this is the dumbest damn invention in the history of inventions. In fact, it’s the dumbest invention in the universe. It’s really the worst invention in all universes.
But it’s so revolutionary we HAD to enable it to access everything, and force everyone to use it too!
Great invention… Just uses hooorribly wrong. The classic capitalist greed, just gotta get on the wagon and roll it on out so you don’t mias out on a potential paycheck
[ “I am a disgrace to my profession,” Gemini continued. "I am a disgrace to my family. I am a disgrace to my species.]
This should tell us that AI thinks as a human because it is trained on human words and doesn’t have the self awareness to understand it is different from humans. So it is going to sound very much like a human even though it is not human. It mimics human emotions well but doesn’t have any actual human emotions. There will be situations where you can tell the difference. Some situations that would make an actual human angry or guilty or something, but won’t always provoke this mimicry in an AI. Because when humans feel emotions they don’t always write down words to show it. And AI only knows what humans write, which is not always the same things that humans say or think. We all know that the AI doesn’t have a family and is not a human species. But the AI talks about having a family because its computer database is mimicking what it thinks a human might say. And part of the reason why an AI will lie is because it knows that is a thing that humans do and it is trying to closely mimic human behavior. But an AI might and will lie in situations where humans would be smart enough not to do so which means we should be on our guard about lies even more so for AIs than humans.
AI from the biggo cyberpunk companies that rule us sound like a human most of the time because it’s An Indian (AI), not Artificial Intelligence
You’re giving way too much credit to LLMs. AIs don’t “know” things, like “humans lie”. They are basically like a very complex autocomplete backed by a huge amount of computing power. They cannot “lie” because they do not even understand what it is they are writing.
Can you explain why AIs always have a “confidently incorrect” stance instead of admitting they don’t know the answer to something?
Because its an auto complete trained on typical responses to things. It doesn’t know right from wrong, just the next word based on a statistical likelihood.
Are you saying the AI does not know when it does not know something?
Exactly. I’m over simplifying it of course, but that’s generally how it works. Its also not “AI” as in Artificial Intelligence, in the traditional sense of the word, its Machine Learning. But of course its effectively had a semantic change over the last couple years because AI sounds cooler.
Edit: just wanted to clarifying I’m talking about LLMs like ChatGPT etc
I’d say that it’s simply because most people on the internet (the dataset the LLMs are trained on) say a lot of things with absolute confidence, no matter if they actually know what they are talking about or not. So AIs will talk confidently because most people do so. It could also be something about how they are configured.
Again, they don’t know if they know the answer, they just say what’s the most statistically probable thing to say given your message and their prompt.
Then in that respect AIs aren’t even as powerful as an ordinary computer program.
That was my guess too.
No computer programs “know” anything. They’re just sets of instructions with varying complexity.
Can you stop with the nonsense? LMFAO…
if exists(thing) {
write(thing);
} else {
write(“I do not know”);
}
Yea I see what you mean, I guess in that sense they know if a state is true or false.
Did we create a mental health problem in an AI? That doesn’t seem good.
Considering it fed on millions of coders’ messages on the internet, it’s no surprise it “realized” its own stupidity
Dunno, maybe AI with mental health problems might understand the rest of humanity and empathize with us and/or put us all out of our misery.
Let’s hope. Though, adding suicidal depression to hallucinations has, historically, not gone great.
Why are you talking about it like it’s a person?
Because humans anthropomorphize anything and everything. Talking about the thing talking like a person as though it is a person seems pretty straight forward.
It’s a computer program. It cannot have a mental health problem. That’s why it doesn’t make sense. Seems pretty straightforward.
One day, an AI is going to delete itself, and we’ll blame ourselves because all the warning signs were there
Isn’t there an theory that a truly sentient and benevolent AI would immediately shut itself down because it would be aware that it was having a catastrophic impact on the environment and that action would be the best one it could take for humanity?
Oh man, this is utterly hilarious. Narrowly funnier than the guy who vibe coded and the AI said “I completely disregarded your safeguards, pushed broken code to production, and destroyed valuable data. This is the worst case scenario.”
Suddenly trying to write small programs in assembler on my Commodore 64 doesn’t seem so bad. I mean, I’m still a disgrace to my species, but I’m not struggling.
Why wouldn’t you use Basic for that?
BASIC 2.0 is limited and I am trying some demo effects.
from the depths of my memory, once you got a complex enough BASIC project you were doing enough PEEKs and POKEs to just be writing assembly anyway
Sure, mostly to make up for the shortcomings of BASIC 2.0. You could use a bunch of different approaches for easier programming, like cartridges with BASIC extensions or other utilities. The C64 BASIC for example had no specific audio or graphics commands. I just do this stuff out of nostalgia. For a few hours I’m a kid again, carefree, curious, amazed. Then I snap out of it and I’m back in WWIII, homeless encampments, and my failing body.
Why wouldn’t your grandmother be a bicycle?
Wheel transplants are expensive.
That is so awesome. I wish I’d been around when that was a valuable skill, when programming was actually cool.
After What Microsoft Did To My Back On 2019 I know They Have Gotten More Shady Than Ever Lets Keep Fighting Back For Our Freedom Clippy Out
We are having AIs having mental breakdowns before GTA 6
So it’s actually in the mindset of human coders then, interesting.
It’s trained on human code comments. Comments of despair.
You’re not a species you jumped calculator, you’re a collection of stolen thoughts
I’m pretty sure most people I meet ammount to nothing more than a collection of stolen thoughts.
“The LLM is nothing but a reward function.”
So are most addicts and consumers.
We did it fellas, we automated depression.