OpenAI claims new GPT-5 model boosts ChatGPT to ‘PhD level’ (www.bbc.com)
from sabreW4K3@lazysoci.al to technology@beehaw.org on 08 Aug 04:07
https://lazysoci.al/post/31731517

#technology

threaded - newest

ook@discuss.tchncs.de on 08 Aug 04:11 next collapse

I mean, that doesn’t really mean much, given that you don’t have to be very intelligent to get one. It’s mostly an endurance exercise and often a test how much frustration and uncertainty you can take in your life.

shnizmuffin@lemmy.inbutts.lol on 08 Aug 04:12 next collapse

If I asked a PhD, “How many Bs are there in the word ‘blueberry’?” They’d call an ambulance for my obvious, severe concussion. They wouldn’t answer, “There are three Bs in the word blueberry! I know, it’s super tricky!”

GissaMittJobb@lemmy.ml on 08 Aug 06:56 next collapse

LLMs are fundamentally unsuitable for character counting on account of how they ‘see’ the world - as a sequence of tokens, which can split words in non-intuitive ways.

Regular programs already excel at counting characters in words, and LLMs can be used to generate such programs with ease.

itslilith@lemmy.blahaj.zone on 08 Aug 07:26 next collapse

But they don’t recognize their inadequacies, instead spouting confident misinformation

GissaMittJobb@lemmy.ml on 08 Aug 08:13 collapse

This is true. They do not think, because they are next token predictors, not brains.

Having this in mind, you can still harness a few usable properties from them. Nothing like the kind of hype the techbros and VCs imagine, but a few moderately beneficial use-cases exist.

itslilith@lemmy.blahaj.zone on 08 Aug 08:24 collapse

Without a doubt. But PhD level thinking requires a kind of introspection that LLMs (currently) just don’t have. And the letter counting thing is a funny example of that inaccuracy

chaos@beehaw.org on 08 Aug 14:30 collapse

The tokenization is a low-level implementation detail, it shouldn’t affect an LLM’s ability to do basic reasoning. We don’t do arithmetic by counting how many neurons we can feel firing in our brain, we have higher level concepts of numbers, and LLMs are supposed to have something similar. Plus, in the “”“thinking”“” models, you’ll see them break up words into individual letters or even write them out in a numbered list, which should break the tokens up into individual letters as well.

panda_abyss@lemmy.ca on 08 Aug 12:12 next collapse

I don’t feel this is a good example of why LLMs shouldn’t be treated like PhDs.

My first interactions with gpt5 have been pretty awful, and I’d test it but it’s not available to me anymore

Edit: I am not having a stroke, I’m bad at typing and autocorrect hates me

shnizmuffin@lemmy.inbutts.lol on 08 Aug 12:35 collapse

Do you smell toast?

panda_abyss@lemmy.ca on 08 Aug 12:36 collapse

BlackBerry toast

darreninthenet@piefed.social on 09 Aug 01:23 collapse

FWIW, ChatGPT 5 gets this correct

shnizmuffin@lemmy.inbutts.lol on 09 Aug 01:50 collapse

Fuckin’ does it?

darreninthenet@piefed.social on 09 Aug 09:27 next collapse

It did for me 🤷🏻‍♂️

darreninthenet@piefed.social on 09 Aug 09:30 next collapse
limerod@reddthat.com on 10 Aug 03:49 collapse

You appear to be using the older gpt model. The newer model calculates and answers correctly for most words at least for the few I asked

mbtrhcs@feddit.org on 10 Aug 08:55 collapse

It literally says 5 in the screenshot but ok

limerod@reddthat.com on 10 Aug 09:08 collapse

I saw that. I’m using the mobile app. There’s a possibility the web version is using an inferior model.

TehPers@beehaw.org on 08 Aug 04:43 next collapse

Ph.Deez nutz.

I have friends who actually have a Ph.D. It takes many years to get one and an attempt to actually better a field. People tend to trust your opinion on a subject when you have a doctorate in that field.

I can’t even trust ChatGPT to answer a basic question without fucking up and apologizing to me, only to fuck up again.

Maybe stop treating language models like AGI? They’re awesome at recognizing semantic similarities between words and phrases (embeddings) as well as generating arbitrary but reasonable looking output that matches an expected output (structured outputs). That’s cool enough. Stop pretending like it isn’t and falsely advertising it as being able to cure cancer and world hunger, especially when you wouldn’t even be happy if it did.

bobs_monkey@lemmy.zip on 08 Aug 05:39 collapse

AI as it sits is a tool that has specific use cases. It is absolutely not intelligence, as it’s commonly marketed. It may seem intelligent to the uninformed, but boy howdy is that a mistake.

t3rmit3@beehaw.org on 08 Aug 06:11 collapse

It’s a sad reflection of our current state when being able to string together coherent sentences is impressive enough to many as to be confused with truth and/or intelligence.

howrar@lemmy.ca on 10 Aug 03:41 collapse

It wasn’t that long ago that it was unfathomable for anything other than humans to be able to do this.

t3rmit3@beehaw.org on 11 Aug 04:07 collapse

“Polly want a cracker” has been around since before anyone alive today was born, and that’s the same thing as what LLMs are doing in essence (mimicking human speech), but no one was taking advice from parrots.

cronenthal@discuss.tchncs.de on 08 Aug 05:06 next collapse

I could power a data center with the rolling of my eyes after reading this headline.

mormund@feddit.org on 08 Aug 05:42 next collapse

Didn’t he claim that with 4ó as well? But yes please inflate the bubble further, blow everything up.

Swedneck@discuss.tchncs.de on 08 Aug 11:35 collapse

just one more iteration, i swear it’s PHD level this time

JUST ONE MORE ITERATION PLEASE

0xtero@beehaw.org on 08 Aug 05:50 next collapse

ChatGPT in its PhD thesis defense: “Oh, I’m sorry for the misinformation, let me try this again…”

Correct316@monero.town on 08 Aug 16:27 collapse

LOL!! 🤣 Yes! This exactly!

Catoblepas@piefed.blahaj.zone on 08 Aug 06:02 next collapse

How many ChatGPhDs will it take to do the math on how long it is until this bubble pops?

furzegulo@lemmy.dbzer0.com on 08 Aug 06:43 next collapse

Just Conmen selling their snake oil

Correct316@monero.town on 08 Aug 16:26 collapse

Have to agree with this. My experience with the various AI models is that they’re fairly terrible. I really don’t want to see this garbage driving cars where lives are at stake.

xxce2AAb@feddit.dk on 08 Aug 10:35 next collapse

“It can now drive its users straight into an active psychosis 35% faster by sounding more persuasive than ever before!”

red_bull_of_juarez@lemmy.dbzer0.com on 08 Aug 11:30 next collapse

OpenAI claims a lot of things.

Mothra@mander.xyz on 08 Aug 13:34 next collapse

This guy always shows up with his hands like this in news photos

I know it’s irrelevant but I had to point it out

Krauerking@lemy.lol on 08 Aug 14:04 next collapse

Oops i ate the onion.

Right? No way thats considered a legitimate argument since a PhD just says you dedicated yourself to a very specific topic and arent necessarily smarter or better spoken for it.
Or is he just bragging he found a way to filter it to just people’s PhD thesis papers that they stole?

Correct316@monero.town on 08 Aug 16:31 next collapse

Shouldn’t be hard to improve over this rubbish:

is it now 15 years after 2010 ?

GPT-4 No, it is not 15 years after 2010. As of today, August 8, 2025, it is 15 years after 2010.>

limerod@reddthat.com on 10 Aug 03:56 collapse

The gpt-5 model answers this correctly. <img alt="" src="https://reddthat.com/pictrs/image/2e927f62-198e-44a0-a16d-3903ef0b229f.jpeg">

arsCynic@beehaw.org on 08 Aug 16:56 next collapse

I had the Blueberry talk with GPT5:

<img alt="" src="https://kieranhealy.org/blog/archives/2025/08/07/blueberry-hill/blueberry-1.png"> <img alt="" src="https://kieranhealy.org/blog/archives/2025/08/07/blueberry-hill/blueberry-2.png"> <img alt="" src="https://kieranhealy.org/blog/archives/2025/08/07/blueberry-hill/blueberry-3.png">

🎓 PhB level checks out.
🚫 Blockchain level uselessness and waste as well.

🎈📌💥

petrol_sniff_king@lemmy.blahaj.zone on 09 Aug 03:32 next collapse

Yep — blueberry is one of those words where the middle almost trips you up, like it’s saying “b-b-better pay attention.”

… I hate this technology so fucking much…

Also, it trying to gaslight you into believing bluebberry is real was very funny.

limerod@reddthat.com on 10 Aug 03:52 collapse

Well, it answers correctly in my case. <img alt="" src="https://reddthat.com/pictrs/image/7df0e275-c41a-46d0-ac05-82101258d13d.jpeg">

Pulptastic@midwest.social on 09 Aug 10:38 collapse

Maybe a PhD in civil engineering lol.