ChatGPT 'got absolutely wrecked' by Atari 2600 in beginner's chess match — OpenAI's newest model bamboozled by 1970s logic

ChatGPT 'got absolutely wrecked' by Atari 2600 in beginner's chess match — OpenAI's newest model bamboozled by 1970s logic (www.tomshardware.com)
from homesweethomeMrL@lemmy.world to retrogaming@lemmy.world on 09 Jun 2025 22:41
https://lemmy.world/post/31107320

#retrogaming

threaded - newest

bridgeenjoyer@sh.itjust.works on 09 Jun 2025 23:22 next collapse

Is this just because gibbity couldn’t recognize the chess pieces? I’d love to believe this is true otherwise, love my 2600 haha.

Stillwater@sh.itjust.works on 09 Jun 2025 23:29 collapse

At first it blamed its poor performance on the icons used, but then they switched to chess notation and it still failed hard

bridgeenjoyer@sh.itjust.works on 10 Jun 2025 14:05 collapse

That is baffling

Electricblush@lemmy.world on 09 Jun 2025 23:37 next collapse

This is so stupid and pointless…

“Thing not made to solve spesific task fails against thing made for it…”

This is like saying that a really old hand pushed lawn mower is better then a SUV at cutting grass…

SpaceNoodle@lemmy.world on 09 Jun 2025 23:41 next collapse

SUVs aren’t marketed as grass mowers. LLMs are marketed as AI with all the answers.

otp@sh.itjust.works on 10 Jun 2025 00:52 next collapse

I’d be interested in seeing marketing of ChatGPT as a competitive boardgame player. Is there any?

missingno@fedia.io on 10 Jun 2025 01:52 next collapse

Not necessarily that AI is marketed as a competitive board game player, but that AI is marketed as intelligence. This helps illustrate how clueless it really is.

otp@sh.itjust.works on 10 Jun 2025 16:55 next collapse

There are plenty of geniuses out there who aren’t great at board games. Using a tool not fit for task is more of an issue with the person using the wrong tool than an issue with the tool itself.

I do get where you’re coming from though. There are definitely people who don’t understand why a ChatBot wouldn’t be good at chess.

SchmidtGenetics@lemmy.world on 11 Jun 2025 12:24 collapse

Do you expect rocket scientists to be good at chess?

Intelligence doesn’t mean it’s blanket smart. This is entirely on individual people for this asinine assumption. It’s never been marketed that way, so why in this singular case is the definition suddenly different? The general public understands this isn’t some be all end all. This assumptive attitude that Lemmy has is fucking weird.

missingno@fedia.io on 11 Jun 2025 16:33 next collapse

I would expect anyone claiming to be intelligent to be able to beat an Atari 2600 set to its very lowest difficulty. This is a task on par with counting the number of Rs in the word 'strawberry', something the intelligent ChatGPT also famously cannot do.

SchmidtGenetics@lemmy.world on 11 Jun 2025 16:56 next collapse

Do you think being good at chess is equivalent to intelligence…?

Those are also vastly different tasks, a toddler can count, while they likely can’t play chess.

You have a very strange notion of what “intelligence” means.

A toddler untrained at counting and untrained at chess would be good at neither. Same goes for adults, you are untrained in rocket physics, so you won’t be good at it either. Why are you holding an ai at some weird ungodly bar that doesn’t apply to anything else? No one’s claimed it to be good at these things. Adults who can’t swim and go in water drown, why? Because they weren’t taught. Notice a pattern yet?

missingno@fedia.io on 11 Jun 2025 17:16 collapse

It's the beginner difficulty on very weak hardware. It's designed to be easily beatable even if you don't know much about chess.

SchmidtGenetics@lemmy.world on 11 Jun 2025 17:21 collapse

Swimming is pretty easy, yet people drown.

homesweethomeMrL@lemmy.world on 11 Jun 2025 17:29 collapse

It’s actually not that easy. Fire up an emulator and take it for a spin. Like, you won’t get away with obvious mistakes.

missingno@fedia.io on 11 Jun 2025 18:47 collapse

First try. I did make a few mistakes, but the 2600 made more.

homesweethomeMrL@lemmy.world on 11 Jun 2025 17:27 collapse

The general public understands this isn’t some be all end all.

I disagree.

SchmidtGenetics@lemmy.world on 11 Jun 2025 17:30 collapse

Got a link? Because people just think it’s cool, not that it’s gonna be this thing that can do everything.

So there must be some place people are getting this “it can do everything” idea from? It’s more an anti-ai propaganda angle, and that’s prevalent mainly on Lemmy. So a source to back up this “ai can do anything” please.

homesweethomeMrL@lemmy.world on 11 Jun 2025 17:40 collapse

Tradition dictates that the first claimant bears responsibility for the link.

SchmidtGenetics@lemmy.world on 11 Jun 2025 17:42 collapse

Yep, go up the thread and see which claim was made first ;)

Of course you would try this, since as I already said, it’s propaganda, doesn’t exist. Can’t throw your hat in an argument, then balk when questioned lmfao.

Edit, I also always asked elsewhere, so why hasn’t someone brought any of its marketed this way?

homesweethomeMrL@lemmy.world on 11 Jun 2025 17:54 collapse

Yep, go up the thread and see which claim was made first ;)

Okay. Ah . . . Yep, it was yours.

The general public understands this isn’t some be all end all.

So that’s the claim, and it was made first. Now. Let’s see that link!

SchmidtGenetics@lemmy.world on 11 Jun 2025 17:55 collapse

And ACTUALLY up the chain…

lemmy.world/comment/17567218

homesweethomeMrL@lemmy.world on 11 Jun 2025 17:58 collapse

Oh that’s a different person there. You can tell because there’s a different name on top of the comment.

I mean, I don’t have a link to prove that though, so, maybe it is me? Hm. That’s certainly something to think about.

SchmidtGenetics@lemmy.world on 11 Jun 2025 18:00 collapse

Can’t throw your hat in an argument, then balk when questioned lmfao.

You’re joining an argument in the middle, doesn’t matter who said it first.

Also. How do you propose I provide you evidence that’s it’s NOT marketed? The mere lack of it, and the lack of you being able to provide any, should be ample proof for someone wanting to have a civil discourse.

homesweethomeMrL@lemmy.world on 11 Jun 2025 18:01 collapse

Nuh uh, you are.

SchmidtGenetics@lemmy.world on 11 Jun 2025 18:04 collapse

Yea, in a chain where the first claim was, “ai was marketed as a be all end all”. If you want to support their side with civil discourse, please provide evidence. Because all you seem to be doing here is trolling.

homesweethomeMrL@lemmy.world on 11 Jun 2025 18:16 collapse

So you admit you have no link for your claim which came first before my claim.

You may again defer to a third party and argue yet another claim was before yours but I say unto thee - wasn’t that claim in reference to one before it? Such a line of inquiry would inevitably lead us to The Original Claim, to which only God or maybe Sam Altman could provide the link to substantiate it.

Therefore, in lieu of petitioning God (or Sam Altman) for a substantiating link to The Original Claim, or the person to whom you were addressing (unless it was me, we’re still checking on that) the onus falls to you to provide a link to substantiate your claim. Failure to do so will go down on your permanent record.

pinball_wizard@lemmy.zip on 11 Jun 2025 20:22 collapse

These tools are marketed as replacing lots of jobs that are a hell of a lot more complex than a simple board game.

otp@sh.itjust.works on 11 Jun 2025 23:16 collapse

These tools are marketed as replacing lots of jobs that are a hell of a lot more complex than a simple board game.

There isn’t really a single sliding scale of “complexity” when it comes to certain tasks.

Given the appropriate input, a calculator can divide two numbers. But it can’t count the number of R’s in the word “strawberry”.

Meanwhile, a script that could count the number of instances of a letter in a word could count those R’s, but it couldn’t divide any two numbers.

Similarly, we didn’t complain that a typewriter couldn’t put pepperoni slices onto a pizza.

SchmidtGenetics@lemmy.world on 10 Jun 2025 01:21 next collapse

Source?

arararagi@ani.social on 11 Jun 2025 17:25 collapse

Hear hear.

warm@kbin.earth on 10 Jun 2025 01:07 collapse

Made people click though didnt it.

MadMadBunny@lemmy.ca on 09 Jun 2025 23:50 next collapse

Attempting to badly quote someone on another post: « How can people honestly think a glorified word autocomplete function could be able to understand what is a logarithm? »

Ephera@lemmy.ml on 10 Jun 2025 05:05 collapse

You can make external tools available to the LLM and then provide it with instructions for when/how to use them.
So, for example, you’d describe to it that if someone asks it about math or chess, then it should generate JSON text according to a given schema and generate the command text to parametrize a script with it. The script can then e.g. make an API call to Wolfram Alpha or call into Stockfish or whatever.

This isn’t going to be 100% reliable. For example, there’s a decent chance of the LLM fucking up when generating the relatively big JSON you need for describing the entire state of the chessboard, especially with general-purpose LLMs which are configured to introduce some amount of randomness in their output.

But well, in particular, ChatGPT just won’t have the instructions built-in for calling a chess API/program, so for this particular case, it is likely as dumb as auto-complete. It will likely have a math API hooked up, though, so it should be able to calculate a logarithm through such an external tool. Of course, it might still not understand when to use a logarithm, for example.

homesweethomeMrL@lemmy.world on 10 Jun 2025 00:17 next collapse

clop - clop - clop - clop - clop - clop

. . .

*bloop*

. . .

[screen goes black for 20 minutes]

. . .

Hmmmmm.

clop - clop - clop - clop - clop - clop - clop - clop - clop - clop

*bloop*

OsrsNeedsF2P@lemmy.ml on 10 Jun 2025 01:40 next collapse

Hey I don’t mean to ruin your day, but maybe you should Google what you just commented…

homesweethomeMrL@lemmy.world on 10 Jun 2025 13:22 collapse

There is 100% no chance google knows what that is

homesweethomeMrL@lemmy.world on 11 Jun 2025 17:39 collapse

Little disappointed more people didn’t get this.

Xanthobilly@lemmy.world on 10 Jun 2025 00:23 next collapse

redsunrise@programming.dev on 10 Jun 2025 00:37 next collapse

in other words, a hammer “got absolutely wrecked” by a handsaw in a board-halving competition

homesweethomeMrL@lemmy.world on 10 Jun 2025 13:27 next collapse

One of those Fisher-Price plastic hammers with the hole in the handle?

Redkey@programming.dev on 10 Jun 2025 14:31 next collapse

When all you have (or you try to convince others that all they need) is a hammer, everything looks like a nail. I guess this shows that it isn’t.

JeeBaiChow@lemmy.world on 11 Jun 2025 08:28 collapse

Clearly you didn’t swing the hammer hard enough

OsrsNeedsF2P@lemmy.ml on 10 Jun 2025 01:39 next collapse

What happens if you ask ChatGPT to code you a chess AI though?

4am@lemm.ee on 10 Jun 2025 02:13 next collapse

It doesn’t work without 200 hours of un-fucking

pedz@lemmy.ca on 10 Jun 2025 12:31 collapse

It probably consumes as much energy as a family house for a day just to come up with that program. That’s what happens.

In fact, I did a Google search and didn’t have any choice but to have an “AI” answer, even if I don’t want it. Here’s what it says:

Each ChatGPT query is estimated to use around 10 times more electricity than a traditional Google search, with a single query consuming approximately 3 watt-hours, compared to 0.3 watt-hours for a Google search. This translates to a daily energy consumption of over half a million kilowatts, equivalent to the power used by 180,000 US households.

daniskarma@lemmy.dbzer0.com on 11 Jun 2025 09:46 collapse

Average daily energy consumption for a family in the US is said to be around 30.000 wh per day.

That would be about 10.000 chatgpt queries per day to equal that.

To have more references, average energy consumption of an hour playing a AAA computer game can easily be 600-1000 wh. Depending on the graphic card.

pedz@lemmy.ca on 11 Jun 2025 16:46 collapse

That must be why Google’s greenhouse emissions went up 50% in five years. ChatGPT’s legendary efficiency.

Keep defending those power wasting glorified autocomplete. In no way are we doomed as a species.

We can just continue tu pump more and more into the air. “AI” will surely find a solution for that anyway.

daniskarma@lemmy.dbzer0.com on 11 Jun 2025 16:53 collapse

Google is not related with chatgpt. Chatgpt parent company is openAI which is a competitor with google.

A more rational explanation is that technology and digital services on general have been growing and are on the rise. Both because more and more complex services are being offered, and more importantly more people are requesting those services. Whole continents that used not to be cover by digital services are now covered. Generative AI is just a very small part of all that.

The best approach to reduce CO2 emissions is to ask for a reduction in human population. From my point of view is the only rational approach, as with a growing population there’s only two solutions, pollute until we die, or reduce quality of life until life is not worth living. Reducing population allows for fewer people to live better loves without destroying the planet.

Venus_Ziegenfalle@feddit.org on 10 Jun 2025 15:47 next collapse

In other news: My toaster makes better toast than my vacuum.

homesweethomeMrL@lemmy.world on 11 Jun 2025 17:31 next collapse

Your vacuum uses more power than a 150,000-person city just to clean an 8’ square rug?

That does suck.

Heh.

chonglibloodsport@lemmy.world on 11 Jun 2025 18:06 collapse

If ChatGPT were marketed as a toaster nobody would bat an eye. The reason so many are laughing is because ChatGPT is marketed as a general intelligence tool.

Railcar8095@lemm.ee on 11 Jun 2025 18:38 collapse

Do you have any OpenAI stuff (ad, interview, presentation…) That claims it’s AGI? Because I’ve never seen such thing, only people hyping it for clicks and ad revenue

chonglibloodsport@lemmy.world on 11 Jun 2025 18:45 collapse

I was very careful not to use the term AGI for this reason. General intelligence tool isn’t the same thing. It’s a much weaker claim, yet it’s also a far stronger claim than any purpose-built software. The ambiguity is part of their marketing strategy.

Railcar8095@lemm.ee on 11 Jun 2025 19:19 collapse

Question remains. Any marketing about it being general intelligence? Not general use, but general intelligence.

chonglibloodsport@lemmy.world on 11 Jun 2025 21:24 collapse

No, though there’s been plenty of marketing where they claim “we know how to build AGI.”

They have marketed ChatGPT as a general purpose AI from the very beginning, though the question of how to leverage that has remained open.

JeeBaiChow@lemmy.world on 11 Jun 2025 08:26 next collapse

If llms are statistics based, wouldn’t there be many many more losing games than perfectly winning ones? It’s like Dr strange saying ‘this is the only way’.

Railcar8095@lemm.ee on 11 Jun 2025 18:35 collapse

It’s not even that. It’s not a chess AI or a AGI (which doesn’t exist). It will speak and pretend to play, but has no memory of the exact position of the pieces nor the capability to plan several steps ahead. For ask intended and porpoises, it’s like asking my toddler what’s the time (she always says something that sounds like a time, but doesn’t understand the concept of hours or what the time is)

The fact that somebody posted this on LinkedIn and not only wasn’t shamed out of his job but there are several articles about it is truly infuriating.

daniskarma@lemmy.dbzer0.com on 11 Jun 2025 09:44 next collapse

My 2€ calculator obliterates a 200.000€ ferrari doing multiplications.

arararagi@ani.social on 11 Jun 2025 17:25 next collapse

Man all these people coping, I thought chatgpt was supposed to be a generic one able to do anything?

homesweethomeMrL@lemmy.world on 11 Jun 2025 17:36 collapse

It depends. Have you used it? If not - Yes! It does do . . . all the things.

If you have used it, I’m sorry that was incorrect. You simply need to pay for the upgraded subscription. Oh, and as a trusted insider now we can let you in on a secret - the next version of this thing is gonna be, like, wow! Boom shanka! Everyone else will be so far behind!

stormeuh@lemmy.world on 11 Jun 2025 21:43 collapse

You know, when you put it like that, it kind of sounds like Scientology…

pinball_wizard@lemmy.zip on 11 Jun 2025 20:19 next collapse

That’s on them for taking on the Atari 2600, where “the games don’t get older, they get better!”

QueenHawlSera@sh.itjust.works on 15 Jun 2025 01:55 collapse

True AI does not and will not exist