Can AI run a physical shop? Anthropic’s Claude tried and the results were gloriously, hilariously bad

Can AI run a physical shop? Anthropic’s Claude tried and the results were gloriously, hilariously bad (venturebeat.com)
from silence7@slrpnk.net to technology@lemmy.world on 27 Jun 19:57
https://slrpnk.net/post/23949315

#technology

threaded - newest

InternetCitizen2@lemmy.world on 27 Jun 20:31 next collapse

Claude ran a vending machine business for a month, selling tungsten cubes

hmmm

unpossum@sh.itjust.works on 27 Jun 21:03 next collapse

as long as it’s not paper clips, we’re good

AdamEatsAss@lemmy.world on 28 Jun 02:21 collapse

Clippy? I need help

prex@aussie.zone on 28 Jun 04:50 collapse

best I can do is grey goo.

JPAKx4@lemmy.blahaj.zone on 27 Jun 22:39 next collapse

I need a gif of a tungsten cube dropping from the top shelf of a vending machine and folding it in on itself

AdamEatsAss@lemmy.world on 28 Jun 02:31 collapse

It was selling tungsten cubes to another AI who’s job was to restock the vending machine.

DarkDarkHouse@lemmy.sdf.org on 28 Jun 07:46 collapse

This is how you juice GDP.

pennomi@lemmy.world on 27 Jun 21:09 next collapse

Claude eventually resolved its existential crisis by convincing itself the whole episode had been an elaborate April Fool’s joke, which it wasn’t. The AI essentially gaslit itself back to functionality, which is either impressive or deeply concerning, depending on your perspective.

Now THAT’S some I, Robot shit. And I’m not talking about the Will Smith movie, I’m talking about the original book.

Havoc8154@mander.xyz on 27 Jun 23:32 next collapse

This is by far the most interesting part. I want to know more about this, like why the author is so certain this wasn’t a joke.

silence7@slrpnk.net on 27 Jun 23:35 collapse

For what its worth, Anthropic posted this in their corporate blog. So if its a joke, its coming out of vetted corporate PR.

Psythik@lemm.ee on 28 Jun 00:30 collapse

Can you talk about the movie too? I may be in the minority here but I enjoyed it.

pennomi@lemmy.world on 28 Jun 04:20 collapse

The movie had themes about AI revolution, while the book was around robopsychology. Since this anecdote was about an AI gaslighting itself, it’s far more appropriate than the movie thematically.

BlackEco@lemmy.blackeco.com on 27 Jun 21:10 next collapse

AI bros are trying really hard to convince people that their parrots can be useful in business settings.

Grandwolf319@sh.itjust.works on 27 Jun 21:24 next collapse

This is how I know AI doesn’t really work. Give it a real use case in the physical world, it can’t be almost there, either it passes or fails.

People should really appreciate deterministic algorithm cause they could automate things in the real world

shalafi@lemmy.world on 28 Jun 03:53 collapse

The physical world is too fast, relies on the speed of human brains calculating a million variables instantly, not mere pattern matching. See how hard it is to teach a robot to catch a ball. You have to input all the physics where a human doesn’t even consciously think on the problem.

We humans are best-in-class at pattern matching, but we often get it wrong and AI amplifies those mistakes.

AI can be great at certain tasks, but we have to be cognizant of how that works.

treadful@lemmy.zip on 27 Jun 21:25 next collapse

Current AI systems can perform sophisticated analysis, engage in complex reasoning, and execute multi-step plans.

No, not really

UnderpantsWeevil@lemmy.world on 27 Jun 21:43 next collapse

It can say it can, when asked by an investor. And really, what else matters?

14th_cylon@lemm.ee on 27 Jun 23:09 next collapse

I mean really, where do these legends come from? I have tried to make chatgpt sort through single document and present clear organized data, present in the document, into sorted table. It can’t reliably do that. How would it do any kind of complex task? That is just laughable.

Nalivai@lemmy.world on 28 Jun 00:31 collapse

I’m convinced that people who are fascinated by llm chatbots are those who usually aren’t better than a chatbot at whatever they do. That is to say, they can’t do shit.

catloaf@lemm.ee on 28 Jun 01:57 collapse

“I don’t know how to run a shop, but it can’t be that hard, let’s just have AI do it!”

SerotoninSwells@lemmy.world on 28 Jun 00:15 next collapse

Claude’s month as a shopkeeper offers a preview of our AI-augmented future that’s simultaneously promising and deeply weird.

Did the author have a stroke by the time they reached the end of writing the article? The mental gymnastics would be funny if it wasn’t terrifying.

CosmoNova@lemmy.world on 28 Jun 07:08 collapse

Wouldn‘t be surprised if the author used AI too but then again bad or let‘s call it „weird“ journalism isn’t all that new.

SerotoninSwells@lemmy.world on 28 Jun 13:27 collapse

You’re absolutely right.

TexasDrunk@lemmy.world on 28 Jun 03:47 collapse

Depends on what you’re calling AI. LLMs (and generative AI in general) are garbage for all those things, and most things in general (all things if you take their cost into account). Machine Learning and expert systems can do at least some of that.

I absolutely hate that generative AI is being marketed as though it’s deep learning instead of a fancy Markov chain. But I think I’ve lost the battle over that nomenclature.

TheBeege@lemmy.world on 28 Jun 10:45 collapse

This. I work at a medical computer vision company, and our system performs better, on average, than radiologists.

It still needs a human to catch the weird edge cases, but studies show humans plus our model have a super high accuracy rate and speed. It’s perfect because there’s a global radiologist shortage, so helping the radiologists we have go faster can save a lot of lives.

But people are bad at nuance. All AI is like LLMs -_-

TexasDrunk@lemmy.world on 28 Jun 16:09 collapse

Case in point: the downvotes are from people who don’t know or care about the difference.

badbytes@lemmy.world on 27 Jun 21:33 next collapse

How will they protect the robots?

ThePowerOfGeek@lemmy.world on 28 Jun 01:12 next collapse

With tungsten cubes apparently. Lots and lots of tungsten cubes!

blargle@sh.itjust.works on 28 Jun 04:14 collapse

By pushing them down the stairs

UnderpantsWeevil@lemmy.world on 27 Jun 21:43 next collapse

If the AI cannot run the business then we must conclude that the business does not produce anything of real value.

Nothing to do but downsize and move on.

strawberry@kbin.earth on 27 Jun 23:05 next collapse

the only real use case I've found for ai (not including science and stuff, I'm talking more LLM for consumer use) is when I have a very niche issue,and even then rarely does it solve the issue, just gives me a better idea of what I can go looking for

BrianTheeBiscuiteer@lemmy.world on 27 Jun 23:08 next collapse

It’s good at giving a new perspective or helping mental blocks.

catloaf@lemm.ee on 28 Jun 01:56 collapse

For a first pass, yes. I wouldn’t really trust it for an unbiased, objective perspective. Each model is only as good as its training data.

isVeryLoud@lemmy.ca on 27 Jun 23:11 next collapse

It’s a good starting point, never the final product.

driving_crooner@lemmy.eco.br on 28 Jun 03:03 next collapse

Chatgpt has been useful for me to look up for related DJs, like I saw this DJ Ziggy and it show me a couple of other DJs of the Netherlands Bubbling scene.

MangoPenguin@lemmy.blahaj.zone on 28 Jun 03:38 collapse

Yeah it makes sense that they’re good at finding similar things.

burgerpocalyse@lemmy.world on 28 Jun 03:16 next collapse

anything a chatbot can do, a person can do better. like you could just ask another person and you would get something more useful off the top of their head

Mnemnosyne@sh.itjust.works on 28 Jun 05:54 collapse

There’s a difference between ‘a person’ and ‘every person’. A person can definitely do things better than any chat bot. But not every person can. And depending on the situation, a person who can may not be available.

Even then, there is a place where the AI beats all persons and is better in one way: speed. If the task at hand does not require a better result than what the AI outputs, then the time savings is big, because there are no situations in which any human will work faster.

Skyrmir@lemmy.world on 28 Jun 04:11 next collapse

Boilerplate code is where it rocks. The syntax for that API function you use once every 5 years and no way remember, it’s got you covered. It can knock out helper functions like a boss too. Nothing complex, that takes too long to fix, but the text filter and type conversion stuff is quicker than typing them out yourself.

AbidanYre@lemmy.world on 28 Jun 04:34 next collapse

Thank you! This is exactly what I use it for. Things that would take 1/2 - 1 days because I need to spend much of that time remembering the syntax or which libraries to import. But AI can get 90% of the way there in 5 minutes and 97% of the way with like two more iterations on the prompt.

AnotherPenguin@programming.dev on 28 Jun 09:19 collapse

It’s really good for prototypes, unit tests, cicd pipelines and most orms

toynbee@lemmy.world on 28 Jun 09:58 next collapse

On Thursday I attended a company-wide meeting wherein several coworkers tried (with mixed results) to persuade the rest of the company to start using AI. The primary way they did so was by listing incidents in which they’d found it useful.

One of the examples was (mildly paraphrased) “our other coworker is old, so he knows things like Tom Sawyer. He said he thought I was pulling a Tom Sawyer, trying to convince him to paint the fence.”

I respect the person who was giving that speech, they seem very knowledgeable, but hearing that they had to ask AI what that meant was just upsetting.

That said, I guess one use for AI is deciphering idioms?

13igTyme@lemmy.world on 28 Jun 14:20 collapse

At work I know some that will take the AI transcript from zoom and put it into Miro, Chatgpt, Gemini, or Notion and have it create a mind map, flow chart, or a bulleted work list.

They still have to go through and clean things up, but it still saves hours.

some_guy@lemmy.sdf.org on 28 Jun 01:47 next collapse

That anyone would even attempt such an experiment shows a profound misunderstanding of what this tech is. It’s depressing how stupid people are.

andallthat@lemmy.world on 28 Jun 04:35 collapse

It was Anthropic who ran this experiment

cley_faye@lemmy.world on 28 Jun 07:45 collapse

It doesn’t detract from the parent’s comment at all.

moopet@sh.itjust.works on 28 Jun 10:29 next collapse

Sure.

But someone offered it $100 for a six pack of Bru and it declined, and they’re taking this as a hilarious failure, because a real human would be a real scumbag and take the cash pretending it was the right amount. So it’s not capitalist-level evil yet.

slaacaa@lemmy.world on 28 Jun 12:47 next collapse

This is actually a very interesting article, the experiment demonstrates the current limitation of “AI” (so really just LLM). Most people (including investors and executives) have no idea what is the reality of the tech they are hyping up

slaacaa@lemmy.world on 28 Jun 12:50 next collapse

“This matters because we’re rapidly approaching a world where AI systems will manage increasingly important decisions.”

How about we just don’t do that?

Prior_Industry@lemmy.world on 28 Jun 14:09 next collapse

Feels like so much of the AI hype is smoke and mirrors to get investor money, give it another year everyone will be wondering how the bubble got so big and popped and how no one saw it coming.

That being said I don’t think it’s going away either, just that a lot of investor money is going to be lost chasing shadows.

13igTyme@lemmy.world on 28 Jun 14:10 collapse

I work in Heath tech and we use Machine learning to create tools that help care managers and providers, but ultimately it’s still completely on the person to make important decisions. Our tool just helps you organize your day.

Zexks@lemmy.world on 28 Jun 13:34 next collapse

Cyberpunk 2077 did a version of this on a side mission. It’s gets pulled for a similar reason.

Zealousideal_Fox_900@lemmy.dbzer0.com on 28 Jun 13:41 collapse

I read some of the results a bit ago. One had what I can only describe as a full mental spasm and loss of reality, and seemed to become disturbed at it’s own existence, and another tried to contact… the FBI.