from Powderhorn@beehaw.org to technology@beehaw.org on 26 Aug 16:30
https://beehaw.org/post/21827372
There tend to be three AI camps. 1) AI is the greatest thing since sliced bread and will transform the world. 2) AI is the spawn of the Devil and will destroy civilization as we know it. And 3) “Write an A-Level paper on the themes in Shakespeare’s Romeo and Juliet.”
I propose a fourth: AI is now as good as it’s going to get, and that’s neither as good nor as bad as its fans and haters think, and you’re still not going to get an A on your report.
You see, now that people have been using AI for everything and anything, they’re beginning to realize that its results, while fast and sometimes useful, tend to be mediocre.
My take is LLMs can speed up some work, like paraphrasing, but all the time that gets saved is diverted to verifying the output.
threaded - newest
Long story short, LLMs are only tools, nothing more.
🌏🧑🚀🔫🧑🚀
My take:
I’ve only tried a handful of times, but I’ve never been able to get an LLM to do a grunt refactoring task that didn’t require me to rewrite all the output again anyway.
And you can do a lot of that with good IDEs.
The trick is giving it tons of context. It also depends on the LLM. Claude has given me the most success.
You have to invest in setting it up for success. Give it a really good context, feed it docs or other resources through MCP servers, use a memory bank pattern.
I just did a 30k contract with it where I hand wrote probably 20% of the code, and 75% of that was probably me just reviewing the diffs the LLM made like a PR. But that doesn't mean I'm vibe coding, I feed it atomic operations and review each change as if it was a PR. I come away understanding the totality of the code so that I can debug easily when things go wrong.
You can't just go "Here's my idea; Make it." That probably will never happen (even though that's the kool-aid that's being served), but if you're disciplined and make the most of the tools available it can absolutely 3-5x your output as an engineer.
The LLM in the most recent case had a monumental amount of context. I then gave it a file implementing a breed of hash set, asked it to explain several of the functions which it did correctly, and then asked it to convert it to a hash map implementation (an entirely trivial, grunt change, but which is too pervasive and functionality-directed for an IDE to have a neat function for this).
It spat out the source code of the tree-based map implementation in the standard library.
LLMs are great at checking grammar in writing. That’s the other thing I’ve found they’re useful for 🤷
Basically, using LLMs to write something is always a bad idea (unless you’re responding to bullshit with more bullshit e.g. work emails 🤣). Using them to check writing is pretty useful though.
I have only found one good use so far and that is to quickly analyze, manipulate and visualize closed datasets. The moment you let it expand beyond your dataset, or ask it to “think” or “imagine”, it promptly shits the bed.
The old adage always applies. “Garbage in. Garbage out.” When you cannot strictly control the datasets being considered, you cannot trust that the results will not include garbage and noise.
They can be useful, used “in negative”. In a physics course at an institution near me, students are asked to check whether the answers to physics questions given by an LLM/GPT are correct or not, and why.
On the one hand, this puts the students with their back against the wall, so to speak, because clearly they can’t use the same or another LLM/GPT to answer, or they’d be going in circles.
But on the other hand, they actually feel empowered when they catch the errors in the LLM/GPT; they really get a kick out of that :)
As a bonus, the students see for themselves that LLMs/GPTs are often grossly or subtly wrong when answering technical questions.
I’ve heard of this kind of AI usage a few times now and it seems so smart. You’re learning by teaching, but also being trained in AI literacy and the downfalls of AI. It encourages critical thinking and genuine learning at the same time.
In addition to being a fucking brilliant idea for that course, this should be adapted more widely. I suspect, once being young, that you’re going to get far more buy-in from showing how often it’s wrong than telling them not to use it.
LLMs are super cool. You provide text A, and text B, add a little cosine similarity or something, and you’ve got a distance between the two texts.
Right, they also generate text. I guess embeddings aren’t really new.
Well the embeddings are nice anyway. Makes it easy to do semantic text searching (or even images or other kinds of inputs). Not sure what that has to do with the general public, but it’s great if you’re writing a search tool.
Propaganda seems still a business case for me.
I don’t think AI is done improving, but companies need to find something other than throwing more compute at it. It seems to get exponentially more expensive for logarithmic gains in performance. I honestly can’t even tell the difference between ChatGPT 4 and 5. I don’t doubt that it is better but I can’t see a difference in my own productivity.
Time savings vs time sinks depends a lot on exactly what you’re doing. Treading well-worn ground in a new domain can be speedy. But fixing a non-standard or niche (or shitty) code base can be a nightmare because nothing is done the standard way.
So far, I’ve gained a bit of productivity through AI, but I’ve been down a few rabbit holes, too. Integration tests can be a real pain. It always wants to recommend custom test configurations but then you wind up with a different test environment and you can’t necessarily trust your tests. Date parsing with Jackson in particular can be different between your configured ObjectMapper injected by Spring and
new ObjectMapper()
in test to give just a super basic example.Another article conflating LLMs and AI.
AI is unfortunately supercharging lots of systems, especially in the police/intelligence spaces. Surveillance driven by AI is absolutely skyrocketing both in capabilities and prevalence.
xAI and OpenAI aren’t seeing good ROI, being LLM companies. Palantir and their ilk are another beast altogether.
I almost wonder if this misstated “underperformance” of “AI” is intentionally trying to make people less fearful about it being weaponized against them.
After all, the AI balloon is deflating, right?