from antonim@lemmy.dbzer0.com to technology@lemmy.world on 04 Jun 13:09
https://lemmy.dbzer0.com/post/45888572
I don’t know if this is an acceptable format for a submission here, but here it goes anyway:
Wikimedia Foundation has been developing an LLM that would produce simplified Wikipedia article summaries, as described here: www.mediawiki.org/…/Simple_Article_Summaries
We would like to provide article summaries, which would simplify the content of the articles. This will make content more readable and accessible, and thus easier to discover and learn from. This part of the project focuses only on displaying the summaries. A future experiment will study ways of editing and adjusting this content.
Currently, much of the encyclopedic quality content is long-form and thus difficult to parse quickly. In addition, it is written at a reading level much higher than that of the average adult. Projects that simplify content, such as Simple English Wikipedia or Basque Txikipedia, are designed to address some of these issues. They do this by having editors manually create simpler versions of articles. However, these projects have so far had very limited success - they are only available in a few languages and have been difficult to scale. In addition, they ask editors to rewrite content that they have already written. This can feel very repetitive.
In our previous research (Content Simplification), we have identified two needs:
- The need for readers to quickly get an overview of a given article or page
- The need for this overview to be written in language the reader can understand
Etc., you should check the full text yourself. There’s a brief video showing how it might look: www.youtube.com/watch?v=DC8JB7q7SZc
This hasn’t been met with warm reactions, the comments on the respective talk page have questioned the purposefulness of the tool (shouldn’t the introductory paragraphs do the same job already?), and some other complaints have been provided as well:
Taking a quote from the page for the usability study:
“Most readers in the US can comfortably read at a grade 5 level,[CN] yet most Wikipedia articles are written in language that requires a grade 9 or higher reading level.”
Also stated on the same page, the study only had 8 participants, most of which did not speak English as their first language. AI skepticism was low among them, with one even mentioning they ‘use AI for everything’. I sincerely doubt this is a representative sample and the fact this project is still going while being based on such shoddy data is shocking to me. Especially considering that the current Qualtrics survey seems to be more about how to best implement such a feature as opposed to the question of whether or not it should be implemented in the first place. I don’t think AI-generated content has a place on Wikipedia. The Morrison Man (talk) 23:19, 3 June 2025 (UTC)
The survey the user mentions is this one: wikimedia.qualtrics.com/jfe/…/SV_1XiNLmcNJxPeMqq and true enough it pretty much takes for granted that the summaries will be added, there’s no judgment of their actual quality, and they’re only asking for people’s feedback on how they should be presented. I filled it out and couldn’t even find the space to say that e.g. the summary they show is written almost insultingly, like it’s meant for particularly dumb children, and I couldn’t even tell whether it is accurate because they just scroll around in the video.
Very extensive discussion is going on at the Village Pump (en.wiki).
The comments are also overwhelmingly negative, some of them pointing out that the summary doesn’t summarise the article properly (“Perhaps the AI is hallucinating, or perhaps it’s drawing from other sources like any widespread llm. What it definitely doesn’t seem to be doing is taking existing article text and simplifying it.” - user CMD). A few comments acknowlegde potential benefits of the summaries, though with a significantly different approach to using them:
I’m glad that WMF is thinking about a solution of a key problem on Wikipedia: most of our technical articles are way too difficult. My experience with AI summaries on Wikiwand is that it is useful, but too often produces misinformation not present in the article it “summarises”. Any information shown to readers should be greenlit by editors in advance, for each individual article. Maybe we can use it as inspiration for writing articles appropriate for our broad audience. —Femke 🐦 (talk) 16:30, 3 June 2025 (UTC)
One of the reasons many prefer chatGPT to Wikipedia is that too large a share of our technical articles are way way too difficult for the intended audience. And we need those readers, so they can become future editors. Ideally, we would fix this ourselves, but my impression is that we usually make articles more difficult, not easier, when they go through GAN and FAC. As a second-best solution, we might try this as long as we have good safeguards in place. —Femke 🐦 (talk) 18:32, 3 June 2025 (UTC)
Finally, some comments are problematising the whole situation with WMF working behind the actual wikis’ backs:
This is a prime reason I tried to formulate my statement on WP:VPWMF#Statement proposed by berchanhimez requesting that we be informed “early and often” of new developments. We shouldn’t be finding out about this a week or two before a test, and we should have the opportunity to inform the WMF if we would approve such a test before they put their effort into making one happen. I think this is a clear example of needing to make a statement like that to the WMF that we do not approve of things being developed in virtual secret (having to go to Meta or MediaWikiWiki to find out about them) and we want to be informed sooner rather than later. I invite anyone who shares concerns over the timeline of this to review my (and others’) statements there and contribute to them if they feel so inclined. I know the wording of mine is quite long and probably less than ideal - I have no problem if others make edits to the wording or flow of it to improve it.
Oh, and to be blunt, I do not support testing this publicly without significantly more editor input from the local wikis involved - whether that’s an opt-in logged-in test for people who want it, or what. Regards, -bɜ:ʳkənhɪmez | me | talk to me! 22:55, 3 June 2025 (UTC)
Again, I recommend reading the whole discussion yourself.
EDIT: WMF has announced they’re putting this on hold after the negative reaction from the editors’ community. (“we’ll pause the launch of the experiment so that we can focus on this discussion first and determine next steps together”)
threaded - newest
“Most readers in the US can comfortably read at a grade 5 level,[CN]”
so where is the citation? did they just pull a number from their butt? hmm…
srsly, this is some bs.
It’s actually true. 56% of Americans are “partially illiterate”, which explains a lot about the state of affairs in that country.
en.wikipedia.org/…/Literacy_in_the_United_States
frankly, I'm not quite surprised ._.
edit: upon reading the article, I now wonder if it's possible for your literacy to go down. I used to be such a bookworm in grade school, but now I have to reread stuff over and over in order to comprehend what's going on.
You might just be chronically tired or worn down from the stresses of life. It’s pretty common.
Another thing is as we get older a lot of people will choose more “challenging” adult books and then just be totally bored lol. I read young adult and kids books sometimes (how can I give a book to a child if I haven’t read it myself?) and it’s always surprising to me how they can be ripped through in no time at all.
But in general I think you’re probably right that literacy can decrease with disuse. It seems like most things about the mind and body trend that way
Maths is a really good example of this.
At one point I really enjoyed doing long division in my head but as time goes on (and you don’t exercise that sponge…), it becomes lazy.
The mind is a muscle. Don’t ignore it. Especially now, if you use your mind you’ll be light-years ahead of ai addicts.
I’m genuonely confused how is that even possible in a developed country such as US. Do people not read at all? As in an article or gossip magazine - all of those would get you there.
Is it just country side folk drinking beer and watching fox news? It can’t be 50% of all people. How.
basically the 2nd sentence is a product defunding education in red states, and under funding everywhere else. another issue is “participation grades for basically almost failing and failing classes”.
thier math skills are even worst.
If it runs without human supervision, it’ll be a gong show.
There should be some degree of supervision, users will at a minimum be able to rate the summaries as helpful or unhelpful, and I guess those rated as unhelpful will be removed.
Finally, a good use case for AI
Looks like the vast majority of people disagree D: I do agree that WP should consider ways to make certain articles more approachable to laymen, but this doesn’t seem to be the right approach.
Doesn’t it already have simplified versions of most articles at simple.wikipedia.org ?
This is already addressed in the first quote in my post. And no, I’m sure that not even close to most articles have a simple.wikipedia equivalent, or that it actually is adequately simple (e.g. one topic I was interested in recently that Wikipedia didn’t really help me with: “The Bernoulli numbers are a sequence of signed rational numbers that can be defined with exponential generating functions. These numbers appear in the series expansion of some trigonometric functions.” - that’s one whole “simplified” article, and I have no idea what it’s saying and it has no additional info or examples).
The vast majority of people in this particular bubble disagree.
I've found that AI is one of those topics that's extremely polarizing, communities drive out dissenters and so end up with little awareness of what the general attitude in the rest of the world is.
You mean the bubble of people who don't want a factually incorrect, environmentally damaging shortcut to provide a summary that's largely already being done by someone?
You're right.
What an unbiased view. Got any citations?
The survey results? Did you read the post?
Miguel's claims are:
There's an anecdote in a talk page about one summary being inaccurate. A talk page anecdote is not a usable citation.
Survey results aren't measuring environmental impact.
An the whole point of AI is to take the load off of someone having to do things manually. Assuming they actually are - even in this thread there are plenty of complaints about articles on Wikipedia that lack basic summaries and jump straight into detailed technical content.
“environmentally damaging”
I see a lot of users on here saying this when talking about any use case for AI without actually doing any sort of comparison.
In some cases, AI absolutely uses more energy than an alternative, but you really need to break it down and it’s not a simple thing to apply to every case.
For instance: using an AI visual detection model hooked up to a camera to detect when rain droplets are hitting the windshield of a car. A completely wasteful example. In comparison you could just use a small laser that pulses every now and then and measures the diffraction to tell when water is on the windshield. The laser uses far less electricity and has been working just fine as they are currently used today.
Compare that to enabling DLSS in a video game where NVIDIA uses multiple AI models to improve performance. As long as you cap the framerates, the additional frame generation, upscaling, etc. will actually conserve electricity as your hardware is no longer working as hard to process and render the graphics (especially if you’re playing on a 4k monitor).
Looking at Wikipedia’s use case, how long would it take for users to go through and create a summary or a “simple.wikipedia” page for every article? How much electricity would that use? Compare that to running everything through an LLM once and quickly generating a summary (which is a use case where LLMs actually excel at). It’s honestly not that simple either because we would also have to consider how often these summaries are being regenerated. Is it every time someone makes a minor edit to a page? Is it every few days/weeks after multiple edits have been made? Etc.
Then you also have to consider, even if a particular use case uses more electricity, does it actually save time? And is the time saved worth the extra cost in electricity? And how was that electricity generated anyway? Was it generated using solar, coal, gas, wind, nuclear, hydro, or geothermal means?
Edit: typo
The problem is that the bubble here are the editors who actually create the site and keep it running, and their “opposition” is the bubble of WMF staff.
No it isn't, it's the technology@lemmy.world Fediverse community.
How much do you want to bet on the overlap being small?
A bigger question is how much does Wikiemedia Foundation want to bet that their top donors and contributors aren’t in this thread…
Edit: Moving my unrelated ramblings to a separate comment.
Have you read my OP or did you just use an AI-generated summary? I copy-pasted several comments from Wikipedia editors and linked a page with dozens, if not a hundred other comments by them, and they’re overwhelmingly negative.
I'm not talking about them at all. I'm talking about the technology@lemmy.world Fediverse community. It's an anti-AI bubble. Just look at the vote ratios on the comments here. The guy you responded to initially said "Finally, a good use case for AI" and he got close to four downvotes per upvote. That's what I'm talking about.
The target of these AI summaries are not Wikipedia editors, it's Wikipedia readers. I see no reason to expect that target group to be particularly anti-AI. If Wikipedia editors don't like it there'll likely be an option to disable it.
But it’s quite obvious that they were what I was talking about, and you were responding to me. Instead of responding to my actual comment, you deceptively shifted the topic in order to trivialise the situation.
Except that the editors will very likely have to work to manage those summaries (rate, correct or delete them), so they definitely will be affected by them. And in general it’s completely unacceptable to suggest that the people who have created 99% of the content on Wikipedia should have less of a say on how the website functions than a handful of bureaucrats who ran a survey.
Disabling would necessarily mean disabling it wiki-wide, not just for individual editors, in which case the opinions of the editors’ “bubble” will be quite relevant.
No Wikipedia editor has to work on anything, if they don't want to interact with those summaries then they don't have to.
And no, it wasn't quite obvious that that's what you were talking about. You said "Looks like the vast majority of people disagree D:". Since you were directly responding to a comment that had been heavily downvoted by the technology@lemmy.world community it was a reasonable assumption that those were the people you were talking about.
No it wouldn't, why would you think that? Wikipedia has plenty of optional features that can be enabled or disabled on a per-user basis.
I posted that first reply before the comment had any downvotes at all, and in my first response to you it is even now certainly clear who I’m talking about.
You seem to be more interested in wringing out arguments to defend AI tools than in how to make Wikipedia function better (adding low-quality summaries and shrugging at the negative reactions because the editors don’t “really” have to fix any of them is not how WP will be improved). That the editors aren’t irrelevant fools in their tiny bubble is something WMF apparently agrees with, because they’ve just announced they’re pausing the A/B testing of the AI summaries they planned.
And when I saw the reply it had plenty of downvotes already, because this is technology@lemmy.world and people are quick to pounce on anything that sounds like it might be pro-AI. You're doing it yourself now, eyeing me suspiciously and asking if I'm one of those pro-AI people. Since there were plenty of downvotes the ambiguity of your comment meant my interpretation should not be surprising.
It just so happens that I am a Wikipedia editor, and I'm also pro-AI. I think this would be a very useful addition to Wikipedia, and I hope they get back to it when the dust settles from this current moral panic. I'm disappointed that they're pausing an experiment because that means that the "discussion" that will be had now will have less actually meaningful information in it. What's the point in a discussion without information to discuss?
You openly declaring yourself as “pro-AI” (which I didn’t have to ask for because I remember your numerous previous comments) and opposing yourself to “anti-AI” people really just shows that the discussion is on wrong tracks. The point of being pro or against AI should be to have whatever positive consequences for the stuff people are doing, in this case Wikipedia. The priority should be to be pro-Wikipedia, and then work out how AI relates to that, rather than going the other way around.
And true to your self-description you’re only defending AI as a concept, by default, without addressing the actual complaints or elaborating why it would be so desirable, but by accusing others of being anti-AI, as if that’s meaningful critique by itself.
Since you claim to be an editor, you could also join the discussion on Village Pump directly and provide your view there. But I believe we’ve already had a very similar discussion about Wikipedia some time ago, and I think you said you’re not actually active there anymore.
I'm "pro-AI" in the sense that I don't knee-jerk oppose it.
I do in fact use AI to summarize things a lot. I've got an extension in Firefox that'll do it to anything. It generally does a fine job.
I am pretty rabidly anti-AI in most cases, but the use case for AI that I don’t think is a big negative is the distillation of information for simplification purposes. I am still somewhat against this in the sense that at the end of the day their summarization AI could hallucinate, and since they’ve admitted this is a solution to a problem of scale, then it’s not sensible to assume humans will be able to babysit it.
However… there is some inherent value to the idea that people will end up using AI to summarize Wikipedia using models of dubious quality with an unknown quantity of intentionally pre-trained bias, and therefore there is some inherent value to training your own model to present the information on your site in a way that is the “most free” of slop and bias.
Wikipedia articles already have lead in summaries.
Fuck right off with this
A lot of them for the small articles and stubs are written very technically and don’t provide an explanation for complex subjects if you aren’t already familiar with it. Then you have to read 4 subjects down just to figure out the jargon for what they’re saying
I’d agree with that, both are problematic.
A lot of stubs should be deleted until they are expanded, they’re often more confusing than knowing nothing at all. I don’t think an LLM summary will help here though.
Reading a few articles deep is not only a pain in the ass, but is going to dissuade those who won’t do it. There’s also the issue that when you do wade in it might link to something that is poorly cited and confusing. Again, I think an LLM is going to make things worse here.
How does one expand a deleted article?
Wikipedia is not intended to be presenting a finished product, it's an eternal work in progress. A stub is the start of an article. If you delete an article whenever it gets started that seems counterproductive.
I agree, having experienced this especially on mathematics pages. But on the other hand, from my experience, the whole article is very technical in those cases : I’m not sure making a summary would help, and im not sure you can provide a summary both correct and easily understandable in those cases.
Math articles are the worst. They always jump right into calculus and stuff. I usually have to hope there’s a simple English article for those!
This is one thing I can see an actual use case for (as an external tool, not as part of WP): Create a summary, not of the article itself, but of the prerequisite background knowledge. And tailored to the reader’s existing knowledge—like, “what do I need to know to understand this article assuming I already know X but not Y or Z”.
Maybe it’s a result of Wikipedia trying to be more of an “online encyclopedia” vs a digital information hub or learning resource. I don’t think it’s a problem on its own but I do think there should be a simplified version of every article.
There is also already simple.wikipedia.org/wiki/Main_Page
If they add AI they better not ask me for any money ever again.
Or moderators. Why would they need those people when the AI can fix everything for free and even improve articles?
Right! I can’t wait to hear about all the new historical events!
I wonder if anyone witnessed the burning of the Library of Alexandria and felt a similar sense of despair for the future of knowledge.
You can download a copy of Wikipedia in full today before they turn it to shit.
Unlike the people in Alexandria, you can spend less that $20 and 20 minutes to download the whole thing and preserve it yourself
You are a light in the darkness.
Holy shit kbin is still around??
Kbin.earth is on mbin, I think kbin is dead.
I am so sad. I really liked what kbin was trying to do.
Kbin was destined to fail without opening up to community collaboration. I greatly preferred it over lemmy. So I will stick with Mbin now and Kbin.earth has been a small but nice Mbin instance.
I’m pro AI but absolutely fucking not.
The use case for AI is to summarize Wiki as an external tool. If Wikipedia starts using AI, it becomes AI eating its own tail.
If people use AI to summarize passages of written words to be simpler for those with poor reading skills to be able to more easily comprehend the words, then how are those readers going to improve their poor reading skills?
Dumbing things down with AI isn’t going to make people smarter I bet. This seems like accelerating into Idiocracy
By becoming interested in improving their poor reading skills. You won’t make people become interested in that by having everything available only in complex language, it’s just going to make them skip over your content. Otherwise there shouldn’t be people with poor reading skills, since complex language is already everywhere in life.
Nope. Reading skills are improved by being challenged by complex language, and the effort required to learn new words to comprehend it. If the reader is interested in the content, they aren’t going to skip it. Dumbing things down only leads to dumbing things down.
For example, look at all the iPad kids who can’t use a computer for shit. Kids who grew up with computers HAD to learn the more complex interface of computers to be able to do the cool things they wanted to do on the computer. Now they don’t because they don’t have to. Therefore if you get everything dumbed down to 5th Grade reading level, that’s where the common denominator will settle. Overcoming that apathy requires a challenge to be a barrier to entry.
But they aren’t interested in the content because of the complexity. You may wish that humans work like you describe, but we literally see that they don’t.
What you can do is provide a simplified summary to make people interested, so they’re willing to engage with the more complex language to get deeper knowledge around the topic.
You’re underestimating how many people before the iPad generation also can’t use computers because they never developed an interest to engage with the complexity.
Wikipedia is not made to teach people how to read, it is meant to share knowledge. For me, they could even make Wikipedia version with hieroglyphics if that would make understanding content easier
Novels are also not made to teach people how to read, but reading them does help the reader practice their reading skills. Beside that point, Wikipedia is not hard to understand in the first place.
Sorry, but that’s absolutely wrong - the complexity of articles can vary wildly. Many are easily understandable, while many others are not understandable without a lot of prerequisite knowledge in the domain (e.g. mathematics stuff).
I am not a native speaker, but my knowledge of the english language is better than most people i know, having no issues reading scientific papers and similar complex documents. Some wikipedia article intros, especially in the mathematics, are not comprehensible for anyone but mathematicians, and therefore fail the objective to give the average person an overview of the material.
It’s fine for me if i am not able to grasp the details of the article because of missing prerequisite knowledge (and i know how to work with integrals and complex numbers!), but the intro should at least not leave me wondering what the article is about.
Do you give toddlers post-grad books to read too? This is such an idiotic slippery slope fallacy that it just reeks of white people privilege.
People aren’t reading Wikipedia articles with the intention of getting better at reading.
Why do you think their reading skills are poor?
Time to switch to something else? Nutomic developed Ibis wiki for example: ibis.wiki
You realize this is just a proposal at this stage? Their proposed next step is an experiment:
Note, an opt-in clickthrough that they intend to monitor for further information on how to implement features like this and whether they should monitor them at all. As befits Wikipedia, they're planning to base these decisions on evidence.
If "they're gathering evidence and making proposals" is the threshold for you to jump ship to some other encyclopedia, I guess you do you. It's not going to be much of an exodus though since nobody who actually uses Wikipedia has seen anything change.
Mb. I still don’t see anything good coming out of implementing anything to do with AI though.
Well, this inspired me to swing my monthly wikipedia donation over to a world book sub instead.
It's bad enough that wikipedia was a very dubious source of info, but now this is just too much.
Never thought I’d cancel my recurring donation for them, but just sent the email. I hope they change their mind on this, but as I told them, I will not support this.
<img alt="" src="https://lemmy.world/pictrs/image/b766514a-3c13-482c-a893-3a8499dd9f91.gif">
There’s a core problem that many Wikipedia articles are hard for a layperson to read and understand. The statement about reading level is one way to express this.
The Simple version of articles shows humans can produce readable text. But there aren’t enough Simple articles, and the Simple articles are often incomplete.
I don’t think AI should be solely trusted with summarization/translation, but it might have a place in the editing cycle.
Maybe people should actually learn to use their brains so they can read slightly more difficult articles. Holy shit are we gonna have some idiots running around in 10 years. Didn’t think we could get dumber but here we are. (I’m not including people with learning disabilities obviously, they may need an article written or summarized differently to grasp it).
Is the point of Wikipedia to provide everyone with information, or to allow editors to spew jargon into opaque articles that are only accessible to experts?
I think it’s the former. There are very few topics that can’t be explained simply, if the author is willing to consider their audience. Best of all, absolutely nothing is lost when an expert reads a well written article.
Thats fair
Many people who are in a position to write opaque jargon lack the perspective that would be required to explain it to a person who isn’t already very well-versed. Math articles are often like that, which doesn’t surprise me. I’ve had a few math professors who appeared completely unable to understand how to explain the subject to anyone who wasn’t already good at it. I had to drop their classes and try my luck with others.
I feel like a few of them are in this thread!
Trolling aside, yeah, being able to explain a concept in everyday terms takes careful thought and discipline. I’m consistently impressed by the people who write Simple articles on Wikipedia. I wish there were more of those articles.
I wasn’t trolling
It didn’t help when any page which could be rewritten with mathematical notation was rewritten as mathematical notation.
I do have concerns about this but it’s really all about the usage, not the AI itself. Would the AI version be the only version allowed? Would the summaries get created on the fly for every visitor? Would edits to an AI summary be allowed? Would this get applied to and alter existing summaries?
I’m totally fine with LLMs and AI as a stop-gap for missing info or a way to coach or critique a human-written summary, but generally I haven’t seen good results when AI is allowed to do its thing without a human reviewing or guiding the outputs.
Is this the same WiliMedia Foundation who was complaining about AI scrapers in April?
IIRC, they weren’t trying to stop them—they were trying to get the scrapers to pull the content in a more efficient format that would reduce the overhead on their web servers.
You can literally just download all of Wikipedia in one go from one URL. They would rather people just do that instead of crawling their entire website because that puts a huge load on their servers.
Ah, but the clueless code monkeys, script kiddies and C-levels who are responsible for writing the AI companies' processing code only know how to scrape from someone else's website. They can't even ask their (respective) company's AI for help because it hasn't been trained yet. (Not that Wikipedia's content will necessarily help).
They're not even capable of taking the ZIP file and hosting the contents on localhost to allow the scraper code they got working to operate on something it understands.
So hammer Wikipedia they must, because it's the limit of their competence.
What’s funny is crawling the site would actually be more difficult and take longer than downloading and reading the archive.
Context for others, Wikipedia is only ~24 GB (compressed and without media or history). en.wikipedia.org/…/Wikipedia:Size_of_Wikipedia
I’m ok with auto generated content, but only if it is clearly separated from human generated content, can be disabled at any time and writing main articles with AI is forbidden
The big issue I see here isn’t the proposed solution, it’s the public image of doing something the tech bro billionaires are pushing hard right now.
It looks a bit like choosing the other side of the class war from their contributors.
Wikipedia, in particular, may not be able to afford that negatvie image, right now.
I could welcome this kind of tool later, but their timing sucks.
sounds like a good use case for an LLM. hope the issues get figured out
It would be a good use case for an LLM if it didnt make up false information
It might, possibly, be a viable use case if the LLM produced the summary for an editor, who then confirmed it’s veracity and appropriateness to the article and posted it themselves.
Guess they’re going to double down on the donation campaign considering the cost involved with ai
Hell nah, I am never donating to Wikipedia if they go AI.
These LLM-page-summaries need to be contained and linked, completely separately, in something like llm.wikipedia.org or ai.wikipedia.org.
In a possible future case, that a few LLM hallucinations have been uncovered in these summaries, it would cast doubts about the accuracy of all page content in the project.
Keep the generated-summaries visibly distinct from user created content.
Et tu, Wikipedia?
My god, why does every damn piece of text suddenly need to be summarized by AI? It’s completely insane to me. I want to read articles, not their summaries in 3 bullet points. I want to read books, not cliff notes, I want to read what people write to me in their emails instead of AI slop. Not everything needs to be a fucking summary!
It seriously feels like the whole damn world is going crazy, which means it’s probably me… :(
It's not you.
"It is no measure of health to be well-adjusted to a profoundly sick society." Krishnamurti
This ignorance is my biggest pet peeve today. Wikipedia is not targeting you with this but expanding accessibility to people who don’t have the means to digest a complex subject on their lunch break.
TL;DR: check your privilege
Giving people incorrect information is not an accessibility feature
RAG on 2 pages of text does not hallucinate anything though. I literally use it every day.
Then skip the AI summary.
For those of us who do skip the AI summaries it’s the equivalent of adding an extra click to everything.
I would support optional AI, but having to physically scroll past random LLM nonsense all the time feels like the internet is being infested by something equally annoying/useless as ads, and we don’t even have a blocker for it.
I think it would be best if that’s a user setting, like dark mode. It would obviously be a popular setting to adjust. If they don’t do that, there will doubtless be grease monkey and other scripts to hide it.
True!
<img alt="" src="https://lemmy.world/pictrs/image/c870fa5b-4cfa-4421-b4d4-535f1205c071.png">
AI threads on lemmy are always such a disappointment.
Its ironic that people put so little thought into understanding this and complain about “ai slop”. The slop was in your heads all along.
To think that more accessibility for a project that is all about sharing information with people to whom information is least accessible is a bad thing is just an incredible lack of awareness.
Its literally the opposite of everything people might hate AI for:
And to top it all you know this is a lost fight even if you’re right so instead of contributing to steering this societal ship these people cover their ears and “bla bla bla we don’t want it”. It’s so disappointingly irresponsible.
I’ll make a note to get back to you about this in a few years when they start blocking people from correcting AI authored articles.
How dare you bring nuance, experience and moderation into the conversation.
Seriously, though, I am a firm believer that no tech is inherently bad, though the people who wield it might well be. It’s rare to see a good, responsible use of LLMs but I think this is one of them.
Whether technology is inherently bad is of nearly no matter. The problem we’re dealing with is the technologies with exherent badness.
The point is they should be fighting AI, not open the door even an inch to AI on their site. Like so many other endeavors, it only works because the contributors are human. Not corpos, not AI, not marketing. AI kills Wikipedia if they let that slip. Look at StackOverflow, look at Reddit, look at Google search, look at many corporate social media. Dead internet theory is all around us.
Wikipedia is trusted because it’s all human. No other reason
I don’t think the idea itself is awful, but everyone is so fed up with AI bullshit that any attempt to integrate even an iota of it will be received very poorly, so I’m not sure it’s worth it.
I don’t think it’s everyone either - just a very vocal minority.
In the OP I linked a comment showing how the summary presented in the showcase video is not actually very accurate and it definitely does invent some elements that are not present in the article that is being summarised.
And in general the “accessibility” that primarily seems to work by expressing things in imprecise, unscientific or emotionally charged terms could well be more harmful than less immediately accessible but accurate and unambiguous content. You appeal to Wikipedia being “a project that is all about sharing information with people to whom information is least accessible”, but I don’t think this ever was that much of a goal - otherwise the editors would have always worked harder on keeping the articles easily accessible and comprehensible to laymen (in fact I’d say traditional encyclopedias are typically superior to Wikipedia in this regard).
Sorry but you’re making things up here, not even the developers of the summaries are promising such massive consequences. The summaries weren’t meant to replace any of the usual editing work, they weren’t meant to replace the normal introductory paragraphs or anything else. How would they save these supposed “millions of editor hours” then? In fact, they themselves would have to be managed by the editors as well, so all I see is a bit of additional work.
I don’t trust even the best modern commercial models to do this right, but with human oversight it could be valuable.
You’re right about it being a lost fight, in some ways at least. There are lawsuits in flight that could undermine it. How far that will go remains to be seen. Pissing and moaning about it won’t accelerate the progress of those lawsuits, and is mainly an empty recreational activity.
This is not the medicine for curing what ails Wikipedia, but when all anyone is selling is a hammer…
Nooooooo, you can’t do this
the summary (not ecessarily ai generated) I read elsewhere is what got me to wikipedia in the first place.
ok, just so long as the articles themselves aren’t AI generated.
It’s kind of indirectly related, but adding a query parameter
udm=14
to the url of your Google searches removes the AI summary at the top, and there are plugins for Firefox that do this for you. My hopes for this WM project are that similar plugins will be possible for Wikipedia.The annoying thing about these summaries is that even for someone who cares about the truth, and gathering actual information, rather than the fancy autocomplete word salad that LLMs generate, it is easy to “fall for it” and end up reading the LLM summary. Usually I catch myself, but I often end up wasting some time reading the summary. Recently the non-information was so egregiously wrong (it called a certain city in Israel non-apartheid), that I ended up installing the udm 14 plugin.
In general, I think the only use cases for fancy autocomplete are where you have a way to verify the answer. For example, if you need to write an email and can’t quite find the words, if an LLM generates something, you will be able to tell whether it conveys what you’re trying to say by reading it. Or in case of writing code, if you’ve written a bunch of tests beforehand expressing what the code needs to do, you can run those on the code the LLM generates and see if it works (if there’s a Dijkstra quote that comes to your mind reading this: high five, I’m thinking the same thing).
I think it can be argued that Wikipedia articles satisfy this criterion. All you need to do to verify the summary is read the article. Will people do this? I can only speak for myself, and I know that, despite my best intentions, sometimes I won’t. If that’s anything to go by, I think these summaries will make the world a worse place.
It’s also the same as selecting “Web” from the bar that has images, video, maps etc.
They’ll absolutely be possible, it’s crazy easy to make addons that edit webpages.
What will be really nice is if someone goes to the effort to make some sort of all in one AI blocker similar to an ad blocker, that removes AI summaries from all sources that have it, so we don’t need a specific add on for each site.
Which Dijkstra quote?
Paraphrasing, but: “testing can only show presence of bugs, not their absence”
I like it
Thank you so much for this!!!
🪦🪦🪦🪦
RIP Wikipedia, we will miss you
Honestly, I think it’s a good idea. As long as it’s clearly highlighted that “this is an AI generated summary”, it could be very useful. I feel like a lot of people here have never tried to e.g. read a maths article without having a PHD in mathematics. I would often find myself trying to remember what a term means or how it works in practice, only to be met by a giant article going into extreme technical detail that I for the life of me cannot understand, but if I were to ask ChatGPT to explain it I would immediately get it.
People will believe the AI summary without reading the article, and AI hallucinates constantly. Never trust an output from a LLM
TIL: Wikipedia uses complex language.
It might just be me, but I find articles written on Wikipedia much more easier to read than shit sometimes people write or speak to me. Sometimes it is incomprehensible garbage, or without much sense.
I’m from a country where English isn’t the primary language, people tend to find many aspects of English complex
I am also from a country that English is not widely spoken, in fact most people are not able to make a simple conversation (they will tell you they know ““basic English”” though).
I still find it easier to read Wikipedia articles in English, than than understand some relatives, because they never precisely say what the fuck they want from me. One person even say such incomprehensible shit, that I am thinking their brain is barely functional.
It really depends on what you’re looking at. The history section of some random town? Absolutely bog-standard prose. I’m probably missing lots of implications as I’m no historian but at least I understand what’s going on. The article on asymmetric relations? Good luck getting your mathematical literacy from wikipedia all the maths articles require you to already have it, and that’s one of the easier ones. It’s a fucking trivial concept, it has a glaringly obvious example… which is mentioned, even as first example, but by that time most people’s eyes have glazed over. “Asymmetric relations are a generalisation of the idea that if a < b, then it is necessarily false that a > b: If it is true that Bob is taller than Tom, then it is false that Tom is taller than Bob.” Put that in the header.
Or let’s take Big O notation. Short overview, formal definition, examples… not practical, but theoretical, then infinitesimal asymptotics, which is deep into the weeds. You know what that article actually needs? After the short overview, have an intuitive/hand-wavy definition, then two well explained “find an entry in a telephone book”, examples, two different algorithms: O(n) (naive) and O(log n) (divide and conquer), to demonstrate the kind of differences the notation is supposed to highlight. Then, with the basics out of the way, one to demonstrate that the notation doesn’t care about multiplicative factors, what it (deliberately) sweeps under the rug. Short blurb about why that’s warranted in practice. Then, directly afterwards, the “orders of common functions” table but make sure to have examples that people actually might be acquainted with. Then talk about amortisation, and how you don’t always use hash tables “because they’re O(1) and trees are not”. Then get into the formal stuff, that is, the current article.
And, no, LLMs will be of absolutely no help doing that. What wikipedia needs is a didactics task force giving specialist editors a slap on the wrist because xkcd 2501.
As I said in an another comment, I find that traditional encyclopedias fare better than Wikipedia in this respect. Wikipedians can muddle even comparatively simple topics, e.g. linguistic purism is described like this:
This is so hopelessly awkward, confusing and inconsistent. (I hope I’ll get around to fixing it, btw.) Compare it with how the linguist RL Trask defines it in his Language and Linguistics: The Key Concepts:
Bam! No LLMs were needed for this definition.
So here’s my explanation for this problem: Wikipedians, specialist or non-specialist, like to collect and pile up a lot of cool info they’ve found in literature and online. When you have several such people working simultaneously, you easily end up with chaotic texts with no head or tails, which can always be expanded further and further with new stuff you’ve found because it’s just a webpage with no technical limits. When scholars write traditional encyclopedic texts, the limited space and singular viewpoint force them to write something much more coherent and readable.
You’ve clearly never tried to use Wikipedia to help with your math homework
I never did any homework unless absolutely necessary.
Now I understand that I should have done it, because I am not good at learning shit in classrooms where there is bunch of people who distract me and I don’t learn anything that way. Only many years later I found out that for most things it’s best for me to study alone.
That said, you are most probably right, because I have opened some math-related Wikipedia articles at some point, and they were pretty incomprehensible to me.
If you can’t make people smarter, make text dumber.
Problem: Most people only process text at the 6th grade level
Proposal: Require mainstream periodicals to only generate articles accessible to people at the 6th grade reading level
Consequence: Everyone accepts the 6th grade reading level as normal
But… New Problem: We’re injecting so many pop-ups and ad-inserts into the body of text that nobody ever bothers to read the whole thing.
Proposal: Insert summaries of 6th grade material, which we will necessarily have to reduce and simplify.
Consequence: Everyone accepts the 3rd grade reading level as normal.
But… New Problem: This isn’t good for generating revenue. Time to start filling those summaries with ad-injects and occluding them with pop ups.
.
Wikipedia articles are already quite simplified down overviews for most topics. I really don’t like the direction of the world where people are reading summaries of summaries and mistaking that for knowledge. The only time I have ever found AI summaries useful is for complex legal documents and low-importance articles where it is clear the author’s main goal was SEO rather than concise and clear information transfer.
Thanks, I hate it.
There was this fucking functionality of the browser called ctrl+f where you can find anything in text. But fucking no, people can’t use it on mobile easily so instead of fucking teaching users how they can find fucking content we will get generated slop… Also fucking websites started implementing stupid shit like loading dynamically or override ctrl+f with stupid javascript popup, so ctrl+f gets broken all the time. And now ctrl+f will be fucking broken because first thing will be fucking AI bullshit. Fuck You. I just hope I will be able hide AI with extension.
RIP
If it’s really that bad we can always fork it.
Relax, this is not the doom and gloom some of y’all think this is and that is pretty telling.
Yeah, the catastrophic comments do take it too far… WMF has already announced they’re putting it on hold, so at the very least there’s a lot of discussion with the editors and additional work that will have to happen before this launches - if it ever launches.
Given the degree to which the modern day Wiki mods jump on to every edit and submission like a pack of starved lions, unleashing a computer to just pump out vaguely human-sounding word salad sounds like a bad enough idea on its face.
If the AI is being given priority over the editors and mods, it sounds even worse. All of that human labor, the endless back-and-forth in the Talk sections, arguing over the precise phrasing or the exact validity of sources or the relevancy of newly released information… and we’re going to occlude it with the half-wit remarks of a glorified chatbot?
Woof. Enshittification really coming for us all.
fucking disgusting. no place should have ai but especially not an encyclopedia.
My immediate thought is that the purpose of an encyclopaedia is to have a more-or-less comprehensive overview of some topic of interest. The reader should be able to look through the page index to find the section they care about and read that section.
Its purpose is not to rapidly teach anyone anything in full.
It seems like a poor fit as an application for LLMs
Who exactly asked for this? Wikipedia isn’t publicly traded, they aren’t a for profit company, why are they trying to shove Ai into people’s faces?
For those few who wanted it, there are dozens of bots who can summarize the (already kinda small) Wikipedia articles