AI Subtitles Are Coming to VLC— Get Ready! (news.itsfoss.com)
from petsoi@discuss.tchncs.de to linux@lemmy.ml on 13 Jan 09:28
https://discuss.tchncs.de/post/28509997

#linux

threaded - newest

despotic_machine@lemmy.dbzer0.com on 13 Jan 09:35 next collapse

I’m ready to deactivate it if it comes with any active component.

SoulWager@lemmy.ml on 13 Jan 10:52 collapse

What do you mean by active component? Is processing the audio being played back to add subtitles active?

kryptonidas@lemmings.world on 13 Jan 12:03 next collapse

Sending the audio to an LLM in the sky. But I assume it would be local?

drwho@beehaw.org on 13 Jan 18:01 collapse

It says pretty explicitly that it only runs on the user’s machine.

despotic_machine@lemmy.dbzer0.com on 13 Jan 12:08 collapse

Is processing the audio being played back to add subtitles active?

Not sure where you are confused. If any part of this feature is active by default I will disable it.

SoulWager@lemmy.ml on 13 Jan 14:42 next collapse

Even non-AI subtitles are off by default, what exactly are you expecting to be on?

despotic_machine@lemmy.dbzer0.com on 13 Jan 15:03 next collapse

Find someone else to argue with.

drwho@beehaw.org on 13 Jan 18:01 collapse

This is the Internet, there’s no shortage of targets.

VintageGenious@sh.itjust.works on 13 Jan 19:16 collapse

Exactly this makes no sense. Which tool would force subtitles

limelight79@lemm.ee on 13 Jan 21:49 collapse

The way you wrote this, I thought you meant that if it required a cloud service you would turn it off. But now I think you’re just saying you wouldn’t use this feature.

I share the confusion over your definition of “active”. You got all defensive when someone asked, so now no one really knows what you meant.

theshatterstone54@feddit.uk on 13 Jan 09:41 next collapse

It’s not every day that you see actually useful applications of AI, but this might be one.

jlow@beehaw.org on 13 Jan 09:45 next collapse

While I hate the capitalist AI-apocalypse with a passion I think this is great news for accessibility.

Mwa@lemm.ee on 13 Jan 10:16 next collapse

If it’s opt in/opt out then am fine with that.

johncutting@hexbear.net on 13 Jan 10:29 next collapse

Yup. Easy uninstall otherwise.

kamiheku@sopuli.xyz on 13 Jan 11:12 collapse

Not only is it opt in, it’s also running fully locally on your machine.

Mwa@lemm.ee on 13 Jan 11:17 next collapse

Ohh I assume it’s Mistral cause Llama uses a Incompatible license.

Tetsuo@jlai.lu on 13 Jan 12:21 next collapse

It’s not an LLM, just a subtitles generator for video.

catloaf@lemm.ee on 13 Jan 15:08 collapse

It’s Whisper.

Mwa@lemm.ee on 13 Jan 16:53 next collapse

OHHH okay

lord_ryvan@ttrpg.network on 13 Jan 23:02 collapse

I wonder how powerful a device you need to run this live a la YouTube auto caption-style.

Does anyone have experience with this?

kent_eh@lemmy.ca on 13 Jan 16:37 collapse

My biggest issue with that is the amount of bloat a full local LLM implementation would add.

But if it’s an optional module that you can choose to add (or choose not to add) after the fact, I have no complaint.

robocall@lemmy.world on 13 Jan 10:24 next collapse

Hold on to your butts!

metaStatic@kbin.earth on 13 Jan 10:47 next collapse

I've seen some pretty piss poor implementations on streaming apps but if anyone can get it right it's VLC

FuckyWucky@hexbear.net on 13 Jan 11:14 next collapse

This means that they most likely went for lighter AI models that use fewer resources, so that they run smoothly without putting too much strain on the machine.

Pretty good. Captions are one of the legitimate uses of “AI”.

Evotech@lemmy.world on 13 Jan 11:22 next collapse

If youtube transcriptions is anything to go by this won’t be great. But I’m optimistic

lefixxx@lemmy.world on 13 Jan 13:41 next collapse

Youtube transcriptions are suprisingly abysmal considering what technology google already has at hand.

Matriks404@lemmy.world on 13 Jan 15:11 next collapse

I find them pretty good for English spoken by native speakers. For anything else it’s horrible.

fhein@lemmy.world on 14 Jan 12:31 collapse

As long as they are talking about normal things and not playing D&D 😃

john89@lemmy.ca on 13 Jan 15:43 collapse

I actually disagree.

I’m consistently impressed whenever I have auto-subtitles turned on on Youtube.

YourMomsTrashman@lemmy.world on 13 Jan 23:33 collapse

I’m not impressed by the subtitles themselves (they’re just ok) but rather by how accessible it is. Like it being an option rather than it being a “tool for creators” or limited to premium or something

Or maybe youtube has added so much dogshit features recently (like ai overviews, automatically adding info cards for anyone mentioned, and highlighting seemingly random words in comments to search it outside of context) that it makes me appreciate these things more lol

jayandp@sh.itjust.works on 13 Jan 17:26 next collapse

I’ve been messing with more recent open-source AI Subtitling models via Subtitle Editor which has a nice GUI for it. Quality is much better these days, at least for English. It still makes mistakes, but the mistakes are on the level of “I misheard what they said and had little context for the conversation” or “the speaker has an accent which makes it hard to understand what they’re saying” mistakes, which is way better than most YouTube Auto Transriptions I’ve seen.

lord_ryvan@ttrpg.network on 13 Jan 22:59 collapse

They’re helpful to my deaf ears, even when they’re wrong (50% of the words) they do give me a solid idea of what is being said together with what the audio sounds like.

With it, I get almost everything correct. Without it, I understand near to nothing.

This only goes for English spoken by Americans and sometimes London Britons, sadly, nothing else get detected nearly as good enough, so I can’t enjoy YouTube in my native language (Dutch), but being able to consume English YouTube already helps a lot!

Evotech@lemmy.world on 13 Jan 23:01 collapse

That is very true. It’s hard to find local subtitles to a lot of stuff. And the whole deaf angle :)

zerakith@lemmy.ml on 13 Jan 11:36 next collapse

It is probably good that OS community are exploring this however I’m not sure the technology is ready (or will ever be maybe) and it potentially undermines the labour intensive activity of producing high quality subtitling for accessibility.

I use them quite a lot and I’ve noticed they really struggle on key things like regional/national dialects, subject specific words and situations where context would allow improvement (e.g. a word invented solely in the universe of the media). So it’s probably managing 95% accuracy which is that danger zone where its good enough that no one checks it but bad enough that it can be really confusing if you are reliant on then. If we care about accessibility we need to care about it being high quality.

markinov@lemmygrad.ml on 13 Jan 23:29 collapse

While good quality subtitles are essential VLC can’t ensure that, it’s the responsibility of the production studio. AI subtitles on vlc are for those videos which doesn’t have any sub (which are a lot). The pushback shouldn’t be for vlc implementing AI, but production studios replacing translators or transcriber with AI (like crunchyroll tried last year).

Also while transcribing and subtitle editing is a labour intensive job, use of AI to help the editors shouldn’t be discouraged, it can increase their productivity by automating repeatative tasks so that they can focus on better quality.

zerakith@lemmy.ml on 14 Jan 09:45 collapse

Agreed that the studios need to be held more accountable and their usage of AI is more problematic than open source last resort type work. I have noticed a degradation of quality in the last five years on mainstream sources.

However, the existence of this last resort tool will shift the dynamics of the “market” for the work that should be being done. Even in the open source community. There used to be an active community of people giving their voluntary labour to writing subtitles for those that lacked them (they may still be active I don’t know). Are they as likely to do that if they think oh well it can be automatically done now?

The real challenge with the argument that it helps editors is the same as the challenge for Automated Driving. If something performs at 95% you actually end up deskilling and stepping down the attention focus and make it more likely to miss that 5% that requires manual intervention. I think it also has a material impact on the wellbeing of those doing the labour.

To be clear I’m not anti this at all but think we need to think carefully about the structures and processes around it to ensure it does lead to improvement in quality not just an improvement in quantity at the cost of quality.

PerogiBoi@lemmy.ca on 13 Jan 12:13 next collapse

Aaaaaand I drop VLC. Fucking shame.

Edit: “wtf i love ai now”- this thread

DarkDarkHouse@lemmy.sdf.org on 13 Jan 12:21 next collapse

Why would you need to do that if it’s off by default and locally processed?

admin@lemmy.my-box.dev on 13 Jan 13:11 next collapse

Because triggered and hate circlejerk.

drwho@beehaw.org on 13 Jan 18:00 collapse

Nuance is deader than Elvis.

admin@lemmy.my-box.dev on 13 Jan 18:32 collapse

uh huh-huh.

kent_eh@lemmy.ca on 13 Jan 16:40 next collapse

Is it off, or is it an optional module that doesn’t have to be adding bloat to my system if I don’t want to use it?

LLMs can take up a pretty big storage footprint.

drwho@beehaw.org on 13 Jan 18:00 collapse

Why don’t you ask them? They’re very responsive to their community of users.

I just took a spin through their news blog and changelog and didn’t see anything about it in the latest release, so it’s probably not out yet.

superkret@feddit.org on 13 Jan 19:00 collapse

Cause we can no longer sit back and allow AI infiltration, AI indoctrination, AI subversion and the international AI conspiracy to sap and impurify all of our precious bodily fluids.

VintageGenious@sh.itjust.works on 13 Jan 17:31 collapse

Braindead comment

PerogiBoi@lemmy.ca on 13 Jan 19:05 collapse

You’re right! Your comment has added a tremendous amount of value to this thread.

VintageGenious@sh.itjust.works on 13 Jan 22:57 collapse

You know AI can mean more than generative AI slop ?

keksi@sopuli.xyz on 13 Jan 12:35 next collapse

I wonder how good it is.

Does it translate from audio or from text?

Does it translate multiple languages, if video has a, b, c languages does it translate all to x.

Does user need to set input language?

coolmojo@lemmy.world on 13 Jan 12:45 next collapse

What would be actually cool if it could translate foreign movies based on audio and add the English subtitles to it.

catloaf@lemm.ee on 13 Jan 15:07 collapse

Translating a transcription should be easy.

coolmojo@lemmy.world on 13 Jan 15:12 collapse

Yes, if the transcript feature works well for the original language.

taiidan@slrpnk.net on 13 Jan 12:45 next collapse

Do one thing and do it well. Oh well…

RedstoneValley@sh.itjust.works on 13 Jan 15:16 next collapse

VLC always had a ton of applications, network device playback, TV, streaming server, files, physical media, music player, effects, recording, AV format conversion, subtitles, plugins and so on.

superkret@feddit.org on 13 Jan 16:53 collapse

“Do one thing well” is what gives you software like sendmail, which requires several other programs to be actually useful, all of which have to be configured separately to work together, with wildly different syntax.

taiidan@slrpnk.net on 14 Jan 00:49 collapse

And enables modular workflows and flexiblity.

z3rOR0ne@lemmy.ml on 13 Jan 13:29 next collapse

Meh, I’ll just stick with mpv.

shawn1122@lemm.ee on 13 Jan 15:10 collapse

How is MPVs impementation? Does it work fairly well?

z3rOR0ne@lemmy.ml on 13 Jan 16:43 collapse

Its a command line multimedia player. It’s implementation is ideal for minimalists, and easily understood by reading the man pages.

It works very well imo.

S13Ni@lemmy.studio on 13 Jan 14:00 next collapse

This is not by default bad thing, if it is something you only use when you decide to do so, when you don’t have other subtitles available tbh. I hate AI slop too but people just go to monkey brain rage mode when they read AI and stop processing any further information.

I’d still always prefer human translated subtitles if possible. However, right now I’m looking into translating entire book via LLM cause it would be only way to read that book, as it is not published in any language I speak. I speak English well enough, so I don’t really need subtitles, just like to have them on so I won’t miss anything.

For English language movies, I’d probably just watch them without subtitles if those were AI, as I don’t really need them, more like nice to have in case I miss something. For languages I don’t understand, it might be good, although I wager it will be quite bad for less common languages.

drwho@beehaw.org on 13 Jan 17:57 collapse

There’s a difference between LLM slop (“write me an article about foo”) and using an LLM for something that’s actually useful (“listen to the audio from this file and transcribe everything that sounds like human speech”).

S13Ni@lemmy.studio on 13 Jan 20:02 collapse

Exactly. I know someone who is really smart and works in machine learning and when I listen to him in isolation, AI sounds like actually useful thing. Most people just are not smart like that, and most applications for AI are not very useful.

One of the things I often think is that AI makes it possible to do things that shouldn’t be done very easily and fast, that would had previously been too much effort or craft for some people, like now they can easily make website for whatever grift they are pushing.

yogthos@lemmy.ml on 13 Jan 16:34 next collapse

The whole knee jerk reaction against anything AI related is tiresome and utterly irrational. This seems like a perfectly legitimate use of technology. If I have a movie in a language I don’t know and I can’t find subs for it, then I’d much rather have AI subs than nothing at all.

Eyck_of_denesle@lemmy.zip on 14 Jan 00:47 collapse

Yea. Sometimes I just can’t process what they are saying because of my adhd ass and subs really help.

HappyTimeHarry@lemm.ee on 13 Jan 18:12 next collapse

Im curious What makes what VLC is doing qualify as artificial intelligence instead of just an automated transcription plugin?

Automated transcription software has been around for decades, I totally understand getting in on the ai hype train but i guess I’m confused as to if software from years past like “dragon naturally speaking” or Shazam are also LLMs that predate openAI or is how those services worked to identify things different from how modern llms work?

VintageGenious@sh.itjust.works on 13 Jan 18:57 next collapse

automated transcription is AI, neural networks are just better AI sometimes

seliaste@lemmy.blahaj.zone on 14 Jan 02:02 collapse

Llms are a very specific Gennerative AI subset. Not everything AI is LLM, especially stuff like Shazam is pretty traditional AI. It’s been around for a while already, and studied for even longer (even back in the 1960s we were already starting to have a field of study in this domain)

tamagotchicowboy@hexbear.net on 13 Jan 18:41 next collapse

I have some older foreign films I’d like to watch that have like 0 subtitles, seems useful.

Quintus@lemmy.ml on 13 Jan 20:50 next collapse

Pandora’s Box is already open. Might as well make use of it.

lord_ryvan@ttrpg.network on 13 Jan 23:03 next collapse

Oh so that wasn’t a joke from their booth.

This seems really out of place, but locally ran auto subtitles from ethically sourced AI would be great.

It’s just that there’s two very big conditions in that sentence there.

zarkanian@sh.itjust.works on 13 Jan 23:44 collapse

Which AI is the ethically-sourced one

smayonak@lemmy.world on 14 Jan 01:31 next collapse

There are a number of open weight open source models out there with all their data sourced from the public domain. Look up BLOOM and Falcon. There are others.

lord_ryvan@ttrpg.network on 14 Jan 07:12 collapse

JetBrains’ AI code suggestions were only trained on code where authors gave explicit permission for it, but that’s the only one I know from the top of my head. Most chat-oriented LLMs (ChatGPT, Claude, Gemini…) were almost certainly trained using corporate piracy.

IronKrill@lemmy.ca on 14 Jan 04:44 next collapse

Not against this feature, but this quote made me laugh:

… once this is in place, people won’t have to scour the internet for sourcing subtitles to their favorite movies, shows, or even anime.

As if MTL will get anywhere near the nuance of a properly made human translation.

Ferk@lemmy.ml on 14 Jan 14:36 collapse

Personally, I would be happy even if it didn’t translate it but were able to give some half decent transcription of, at least, English voice into English text. I prefer having subtitles, even when I speak the language, because it helps in noisy environments and/or when the characters mumble / have weird accents.

However, even that would likely be difficult with a lightweight model. Even big companies like Google often struggle with their autogenerated subtitles. When there’s some very context-specific terminology, or uncommon names, it fumbles. And adding translation to an already incorrect transcript multiplies the nonsense, even if the translation were technically correct.

mexicancartel@lemmy.dbzer0.com on 14 Jan 06:10 collapse

It won’t be better than human translated ones but begter than no subtitles. I don’t think even humans can make subtitles correctly without knowing context

lengau@midwest.social on 14 Jan 07:13 collapse

Honestly, if it can generate subtitle files it’ll be a huge benefit to people creating subtitles. It’s way easier to start with bad subs and fix them than it is to write from scratch.

mexicancartel@lemmy.dbzer0.com on 14 Jan 07:16 collapse

Yeah true. Good feature anyways