FFmpeg 8 can subtitle your videos on the fly with Whisper (www.theregister.com)
from along_the_road@beehaw.org to technology@beehaw.org on 29 Aug 06:35
https://beehaw.org/post/21869936

#technology

threaded - newest

jcr@jlai.lu on 29 Aug 06:50 next collapse

So proud of FFmpeg and VLC project ; I want news like this everyday ! Scavenging the good stuff from AI and putting itt to good use, we need more of this ! Next time automatic translation integration from AI models and we don’t need to worry about the dredge of syncing subtitles with videos, one button compilation process and woosh ! -> in fact “on the fly” means no need for compilation, it is even better than I thought 🤩

Sina@beehaw.org on 29 Aug 14:21 collapse

I use mpv & it’s just a gui frontend to ffmpeg. I have some layering issues with VLC on Linux…

quick_snail@feddit.nl on 29 Aug 23:59 collapse

Isn’t MPV liberally not a GUI?

dubyakay@lemmy.ca on 29 Aug 08:13 next collapse

I need something like this but for image viewers and for manga/manhwa. It always amazes me how there’s options for this on mobile, even on IOS, but desktop linux doesn’t have anything comparable that has integrated text recognition and translation overlay.

limerod@reddthat.com on 29 Aug 18:15 collapse

Can you share those options? I’m curious.

dubyakay@lemmy.ca on 29 Aug 22:13 collapse

I’m just going by what the default image viewer on iOS is capable of:

<img alt="" src="https://lemmy.ca/pictrs/image/c86eae3d-66f3-4a08-8a6d-411680f4dd16.jpeg">

Which is pretty yank, but I know that there’s also Google Lens on android, which does a lot better job, so I assume there’s actual better, more manga reading suited software out there for both mobile OSes.

I saw some implementation on Linux that worked a bit, but…

limerod@reddthat.com on 29 Aug 10:34 next collapse

This is superb news. Now, we need audio translation and we can watch any videos in any language in the whole world.

Currently, doing that is a bit complicated than it needs to be.

chahk@beehaw.org on 29 Aug 11:54 collapse

After that - dubbing in realtime.

Midnitte@beehaw.org on 29 Aug 11:55 collapse

Call it dubblefish

melroy@kbin.melroy.org on 29 Aug 12:23 next collapse

FFmpeg is one of the most underrated software on Linux. It's so important for the ecosystem very widely used. And you most likely barely notice it.

Just like curl.

Appoxo@lemmy.dbzer0.com on 29 Aug 14:09 next collapse

You mean underrated?

melroy@kbin.melroy.org on 29 Aug 18:43 collapse

Haha yes. My bad.

AnarchistArtificer@slrpnk.net on 29 Aug 23:09 collapse

I remember way back when I was still intimidated by the command line, I was having issues with a video, and the only info I could find was on using ffmpeg to do some conversions directly. I laugh at the memory of me being nonplussed at trying to launch ffmpeg and expecting a GUI to pop up.

I am glad that I spent some time getting to know ffmpeg directly. There’s been a few times where knowing that it was ffmpeg under the hood helped me.

LukeZaz@beehaw.org on 29 Aug 13:32 next collapse

The changelog lists 30 significant changes, of which the top new feature is integrating Whisper. This means whisper.cpp, which is Georgi Gerganov’s entirely local and offline version of OpenAI’s Whisper automatic speech recognition model. The bottom line is that FFmpeg can now automatically subtitle videos for you.

Yeah hey, can anyone chime in if this is at all based off LLMs? Because my problems with the incorrect plagiarism machine don’t end just because it’s now the offline incorrect plagiarism machine. Making OpenAI’s garbage hockey open source doesn’t make it okay. Or should I just start calling this shit FOSSwashing?

I dug around for a bit and couldn’t find much of anything, but judging by a look at the Github pages for both versions of Whisper, it’s looking very related. If that’s the case, fuck right off. I don’t want AI in FFmpeg, either.

kayohtie@pawb.social on 29 Aug 14:01 collapse

It’s not AI, it’s neural network models in the same way voice recognition in devices has been working for over a decade. Even Dragon has been utilizing language models vectors for a very long time, just requiring voice training instead of utilizing a premade research or open-source data set.

I hate generative AI and it’s slop too, but getting angry about neural network models in general is not only absurd, but playing exactly into what corporations want – conflation of the underlying basic technology concepts with the capitalistic vampirism of art.

EDIT: to add, “research” here can be closed source – voice models utilized with these tend to be internally-sourced for much of them, at least earlier ones do.

drosophila@lemmy.blahaj.zone on 29 Aug 18:59 collapse

It’s not AI, it’s neural network models

These used to be called AI before people decided that only LLMs and Diffusion models were AI. Both of which are types of neural networks.

kayohtie@pawb.social on 30 Aug 18:14 collapse

But much more loosely-so, not nearly as heavily. It was more like a seldom-used term to say that it might be like what machine learning actually was.

Now they’re all being called it heavily, forcefully, by corporations which started using it for capitalistic hype reasons. Hence, the push for strong distinctions between a field that’s been around for quite a while as algorithms in mathematics being a variety of types, and lazy slop that “just one more prompt bro” and “we can replace workers”. Even DLSS wasn’t called “AI” until the hype train started, and now Jensen Huang can’t call it that often enough lest he be unable to afford yet another leather jacket as if they’re disposable glasses wipes.

herseycokguzelolacak@lemmy.ml on 29 Aug 15:57 next collapse

the people behind ffmpeg need a medal.

flango@lemmy.eco.br on 29 Aug 16:47 next collapse

Ok, but how to use it?

FundMECFS@anarchist.nexus on 29 Aug 17:55 next collapse

following

MoonMelon@lemmy.ml on 29 Aug 23:15 collapse

I really like this website for exploring all the options. Not sure how up to date it is with this latest ffmpeg.

ffmpeg.lav.io

quick_snail@feddit.nl on 29 Aug 23:59 collapse

Is it in the official Debian apt repos yet?