Crunchyroll accidentally confirmed it uses ChatGPT for subtitles (www.pocket-lint.com)
from sabreW4K3@lazysoci.al to technology@beehaw.org on 02 Jul 19:50
https://lazysoci.al/post/29558083

#technology

threaded - newest

nieceandtows@programming.dev on 02 Jul 20:07 next collapse

How are subtitles created usually? Are they provided by the source material team, some professional third party that manually transcribes the video, or just fans doing it for free?

sabreW4K3@lazysoci.al on 02 Jul 20:16 next collapse

In terms of anime fansubs, it’s normally just great folks in the community. Some got hired by studios. But the studio is meant to provide the subs.

megopie@beehaw.org on 02 Jul 21:00 next collapse

See that’s the kicker, for the longest time it was basically all fan translated subtitles, and only recently have payed for translation become the norm.

So it’s really quite pathetic for them to try and save a few bucks by replacing a proper translator with a LLM, given that there are still plenty of passionate fans who would have done it for free. Especially given that translating between Japanese and English in a cultural context heavy situation is something these LLMs are really bad at.

Unboxious@ani.social on 02 Jul 21:54 collapse

given that there are still plenty of passionate fans who would have done it for free

I’d imagine this is a non-starter from a corporate standpoint. I know if I were in charge I’d be terrified of the idea of just trusting community-submitted subtitles to not have random slurs or something inserted. That said I still think it would be super cool if they’d let people source and use their own subtitle files; I now it’s possible because I have a tampermonkey script that lets me do just that.

megopie@beehaw.org on 03 Jul 03:41 collapse

That’s the core of the issue, crunchy roll has sat its self as a corporate middleman, buying the rights to distribute shows and then charging consumers a subscription for access.

But they can’t be bothered to do the only actual damn work their position would realistically demand, beyond renting server space; providing translations for the foreign media they’re distributing.

That’s without even discussing the fact that not a single penny users give them will end up in the hands of any of the exploited artists who actually made the shows, since the industry doesn’t work on residuals or any other kind of profit sharing, the licensing fees crunchy roll pays essentially going straight to financiers.

borari@lemmy.dbzer0.com on 03 Jul 03:54 next collapse

So pirate the shit and use whatever subtitles you want.

Unboxious@ani.social on 03 Jul 14:05 collapse

That’s without even discussing the fact that not a single penny users give them will end up in the hands of any of the exploited artists who actually made the shows

That’s quite the assertion. How exactly do you suggest they’re buying the rights to distribute the shows then?

SaltySalamander@fedia.io on 03 Jul 16:57 next collapse

You left out the part of the sentence where they actually answer your question.

megopie@beehaw.org on 03 Jul 22:05 collapse

They’re buying them from production committees and other such organizations. Most anime is made on essentially “commission” basis, where a studio is payed a fixed upfront amount by a group of financiers and other interests, who then distribute the show, sell the merch, and license it internationally. Essentially studios and those who work there are payed no residuals or other profit sharing scheme like is common in the American film and television industry.

There is actually a bit of a cartel in that regard, with the third parties that purchase shows from studios having collaborated to suppress the cost of seasons for nearly 2 decades, leading to stagnant wages and rampant overworking of artists as the quality and quantity of work expected increases while the budget stays the same. Increasingly artists at the companies have had to fall back on gig work beyond their standard hours to make ends meet, getting payed by frame in their off hours to make a little extra money, effectively working 16 hour days through this additional work. There is some movement to change this as of late, but, this is still essentially the norm.

Unboxious@ani.social on 03 Jul 23:55 collapse

Yes, but do you think they’d buy the shows from those production committees and other organizations if people weren’t interested in paying subscriptions to watch them? That’s like saying Bandai doesn’t get money when I buy gunpla from a store like usagunplastore just because usagunplastore already bought the gunpla from Bandai months ago and Bandai isn’t getting more money from that particular purchase.

Animators being horribly underpaid is a different topic entirely.

megopie@beehaw.org on 04 Jul 00:11 collapse

The people who actually made the show, animators, voice actors, and writers do not get money based on your crunchy rolls subscription, and those production committees that do get money, didn’t make the shows, they just initially financed them.

Assuming the show is based on a manga or light novel, the original artist/writer might if they were lucky enough to negotiate shares in the production committee, but most are not in a position to do so.

For me, what matters, is that the people who made the art get compensated fairly, that they are able to live a good life. That people are encouraged to make art by my consumption of it, and the current system doesn’t do that. It’s a horrific exploitative machine where purchase reward further exploitation of the people who actually put work and effort in to make the art.

Unboxious@ani.social on 04 Jul 00:22 collapse

I don’t think there’s an industry on earth where it’s normal for the low-level workers to be paid directly when the customer buys something. It being filtered through a bunch of business stuff is the norm everywhere I’m afraid.

megopie@beehaw.org on 04 Jul 00:51 collapse

Residuals are standard in the American film/TV industry. They are paid a percent of ongoing profits of previous projects they’ve worked on.

Another fairly common practice is ESOPs where over time workers at a company receive shares in the company, and thus dividends.

handsoffmydata@lemmy.zip on 02 Jul 21:25 next collapse

I maintain my own media library and I ensure every file has English and German subtitles. There are a variety of ways to source srt files but when all else fails a machine with enough compute can transcribe video files using open source whisper. After I generate an English srt file from the video I send it to OpenAI to create the German translation.

dubyakay@lemmy.ca on 03 Jul 05:18 collapse

Is there something similar for manga? Something that can overlay Japanese text on images, similar to what we have on smartphones but for the PC?

handsoffmydata@lemmy.zip on 03 Jul 18:11 collapse

If your video file is Japanese language use a whisper model optimized for Japanese. Once it produces the Japanese srt you can get translations from open ai. Use handbrake to add the srt to the file and you’re done. Good luck!

dubyakay@lemmy.ca on 03 Jul 21:58 collapse

Sorry. By manga I’ve meant image data, like pngs or JPEGs.

yetAnotherUser@lemmy.ca on 02 Jul 21:59 collapse

It seems that they have, or at least had in 2023, internal teams that handled the translations. crunchyroll.com/…/international-translation-day-2…

SaneMartigan@aussie.zone on 02 Jul 23:02 next collapse

I feel like this is a reasonable use of chat gpt.

Kirk@startrek.website on 02 Jul 23:24 collapse

For YouTube tutorial videos I have no issue with relying on GPT, but I think it’s important to recognize that the translation of art is art. I don’t feel good about the idea of something without a soul or perspective interpolating a work of art from one culture and language into another that might be wildly different from where it started.

That all said, I think Crunchyroll and anyone else using AI art without disclosing it absolutely should be honest about it.

null@slrpnk.net on 03 Jul 18:43 collapse

I feel like what makes the most sense and is likely what’s happening is that ChatGPT is being used to do the initial translation, and then a human is auditing that translation and making adjustments. So just a faster way to get the scaffolding and grunt-work out of the way.

megopie@beehaw.org on 04 Jul 01:57 collapse

they appear to be copying direct translations from chat GPT in to the subtitles, judging by the fact that one of the subtitles said “Chat GPT says:” and then the line in German. People who speak German also noticed that the grammar and sentence structure for many of these shows has been awful and nonsensical at times.

If anyone is doing any sort of oversight, they don’t appear to speak German them selves and are just betting that the output will be accurate and pasting it in.

Someone who spoke German and Japanese fluently enough to do competent oversight could probably translate faster than they could edit and rephrase the work of an LLM, which are notoriously bad at translating languages in a high context situation like dialog in a animated show. LLMs are also generally very bad with high context languages like Japanese, and even worse at translating between them and low context languages like German.

Geodad@beehaw.org on 03 Jul 01:04 next collapse

As someone who is able to speak Japanese, I’d notice the drop in quality of translation almost instantly.

I never turn on subs anyway when I watch my anime though.

t3rmit3@beehaw.org on 04 Jul 00:21 collapse

I have to since my partner doesn’t speak Japanese, but half the time I end up having to correct lines for them once or twice, to make things make sense. The non-egregious stuff I don’t even bother with. It’s crazy how amateurish some of the mistakes are, or even what are clearly choices to omit entire sentences, for no reason.

おい、ゆうじ君、海行こうぜ

“Hi Yuji!”

MaggiWuerze@feddit.org on 04 Jul 09:33 collapse

As someone who learns japanese. Is that a kanji for a honorific? probably kun? ゆうじ is the name, although weird that it is written in hiragana I guess… But I fail at this one 海行こうぜ

The first Kanji has the one for mother as part of it I think… And the second one is pronounced it ‘i’ so …iikouze ? Let’s go somewhere?

t3rmit3@beehaw.org on 04 Jul 16:19 collapse

Yes, 君 is ‘kun’ when used as an honorific.

海 is ‘umi’, or sea/ocean. You are correct that the second half of the kanji (母) is the same as the standalone character for mother, but it’s base radical is ⽏, which also just means mother. The first radical, ⺡, means water/ liquid, so you can sort of infer that “water mother” = ocean. Not all kanji work out this nicely with their radical structure, though.

Last part is spot on, ikou (行こう) is the shortened (conjugation?) of iku or ‘to go’ that expresses a suggestion to do, i.e. “let’s (go)”.

MaggiWuerze@feddit.org on 05 Jul 20:01 collapse

Thanks for the feedback, seems my efforts weren’t entirely wasted :D Interesting, that the Kanji for water itself does not contain that rqficale (unless you squint heavily) What’s the difference to Ikkimashou? Isn’t that the suggestive form? As in ‘we should go’

t3rmit3@beehaw.org on 06 Jul 19:05 collapse

The radical for water is actually derived from the standalone kanji. It’s basically an extremely short-stroke version of the kanji.

Ikimashou is just the ‘formal’, full-length version. No difference in meaning. Just as “iku” is the casual version of “ikimasu”.

Ikimasu -> iku

Ikimashou -> ikou

MaggiWuerze@feddit.org on 06 Jul 19:24 collapse

Fascinating. That explains the similarity. Since watching that episode of Witch Watch I definitely feel bad about my formal “Duolingo” Japanese :D

By the way, is there a rule to how these short forms are formed?

t3rmit3@beehaw.org on 06 Jul 19:53 collapse

By the way, is there a rule to how these short forms are formed?

Yep! Most Japanese verbs (with a few exceptions like ‘shimasu’ becoming suru) use one of the ‘i’ variants (‘i’, ‘ki’, ‘ni’, ‘mi’, or ‘ri’) after the kanji, that indicates they are verbs.

Yakimasu (to burn/ cook), shirimasu (to know), arukimasu (to walk), arimasu (to be), shinimasu (to die), yomimasu (to read).

Ki will become ku in the shortened version, ri will become ru, ni -> nu, etc:

yaku, shiru, aruku, aru, shinu, yomu

I believe the verbs that don’t end in one of those like tabemasu (to eat) will default to ‘ru’ (taberu), but I don’t know if that’s a rule off the top of my head, or if I just can’t think of any others right now.

In the cases where rendaku applies, such as oyogimasu (to swim), the end kana will also have rendaku applied, e.g. oyogu. Ki -> ku, gi -> gu.

MaggiWuerze@feddit.org on 06 Jul 20:12 collapse

Do you teach this usually? These explanations seem very practiced (in a good way).

Thanks a lot, maybe this will help me sound at least somewhat casual :D

t3rmit3@beehaw.org on 06 Jul 22:21 collapse

Nope, just been learning and speaking it for a long time. :)

Good luck with your studies, and you can always dm me if you have any other questions!

MaggiWuerze@feddit.org on 06 Jul 22:39 collapse

Thank you

luciole@beehaw.org on 03 Jul 01:12 next collapse

Both translation and subtitles have highly efficient tooling when in the hands of a professional. Translators nowadays use a mix and will build up a dynamic database as they go through a corpus that needs coherence. What’s bad in this instance is not the usage of some AI, but of a badly adapted AI and ultimately of mediocre results which gives an amateurish impression.

SpectralPineapple@beehaw.org on 03 Jul 05:34 next collapse

Although it seems likely that Crunchyroll uses an LLM for translation in some way, I wouldn’t call that “confirmed” since that might be the result of an individual translator using it.

t3rmit3@beehaw.org on 04 Jul 00:17 collapse

The actions of an employee, when reviewed and released by a company, are the actions of that company. A company is just the sum of its employees’ actions.

faercol@lemmy.blahaj.zone on 04 Jul 16:27 collapse

Also, LLM have been there for a while. So there are a few possible situations

  • LLM used is authorized or even encouraged. In this case it’s the company
  • LLM use is controlled, and this falls into one of the authorized cases. Same thing really. Also their authorized use cases need review
  • LLM use is forbidden, or restricted and this is not an authorized use. In this case it falls on the company to review what’s being done. It’s their responsibility.

So yeah, whatever the situation, it’s on Crunchyroll.

DragonTypeWyvern@midwest.social on 04 Jul 17:51 collapse

Pretty obvious if you’re used any recently but confirmation is nice. Their closed captions are generally pretty terrible as well.