azorius.net

Sapo3, a tui audiobook generator, in Bash (gitlab.com)
from christos@lemmy.world to linux@lemmy.ml on 26 Sep 2024 21:17
https://lemmy.world/post/20222014

https://gitlab.com/christosangel/sapo3

Sapo3 is a suite of scripts-tools that can help the user convert a text file to an audio file.
It uses the tts-edge API for text-to-speech conversion.
Big txt files can be easily converted to audio books, using a wide range of customization capabilities.

When the user runs Sapo3, they will be presented with a menu of options:

o option: Fix name pronunciation with Fix Names

c option: Split text to chapters with Chapterize
v option: Convert File to audio
f option: Check every sentence outcome with Fix Audio option.

m option: Merging Audio Files
p option: Configuring Preferences

#linux

threaded - newest

schizo@forum.uncomfortable.business on 26 Sep 2024 22:03 collapse

Neat idea, but the send-your-text-to-Microsoft bit of it is uh, well, no thank you.

Seems like a strange choice, personally, but I’m not a fan of sending tech corporations anything avoidable.

christos@lemmy.world on 26 Sep 2024 22:10 next collapse

I totally undersand what you are saying. Initially, the original project used local text-to-speech, but was less than perfect, slower and cpu-costly.

You can check it out here https://gitlab.com/christosangel/sapo

Once a FOSS solution gets better and more usable, swapping the tts conversion is not a great deal.

schizo@forum.uncomfortable.business on 27 Sep 2024 02:01 next collapse

I’m somewhat surprised that there aren’t a lot of good alternatives but uh, yeah, there doesn’t seem to be.

I would have expected there to be at least one or two good TTS engines but I guess that assumption is quite wrong.

As to your other post, it’s less that I care in any specific sense that Microsoft knows what I’m reading and more of a (admittedly irrational) dislike of providing anything that an ad company could maybe later use to sell me shit.

lime@feddit.nu on 27 Sep 2024 07:02 collapse

shouldn’t there at least be an option to use speech-dispatcher?

christos@lemmy.world on 27 Sep 2024 10:38 collapse

Do you mean an option to choose between various tts methods?

lime@feddit.nu on 27 Sep 2024 11:08 collapse

i believe that’s what speech-dispatcher is; a uniform interface for tts systems.

christos@lemmy.world on 27 Sep 2024 11:15 collapse

speech-dispatcher

If you are referring to locally generated speech synthesis, the respecting outcome as far as I am concerned generally sounds generally poorer, and is more difficult to manage. However you can check out the original project https://gitlab.com/christosangel/sapo, where the audio files are generated locally.

lime@feddit.nu on 27 Sep 2024 12:01 collapse

well speech-dispatcher has no synthesis component, you can plug in any tts engine that follows the interface. it’s nice to have a choice in engine just by implementing the support. personally i use piper which i feel gives a pretty good performance.

christos@lemmy.world on 27 Sep 2024 12:07 collapse

piper

Indeed piper performs very well. Thank you for the input, I will most certainly consider adding the option to select tts engine in the near future, piper sounds totally worth it.

christos@lemmy.world on 26 Sep 2024 22:17 collapse

And, as far as

send-your-text-to-Microsoft bit

goes, well, if MS wants a copy of Brothers Karamazov, they can save themselves the trouble and get it here , it is free https://www.gutenberg.org/