Anyone using "Speech Note" (speech to text) with good results?
from Showroom7561@lemmy.ca to linux@lemmy.ml on 15 Sep 19:13
https://lemmy.ca/post/51669193

I’ve been using Speech Note (github link) for months, but it often gets things wildly wrong.

I thought it was my mic, so I got one that’s crystal clear. I also tried a ton of different models, and other than being slow (or fast), their accuracy is usually pretty similar.

But I’m still needing to take a lot of time to edit the results, and I wonder if there’s something I should be doing to get better results.

On other speech-to-text platforms (like Futo keyboard on Android), the results are fast and very accurate. I have a hard time believing that Speech Note can’t be as good.

Can any other users share their experience?

#linux

threaded - newest

DrDystopia@lemy.lol on 15 Sep 20:24 next collapse

Had enough issues with it to not find it helpful. But I’m not a native English speaker and support for my local language is so-so, so might as well be me that’s the problem.

arty@feddit.org on 15 Sep 21:00 next collapse

I’m not a native English speaker, but neither people nor other robots have problems understanding me - in person or over a microphone. Speech Note hadn’t shown good results, unfortunately. I really wanted to use it, because on my Android phone I use voice input all the time.

Showroom7561@lemmy.ca on 15 Sep 21:10 collapse

I really wanted to use it, because on my Android phone I use voice input all the time.

That’s why I’m thinking it’s a problem with Speech Note and not my mic, or how I’m speaking to it.

That’s a real shame. I can type quite fast, but my hand joints called it quite a while ago. 😵

NKBTN@feddit.uk on 15 Sep 21:39 next collapse

Try a few different accents out - but I’ve never had better than a 95% success rate myself

k4j8@lemmy.world on 15 Sep 22:00 next collapse

I haven’t used Speech Note, but I have been using Whisper with great success. I run it via Docker.

undrwater@lemmy.world on 16 Sep 22:13 collapse

I’ve used it for a short while to test it out. Accuracy was pretty good, as was correct punctuation. Response time also good.

It’s using my Nvidia GPU to do the LLM thing, so that may be the difference.

Showroom7561@lemmy.ca on 17 Sep 14:29 collapse

It’s using my Nvidia GPU to do the LLM thing, so that may be the difference.

This could be!

Interestingly enough, I was playing around with LLama, as they have speech to text to interact with their chat bot, and it converts in near real-time with very good accuracy. So I do know that things can be fast and accurate, but I wish it was in Speech Note. LOL

For now, I may just to STT through my phone on a shared document with my laptop.