Stop Parsing (unstructured) Text (pc-hass.de)
from Laser@feddit.org to linux@lemmy.ml on 03 May 19:20
https://feddit.org/post/11845727

#linux

threaded - newest

Trent@lemmy.ml on 03 May 19:58 next collapse

You might like jc

Laser@feddit.org on 03 May 20:03 next collapse

Thanks, I never used it and had forgotten about it until now.

double_quack@lemm.ee on 04 May 07:01 collapse

Nice! I didn’t know this

StrangeAstronomer@lemmy.ml on 03 May 22:26 next collapse

venerable jq

Ha! jq was the bratty kid I yelled at to get off my lawn. Now he’s a drinking buddy, but still the youngest!

Laser@feddit.org on 04 May 06:08 collapse

It’s true that compared to the other utilities, it’s rather new. First release was almost 13 years ago. awk, which I think is the closest comparison, on the other hand turns 50 in 2027… though new awk is only 40.

traches@sh.itjust.works on 04 May 07:49 next collapse

Shout out to nushell for building an entire shell around this idea!

Laser@feddit.org on 04 May 08:02 collapse

It’s a cool shell, I like ita lot more since I found out you can use ? to mark a field optional

MonkderVierte@lemmy.ml on 04 May 11:02 collapse

A tradeoff between convenience and usecase. I personally would only use json/jq for complex data processing needs. But then i would use Python, not shell.

Laser@feddit.org on 04 May 12:17 collapse

The issue is not only complexity, though it does play a role. You can also run into issues with pure text parsing, especially when whitespace is involved. The IP thing is a very classic example in my opinion, and while whitespace might not be an issue there (more common with filenames), the queries you find online in my opinion aren’t less complex.

Normal CLI output is often meant to be consumed by humans, so the data presentation requirements are different. Then you find out that an assumption you made isn’t true (e.g. due to LANG indicating a non-English language) and suddenly your matching rules don’t fit.

There are just a lot of pitfalls that can make things go subtly wrong, which is why parsing general CLI output that’s not intended to be parsed is often advised against. It doesn’t mean that it will go wrong.

Regarding Python, I think it has a place when you do what I’d call data set processing, while what I talk about is shell plumbing. They can both use JSON, but the tools are probably not the same.