Can you use 0x1E-0x1F ASCII codes?
from Val@lemm.ee to programming@programming.dev on 27 Dec 16:48
https://lemm.ee/post/50930455

After needing to find a small delimiter for my data format I started wondering if I could use 0x1E-0x1F?

They are part of the control codes so I thought they might do something weird?

en.wikipedia.org/wiki/C0_and_C1_control_codes#Fie…

#programming

threaded - newest

bamboo@lemm.ee on 27 Dec 18:04 next collapse

Depends on if you want your data format to be strict ascii. If you don’t care, then sure, why not?

SpaceNoodle@lemmy.world on 27 Dec 18:38 next collapse

1E and 1F were actually originally intended to be used as record and unit separators, respectively, so that’s actually not a bad idea. The description for those fields in the article you linked even mentions that they’re suited for use as field delimiters.

FizzyOrange@programming.dev on 27 Dec 19:54 next collapse

Generally a bad idea to use in-band signalling like that. They won’t do anything weird but consider what happens if the actual data contains them.

dgriffith@aussie.zone on 27 Dec 22:59 collapse

consider what happens if the actual data contains them.

Then you’d escape them by using another character in front. But if their data format is ASCII text or is guaranteed not to have characters below ASCII 32 then using ASCII delimiters is fine.

SpaceNoodle@lemmy.world on 27 Dec 23:38 next collapse

But who escapes the escape characters?

QuazarOmega@lemy.lol on 27 Dec 23:43 next collapse

It’s escape characters all the way down

Val@lemm.ee on 28 Dec 08:01 collapse

You can use Unicode pictures: ␜ ␝ ␞ ␟

en.wikipedia.org/wiki/Control_Pictures

SpaceNoodle@lemmy.world on 28 Dec 08:10 collapse

Use emoji as escape characters

FizzyOrange@programming.dev on 28 Dec 09:53 next collapse

Indeed. Escape characters add a lot of additional complexity, footguns, and performance penalties.

lurklurk@lemmy.world on 28 Dec 10:50 collapse

Then you can just use a conmon delimiter like comma or semicolon or something. It’s better even as you’re less likely to have something that seems to work until your exotic delimiter pops up in the data.

Better yet, use a commonly used data format like csv or json and don’t build your own

MinekPo1@lemmygrad.ml on 27 Dec 21:17 collapse

depends on your format ? if the format is binary anyway or has binary blobs (ie it needs a program that is able to handle octets outside the printable range) and using those characters does not introduce any ambiguities with the format then go for it . ANSI and related control codes all start with 0x1B