Pyroglyph@lemmy.world
on 27 Dec 2023 19:04
nextcollapse
While I agree with the premise of the article, the code is completely unreadable to me. I took a look at the first snippet and just thought “Nah.”
NostraDavid@programming.dev
on 27 Dec 2023 20:01
nextcollapse
You really need to know C to be able to read that code - If I only knew Python or Java I’d be hella lost too.
Pyroglyph@lemmy.world
on 27 Dec 2023 22:01
collapse
It’s not even that. I can generally read a C-like language, but when the first line I see is a long-ass array of bytes with zero documentation it just makes me not want to even try.
vzq@lemmy.blahaj.zone
on 29 Dec 2023 20:00
collapse
That’s because it’s not the source. It’s not the preferred way of modifying the program. The state diagram is the preferred way of modifying the program.
mrkite@programming.dev
on 27 Dec 2023 21:19
nextcollapse
State machines always make me think of the Disk II controller on the Apple II. It uses a state machine to implement reading and writing sectors to disk.
TehPers@beehaw.org
on 27 Dec 2023 22:33
nextcollapse
Also worth reading is how state machines can be encoded in the type system in some languages, for example the typestate pattern in Rust. By using the type system to encode state like this, you can prevent invalid operations on a state machine from even compiling.
Amaltheamannen@lemmy.ml
on 28 Dec 2023 00:25
collapse
Or just with algebraic types, like Enums in rust or data types in Haskell
Redkey@programming.dev
on 28 Dec 2023 05:25
nextcollapse
I love low-level stuff and this still took me a little while to break down, so I’d like to share some notes on the author’s code snippet that might help someone else.
The function morse_decode is meant to be called iteratively by another routine, once per morse “character” c (dot, dash, or null) in a stream, while feeding its own output back into it as state. As long as the function returns a negative value, that value represents the next state of the machine, and the morse stream hasn’t yet been resolved into an output symbol. When the return value is positive, that represents the decoded letter, and the next call to morse_decode should use a state of 0. If the return value is 0, something has gone wrong with the decoding.
state is just a negated index into the array t, which is actually two arrays squeezed into one. The first 64 bytes are a binary heap of bytes in the format nnnnnnlr, each corresponding to one node in the morse code trie. l and r are single bits that represent the existence of a left or right child of the current node (i.e. reading a dot or dash in the current state leading to another valid state). nnnnnn is a 6-bit value that, when shifted appropriately and added to 63, becomes an index into the second part of the array, which is a list of UTF-8/ASCII codes for letters and numbers for the final output.
threaded - newest
While I agree with the premise of the article, the code is completely unreadable to me. I took a look at the first snippet and just thought “Nah.”
You really need to know C to be able to read that code - If I only knew Python or Java I’d be hella lost too.
It’s not even that. I can generally read a C-like language, but when the first line I see is a long-ass array of bytes with zero documentation it just makes me not want to even try.
That’s because it’s not the source. It’s not the preferred way of modifying the program. The state diagram is the preferred way of modifying the program.
State machines always make me think of the Disk II controller on the Apple II. It uses a state machine to implement reading and writing sectors to disk.
bigmessowires.com/…/the-amazing-disk-ii-controlle…
Also worth reading is how state machines can be encoded in the type system in some languages, for example the typestate pattern in Rust. By using the type system to encode state like this, you can prevent invalid operations on a state machine from even compiling.
Or just with algebraic types, like Enums in rust or data types in Haskell
I love low-level stuff and this still took me a little while to break down, so I’d like to share some notes on the author’s code snippet that might help someone else.
The function
morse_decode
is meant to be called iteratively by another routine, once per morse “character”c
(dot, dash, or null) in a stream, while feeding its own output back into it asstate
. As long as the function returns a negative value, that value represents the next state of the machine, and the morse stream hasn’t yet been resolved into an output symbol. When the return value is positive, that represents the decoded letter, and the next call tomorse_decode
should use astate
of 0. If the return value is 0, something has gone wrong with the decoding.state
is just a negated index into the arrayt
, which is actually two arrays squeezed into one. The first 64 bytes are a binary heap of bytes in the formatnnnnnnlr
, each corresponding to one node in the morse code trie.l
andr
are single bits that represent the existence of a left or right child of the current node (i.e. reading a dot or dash in the current state leading to another valid state).nnnnnn
is a 6-bit value that, when shifted appropriately and added to 63, becomes an index into the second part of the array, which is a list of UTF-8/ASCII codes for letters and numbers for the final output.Real MVP.
Brilliant! I love FSMs…and this one fairly elegant.