Parse, don’t validate

Parse, don’t validate (lexi-lambda.github.io)
from ephemera3444@lemmy.blahaj.zone to programming@programming.dev on 06 Jan 2025 01:15
https://lemmy.blahaj.zone/post/20483463

#programming

threaded - newest

livingcoder@programming.dev on 06 Jan 2025 03:54 next collapse

This was a good blog post. I particularly appreciated the statement about the validate and parse function comparison: “Both of these functions check the same thing, but parseNonEmpty gives the caller access to the information it learned, while validateNonEmpty just throws it away.”

Deckweiss@lemmy.world on 06 Jan 2025 09:29 next collapse

confused java dev: what do you mean a function can’t return void???

QT1@lemm.ee on 06 Jan 2025 09:43 collapse

void in Java and Void in Haskell are quite different. As the post explains, in Haskell it’s a type with no possible values. In Java, the equivalent would be a class without a constructor (not sure if that’s even possible). It defines a type, but you cannot construct a value or object with that type. The equivalent of Java‘s void in Haskell is the unit type () which has exactly one possible value, also called (). It can be returned by a function, but it does not give you any information, just like void. By the way, Rust also uses the unit type instead of void.

Deckweiss@lemmy.world on 06 Jan 2025 09:46 next collapse

yeah that was the joke, thanks for explaining it

FrostyPolicy@suppo.fi on 06 Jan 2025 10:38 collapse

You do have a Void type in Java if you really must specify a return type and don’t want to return anything e.g. services and their tasks in JavaFx. The Task must have a return type thus you can use Void if the task doesn’t actually return anything.

Ephera@lemmy.ml on 07 Jan 2025 06:47 collapse

Well, yeah, but that Void type is different than the Void type in Haskell.

The Haskell-Void says that the function never returns. So, for example, if the function always goes into an infinite loop. Or only ever throws an exception or does a System.exit(0). You cannot portray that in Java, to my knowledge.

QT1@lemm.ee on 06 Jan 2025 09:32 next collapse

I’ve first read this post back in 2019 when it was released and I have to say that it really has left quite an impact on the way I write programs these days. The „make illegal states unrepresentable“ and „push proofs up“ guidelines are so simple yet so effective. Sure, there is some initial cost to create new datatypes, but it really pays off in the long run. Not having to worry about null or wrongly shaped data structures down the line is really nice, especially if you’re working on older code or develop in a team. Even though the post uses Haskell to explain the concepts, I found it to also work well in other languages, even Java or Python.

onlinepersona@programming.dev on 06 Jan 2025 16:53 next collapse

data NonEmpty a = a :| [a]

Note that NonEmpty a is really just a tuple of an a and an ordinary, possibly-empty [a]. This conveniently models a non-empty list by storing the first element of the list separately from the list’s tail: even if the [a] component is [], the a component must always be present.

Wat? How can I “store the first element of the list separated from the lists tail” when the list is empty? Whether a list is empty or not is a runtime possibility, not a compile-time possibility.

Someone care to explain this part? It does not compute at all for me.

Anti Commercial-AI license

Kache@lemm.ee on 06 Jan 2025 17:01 next collapse

You cannot, and that’s why that type declaration models a NonEmpty that a type checker can enforce

onlinepersona@programming.dev on 06 Jan 2025 18:13 collapse

So it’s the implementation that has to ensure a NonEmpty is returned, but that’s up to the developer, correct? The developer still holds the gun to shoot themselves in the foot by returning an empty list, IINM.

Anti Commercial-AI license

hallettj@leminal.space on 06 Jan 2025 19:51 collapse

If the return type of a function is NonEmpty the value returned is guaranteed to be non-empty because it is not possible to construct an empty NonEmpty value. That’s the “make illegal states unrepresentable” mantra in action.

At runtime you might get a list from an API response or something, and it might be empty. At that point you have a regular list. Following the advice from the article you want to parse that data to transform it into the types representing your legal states. So if the list is not supposed to be empty then somewhere you have a function that takes the possibly-empty list, and returns a value of type NonEmpty. But if the list actually is empty that function will fail so it has to be able to return or throw an error. The article uses the Maybe type for that which is one of the Haskell types for functions that can fail.

Once you have parsed the input list, and successfully gotten a NonEmpty value the rest of your code can safely access the first element of the list because a value of that type is guaranteed to have at least one value.

Corbin@programming.dev on 06 Jan 2025 17:04 next collapse

A list can store zero or more elements. A NonEmpty can store one or more element. That’s all.

This overall strategy – representing the top of a list as a dedicated value – shows up elsewhere, notably in Forths, where it is called “top of stack” and often stored in a dedicated CPU register.

Ephera@lemmy.ml on 07 Jan 2025 06:33 collapse

During the parsing step, you check that the list has at least one element. If it does not, you report an error to the user and exit. If it does, you take the first element in the least and store it in the left side of your tuple, and then the remaining elements of the input list go into the right side of your tuple.

So, for example: [1, 2, 3] → (1, [2, 3])
Or also: [1] → (1, [])
If the user gives you [], then you cannot represent that with your tuple, you necessarily have to error.

hallettj@leminal.space on 06 Jan 2025 19:38 collapse

Hey I had this post in mind just yesterday when I was working on some Mastodon client code to show comments on my static-site blog. Typescript is especially well-suited for deriving types from parsers. I also enjoyed brushing up on how to use JSDoc annotations and ES modules to publish what is effectively Typescript that runs in the browser without a build step.