Self-documenting Code (lackofimagination.org)
from Aijan@programming.dev to programming@programming.dev on 21 Oct 09:22
https://programming.dev/post/20805887

#programming

threaded - newest

steventhedev@lemmy.world on 21 Oct 09:49 next collapse

Ew no.

Abusing language features like this (boolean expression short circuit) just makes it harder for other people to come and maintain your code.

The function does have opportunity for improvement by checking one thing at a time. This flattens the ifs and changes them into proper sentry clauses. It also opens the door to encapsulating their logic and refactoring this function into a proper validator that can return all the reasons a user is invalid.

Good code is not “elegant” code. It’s code that is simple and unsurprising and can be easily understood by a hungover fresh graduate new hire.

traches@sh.itjust.works on 21 Oct 09:58 next collapse

Agreed. OP was doing well until they replaced the if statements with ‚function call || throw error’. That’s still an if statement, but obfuscated.

BrianTheeBiscuiteer@lemmy.world on 21 Oct 12:00 collapse

Don’t mind the || but I do agree if you’re validating an input you’d best find all issues at once instead of “first rule wins”.

rooster_butt@lemm.ee on 22 Oct 13:58 collapse

Short circuiting conditions is important. Mainly for things such as:

if(Object != Null && Object.HasThing) …

Without short circuit evaluation you end up with a null pointer exception.

verstra@programming.dev on 21 Oct 10:33 next collapse

I agree, this is an anti-pattern for me.

Having explicit throw keywords is much more readable compared to hiding flow-control into helper functions.

Womble@lemmy.world on 21 Oct 11:23 next collapse

Good code is not “elegant” code. It’s code that is simple and unsurprising and can be easily understood by a hungover fresh graduate new hire.

I wouldnt go that far, both elegance are simplicity are important. Sure using obvious and well known language feaures is a plus, but give me three lines that solve the problem as a graph search over 200 lines of object oriented boilerplate any day. Like most things it’s a trade-off, going too far in either direction is bad.

lmaydev@lemmy.world on 21 Oct 14:16 next collapse

100% un-nesting that if would have been fine.

YaBoyMax@programming.dev on 21 Oct 14:28 next collapse

This is the most important thing I’ve learned since the start of my career. All those “clever” tricks literally just serve to make the author feel clever at the expense of clarity and long-term manintainability.

hex@programming.dev on 21 Oct 22:13 next collapse

I mean, boolean short circuit is a super idiomatic pattern in Javascript

clutchtwopointzero@lemmy.world on 22 Oct 09:25 next collapse

Because on JS the goal is to shave bytes to save money on data transfer rates

hex@programming.dev on 22 Oct 11:57 collapse

It’s not that deep. It looks nice, and is easy to understand.

arendjr@programming.dev on 22 Oct 12:29 collapse

I think that’s very team/project dependent. I’ve seen it done before indeed, but I’ve never been on a team where it was considered idiomatic.

hex@programming.dev on 22 Oct 12:58 collapse

That makes sense.

sip@programming.dev on 22 Oct 16:25 collapse

assert(isPasswordGood(…)) is already in the language. node

Simulation6@sopuli.xyz on 21 Oct 10:01 next collapse

Figuring out what the code is doing is not the hard part. Documenting the reason you want it to do that (domain knowledge) is the hard part.

tatterdemalion@programming.dev on 21 Oct 10:37 next collapse

Agreed.

And sometimes code is not the right medium for communicating domain knowledge. For example, if you are writing code the does some geometric calculations, with lot of trigonometry, etc. Even with clear variable names, it can be hard to decipher without a generous comment or splitting it up into functions with verbose names. Sometimes you really just want a picture of what’s happening, in SVG format, embedded into the function documentation HTML.

Flamekebab@piefed.social on 21 Oct 15:08 next collapse

TempleOS: Hold my communion wine

hex@programming.dev on 21 Oct 22:10 collapse

Yeah. I advocate for self explanatory code, but I definitely don’t frown upon comments. Comments are super useful but soooo overused. I have coworkers that aren’t that great that would definitely comment on the most basic if statements. That’s why we have to push self explanatory code, because some beginners think they need to say:

//prints to the console
console.log("hello world");

I think by my logic, comments are kind of an advanced level concept, lol. Like you shouldn’t really start using comments often until you’re writing some pretty complex code, or using a giant codebase.

TehPers@beehaw.org on 22 Oct 02:45 next collapse

Sometimes when I don’t leave comments like that, I get review comments asking what the line does. Code like ThisMethodInitsTheService() with comments like “what does this do?” in the review.

So now I comment a lot. Apparently reading code is hard for some people, even code that tells you exactly what it does in very simple terms.

hex@programming.dev on 22 Oct 02:49 collapse

Fair. I guess in this case, it’s a manner of gauging who you’re working with. I’d much rather answer a question once in a while than over-comment (since refactors often make comments worthless and they’re so easy to miss…), but if it’s a regular occurrence, yeah it would get on my nerves. Read the fuckin name of the function! Or better yet go check out what the function does!

nous@programming.dev on 22 Oct 08:22 collapse

Worse, refactors make comments wrong. And there is nothing more annoying then having the comment conflict with the code. Which is right? Is it a bug or did someone just forget to update the comments… The latter is far more common.

Comments that just repeat the code mean you now have two places to update and keep in sync - a pointless waste of time and confusion.

hex@programming.dev on 22 Oct 11:56 collapse

Yes- exactly, they make comments wrong. But comments aren’t always a waste of time, like in legacy code, or just in general code that isn’t gonna change (mathematical equations too)

nous@programming.dev on 22 Oct 12:38 collapse

Comments are not always a waste of time, but comments that repeat or tell you what the code is doing (rather than why) are a waste. For legacy code you generally don’t have comments anyway and the code is hard to read/understand.

But if you can understand the code enough to write a comment you can likely refactor the code to just make it more readable to start with.

For code that does not change generally does not need to be read much so does not need comments to describe what it is doing. And again, if you understand it enough to write a comment to explain what it is doing you can refactor it to be readable to begin with. Even for mathematical equations I would either expect the reader to be able to read them or link to documentation that describes what it is in much more detail to name the function enough that the reader can look it up to understand the principals behind it.

hex@programming.dev on 22 Oct 12:55 collapse

You make some great points. Using smaller functions and breaking up your code in readable bits makes a huge difference and you will likely never need comments if you do it right 👍🏻

nous@programming.dev on 22 Oct 15:08 collapse

Creating functions is IMO not the first thing you should do. Giving variables better names or naming temporaries/intermediate steps is often all you really need to do to make things clearer. Creating smaller functions tends to be my last resort and I would avoid it when I can as splitting the code up can make things harder to understand as you have to jump around more often.

hex@programming.dev on 22 Oct 16:15 collapse

I hear ya. As always, it’s a balance between having functions that are too long, and many too small functions. Matter of team preferences too.

Faresh@lemmy.ml on 22 Oct 12:42 collapse

Comments are super useful but soooo overused

I think overusing comments is a non-issue. I’d rather have over-commented code that doesn’t need it, over undocumented code without comments that needs them. If this over-commenting causes some comments to be out of date, those instances should hopefully be obvious from the code itself or the other comments and easily fixed.

hex@programming.dev on 22 Oct 12:54 collapse

I understand what you’re saying and I mostly agree, but those few instances where a line of code is only slightly different and the comment is the same, can really be confusing.

lobut@lemmy.ca on 21 Oct 12:25 next collapse

I can’t recall the exact change but a coworker did something five years very intentionally. The comments, the commit and everything described what they did but not why.

I think it was with side effects: true and I fixed a certain way we bundled things and I believe that could have solved the issue but I don’t know for sure :/

steventhedev@lemmy.world on 21 Oct 12:50 collapse

One upvote is not enough.

I once wrote a commit message the length of a full blog post comparing 10 different alternatives for micro optimization, with benchmarks and more. The diff itself was ten lines. Shaved around 4% off the hot path (based on a sampling profiler that ran over the weekend).

koper@feddit.nl on 21 Oct 10:05 next collapse

Why the password.trim()? Silently removing parts of the password can lead to dangerous bugs and tells me the developer didn’t peoperly consider how to sanitize input.

I remember once my password for a particular organization had a space at the end. I could log in to all LDAP-connected applications, except for one that would insist my password was wrong. A trim() or similar was likely the culprit.

Aijan@programming.dev on 21 Oct 10:16 next collapse

Thanks for the tip. password.trim() can indeed be problematic. I just removed that line.

spechter@lemmy.ml on 21 Oct 10:28 next collapse

Another favorite of mine is truncating the password to a certain length w/o informing the user.

Flipper@feddit.org on 21 Oct 12:17 next collapse

The password needs to be 8 letters long and may only contain the alphabet. Also we don’t tell you this requirement or tell you that setting the password went wrong. We just lock you out.

NotationalSymmetry@ani.social on 21 Oct 14:21 collapse

Saving the password truncates but validation doesn’t. So it just fails every time you try to log in with no explanation. The number of times I have seen this in a production website is too damn high.

nitefox@sh.itjust.works on 21 Oct 17:52 collapse

It also can truncate on the BE side when using the damn varchar

nous@programming.dev on 21 Oct 18:34 collapse

Passwords should be hashed, not stored plain text! Hashes are always the same length so this is an immediate sign they are doing horribly insecure things with your password.

HamsterRage@lemmy.ca on 21 Oct 12:59 collapse

The reason for leaving in the password.trim() would be one of the few things that I would ever document with a comment.

graycube@lemmy.world on 21 Oct 11:03 next collapse

I would have liked some comments explaining the rules we are trying to enforce or a link to the product requirements for it. Changing the rules requirements is the most likely reason this code will ever be looked at again. The easier you can make it for someone to change them the better. Another reason to need to touch the code is if the user model changes. I suppose we might also want a different password hash or to store the password separately even a different outcome if the validation fails. Or maybe have different ruled for different user types. When building a function like this I think less about “ideals” and more about why someone might need to change what I just did and how can I make it easier for them.

nous@programming.dev on 21 Oct 18:53 collapse

and how can I make it easier for them.

I am wary of this. It is very hard to predict what someone else in the future might want to do. I would only go so far as to ensure nothing I am doing will unnecessarily block a refactor later on but I would avoid trying to add or abstract things in ways that make the current code harder to read because you think it might be easier for someone to add to in the future.

I have needed, far too many times, to strip out some unused abstraction to do something that abstraction was never intended to allow because someone was trying to save me time and predict what might happen to the code in the future and got it completely wrong. It is far easier to add an abstraction to simple code later on when it actually helps then to try and figure out what the abstraction is and remove it when it is found to be wrong.

graycube@lemmy.world on 22 Oct 01:37 collapse

Good point. I think knowing where to draw that line comes with experience (and having to fix lots of other people’s code).

humblebun@sh.itjust.works on 21 Oct 11:07 next collapse

Code should be generated from documentation imo

verstra@programming.dev on 21 Oct 11:32 collapse

Documentation should be generated from code imo

humblebun@sh.itjust.works on 21 Oct 11:36 next collapse

This means war

Deckweiss@lemmy.world on 21 Oct 11:44 next collapse

Just run both in a loop until it reaches a state of equilibrium.

humblebun@sh.itjust.works on 21 Oct 11:48 collapse

Or use LLM to generate the doc from all sources you have. Foolproof 100%

nous@programming.dev on 21 Oct 19:02 collapse

Better yet, use LLM to generate the docs - then use it again to generate the code form the docs. WCPGW

humblebun@sh.itjust.works on 21 Oct 19:23 collapse

You sound like a CEO. Do you have a spare CTO role?

0x0@programming.dev on 21 Oct 15:57 collapse

Could be .jar too.

key@lemmy.keychat.org on 21 Oct 11:49 next collapse

Code should be generated from documentation generated from code

svetlyak40wt@fosstodon.org on 21 Oct 12:14 collapse

@key @verstra documentation should be generated from neuro-waves, generated from a brainstorm.

Zagorath@aussie.zone on 21 Oct 12:22 collapse

If the doco we’re talking about is specifically an API reference, then the documentation should be written first. Generate code stubs (can be as little as an interface, or include some basic actual code such as validating required properties are included, if you can get that code working purely with a generated template). Then write your actual functional implementation implementing those stubs.

That way you can regenerate when you change the doco without overriding your implementation, but you are still forced to think about the user (as in the programmer implementing your API) experience first and foremost, rather than the often more haphazard result you can get if you write code first.

For example, if writing a web API, write documentation in something like OpenAPI and generate stubs using Swagger.

humblebun@sh.itjust.works on 21 Oct 12:33 collapse

Same for drivers. Generate headers from documentation and distribute it you fucking morons

Zagorath@aussie.zone on 21 Oct 12:42 collapse

Yup absolutely. I mentioned web APIs because that’s what I’ve got the most experience with, but .h files, class library public interfaces, and any other time users who are not the implementor of the functionality might want to call it, the code they’ll be interacting with should be tailored to be good to interact with.

dohpaz42@lemmy.world on 21 Oct 12:05 next collapse

async function createUser(user) {
    validateUserInput(user) || throwError(err.userValidationFailed);
    isPasswordValid(user.password) || throwError(err.invalidPassword);
    !(await userService.getUserByEmail(user.email)) || throwError(err.userExists);

    user.password = await hashPassword(user.password);
    return userService.create(user);
}

Or

async function createUser(user) {
    return await (new UserService(user))
        .validate()
        .create();
}

// elsewhere…
const UserService = class {
    #user;

    constructor(user) {
        this.user = user;
    }

    async validate() {
        InputValidator.valid(this.user);

       PasswordValidator.valid(this.user.password);

        !(await UserUniqueValidator.valid(this.user.email);

        return this;
    }

    async create() {
        this.user.password = await hashPassword(this.user.password);

        return userService.create(this.user);
    }
}

I would argue that the validate routines be their own classes; ie UserInputValidator, UserPasswordValidator, etc. They should conform to a common interface with a valid() method that throws when invalid. (I’m on mobile and typed enough already).

“Self-documenting” does not mean “write less code”. In fact, it means the opposite; it means be more verbose. The trick is to find that happy balance where you write just enough code to make it clear what’s going on (that does not mean you write long identifier names (e.g., getUserByEmail(email) vs. getUser(email) or better fetchUser(email)).

Be consistent:

  1. get* and set* should be reserved for working on an instance of an object
  2. is* or has* for Boolean returns
  3. Methods/functions are verbs because they are actionable; e.g., fetchUser(), validate(), create()
  4. Do not repeat identifiers: e.g., UserService.createUser()
  5. Properties/variables are not verbs; they are state: e.g., valid vs isValid
  6. Especially for JavaScript, everything is const unless you absolutely have to reassign its direct value; I.e., objects and arrays should be const unless you use the assignment operator after initialization
  7. All class methods should be private until it’s needed to be public. It’s easier to make an API public, but near impossible to make it private without compromising backward compatibility.
  8. Don’t be afraid to use if {} statements. Short-circuiting is cutesy and all, but it makes code more complex to read.
  9. Delineate unrelated code with new lines. What I mean is that jamming all your code together into one block makes it difficult to follow (like run-on sentences or massive walls of text). Use new lines and/or {} to create small groups of related code. You’re not penalized for the white space because it gets compiled away anyway.

There is so much more, but this should be a good primer.

RecluseRamble@lemmy.dbzer0.com on 21 Oct 13:27 next collapse

I would argue that the validate routines be their own classes; ie UserInputValidator, UserPasswordValidator, etc.

I wouldn’t. Not from this example anyway. YAGNI is an important paradigm and introducing plenty of classes upfront to implement trivial checks is overengineering typical for Java and the reason I don’t like it.

Edit: Your naming convention isn’t the best either. I’d expect UserInputValidator to validate user input, maybe sanitize it for a database query, but not necessarily an existence check as in the example.

dohpaz42@lemmy.world on 21 Oct 14:56 collapse

I wouldn’t. Not from this example anyway. YAGNI is an important paradigm and introducing plenty of classes upfront to implement trivial checks is overengineering…

Classes, functions, methods… pick your poison. The point is to encapsulate your logic in a way that is easy to understand. Lumping all of the validation logic into one monolithic block of code (be it a single class, function, or methods) is not self-documenting. Whereas separating the concerns makes it easier to read and keep your focus without mixing purposes. I’m very-engineering (imo) would be something akin to creating micro services to send data in and get a response back.

Edit: Your naming convention isn’t the best either. I’d expect UserInputValidator to validate user input, maybe sanitize it for a database query, but not necessarily an existence check as in the example.

If you go back to my example, you’ll notice there is a UserUniqueValidator, which is meant to check for existence of a user.

And if you expect a validator to do sanitation, then your expectations are wrong. A validator validates, and a sanitizer sanitizes. Not both.

For the uninitiated, this is called Separation of Concerns. The idea is to do one thing and do it well, and then compose these things together to make your program — like an orchestra.

RecluseRamble@lemmy.dbzer0.com on 21 Oct 17:44 next collapse

If you go back to my example, you’ll notice there is a UserUniqueValidator, which is meant to check for existence of a user.

Oops, right, I just glanced over the code and obviously missed the text and code had different class names. Another smell in my opinion, choosing class names that only differ in the middle. Easily missed and confusion caused.

I don’t think our opinions are too far off though. You’re just scaling the validation logic to realistic levels and I warn that in practice coders extrapolate too quickly and too often, which results in too much generic code which is naturally harder to understand and maintain than specific code.

nous@programming.dev on 21 Oct 18:46 collapse

This is abuse of the separation of concerns concepts IMO. You have taken things far too far many made it far less readable overall. The main concern here is password validation - and the code already separated this out from other code. By separating out each check you are just violating another principal - locality of behavior which says related things should be located close to each other. This makes things far easier to read and see what is actually going on without needing to jump through several classes/functions of abstraction.

We need to stop trying to break everything down into the smallest possibly chunks we can. It is fine for a few lines of related code to live in the same function.

FizzyOrange@programming.dev on 21 Oct 21:39 next collapse

Oof found the Java developer. No thanks.

olafurp@lemmy.world on 22 Oct 07:40 collapse

I like the service but the constructor parameter is really bad and makes the methods less reusable

dohpaz42@lemmy.world on 22 Oct 12:03 collapse

That’s fair. How would you go about implementing the service? I always love seeing other people’s perspectives. 😊

olafurp@lemmy.world on 23 Oct 08:08 collapse

More or less the same but the user gets passed as a method parameter each time. Validators would be in my opinion a long function inside the service also with named variables like this because it’s just easy to read and there are no surprises. I’d probably refactor it at around 5 conditions or 30 lines of validation logic.

I recommend trying out using the constructor in services for tools such as a database and methods for data such as user. It will be very easy to use everywhere and for many users and whatever

const passwordIsValid = ...
if (!passwordIsValid){
  return whatever
}
Rogue@feddit.uk on 21 Oct 15:14 next collapse

A quick glance and this seemed nothing to do with self documenting code and everything to do with the flaws when code isn’t strictly typed.

Atlas_@lemmy.world on 22 Oct 00:25 next collapse

In addition to the excellent points made by steventhedev and koper:

user.password = await hashPassword(user.password);

Just this one line of code alone is wrong.

  1. It’s unclear, but quite likely that the type has changed here. Even in a duck typed language this is hard to manage and often leads to bugs.
  2. Even without a type change, you shouldn’t reuse an object member like this. Dramatically better to have password and hashed_password so that they never get mixed up. If you don’t want the raw password available after this point, zero it out or delete it.
  3. All of these style considerations apply 4x as strongly when it’s a piece of code that’s important to the security of your service, which obviously hashing passwords is.
Aijan@programming.dev on 22 Oct 06:52 collapse

I appreciate the security concerns, but I wouldn’t consider overriding the password property with the hashed password to be wrong. Raw passwords are typically only needed in three places: user creation, login, and password reset. I’d argue that having both password and hashedPassword properties in the user object may actually lead to confusion, since user objects are normally used in hundreds of places throughout the codebase. I think, when applicable, we should consider balancing security with code maintainability by avoiding redundancy and potential confusion.

Atlas_@lemmy.world on 22 Oct 07:18 next collapse

I absolutely agree. An even better structure wouldn’t have a raw password field on the user object at all.

Draugnoss@sopuli.xyz on 22 Oct 07:18 next collapse

I agree. The field shouldn’t have been called ‘password’ in the first place, but rather ‘plaintextPassword’ or similar. That makes the code much more readable, if at a glance I know if I’m dealing with the hash or the plaintext version.

nous@programming.dev on 22 Oct 08:20 collapse

When is the hashed password needed other than user creation, login or password resets? Once you have verified the user you should not need it at all. If anything storing it on the user at all is likely a bad idea. Really you have two states here - the unauthed user which has their login details, and an authed user which has required info about the user but not their password, hashed or not.

Personally I would construct the user object from the request after doing auth - that way you know that any user object is already authed and it never needs to store the password or hash at all.

Aijan@programming.dev on 22 Oct 08:34 collapse

Perhaps I was unclear. What I meant to say is that, whenever possible, we shouldn’t have multiple versions of a field, especially when there is no corresponding plaintext password field in the database, as is the case here.

nous@programming.dev on 22 Oct 09:06 collapse

And they were arguing the same - just renaming the property rather than reusing it. You should only have one not both but naming them differently can make it clear which one you have.

But here I am arguing to not have either on the user object at all. They are only needed at the start of a request and should never be needed after that point. So no point in attaching them to a user object - just verify the username and password and pass around user object after that without either the password or hash. Not everything needs to be added to a object.

Kissaki@programming.dev on 23 Oct 07:02 collapse

Code before:

async function createUser(user) {
    if (!validateUserInput(user)) {
        throw new Error('u105');
    }

    const rules = [/[a-z]{1,}/, /[A-Z]{1,}/, /[0-9]{1,}/, /\W{1,}/];
    if (user.password.length >= 8 && rules.every((rule) => rule.test(user.password))) {
        if (await userService.getUserByEmail(user.email)) {
            throw new Error('u212');
        }
    } else {
        throw new Error('u201');
    }

    user.password = await hashPassword(user.password);
    return userService.create(user);
}

Here’s how I would refac it for my personal readability. I would certainly introduce class types for some concern structuring and not dangling functions, but that’d be the next step and I’m also not too familiar with TypeScript differences to JavaScript.

const passwordRules = [/[a-z]{1,}/, /[A-Z]{1,}/, /[0-9]{1,}/, /\W</