Push Ifs Up And Fors Down
(matklad.github.io)
from lysdexic@programming.dev to programming@programming.dev on 17 Nov 2023 10:19
https://programming.dev/post/6001717
from lysdexic@programming.dev to programming@programming.dev on 17 Nov 2023 10:19
https://programming.dev/post/6001717
threaded - newest
I'm curious, is there a difference in doing it like this?:
Doesn't assigning the condition to a variable make it so that it's only really evaluated once?
.
You still need to read the variable in each iteration, but the cost of that is probably negligible
Modern optimizing compilers are magical. I would need to check assembly but I would actually expect the if to be hoisted out of the loop entirely to relieve pressure on the branch predictor.
I’m reasonably sure compilers can shift the if out. I believe it’s called “loop invariant code motion”. Won’t work in call cases, but in the variable case it should.
That scenario would only apply if the condition was constant and specified at compile time. There’s no indication on whether
var1
orvar2
are compile-time constants or predicates evaluated at runtime.I don’t believe you have to specify the condition at compile time. I think that optimization would fall under dead code elimination.
For the invariant code motion stuff the comparison just has to be invariant from start to finish. At least, that’s been my experience. The compiler will just shift the if stement.
But, like, there are totally times when it can’t figure out that the thing is invariant. And sometimes it’s just more readable to move the if statement out of the loop.
How do you tell if some code behind a conditional is dead if the predicate that drives the condition is evaluated at runtime?
Sorry. I wasn’t clear. If the conditional is constant a compile time you get the dead code optimization. The path not taken is removed. If it’s not constant at you may get the loop invariant movement. But only if the compiler can tell that it’s invariant.
My point wasn’t that you should always rely on this behavior. At least, I didn’t mean to say that. I suppose what I should have said is more like “in many cases you won’t see any performance difference because the compiler will do that for you anyway.”
I suppose I have value judgements around that like “generally you should do the thing that is more readable and let the compiler take care of stuff like moving the loop invariant”. That’s been mostly true for me. But only mostly.
The point is that if the predicate is evaluated at runtime then the compiler plays no role because there is no compile-time constant and all code paths are deemed possible.
I understand that you’re trying to say that compilers can leverage compile-time constants to infer if code paths are dead code or not.
That’s just a corner case though. Your compiler has no say on what code paths are dead if you’re evaluating a predicate from, say, the response of a HTTP request. It doesn’t make sense to expect this hypothetical scenario to be realistic when you have no info on where a predicate is coming from.
They’re talking about moving the calculation of the if condition outside of the loop. Which the original commenter mentioned. en.m.wikipedia.org/…/Loop-invariant_code_motion
You are the one that brought up dead code elimination.
The variable is set once, but the if expression is still evaluated every time (unless the compiler can optimize it)
(edit after skimming the article: yes,using the variable would solve the problem of the last example)
So there would be the branching overhead in every iteration. But that’s something the cpu branch prediction should cover, especially since the taken branch will be identical in every loop.
Same also applied to the implied condition to break the for loop (only the first few and last iteration should be wrong predictions)
Seems more applicable to an imperative style, and IMO even still the advice is too dependent on special/actual case details to be generally applicable as a “rule of thumb”.
This is just one specific example amongst many of how redundant logic could be simplified because sometimes the branch is an implementation detail and you want to push it down, and sometimes it’s not and you want to push it up.
Saw this posted on hackernews yesterday, along with hundreds of comments of people completely misunderstanding the advice given. Glad to not see any of that here.
I disagree with this article.
The first one is a good point in principle, but I would call it pushing down the ifs. A method is usually used to group logic together. So pulling the assertion checks out of that method forces you to replicate it every single time before calling the function. It’s very likely you forget to update these conditions in one place and not worth the miniscule loss of performance.
Now, if
frobnicate
is only something a walrus can do, then it should take a walrus. But if it is something like “given a picture of an animal, check if it is of a walrus, determine its sex and if it is a male, then give the likelihood that is the dominant male on this part of the beach” then you should most certainly push the ifs down to contain all that logic in one singlefrobnicate
method.Regarding the loops:
If the example given is really all the code you have, then sure, pull that condition out of the loop if you want. But the compiler will optimize it anyway if the condition does not change every loop iteration. And if it does you can’t pull the condition out of the loop anyway. A for loop I have much more often in my code looks like this though:
Which you do not want to spread to two for loops! Sure, in this toy example the if conditions could be simplified. And if you have such trivial loops, you should use iterators anyway. But sometimes you have logic you need to apply to every element, do not want to put that logic in the
frobnicate
method (especially if you push your ifs up) and can’t use iterators in an elegant way.For the love of god, do not turn it into
I agree with part of the article, because I didn’t read the rest. I truly dislike the use of single letter variable names:
f
,g
,h
andfoo
,bar
,baz
. My advice: use descriptive variable names.function twoIfs
,function complicatedIf
,var simpleAnd
, etc. Makes it so much easier to read examples instead of remembering “oh yeah,f
had two ifs in it,h
had the if/else,g
callsf
which callsh
which,…”.Also see this often in other examples: "
A
for ‘Truthy variable’ " 😓 Wtf. Laziness is good when it makes things easier, not harder.The article is really not about naming conventions.
Doesn’t matter, it’s hard to read an article. If it were hard to read for another reason like bad grammar, I’d comment on that too 🤷
Should have still used them. It was harder to read this way.
The blog author is literally using de-facto standard for placeholder names.
The var names used by the author are perfectly fine. They don’t cause any issue, nor do they make things hard to read.
Agreed that some people can find it easier with explicit names - however some people find it easier with short meaningless names as it makes them focus on the abstraction rather than the naming. There is no right or wrong here. It all depends on the reader.
I even thought that this (hardness) was intended to emphasize the way it’s hard to spot problems in real codebase 😅
What’s up with that capitalisation, I thought it was about git lfs at first, really confused me lol