Which is more important for rust compilation: higher number of cores or faster single core performance?
from Timely_Jellyfish_2077@programming.dev to rust@programming.dev on 14 Aug 2024 19:41
https://programming.dev/post/18131541
from Timely_Jellyfish_2077@programming.dev to rust@programming.dev on 14 Aug 2024 19:41
https://programming.dev/post/18131541
Planning to build a PC in couple of weeks.
What is the optimal number of cores to have without having diminishing returns?
threaded - newest
For a clean build: number of cores (because cargo builds each crate dependency in a separate process), for a build of your crate only: single core perf.
Until the parallel compiler feature (
-Z threads=<n>
) stabilizes and becomes more complete.It’s also always worth mentioning that the choice of linker is important. Using mold or lld can significantly speed things up in some use-cases.
Beyond that,
codegen-units
andlto
profile options are also important.And finally, for development purposes, the code generator is important, as
cranelift
provides much faster compile times, but resulting binaries are not as optimized as LLVM-generated ones.Oh, i have to try these out to see if it effects my development cycle. I do notice that cargo check is super fast, but cargo build takes a long time. So codegen and linker could be the source of slowness.
The compiler is getting more and more parallel but there’s a few bottlenecks still. The frontend (parsing, macro expansion, trait resolution, HIR lowering) is still single-threaded, although there’s a parallel implementation on nightly.
Optimal core count really depends on the project you’re compiling. The compiler splits the crate into codegen units that can be processed by LLVM in parallel. It’s currently 16 for release builds and 256 for debug builds.
This theoretically means that you could continue to see performance gains up to 256 cores in debug builds, but in practice there’s going to be other bottlenecks.
Compilation is very memory and disk-I/O intensive as well. Having a fast SSD and plenty of spare memory space that the OS can use for caching files will help. You may also see a benefit from a processor with a large L3 cache, like AMD’s X3D processor variants.
Across a project, it depends on how many dependencies can be compiled in parallel. The dependencies for a crate have to be compiled before the crate itself can be compiled, so the upper limit to parallelism here is set by your dependency graph. But this really only matters for fresh builds.
Preference order is: Fast IO like SSD and mirror RAID, then RAM size, then core count and then core speed.
This is not RUST specific.
Unless all you’re doing is compiling then I’d say single core performance is more important.