While this is slightly off-topic, there’s certainly room for a middle ground of running a dozen or so of the basic optimizations (most notably mem2reg, sroa, instcombine) without running the full ~70 passes that run at O1. As noted above, O1 takes ~1.79x longer to compile than O0 and finishes executing benchmarks in ~0.31x the time; if you want to experiment with finding a configuration in the middle of this, I’d recommend starting by running only the “O1 Function Simplification Pipeline” (see PassBuilderPipelines.cpp), and either bringing in passes from the “Module Optimization Pipeline” if the resulting execution time is too slow, or further removing passes from the simplification pipeline if the compile time is too slow.
1 Like