hi, I realized gcc does not use optimized const shifts well but instead does replace some (3-7) left shifts on 32bit variables with add and adc operations. If the shift is not optimized and just unrolled they should be the same cycle count but for some reason it also adds some mov operations in the later part making it even worse performance wise. In terms of bytecode size shifts are better than add and adc operations. Also all const shifts can get optimized even better as one can see for the other variable sizes e.g. 16bit. Also I should mention this only happens with left shifts on 32bit (maybe also on 24bit) and with some non Os optimizer option. I sent const case optimisation to the patch mailing list but was not able to figure out where this bad optimisation is coming from. I prepared a compiler explorer example for you to get a easy grasp on it: https://godbolt.org/z/x75EM94rE In case compiler explorer does not work out for you the example code is: unsigned long lshift32_c(const unsigned long value) { return value << 7; } resulting in a lot of in case of O0 showing the wrong replacement by optimizer: add r24,r24 adc r25,r25 adc r26,r26 adc r27,r27 resulting in a lot of singe not optimized shifts and some additional useless mov near the end with O2: lsl r24 rol r25 rol r26 rol r27 ... movw r18,r24 movw r20,r26 lsl r18 rol r19 rol r20 rol r21