Thank you for your effort. I had evaluated only in intrate tests. I am glad to see the same result on Leela. On Tue, Aug 1, 2023 at 11:14 PM Vineet Gupta wrote: > > > On 7/25/23 20:31, Jeff Law via Gcc-patches wrote: > > > > > > On 7/25/23 05:24, Jivan Hakobyan wrote: > >> Hi. > >> > >> I re-run the benchmarks and hopefully got the same profit. > >> I also compared the leela's code and figured out the reason. > >> > >> Actually, my and Manolis's patches do the same thing. The difference > >> is only execution order. > > But shouldn't your patch also allow for for at the last the potential > > to pull the fp+offset computation out of a loop? I'm pretty sure > > Manolis's patch can't do that. > > > >> Because of f-m-o held after the register allocation it cannot > >> eliminate redundant move 'sp' to another register. > > Actually that's supposed to be handled by a different patch that > > should already be upstream. Specifically; > > > >> commit 6a2e8dcbbd4bab374b27abea375bf7a921047800 > >> Author: Manolis Tsamis > >> Date: Thu May 25 13:44:41 2023 +0200 > >> > >> cprop_hardreg: Enable propagation of the stack pointer if possible > >> Propagation of the stack pointer in cprop_hardreg is currenty > >> forbidden in all cases, due to maybe_mode_change returning NULL. > >> Relax this restriction and allow propagation when no mode change is > >> requested. > >> gcc/ChangeLog: > >> * regcprop.cc (maybe_mode_change): Enable stack pointer > >> propagation. > > I think there were a couple-follow-ups. But that's the key change > > that should allow propagation of copies from the stack pointer and > > thus eliminate the mov gpr,sp instructions. If that's not happening, > > then it's worth investigating why. > > > >> > >> Besides that, I have checked the build failure on x264_r. It is > >> already fixed on the third version. > > Yea, this was a problem with re-recognition. I think it was fixed by: > > > >> commit ecfa870ff29d979bd2c3d411643b551f2b6915b0 > >> Author: Vineet Gupta > >> Date: Thu Jul 20 11:15:37 2023 -0700 > >> > >> RISC-V: optim const DF +0.0 store to mem [PR/110748] > >> Fixes: ef85d150b5963 ("RISC-V: Enable TARGET_SUPPORTS_WIDE_INT") > >> DF +0.0 is bitwise all zeros so int x0 store to mem can be > >> used to optimize it. > > [ ... ] > > > > > > So I think the big question WRT your patch is does it still help the > > case where we weren't pulling the fp+offset computation out of a loop. > > I have some numbers for f-m-o v3 vs this. Attached here (vs. inline to > avoid the Thunderbird mangling the test formatting) > -- With the best regards Jivan Hakobyan