Hi, Robin. Could you continue on this LICM issue ? I am not sure whether my fix is correct, or you may find another way to make LICM works ? juzhe.zhong@rivai.ai From: Robin Dapp Date: 2024-02-06 21:14 To: juzhe.zhong@rivai.ai; kito.cheng CC: rdapp.gcc; gcc-patches; Kito.cheng; jeffreyalaw Subject: Re: [PATCH] RISC-V: Allow LICM hoist POLY_INT configuration code sequence > The root cause is this following RTL pattern, after fwprop1: > > (insn 82 78 84 9 (set (reg:DI 230) > (sign_extend:DI (minus:SI (subreg/s/v:SI (reg:DI 150 [ niters.10 ]) 0) > (subreg:SI (reg:DI 221) 0)))) 13 {subsi3_extended} > (expr_list:REG_EQUAL (sign_extend:DI (plus:SI (subreg/s/v:SI (reg:DI 150 [ niters.10 ]) 0) > *(const_poly_int:SI [-16, -16])*)) > (nil))) > > The highlight *(const_poly_int:SI [-16, -16])* > causes ICE. > > This RTL is because: > (insn 69 68 71 8 (set (reg:DI 221) > (const_poly_int:DI [16, 16])) 208 {*movdi_64bit} > (nil)) > (insn 82 78 84 9 (set (reg:DI 230) > (sign_extend:DI (minus:SI (subreg/s/v:SI (reg:DI 150 [ niters.10 ]) 0) > (subreg:SI (reg:DI 221) 0)))) 13 {subsi3_extended} ----> (subreg:SI (const_poly_int:SI [-16, -16])) fwprop1 add (const_poly_int:SI [-16, -16]) reg_equal > (expr_list:REG_EQUAL (sign_extend:DI (plus:SI (subreg/s/v:SI (reg:DI 150 [ niters.10 ]) 0) > (const_poly_int:SI [-16, -16]))) > (nil))) I'm seeing a slightly different pattern but that doesn't change the problem. > (set (reg:SI) (subreg:SI (DI: poly value))) but it causes ICE that I > mentioned above. That's indeed a bit more idiomatic and I wouldn't oppose that. The problem causing the ICE is that we want to simplify a PLUS with (const_poly_int:SI [16, 16]) and (const_int 0) but the mode is DImode. My suspicion is that this is caused by our addsi3_extended pattern and we fail to deduce the proper mode for analysis. I'm just speculating but maybe that's because we assert that a plus is of the form simple_reg_p (op0) && CONSTANT_P (op1). Usually, constants don't have a mode and can just be used. poly_int_csts do have one and need to be explicitly converted (kind of). We can only analyze this zero_extended plus at all since Jeff added the addsi3_extended handling for loop-iv. Maybe we could punt like diff --git a/gcc/loop-iv.cc b/gcc/loop-iv.cc index eb7e923a38b..796413c25a3 100644 --- a/gcc/loop-iv.cc +++ b/gcc/loop-iv.cc @@ -714,6 +714,9 @@ get_biv_step_1 (df_ref def, scalar_int_mode outer_mode, rtx reg, if (!simple_reg_p (op0) || !CONSTANT_P (op1)) return false; + if (CONST_POLY_INT_P (op1) && GET_MODE (op1) != outer_mode) + return false; + This helps for your test case but I haven't done any further testing. I'd think this is relatively safe because it's only a missed analysis/optimization in the worst case. Still, generally, I don't see a reason why we wouldn't be able to analyze this? Regards Robin