Hi, Robin. Could you continue on this LICM issue ?
I am not sure whether my fix is correct, or you may find another way to make LICM works ?



juzhe.zhong@rivai.ai
 
From: Robin Dapp
Date: 2024-02-06 21:14
To: juzhe.zhong@rivai.ai; kito.cheng
CC: rdapp.gcc; gcc-patches; Kito.cheng; jeffreyalaw
Subject: Re: [PATCH] RISC-V: Allow LICM hoist POLY_INT configuration code sequence
> The root cause is this following RTL pattern, after fwprop1:
> 
> (insn 82 78 84 9 (set (reg:DI 230)
>         (sign_extend:DI (minus:SI (subreg/s/v:SI (reg:DI 150 [ niters.10 ]) 0)
>                 (subreg:SI (reg:DI 221) 0)))) 13 {subsi3_extended}
>      (expr_list:REG_EQUAL (sign_extend:DI (plus:SI (subreg/s/v:SI (reg:DI 150 [ niters.10 ]) 0)
>                 *(const_poly_int:SI [-16, -16])*))
>         (nil)))
> 
> The highlight *(const_poly_int:SI [-16, -16])*
> causes ICE.
> 
> This RTL is because:
> (insn 69 68 71 8 (set (reg:DI 221)
>         (const_poly_int:DI [16, 16])) 208 {*movdi_64bit}
>      (nil))
> (insn 82 78 84 9 (set (reg:DI 230)
>         (sign_extend:DI (minus:SI (subreg/s/v:SI (reg:DI 150 [ niters.10 ]) 0)
>                 (subreg:SI (reg:DI 221) 0)))) 13 {subsi3_extended}                                          ----> (subreg:SI (const_poly_int:SI [-16, -16])) fwprop1 add  (const_poly_int:SI [-16, -16]) reg_equal
>      (expr_list:REG_EQUAL (sign_extend:DI (plus:SI (subreg/s/v:SI (reg:DI 150 [ niters.10 ]) 0)
>                 (const_poly_int:SI [-16, -16])))
>         (nil)))
 
I'm seeing a slightly different pattern but that doesn't change
the problem.
 
> (set (reg:SI)  (subreg:SI (DI: poly value))) but it causes ICE that I
> mentioned above.
 
That's indeed a bit more idiomatic and I wouldn't oppose that.
 
The problem causing the ICE is that we want to simplify a PLUS
with (const_poly_int:SI [16, 16]) and (const_int 0) but the mode
is DImode.  My suspicion is that this is caused by our
addsi3_extended pattern and we fail to deduce the proper mode
for analysis.
 
I'm just speculating but maybe that's because we assert that a
plus is of the form simple_reg_p (op0) && CONSTANT_P (op1).
Usually, constants don't have a mode and can just be used.
poly_int_csts do have one and need to be explicitly converted
(kind of).
 
We can only analyze this zero_extended plus at all since Jeff
added the addsi3_extended handling for loop-iv.   Maybe we could
punt like
 
diff --git a/gcc/loop-iv.cc b/gcc/loop-iv.cc
index eb7e923a38b..796413c25a3 100644
--- a/gcc/loop-iv.cc
+++ b/gcc/loop-iv.cc
@@ -714,6 +714,9 @@ get_biv_step_1 (df_ref def, scalar_int_mode outer_mode, rtx reg,
          if (!simple_reg_p (op0) || !CONSTANT_P (op1))
            return false;
+         if (CONST_POLY_INT_P (op1) && GET_MODE (op1) != outer_mode)
+           return false;
+
 
This helps for your test case but I haven't done any further
testing.  I'd think this is relatively safe because it's only
a missed analysis/optimization in the worst case.
Still, generally, I don't see a reason why we wouldn't be able
to analyze this?
 
Regards
Robin