public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [committed] [RISC-V] Avoid sub-word mode comparisons with Zicond
@ 2023-08-02  5:41 Jeff Law
  2023-08-05  7:46 ` Xiao Zeng
  0 siblings, 1 reply; 3+ messages in thread
From: Jeff Law @ 2023-08-02  5:41 UTC (permalink / raw)
  To: gcc-patches; +Cc: Xiao Zeng

[-- Attachment #1: Type: text/plain, Size: 4466 bytes --]


c-torture/execute/pr59014-2.c fails with the Zicond work on rv64.  We 
miscompile the "foo" routine because we have eliminated a required sign 
extension.

The key routine looks like this:

foo (long long int x, long long int y)
{
   if (((int) x | (int) y) != 0)
     return 6;
   return x + y;
}

So we kindof do the expected thing at expansion time.  We IOR X and Y, 
sign extend the result from 32 to 64 bits (note how the values in the 
source are casted from long long to ints), then emit a suitable 
conditional branch.  ie:

> (insn 10 4 12 2 (set (reg:DI 142)
>         (ior:DI (reg/v:DI 138 [ x ])
>             (reg/v:DI 139 [ y ]))) "j.c":6:16 99 {iordi3}
>      (nil))
> (insn 12 10 13 2 (set (reg:DI 144)
>         (sign_extend:DI (subreg:SI (reg:DI 142) 0))) "j.c":6:6 116 {extendsidi2}
>      (nil))
> (jump_insn 13 12 14 2 (set (pc)
>         (if_then_else (ne (reg:DI 144)
>                 (const_int 0 [0]))
>             (label_ref:DI 27)
>             (pc))) "j.c":6:6 243 {*branchdi}
>      (expr_list:REG_DEAD (reg:DI 144)
>         (int_list:REG_BR_PROB 233216732 (nil)))

When we if-convert that we generate this sequence:

> (insn 10 4 12 2 (set (reg:DI 142)
>         (ior:DI (reg/v:DI 138 [ x ])
>             (reg/v:DI 139 [ y ]))) "j.c":6:16 99 {iordi3}
>      (nil))
> (insn 12 10 30 2 (set (reg:DI 144)
>         (sign_extend:DI (subreg:SI (reg:DI 142) 0))) "j.c":6:6 116 {extendsidi2}
>      (nil))
> (insn 30 12 31 2 (set (reg:DI 147)
>         (const_int 6 [0x6])) "j.c":8:12 179 {*movdi_64bit}
>      (nil))     
> (insn 31 30 33 2 (set (reg:DI 146)
>         (plus:DI (reg/v:DI 138 [ x ])
>             (reg/v:DI 139 [ y ]))) "j.c":8:12 5 {adddi3} 
>      (nil))      
> (insn 33 31 34 2 (set (reg:DI 149)
>         (if_then_else:DI (ne:DI (reg:DI 144)
>                 (const_int 0 [0]))
>             (const_int 0 [0])
>             (reg:DI 146))) "j.c":8:12 11368 {*czero.nez.didi}
>      (nil)) 
> (insn 34 33 35 2 (set (reg:DI 148) 
>         (if_then_else:DI (eq:DI (reg:DI 144)
>                 (const_int 0 [0]))
>             (const_int 0 [0])
>             (reg:DI 147))) "j.c":8:12 11367 {*czero.eqz.didi}
>      (nil))
> (insn 35 34 21 2 (set (reg:DI 137 [ <retval> ])
>         (ior:DI (reg:DI 148)
>             (reg:DI 149))) "j.c":8:12 99 {iordi3}
>      (nil))

Which looks basically OK.  The sign extended subreg is a bit worrisome 
though.  And sure enough when we get into combine:

> Failed to match this instruction:
> (parallel [
>         (set (reg:DI 149)
>             (if_then_else:DI (eq:DI (subreg:SI (reg:DI 142) 0)
>                     (const_int 0 [0]))
>                 (reg:DI 146)
>                 (const_int 0 [0])))
>         (set (reg:DI 144)
>             (sign_extend:DI (subreg:SI (reg:DI 142) 0)))
>     ])
> Successfully matched this instruction:
> (set (reg:DI 144)
>     (sign_extend:DI (subreg:SI (reg:DI 142) 0)))
> Successfully matched this instruction:
> (set (reg:DI 149)
>     (if_then_else:DI (eq:DI (subreg:SI (reg:DI 142) 0)
>             (const_int 0 [0]))
>         (reg:DI 146)
>         (const_int 0 [0])))
> allowing combination of insns 12 and 33

Since we need the side effect we first try the PARALLEL with two sets. 
That, as expected, fails.  Generic combine code then tries to pull apart 
the two sets as distinct insns resulting in this conditional move:

> (insn 33 31 34 2 (set (reg:DI 149)
>         (if_then_else:DI (eq:DI (subreg:SI (reg:DI 142) 0)
>                 (const_int 0 [0]))
>             (reg:DI 146)
>             (const_int 0 [0]))) "j.c":8:12 11347 {*czero.nez.disi}
>      (expr_list:REG_DEAD (reg:DI 146)
>         (nil)))

Bzzt.  We can't actually implement this RTL in the hardware.  Basically 
it's asking to do 32bit comparison on rv64, ignoring the upper 32 bits 
of the input register.  That's not actually how zicond works.

The operands to the comparison need to be in DImode for rv64 and SImode 
for rv32.  That's the X iterator.  Note the mode of the comparison 
operands may be different than the mode of the destination.  ie, we 
might have a 64bit comparison and produce a 32bit sign extended result 
much like the setcc insns support.

This patch changes the 6 zicond patterns to use the X iterator on the 
comparison inputs.  That at least makes the patterns correct and fixes 
this particular testcase.   There's a few other lurking problems that 
I'll address in additional patches.

Committed to the trunk,
Jeff





[-- Attachment #2: P --]
[-- Type: text/plain, Size: 8890 bytes --]

commit 2d73f2eb80caf328bc4dd1324d475e7bf6b56837
Author: Jeff Law <jlaw@ventanamicro.com>
Date:   Tue Aug 1 23:12:16 2023 -0600

    [committed] [RISC-V] Avoid sub-word mode comparisons with Zicond
    
    c-torture/execute/pr59014-2.c fails with the Zicond work on rv64.  We
    miscompile the "foo" routine because we have eliminated a required sign
    extension.
    
    The key routine looks like this:
    
    foo (long long int x, long long int y)
    {
      if (((int) x | (int) y) != 0)
        return 6;
      return x + y;
    }
    
    So we kindof do the expected thing.  We IOR X and Y, sign extend the result
    from 32 to 64 bits, then emit a suitable conditional branch.  ie:
    
    > (insn 10 4 12 2 (set (reg:DI 142)
    >         (ior:DI (reg/v:DI 138 [ x ])
    >             (reg/v:DI 139 [ y ]))) "j.c":6:16 99 {iordi3}
    >      (nil))
    > (insn 12 10 13 2 (set (reg:DI 144)
    >         (sign_extend:DI (subreg:SI (reg:DI 142) 0))) "j.c":6:6 116 {extendsidi2}
    >      (nil))
    > (jump_insn 13 12 14 2 (set (pc)
    >         (if_then_else (ne (reg:DI 144)
    >                 (const_int 0 [0]))
    >             (label_ref:DI 27)
    >             (pc))) "j.c":6:6 243 {*branchdi}
    >      (expr_list:REG_DEAD (reg:DI 144)
    >         (int_list:REG_BR_PROB 233216732 (nil)))
    When we if-convert that we generate this sequence:
    
    > (insn 10 4 12 2 (set (reg:DI 142)
    >         (ior:DI (reg/v:DI 138 [ x ])
    >             (reg/v:DI 139 [ y ]))) "j.c":6:16 99 {iordi3}
    >      (nil))
    > (insn 12 10 30 2 (set (reg:DI 144)
    >         (sign_extend:DI (subreg:SI (reg:DI 142) 0))) "j.c":6:6 116 {extendsidi2}
    >      (nil))
    > (insn 30 12 31 2 (set (reg:DI 147)
    >         (const_int 6 [0x6])) "j.c":8:12 179 {*movdi_64bit}
    >      (nil))
    > (insn 31 30 33 2 (set (reg:DI 146)
    >         (plus:DI (reg/v:DI 138 [ x ])
    >             (reg/v:DI 139 [ y ]))) "j.c":8:12 5 {adddi3}
    >      (nil))
    > (insn 33 31 34 2 (set (reg:DI 149)
    >         (if_then_else:DI (ne:DI (reg:DI 144)
    >                 (const_int 0 [0]))
    >             (const_int 0 [0])
    >             (reg:DI 146))) "j.c":8:12 11368 {*czero.nez.didi}
    >      (nil))
    > (insn 34 33 35 2 (set (reg:DI 148)
    >         (if_then_else:DI (eq:DI (reg:DI 144)
    >                 (const_int 0 [0]))
    >             (const_int 0 [0])
    >             (reg:DI 147))) "j.c":8:12 11367 {*czero.eqz.didi}
    >      (nil))
    > (insn 35 34 21 2 (set (reg:DI 137 [ <retval> ])
    >         (ior:DI (reg:DI 148)
    >             (reg:DI 149))) "j.c":8:12 99 {iordi3}
    >      (nil))
    Which looks basically OK.  The sign extended subreg is a bit worrisome though.
    And sure enough when we get into combine:
    
    > Failed to match this instruction:
    > (parallel [
    >         (set (reg:DI 149)
    >             (if_then_else:DI (eq:DI (subreg:SI (reg:DI 142) 0)
    >                     (const_int 0 [0]))
    >                 (reg:DI 146)
    >                 (const_int 0 [0])))
    >         (set (reg:DI 144)
    >             (sign_extend:DI (subreg:SI (reg:DI 142) 0)))
    >     ])
    > Successfully matched this instruction:
    > (set (reg:DI 144)
    >     (sign_extend:DI (subreg:SI (reg:DI 142) 0)))
    > Successfully matched this instruction:
    > (set (reg:DI 149)
    >     (if_then_else:DI (eq:DI (subreg:SI (reg:DI 142) 0)
    >             (const_int 0 [0]))
    >         (reg:DI 146)
    >         (const_int 0 [0])))
    > allowing combination of insns 12 and 33
    Since we need the side effect we first try the PARALLEL with two sets.
    That, as expected, fails.  Generic combine code then tries to pull apart
    the two sets as distinct insns resulting in this conditional move:
    
    > (insn 33 31 34 2 (set (reg:DI 149)
    >         (if_then_else:DI (eq:DI (subreg:SI (reg:DI 142) 0)
    >                 (const_int 0 [0]))
    >             (reg:DI 146)
    >             (const_int 0 [0]))) "j.c":8:12 11347 {*czero.nez.disi}
    >      (expr_list:REG_DEAD (reg:DI 146)
    >         (nil)))
    Bzzt.  We can't actually implement this RTL in the hardware.  Basically
    it's asking to do 32bit comparison on rv64, ignoring the upper 32 bits
    of the input register.  That's not actually how zicond works.
    
    The operands to the comparison need to be in DImode for rv64 and SImode
    for rv32.  That's the X iterator.  Note the mode of the comparison
    operands may be different than the mode of the destination.  ie, we might
    have a 64bit comparison and produce a 32bit sign extended result much
    like the setcc insns support.
    
    This patch changes the 6 zicond patterns to use the X iterator on the
    comparison inputs and fixes the testsuite failure.
    
    gcc/
    
            * config/riscv/zicond.md: Use the X iterator instead of ANYI
            on the comparison input operands.

diff --git a/gcc/config/riscv/zicond.md b/gcc/config/riscv/zicond.md
index 723a22422e1..8f24b3a1690 100644
--- a/gcc/config/riscv/zicond.md
+++ b/gcc/config/riscv/zicond.md
@@ -22,9 +22,9 @@ (define_code_attr eqz [(eq "nez") (ne "eqz")])
 (define_code_attr nez [(eq "eqz") (ne "nez")])
 
 ;; Zicond
-(define_insn "*czero.<eqz>.<GPR:mode><ANYI:mode>"
+(define_insn "*czero.<eqz>.<GPR:mode><X:mode>"
   [(set (match_operand:GPR 0 "register_operand"                      "=r")
-        (if_then_else:GPR (eq_or_ne (match_operand:ANYI 1 "register_operand" "r")
+        (if_then_else:GPR (eq_or_ne (match_operand:X 1 "register_operand" "r")
                                     (const_int 0))
                           (match_operand:GPR 2 "register_operand"    "r")
                           (const_int 0)))]
@@ -32,9 +32,9 @@ (define_insn "*czero.<eqz>.<GPR:mode><ANYI:mode>"
   "czero.<eqz>\t%0,%2,%1"
 )
 
-(define_insn "*czero.<nez>.<GPR:mode><ANYI:mode>"
+(define_insn "*czero.<nez>.<GPR:mode><X:mode>"
   [(set (match_operand:GPR 0 "register_operand"                     "=r")
-        (if_then_else:GPR (eq_or_ne (match_operand:ANYI 1 "register_operand" "r")
+        (if_then_else:GPR (eq_or_ne (match_operand:X 1 "register_operand" "r")
                                     (const_int 0))
                           (const_int 0)
                           (match_operand:GPR 2 "register_operand"   "r")))]
@@ -43,9 +43,9 @@ (define_insn "*czero.<nez>.<GPR:mode><ANYI:mode>"
 )
 
 ;; Special optimization under eq/ne in primitive semantics
-(define_insn "*czero.eqz.<GPR:mode><ANYI:mode>.opt1"
+(define_insn "*czero.eqz.<GPR:mode><X:mode>.opt1"
   [(set (match_operand:GPR 0 "register_operand"                   "=r")
-        (if_then_else:GPR (eq (match_operand:ANYI 1 "register_operand" "r")
+        (if_then_else:GPR (eq (match_operand:X 1 "register_operand" "r")
                               (const_int 0))
                           (match_operand:GPR 2 "register_operand" "1")
                           (match_operand:GPR 3 "register_operand" "r")))]
@@ -53,9 +53,9 @@ (define_insn "*czero.eqz.<GPR:mode><ANYI:mode>.opt1"
   "czero.eqz\t%0,%3,%1"
 )
 
-(define_insn "*czero.eqz.<GPR:mode><ANYI:mode>.opt2"
+(define_insn "*czero.eqz.<GPR:mode><X:mode>.opt2"
   [(set (match_operand:GPR 0 "register_operand"                   "=r")
-        (if_then_else:GPR (eq (match_operand:ANYI 1 "register_operand" "r")
+        (if_then_else:GPR (eq (match_operand:X 1 "register_operand" "r")
                               (const_int 0))
                           (match_operand:GPR 2 "register_operand" "r")
                           (match_operand:GPR 3 "register_operand" "1")))]
@@ -63,9 +63,9 @@ (define_insn "*czero.eqz.<GPR:mode><ANYI:mode>.opt2"
   "czero.nez\t%0,%2,%1"
 )
 
-(define_insn "*czero.nez.<GPR:mode><ANYI:mode>.opt3"
+(define_insn "*czero.nez.<GPR:mode><X:mode>.opt3"
   [(set (match_operand:GPR 0 "register_operand"                   "=r")
-        (if_then_else:GPR (ne (match_operand:ANYI 1 "register_operand" "r")
+        (if_then_else:GPR (ne (match_operand:X 1 "register_operand" "r")
                               (const_int 0))
                           (match_operand:GPR 2 "register_operand" "r")
                           (match_operand:GPR 3 "register_operand" "1")))]
@@ -73,9 +73,9 @@ (define_insn "*czero.nez.<GPR:mode><ANYI:mode>.opt3"
   "czero.eqz\t%0,%2,%1"
 )
 
-(define_insn "*czero.nez.<GPR:mode><ANYI:mode>.opt4"
+(define_insn "*czero.nez.<GPR:mode><X:mode>.opt4"
   [(set (match_operand:GPR 0 "register_operand"                   "=r")
-        (if_then_else:GPR (ne (match_operand:ANYI 1 "register_operand" "r")
+        (if_then_else:GPR (ne (match_operand:X 1 "register_operand" "r")
                               (const_int 0))
                           (match_operand:GPR 2 "register_operand" "1")
                           (match_operand:GPR 3 "register_operand" "r")))]

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2023-08-05 14:56 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-02  5:41 [committed] [RISC-V] Avoid sub-word mode comparisons with Zicond Jeff Law
2023-08-05  7:46 ` Xiao Zeng
2023-08-05 14:56   ` Jeff Law

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).