public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/113281] New: [14] RISC-V rv64gcv_zvl256b vector: Runtime mismatch with rv64gc
@ 2024-01-08 19:17 patrick at rivosinc dot com
  2024-01-08 19:20 ` [Bug target/113281] " patrick at rivosinc dot com
                   ` (32 more replies)
  0 siblings, 33 replies; 34+ messages in thread
From: patrick at rivosinc dot com @ 2024-01-08 19:17 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113281

            Bug ID: 113281
           Summary: [14] RISC-V rv64gcv_zvl256b vector: Runtime mismatch
                    with rv64gc
           Product: gcc
           Version: 14.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: patrick at rivosinc dot com
  Target Milestone: ---

Testcase:
unsigned char a;

int main() {
  short b = a = 0;
  for (; a != 19; a++)
    if (a)
      b = 32872 >> a;

  if (b == 0)
    return 0;
  else
    return 1;
}

Commands:
rv64gcv_zvl256b:
> /scratch/tc-testing/tc-jan-8-trunk/build-rv64gcv/bin/riscv64-unknown-linux-gnu-gcc -march=rv64gcv_zvl256b -O3 red.c -o user-config.out
> QEMU_CPU=rv64,vlen=256,v=true,vext_spec=v1.0,Zve32f=true,Zve64f=true timeout --verbose -k 0.1 1 /scratch/tc-testing/tc-jan-8-trunk/build-rv64gcv/bin/qemu-riscv64 user-config.out
> echo $?
1

rv64gc:
> /scratch/tc-testing/tc-jan-8-trunk/build-rv64gcv/bin/riscv64-unknown-linux-gnu-gcc -march=rv64gc -O3 red.c -o rv64gc.out
> QEMU_CPU=rv64,vlen=256,v=true,vext_spec=v1.0,Zve32f=true,Zve64f=true timeout --verbose -k 0.1 1 /scratch/tc-testing/tc-jan-8-trunk/build-rv64gcv/bin/qemu-riscv64 rv64gc.out
> echo $?
0

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [Bug target/113281] [14] RISC-V rv64gcv_zvl256b vector: Runtime mismatch with rv64gc
  2024-01-08 19:17 [Bug target/113281] New: [14] RISC-V rv64gcv_zvl256b vector: Runtime mismatch with rv64gc patrick at rivosinc dot com
@ 2024-01-08 19:20 ` patrick at rivosinc dot com
  2024-01-08 21:49 ` rdapp at gcc dot gnu.org
                   ` (31 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: patrick at rivosinc dot com @ 2024-01-08 19:20 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113281

Patrick O'Neill <patrick at rivosinc dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |wrong-code
                 CC|                            |juzhe.zhong at rivai dot ai,
                   |                            |patrick at rivosinc dot com,
                   |                            |rdapp at gcc dot gnu.org,
                   |                            |vineetg at rivosinc dot com
             Target|                            |riscv

--- Comment #1 from Patrick O'Neill <patrick at rivosinc dot com> ---
Godbolt: https://godbolt.org/z/efPhqzcr5

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [Bug target/113281] [14] RISC-V rv64gcv_zvl256b vector: Runtime mismatch with rv64gc
  2024-01-08 19:17 [Bug target/113281] New: [14] RISC-V rv64gcv_zvl256b vector: Runtime mismatch with rv64gc patrick at rivosinc dot com
  2024-01-08 19:20 ` [Bug target/113281] " patrick at rivosinc dot com
@ 2024-01-08 21:49 ` rdapp at gcc dot gnu.org
  2024-01-09  1:32 ` juzhe.zhong at rivai dot ai
                   ` (30 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: rdapp at gcc dot gnu.org @ 2024-01-08 21:49 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113281

--- Comment #2 from Robin Dapp <rdapp at gcc dot gnu.org> ---
Confirmed.  Funny, we shouldn't vectorize that but really optimize to "return
0".  Costing might be questionable but we also haven't optimized away the loop
when comparing costs.

Disregarding that, of course the vectorization should be correct.

The vect output doesn't really make sense to me but I haven't looked very
closely yet:

  _177 = .SELECT_VL (2, POLY_INT_CST [16, 16]);
  vect_patt_82.18_166 = (vector([16,16]) unsigned short) { 17, 18, 19, ... };
  vect_patt_84.19_168 = MIN_EXPR <vect_patt_82.18_166, { 15, ... }>;
  vect_patt_85.20_170 = { 32872, ... } >> vect_patt_84.19_168;
  vect_patt_87.21_171 = VIEW_CONVERT_EXPR<vector([16,16]) short
intD.12>(vect_patt_85.20_170);
  _173 = _177 + 18446744073709551615;
  # RANGE [irange] short int [0, 16436] MASK 0x7fff VALUE 0x0
  _174 = .VEC_EXTRACT (vect_patt_87.21_171, _173);

vect_patt_85.20_170 should be all zeros and then we'd just vec_extract a 0 and
return that.  However, 32872 >> 15 == 1 so we return 1.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [Bug target/113281] [14] RISC-V rv64gcv_zvl256b vector: Runtime mismatch with rv64gc
  2024-01-08 19:17 [Bug target/113281] New: [14] RISC-V rv64gcv_zvl256b vector: Runtime mismatch with rv64gc patrick at rivosinc dot com
  2024-01-08 19:20 ` [Bug target/113281] " patrick at rivosinc dot com
  2024-01-08 21:49 ` rdapp at gcc dot gnu.org
@ 2024-01-09  1:32 ` juzhe.zhong at rivai dot ai
  2024-01-09  2:06 ` juzhe.zhong at rivai dot ai
                   ` (29 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: juzhe.zhong at rivai dot ai @ 2024-01-09  1:32 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113281

--- Comment #3 from JuzheZhong <juzhe.zhong at rivai dot ai> ---
I think there are 2 issues here:

1. We should adjust cost model to let loop vectorizer eliminate the
unprofitable
   vectorization. It should be done in RISC-V backend.
2. We should fix run fail bug with -fno-vect-cost-model.

I think 1st issue is simpler than second one. And I highly doubt that the
second
one is not RISC-V bug is middle-end bug.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [Bug target/113281] [14] RISC-V rv64gcv_zvl256b vector: Runtime mismatch with rv64gc
  2024-01-08 19:17 [Bug target/113281] New: [14] RISC-V rv64gcv_zvl256b vector: Runtime mismatch with rv64gc patrick at rivosinc dot com
                   ` (2 preceding siblings ...)
  2024-01-09  1:32 ` juzhe.zhong at rivai dot ai
@ 2024-01-09  2:06 ` juzhe.zhong at rivai dot ai
  2024-01-09  2:08 ` juzhe.zhong at rivai dot ai
                   ` (28 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: juzhe.zhong at rivai dot ai @ 2024-01-09  2:06 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113281

--- Comment #4 from JuzheZhong <juzhe.zhong at rivai dot ai> ---
Confirm reduced case:

#include <assert.h>
unsigned char a;

int main() {
  short b = a = 0;
  for (; a != 19; a++)
    if (a)
      b = 32872 >> a;

  assert (b == 0);
}

with -fno-vect-cost-model -march=rv64gcv -O3:

https://godbolt.org/z/joGb3e9Eb

Also run failed assertion "b == 0" failed: file "bug.c", line 10, function:
main

I suspect ARM SVE has the same fail.

Hi, Andrew. Could you test this case on ARM to see whether ARM has same issue
as RISC-V for me ?

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [Bug target/113281] [14] RISC-V rv64gcv_zvl256b vector: Runtime mismatch with rv64gc
  2024-01-08 19:17 [Bug target/113281] New: [14] RISC-V rv64gcv_zvl256b vector: Runtime mismatch with rv64gc patrick at rivosinc dot com
                   ` (3 preceding siblings ...)
  2024-01-09  2:06 ` juzhe.zhong at rivai dot ai
@ 2024-01-09  2:08 ` juzhe.zhong at rivai dot ai
  2024-01-09  2:12 ` [Bug tree-optimization/113281] " pinskia at gcc dot gnu.org
                   ` (27 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: juzhe.zhong at rivai dot ai @ 2024-01-09  2:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113281

--- Comment #5 from JuzheZhong <juzhe.zhong at rivai dot ai> ---
(In reply to JuzheZhong from comment #4)
> Confirm reduced case:
> 
> #include <assert.h>
> unsigned char a;
> 
> int main() {
>   short b = a = 0;
>   for (; a != 19; a++)
>     if (a)
>       b = 32872 >> a;
>   
>   assert (b == 0);
> }
> 
> with -fno-vect-cost-model -march=rv64gcv -O3:
> 
> https://godbolt.org/z/joGb3e9Eb
> 
> Also run failed assertion "b == 0" failed: file "bug.c", line 10, function:
> main
> 
> I suspect ARM SVE has the same fail.
> 
> Hi, Andrew. Could you test this case on ARM to see whether ARM has same
> issue as RISC-V for me ?

The vect dump tree is quite similar between ARM and RISC-V.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [Bug tree-optimization/113281] [14] RISC-V rv64gcv_zvl256b vector: Runtime mismatch with rv64gc
  2024-01-08 19:17 [Bug target/113281] New: [14] RISC-V rv64gcv_zvl256b vector: Runtime mismatch with rv64gc patrick at rivosinc dot com
                   ` (4 preceding siblings ...)
  2024-01-09  2:08 ` juzhe.zhong at rivai dot ai
@ 2024-01-09  2:12 ` pinskia at gcc dot gnu.org
  2024-01-09  2:16 ` pinskia at gcc dot gnu.org
                   ` (26 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-01-09  2:12 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113281

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|                            |2024-01-09
             Status|UNCONFIRMED                 |NEW
     Ever confirmed|0                           |1
          Component|target                      |tree-optimization
             Target|riscv                       |riscv aarch64-linux-gnu

--- Comment #6 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Confirmed on aarch64 too:

[apinski@xeond2 upstream-full-cross]$ ./install/bin/aarch64-linux-gnu-gcc -O3
-march=armv9-a+sve2 t.c -static -fno-vect-cost-model
[apinski@xeond2 upstream-full-cross]$ ./install-qemu/bin/qemu-aarch64 a.out
;echo $?
1

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [Bug tree-optimization/113281] [14] RISC-V rv64gcv_zvl256b vector: Runtime mismatch with rv64gc
  2024-01-08 19:17 [Bug target/113281] New: [14] RISC-V rv64gcv_zvl256b vector: Runtime mismatch with rv64gc patrick at rivosinc dot com
                   ` (5 preceding siblings ...)
  2024-01-09  2:12 ` [Bug tree-optimization/113281] " pinskia at gcc dot gnu.org
@ 2024-01-09  2:16 ` pinskia at gcc dot gnu.org
  2024-01-09  2:17 ` juzhe.zhong at rivai dot ai
                   ` (25 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-01-09  2:16 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113281

--- Comment #7 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
>  short b = a = 0;

s/short/int/ allows it to work ...

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [Bug tree-optimization/113281] [14] RISC-V rv64gcv_zvl256b vector: Runtime mismatch with rv64gc
  2024-01-08 19:17 [Bug target/113281] New: [14] RISC-V rv64gcv_zvl256b vector: Runtime mismatch with rv64gc patrick at rivosinc dot com
                   ` (6 preceding siblings ...)
  2024-01-09  2:16 ` pinskia at gcc dot gnu.org
@ 2024-01-09  2:17 ` juzhe.zhong at rivai dot ai
  2024-01-09  2:58 ` [Bug tree-optimization/113281] [14 Regression] Wrong code due to vectorization of shift reduction and missing promotions pinskia at gcc dot gnu.org
                   ` (24 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: juzhe.zhong at rivai dot ai @ 2024-01-09  2:17 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113281

--- Comment #8 from JuzheZhong <juzhe.zhong at rivai dot ai> ---
(In reply to Andrew Pinski from comment #7)
> >  short b = a = 0;
> 
> s/short/int/ allows it to work ...

Thanks Andrew. I think we should change the tittle.

It should be mismatch on both RISC-V and ARM SVE with -fno-vect-cost-model.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [Bug tree-optimization/113281] [14 Regression] Wrong code due to vectorization of shift reduction and missing promotions
  2024-01-08 19:17 [Bug target/113281] New: [14] RISC-V rv64gcv_zvl256b vector: Runtime mismatch with rv64gc patrick at rivosinc dot com
                   ` (7 preceding siblings ...)
  2024-01-09  2:17 ` juzhe.zhong at rivai dot ai
@ 2024-01-09  2:58 ` pinskia at gcc dot gnu.org
  2024-01-09  2:59 ` pinskia at gcc dot gnu.org
                   ` (23 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-01-09  2:58 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113281

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|[14] RISC-V rv64gcv_zvl256b |[14 Regression] Wrong code
                   |vector: Runtime mismatch    |due to vectorization of
                   |with rv64gc                 |shift reduction and missing
                   |                            |promotions
   Target Milestone|---                         |14.0
      Known to work|                            |13.2.0

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [Bug tree-optimization/113281] [14 Regression] Wrong code due to vectorization of shift reduction and missing promotions
  2024-01-08 19:17 [Bug target/113281] New: [14] RISC-V rv64gcv_zvl256b vector: Runtime mismatch with rv64gc patrick at rivosinc dot com
                   ` (8 preceding siblings ...)
  2024-01-09  2:58 ` [Bug tree-optimization/113281] [14 Regression] Wrong code due to vectorization of shift reduction and missing promotions pinskia at gcc dot gnu.org
@ 2024-01-09  2:59 ` pinskia at gcc dot gnu.org
  2024-01-09  8:32 ` rguenth at gcc dot gnu.org
                   ` (22 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-01-09  2:59 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113281

--- Comment #9 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to JuzheZhong from comment #8)
> It should be mismatch on both RISC-V and ARM SVE with -fno-vect-cost-model.

Changed and checked to see that GCC 13 didn't vectorize the loop for aarch64
SVE either so it is a regression.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [Bug tree-optimization/113281] [14 Regression] Wrong code due to vectorization of shift reduction and missing promotions
  2024-01-08 19:17 [Bug target/113281] New: [14] RISC-V rv64gcv_zvl256b vector: Runtime mismatch with rv64gc patrick at rivosinc dot com
                   ` (9 preceding siblings ...)
  2024-01-09  2:59 ` pinskia at gcc dot gnu.org
@ 2024-01-09  8:32 ` rguenth at gcc dot gnu.org
  2024-01-10 13:26 ` rguenth at gcc dot gnu.org
                   ` (21 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-01-09  8:32 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113281

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |needs-bisection
                 CC|                            |rguenth at gcc dot gnu.org,
                   |                            |rsandifo at gcc dot gnu.org

--- Comment #10 from Richard Biener <rguenth at gcc dot gnu.org> ---
  vect_patt_84.19_168 = MIN_EXPR <vect_patt_82.18_166, { 15, ... }>;

this one is new, but I think there's an existing open bug for the "widening"
pattern recognition failing to make sure the shift argument is kept in range.
Or maybe that was the same that spurred the MIN above (but IIRC that was
some "undefined behavior" thing, not wrong-code).

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [Bug tree-optimization/113281] [14 Regression] Wrong code due to vectorization of shift reduction and missing promotions
  2024-01-08 19:17 [Bug target/113281] New: [14] RISC-V rv64gcv_zvl256b vector: Runtime mismatch with rv64gc patrick at rivosinc dot com
                   ` (10 preceding siblings ...)
  2024-01-09  8:32 ` rguenth at gcc dot gnu.org
@ 2024-01-10 13:26 ` rguenth at gcc dot gnu.org
  2024-01-12 12:16 ` cvs-commit at gcc dot gnu.org
                   ` (20 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-01-10 13:26 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113281

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Priority|P3                          |P1

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [Bug tree-optimization/113281] [14 Regression] Wrong code due to vectorization of shift reduction and missing promotions
  2024-01-08 19:17 [Bug target/113281] New: [14] RISC-V rv64gcv_zvl256b vector: Runtime mismatch with rv64gc patrick at rivosinc dot com
                   ` (11 preceding siblings ...)
  2024-01-10 13:26 ` rguenth at gcc dot gnu.org
@ 2024-01-12 12:16 ` cvs-commit at gcc dot gnu.org
  2024-01-15 12:06 ` cvs-commit at gcc dot gnu.org
                   ` (19 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2024-01-12 12:16 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113281

--- Comment #11 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Pan Li <panli@gcc.gnu.org>:

https://gcc.gnu.org/g:0acb63670bf1058fce00a75bd318c40be3bfa222

commit r14-7183-g0acb63670bf1058fce00a75bd318c40be3bfa222
Author: Juzhe-Zhong <juzhe.zhong@rivai.ai>
Date:   Fri Jan 12 17:28:44 2024 +0800

    RISC-V: Adjust scalar_to_vec cost

    1. Introduce vector regmove new tune info.
    2. Adjust scalar_to_vec cost in add_stmt_cost.

    We will get optimal codegen after this patch with -march=rv64gcv_zvl256b:

            lui     a5,%hi(a)
            li      a4,19
            sb      a4,%lo(a)(a5)
            li      a0,0
            ret

    Tested on both RV32/RV64 no regression, Ok for trunk ?

            PR target/113281

    gcc/ChangeLog:

            * config/riscv/riscv-protos.h (struct regmove_vector_cost): New
struct.
            (struct cpu_vector_cost): Add regmove struct.
            (get_vector_costs): Export as global.
            * config/riscv/riscv-vector-costs.cc (adjust_stmt_cost): Adjust
scalar_to_vec cost.
            (costs::add_stmt_cost): Ditto.
            * config/riscv/riscv.cc (get_common_costs): Export global function.

    gcc/testsuite/ChangeLog:

            * gcc.target/riscv/rvv/autovec/pr113209.c: Adapt test.
            * gcc.dg/vect/costmodel/riscv/rvv/pr113281-1.c: New test.
            * gcc.dg/vect/costmodel/riscv/rvv/pr113281-2.c: New test.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [Bug tree-optimization/113281] [14 Regression] Wrong code due to vectorization of shift reduction and missing promotions
  2024-01-08 19:17 [Bug target/113281] New: [14] RISC-V rv64gcv_zvl256b vector: Runtime mismatch with rv64gc patrick at rivosinc dot com
                   ` (12 preceding siblings ...)
  2024-01-12 12:16 ` cvs-commit at gcc dot gnu.org
@ 2024-01-15 12:06 ` cvs-commit at gcc dot gnu.org
  2024-01-23 16:58 ` [Bug tree-optimization/113281] [14 Regression] Wrong code due to vectorization of shift reduction and missing promotions since r14-3027 jakub at gcc dot gnu.org
                   ` (18 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2024-01-15 12:06 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113281

--- Comment #12 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The trunk branch has been updated by Lehua Ding <lhtin@gcc.gnu.org>:

https://gcc.gnu.org/g:405096f908e1ceb0d6a1b5420ded20ad85ddae9e

commit r14-7244-g405096f908e1ceb0d6a1b5420ded20ad85ddae9e
Author: Juzhe-Zhong <juzhe.zhong@rivai.ai>
Date:   Mon Jan 15 09:22:40 2024 +0800

    RISC-V: Adjust loop len by costing 1 when NITER < VF

    Rebase in v3: Rebase to the trunk and commit it as it's approved by Robin.
    Update in v2: Add dynmaic lmul test.

    This patch fixes the regression between GCC 13.2.0 and trunk GCC (GCC-14)

    GCC 13.2.0:

            lui     a5,%hi(a)
            li      a4,19
            sb      a4,%lo(a)(a5)
            li      a0,0
            ret

    Trunk GCC:

            vsetvli a5,zero,e8,mf2,ta,ma
            li      a4,-32768
            vid.v   v1
            vsetvli zero,zero,e16,m1,ta,ma
            addiw   a4,a4,104
            vmv.v.i v3,15
            lui     a1,%hi(a)
            li      a0,19
            vsetvli zero,zero,e8,mf2,ta,ma
            vadd.vi v1,v1,1
            sb      a0,%lo(a)(a1)
            vsetvli zero,zero,e16,m1,ta,ma
            vzext.vf2       v2,v1
            vmv.v.x v1,a4
            vminu.vv        v2,v2,v3
            vsrl.vv v1,v1,v2
            vslidedown.vi   v1,v1,17
            vmv.x.s a0,v1
            snez    a0,a0
            ret

    The root cause we are vectorizing the codes inefficiently since we doesn't
cost len when NITERS < VF.
    Leverage loop control of mask targets or rs6000 fixes the regression.

    Tested no regression. Ok for trunk ?

            PR target/113281

    gcc/ChangeLog:

            * config/riscv/riscv-vector-costs.cc
(costs::adjust_vect_cost_per_loop): New function.
            (costs::finish_cost): Adjust cost for LOOP LEN with NITERS < VF.
            * config/riscv/riscv-vector-costs.h: New function.

    gcc/testsuite/ChangeLog:

            * gcc.dg/vect/costmodel/riscv/rvv/pr113281-3.c: New test.
            * gcc.dg/vect/costmodel/riscv/rvv/pr113281-4.c: New test.
            * gcc.dg/vect/costmodel/riscv/rvv/pr113281-5.c: New test.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [Bug tree-optimization/113281] [14 Regression] Wrong code due to vectorization of shift reduction and missing promotions since r14-3027
  2024-01-08 19:17 [Bug target/113281] New: [14] RISC-V rv64gcv_zvl256b vector: Runtime mismatch with rv64gc patrick at rivosinc dot com
                   ` (13 preceding siblings ...)
  2024-01-15 12:06 ` cvs-commit at gcc dot gnu.org
@ 2024-01-23 16:58 ` jakub at gcc dot gnu.org
  2024-01-24  7:48 ` rguenth at gcc dot gnu.org
                   ` (17 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-01-23 16:58 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113281

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jakub at gcc dot gnu.org
            Summary|[14 Regression] Wrong code  |[14 Regression] Wrong code
                   |due to vectorization of     |due to vectorization of
                   |shift reduction and missing |shift reduction and missing
                   |promotions                  |promotions since r14-3027
           Keywords|needs-bisection             |

--- Comment #13 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
At least on aarch64 this is vectorized since
r14-3027-gc5f673dbc252e35e6b66e9b8abd30a4027193e0b

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [Bug tree-optimization/113281] [14 Regression] Wrong code due to vectorization of shift reduction and missing promotions since r14-3027
  2024-01-08 19:17 [Bug target/113281] New: [14] RISC-V rv64gcv_zvl256b vector: Runtime mismatch with rv64gc patrick at rivosinc dot com
                   ` (14 preceding siblings ...)
  2024-01-23 16:58 ` [Bug tree-optimization/113281] [14 Regression] Wrong code due to vectorization of shift reduction and missing promotions since r14-3027 jakub at gcc dot gnu.org
@ 2024-01-24  7:48 ` rguenth at gcc dot gnu.org
  2024-01-24  7:54 ` pinskia at gcc dot gnu.org
                   ` (16 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-01-24  7:48 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113281

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Blocks|                            |110838

--- Comment #14 from Richard Biener <rguenth at gcc dot gnu.org> ---
So the issue is we are vectorizing

     b = 32872 >> a;

as a HImode shift with

   b = 32872 >> MIN (a, 15)

that's wrong since it never yields the desired zero.  With a shift value
of 16 invoking undefined behavior for HImode but the expression originally
promoted to int and thus fine to be shifted by even 18 this means we cannot
vectorize this without introducing undefined behavior in GIMPLE?

This was introduced by the fix for PR110838 (and followup fixes).

I wonder if it would make most sense to allow shifts by the width of
the promoted left operand which would also help interoperability of
GIMPLE with Fortran here?  I'm not sure what the reasoning is to
disallow that specific shift amount (any hardware that would not compute
the result to zero or -1?  Esp. for logical right shifts this seems like
an odd restriction).  I'd hate to have another shift operator with
special defined behavior for this single case but I see no other way
to fix this short of complicating the lowering to
a >= 16 ? 0 : 32872 >> MIN (a, 15) (the MIN is still required to
avoid requiring masking).


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110838
[Bug 110838] [14 Regression] wrong code on x365-3.5, -O3, sign extraction

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [Bug tree-optimization/113281] [14 Regression] Wrong code due to vectorization of shift reduction and missing promotions since r14-3027
  2024-01-08 19:17 [Bug target/113281] New: [14] RISC-V rv64gcv_zvl256b vector: Runtime mismatch with rv64gc patrick at rivosinc dot com
                   ` (15 preceding siblings ...)
  2024-01-24  7:48 ` rguenth at gcc dot gnu.org
@ 2024-01-24  7:54 ` pinskia at gcc dot gnu.org
  2024-01-24  8:32 ` rguenther at suse dot de
                   ` (15 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-01-24  7:54 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113281

--- Comment #15 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #14)
> a >= 16 ? 0 : 32872 >> MIN (a, 15) (the MIN is still required to
> avoid requiring masking).

Note maybe instead of MIN here we use `a & 0xf` since that will more likely be
cheaper than a MIN.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [Bug tree-optimization/113281] [14 Regression] Wrong code due to vectorization of shift reduction and missing promotions since r14-3027
  2024-01-08 19:17 [Bug target/113281] New: [14] RISC-V rv64gcv_zvl256b vector: Runtime mismatch with rv64gc patrick at rivosinc dot com
                   ` (16 preceding siblings ...)
  2024-01-24  7:54 ` pinskia at gcc dot gnu.org
@ 2024-01-24  8:32 ` rguenther at suse dot de
  2024-01-24  8:38 ` pinskia at gcc dot gnu.org
                   ` (14 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: rguenther at suse dot de @ 2024-01-24  8:32 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113281

--- Comment #16 from rguenther at suse dot de <rguenther at suse dot de> ---
On Wed, 24 Jan 2024, pinskia at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113281
> 
> --- Comment #15 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
> (In reply to Richard Biener from comment #14)
> > a >= 16 ? 0 : 32872 >> MIN (a, 15) (the MIN is still required to
> > avoid requiring masking).
> 
> Note maybe instead of MIN here we use `a & 0xf` since that will more likely be
> cheaper than a MIN.

But it's incorrect (that's what I did originally).

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [Bug tree-optimization/113281] [14 Regression] Wrong code due to vectorization of shift reduction and missing promotions since r14-3027
  2024-01-08 19:17 [Bug target/113281] New: [14] RISC-V rv64gcv_zvl256b vector: Runtime mismatch with rv64gc patrick at rivosinc dot com
                   ` (17 preceding siblings ...)
  2024-01-24  8:32 ` rguenther at suse dot de
@ 2024-01-24  8:38 ` pinskia at gcc dot gnu.org
  2024-01-24  9:52 ` jakub at gcc dot gnu.org
                   ` (13 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-01-24  8:38 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113281

--- Comment #17 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to rguenther@suse.de from comment #16)
> On Wed, 24 Jan 2024, pinskia at gcc dot gnu.org wrote:
> 
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113281
> > 
> > --- Comment #15 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
> > (In reply to Richard Biener from comment #14)
> > > a >= 16 ? 0 : 32872 >> MIN (a, 15) (the MIN is still required to
> > > avoid requiring masking).
> > 
> > Note maybe instead of MIN here we use `a & 0xf` since that will more likely be
> > cheaper than a MIN.
> 
> But it's incorrect (that's what I did originally).

But `(a>= 16)? 0: (32872>> (a&0xf))` is correct.

So is `(a>=16 ? 0 : 32872) >> ( a & 0xf)` .

Or is it you want to avoid the conditional here.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [Bug tree-optimization/113281] [14 Regression] Wrong code due to vectorization of shift reduction and missing promotions since r14-3027
  2024-01-08 19:17 [Bug target/113281] New: [14] RISC-V rv64gcv_zvl256b vector: Runtime mismatch with rv64gc patrick at rivosinc dot com
                   ` (18 preceding siblings ...)
  2024-01-24  8:38 ` pinskia at gcc dot gnu.org
@ 2024-01-24  9:52 ` jakub at gcc dot gnu.org
  2024-01-24  9:54 ` rguenther at suse dot de
                   ` (12 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-01-24  9:52 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113281

--- Comment #18 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
(In reply to Andrew Pinski from comment #17)
> (In reply to rguenther@suse.de from comment #16)
> > On Wed, 24 Jan 2024, pinskia at gcc dot gnu.org wrote:
> > 
> > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113281
> > > 
> > > --- Comment #15 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
> > > (In reply to Richard Biener from comment #14)
> > > > a >= 16 ? 0 : 32872 >> MIN (a, 15) (the MIN is still required to
> > > > avoid requiring masking).
> > > 
> > > Note maybe instead of MIN here we use `a & 0xf` since that will more likely be
> > > cheaper than a MIN.
> > 
> > But it's incorrect (that's what I did originally).
> 
> But `(a>= 16)? 0: (32872>> (a&0xf))` is correct.
> 
> So is `(a>=16 ? 0 : 32872) >> ( a & 0xf)` .
> 
> Or is it you want to avoid the conditional here.

I believe the thing is that Richi's PR110838 changes for RSHIFT_EXPR are
correct for arithmetic right shifts and we should keep doing what it does for
those.
But they are not correct for logical right shifts and they don't handle left
shifts.
Both logical right shifts and left shifts need to get 0 from the shift counts
bigger than new_precision - 1, which can be achieved through one of (or <<
instead of >>):
(op1 >= min_precision ? 0 : op0) >> (op1 & (min_precision - 1))
(op0 & (op1 < min_precision ? -1 : 0)) >> (op1 & (min_precision - 1))
op1 >= min_precision ? 0 : (op0 >> (op1 & (min_precision - 1)))
(op0 >> (op1 & (min_precision - 1))) & (op1 < min_precision ? -1 : 0)
Which one of these is best?
For simplicity of vect_recog_over_widening_pattern probably the first one,
unsure
about what generates best code (and where).

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [Bug tree-optimization/113281] [14 Regression] Wrong code due to vectorization of shift reduction and missing promotions since r14-3027
  2024-01-08 19:17 [Bug target/113281] New: [14] RISC-V rv64gcv_zvl256b vector: Runtime mismatch with rv64gc patrick at rivosinc dot com
                   ` (19 preceding siblings ...)
  2024-01-24  9:52 ` jakub at gcc dot gnu.org
@ 2024-01-24  9:54 ` rguenther at suse dot de
  2024-01-24 10:00 ` jakub at gcc dot gnu.org
                   ` (11 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: rguenther at suse dot de @ 2024-01-24  9:54 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113281

--- Comment #19 from rguenther at suse dot de <rguenther at suse dot de> ---
On Wed, 24 Jan 2024, pinskia at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113281
> 
> --- Comment #17 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
> (In reply to rguenther@suse.de from comment #16)
> > On Wed, 24 Jan 2024, pinskia at gcc dot gnu.org wrote:
> > 
> > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113281
> > > 
> > > --- Comment #15 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
> > > (In reply to Richard Biener from comment #14)
> > > > a >= 16 ? 0 : 32872 >> MIN (a, 15) (the MIN is still required to
> > > > avoid requiring masking).
> > > 
> > > Note maybe instead of MIN here we use `a & 0xf` since that will more likely be
> > > cheaper than a MIN.
> > 
> > But it's incorrect (that's what I did originally).
> 
> But `(a>= 16)? 0: (32872>> (a&0xf))` is correct.
> 
> So is `(a>=16 ? 0 : 32872) >> ( a & 0xf)` .
> 
> Or is it you want to avoid the conditional here.

Yeah - at some point not trying to optimize the widening/shortening
is going to be cheaper, no?  Esp. if it is "just" because of
GIMPLE being limited with no way to express the desired semantics
for "out-of-bound" shifts (and no way to query target behavior for
them of course).

The widening/shortening might or might not have an effect on the
VF (we don't know) and we don't (yet) compute if the target
supports it.  We could force a smaller vector size (if the target
supports it) to avoid increasing the VF.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [Bug tree-optimization/113281] [14 Regression] Wrong code due to vectorization of shift reduction and missing promotions since r14-3027
  2024-01-08 19:17 [Bug target/113281] New: [14] RISC-V rv64gcv_zvl256b vector: Runtime mismatch with rv64gc patrick at rivosinc dot com
                   ` (20 preceding siblings ...)
  2024-01-24  9:54 ` rguenther at suse dot de
@ 2024-01-24 10:00 ` jakub at gcc dot gnu.org
  2024-01-24 13:23 ` jakub at gcc dot gnu.org
                   ` (10 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-01-24 10:00 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113281

--- Comment #20 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
(In reply to Jakub Jelinek from comment #18)
> But they are not correct for logical right shifts and they don't handle left
> shifts.

Oh, and then there are rotates and I think we need to punt on those, those
shouldn't be narrowed.  Unless we already punt on them...

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [Bug tree-optimization/113281] [14 Regression] Wrong code due to vectorization of shift reduction and missing promotions since r14-3027
  2024-01-08 19:17 [Bug target/113281] New: [14] RISC-V rv64gcv_zvl256b vector: Runtime mismatch with rv64gc patrick at rivosinc dot com
                   ` (21 preceding siblings ...)
  2024-01-24 10:00 ` jakub at gcc dot gnu.org
@ 2024-01-24 13:23 ` jakub at gcc dot gnu.org
  2024-01-24 20:49 ` rsandifo at gcc dot gnu.org
                   ` (9 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-01-24 13:23 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113281

--- Comment #21 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
So, I think
--- gcc/tree-vect-patterns.cc.jj        2024-01-03 11:51:37.487648527 +0100
+++ gcc/tree-vect-patterns.cc   2024-01-24 14:05:55.911356481 +0100
@@ -3002,6 +3002,12 @@ vect_recog_over_widening_pattern (vec_in
   if (STMT_VINFO_DEF_TYPE (last_stmt_info) == vect_reduction_def)
     return NULL;

+  /* FIXME: Aren't actually most operations incompatible with the truncation?
+     Like division/modulo, ...  Shouldn't this punt on
+     !vect_truncatable_operation_p (code) or that plus some exceptions?  */
+  if (code == LROTATE_EXPR || code == RROTATE_EXPR)
+    return NULL;
+
   /* Keep the first operand of a COND_EXPR as-is: only the other two
      operands are interesting.  */
   unsigned int first_op = (code == COND_EXPR ? 2 : 1);
@@ -3169,22 +3175,35 @@ vect_recog_over_widening_pattern (vec_in
                       op_type, &unprom[0], op_vectype);

   /* Limit shift operands.  */
-  if (code == RSHIFT_EXPR)
+  if (code == LSHIFT_EXPR || code == RSHIFT_EXPR)
     {
       wide_int min_value, max_value;
       if (TREE_CODE (ops[1]) == INTEGER_CST)
-       ops[1] = wide_int_to_tree (op_type,
-                                  wi::umin (wi::to_wide (ops[1]),
-                                            new_precision - 1));
+       {
+         if (wi::ge_p (wi::to_wide (ops[1]), new_precision,
+                       TYPE_SIGN (TREE_TYPE (ops[1]))))
+           {
+             if (code == LSHIFT_EXPR || unsigned_p)
+               return NULL;
+             ops[1] = build_int_cst (TREE_TYPE (ops[1]), new_precision - 1);
+           }
+       }
       else if (!vect_get_range_info (ops[1], &min_value, &max_value)
-              || wi::ge_p (max_value, new_precision, TYPE_SIGN (op_type)))
+              || wi::ge_p (max_value, new_precision,
+                           TYPE_SIGN (TREE_TYPE (ops[1]))))
        {
+         /* LSHIFT_EXPR or logical RSHIFT_EXPR can be handled as
+            (op1 >= new_precision ? 0 : op0) << (op1 & (new_precision - 1))
+            but perhaps it is too costly.  */
+         if (code == LSHIFT_EXPR || unsigned_p)
+           return NULL;
          /* ???  Note the following bad for SLP as that only supports
             same argument widened shifts and it un-CSEs same arguments.  */
-         tree new_var = vect_recog_temp_ssa_var (op_type, NULL);
+         tree new_var = vect_recog_temp_ssa_var (TREE_TYPE (ops[1]), NULL);
          gimple *pattern_stmt
            = gimple_build_assign (new_var, MIN_EXPR, ops[1],
-                                  build_int_cst (op_type, new_precision - 1));
+                                  build_int_cst (TREE_TYPE (ops[1]),
+                                                 new_precision - 1));
          gimple_set_location (pattern_stmt, gimple_location (last_stmt));
          if (ops[1] == unprom[1].op && unprom[1].dt == vect_external_def)
            {
should fix it (given Richi's comments about the conditionals being too costly
it instead punts for the truncated left shifts or right logical shifts unless
VRP proves
the shift count is in range; I've also changed all the op_type uses on the
shift count
operand to TREE_TYPE (ops[1]), because I believe the two types can and most
often are different.
But as the first comment suggests, I wonder how comes this doesn't miscompile
rotates, or say divisions/multiplications etc.
While vect_determine_precisions_from_range and
vect_determine_precisions_from_users call vect_truncatable_operation_p and the
latter allows casts and LSHIFT_EXPR/RSHIFT_EXPR next to it, the former seems
for !vect_truncatable_operation_p
to just compute minimum/maximum from ranges across all operands and lhs, is
that a safe thing for any non-truncatable operations?

Richard S., can you please have a look, this was added in
r9-1590-g370c2ebe8fa20e0812cd2d533d4ed38ee2d37c85 I believe.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [Bug tree-optimization/113281] [14 Regression] Wrong code due to vectorization of shift reduction and missing promotions since r14-3027
  2024-01-08 19:17 [Bug target/113281] New: [14] RISC-V rv64gcv_zvl256b vector: Runtime mismatch with rv64gc patrick at rivosinc dot com
                   ` (22 preceding siblings ...)
  2024-01-24 13:23 ` jakub at gcc dot gnu.org
@ 2024-01-24 20:49 ` rsandifo at gcc dot gnu.org
  2024-01-29 12:33 ` cvs-commit at gcc dot gnu.org
                   ` (8 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: rsandifo at gcc dot gnu.org @ 2024-01-24 20:49 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113281

Richard Sandiford <rsandifo at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Assignee|unassigned at gcc dot gnu.org      |rsandifo at gcc dot gnu.org
             Status|NEW                         |ASSIGNED

--- Comment #22 from Richard Sandiford <rsandifo at gcc dot gnu.org> ---
Taking following discussion on irc.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [Bug tree-optimization/113281] [14 Regression] Wrong code due to vectorization of shift reduction and missing promotions since r14-3027
  2024-01-08 19:17 [Bug target/113281] New: [14] RISC-V rv64gcv_zvl256b vector: Runtime mismatch with rv64gc patrick at rivosinc dot com
                   ` (23 preceding siblings ...)
  2024-01-24 20:49 ` rsandifo at gcc dot gnu.org
@ 2024-01-29 12:33 ` cvs-commit at gcc dot gnu.org
  2024-01-29 12:34 ` [Bug tree-optimization/113281] " rsandifo at gcc dot gnu.org
                   ` (7 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2024-01-29 12:33 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113281

--- Comment #23 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The trunk branch has been updated by Richard Sandiford <rsandifo@gcc.gnu.org>:

https://gcc.gnu.org/g:1a8261e047f7a2c2b0afb95716f7615cba718cd1

commit r14-8492-g1a8261e047f7a2c2b0afb95716f7615cba718cd1
Author: Richard Sandiford <richard.sandiford@arm.com>
Date:   Mon Jan 29 12:33:08 2024 +0000

    vect: Tighten vect_determine_precisions_from_range [PR113281]

    This was another PR caused by the way that
    vect_determine_precisions_from_range handles shifts.  We tried to
    narrow 32768 >> x to a 16-bit shift based on range information for
    the inputs and outputs, with vect_recog_over_widening_pattern
    (after PR110828) adjusting the shift amount.  But this doesn't
    work for the case where x is in [16, 31], since then 32-bit
    32768 >> x is a well-defined zero, whereas no well-defined
    16-bit 32768 >> y will produce 0.

    We could perhaps generate x < 16 ? 32768 >> x : 0 instead,
    but since vect_determine_precisions_from_range was never really
    supposed to rely on fix-ups, it seems better to fix that instead.

    The patch also makes the code more selective about which codes
    can be narrowed based on input and output ranges.  This showed
    that vect_truncatable_operation_p was missing cases for
    BIT_NOT_EXPR (equivalent to BIT_XOR_EXPR of -1) and NEGATE_EXPR
    (equivalent to BIT_NOT_EXPR followed by a PLUS_EXPR of 1).

    pr113281-1.c is the original testcase.  pr113281-[23].c failed
    before the patch due to overly optimistic narrowing.  pr113281-[45].c
    previously passed and are meant to protect against accidental
    optimisation regressions.

    gcc/
            PR target/113281
            * tree-vect-patterns.cc (vect_recog_over_widening_pattern): Remove
            workaround for right shifts.
            (vect_truncatable_operation_p): Handle NEGATE_EXPR and
BIT_NOT_EXPR.
            (vect_determine_precisions_from_range): Be more selective about
            which codes can be narrowed based on their input and output ranges.
            For shifts, require at least one more bit of precision than the
            maximum shift amount.

    gcc/testsuite/
            PR target/113281
            * gcc.dg/vect/pr113281-1.c: New test.
            * gcc.dg/vect/pr113281-2.c: Likewise.
            * gcc.dg/vect/pr113281-3.c: Likewise.
            * gcc.dg/vect/pr113281-4.c: Likewise.
            * gcc.dg/vect/pr113281-5.c: Likewise.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [Bug tree-optimization/113281] Wrong code due to vectorization of shift reduction and missing promotions since r14-3027
  2024-01-08 19:17 [Bug target/113281] New: [14] RISC-V rv64gcv_zvl256b vector: Runtime mismatch with rv64gc patrick at rivosinc dot com
                   ` (24 preceding siblings ...)
  2024-01-29 12:33 ` cvs-commit at gcc dot gnu.org
@ 2024-01-29 12:34 ` rsandifo at gcc dot gnu.org
  2024-01-31 17:30 ` [Bug tree-optimization/113281] [11/12/13 Regression] Latent wrong code due to vectorization of shift reduction and missing promotions since r9-1590 jakub at gcc dot gnu.org
                   ` (6 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: rsandifo at gcc dot gnu.org @ 2024-01-29 12:34 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113281

--- Comment #24 from Richard Sandiford <rsandifo at gcc dot gnu.org> ---
Fixed on trunk so far, but it's latent on branches.  I'll see what
the trunk fallout is like before asking about backports.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [Bug tree-optimization/113281] [11/12/13 Regression] Latent wrong code due to vectorization of shift reduction and missing promotions since r9-1590
  2024-01-08 19:17 [Bug target/113281] New: [14] RISC-V rv64gcv_zvl256b vector: Runtime mismatch with rv64gc patrick at rivosinc dot com
                   ` (25 preceding siblings ...)
  2024-01-29 12:34 ` [Bug tree-optimization/113281] " rsandifo at gcc dot gnu.org
@ 2024-01-31 17:30 ` jakub at gcc dot gnu.org
  2024-03-13 20:17 ` ewlu at rivosinc dot com
                   ` (5 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-01-31 17:30 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113281

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Priority|P1                          |P2
   Target Milestone|14.0                        |11.5
            Summary|[14 Regression] Wrong code  |[11/12/13 Regression]
                   |due to vectorization of     |Latent wrong code due to
                   |shift reduction and missing |vectorization of shift
                   |promotions since r14-3027   |reduction and missing
                   |                            |promotions since r9-1590

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [Bug tree-optimization/113281] [11/12/13 Regression] Latent wrong code due to vectorization of shift reduction and missing promotions since r9-1590
  2024-01-08 19:17 [Bug target/113281] New: [14] RISC-V rv64gcv_zvl256b vector: Runtime mismatch with rv64gc patrick at rivosinc dot com
                   ` (26 preceding siblings ...)
  2024-01-31 17:30 ` [Bug tree-optimization/113281] [11/12/13 Regression] Latent wrong code due to vectorization of shift reduction and missing promotions since r9-1590 jakub at gcc dot gnu.org
@ 2024-03-13 20:17 ` ewlu at rivosinc dot com
  2024-03-13 20:22 ` pinskia at gcc dot gnu.org
                   ` (4 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: ewlu at rivosinc dot com @ 2024-03-13 20:17 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113281

Edwin Lu <ewlu at rivosinc dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |ewlu at rivosinc dot com

--- Comment #25 from Edwin Lu <ewlu at rivosinc dot com> ---
(In reply to Richard Sandiford from comment #24)
> Fixed on trunk so far, but it's latent on branches.  I'll see what
> the trunk fallout is like before asking about backports.

It looks like we have a regression for riscv 

I was going through the scan dump failures on trunk and ended up revisiting
https://github.com/patrick-rivos/gcc-postcommit-ci/issues/463 where
gcc.dg/vect/costmodel/riscv/rvv/pr113281-[125].c are failing the scan-dump
checks. I didn't realize at the time that the scan dumps were checking code
correctness and ended up ignoring it. 

It's still persisting on trunk (at least for pr113281-1.c
https://godbolt.org/z/M9EK44hKe)

A bisection on https://github.com/patrick-rivos/gcc-postcommit-ci/issues/463
commit range suggests
https://gcc.gnu.org/g:1a8261e047f7a2c2b0afb95716f7615cba718cd1 introduced it.

# first bad commit: [1a8261e047f7a2c2b0afb95716f7615cba718cd1] vect: Tighten
vect_determine_precisions_from_range [PR113281]

Configuration
../configure --prefix=$(pwd) --with-multilib-generator="rv64gcv-lp64d--"
make stamps/build-gcc-linux-stage1 -j 32

Testing
./build-gcc-linux-stage1/gcc/cc1  
../gcc/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr113281-1.c 
-march=rv64gcv -mabi=lp64d -mtune=rocket -mcmodel=medlow  
-fdiagnostics-plain-output  -march=rv64gcv_zvl256b -mabi=lp64d -O3
-ftree-vectorize -ffat-lto-objects -fno-ident   -o pr113281-1.s

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [Bug tree-optimization/113281] [11/12/13 Regression] Latent wrong code due to vectorization of shift reduction and missing promotions since r9-1590
  2024-01-08 19:17 [Bug target/113281] New: [14] RISC-V rv64gcv_zvl256b vector: Runtime mismatch with rv64gc patrick at rivosinc dot com
                   ` (27 preceding siblings ...)
  2024-03-13 20:17 ` ewlu at rivosinc dot com
@ 2024-03-13 20:22 ` pinskia at gcc dot gnu.org
  2024-03-13 20:29 ` patrick at rivosinc dot com
                   ` (3 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-03-13 20:22 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113281

--- Comment #26 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Edwin Lu from comment #25)
> It's still persisting on trunk (at least for pr113281-1.c
> https://godbolt.org/z/M9EK44hKe)

I looked into what the vectorizer produces:
  vect__22.13_31 = (vector(8) int) vect_vec_iv_.12_8;
  _22 = (int) a.4_25;
  vect__12.14_33 = { 32872, 32872, 32872, 32872, 32872, 32872, 32872, 32872 }
>> vect__22.13_31;
  _12 = 32872 >> _22;
  vect_b_7.15_34 = (vector(8) short int) vect__12.14_33;

that is valid thing to do. That is do the shift in `vector(8) int` and then do
a truncation. The issue originally was about doing the shift in `vector(8)
short` which is not happening here.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [Bug tree-optimization/113281] [11/12/13 Regression] Latent wrong code due to vectorization of shift reduction and missing promotions since r9-1590
  2024-01-08 19:17 [Bug target/113281] New: [14] RISC-V rv64gcv_zvl256b vector: Runtime mismatch with rv64gc patrick at rivosinc dot com
                   ` (28 preceding siblings ...)
  2024-03-13 20:22 ` pinskia at gcc dot gnu.org
@ 2024-03-13 20:29 ` patrick at rivosinc dot com
  2024-03-13 23:18 ` juzhe.zhong at rivai dot ai
                   ` (2 subsequent siblings)
  32 siblings, 0 replies; 34+ messages in thread
From: patrick at rivosinc dot com @ 2024-03-13 20:29 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113281

--- Comment #27 from Patrick O'Neill <patrick at rivosinc dot com> ---
(In reply to Andrew Pinski from comment #26)
> (In reply to Edwin Lu from comment #25)
> > It's still persisting on trunk (at least for pr113281-1.c
> > https://godbolt.org/z/M9EK44hKe)
> 
> I looked into what the vectorizer produces:
>   vect__22.13_31 = (vector(8) int) vect_vec_iv_.12_8;
>   _22 = (int) a.4_25;
>   vect__12.14_33 = { 32872, 32872, 32872, 32872, 32872, 32872, 32872, 32872
> } >> vect__22.13_31;
>   _12 = 32872 >> _22;
>   vect_b_7.15_34 = (vector(8) short int) vect__12.14_33;
> 
> that is valid thing to do. That is do the shift in `vector(8) int` and then
> do a truncation. The issue originally was about doing the shift in
> `vector(8) short` which is not happening here.

The regressed testcase looks like its testing if riscv vectorizes the code at
all (the first issue Juzhe noted in comment #3 and then fixed). So this is a
performance regression for risc-v, not correctness.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [Bug tree-optimization/113281] [11/12/13 Regression] Latent wrong code due to vectorization of shift reduction and missing promotions since r9-1590
  2024-01-08 19:17 [Bug target/113281] New: [14] RISC-V rv64gcv_zvl256b vector: Runtime mismatch with rv64gc patrick at rivosinc dot com
                   ` (29 preceding siblings ...)
  2024-03-13 20:29 ` patrick at rivosinc dot com
@ 2024-03-13 23:18 ` juzhe.zhong at rivai dot ai
  2024-05-31  8:12 ` rdapp at gcc dot gnu.org
  2024-05-31 14:56 ` cvs-commit at gcc dot gnu.org
  32 siblings, 0 replies; 34+ messages in thread
From: juzhe.zhong at rivai dot ai @ 2024-03-13 23:18 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113281

--- Comment #28 from JuzheZhong <juzhe.zhong at rivai dot ai> ---
The original cost model I did work for all cases but with some middle-end
changes
the cost model failed.

I don't have time to figure out what's going on here.

Robin may be interested at it.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [Bug tree-optimization/113281] [11/12/13 Regression] Latent wrong code due to vectorization of shift reduction and missing promotions since r9-1590
  2024-01-08 19:17 [Bug target/113281] New: [14] RISC-V rv64gcv_zvl256b vector: Runtime mismatch with rv64gc patrick at rivosinc dot com
                   ` (30 preceding siblings ...)
  2024-03-13 23:18 ` juzhe.zhong at rivai dot ai
@ 2024-05-31  8:12 ` rdapp at gcc dot gnu.org
  2024-05-31 14:56 ` cvs-commit at gcc dot gnu.org
  32 siblings, 0 replies; 34+ messages in thread
From: rdapp at gcc dot gnu.org @ 2024-05-31  8:12 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113281

--- Comment #29 from Robin Dapp <rdapp at gcc dot gnu.org> ---
Just to document again:  The test case should not be vectorized and at some
point we will adjust the cost model so it is not going to be.  I'd prefer to
base that decision on real uarchs rather than adjust the generic cost model
right away though.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [Bug tree-optimization/113281] [11/12/13 Regression] Latent wrong code due to vectorization of shift reduction and missing promotions since r9-1590
  2024-01-08 19:17 [Bug target/113281] New: [14] RISC-V rv64gcv_zvl256b vector: Runtime mismatch with rv64gc patrick at rivosinc dot com
                   ` (31 preceding siblings ...)
  2024-05-31  8:12 ` rdapp at gcc dot gnu.org
@ 2024-05-31 14:56 ` cvs-commit at gcc dot gnu.org
  32 siblings, 0 replies; 34+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2024-05-31 14:56 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113281

--- Comment #30 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-13 branch has been updated by Richard Sandiford
<rsandifo@gcc.gnu.org>:

https://gcc.gnu.org/g:2602b71103d5ef2ef86000cac832b31dad3dfe2b

commit r13-8813-g2602b71103d5ef2ef86000cac832b31dad3dfe2b
Author: Richard Sandiford <richard.sandiford@arm.com>
Date:   Fri May 31 15:56:05 2024 +0100

    vect: Tighten vect_determine_precisions_from_range [PR113281]

    This was another PR caused by the way that
    vect_determine_precisions_from_range handles shifts.  We tried to
    narrow 32768 >> x to a 16-bit shift based on range information for
    the inputs and outputs, with vect_recog_over_widening_pattern
    (after PR110828) adjusting the shift amount.  But this doesn't
    work for the case where x is in [16, 31], since then 32-bit
    32768 >> x is a well-defined zero, whereas no well-defined
    16-bit 32768 >> y will produce 0.

    We could perhaps generate x < 16 ? 32768 >> x : 0 instead,
    but since vect_determine_precisions_from_range was never really
    supposed to rely on fix-ups, it seems better to fix that instead.

    The patch also makes the code more selective about which codes
    can be narrowed based on input and output ranges.  This showed
    that vect_truncatable_operation_p was missing cases for
    BIT_NOT_EXPR (equivalent to BIT_XOR_EXPR of -1) and NEGATE_EXPR
    (equivalent to BIT_NOT_EXPR followed by a PLUS_EXPR of 1).

    pr113281-1.c is the original testcase.  pr113281-[23].c failed
    before the patch due to overly optimistic narrowing.  pr113281-[45].c
    previously passed and are meant to protect against accidental
    optimisation regressions.

    gcc/
            PR target/113281
            * tree-vect-patterns.cc (vect_recog_over_widening_pattern): Remove
            workaround for right shifts.
            (vect_truncatable_operation_p): Handle NEGATE_EXPR and
BIT_NOT_EXPR.
            (vect_determine_precisions_from_range): Be more selective about
            which codes can be narrowed based on their input and output ranges.
            For shifts, require at least one more bit of precision than the
            maximum shift amount.

    gcc/testsuite/
            PR target/113281
            * gcc.dg/vect/pr113281-1.c: New test.
            * gcc.dg/vect/pr113281-2.c: Likewise.
            * gcc.dg/vect/pr113281-3.c: Likewise.
            * gcc.dg/vect/pr113281-4.c: Likewise.
            * gcc.dg/vect/pr113281-5.c: Likewise.

    (cherry picked from commit 1a8261e047f7a2c2b0afb95716f7615cba718cd1)

^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2024-05-31 14:56 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-01-08 19:17 [Bug target/113281] New: [14] RISC-V rv64gcv_zvl256b vector: Runtime mismatch with rv64gc patrick at rivosinc dot com
2024-01-08 19:20 ` [Bug target/113281] " patrick at rivosinc dot com
2024-01-08 21:49 ` rdapp at gcc dot gnu.org
2024-01-09  1:32 ` juzhe.zhong at rivai dot ai
2024-01-09  2:06 ` juzhe.zhong at rivai dot ai
2024-01-09  2:08 ` juzhe.zhong at rivai dot ai
2024-01-09  2:12 ` [Bug tree-optimization/113281] " pinskia at gcc dot gnu.org
2024-01-09  2:16 ` pinskia at gcc dot gnu.org
2024-01-09  2:17 ` juzhe.zhong at rivai dot ai
2024-01-09  2:58 ` [Bug tree-optimization/113281] [14 Regression] Wrong code due to vectorization of shift reduction and missing promotions pinskia at gcc dot gnu.org
2024-01-09  2:59 ` pinskia at gcc dot gnu.org
2024-01-09  8:32 ` rguenth at gcc dot gnu.org
2024-01-10 13:26 ` rguenth at gcc dot gnu.org
2024-01-12 12:16 ` cvs-commit at gcc dot gnu.org
2024-01-15 12:06 ` cvs-commit at gcc dot gnu.org
2024-01-23 16:58 ` [Bug tree-optimization/113281] [14 Regression] Wrong code due to vectorization of shift reduction and missing promotions since r14-3027 jakub at gcc dot gnu.org
2024-01-24  7:48 ` rguenth at gcc dot gnu.org
2024-01-24  7:54 ` pinskia at gcc dot gnu.org
2024-01-24  8:32 ` rguenther at suse dot de
2024-01-24  8:38 ` pinskia at gcc dot gnu.org
2024-01-24  9:52 ` jakub at gcc dot gnu.org
2024-01-24  9:54 ` rguenther at suse dot de
2024-01-24 10:00 ` jakub at gcc dot gnu.org
2024-01-24 13:23 ` jakub at gcc dot gnu.org
2024-01-24 20:49 ` rsandifo at gcc dot gnu.org
2024-01-29 12:33 ` cvs-commit at gcc dot gnu.org
2024-01-29 12:34 ` [Bug tree-optimization/113281] " rsandifo at gcc dot gnu.org
2024-01-31 17:30 ` [Bug tree-optimization/113281] [11/12/13 Regression] Latent wrong code due to vectorization of shift reduction and missing promotions since r9-1590 jakub at gcc dot gnu.org
2024-03-13 20:17 ` ewlu at rivosinc dot com
2024-03-13 20:22 ` pinskia at gcc dot gnu.org
2024-03-13 20:29 ` patrick at rivosinc dot com
2024-03-13 23:18 ` juzhe.zhong at rivai dot ai
2024-05-31  8:12 ` rdapp at gcc dot gnu.org
2024-05-31 14:56 ` cvs-commit at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).