[Bug target/102860] New: [12 regression] libgomp.fortran/simd2.f90 ICEs after r12-4526

public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed

* [Bug target/102860] New: [12 regression] libgomp.fortran/simd2.f90 ICEs after r12-4526
@ 2021-10-20 17:45 seurer at gcc dot gnu.org
  2021-10-20 18:19 ` [Bug target/102860] " aldyh at gcc dot gnu.org
                   ` (18 more replies)
  0 siblings, 19 replies; 20+ messages in thread
From: seurer at gcc dot gnu.org @ 2021-10-20 17:45 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102860

            Bug ID: 102860
           Summary: [12 regression] libgomp.fortran/simd2.f90 ICEs after
                    r12-4526
           Product: gcc
           Version: 12.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: seurer at gcc dot gnu.org
  Target Milestone: ---

g:d8edfadfc7a9795b65177a50ce44fd348858e844, r12-4526

I am only seeing this on a power 10 machine.

make  -k check-target-libgomp
RUNTESTFLAGS="fortran.exp=libgomp.fortran/simd2.f90"
FAIL: libgomp.fortran/simd2.f90   -O2  (internal compiler error)
FAIL: libgomp.fortran/simd2.f90   -O2  (test for excess errors)
FAIL: libgomp.fortran/simd2.f90   -O3 -fomit-frame-pointer -funroll-loops
-fpeel-loops -ftracer -finline-functions  (internal compiler error)
FAIL: libgomp.fortran/simd2.f90   -O3 -fomit-frame-pointer -funroll-loops
-fpeel-loops -ftracer -finline-functions  (test for excess errors)
FAIL: libgomp.fortran/simd2.f90   -O3 -g  (internal compiler error)
FAIL: libgomp.fortran/simd2.f90   -O3 -g  (test for excess errors)
# of expected passes            6
# of unexpected failures        6
# of unresolved testcases       3


spawn -ignore SIGHUP /home/seurer/gcc/git/build/gcc-test/./gcc/xgcc
-B/home/seurer/gcc/git/build/gcc-test/./gcc/
-B/home/seurer/gcc/git/install/gcc-test/powerpc64le-unknown-linux-gnu/bin/
-B/home/seurer/gcc/git/install/gcc-test/powerpc64le-unknown-linux-gnu/lib/
-isystem
/home/seurer/gcc/git/install/gcc-test/powerpc64le-unknown-linux-gnu/include
-isystem
/home/seurer/gcc/git/install/gcc-test/powerpc64le-unknown-linux-gnu/sys-include
/home/seurer/gcc/git/gcc-test/libgomp/testsuite/libgomp.fortran/simd2.f90
-B/home/seurer/gcc/git/build/gcc-test/powerpc64le-unknown-linux-gnu/./libgomp/
-B/home/seurer/gcc/git/build/gcc-test/powerpc64le-unknown-linux-gnu/./libgomp/.libs
-I/home/seurer/gcc/git/build/gcc-test/powerpc64le-unknown-linux-gnu/./libgomp
-I/home/seurer/gcc/git/gcc-test/libgomp/testsuite/../../include
-I/home/seurer/gcc/git/gcc-test/libgomp/testsuite/.. -fmessage-length=0
-fno-diagnostics-show-caret -fdiagnostics-color=never -fopenmp
-B/home/seurer/gcc/git/build/gcc-test/powerpc64le-unknown-linux-gnu/./libgomp/../libquadmath/.libs/
-O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer
-finline-functions
-B/home/seurer/gcc/git/build/gcc-test/powerpc64le-unknown-linux-gnu/./libgomp/../libgfortran/.libs
-fintrinsic-modules-path=/home/seurer/gcc/git/build/gcc-test/powerpc64le-unknown-linux-gnu/./libgomp
-L/home/seurer/gcc/git/build/gcc-test/powerpc64le-unknown-linux-gnu/./libgomp/.libs
-L/home/seurer/gcc/git/build/gcc-test/powerpc64le-unknown-linux-gnu/./libgomp/../libquadmath/.libs/
-L/home/seurer/gcc/git/build/gcc-test/powerpc64le-unknown-linux-gnu/./libgomp/../libgfortran/.libs
-lgfortran -foffload=-lgfortran -lquadmath -lm -o ./simd2.exe
during RTL pass: expand
/home/seurer/gcc/git/gcc-test/libgomp/testsuite/libgomp.fortran/simd2.f90:11:30:
internal compiler error: in prepare_cmp_insn, at optabs.c:4532
0x10ad438b prepare_cmp_insn
        /home/seurer/gcc/git/gcc-test/gcc/optabs.c:4532
0x10ad4507 emit_cmp_and_jump_insns(rtx_def*, rtx_def*, rtx_code, rtx_def*,
machine_mode, int, rtx_def*, profile_probability)
        /home/seurer/gcc/git/gcc-test/gcc/optabs.c:4677
0x106346a7 do_compare_rtx_and_jump(rtx_def*, rtx_def*, rtx_code, int,
machine_mode, rtx_def*, rtx_code_label*, rtx_code_label*, profile_probability)
        /home/seurer/gcc/git/gcc-test/gcc/dojump.c:1220
0x10712eff do_cmp_and_jump
        /home/seurer/gcc/git/gcc-test/gcc/expmed.c:6346
0x10712eff expand_divmod(int, tree_code, machine_mode, rtx_def*, rtx_def*,
rtx_def*, int, optab_methods)
        /home/seurer/gcc/git/gcc-test/gcc/expmed.c:4865
0x10716dbb expand_expr_divmod
        /home/seurer/gcc/git/gcc-test/gcc/expr.c:8945
0x1074094b expand_expr_real_2(separate_ops*, rtx_def*, machine_mode,
expand_modifier)
        /home/seurer/gcc/git/gcc-test/gcc/expr.c:9582
0x105686a7 expand_gimple_stmt_1
        /home/seurer/gcc/git/gcc-test/gcc/cfgexpand.c:3979
0x105686a7 expand_gimple_stmt
        /home/seurer/gcc/git/gcc-test/gcc/cfgexpand.c:4040
0x105717db expand_gimple_basic_block
        /home/seurer/gcc/git/gcc-test/gcc/cfgexpand.c:6082
0x1057476b execute
        /home/seurer/gcc/git/gcc-test/gcc/cfgexpand.c:6808


commit d8edfadfc7a9795b65177a50ce44fd348858e844 (HEAD, refs/bisect/bad)
Author: Aldy Hernandez <aldyh@redhat.com>
Date:   Mon Oct 4 09:47:02 2021 +0200

    Disallow loop rotation and loop header crossing in jump threaders.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug target/102860] [12 regression] libgomp.fortran/simd2.f90 ICEs after r12-4526
  2021-10-20 17:45 [Bug target/102860] New: [12 regression] libgomp.fortran/simd2.f90 ICEs after r12-4526 seurer at gcc dot gnu.org
@ 2021-10-20 18:19 ` aldyh at gcc dot gnu.org
  2021-10-20 19:36 ` pinskia at gcc dot gnu.org
                   ` (17 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: aldyh at gcc dot gnu.org @ 2021-10-20 18:19 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102860

--- Comment #1 from Aldy Hernandez <aldyh at gcc dot gnu.org> ---
The referenced patch reduces the amount of threaded paths, not increase them. 
This actually looks like another pass hiccuping because it was expecting a
threaded path that is no longer there.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug target/102860] [12 regression] libgomp.fortran/simd2.f90 ICEs after r12-4526
  2021-10-20 17:45 [Bug target/102860] New: [12 regression] libgomp.fortran/simd2.f90 ICEs after r12-4526 seurer at gcc dot gnu.org
  2021-10-20 18:19 ` [Bug target/102860] " aldyh at gcc dot gnu.org
@ 2021-10-20 19:36 ` pinskia at gcc dot gnu.org
  2021-10-21  6:55 ` [Bug middle-end/102860] " rguenth at gcc dot gnu.org
                   ` (16 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-10-20 19:36 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102860

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|---                         |12.0
           Keywords|                            |ice-on-valid-code

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug middle-end/102860] [12 regression] libgomp.fortran/simd2.f90 ICEs after r12-4526
  2021-10-20 17:45 [Bug target/102860] New: [12 regression] libgomp.fortran/simd2.f90 ICEs after r12-4526 seurer at gcc dot gnu.org
  2021-10-20 18:19 ` [Bug target/102860] " aldyh at gcc dot gnu.org
  2021-10-20 19:36 ` pinskia at gcc dot gnu.org
@ 2021-10-21  6:55 ` rguenth at gcc dot gnu.org
  2021-10-26  7:09 ` rguenth at gcc dot gnu.org
                   ` (15 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-10-21  6:55 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102860

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
     Ever confirmed|0                           |1
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2021-10-21
          Component|target                      |middle-end

--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
We are expanding

vect__5.73_271 = vect__4.72_269 %[fl] { 39, 39, 39, 39 };

produced from vectorizing

  _5 = _4 %[fl] 39;

optab_for_tree_code does

    case TRUNC_MOD_EXPR:
    case CEIL_MOD_EXPR:
    case FLOOR_MOD_EXPR:
    case ROUND_MOD_EXPR:
      return TYPE_UNSIGNED (type) ? umod_optab : smod_optab;

somehow the vectorizer finds an optab to vectorize this but RTL expansion
fails up to

        /* No luck with division elimination or divmod.  Have to do it
           by conditionally adjusting op0 *and* the result.  */

expand_divmod never seems to use smod_optab for FLOOR_MOD_EXPR.

So this seems to be a latent issue but definitely this expansion code
doing compare & jump has to be gated on !VECTOR_TYPE since do_cmp_and_jump
cannot work with vector arguments.  And then there's a fallback missing
I guess.  Simply gating produces

(insn 35 34 36 (set (reg:V4SI 238)
        (const_vector:V4SI [
                (const_int -52 [0xffffffffffffffcc]) repeated x4
            ])) "simd2.f90":11:30 -1
     (nil))

(insn 36 35 37 (set (reg:V4SI 237 [ vect__4.72 ])
        (plus:V4SI (reg:V4SI 181 [ vect_vec_iv_.68 ])
            (reg:V4SI 238))) "simd2.f90":11:30 -1
     (nil))

(insn 37 36 38 (set (reg:V4SI 240)
        (reg:V4SI 239)) "simd2.f90":11:30 -1
     (nil))

(insn 38 37 39 (set (reg:V4SI 242)
        (const_vector:V4SI [
                (const_int 2 [0x2]) repeated x4
            ])) "simd2.f90":11:30 -1
     (nil))

(insn 39 38 40 (set (reg:V4SI 241)
        (ashift:V4SI (reg:V4SI 240)
            (reg:V4SI 242))) "simd2.f90":11:30 -1
     (nil))

(insn 40 39 41 (set (reg:V4SI 240)
        (reg:V4SI 241)) "simd2.f90":11:30 -1
     (nil))

(insn 41 40 42 (set (reg:V4SI 243)
        (plus:V4SI (reg:V4SI 240)
            (reg:V4SI 239))) "simd2.f90":11:30 -1
     (nil))

(insn 42 41 43 (set (reg:V4SI 245)
        (const_vector:V4SI [
                (const_int 3 [0x3]) repeated x4
            ])) "simd2.f90":11:30 -1
     (nil))

(insn 43 42 44 (set (reg:V4SI 244)
        (ashift:V4SI (reg:V4SI 243)
            (reg:V4SI 245))) "simd2.f90":11:30 -1
     (nil))

(insn 44 43 45 (set (reg:V4SI 243)
        (reg:V4SI 244)) "simd2.f90":11:30 -1
     (nil))

(insn 45 44 46 (set (reg:V4SI 246)
        (minus:V4SI (reg:V4SI 243)
            (reg:V4SI 239))) "simd2.f90":11:30 -1
     (nil))

(insn 46 45 0 (set (reg:V4SI 221 [ vect__5.73 ])
        (minus:V4SI (reg:V4SI 237 [ vect__4.72 ])
            (reg:V4SI 246))) "simd2.f90":11:30 -1
     (nil))

which I guess is OK for trunc_mod but not floor_mod, but it fixes the ICE.

diff --git a/gcc/expmed.c b/gcc/expmed.c
index bbdd0e71d20..0ae57cc3f8a 100644
--- a/gcc/expmed.c
+++ b/gcc/expmed.c
@@ -4850,7 +4850,7 @@ expand_divmod (int rem_flag, enum tree_code code,
machine_mode mode,

        /* No luck with division elimination or divmod.  Have to do it
           by conditionally adjusting op0 *and* the result.  */
-       {
+       if (!VECTOR_MODE_P (mode)) {
          rtx_code_label *label1, *label2, *label3, *label4, *label5;
          rtx adjusted_op0;
          rtx tem;


The floor-mod is present in .original already:

simd2.f90.005t.original:        b[(integer(kind=8)) i + -1] = (i + -52) %[fl]
39;

looks like the modulo intrinsic is floor_mod?

    b(i) = modulo (i - 52, 39)

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug middle-end/102860] [12 regression] libgomp.fortran/simd2.f90 ICEs after r12-4526
  2021-10-20 17:45 [Bug target/102860] New: [12 regression] libgomp.fortran/simd2.f90 ICEs after r12-4526 seurer at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2021-10-21  6:55 ` [Bug middle-end/102860] " rguenth at gcc dot gnu.org
@ 2021-10-26  7:09 ` rguenth at gcc dot gnu.org
  2021-10-26 18:13 ` seurer at gcc dot gnu.org
                   ` (14 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-10-26  7:09 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102860

--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
*** Bug 102938 has been marked as a duplicate of this bug. ***

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug middle-end/102860] [12 regression] libgomp.fortran/simd2.f90 ICEs after r12-4526
  2021-10-20 17:45 [Bug target/102860] New: [12 regression] libgomp.fortran/simd2.f90 ICEs after r12-4526 seurer at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2021-10-26  7:09 ` rguenth at gcc dot gnu.org
@ 2021-10-26 18:13 ` seurer at gcc dot gnu.org
  2021-12-15  7:05 ` luoxhu at gcc dot gnu.org
                   ` (13 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: seurer at gcc dot gnu.org @ 2021-10-26 18:13 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102860

--- Comment #4 from seurer at gcc dot gnu.org ---
For completeness this also affects two other test cases:

FAIL: libgomp.fortran/simd2.f90   -O2  (internal compiler error)
FAIL: libgomp.fortran/simd2.f90   -O2  (test for excess errors)
FAIL: libgomp.fortran/simd2.f90   -O3 -fomit-frame-pointer -funroll-loops
-fpeel-loops -ftracer -finline-functions  (internal compiler error)
FAIL: libgomp.fortran/simd2.f90   -O3 -fomit-frame-pointer -funroll-loops
-fpeel-loops -ftracer -finline-functions  (test for excess errors)
FAIL: libgomp.fortran/simd2.f90   -O3 -g  (internal compiler error)
FAIL: libgomp.fortran/simd2.f90   -O3 -g  (test for excess errors)
FAIL: libgomp.fortran/simd3.f90   -O2  (internal compiler error)
FAIL: libgomp.fortran/simd3.f90   -O2  (test for excess errors)
FAIL: libgomp.fortran/simd3.f90   -O3 -fomit-frame-pointer -funroll-loops
-fpeel-loops -ftracer -finline-functions  (internal compiler error)
FAIL: libgomp.fortran/simd3.f90   -O3 -fomit-frame-pointer -funroll-loops
-fpeel-loops -ftracer -finline-functions  (test for excess errors)
FAIL: libgomp.fortran/simd3.f90   -O3 -g  (internal compiler error)
FAIL: libgomp.fortran/simd3.f90   -O3 -g  (test for excess errors)
FAIL: libgomp.fortran/simd4.f90   -O2  (internal compiler error)
FAIL: libgomp.fortran/simd4.f90   -O2  (test for excess errors)
FAIL: libgomp.fortran/simd4.f90   -O3 -fomit-frame-pointer -funroll-loops
-fpeel-loops -ftracer -finline-functions  (internal compiler error)
FAIL: libgomp.fortran/simd4.f90   -O3 -fomit-frame-pointer -funroll-loops
-fpeel-loops -ftracer -finline-functions  (test for excess errors)
FAIL: libgomp.fortran/simd4.f90   -O3 -g  (internal compiler error)
FAIL: libgomp.fortran/simd4.f90   -O3 -g  (test for excess errors)

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug middle-end/102860] [12 regression] libgomp.fortran/simd2.f90 ICEs after r12-4526
  2021-10-20 17:45 [Bug target/102860] New: [12 regression] libgomp.fortran/simd2.f90 ICEs after r12-4526 seurer at gcc dot gnu.org
                   ` (4 preceding siblings ...)
  2021-10-26 18:13 ` seurer at gcc dot gnu.org
@ 2021-12-15  7:05 ` luoxhu at gcc dot gnu.org
  2021-12-15  7:24 ` luoxhu at gcc dot gnu.org
                   ` (12 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: luoxhu at gcc dot gnu.org @ 2021-12-15  7:05 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102860

luoxhu at gcc dot gnu.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |luoxhu at gcc dot gnu.org

--- Comment #5 from luoxhu at gcc dot gnu.org ---
P8, P9 and X86 doesn't vectorize the floor_mod operation, so they passed.
The fix in #c2 only fixes ICE, but execution still fails, reason is R239 is
used but not defined.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug middle-end/102860] [12 regression] libgomp.fortran/simd2.f90 ICEs after r12-4526
  2021-10-20 17:45 [Bug target/102860] New: [12 regression] libgomp.fortran/simd2.f90 ICEs after r12-4526 seurer at gcc dot gnu.org
                   ` (5 preceding siblings ...)
  2021-12-15  7:05 ` luoxhu at gcc dot gnu.org
@ 2021-12-15  7:24 ` luoxhu at gcc dot gnu.org
  2022-01-17 13:13 ` rguenth at gcc dot gnu.org
                   ` (11 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: luoxhu at gcc dot gnu.org @ 2021-12-15  7:24 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102860

--- Comment #6 from luoxhu at gcc dot gnu.org ---
Fortran's modulo is floor_mod as documented here:
https://gcc.gnu.org/onlinedocs/gfortran/MODULO.html?

Syntax:
RESULT = MODULO(A, P)

Return value:
The type and kind of the result are those of the arguments. (As a GNU
extension, kind is the largest kind of the actual arguments.)

If A and P are of type INTEGER:
MODULO(A,P) has the value R such that A=Q*P+R, where Q is an integer and R is
between 0 (inclusive) and P (exclusive).

If A and P are of type REAL:
MODULO(A,P) has the value of A - FLOOR (A / P) * P.

The returned value has the same sign as P and a magnitude less than the
magnitude of P.


program test_modulo
  print *, modulo(17,3)
  print *, modulo(17.5,5.5)

  print *, modulo(-17,3)
  print *, modulo(-17.5,5.5)

  print *, modulo(17,-3)
  print *, modulo(17.5,-5.5)
end program


LD_LIBRARY_PATH=./x86_64-pc-linux-gnu/libgfortran/.libs/ ./a.out

           2
   1.00000000
           1
   4.50000000
          -1
  -4.50000000

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug middle-end/102860] [12 regression] libgomp.fortran/simd2.f90 ICEs after r12-4526
  2021-10-20 17:45 [Bug target/102860] New: [12 regression] libgomp.fortran/simd2.f90 ICEs after r12-4526 seurer at gcc dot gnu.org
                   ` (6 preceding siblings ...)
  2021-12-15  7:24 ` luoxhu at gcc dot gnu.org
@ 2022-01-17 13:13 ` rguenth at gcc dot gnu.org
  2022-01-18 17:21 ` jakub at gcc dot gnu.org
                   ` (10 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-01-17 13:13 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102860

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Priority|P3                          |P1

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug middle-end/102860] [12 regression] libgomp.fortran/simd2.f90 ICEs after r12-4526
  2021-10-20 17:45 [Bug target/102860] New: [12 regression] libgomp.fortran/simd2.f90 ICEs after r12-4526 seurer at gcc dot gnu.org
                   ` (7 preceding siblings ...)
  2022-01-17 13:13 ` rguenth at gcc dot gnu.org
@ 2022-01-18 17:21 ` jakub at gcc dot gnu.org
  2022-01-18 18:45 ` jakub at gcc dot gnu.org
                   ` (9 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: jakub at gcc dot gnu.org @ 2022-01-18 17:21 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102860

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jakub at gcc dot gnu.org

--- Comment #7 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Short testcase:
function foo(a)
  integer(kind=4) :: a(1024)
  a(:) = modulo (a(:), 39)
end function
-O2 -mcpu=power10.
vect_recog_divmod_pattern only handles TRUNC_{DIV,MOD}_EXPR and EXACT_DIV_EXPR
(and isn't guaranteed to succeed anyway), but optab_for_tree_code returns the
same smod_optab or sdiv_optab (if signed; FLOOR_* for unsigned is mapped to
TRUNC_*).
I guess the quickest way would be to punt on {CEIL,FLOOR,ROUND}_{DIV,MOD}_EXPR
in the vectorizer and tree-vect-generic.cc
Further gradual improvements can be:
1) match.pd has:
/* For unsigned integral types, FLOOR_DIV_EXPR is the same as
   TRUNC_DIV_EXPR.  Rewrite into the latter in this case.  */
(simplify
 (floor_div @0 @1)
 (if ((INTEGRAL_TYPE_P (type) || VECTOR_INTEGER_TYPE_P (type))
      && TYPE_UNSIGNED (type))
  (trunc_div @0 @1)))
but expmed.cc has:
  /* Promote floor rounding to trunc rounding for unsigned operations.  */
  if (unsignedp)
    {
      if (code == FLOOR_DIV_EXPR)
        code = TRUNC_DIV_EXPR;
      if (code == FLOOR_MOD_EXPR)
        code = TRUNC_MOD_EXPR;
      if (code == EXACT_DIV_EXPR && op1_is_pow2)
        code = TRUNC_DIV_EXPR;
    }
Shouldn't we make it
(for floor_divmod (floor_div floor_mod)
     trunc_divmod (trunc_div trunc_mod)
 (simplify
  (floor_divmod @0 @1)
  (if ((INTEGRAL_TYPE_P (type) || VECTOR_INTEGER_TYPE_P (type))
       && TYPE_UNSIGNED (type))
   (trunc_divmod @0 @1))))
?
2) as the RTL optabs really do just trunc div/mod, perhaps
tree-vect-patterns.cc
could be changed to replace some or all of those operations with the trunc
operation followed by some arith and cond_exprs so that the vectorizer knows
actual cost of those operations.
E.g. it seems expmed.cc expands
r = x %[fl] y;
as
r = x % y; if (r && (x ^ y) < 0) r += y;
and
d = x /[fl] y;
would be
r = x % y; d = x / y; if (r && (x ^ y) < 0) --d;
Looking at wide-int.h,
r = x %[cl] y;
as
r = x % y; if (r && (x ^ y) >= 0) r -= y;
and
d = /[cl] y;
as
r = x % y; d = x / y; if (r && (x ^ y) >= 0) ++d;
All of the above for signed, as I said earlier, unsigned [fl] is the same as
trunc and unsigned [cl] should replace (x ^ y) >= 0 with 1.
[rd] is even more complex.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug middle-end/102860] [12 regression] libgomp.fortran/simd2.f90 ICEs after r12-4526
  2021-10-20 17:45 [Bug target/102860] New: [12 regression] libgomp.fortran/simd2.f90 ICEs after r12-4526 seurer at gcc dot gnu.org
                   ` (8 preceding siblings ...)
  2022-01-18 17:21 ` jakub at gcc dot gnu.org
@ 2022-01-18 18:45 ` jakub at gcc dot gnu.org
  2022-01-19  7:27 ` rguenther at suse dot de
                   ` (8 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: jakub at gcc dot gnu.org @ 2022-01-18 18:45 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102860

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |ASSIGNED
           Assignee|unassigned at gcc dot gnu.org      |jakub at gcc dot gnu.org

--- Comment #8 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Created attachment 52218
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52218&action=edit
gcc12-pr102860-wip.patch

Untested patch that avoids the ICE, but still doesn't add the
tree-vect-patterns.cc cases.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug middle-end/102860] [12 regression] libgomp.fortran/simd2.f90 ICEs after r12-4526
  2021-10-20 17:45 [Bug target/102860] New: [12 regression] libgomp.fortran/simd2.f90 ICEs after r12-4526 seurer at gcc dot gnu.org
                   ` (9 preceding siblings ...)
  2022-01-18 18:45 ` jakub at gcc dot gnu.org
@ 2022-01-19  7:27 ` rguenther at suse dot de
  2022-01-19 10:01 ` jakub at gcc dot gnu.org
                   ` (7 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: rguenther at suse dot de @ 2022-01-19  7:27 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102860

--- Comment #9 from rguenther at suse dot de <rguenther at suse dot de> ---
On Tue, 18 Jan 2022, jakub at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102860
> 
> Jakub Jelinek <jakub at gcc dot gnu.org> changed:
> 
>            What    |Removed                     |Added
> ----------------------------------------------------------------------------
>                  CC|                            |jakub at gcc dot gnu.org
> 
> --- Comment #7 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
> Short testcase:
> function foo(a)
>   integer(kind=4) :: a(1024)
>   a(:) = modulo (a(:), 39)
> end function
> -O2 -mcpu=power10.
> vect_recog_divmod_pattern only handles TRUNC_{DIV,MOD}_EXPR and EXACT_DIV_EXPR
> (and isn't guaranteed to succeed anyway), but optab_for_tree_code returns the
> same smod_optab or sdiv_optab (if signed; FLOOR_* for unsigned is mapped to
> TRUNC_*).
> I guess the quickest way would be to punt on {CEIL,FLOOR,ROUND}_{DIV,MOD}_EXPR
> in the vectorizer and tree-vect-generic.cc

True.

> Further gradual improvements can be:
> 1) match.pd has:
> /* For unsigned integral types, FLOOR_DIV_EXPR is the same as
>    TRUNC_DIV_EXPR.  Rewrite into the latter in this case.  */
> (simplify
>  (floor_div @0 @1)
>  (if ((INTEGRAL_TYPE_P (type) || VECTOR_INTEGER_TYPE_P (type))
>       && TYPE_UNSIGNED (type))
>   (trunc_div @0 @1)))
> but expmed.cc has:
>   /* Promote floor rounding to trunc rounding for unsigned operations.  */
>   if (unsignedp)
>     {
>       if (code == FLOOR_DIV_EXPR)
>         code = TRUNC_DIV_EXPR;
>       if (code == FLOOR_MOD_EXPR)
>         code = TRUNC_MOD_EXPR;
>       if (code == EXACT_DIV_EXPR && op1_is_pow2)
>         code = TRUNC_DIV_EXPR;
>     }
> Shouldn't we make it
> (for floor_divmod (floor_div floor_mod)
>      trunc_divmod (trunc_div trunc_mod)
>  (simplify
>   (floor_divmod @0 @1)
>   (if ((INTEGRAL_TYPE_P (type) || VECTOR_INTEGER_TYPE_P (type))
>        && TYPE_UNSIGNED (type))
>    (trunc_divmod @0 @1))))
> ?

Yeah, if the simplification is incomplete we should amend it.

> 2) as the RTL optabs really do just trunc div/mod, perhaps
> tree-vect-patterns.cc
> could be changed to replace some or all of those operations with the trunc
> operation followed by some arith and cond_exprs so that the vectorizer knows
> actual cost of those operations.
> E.g. it seems expmed.cc expands
> r = x %[fl] y;
> as
> r = x % y; if (r && (x ^ y) < 0) r += y;
> and
> d = x /[fl] y;
> would be
> r = x % y; d = x / y; if (r && (x ^ y) < 0) --d;
> Looking at wide-int.h,
> r = x %[cl] y;
> as
> r = x % y; if (r && (x ^ y) >= 0) r -= y;
> and
> d = /[cl] y;
> as
> r = x % y; d = x / y; if (r && (x ^ y) >= 0) ++d;
> All of the above for signed, as I said earlier, unsigned [fl] is the same as
> trunc and unsigned [cl] should replace (x ^ y) >= 0 with 1.
> [rd] is even more complex.

That sounds reasonable as well.  I think we can do 0) and 1) now and
defer 2) to the next stage1, maybe tracking it with an enhancement
bugreport.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug middle-end/102860] [12 regression] libgomp.fortran/simd2.f90 ICEs after r12-4526
  2021-10-20 17:45 [Bug target/102860] New: [12 regression] libgomp.fortran/simd2.f90 ICEs after r12-4526 seurer at gcc dot gnu.org
                   ` (10 preceding siblings ...)
  2022-01-19  7:27 ` rguenther at suse dot de
@ 2022-01-19 10:01 ` jakub at gcc dot gnu.org
  2022-01-19 10:02 ` jakub at gcc dot gnu.org
                   ` (6 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: jakub at gcc dot gnu.org @ 2022-01-19 10:01 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102860

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #52218|0                           |1
        is obsolete|                            |

--- Comment #10 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Created attachment 52228
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52228&action=edit
gcc12-pr102860-1.patch

One possible patch.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug middle-end/102860] [12 regression] libgomp.fortran/simd2.f90 ICEs after r12-4526
  2021-10-20 17:45 [Bug target/102860] New: [12 regression] libgomp.fortran/simd2.f90 ICEs after r12-4526 seurer at gcc dot gnu.org
                   ` (11 preceding siblings ...)
  2022-01-19 10:01 ` jakub at gcc dot gnu.org
@ 2022-01-19 10:02 ` jakub at gcc dot gnu.org
  2022-01-19 14:05 ` cvs-commit at gcc dot gnu.org
                   ` (5 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: jakub at gcc dot gnu.org @ 2022-01-19 10:02 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102860

--- Comment #11 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Created attachment 52229
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52229&action=edit
gcc12-pr102860-2.patch

And here is IMHO a better one that for VECTOR_TYPE just returns unknown_optab
so that the callers don't expect vectorized %[fl] etc. will work.

For both patches, tree-vect-patterns.cc work deferred for stage1, will create a
PR for it once this PR is fixed.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug middle-end/102860] [12 regression] libgomp.fortran/simd2.f90 ICEs after r12-4526
  2021-10-20 17:45 [Bug target/102860] New: [12 regression] libgomp.fortran/simd2.f90 ICEs after r12-4526 seurer at gcc dot gnu.org
                   ` (12 preceding siblings ...)
  2022-01-19 10:02 ` jakub at gcc dot gnu.org
@ 2022-01-19 14:05 ` cvs-commit at gcc dot gnu.org
  2022-01-19 14:06 ` jakub at gcc dot gnu.org
                   ` (4 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2022-01-19 14:05 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102860

--- Comment #12 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:ffc7f200adbdf47f14b3594d9b21855c19cf797a

commit r12-6739-gffc7f200adbdf47f14b3594d9b21855c19cf797a
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Wed Jan 19 15:03:45 2022 +0100

    match.pd, optabs: Avoid vectorization of {FLOOR,CEIL,ROUND}_{DIV,MOD}_EXPR
[PR102860]

    power10 has modv4si3 expander and so vectorizes the following testcase
    where Fortran modulo is FLOOR_MOD_EXPR.
    optabs_for_tree_code indicates that the optab for all the *_MOD_EXPR
    variants is umod_optab or smod_optab, but that isn't true, that optab
    actually expands just TRUNC_MOD_EXPR.  For the other tree codes expmed.cc
    has code how to adjust the TRUNC_MOD_EXPR into those by emitting some
    extra comparisons and conditional updates.  Similarly for *_DIV_EXPR,
    except in that case it actually needs both division and modulo.

    While it would be possible to handle it in expmed.cc for vectors as well,
    we'd need to be sure all the vector operations we need for that are
    available, and furthermore we wouldn't account for that in the costing.

    So, IMHO it is better to stop pretending those non-truncating (and
    non-exact) div/mod operations have an optab.  For GCC 13, we should
    IMHO pattern match these in tree-vect-patterns.cc and transform them
    to truncating div/mod with follow-up adjustments and let the vectorizer
    vectorize that.  As written in the PR, for signed operands:
    r = x %[fl] y;
    is
    r = x % y; if (r && (x ^ y) < 0) r += y;
    and
    d = x /[fl] y;
    is
    r = x % y; d = x / y; if (r && (x ^ y) < 0) --d;
    and
    r = x %[cl] y;
    is
    r = x % y; if (r && (x ^ y) >= 0) r -= y;
    and
    d = /[cl] y;
    is
    r = x % y; d = x / y; if (r && (x ^ y) >= 0) ++d;
    (too lazy to figure out rounding div/mod now).  I'll create a PR
    for that.
    The patch also extends a match.pd optimization that floor_mod on
    unsigned operands is actually trunc_mod.

    2022-01-19  Jakub Jelinek  <jakub@redhat.com>

            PR middle-end/102860
            * match.pd (x %[fl] y -> x % y): New simplification for
            unsigned integral types.
            * optabs-tree.cc (optab_for_tree_code): Return unknown_optab
            for {CEIL,FLOOR,ROUND}_{DIV,MOD}_EXPR with VECTOR_TYPE.

            * gfortran.dg/pr102860.f90: New test.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug middle-end/102860] [12 regression] libgomp.fortran/simd2.f90 ICEs after r12-4526
  2021-10-20 17:45 [Bug target/102860] New: [12 regression] libgomp.fortran/simd2.f90 ICEs after r12-4526 seurer at gcc dot gnu.org
                   ` (13 preceding siblings ...)
  2022-01-19 14:05 ` cvs-commit at gcc dot gnu.org
@ 2022-01-19 14:06 ` jakub at gcc dot gnu.org
  2022-01-24  9:21 ` cvs-commit at gcc dot gnu.org
                   ` (3 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: jakub at gcc dot gnu.org @ 2022-01-19 14:06 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102860

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |FIXED
             Status|ASSIGNED                    |RESOLVED

--- Comment #13 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Fixed now.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug middle-end/102860] [12 regression] libgomp.fortran/simd2.f90 ICEs after r12-4526
  2021-10-20 17:45 [Bug target/102860] New: [12 regression] libgomp.fortran/simd2.f90 ICEs after r12-4526 seurer at gcc dot gnu.org
                   ` (14 preceding siblings ...)
  2022-01-19 14:06 ` jakub at gcc dot gnu.org
@ 2022-01-24  9:21 ` cvs-commit at gcc dot gnu.org
  2022-05-10  8:23 ` cvs-commit at gcc dot gnu.org
                   ` (2 subsequent siblings)
  18 siblings, 0 replies; 20+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2022-01-24  9:21 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102860

--- Comment #14 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-11 branch has been updated by Jakub Jelinek
<jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:baf18a6d47c4325be004cb6289d4057b113de282

commit r11-9504-gbaf18a6d47c4325be004cb6289d4057b113de282
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Wed Jan 19 15:03:45 2022 +0100

    match.pd, optabs: Avoid vectorization of {FLOOR,CEIL,ROUND}_{DIV,MOD}_EXPR
[PR102860]

    power10 has modv4si3 expander and so vectorizes the following testcase
    where Fortran modulo is FLOOR_MOD_EXPR.
    optabs_for_tree_code indicates that the optab for all the *_MOD_EXPR
    variants is umod_optab or smod_optab, but that isn't true, that optab
    actually expands just TRUNC_MOD_EXPR.  For the other tree codes expmed.cc
    has code how to adjust the TRUNC_MOD_EXPR into those by emitting some
    extra comparisons and conditional updates.  Similarly for *_DIV_EXPR,
    except in that case it actually needs both division and modulo.

    While it would be possible to handle it in expmed.cc for vectors as well,
    we'd need to be sure all the vector operations we need for that are
    available, and furthermore we wouldn't account for that in the costing.

    So, IMHO it is better to stop pretending those non-truncating (and
    non-exact) div/mod operations have an optab.  For GCC 13, we should
    IMHO pattern match these in tree-vect-patterns.cc and transform them
    to truncating div/mod with follow-up adjustments and let the vectorizer
    vectorize that.  As written in the PR, for signed operands:
    r = x %[fl] y;
    is
    r = x % y; if (r && (x ^ y) < 0) r += y;
    and
    d = x /[fl] y;
    is
    r = x % y; d = x / y; if (r && (x ^ y) < 0) --d;
    and
    r = x %[cl] y;
    is
    r = x % y; if (r && (x ^ y) >= 0) r -= y;
    and
    d = /[cl] y;
    is
    r = x % y; d = x / y; if (r && (x ^ y) >= 0) ++d;
    (too lazy to figure out rounding div/mod now).  I'll create a PR
    for that.
    The patch also extends a match.pd optimization that floor_mod on
    unsigned operands is actually trunc_mod.

    2022-01-19  Jakub Jelinek  <jakub@redhat.com>

            PR middle-end/102860
            * match.pd (x %[fl] y -> x % y): New simplification for
            unsigned integral types.
            * optabs-tree.c (optab_for_tree_code): Return unknown_optab
            for {CEIL,FLOOR,ROUND}_{DIV,MOD}_EXPR with VECTOR_TYPE.

            * gfortran.dg/pr102860.f90: New test.

    (cherry picked from commit ffc7f200adbdf47f14b3594d9b21855c19cf797a)

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug middle-end/102860] [12 regression] libgomp.fortran/simd2.f90 ICEs after r12-4526
  2021-10-20 17:45 [Bug target/102860] New: [12 regression] libgomp.fortran/simd2.f90 ICEs after r12-4526 seurer at gcc dot gnu.org
                   ` (15 preceding siblings ...)
  2022-01-24  9:21 ` cvs-commit at gcc dot gnu.org
@ 2022-05-10  8:23 ` cvs-commit at gcc dot gnu.org
  2022-05-11  6:23 ` cvs-commit at gcc dot gnu.org
  2022-05-11 18:39 ` cvs-commit at gcc dot gnu.org
  18 siblings, 0 replies; 20+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2022-05-10  8:23 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102860

--- Comment #15 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-10 branch has been updated by Jakub Jelinek
<jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:03bd2b252bca8c0ba4aeecab8b560d751f2ef57d

commit r10-10669-g03bd2b252bca8c0ba4aeecab8b560d751f2ef57d
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Wed Jan 19 15:03:45 2022 +0100

    match.pd, optabs: Avoid vectorization of {FLOOR,CEIL,ROUND}_{DIV,MOD}_EXPR
[PR102860]

    power10 has modv4si3 expander and so vectorizes the following testcase
    where Fortran modulo is FLOOR_MOD_EXPR.
    optabs_for_tree_code indicates that the optab for all the *_MOD_EXPR
    variants is umod_optab or smod_optab, but that isn't true, that optab
    actually expands just TRUNC_MOD_EXPR.  For the other tree codes expmed.cc
    has code how to adjust the TRUNC_MOD_EXPR into those by emitting some
    extra comparisons and conditional updates.  Similarly for *_DIV_EXPR,
    except in that case it actually needs both division and modulo.

    While it would be possible to handle it in expmed.cc for vectors as well,
    we'd need to be sure all the vector operations we need for that are
    available, and furthermore we wouldn't account for that in the costing.

    So, IMHO it is better to stop pretending those non-truncating (and
    non-exact) div/mod operations have an optab.  For GCC 13, we should
    IMHO pattern match these in tree-vect-patterns.cc and transform them
    to truncating div/mod with follow-up adjustments and let the vectorizer
    vectorize that.  As written in the PR, for signed operands:
    r = x %[fl] y;
    is
    r = x % y; if (r && (x ^ y) < 0) r += y;
    and
    d = x /[fl] y;
    is
    r = x % y; d = x / y; if (r && (x ^ y) < 0) --d;
    and
    r = x %[cl] y;
    is
    r = x % y; if (r && (x ^ y) >= 0) r -= y;
    and
    d = /[cl] y;
    is
    r = x % y; d = x / y; if (r && (x ^ y) >= 0) ++d;
    (too lazy to figure out rounding div/mod now).  I'll create a PR
    for that.
    The patch also extends a match.pd optimization that floor_mod on
    unsigned operands is actually trunc_mod.

    2022-01-19  Jakub Jelinek  <jakub@redhat.com>

            PR middle-end/102860
            * match.pd (x %[fl] y -> x % y): New simplification for
            unsigned integral types.
            * optabs-tree.c (optab_for_tree_code): Return unknown_optab
            for {CEIL,FLOOR,ROUND}_{DIV,MOD}_EXPR with VECTOR_TYPE.

            * gfortran.dg/pr102860.f90: New test.

    (cherry picked from commit ffc7f200adbdf47f14b3594d9b21855c19cf797a)

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug middle-end/102860] [12 regression] libgomp.fortran/simd2.f90 ICEs after r12-4526
  2021-10-20 17:45 [Bug target/102860] New: [12 regression] libgomp.fortran/simd2.f90 ICEs after r12-4526 seurer at gcc dot gnu.org
                   ` (16 preceding siblings ...)
  2022-05-10  8:23 ` cvs-commit at gcc dot gnu.org
@ 2022-05-11  6:23 ` cvs-commit at gcc dot gnu.org
  2022-05-11 18:39 ` cvs-commit at gcc dot gnu.org
  18 siblings, 0 replies; 20+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2022-05-11  6:23 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102860

--- Comment #16 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-9 branch has been updated by Jakub Jelinek
<jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:95f6eb7ae707482fdeed57b0906dacb8e675385d

commit r9-10118-g95f6eb7ae707482fdeed57b0906dacb8e675385d
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Wed Jan 19 15:03:45 2022 +0100

    match.pd, optabs: Avoid vectorization of {FLOOR,CEIL,ROUND}_{DIV,MOD}_EXPR
[PR102860]

    power10 has modv4si3 expander and so vectorizes the following testcase
    where Fortran modulo is FLOOR_MOD_EXPR.
    optabs_for_tree_code indicates that the optab for all the *_MOD_EXPR
    variants is umod_optab or smod_optab, but that isn't true, that optab
    actually expands just TRUNC_MOD_EXPR.  For the other tree codes expmed.cc
    has code how to adjust the TRUNC_MOD_EXPR into those by emitting some
    extra comparisons and conditional updates.  Similarly for *_DIV_EXPR,
    except in that case it actually needs both division and modulo.

    While it would be possible to handle it in expmed.cc for vectors as well,
    we'd need to be sure all the vector operations we need for that are
    available, and furthermore we wouldn't account for that in the costing.

    So, IMHO it is better to stop pretending those non-truncating (and
    non-exact) div/mod operations have an optab.  For GCC 13, we should
    IMHO pattern match these in tree-vect-patterns.cc and transform them
    to truncating div/mod with follow-up adjustments and let the vectorizer
    vectorize that.  As written in the PR, for signed operands:
    r = x %[fl] y;
    is
    r = x % y; if (r && (x ^ y) < 0) r += y;
    and
    d = x /[fl] y;
    is
    r = x % y; d = x / y; if (r && (x ^ y) < 0) --d;
    and
    r = x %[cl] y;
    is
    r = x % y; if (r && (x ^ y) >= 0) r -= y;
    and
    d = /[cl] y;
    is
    r = x % y; d = x / y; if (r && (x ^ y) >= 0) ++d;
    (too lazy to figure out rounding div/mod now).  I'll create a PR
    for that.
    The patch also extends a match.pd optimization that floor_mod on
    unsigned operands is actually trunc_mod.

    2022-01-19  Jakub Jelinek  <jakub@redhat.com>

            PR middle-end/102860
            * match.pd (x %[fl] y -> x % y): New simplification for
            unsigned integral types.
            * optabs-tree.c (optab_for_tree_code): Return unknown_optab
            for {CEIL,FLOOR,ROUND}_{DIV,MOD}_EXPR with VECTOR_TYPE.

            * gfortran.dg/pr102860.f90: New test.

    (cherry picked from commit ffc7f200adbdf47f14b3594d9b21855c19cf797a)

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [Bug middle-end/102860] [12 regression] libgomp.fortran/simd2.f90 ICEs after r12-4526
  2021-10-20 17:45 [Bug target/102860] New: [12 regression] libgomp.fortran/simd2.f90 ICEs after r12-4526 seurer at gcc dot gnu.org
                   ` (17 preceding siblings ...)
  2022-05-11  6:23 ` cvs-commit at gcc dot gnu.org
@ 2022-05-11 18:39 ` cvs-commit at gcc dot gnu.org
  18 siblings, 0 replies; 20+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2022-05-11 18:39 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102860

--- Comment #17 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-9 branch has been updated by Jakub Jelinek
<jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:6d79958a50eb5419ebb9baa5ef880aabeef05467

commit r9-10157-g6d79958a50eb5419ebb9baa5ef880aabeef05467
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Wed May 11 20:37:01 2022 +0200

    testsuite: Fix up pr102860.f90 for gcc 9 [PR105570]

    Apparently -mcpu=power10 is gcc 10+, but the PR102860 change otherwise
    made sense also for 9.x.  So just adjusting testcase...

    2022-05-11  Jakub Jelinek  <jakub@redhat.com>

            PR middle-end/102860
            PR testsuite/105570
            * gfortran.dg/pr102860.f90: Use -mcpu=power9 instead of
-mcpu=power10.

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2022-05-11 18:39 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-20 17:45 [Bug target/102860] New: [12 regression] libgomp.fortran/simd2.f90 ICEs after r12-4526 seurer at gcc dot gnu.org
2021-10-20 18:19 ` [Bug target/102860] " aldyh at gcc dot gnu.org
2021-10-20 19:36 ` pinskia at gcc dot gnu.org
2021-10-21  6:55 ` [Bug middle-end/102860] " rguenth at gcc dot gnu.org
2021-10-26  7:09 ` rguenth at gcc dot gnu.org
2021-10-26 18:13 ` seurer at gcc dot gnu.org
2021-12-15  7:05 ` luoxhu at gcc dot gnu.org
2021-12-15  7:24 ` luoxhu at gcc dot gnu.org
2022-01-17 13:13 ` rguenth at gcc dot gnu.org
2022-01-18 17:21 ` jakub at gcc dot gnu.org
2022-01-18 18:45 ` jakub at gcc dot gnu.org
2022-01-19  7:27 ` rguenther at suse dot de
2022-01-19 10:01 ` jakub at gcc dot gnu.org
2022-01-19 10:02 ` jakub at gcc dot gnu.org
2022-01-19 14:05 ` cvs-commit at gcc dot gnu.org
2022-01-19 14:06 ` jakub at gcc dot gnu.org
2022-01-24  9:21 ` cvs-commit at gcc dot gnu.org
2022-05-10  8:23 ` cvs-commit at gcc dot gnu.org
2022-05-11  6:23 ` cvs-commit at gcc dot gnu.org
2022-05-11 18:39 ` cvs-commit at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).