public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/98516] New: Wrong code generated by tree vectorizer
@ 2021-01-04 17:01 martin@mpa-garching.mpg.de
  2021-01-04 18:59 ` [Bug tree-optimization/98516] " martin@mpa-garching.mpg.de
                   ` (9 more replies)
  0 siblings, 10 replies; 11+ messages in thread
From: martin@mpa-garching.mpg.de @ 2021-01-04 17:01 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98516

            Bug ID: 98516
           Summary: Wrong code generated by tree vectorizer
           Product: gcc
           Version: 11.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: martin@mpa-garching.mpg.de
  Target Milestone: ---

Created attachment 49879
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49879&action=edit
Code to reproduce the problem

The attached code is a distilled test case from an FFT library, which works
fine with current released GCC versions, but produces incorrect results with
current trunk when tree optimization is switched on via -O3:

martin@debian:~/codes/ducc$ g++ -I src/ -std=c++17 -O3 -march=native
-ffast-math bug.cc
martin@debian:~/codes/ducc$ ./a.out
(0.362978,0.601326) (0.362155,0.18782) (1.63193,-0.0779749) (1.26662,0.327246)
(-1.0024,1.03302) 

The third complex number in the result line is wrong.
When disabling tree vectorization (or when using a released GCC version), the
correct answer is produced:

martin@debian:~/codes/ducc$ g++ -I src/ -std=c++17 -O3 -march=native
-ffast-math bug.cc -fno-tree-vectorize
martin@debian:~/codes/ducc$ ./a.out
(0.362978,0.601326) (0.362155,0.18782) (0.380433,0.228703) (1.26662,0.327246)
(-1.0024,1.03302) 

My gcc version is

martin@debian:~/codes/ducc$ g++ -v
Using built-in specs.
COLLECT_GCC=g++
COLLECT_LTO_WRAPPER=/home/martin/codes/umaster/libexec/gcc/x86_64-pc-linux-gnu/11.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: /home/martin/codes/gccgit/configure --disable-bootstrap
--disable-multilib --prefix=/home/martin/codes/umaster
--enable-languages=c++,fortran --enable-target=all --enable-checking=release
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 11.0.0 20210103 (experimental) [master revision
37d0bb1f5b5:d78f978936b:3335c9f954f8939403eabb5ad3d8739be9984f81] (GCC) 

I have tried to narrow down this failure further, but without success so far.
It's quite possible that the mistake is on my side, but using the sanitizers
and valgrind I have not been able to find anything.

Maybe a git bisect could locate the commit that introduced the change in
behaviour.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug tree-optimization/98516] Wrong code generated by tree vectorizer
  2021-01-04 17:01 [Bug tree-optimization/98516] New: Wrong code generated by tree vectorizer martin@mpa-garching.mpg.de
@ 2021-01-04 18:59 ` martin@mpa-garching.mpg.de
  2021-01-05 10:30 ` [Bug tree-optimization/98516] [11 Regression] Wrong code generated by tree vectorizer since r11-3823-g126ed72b9f48f853 marxin at gcc dot gnu.org
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: martin@mpa-garching.mpg.de @ 2021-01-04 18:59 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98516

--- Comment #1 from Martin Reinecke <martin@mpa-garching.mpg.de> ---
Minimal set of flags to trigger the problem seems to be

g++ -std=c++17 -O1 -ftree-vectorize -fno-signed-zeros bug.cc

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug tree-optimization/98516] [11 Regression] Wrong code generated by tree vectorizer since r11-3823-g126ed72b9f48f853
  2021-01-04 17:01 [Bug tree-optimization/98516] New: Wrong code generated by tree vectorizer martin@mpa-garching.mpg.de
  2021-01-04 18:59 ` [Bug tree-optimization/98516] " martin@mpa-garching.mpg.de
@ 2021-01-05 10:30 ` marxin at gcc dot gnu.org
  2021-01-05 10:37 ` marxin at gcc dot gnu.org
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: marxin at gcc dot gnu.org @ 2021-01-05 10:30 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98516

Martin Liška <marxin at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
     Ever confirmed|0                           |1
   Last reconfirmed|                            |2021-01-05
             Status|UNCONFIRMED                 |NEW
            Summary|Wrong code generated by     |[11 Regression] Wrong code
                   |tree vectorizer             |generated by tree
                   |                            |vectorizer since
                   |                            |r11-3823-g126ed72b9f48f853
                 CC|                            |marxin at gcc dot gnu.org,
                   |                            |rguenth at gcc dot gnu.org

--- Comment #2 from Martin Liška <marxin at gcc dot gnu.org> ---
Thanks, started with r11-3823-g126ed72b9f48f853.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug tree-optimization/98516] [11 Regression] Wrong code generated by tree vectorizer since r11-3823-g126ed72b9f48f853
  2021-01-04 17:01 [Bug tree-optimization/98516] New: Wrong code generated by tree vectorizer martin@mpa-garching.mpg.de
  2021-01-04 18:59 ` [Bug tree-optimization/98516] " martin@mpa-garching.mpg.de
  2021-01-05 10:30 ` [Bug tree-optimization/98516] [11 Regression] Wrong code generated by tree vectorizer since r11-3823-g126ed72b9f48f853 marxin at gcc dot gnu.org
@ 2021-01-05 10:37 ` marxin at gcc dot gnu.org
  2021-01-05 11:28 ` rguenth at gcc dot gnu.org
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: marxin at gcc dot gnu.org @ 2021-01-05 10:37 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98516

--- Comment #3 from Martin Liška <marxin at gcc dot gnu.org> ---
Minimal options:

g++ pr98516.c -std=c++17 -O1 -ftree-slp-vectorize -fno-signed-zeros
-fdbg-cnt=vect_slp:2

while this one is fine:

g++ pr98516.c -std=c++17 -O1 -ftree-slp-vectorize -fno-signed-zeros
-fdbg-cnt=vect_slp:1

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug tree-optimization/98516] [11 Regression] Wrong code generated by tree vectorizer since r11-3823-g126ed72b9f48f853
  2021-01-04 17:01 [Bug tree-optimization/98516] New: Wrong code generated by tree vectorizer martin@mpa-garching.mpg.de
                   ` (2 preceding siblings ...)
  2021-01-05 10:37 ` marxin at gcc dot gnu.org
@ 2021-01-05 11:28 ` rguenth at gcc dot gnu.org
  2021-01-05 14:33 ` rguenth at gcc dot gnu.org
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-01-05 11:28 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98516

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |ASSIGNED
           Assignee|unassigned at gcc dot gnu.org      |rguenth at gcc dot gnu.org
           Priority|P3                          |P1
   Target Milestone|---                         |11.0

--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
Mine.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug tree-optimization/98516] [11 Regression] Wrong code generated by tree vectorizer since r11-3823-g126ed72b9f48f853
  2021-01-04 17:01 [Bug tree-optimization/98516] New: Wrong code generated by tree vectorizer martin@mpa-garching.mpg.de
                   ` (3 preceding siblings ...)
  2021-01-05 11:28 ` rguenth at gcc dot gnu.org
@ 2021-01-05 14:33 ` rguenth at gcc dot gnu.org
  2021-01-05 14:41 ` rguenth at gcc dot gnu.org
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-01-05 14:33 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98516

--- Comment #5 from Richard Biener <rguenth at gcc dot gnu.org> ---
Meh, it's very hard to spot the actual problem :/

diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
index d8a2ceb0fa1..dee360307d0 100644
--- a/gcc/tree-vect-slp.c
+++ b/gcc/tree-vect-slp.c
@@ -5058,8 +5059,7 @@ vect_slp_region (vec<basic_block> bbs,
vec<data_reference_p> datarefs,
        bb_vinfo->shared->check_datarefs ();
       bb_vinfo->vector_mode = next_vector_mode;

-      if (vect_slp_analyze_bb_1 (bb_vinfo, n_stmts, fatal, dataref_groups)
-         && dbg_cnt (vect_slp))
+      if (vect_slp_analyze_bb_1 (bb_vinfo, n_stmts, fatal, dataref_groups))
        {
          if (dump_enabled_p ())
            {
@@ -5090,6 +5090,9 @@ vect_slp_region (vec<basic_block> bbs,
vec<data_reference_p> datarefs,
                  continue;
                }

+             if (!dbg_cnt (vect_slp))
+               continue;
+
              if (!vectorized && dump_enabled_p ())
                dump_printf_loc (MSG_NOTE, vect_location,
                                 "Basic block will be vectorized "

helps to narrow down the bogus vectorization, -fdbg-cnt=vect_slp:2:2 triggers
it but the SLP region is quite big still.

diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
index d8a2ceb0fa1..dee360307d0 100644
--- a/gcc/tree-vect-slp.c
+++ b/gcc/tree-vect-slp.c
@@ -3310,6 +3310,7 @@ vect_optimize_slp (vec_info *vinfo)
   auto_vec<int> leafs;
   vect_slp_build_vertices (vinfo, vertices, leafs);

+#if 0
   struct graph *slpg = new_graph (vertices.length ());
   FOR_EACH_VEC_ELT (vertices, i, node)
     {
@@ -3619,7 +3620,7 @@ vect_optimize_slp (vec_info *vinfo)
   while (!perms.is_empty ())
     perms.pop ().release ();
   free_graph (slpg);
-
+#endif

   /* Now elide load permutations that are not necessary.  */
   for (i = 0; i < leafs.length (); ++i)

avoids the miscompilation.  The key transform we're doing is
eliding load permutations that swap real/imag parts and instead
adjust the lane permutation of a blend created for plus/minus ops
which is where the bug is I think.  We're changing

t.C:80:7: note: node 0x4204018 (max_nunits=2, refcnt=1)
t.C:80:7: note: op: VEC_PERM_EXPR
t.C:80:7: note:         stmt 0 _37 = _35 - _36;
t.C:80:7: note:         stmt 1 _34 = _32 + _33;
t.C:80:7: note:         lane permutation { 0[0] 1[1] }
t.C:80:7: note:         children 0x42045f0 0x4204678

to

t.C:80:7: note: node 0x4207018 (max_nunits=2, refcnt=1)
t.C:80:7: note: op: VEC_PERM_EXPR
t.C:80:7: note:         stmt 0 _37 = _35 - _36;
t.C:80:7: note:         stmt 1 _34 = _32 + _33;
t.C:80:7: note:         lane permutation { 1[1] 0[0] }
t.C:80:7: note:         children 0x42075f0 0x4207678

but that's not what is necessary - we have permuted the lanes of the
children but permuting the blend will not materialize properly
instead we need to generate { 0[1] 1[0] } I think.

I'm trying to create a simpler C testcase now.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug tree-optimization/98516] [11 Regression] Wrong code generated by tree vectorizer since r11-3823-g126ed72b9f48f853
  2021-01-04 17:01 [Bug tree-optimization/98516] New: Wrong code generated by tree vectorizer martin@mpa-garching.mpg.de
                   ` (4 preceding siblings ...)
  2021-01-05 14:33 ` rguenth at gcc dot gnu.org
@ 2021-01-05 14:41 ` rguenth at gcc dot gnu.org
  2021-01-05 16:41 ` cvs-commit at gcc dot gnu.org
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-01-05 14:41 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98516

--- Comment #6 from Richard Biener <rguenth at gcc dot gnu.org> ---
/* { dg-do run } */

double a[4], b[2];

void __attribute__((noipa))
foo ()
{
  double a0 = a[0];
  double a1 = a[1];
  double a2 = a[2];
  double a3 = a[3];
  b[0] = a1 - a3;
  b[1] = a0 + a2;
}

int main()
{
  a[0] = 1.;
  a[1] = 2.;
  a[2] = 3.;
  a[3] = 4.;
  foo ();
  if (b[0] != -2 || b[1] != 4)
    __builtin_abort ();
  return 0;
}

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug tree-optimization/98516] [11 Regression] Wrong code generated by tree vectorizer since r11-3823-g126ed72b9f48f853
  2021-01-04 17:01 [Bug tree-optimization/98516] New: Wrong code generated by tree vectorizer martin@mpa-garching.mpg.de
                   ` (5 preceding siblings ...)
  2021-01-05 14:41 ` rguenth at gcc dot gnu.org
@ 2021-01-05 16:41 ` cvs-commit at gcc dot gnu.org
  2021-01-05 16:41 ` rguenth at gcc dot gnu.org
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-01-05 16:41 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98516

--- Comment #7 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Richard Biener <rguenth@gcc.gnu.org>:

https://gcc.gnu.org/g:33a63257701c8d94ee375e32ff1837c989d8ded6

commit r11-6478-g33a63257701c8d94ee375e32ff1837c989d8ded6
Author: Richard Biener <rguenther@suse.de>
Date:   Tue Jan 5 16:17:15 2021 +0100

    tree-optimization/98516 - fix SLP permute opt materialization

    When materializing on a VEC_PERM node we have to permute the
    incoming vectors, not the outgoing one.

    2021-01-05  Richard Biener  <rguenther@suse.de>

            PR tree-optimization/98516
            * tree-vect-slp.c (vect_optimize_slp): Permute the incoming
            lanes when materializing on a VEC_PERM node.
            (vectorizable_slp_permutation): Dump the permute properly.

            * gcc.dg/vect/bb-slp-pr98516-1.c: New testcase.
            * gcc.dg/vect/bb-slp-pr98516-2.c: Likewise.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug tree-optimization/98516] [11 Regression] Wrong code generated by tree vectorizer since r11-3823-g126ed72b9f48f853
  2021-01-04 17:01 [Bug tree-optimization/98516] New: Wrong code generated by tree vectorizer martin@mpa-garching.mpg.de
                   ` (6 preceding siblings ...)
  2021-01-05 16:41 ` cvs-commit at gcc dot gnu.org
@ 2021-01-05 16:41 ` rguenth at gcc dot gnu.org
  2021-01-05 18:06 ` martin@mpa-garching.mpg.de
  2021-01-06  7:41 ` rguenther at suse dot de
  9 siblings, 0 replies; 11+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-01-05 16:41 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98516

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |FIXED
             Status|ASSIGNED                    |RESOLVED

--- Comment #8 from Richard Biener <rguenth at gcc dot gnu.org> ---
Fixed.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug tree-optimization/98516] [11 Regression] Wrong code generated by tree vectorizer since r11-3823-g126ed72b9f48f853
  2021-01-04 17:01 [Bug tree-optimization/98516] New: Wrong code generated by tree vectorizer martin@mpa-garching.mpg.de
                   ` (7 preceding siblings ...)
  2021-01-05 16:41 ` rguenth at gcc dot gnu.org
@ 2021-01-05 18:06 ` martin@mpa-garching.mpg.de
  2021-01-06  7:41 ` rguenther at suse dot de
  9 siblings, 0 replies; 11+ messages in thread
From: martin@mpa-garching.mpg.de @ 2021-01-05 18:06 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98516

--- Comment #9 from Martin Reinecke <martin@mpa-garching.mpg.de> ---
Thanks, this fixes the reduced test case for me as well!

Unfortunately there seems to be more where this one came from, since my
comprehensive test suite still fails ... I'll try to produce test cases and
open another bug report.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug tree-optimization/98516] [11 Regression] Wrong code generated by tree vectorizer since r11-3823-g126ed72b9f48f853
  2021-01-04 17:01 [Bug tree-optimization/98516] New: Wrong code generated by tree vectorizer martin@mpa-garching.mpg.de
                   ` (8 preceding siblings ...)
  2021-01-05 18:06 ` martin@mpa-garching.mpg.de
@ 2021-01-06  7:41 ` rguenther at suse dot de
  9 siblings, 0 replies; 11+ messages in thread
From: rguenther at suse dot de @ 2021-01-06  7:41 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98516

--- Comment #10 from rguenther at suse dot de <rguenther at suse dot de> ---
On Tue, 5 Jan 2021, martin@mpa-garching.mpg.de wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98516
> 
> --- Comment #9 from Martin Reinecke <martin@mpa-garching.mpg.de> ---
> Thanks, this fixes the reduced test case for me as well!
> 
> Unfortunately there seems to be more where this one came from, since my
> comprehensive test suite still fails ... I'll try to produce test cases and
> open another bug report.

Thanks, that's appreciated!

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2021-01-06  7:41 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-04 17:01 [Bug tree-optimization/98516] New: Wrong code generated by tree vectorizer martin@mpa-garching.mpg.de
2021-01-04 18:59 ` [Bug tree-optimization/98516] " martin@mpa-garching.mpg.de
2021-01-05 10:30 ` [Bug tree-optimization/98516] [11 Regression] Wrong code generated by tree vectorizer since r11-3823-g126ed72b9f48f853 marxin at gcc dot gnu.org
2021-01-05 10:37 ` marxin at gcc dot gnu.org
2021-01-05 11:28 ` rguenth at gcc dot gnu.org
2021-01-05 14:33 ` rguenth at gcc dot gnu.org
2021-01-05 14:41 ` rguenth at gcc dot gnu.org
2021-01-05 16:41 ` cvs-commit at gcc dot gnu.org
2021-01-05 16:41 ` rguenth at gcc dot gnu.org
2021-01-05 18:06 ` martin@mpa-garching.mpg.de
2021-01-06  7:41 ` rguenther at suse dot de

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).