public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/59594] New: wrong code (by tree vectorizer) at -O3 on x86_64-linux-gnu
@ 2013-12-24 17:14 su at cs dot ucdavis.edu
  2013-12-24 18:10 ` [Bug tree-optimization/59594] [4.9 Regression] " hjl.tools at gmail dot com
                   ` (8 more replies)
  0 siblings, 9 replies; 10+ messages in thread
From: su at cs dot ucdavis.edu @ 2013-12-24 17:14 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59594

            Bug ID: 59594
           Summary: wrong code (by tree vectorizer) at -O3 on
                    x86_64-linux-gnu
           Product: gcc
           Version: 4.9.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: su at cs dot ucdavis.edu

The current gcc trunk miscompiles the following testcase on x86_64-linux at -O3
in both 32-bit and 64-bit modes.  This is a regression from 4.8.x.  

It looks like a bug in the tree vectorizer as it goes away with
-fno-tree-vectorize. 

$ gcc-trunk -v
Using built-in specs.
COLLECT_GCC=gcc-trunk
COLLECT_LTO_WRAPPER=/usr/local/gcc-trunk/libexec/gcc/x86_64-unknown-linux-gnu/4.9.0/lto-wrapper
Target: x86_64-unknown-linux-gnu
Configured with: ../gcc-trunk/configure --prefix=/usr/local/gcc-trunk
--enable-languages=c,c++ --disable-werror --enable-multilib
Thread model: posix
gcc version 4.9.0 20131224 (experimental) [trunk revision 206194] (GCC) 
$ 
$ gcc-trunk -O2 small.c; a.out
0
$ gcc-trunk -O3 -fno-tree-vectorize small.c; a.out
0
$ gcc-trunk -O3 small.c; a.out
1
$ 


-------------------------------


int printf (const char *, ...);

int a;
static int b[7];

int
main ()
{
  for (a = 5; a >= 0; a--)
    {
      b[a + 1] = b[a];
      b[a] = 1;
    }
  printf ("%d\n", b[1]);
  return 0;
}


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug tree-optimization/59594] [4.9 Regression] wrong code (by tree vectorizer) at -O3 on x86_64-linux-gnu
  2013-12-24 17:14 [Bug tree-optimization/59594] New: wrong code (by tree vectorizer) at -O3 on x86_64-linux-gnu su at cs dot ucdavis.edu
@ 2013-12-24 18:10 ` hjl.tools at gmail dot com
  2013-12-24 18:37 ` hjl.tools at gmail dot com
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: hjl.tools at gmail dot com @ 2013-12-24 18:10 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59594

H.J. Lu <hjl.tools at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Priority|P3                          |P1
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2013-12-24
   Target Milestone|---                         |4.9.0
            Summary|wrong code (by tree         |[4.9 Regression] wrong code
                   |vectorizer) at -O3 on       |(by tree vectorizer) at -O3
                   |x86_64-linux-gnu            |on x86_64-linux-gnu
     Ever confirmed|0                           |1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug tree-optimization/59594] [4.9 Regression] wrong code (by tree vectorizer) at -O3 on x86_64-linux-gnu
  2013-12-24 17:14 [Bug tree-optimization/59594] New: wrong code (by tree vectorizer) at -O3 on x86_64-linux-gnu su at cs dot ucdavis.edu
  2013-12-24 18:10 ` [Bug tree-optimization/59594] [4.9 Regression] " hjl.tools at gmail dot com
@ 2013-12-24 18:37 ` hjl.tools at gmail dot com
  2014-01-13 10:19 ` jakub at gcc dot gnu.org
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: hjl.tools at gmail dot com @ 2013-12-24 18:37 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59594

H.J. Lu <hjl.tools at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |rguenther at suse dot de

--- Comment #1 from H.J. Lu <hjl.tools at gmail dot com> ---
It is caused by r204062.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug tree-optimization/59594] [4.9 Regression] wrong code (by tree vectorizer) at -O3 on x86_64-linux-gnu
  2013-12-24 17:14 [Bug tree-optimization/59594] New: wrong code (by tree vectorizer) at -O3 on x86_64-linux-gnu su at cs dot ucdavis.edu
  2013-12-24 18:10 ` [Bug tree-optimization/59594] [4.9 Regression] " hjl.tools at gmail dot com
  2013-12-24 18:37 ` hjl.tools at gmail dot com
@ 2014-01-13 10:19 ` jakub at gcc dot gnu.org
  2014-01-13 13:47 ` hjl.tools at gmail dot com
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: jakub at gcc dot gnu.org @ 2014-01-13 10:19 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59594

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |hjl.tools at gmail dot com,
                   |                            |jakub at gcc dot gnu.org,
                   |                            |meibf at gcc dot gnu.org

--- Comment #2 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
(In reply to H.J. Lu from comment #1)
> It is caused by r204062.

That is only a part of the story, this testcase seems to be interesting.

Starting with r204062 until r204560 this has been broken because of an ldist
bug.
r204561 fixed that again and the testcase worked until r206147.
r206148 (aka negative step vectorization support) started to ICE on this.
With r206178 it stopped ICEing and works again with -O3 -mtune=generic, not
sure if that has been intentional to change the generic tuning with that patch.
 H.J.?
Anyway, with -O3 -mtune=core-avx2 e.g. r206178 still ICEs and r206179 (which
fixed the ICE) starts the miscompilation.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug tree-optimization/59594] [4.9 Regression] wrong code (by tree vectorizer) at -O3 on x86_64-linux-gnu
  2013-12-24 17:14 [Bug tree-optimization/59594] New: wrong code (by tree vectorizer) at -O3 on x86_64-linux-gnu su at cs dot ucdavis.edu
                   ` (2 preceding siblings ...)
  2014-01-13 10:19 ` jakub at gcc dot gnu.org
@ 2014-01-13 13:47 ` hjl.tools at gmail dot com
  2014-01-13 14:02 ` jakub at gcc dot gnu.org
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: hjl.tools at gmail dot com @ 2014-01-13 13:47 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59594

--- Comment #3 from H.J. Lu <hjl.tools at gmail dot com> ---
(In reply to Jakub Jelinek from comment #2)
> (In reply to H.J. Lu from comment #1)
> > It is caused by r204062.
> 
> That is only a part of the story, this testcase seems to be interesting.
> 
> Starting with r204062 until r204560 this has been broken because of an ldist
> bug.
> r204561 fixed that again and the testcase worked until r206147.
> r206148 (aka negative step vectorization support) started to ICE on this.
> With r206178 it stopped ICEing and works again with -O3 -mtune=generic, not
> sure if that has been intentional to change the generic tuning with that
> patch.  H.J.?

There was a typo in r206178 and was fixed by r206180:

http://gcc.gnu.org/git/?p=gcc.git;a=commit;h=ff385162ee38caf3100ba4a0f682241c3b0d681d

Can you try r206180 on this?

> Anyway, with -O3 -mtune=core-avx2 e.g. r206178 still ICEs and r206179 (which
> fixed the ICE) starts the miscompilation.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug tree-optimization/59594] [4.9 Regression] wrong code (by tree vectorizer) at -O3 on x86_64-linux-gnu
  2013-12-24 17:14 [Bug tree-optimization/59594] New: wrong code (by tree vectorizer) at -O3 on x86_64-linux-gnu su at cs dot ucdavis.edu
                   ` (3 preceding siblings ...)
  2014-01-13 13:47 ` hjl.tools at gmail dot com
@ 2014-01-13 14:02 ` jakub at gcc dot gnu.org
  2014-01-22 13:22 ` jakub at gcc dot gnu.org
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: jakub at gcc dot gnu.org @ 2014-01-13 14:02 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59594

--- Comment #4 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
You're right, r206180 fixed this (i.e. that it FAILs at runtime even for
-mtune=generic).  In any case, sounds like this is a problem in determination
of the aliasing, we should have refused to vectorize this (of course unless we
are smart enough to find out all the b[a] = 1; stores but the b[0] = 1; is
redundant, but I suppose the vectorizer isn't the right place to do that
optimization.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug tree-optimization/59594] [4.9 Regression] wrong code (by tree vectorizer) at -O3 on x86_64-linux-gnu
  2013-12-24 17:14 [Bug tree-optimization/59594] New: wrong code (by tree vectorizer) at -O3 on x86_64-linux-gnu su at cs dot ucdavis.edu
                   ` (4 preceding siblings ...)
  2014-01-13 14:02 ` jakub at gcc dot gnu.org
@ 2014-01-22 13:22 ` jakub at gcc dot gnu.org
  2014-01-24 10:03 ` jakub at gcc dot gnu.org
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: jakub at gcc dot gnu.org @ 2014-01-22 13:22 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59594

--- Comment #5 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Created attachment 31919
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=31919&action=edit
gcc49-pr59594.patch

Untested patch for discussion.  The reason why we (incorrectly) vectorize the
testcase is that we ignore the data dependency, on the testcase both the b[a]
read vs. b[a+1] store and b[a] store vs. b[a+1] store DDRs have dist 1 and
DDR_REVERSED_P set and we ignore those.  Now on say:
int printf (const char *, ...);

int a;
static int b[1024];

int
main ()
{
  for (a = 0; a <= 512; a++)
    {
      b[a - 1] = b[a];
      b[a] = 1;
    }
  printf ("%d\n", b[1]);
  return 0;
}
only the b[a] read vs. b[a-1] store is dist 1 DDR_REVERSED_P and b[a] store vs.
b[a-1] store is dist 1 !DDR_REVERSED_P, thus we don't vectorize it (correctly).
Unfortunately not ignoring dist > 0 && DDR_REVERSED_P ddrs for negative step
regresses the testcase I've attached, where there is a write after read ddr and
it works properly with the current check.  While the attached patch keeps that
testcase (no-vfa-vect-depend*.c) working and fixes the test (pr59594.c), the
conditions are piled completely randomly, I'm afraid I don't know why it is so,
if for the DDR_REVERSED_P continue it matters whether step is positive or
negative, or if that is irrelevant and all the write after write DDR_REVERSED_P
ddrs need to be checked normally (abs (dist) >= *max_vf), or if say only write
after read should be treated as the code treats it right now and even read
after write is problematic.  The DDR_REVERSED_P stuff has been added in 2007
for PR32377, see e.g. http://gcc.gnu.org/ml/gcc-patches/2007-09/msg01067.html

Richard, any ideas?


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug tree-optimization/59594] [4.9 Regression] wrong code (by tree vectorizer) at -O3 on x86_64-linux-gnu
  2013-12-24 17:14 [Bug tree-optimization/59594] New: wrong code (by tree vectorizer) at -O3 on x86_64-linux-gnu su at cs dot ucdavis.edu
                   ` (5 preceding siblings ...)
  2014-01-22 13:22 ` jakub at gcc dot gnu.org
@ 2014-01-24 10:03 ` jakub at gcc dot gnu.org
  2014-01-29  9:28 ` jakub at gcc dot gnu.org
  2014-01-29 11:04 ` jakub at gcc dot gnu.org
  8 siblings, 0 replies; 10+ messages in thread
From: jakub at gcc dot gnu.org @ 2014-01-24 10:03 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59594

--- Comment #6 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
On:
#define N 1024
int ia[N + 1];
int ib[N + 1];

void
f1 (void)
{
  int i;
  for (i = 0; i < N; i++)
    {
      ia[i + 1] = 1;
      ib[i] = ia[i];
    }
}

void
f2 (void)
{
  int i;
  for (i = 0; i < N; i++)
    {
      ia[i] = 1;
      ib[i] = ia[i + 1];
    }
}

void
f3 (void)
{
  int i;
  for (i = N - 1; i >= 0; i--)
    {
      ia[i + 1] = 1;
      ib[i] = ia[i];
    }
}

void
f4 (void)
{
  int i;
  for (i = N - 1; i >= 0; i--)
    {
      ia[i] = 1;
      ib[i] = ia[i + 1];
    }
}

we properly vectorize f2 and f3 where the write/read DDR is DDR_REVERSED_P and
not f1/f4.

On
#define N 1024
int ia[N + 1];
int ib[N + 1];

void
f1 (void)
{
  int i;
  for (i = 0; i < N; i++)
    {
      ia[i + 1] = 1;
      ia[i] = 2;
    }
}

void
f2 (void)
{
  int i;
  for (i = 0; i < N; i++)
    {
      ia[i] = 1;
      ia[i + 1] = 2;
    }
}

void
f3 (void)
{
  int i;
  for (i = N - 1; i >= 0; i--)
    {
      ia[i + 1] = 1;
      ia[i] = 2;
    }
}

void
f4 (void)
{
  int i;
  for (i = N - 1; i >= 0; i--)
    {
      ia[i] = 1;
      ia[i + 1] = 2;
    }
}

we don't vectorize f1 and f2, in both cases for the write/write DDR
DDR_REVERSED_P is false, and vectorize f3/f4, where DDR_REVERSED_P is true in
both cases.  f2 and f3 shouldn't be vectorizable (at least not as is, when we'd
be trying to vectorize the two stores just by putting a vector store at that
position), f1 and f4 can.  So, this leads me to believe that for write/write we
don't have a way to differentiate between the bad and good cases using dist > 0
&& DDR_REVERSED_P test.  In that case, I'd think best would be to not ignore
dist > 0 && DDR_REVERSED_P (ddr)
ddrs if (!DR_IS_READ (dra) && !DR_IS_READ (drb)).


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug tree-optimization/59594] [4.9 Regression] wrong code (by tree vectorizer) at -O3 on x86_64-linux-gnu
  2013-12-24 17:14 [Bug tree-optimization/59594] New: wrong code (by tree vectorizer) at -O3 on x86_64-linux-gnu su at cs dot ucdavis.edu
                   ` (6 preceding siblings ...)
  2014-01-24 10:03 ` jakub at gcc dot gnu.org
@ 2014-01-29  9:28 ` jakub at gcc dot gnu.org
  2014-01-29 11:04 ` jakub at gcc dot gnu.org
  8 siblings, 0 replies; 10+ messages in thread
From: jakub at gcc dot gnu.org @ 2014-01-29  9:28 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59594

--- Comment #7 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Author: jakub
Date: Wed Jan 29 09:27:43 2014
New Revision: 207225

URL: http://gcc.gnu.org/viewcvs?rev=207225&root=gcc&view=rev
Log:
    PR tree-optimization/59594
    * tree-vect-data-refs.c (vect_analyze_data_ref_accesses): Sort
    a copy of the datarefs vector rather than the vector itself.

    * gcc.dg/vect/no-vfa-vect-depend-2.c: New test.
    * gcc.dg/vect/no-vfa-vect-depend-3.c: New test.
    * gcc.dg/vect/pr59594.c: New test.

Added:
    trunk/gcc/testsuite/gcc.dg/vect/no-vfa-vect-depend-2.c
    trunk/gcc/testsuite/gcc.dg/vect/no-vfa-vect-depend-3.c
    trunk/gcc/testsuite/gcc.dg/vect/pr59594.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/tree-vect-data-refs.c


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug tree-optimization/59594] [4.9 Regression] wrong code (by tree vectorizer) at -O3 on x86_64-linux-gnu
  2013-12-24 17:14 [Bug tree-optimization/59594] New: wrong code (by tree vectorizer) at -O3 on x86_64-linux-gnu su at cs dot ucdavis.edu
                   ` (7 preceding siblings ...)
  2014-01-29  9:28 ` jakub at gcc dot gnu.org
@ 2014-01-29 11:04 ` jakub at gcc dot gnu.org
  8 siblings, 0 replies; 10+ messages in thread
From: jakub at gcc dot gnu.org @ 2014-01-29 11:04 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59594

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |FIXED

--- Comment #8 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Fixed.


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2014-01-29 11:04 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-12-24 17:14 [Bug tree-optimization/59594] New: wrong code (by tree vectorizer) at -O3 on x86_64-linux-gnu su at cs dot ucdavis.edu
2013-12-24 18:10 ` [Bug tree-optimization/59594] [4.9 Regression] " hjl.tools at gmail dot com
2013-12-24 18:37 ` hjl.tools at gmail dot com
2014-01-13 10:19 ` jakub at gcc dot gnu.org
2014-01-13 13:47 ` hjl.tools at gmail dot com
2014-01-13 14:02 ` jakub at gcc dot gnu.org
2014-01-22 13:22 ` jakub at gcc dot gnu.org
2014-01-24 10:03 ` jakub at gcc dot gnu.org
2014-01-29  9:28 ` jakub at gcc dot gnu.org
2014-01-29 11:04 ` jakub at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).