public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c/63148] New: r187042 causes auto-vectorization failure for X86 for -m32.
@ 2014-09-03  0:16 doug.gilmore at imgtec dot com
  2014-09-03  9:02 ` [Bug tree-optimization/63148] " rguenth at gcc dot gnu.org
                   ` (8 more replies)
  0 siblings, 9 replies; 10+ messages in thread
From: doug.gilmore at imgtec dot com @ 2014-09-03  0:16 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63148

            Bug ID: 63148
           Summary: r187042 causes auto-vectorization failure for X86 for
                    -m32.
           Product: gcc
           Version: 4.8.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
          Assignee: unassigned at gcc dot gnu.org
          Reporter: doug.gilmore at imgtec dot com

Created attachment 33440
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=33440&action=edit
test example

I noticed that MultiSource/Benchmarks/TSVC/LoopRestructuring-{flt,dbl}
from LLVM test-suite fail on X86 -m32 and I was able to bisect the
failure to commit r187042.

I attached a stripped down example:

Before the revision if we compile with -fdump-tree-vect-details
we see that a loop carried dependency is recorded:

(compute_affine_dependence
  stmt_a: D.1748_9 = global_data.b[D.1747_8];
  stmt_b: global_data.b[i.0_2] = D.1750_11;
(subscript_dependence_tester 
(analyze_overlapping_iterations 
  (chrec_a = {0, +, 1}_5)
  (chrec_b = {1, +, 1}_5)
(analyze_siv_subscript 
(analyze_subscript_affine_affine 
  (overlaps_a = [1 + 1 * x_1]
)
  (overlaps_b = [0 + 1 * x_1]
)
)
)
  (overlap_iterations_a = [1 + 1 * x_1]
)
  (overlap_iterations_b = [0 + 1 * x_1]
)
)
(analyze_overlapping_iterations 
  (chrec_a = 2816)
  (chrec_b = 2816)
  (overlap_iterations_a = [0]
)
  (overlap_iterations_b = [0]
)
)
(build_classic_dist_vector
  dist_vector = (  1 
  )
)
)
)

which results in the loop not being vectorized because of the memory
recurrence.

After the change the dependency is not recorded:

(compute_affine_dependence
  stmt_a: D.1748_9 = global_data.b[D.1747_8];
  stmt_b: global_data.b[i.0_2] = D.1750_11;
(subscript_dependence_tester 
(analyze_overlapping_iterations 
  (chrec_a = {536870912, +, 1}_5)
  (chrec_b = {1, +, 1}_5)
(analyze_siv_subscript 
(analyze_subscript_affine_affine 
  (overlaps_a = no dependence
)
  (overlaps_b = no dependence
)
)
)
  (overlap_iterations_a = no dependence
)
  (overlap_iterations_b = no dependence
)
)
(dependence classified: scev_known)
)

Causing the loop to be incorrectly vectorized.

Note that when compiled with -m64 is actually vectorized,
but it is determined that versioning is needed:

45: dependence distance == 0 between global_data.a[D.1767_2] and
global_data.a[D.1767_2]
45: versioning for alias required: can't determine dependence between
global_data.a[D.1767_2] and *D.1776_10
...
58: LOOP VECTORIZED.
s221_extract.c:40: note: vectorized 5 loops in function.
Merging blocks 2 and 41
Removing basic block 5
...

and the incorrectly vectorized code is removed.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug tree-optimization/63148] r187042 causes auto-vectorization failure for X86 for -m32.
  2014-09-03  0:16 [Bug c/63148] New: r187042 causes auto-vectorization failure for X86 for -m32 doug.gilmore at imgtec dot com
@ 2014-09-03  9:02 ` rguenth at gcc dot gnu.org
  2014-09-04  7:39 ` rguenth at gcc dot gnu.org
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: rguenth at gcc dot gnu.org @ 2014-09-03  9:02 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63148

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |RESOLVED
         Resolution|---                         |DUPLICATE

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
This has been fixed on the 4.8 branch already, I think this is a duplicate of
PR60276.

*** This bug has been marked as a duplicate of bug 60276 ***


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug tree-optimization/63148] r187042 causes auto-vectorization failure for X86 for -m32.
  2014-09-03  0:16 [Bug c/63148] New: r187042 causes auto-vectorization failure for X86 for -m32 doug.gilmore at imgtec dot com
  2014-09-03  9:02 ` [Bug tree-optimization/63148] " rguenth at gcc dot gnu.org
@ 2014-09-04  7:39 ` rguenth at gcc dot gnu.org
  2014-09-04  7:41 ` [Bug tree-optimization/63148] [4.8/4.9/5 Regression] " rguenth at gcc dot gnu.org
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: rguenth at gcc dot gnu.org @ 2014-09-04  7:39 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63148

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|RESOLVED                    |ASSIGNED
   Last reconfirmed|                            |2014-09-04
         Resolution|DUPLICATE                   |---
           Assignee|unassigned at gcc dot gnu.org      |rguenth at gcc dot gnu.org
     Ever confirmed|0                           |1

--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
Indeed, sorry for the mistake (mistakenly tried a compiler configured to
-march=i586 for -m32).


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug tree-optimization/63148] [4.8/4.9/5 Regression] r187042 causes auto-vectorization failure for X86 for -m32.
  2014-09-03  0:16 [Bug c/63148] New: r187042 causes auto-vectorization failure for X86 for -m32 doug.gilmore at imgtec dot com
  2014-09-03  9:02 ` [Bug tree-optimization/63148] " rguenth at gcc dot gnu.org
  2014-09-04  7:39 ` rguenth at gcc dot gnu.org
@ 2014-09-04  7:41 ` rguenth at gcc dot gnu.org
  2014-09-04  8:37 ` rguenth at gcc dot gnu.org
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: rguenth at gcc dot gnu.org @ 2014-09-04  7:41 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63148

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |wrong-code
      Known to work|                            |4.7.3
   Target Milestone|---                         |4.8.4
            Summary|r187042 causes              |[4.8/4.9/5 Regression]
                   |auto-vectorization failure  |r187042 causes
                   |for X86 for -m32.           |auto-vectorization failure
                   |                            |for X86 for -m32.
      Known to fail|                            |4.8.0, 4.8.3, 4.9.1, 5.0

--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
I'll investigate.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug tree-optimization/63148] [4.8/4.9/5 Regression] r187042 causes auto-vectorization failure for X86 for -m32.
  2014-09-03  0:16 [Bug c/63148] New: r187042 causes auto-vectorization failure for X86 for -m32 doug.gilmore at imgtec dot com
                   ` (2 preceding siblings ...)
  2014-09-04  7:41 ` [Bug tree-optimization/63148] [4.8/4.9/5 Regression] " rguenth at gcc dot gnu.org
@ 2014-09-04  8:37 ` rguenth at gcc dot gnu.org
  2014-09-05  1:41 ` doug.gilmore at imgtec dot com
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: rguenth at gcc dot gnu.org @ 2014-09-04  8:37 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63148

--- Comment #5 from Richard Biener <rguenth at gcc dot gnu.org> ---
The input to the vectorizer is already bogus:

  _12 = i.0_5 + 536870911;
  _13 = global_data.b[_12];

the issue seems to be that 'sizetype' is used to index the array:

  *(double * const) &global_data.a[(sizetype) i] = *(double * const)
&global_data.a[(sizetype) i] + *(double * const) &global_data.c[(sizetype) i] *
*(double * const) &global_data.d[(sizetype) i];


Ok, so it's one of the suspicious transforms in fold-const.c (I removed all
these sorts of transforms from GIMPLE already...).  try_move_mult_to_index.

Bah.

It's never correct (for later data-dependence) to "reconstruct" ARRAY_REFs
from pointer arithmetic.  Here we fold (ssizetype) (((sizetype) i + 536870911)
* 8) to &global_data.b[(sizetype) i + 536870911].  But that's not the same
as data-dependence analysis doesn't interpret the array index as only
ending up in the address computation which multiplies the index by 8 again
and thus correctly arrives at i * 8 + -8U.  That is, you can't simply
strip an unsigned multiplication this way.

For 64bit we seem to be lucky and we retain

  (sizetype) ((long unsigned int) i * 8) + 18446744073709551608

so we didn't move the multiplication out.  That is because
fold_plusminus_mult_expr only handles signed HWI and 18446744073709551608
is too large for a signed HWI.

So maybe we can apply a not so invasive fix here by restricting both
to signed or unsigned with no sign bit set values.

Doing that fixes the testcase but also ends up with mixed pointer/array
accesses which dependence analysis cannot handle so we get versioning
for aliasing.  If OTOH we disable the offending transform to an array
access we get only pointer-based accesses and data dependence analysis
fails correctly and we don't get anything vectorized here.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug tree-optimization/63148] [4.8/4.9/5 Regression] r187042 causes auto-vectorization failure for X86 for -m32.
  2014-09-03  0:16 [Bug c/63148] New: r187042 causes auto-vectorization failure for X86 for -m32 doug.gilmore at imgtec dot com
                   ` (3 preceding siblings ...)
  2014-09-04  8:37 ` rguenth at gcc dot gnu.org
@ 2014-09-05  1:41 ` doug.gilmore at imgtec dot com
  2014-09-05  8:07 ` rguenth at gcc dot gnu.org
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: doug.gilmore at imgtec dot com @ 2014-09-05  1:41 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63148

--- Comment #6 from Doug Gilmore <doug.gilmore at imgtec dot com> ---
> The input to the vectorizer is already bogus:
>
>   _12 = i.0_5 + 536870911;
>   _13 = global_data.b[_12];

Note that gimple out generated by the front end
is already problematic:

Before r187042:
  D.1747 = i.0 + -1;
With r187042:
  D.1747 = i.0 + 536870911;
Any idea what the intent of the changes in r187042 that transform
signed to unsigned constants?  To me, that is the problematic issue.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug tree-optimization/63148] [4.8/4.9/5 Regression] r187042 causes auto-vectorization failure for X86 for -m32.
  2014-09-03  0:16 [Bug c/63148] New: r187042 causes auto-vectorization failure for X86 for -m32 doug.gilmore at imgtec dot com
                   ` (4 preceding siblings ...)
  2014-09-05  1:41 ` doug.gilmore at imgtec dot com
@ 2014-09-05  8:07 ` rguenth at gcc dot gnu.org
  2014-09-05  8:24 ` rguenth at gcc dot gnu.org
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: rguenth at gcc dot gnu.org @ 2014-09-05  8:07 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63148

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|rguenther at suse dot de           |

--- Comment #7 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Doug Gilmore from comment #6)
> > The input to the vectorizer is already bogus:
> >
> >   _12 = i.0_5 + 536870911;
> >   _13 = global_data.b[_12];
> 
> Note that gimple out generated by the front end
> is already problematic:
> 
> Before r187042:
>   D.1747 = i.0 + -1;
> With r187042:
>   D.1747 = i.0 + 536870911;
> Any idea what the intent of the changes in r187042 that transform
> signed to unsigned constants?  To me, that is the problematic issue.

Well, before r187042 the constants had an unsigned type but were
sign-extended (but only constants were!).  This has caused similar
issues elsewhere.  Now constants are consistent with their types
but now we run into the issue that as POINTER_PLUS_EXPR forces
the offset to be 'sizetype' (which is unsigned), we lose information
when translating C array[index] as *(&array + index * element_size).
So we can't go "back" to array[index] by dividing the pointer offset
by the element_size because we have no idea if the offset is really
signed or not (but even then the index may be obfuscated by the
programmer so you can't really go back to array[index] from pointer
arithmetic).


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug tree-optimization/63148] [4.8/4.9/5 Regression] r187042 causes auto-vectorization failure for X86 for -m32.
  2014-09-03  0:16 [Bug c/63148] New: r187042 causes auto-vectorization failure for X86 for -m32 doug.gilmore at imgtec dot com
                   ` (5 preceding siblings ...)
  2014-09-05  8:07 ` rguenth at gcc dot gnu.org
@ 2014-09-05  8:24 ` rguenth at gcc dot gnu.org
  2014-09-05  8:34 ` rguenth at gcc dot gnu.org
  2014-09-06  4:47 ` [Bug tree-optimization/63148] [4.8/4.9 " doug.gilmore at imgtec dot com
  8 siblings, 0 replies; 10+ messages in thread
From: rguenth at gcc dot gnu.org @ 2014-09-05  8:24 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63148

--- Comment #8 from Richard Biener <rguenth at gcc dot gnu.org> ---
Author: rguenth
Date: Fri Sep  5 08:23:32 2014
New Revision: 214941

URL: https://gcc.gnu.org/viewcvs?rev=214941&root=gcc&view=rev
Log:
2014-09-05  Richard Biener  <rguenther@suse.de>

    PR middle-end/63148
    * fold-const.c (try_move_mult_to_index): Remove.
    (fold_binary_loc): Do not call it.
    * tree-data-ref.c (dr_analyze_indices): Strip conversions
    from the base object again.

    c-family/
    * c-format.c (check_format_arg): Properly handle
    effectively signed POINTER_PLUS_EXPR offset.

    * gcc.dg/vect/pr63148.c: New testcase.
    * c-c++-common/pr19807-1.c: Likewise.
    * g++.dg/tree-ssa/pr19807.C: Adjust.
    * g++.dg/tree-ssa/tmmti-2.C: Remove.

Added:
    trunk/gcc/testsuite/c-c++-common/pr19807-1.c
    trunk/gcc/testsuite/gcc.dg/vect/pr63148.c
Removed:
    trunk/gcc/testsuite/g++.dg/tree-ssa/tmmti-2.C
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/c-family/ChangeLog
    trunk/gcc/c-family/c-format.c
    trunk/gcc/fold-const.c
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/testsuite/g++.dg/tree-ssa/pr19807.C
    trunk/gcc/tree-data-ref.c


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug tree-optimization/63148] [4.8/4.9/5 Regression] r187042 causes auto-vectorization failure for X86 for -m32.
  2014-09-03  0:16 [Bug c/63148] New: r187042 causes auto-vectorization failure for X86 for -m32 doug.gilmore at imgtec dot com
                   ` (6 preceding siblings ...)
  2014-09-05  8:24 ` rguenth at gcc dot gnu.org
@ 2014-09-05  8:34 ` rguenth at gcc dot gnu.org
  2014-09-06  4:47 ` [Bug tree-optimization/63148] [4.8/4.9 " doug.gilmore at imgtec dot com
  8 siblings, 0 replies; 10+ messages in thread
From: rguenth at gcc dot gnu.org @ 2014-09-05  8:34 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63148

--- Comment #9 from Richard Biener <rguenth at gcc dot gnu.org> ---
Fixed on trunk the "right" way, still pondering of the best way to fix it on
the branches.  (aka less intrusive)


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug tree-optimization/63148] [4.8/4.9 Regression] r187042 causes auto-vectorization failure for X86 for -m32.
  2014-09-03  0:16 [Bug c/63148] New: r187042 causes auto-vectorization failure for X86 for -m32 doug.gilmore at imgtec dot com
                   ` (7 preceding siblings ...)
  2014-09-05  8:34 ` rguenth at gcc dot gnu.org
@ 2014-09-06  4:47 ` doug.gilmore at imgtec dot com
  8 siblings, 0 replies; 10+ messages in thread
From: doug.gilmore at imgtec dot com @ 2014-09-06  4:47 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63148

Doug Gilmore <doug.gilmore at imgtec dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |RESOLVED
         Resolution|---                         |FIXED

--- Comment #10 from Doug Gilmore <doug.gilmore at imgtec dot com> ---
Verified my test examples are working (both X86 -m32
and MIPS32 -mmsa (patch is under review) are now working.

Thanks!

Doug


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2014-09-06  4:47 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-09-03  0:16 [Bug c/63148] New: r187042 causes auto-vectorization failure for X86 for -m32 doug.gilmore at imgtec dot com
2014-09-03  9:02 ` [Bug tree-optimization/63148] " rguenth at gcc dot gnu.org
2014-09-04  7:39 ` rguenth at gcc dot gnu.org
2014-09-04  7:41 ` [Bug tree-optimization/63148] [4.8/4.9/5 Regression] " rguenth at gcc dot gnu.org
2014-09-04  8:37 ` rguenth at gcc dot gnu.org
2014-09-05  1:41 ` doug.gilmore at imgtec dot com
2014-09-05  8:07 ` rguenth at gcc dot gnu.org
2014-09-05  8:24 ` rguenth at gcc dot gnu.org
2014-09-05  8:34 ` rguenth at gcc dot gnu.org
2014-09-06  4:47 ` [Bug tree-optimization/63148] [4.8/4.9 " doug.gilmore at imgtec dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).