public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/98221] New: [11 regression] Wrong unpack operation emitted in tree-ssa-forwprop.c
@ 2020-12-10  8:58 krebbel at gcc dot gnu.org
  2020-12-10  8:58 ` [Bug tree-optimization/98221] " krebbel at gcc dot gnu.org
                   ` (9 more replies)
  0 siblings, 10 replies; 11+ messages in thread
From: krebbel at gcc dot gnu.org @ 2020-12-10  8:58 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98221

            Bug ID: 98221
           Summary: [11 regression] Wrong unpack operation emitted in
                    tree-ssa-forwprop.c
           Product: gcc
           Version: 11.0
            Status: UNCONFIRMED
          Severity: major
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: krebbel at gcc dot gnu.org
  Target Milestone: ---

Created attachment 49728
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49728&action=edit
Fix

The vec-abi-varargs-1.c testcase on IBM Z currently fails.

While adding an SI mode vector to a DI mode vector the first is unpacked using:

  _28 = BIT_INSERT_EXPR <{ 0, 0, 0, 0 }, _2, 0>;
  _34 = [vec_unpack_lo_expr] _28;

However, on big endian targets lo refers to the right hand side of the vector -
in this case the zeroes.


This appears to be triggered with that patch:


commit 78307657cf9675bc4aa2e77561c823834714b4c8                                 
Author: Richard Biener <rguenther@suse.de>                                      
Date:   Thu Nov 28 12:22:04 2019 +0000                                          

    re PR tree-optimization/92645 (Hand written vector code is 450 times slower
when compiled with GCC compared to Clang)                                       

    2019-11-28  Richard Biener  <rguenther@suse.de>                             

            PR tree-optimization/92645                                          
            * tree-ssa-forwprop.c (get_bit_field_ref_def): Also handle          
            conversions inside a mode class.  Remove restriction on             
            preserving the element size.                                        
            (simplify_vector_constructor): Deal with the above and for          
            identity permutes also try using VEC_UNPACK_[FLOAT_]LO_EXPR         
            and VEC_PACK_TRUNC_EXPR.                                            

            * gcc.target/i386/pr92645-4.c: New testcase.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug tree-optimization/98221] [11 regression] Wrong unpack operation emitted in tree-ssa-forwprop.c
  2020-12-10  8:58 [Bug tree-optimization/98221] New: [11 regression] Wrong unpack operation emitted in tree-ssa-forwprop.c krebbel at gcc dot gnu.org
@ 2020-12-10  8:58 ` krebbel at gcc dot gnu.org
  2020-12-10 10:21 ` rguenth at gcc dot gnu.org
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: krebbel at gcc dot gnu.org @ 2020-12-10  8:58 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98221

Andreas Krebbel <krebbel at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Priority|P3                          |P2
           Keywords|                            |wrong-code
             Target|                            |s390x

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug tree-optimization/98221] [11 regression] Wrong unpack operation emitted in tree-ssa-forwprop.c
  2020-12-10  8:58 [Bug tree-optimization/98221] New: [11 regression] Wrong unpack operation emitted in tree-ssa-forwprop.c krebbel at gcc dot gnu.org
  2020-12-10  8:58 ` [Bug tree-optimization/98221] " krebbel at gcc dot gnu.org
@ 2020-12-10 10:21 ` rguenth at gcc dot gnu.org
  2020-12-10 10:33 ` rguenth at gcc dot gnu.org
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: rguenth at gcc dot gnu.org @ 2020-12-10 10:21 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98221

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|---                         |11.0

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug tree-optimization/98221] [11 regression] Wrong unpack operation emitted in tree-ssa-forwprop.c
  2020-12-10  8:58 [Bug tree-optimization/98221] New: [11 regression] Wrong unpack operation emitted in tree-ssa-forwprop.c krebbel at gcc dot gnu.org
  2020-12-10  8:58 ` [Bug tree-optimization/98221] " krebbel at gcc dot gnu.org
  2020-12-10 10:21 ` rguenth at gcc dot gnu.org
@ 2020-12-10 10:33 ` rguenth at gcc dot gnu.org
  2020-12-10 10:33 ` [Bug tree-optimization/98221] [10/11 " rguenth at gcc dot gnu.org
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: rguenth at gcc dot gnu.org @ 2020-12-10 10:33 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98221

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |rsandifo at gcc dot gnu.org

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
But all GIMPLE operates on "GIMPLE lane order" so this is a defect in how the
backend handles those tree codes at expansion time?

You'll find no mention of BYTES_BIG_ENDIAN in the vectorizer (ok, you find
exactly one - but other UNPACK_LO/HI uses in other places).

>From looking at tree-vect-generic.c VEC_UNPACK_{,FIX_TRUNC,FLOAT}_{LO,HI}_EXPR
are subject to this while supportable_widening_operation adds some more
LO,HI but also one EVEN,ODD case.

(Ab-)using BYTES_BIG_ENDIAN is also a bit odd, if we insist on preserving
this endian dependency on GIMPLE it should be a VECTOR_LANES_BIG_ENDIAN
or so.  But it doesn't seem to be consistently implemented.

The patch looks "obvious" if the semantics are dependent on BYTES_BIG_ENDIAN
but I see that nowhere documented and it's contrary to my expectation of
GIMPLE semantics (GIMPLE semantics assumes memory order).

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug tree-optimization/98221] [10/11 regression] Wrong unpack operation emitted in tree-ssa-forwprop.c
  2020-12-10  8:58 [Bug tree-optimization/98221] New: [11 regression] Wrong unpack operation emitted in tree-ssa-forwprop.c krebbel at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2020-12-10 10:33 ` rguenth at gcc dot gnu.org
@ 2020-12-10 10:33 ` rguenth at gcc dot gnu.org
  2020-12-10 11:27 ` rsandifo at gcc dot gnu.org
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: rguenth at gcc dot gnu.org @ 2020-12-10 10:33 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98221

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|[11 regression] Wrong       |[10/11 regression] Wrong
                   |unpack operation emitted in |unpack operation emitted in
                   |tree-ssa-forwprop.c         |tree-ssa-forwprop.c
   Target Milestone|11.0                        |10.3

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug tree-optimization/98221] [10/11 regression] Wrong unpack operation emitted in tree-ssa-forwprop.c
  2020-12-10  8:58 [Bug tree-optimization/98221] New: [11 regression] Wrong unpack operation emitted in tree-ssa-forwprop.c krebbel at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2020-12-10 10:33 ` [Bug tree-optimization/98221] [10/11 " rguenth at gcc dot gnu.org
@ 2020-12-10 11:27 ` rsandifo at gcc dot gnu.org
  2020-12-10 11:50 ` krebbel at gcc dot gnu.org
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: rsandifo at gcc dot gnu.org @ 2020-12-10 11:27 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98221

--- Comment #2 from rsandifo at gcc dot gnu.org <rsandifo at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #1)
> But all GIMPLE operates on "GIMPLE lane order" so this is a defect in how
> the backend handles those tree codes at expansion time?
I can never remember which way round it is and always have
to look it up, but: for better or (probably) worse, I think the hi/lo
classification applies to the high/low parts of registers (MSB/LSB)
rather than high/low lanes.

Personally I'd prefer if it was the other way around, but that's
not going to be easy to change.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug tree-optimization/98221] [10/11 regression] Wrong unpack operation emitted in tree-ssa-forwprop.c
  2020-12-10  8:58 [Bug tree-optimization/98221] New: [11 regression] Wrong unpack operation emitted in tree-ssa-forwprop.c krebbel at gcc dot gnu.org
                   ` (4 preceding siblings ...)
  2020-12-10 11:27 ` rsandifo at gcc dot gnu.org
@ 2020-12-10 11:50 ` krebbel at gcc dot gnu.org
  2020-12-10 12:00 ` rguenth at gcc dot gnu.org
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: krebbel at gcc dot gnu.org @ 2020-12-10 11:50 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98221

--- Comment #3 from Andreas Krebbel <krebbel at gcc dot gnu.org> ---
tree-vect-loop-manip.c: vect_maybe_permute_loop_masks also emits
VEC_UNPACKS_HI/LO dependent on BYTES_BIG_ENDIAN.

What is the expectation wrt the meaning of hi/lo in RTL standard names? I
couldn't find it clearly documented for this either. Well, for things like
'smulm3_highpart' we say it is about the 'most significant half' but I don't
see anything for the vector hi/lo.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug tree-optimization/98221] [10/11 regression] Wrong unpack operation emitted in tree-ssa-forwprop.c
  2020-12-10  8:58 [Bug tree-optimization/98221] New: [11 regression] Wrong unpack operation emitted in tree-ssa-forwprop.c krebbel at gcc dot gnu.org
                   ` (5 preceding siblings ...)
  2020-12-10 11:50 ` krebbel at gcc dot gnu.org
@ 2020-12-10 12:00 ` rguenth at gcc dot gnu.org
  2021-01-11 10:47 ` [Bug tree-optimization/98221] [10 " rguenth at gcc dot gnu.org
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: rguenth at gcc dot gnu.org @ 2020-12-10 12:00 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98221

--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Andreas Krebbel from comment #3)
> tree-vect-loop-manip.c: vect_maybe_permute_loop_masks also emits
> VEC_UNPACKS_HI/LO dependent on BYTES_BIG_ENDIAN.
> 
> What is the expectation wrt the meaning of hi/lo in RTL standard names? I
> couldn't find it clearly documented for this either. Well, for things like
> 'smulm3_highpart' we say it is about the 'most significant half' but I don't
> see anything for the vector hi/lo.

It would be nice to clarify this indeed.  Like for ppc64le with "big-endian"
vector ops the only chance is to have the target emulate litte-endian
vector ops given the 1:1 match on BYTES_BIG_ENDIAN.

Note in supportable_widening_operation we also swap

    case VEC_WIDEN_MULT_EVEN_EXPR:
      /* Support the recursion induced just above.  */
      c1 = VEC_WIDEN_MULT_EVEN_EXPR;
      c2 = VEC_WIDEN_MULT_ODD_EXPR;
      break;

and I wonder if we can build a wrong-code testcase from that (unless even/odd
interpretation also depends on BYTES_BIG_ENDIAN).  s390 does seem to implement
those at least.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug tree-optimization/98221] [10 regression] Wrong unpack operation emitted in tree-ssa-forwprop.c
  2020-12-10  8:58 [Bug tree-optimization/98221] New: [11 regression] Wrong unpack operation emitted in tree-ssa-forwprop.c krebbel at gcc dot gnu.org
                   ` (6 preceding siblings ...)
  2020-12-10 12:00 ` rguenth at gcc dot gnu.org
@ 2021-01-11 10:47 ` rguenth at gcc dot gnu.org
  2021-01-12  9:51 ` cvs-commit at gcc dot gnu.org
  2021-01-12  9:52 ` rguenth at gcc dot gnu.org
  9 siblings, 0 replies; 11+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-01-11 10:47 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98221

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Assignee|unassigned at gcc dot gnu.org      |rguenth at gcc dot gnu.org
            Summary|[10/11 regression] Wrong    |[10 regression] Wrong
                   |unpack operation emitted in |unpack operation emitted in
                   |tree-ssa-forwprop.c         |tree-ssa-forwprop.c
      Known to work|                            |11.0
     Ever confirmed|0                           |1
   Last reconfirmed|                            |2021-01-11
             Status|UNCONFIRMED                 |ASSIGNED

--- Comment #5 from Richard Biener <rguenth at gcc dot gnu.org> ---
Fixed on trunk sofar.

commit 300a3ce5c5695eb1a7c0476e9d1b45420a463248 (HEAD -> trunk, origin/master,
o
rigin/HEAD)
Author: Andreas Krebbel <krebbel@gcc.gnu.org>
Date:   Mon Jan 11 10:59:43 2021 +0100

    tree-optimization/98221 - fix wrong unpack operation used for big-endian

    The vec-abi-varargs-1.c testcase on IBM Z currently fails.

    While adding an SI mode vector to a DI mode vector the first is unpacked
using:

      _28 = BIT_INSERT_EXPR <{ 0, 0, 0, 0 }, _2, 0>;
      _34 = [vec_unpack_lo_expr] _28;

    However, on big endian targets lo refers to the right hand side of the
vector - in this case the zeroes.

    2021-01-11  Andreas Krebbel  <krebbel@linux.ibm.com>

            * tree-ssa-forwprop.c (simplify_vector_constructor): For
            big-endian, use UNPACK[_FLOAT]_HI.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug tree-optimization/98221] [10 regression] Wrong unpack operation emitted in tree-ssa-forwprop.c
  2020-12-10  8:58 [Bug tree-optimization/98221] New: [11 regression] Wrong unpack operation emitted in tree-ssa-forwprop.c krebbel at gcc dot gnu.org
                   ` (7 preceding siblings ...)
  2021-01-11 10:47 ` [Bug tree-optimization/98221] [10 " rguenth at gcc dot gnu.org
@ 2021-01-12  9:51 ` cvs-commit at gcc dot gnu.org
  2021-01-12  9:52 ` rguenth at gcc dot gnu.org
  9 siblings, 0 replies; 11+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-01-12  9:51 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98221

--- Comment #6 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-10 branch has been updated by Richard Biener
<rguenth@gcc.gnu.org>:

https://gcc.gnu.org/g:ecab0d9106a317073f5891f161f031078d835b2f

commit r10-9253-gecab0d9106a317073f5891f161f031078d835b2f
Author: Andreas Krebbel <krebbel@gcc.gnu.org>
Date:   Mon Jan 11 10:59:43 2021 +0100

    tree-optimization/98221 - fix wrong unpack operation used for big-endian

    The vec-abi-varargs-1.c testcase on IBM Z currently fails.

    While adding an SI mode vector to a DI mode vector the first is unpacked
using:

      _28 = BIT_INSERT_EXPR <{ 0, 0, 0, 0 }, _2, 0>;
      _34 = [vec_unpack_lo_expr] _28;

    However, on big endian targets lo refers to the right hand side of the
vector - in this case the zeroes.

    2021-01-11  Andreas Krebbel  <krebbel@linux.ibm.com>

            PR tree-optimization/98221
            * tree-ssa-forwprop.c (simplify_vector_constructor): For
            big-endian, use UNPACK[_FLOAT]_HI.

    (cherry picked from commit 300a3ce5c5695eb1a7c0476e9d1b45420a463248)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug tree-optimization/98221] [10 regression] Wrong unpack operation emitted in tree-ssa-forwprop.c
  2020-12-10  8:58 [Bug tree-optimization/98221] New: [11 regression] Wrong unpack operation emitted in tree-ssa-forwprop.c krebbel at gcc dot gnu.org
                   ` (8 preceding siblings ...)
  2021-01-12  9:51 ` cvs-commit at gcc dot gnu.org
@ 2021-01-12  9:52 ` rguenth at gcc dot gnu.org
  9 siblings, 0 replies; 11+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-01-12  9:52 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98221

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
      Known to work|                            |10.2.1
      Known to fail|                            |10.2.0
         Resolution|---                         |FIXED
             Status|ASSIGNED                    |RESOLVED

--- Comment #7 from Richard Biener <rguenth at gcc dot gnu.org> ---
Fixed.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2021-01-12  9:52 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-10  8:58 [Bug tree-optimization/98221] New: [11 regression] Wrong unpack operation emitted in tree-ssa-forwprop.c krebbel at gcc dot gnu.org
2020-12-10  8:58 ` [Bug tree-optimization/98221] " krebbel at gcc dot gnu.org
2020-12-10 10:21 ` rguenth at gcc dot gnu.org
2020-12-10 10:33 ` rguenth at gcc dot gnu.org
2020-12-10 10:33 ` [Bug tree-optimization/98221] [10/11 " rguenth at gcc dot gnu.org
2020-12-10 11:27 ` rsandifo at gcc dot gnu.org
2020-12-10 11:50 ` krebbel at gcc dot gnu.org
2020-12-10 12:00 ` rguenth at gcc dot gnu.org
2021-01-11 10:47 ` [Bug tree-optimization/98221] [10 " rguenth at gcc dot gnu.org
2021-01-12  9:51 ` cvs-commit at gcc dot gnu.org
2021-01-12  9:52 ` rguenth at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).