public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/104658] New: Inefficient vectorization using mask CTORs
@ 2022-02-23 10:25 rguenth at gcc dot gnu.org
  2022-02-23 10:26 ` [Bug tree-optimization/104658] " rguenth at gcc dot gnu.org
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-02-23 10:25 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104658

            Bug ID: 104658
           Summary: Inefficient vectorization using mask CTORs
           Product: gcc
           Version: 12.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: rguenth at gcc dot gnu.org
  Target Milestone: ---

Originally observed as part of PR101636 - we are sometimes mixing mask &
non-mask
vector defs resulting in external defs for SLP nodes of mask type which are not
code-generated efficiently, resulting in

     <signed-boolean:1> _135 = _49 ? -1 : 0;
     _144 = { _135, .... }

and quite awful bit-insert code.  That's not expected cost wise and the
appropriate fix is to the bool pattern recog which should have prepared the
IL to avoid the situation.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug tree-optimization/104658] Inefficient vectorization using mask CTORs
  2022-02-23 10:25 [Bug tree-optimization/104658] New: Inefficient vectorization using mask CTORs rguenth at gcc dot gnu.org
@ 2022-02-23 10:26 ` rguenth at gcc dot gnu.org
  2022-05-04 13:12 ` cvs-commit at gcc dot gnu.org
  2022-05-04 13:15 ` rguenth at gcc dot gnu.org
  2 siblings, 0 replies; 4+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-02-23 10:26 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104658

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|                            |2022-02-23
             Status|UNCONFIRMED                 |ASSIGNED
           Keywords|                            |missed-optimization
           Assignee|unassigned at gcc dot gnu.org      |rguenth at gcc dot gnu.org
     Ever confirmed|0                           |1

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
I have a patch for this.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug tree-optimization/104658] Inefficient vectorization using mask CTORs
  2022-02-23 10:25 [Bug tree-optimization/104658] New: Inefficient vectorization using mask CTORs rguenth at gcc dot gnu.org
  2022-02-23 10:26 ` [Bug tree-optimization/104658] " rguenth at gcc dot gnu.org
@ 2022-05-04 13:12 ` cvs-commit at gcc dot gnu.org
  2022-05-04 13:15 ` rguenth at gcc dot gnu.org
  2 siblings, 0 replies; 4+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2022-05-04 13:12 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104658

--- Comment #2 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Richard Biener <rguenth@gcc.gnu.org>:

https://gcc.gnu.org/g:eca04dc8555f5fae462fbd16386da9aaf38a0711

commit r13-111-geca04dc8555f5fae462fbd16386da9aaf38a0711
Author: Richard Biener <rguenther@suse.de>
Date:   Tue Feb 22 16:02:27 2022 +0100

    tree-optimization/104658 - avoid mixing mask & non-mask vector defs

    When pattern recognition fails to sanitize all defs of a mask
    producing operation and the respective def is external or constant
    we end up trying to produce a VECTOR_BOOLEAN_TYPE_P constructor
    which in turn ends up exposing stmts like

      <signed-boolean:1> _135 = _49 ? -1 : 0;

    which isn't handled well in followup SLP and generates awful code.

    We do rely heavily on pattern recognition to sanitize mask vs.
    data uses of bools but that fails here which means we also should
    fail vectorization.  That avoids ICEing because of such stmts
    and it also avoids generating weird code which makes the
    vectorization not profitable.

    The following patch simply disallows external VECTOR_BOOLEAN_TYPE_P
    defs and arranges the promote to external code to instead promote
    mask uses to extern (that's just a short-cut here).

    I've also looked at aarch64 and with SVE and a fixed vector length
    for the gcc.target/i386/pr101636.c testcase.  I see similar vectorization
    (using <signed-boolean:4>) there but it's hard to decide whether the
    old, the new or no vectorization is better for this.  The code
    generated with traditional integer masks isn't as awkward but we
    still get the != 0 promotion done for each scalar element which
    doesn't look like intended - this operation should be visible upfront.

    That also means some cases will now become a missed optimization
    that needs to be fixed by bool pattern recognition.  But that can
    possibly be delayed to GCC 13.

    2022-02-22  Richard Biener  <rguenther@suse.de>

            PR tree-optimization/104658
            * tree-vect-slp.cc (vect_slp_convert_to_external): Do not
            create VECTOR_BOOLEAN_TYPE_P extern defs.  Reset the vector
            type on nodes we promote.
            (vectorizable_bb_reduc_epilogue): Deal with externalized
            root.
            * tree-vect-stmts.cc (vect_maybe_update_slp_op_vectype): Do
            not allow VECTOR_BOOLEAN_TYPE_P extern defs.

            * gcc.target/i386/pr104658.c: New testcase.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug tree-optimization/104658] Inefficient vectorization using mask CTORs
  2022-02-23 10:25 [Bug tree-optimization/104658] New: Inefficient vectorization using mask CTORs rguenth at gcc dot gnu.org
  2022-02-23 10:26 ` [Bug tree-optimization/104658] " rguenth at gcc dot gnu.org
  2022-05-04 13:12 ` cvs-commit at gcc dot gnu.org
@ 2022-05-04 13:15 ` rguenth at gcc dot gnu.org
  2 siblings, 0 replies; 4+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-05-04 13:15 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104658

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
      Known to work|                            |13.0
             Status|ASSIGNED                    |RESOLVED
         Resolution|---                         |FIXED

--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
Fixed.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2022-05-04 13:15 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-23 10:25 [Bug tree-optimization/104658] New: Inefficient vectorization using mask CTORs rguenth at gcc dot gnu.org
2022-02-23 10:26 ` [Bug tree-optimization/104658] " rguenth at gcc dot gnu.org
2022-05-04 13:12 ` cvs-commit at gcc dot gnu.org
2022-05-04 13:15 ` rguenth at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).