public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/103941] New: uavgv2qi3_ceil is not used
@ 2022-01-07 15:31 ubizjak at gmail dot com
  2022-01-10 10:53 ` [Bug tree-optimization/103941] uavgv2qi3_ceil is not used (SLP costing and patterns vs live stmts) rguenth at gcc dot gnu.org
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: ubizjak at gmail dot com @ 2022-01-07 15:31 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103941

            Bug ID: 103941
           Summary: uavgv2qi3_ceil is not used
           Product: gcc
           Version: 12.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: ubizjak at gmail dot com
  Target Milestone: ---

Following testcase:

unsigned char ur[16], ua[16], ub[16];

void avgu_v2qi (void)
{
  int i;

  for (i = 0; i < 2; i++)
    ur[i] = (ua[i] + ub[i] + 1) >> 1;
}

does not vectorize on x86_64-linux-gnu with -O2 -ftree-vectorize.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/103941] uavgv2qi3_ceil is not used (SLP costing and patterns vs live stmts)
  2022-01-07 15:31 [Bug tree-optimization/103941] New: uavgv2qi3_ceil is not used ubizjak at gmail dot com
@ 2022-01-10 10:53 ` rguenth at gcc dot gnu.org
  2022-01-10 12:49 ` rguenth at gcc dot gnu.org
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-01-10 10:53 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103941

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Target|                            |x86_64-*-* i?86-*-*
     Ever confirmed|0                           |1
             Status|UNCONFIRMED                 |ASSIGNED
           Assignee|unassigned at gcc dot gnu.org      |rguenth at gcc dot gnu.org
   Last reconfirmed|                            |2022-01-10
             Blocks|                            |53947
            Summary|uavgv2qi3_ceil is not used  |uavgv2qi3_ceil is not used
                   |                            |(SLP costing and patterns
                   |                            |vs live stmts)
           Keywords|                            |missed-optimization

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
t.c:8:11: note: Costing subgraph:
t.c:8:11: note: node 0x409a000 (max_nunits=2, refcnt=1)
t.c:8:11: note: op template: ur[0] = _23;
t.c:8:11: note:         stmt 0 ur[0] = _23;
t.c:8:11: note:         stmt 1 ur[1] = _35;
t.c:8:11: note:         children 0x409a088
t.c:8:11: note: node 0x409a088 (max_nunits=2, refcnt=1)
t.c:8:11: note: op template: patt_58 = (unsigned char) patt_56;
t.c:8:11: note:         stmt 0 patt_58 = (unsigned char) patt_56;
t.c:8:11: note:         stmt 1 patt_71 = (unsigned char) patt_69;
t.c:8:11: note:         children 0x409a110
t.c:8:11: note: node 0x409a110 (max_nunits=2, refcnt=1)
t.c:8:11: note: op template: patt_56 = .AVG_CEIL (_16, _18);
t.c:8:11: note:         stmt 0 patt_56 = .AVG_CEIL (_16, _18);
t.c:8:11: note:         stmt 1 patt_69 = .AVG_CEIL (_28, _30);
t.c:8:11: note:         children 0x409a220 0x409a198
t.c:8:11: note: node 0x409a220 (max_nunits=2, refcnt=1)
t.c:8:11: note: op template: _16 = ua[0];
t.c:8:11: note:         stmt 0 _16 = ua[0];
t.c:8:11: note:         stmt 1 _28 = ua[1];
t.c:8:11: note: node 0x409a198 (max_nunits=2, refcnt=1)
t.c:8:11: note: op template: _18 = ub[0];
t.c:8:11: note:         stmt 0 _18 = ub[0];
t.c:8:11: note:         stmt 1 _30 = ub[1];
t.c:8:11: note: Cost model analysis:
_23 1 times scalar_store costs 12 in body
_35 1 times scalar_store costs 12 in body
(unsigned char) _22 1 times scalar_stmt costs 4 in body
(unsigned char) _34 1 times scalar_stmt costs 4 in body
ua[0] 1 times vector_load costs 12 in body
ub[0] 1 times vector_load costs 12 in body
.AVG_CEIL (_16, _18) 1 times vector_stmt costs 4 in body
_23 1 times vector_store costs 12 in body
ua[0] 1 times vec_to_scalar costs 4 in epilogue
ua[1] 1 times vec_to_scalar costs 4 in epilogue
ub[0] 1 times vec_to_scalar costs 4 in epilogue
ub[1] 1 times vec_to_scalar costs 4 in epilogue
t.c:8:11: note: Cost model analysis for part in loop 0:
  Vector cost: 56
  Scalar cost: 32
t.c:8:11: missed: not vectorized: vectorization is not profitable.

it looks like somehow the scalar costing is off and the scalar loads from
ua and ub are considered live.  Possibly an artifact of patterns.

It's vectorized fine with -fno-vect-cost-model.

I will have a look, eventually not for GCC 12.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
[Bug 53947] [meta-bug] vectorizer missed-optimizations

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/103941] uavgv2qi3_ceil is not used (SLP costing and patterns vs live stmts)
  2022-01-07 15:31 [Bug tree-optimization/103941] New: uavgv2qi3_ceil is not used ubizjak at gmail dot com
  2022-01-10 10:53 ` [Bug tree-optimization/103941] uavgv2qi3_ceil is not used (SLP costing and patterns vs live stmts) rguenth at gcc dot gnu.org
@ 2022-01-10 12:49 ` rguenth at gcc dot gnu.org
  2022-01-12 19:57 ` cvs-commit at gcc dot gnu.org
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-01-10 12:49 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103941

--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
I think I've seen this before - the use in the conversion is elided in the
vector path via recognizing a pattern of a pattern - that makes it not part of
the SLP
tree and thus left as SLP_TYPE (..) = loop_vect, fooling the live computation.

vect_detect_hybrid_slp now does this in a more correct way but the original
worklist seeding has to be done differently for BB SLP.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/103941] uavgv2qi3_ceil is not used (SLP costing and patterns vs live stmts)
  2022-01-07 15:31 [Bug tree-optimization/103941] New: uavgv2qi3_ceil is not used ubizjak at gmail dot com
  2022-01-10 10:53 ` [Bug tree-optimization/103941] uavgv2qi3_ceil is not used (SLP costing and patterns vs live stmts) rguenth at gcc dot gnu.org
  2022-01-10 12:49 ` rguenth at gcc dot gnu.org
@ 2022-01-12 19:57 ` cvs-commit at gcc dot gnu.org
  2022-01-26 12:42 ` rguenth at gcc dot gnu.org
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2022-01-12 19:57 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103941

--- Comment #3 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Uros Bizjak <uros@gcc.gnu.org>:

https://gcc.gnu.org/g:cb46559cea1d554cef1138db5bfbdd0647ffbc0d

commit r12-6535-gcb46559cea1d554cef1138db5bfbdd0647ffbc0d
Author: Uros Bizjak <ubizjak@gmail.com>
Date:   Wed Jan 12 20:57:12 2022 +0100

    testsuite: Compile gcc.target/i386/pr103861-3.c with -fno-vect-cost-model
[PR103941]

    2022-01-12  Uroš Bizjak  <ubizjak@gmail.com>

    gcc/testsuite/ChangeLog:

            PR target/103941
            * gcc.target/i386/pr103861-3.c (dg-options): Add
-fno-vect-cost-model.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/103941] uavgv2qi3_ceil is not used (SLP costing and patterns vs live stmts)
  2022-01-07 15:31 [Bug tree-optimization/103941] New: uavgv2qi3_ceil is not used ubizjak at gmail dot com
                   ` (2 preceding siblings ...)
  2022-01-12 19:57 ` cvs-commit at gcc dot gnu.org
@ 2022-01-26 12:42 ` rguenth at gcc dot gnu.org
  2022-04-19 14:42 ` cvs-commit at gcc dot gnu.org
  2022-04-19 14:44 ` rguenth at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-01-26 12:42 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103941

--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
Another testcase where this occurs:

void foo (int *c, float *x, float *y)
{
  c[0] = x[0] < y[0];
  c[1] = x[1] < y[1];
  c[2] = x[2] < y[2];
  c[3] = x[3] < y[3];
}

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/103941] uavgv2qi3_ceil is not used (SLP costing and patterns vs live stmts)
  2022-01-07 15:31 [Bug tree-optimization/103941] New: uavgv2qi3_ceil is not used ubizjak at gmail dot com
                   ` (3 preceding siblings ...)
  2022-01-26 12:42 ` rguenth at gcc dot gnu.org
@ 2022-04-19 14:42 ` cvs-commit at gcc dot gnu.org
  2022-04-19 14:44 ` rguenth at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2022-04-19 14:42 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103941

--- Comment #5 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Richard Biener <rguenth@gcc.gnu.org>:

https://gcc.gnu.org/g:353434b65ef7972172597d232ae17022d9a57244

commit r12-8195-g353434b65ef7972172597d232ae17022d9a57244
Author: Richard Biener <rguenther@suse.de>
Date:   Wed Apr 13 13:49:45 2022 +0200

    tree-optimization/104010 - fix SLP scalar costing with patterns

    When doing BB vectorization the scalar cost compute is derailed
    by patterns, causing lanes to be considered live and thus not
    costed on the scalar side.  For the testcase in PR104010 this
    prevents vectorization which was done by GCC 11.  PR103941
    shows similar cases of missed optimizations that are fixed by
    this patch.

    2022-04-13  Richard Biener  <rguenther@suse.de>

            PR tree-optimization/104010
            PR tree-optimization/103941
            * tree-vect-slp.cc (vect_bb_slp_scalar_cost): When
            we run into stmts in patterns continue walking those
            for uses outside of the vectorized region instead of
            marking the lane live.

            * gcc.target/i386/pr103941-1.c: New testcase.
            * gcc.target/i386/pr103941-2.c: Likewise.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/103941] uavgv2qi3_ceil is not used (SLP costing and patterns vs live stmts)
  2022-01-07 15:31 [Bug tree-optimization/103941] New: uavgv2qi3_ceil is not used ubizjak at gmail dot com
                   ` (4 preceding siblings ...)
  2022-04-19 14:42 ` cvs-commit at gcc dot gnu.org
@ 2022-04-19 14:44 ` rguenth at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-04-19 14:44 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103941

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |RESOLVED
         Resolution|---                         |FIXED
      Known to work|                            |12.0

--- Comment #6 from Richard Biener <rguenth at gcc dot gnu.org> ---
Fixed on trunk via the PR104010 regression fix.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-04-19 14:44 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-07 15:31 [Bug tree-optimization/103941] New: uavgv2qi3_ceil is not used ubizjak at gmail dot com
2022-01-10 10:53 ` [Bug tree-optimization/103941] uavgv2qi3_ceil is not used (SLP costing and patterns vs live stmts) rguenth at gcc dot gnu.org
2022-01-10 12:49 ` rguenth at gcc dot gnu.org
2022-01-12 19:57 ` cvs-commit at gcc dot gnu.org
2022-01-26 12:42 ` rguenth at gcc dot gnu.org
2022-04-19 14:42 ` cvs-commit at gcc dot gnu.org
2022-04-19 14:44 ` rguenth at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).