[Bug tree-optimization/104479] New: [12 Regression] cond_op is combined without considering single

public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed

* [Bug tree-optimization/104479] New: [12 Regression] cond_op is combined without considering single_use
@ 2022-02-10  6:45 crazylht at gmail dot com
  2022-02-10  7:16 ` [Bug tree-optimization/104479] " rguenth at gcc dot gnu.org
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: crazylht at gmail dot com @ 2022-02-10  6:45 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104479

            Bug ID: 104479
           Summary: [12 Regression] cond_op is combined without
                    considering single_use
           Product: gcc
           Version: 12.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: crazylht at gmail dot com
  Target Milestone: ---
              Host: x86_64-pc-linux-gnu
            Target: x86_64-*-* i?86-*-*

cat test.c

void
mc_weight (unsigned int* __restrict dst, unsigned int* __restrict src,
           int i_width,int i_scale, unsigned int* __restrict y)
{
  for(int x = 0; x < i_width; x++)
    dst[x] =  src[x] >> 3 > 255 ? src[x] >> 3 : y[x];
}

gcc -march=icelake-server -O3


gcc11.2 

        vpsrld  ymm0, YMMWORD PTR [rsi+rax], 3
        vpcmpud k1, ymm0, ymm2, 2
        vmovdqu32       ymm1{k1}, YMMWORD PTR [r8+rax]
        vpcmpud k1, ymm0, ymm2, 6
        vpblendmd       ymm0{k1}, ymm1, ymm0
        vmovdqu YMMWORD PTR [rcx+rax], ymm0

gcc 12

        vmovdqu ymm1, YMMWORD PTR [rsi+rax]
        vpsrld  ymm2, ymm1, 3
        vpcmpud k1, ymm2, ymm3, 2
        vmovdqu32       ymm0{k1}, YMMWORD PTR [r8+rax]
        vpcmpud k1, ymm2, ymm3, 6
        vmovdqa ymm2, ymm0
        vpsrld  ymm2{k1}, ymm1, 3
        vmovdqu YMMWORD PTR [rcx+rax], ymm2

It's because in match.pd

---------------cut----------------
(for uncond_op (UNCOND_BINARY)
     cond_op (COND_BINARY)
 (simplify
  (vec_cond @0 (view_convert? (uncond_op@4 @1 @2)) @3)
  (with { tree op_type = TREE_TYPE (@4); }
   (if (vectorized_internal_fn_supported_p (as_internal_fn (cond_op), op_type)
        && is_truth_type_for (op_type, TREE_TYPE (@0)))
    (view_convert (cond_op @0 @1 @2 (view_convert:op_type @3))))))
 (simplify
  (vec_cond @0 @1 (view_convert? (uncond_op@4 @2 @3)))
  (with { tree op_type = TREE_TYPE (@4); }
   (if (vectorized_internal_fn_supported_p (as_internal_fn (cond_op), op_type)
        && is_truth_type_for (op_type, TREE_TYPE (@0)))
    (view_convert (cond_op (bit_not @0) @2 @3 (view_convert:op_type @1)))))))
---------------end-------------------

uncond_op + vec_cond is combined to cond_op w/o considering uncond_op result
could be used by others, which caused unoptimal codegen.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug tree-optimization/104479] [12 Regression] cond_op is combined without considering single_use
  2022-02-10  6:45 [Bug tree-optimization/104479] New: [12 Regression] cond_op is combined without considering single_use crazylht at gmail dot com
@ 2022-02-10  7:16 ` rguenth at gcc dot gnu.org
  2022-02-10  8:30 ` crazylht at gmail dot com
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-02-10  7:16 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104479

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|---                         |12.0
                 CC|                            |rguenth at gcc dot gnu.org,
                   |                            |rsandifo at gcc dot gnu.org
     Ever confirmed|0                           |1
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2022-02-10

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
Confirmed.  When uncond_op is expensive (there's *div amongst them) that's
definitely unwanted.  OTOH when it is cheap then combining will reduce
latency.

GIMPLE wise it's a neutral transform if uncond_op is not single-use unless
we need two v_c_es.

In the assembly it's masked vpsrld vs. masked vpblendmd, it's not entirely
clear why one should be slower than the other (but yes, blends are usually
very cheap and also not resource constrained).

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug tree-optimization/104479] [12 Regression] cond_op is combined without considering single_use
  2022-02-10  6:45 [Bug tree-optimization/104479] New: [12 Regression] cond_op is combined without considering single_use crazylht at gmail dot com
  2022-02-10  7:16 ` [Bug tree-optimization/104479] " rguenth at gcc dot gnu.org
@ 2022-02-10  8:30 ` crazylht at gmail dot com
  2022-02-10  8:32 ` rguenther at suse dot de
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: crazylht at gmail dot com @ 2022-02-10  8:30 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104479

--- Comment #2 from Hongtao.liu <crazylht at gmail dot com> ---
(In reply to Richard Biener from comment #1)
> Confirmed.  When uncond_op is expensive (there's *div amongst them) that's
> definitely unwanted.  OTOH when it is cheap then combining will reduce
> latency.
> 
> GIMPLE wise it's a neutral transform if uncond_op is not single-use unless
> we need two v_c_es.

We can leave it to rtl combine/fwprop which will consider rtx_cost for them.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug tree-optimization/104479] [12 Regression] cond_op is combined without considering single_use
  2022-02-10  6:45 [Bug tree-optimization/104479] New: [12 Regression] cond_op is combined without considering single_use crazylht at gmail dot com
  2022-02-10  7:16 ` [Bug tree-optimization/104479] " rguenth at gcc dot gnu.org
  2022-02-10  8:30 ` crazylht at gmail dot com
@ 2022-02-10  8:32 ` rguenther at suse dot de
  2022-02-11  7:48 ` cvs-commit at gcc dot gnu.org
  2022-02-11  7:52 ` crazylht at gmail dot com
  4 siblings, 0 replies; 6+ messages in thread
From: rguenther at suse dot de @ 2022-02-10  8:32 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104479

--- Comment #3 from rguenther at suse dot de <rguenther at suse dot de> ---
On Thu, 10 Feb 2022, crazylht at gmail dot com wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104479
> 
> --- Comment #2 from Hongtao.liu <crazylht at gmail dot com> ---
> (In reply to Richard Biener from comment #1)
> > Confirmed.  When uncond_op is expensive (there's *div amongst them) that's
> > definitely unwanted.  OTOH when it is cheap then combining will reduce
> > latency.
> > 
> > GIMPLE wise it's a neutral transform if uncond_op is not single-use unless
> > we need two v_c_es.
> 
> We can leave it to rtl combine/fwprop which will consider rtx_cost for them.

That certainly makes sense for the !single_use case.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug tree-optimization/104479] [12 Regression] cond_op is combined without considering single_use
  2022-02-10  6:45 [Bug tree-optimization/104479] New: [12 Regression] cond_op is combined without considering single_use crazylht at gmail dot com
                   ` (2 preceding siblings ...)
  2022-02-10  8:32 ` rguenther at suse dot de
@ 2022-02-11  7:48 ` cvs-commit at gcc dot gnu.org
  2022-02-11  7:52 ` crazylht at gmail dot com
  4 siblings, 0 replies; 6+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2022-02-11  7:48 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104479

--- Comment #4 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by hongtao Liu <liuhongt@gcc.gnu.org>:

https://gcc.gnu.org/g:165947fecf4d78c7effb0f1ee15e6942d8dce4ea

commit r12-7193-g165947fecf4d78c7effb0f1ee15e6942d8dce4ea
Author: liuhongt <hongtao.liu@intel.com>
Date:   Thu Feb 10 15:42:13 2022 +0800

    Add single_use to simplification (uncond_op + vec_cond -> cond_op).

    gcc/ChangeLog:

            PR tree-optimization/104479
            * match.pd (uncond_op + vec_cond -> cond_op): Add single_use
            for the dest of uncond_op.

    gcc/testsuite/ChangeLog:

            * gcc.target/i386/pr104479.c: New test.
            * gcc.target/i386/cond_op_shift_w-1.c: Adjust testcase.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug tree-optimization/104479] [12 Regression] cond_op is combined without considering single_use
  2022-02-10  6:45 [Bug tree-optimization/104479] New: [12 Regression] cond_op is combined without considering single_use crazylht at gmail dot com
                   ` (3 preceding siblings ...)
  2022-02-11  7:48 ` cvs-commit at gcc dot gnu.org
@ 2022-02-11  7:52 ` crazylht at gmail dot com
  4 siblings, 0 replies; 6+ messages in thread
From: crazylht at gmail dot com @ 2022-02-11  7:52 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104479

Hongtao.liu <crazylht at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |FIXED

--- Comment #5 from Hongtao.liu <crazylht at gmail dot com> ---
Fixed.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2022-02-11  7:52 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-10  6:45 [Bug tree-optimization/104479] New: [12 Regression] cond_op is combined without considering single_use crazylht at gmail dot com
2022-02-10  7:16 ` [Bug tree-optimization/104479] " rguenth at gcc dot gnu.org
2022-02-10  8:30 ` crazylht at gmail dot com
2022-02-10  8:32 ` rguenther at suse dot de
2022-02-11  7:48 ` cvs-commit at gcc dot gnu.org
2022-02-11  7:52 ` crazylht at gmail dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).