public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug rtl-optimization/95102] New: missed if-conversion
@ 2020-05-13  8:58 rguenth at gcc dot gnu.org
  2020-05-13  9:11 ` [Bug rtl-optimization/95102] " rguenth at gcc dot gnu.org
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: rguenth at gcc dot gnu.org @ 2020-05-13  8:58 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95102

            Bug ID: 95102
           Summary: missed if-conversion
           Product: gcc
           Version: 11.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: rtl-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: rguenth at gcc dot gnu.org
  Target Milestone: ---

If you rewrite gcc.target/i386/pr54855-9.c to a form GIMPLE looks like after
some PRE you end up with

typedef float vec __attribute__((vector_size(16)));

vec
foo (vec x, float a)
{
  if (!(x[0] < a))
    x[0] = a;
  return x;
}

which is no longer recognized as the same and emits

foo:
.LFB0:
        .cfi_startproc
        comiss  %xmm0, %xmm1
        ja      .L2
        movss   %xmm1, %xmm0
.L2:
        ret

instead of

foo:
.LFB1:  
        .cfi_startproc
        minss   %xmm1, %xmm0
        ret

this is because RTL if-conversion does not recognize

    7: r86:SF=vec_select(r84:V4SF,parallel)
    8: flags:CCFP=cmp(r85:SF,r86:SF)
      REG_DEAD r86:SF
    9: pc={(flags:CCFP>0)?L14:pc}
      REG_DEAD flags:CCFP
      REG_BR_PROB 536870916

   10: NOTE_INSN_BASIC_BLOCK 3
   12: r84:V4SF=vec_merge(vec_duplicate(r85:SF),r84:V4SF,0x1)
      REG_DEAD r85:SF

   14: L14:
   15: NOTE_INSN_BASIC_BLOCK 4
   20: xmm0:V4SF=r84:V4SF

the form it does recognize is

    8: r82:SF=vec_select(r84:V4SF,parallel)
    9: flags:CCFP=cmp(r85:SF,r82:SF)
   10: pc={(flags:CCFP>0)?L28:pc}
      REG_DEAD flags:CCFP
      REG_BR_PROB 536870916

   28: L28:
   14: NOTE_INSN_BASIC_BLOCK 3
    5: r85:SF=r82:SF
      REG_DEAD r82:SF

   15: L15:
   16: NOTE_INSN_BASIC_BLOCK 4
   18: r87:V4SF=vec_merge(vec_duplicate(r85:SF),r84:V4SF,0x1)

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug rtl-optimization/95102] missed if-conversion
  2020-05-13  8:58 [Bug rtl-optimization/95102] New: missed if-conversion rguenth at gcc dot gnu.org
@ 2020-05-13  9:11 ` rguenth at gcc dot gnu.org
  2021-07-20  1:05 ` pinskia at gcc dot gnu.org
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: rguenth at gcc dot gnu.org @ 2020-05-13  9:11 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95102

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
OK, so one reason is that

  if (!can_conditionally_move_p (x_mode))
    return FALSE;

returns false for E_V4SFmode on x86.  min/max detection is based
on fp_cmov expansion for scalar FP on x86 though (with its own
problems, see PR95083).

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug rtl-optimization/95102] missed if-conversion
  2020-05-13  8:58 [Bug rtl-optimization/95102] New: missed if-conversion rguenth at gcc dot gnu.org
  2020-05-13  9:11 ` [Bug rtl-optimization/95102] " rguenth at gcc dot gnu.org
@ 2021-07-20  1:05 ` pinskia at gcc dot gnu.org
  2021-07-20  1:08 ` pinskia at gcc dot gnu.org
  2021-07-20  7:27 ` rguenth at gcc dot gnu.org
  3 siblings, 0 replies; 5+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-07-20  1:05 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95102

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
     Ever confirmed|0                           |1
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2021-07-20
           Severity|normal                      |enhancement

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
One thing that should be done:
  if (_1 < a_4(D))
    goto <bb 4>; [50.00%]
  else
    goto <bb 3>; [50.00%]

  <bb 3> [local count: 536870913]:
  x_5 = BIT_INSERT_EXPR <x_3(D), a_4(D), 0>;

  <bb 4> [local count: 1073741824]:
  # x_2 = PHI <x_3(D)(2), x_5(3)>


Be converted to:
  _1 = BIT_FIELD_REF <x_3(D), 32, 0>;
  nt_6 = BIT_FIELD_REF <x_3(D), 32, 0>;
  if (_1 < a_4(D))
    goto <bb 4>; [50.00%]
  else
    goto <bb 3>; [50.00%]

  <bb 3> [local count: 536870913]:

  <bb 4> [local count: 1073741824]:
  # nt_7 = PHI <nt_6(2), a_4(D)(3)>
  x_2 = BIT_INSERT_EXPR <x_3(D), nt_7, 0>;

And then normal phiopt would convert it to:
_1 = BIT_FIELD_REF <x_3(D), 32, 0>;
nt_7 = MIN <_1, a_4(D);
x_2 = BIT_INSERT_EXPR <x_3(D), nt_7, 0>;

But this would require a new part of tree-ssa-phiopt which does this, this
can't be done in match as far as I can tell; I don't know how profitable this
transfomration is either.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug rtl-optimization/95102] missed if-conversion
  2020-05-13  8:58 [Bug rtl-optimization/95102] New: missed if-conversion rguenth at gcc dot gnu.org
  2020-05-13  9:11 ` [Bug rtl-optimization/95102] " rguenth at gcc dot gnu.org
  2021-07-20  1:05 ` pinskia at gcc dot gnu.org
@ 2021-07-20  1:08 ` pinskia at gcc dot gnu.org
  2021-07-20  7:27 ` rguenth at gcc dot gnu.org
  3 siblings, 0 replies; 5+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-07-20  1:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95102

--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #0)
> If you rewrite gcc.target/i386/pr54855-9.c to a form GIMPLE looks like after
> some PRE you end up with

I ran into this same thing when I was working on improving other areas too:
FAIL: gcc.target/i386/pr54855-8.c scan-assembler-not movsd
FAIL: gcc.target/i386/pr54855-8.c scan-assembler-times maxsd 1
FAIL: gcc.target/i386/pr54855-9.c scan-assembler-not movss
FAIL: gcc.target/i386/pr54855-9.c scan-assembler-times minss 1
:)

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug rtl-optimization/95102] missed if-conversion
  2020-05-13  8:58 [Bug rtl-optimization/95102] New: missed if-conversion rguenth at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2021-07-20  1:08 ` pinskia at gcc dot gnu.org
@ 2021-07-20  7:27 ` rguenth at gcc dot gnu.org
  3 siblings, 0 replies; 5+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-07-20  7:27 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95102

--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Andrew Pinski from comment #2)
> One thing that should be done:
>   if (_1 < a_4(D))
>     goto <bb 4>; [50.00%]
>   else
>     goto <bb 3>; [50.00%]
> 
>   <bb 3> [local count: 536870913]:
>   x_5 = BIT_INSERT_EXPR <x_3(D), a_4(D), 0>;
> 
>   <bb 4> [local count: 1073741824]:
>   # x_2 = PHI <x_3(D)(2), x_5(3)>
> 
> 
> Be converted to:
>   _1 = BIT_FIELD_REF <x_3(D), 32, 0>;
>   nt_6 = BIT_FIELD_REF <x_3(D), 32, 0>;
>   if (_1 < a_4(D))
>     goto <bb 4>; [50.00%]
>   else
>     goto <bb 3>; [50.00%]
> 
>   <bb 3> [local count: 536870913]:
> 
>   <bb 4> [local count: 1073741824]:
>   # nt_7 = PHI <nt_6(2), a_4(D)(3)>
>   x_2 = BIT_INSERT_EXPR <x_3(D), nt_7, 0>;
> 
> And then normal phiopt would convert it to:
> _1 = BIT_FIELD_REF <x_3(D), 32, 0>;
> nt_7 = MIN <_1, a_4(D);
> x_2 = BIT_INSERT_EXPR <x_3(D), nt_7, 0>;
> 
> But this would require a new part of tree-ssa-phiopt which does this, this
> can't be done in match as far as I can tell; I don't know how profitable
> this transfomration is either.

A BIT_INSERT can be quite costly so I'm unsure whether this pays off in
general (without the MIN/MAX detection followup optimization opportunity).
But yes, when constraining it to the case that we insert at the position
we're having one operand of the controlling condition tested it might work.

And sure, why would it not work with match.pd?  You can write

(simplify
  (cond (lt:c (BIT_FIELD_REF@4 @0 @1 @2) @3) (BIT_INSERT_EXPR @0 @@3 @2) @0)
  (if (compare_tree_int (@1, TYPE_PRECISION (TREE_TYPE (@3))))
    ...

the precision thing is required in case @3 is constant.  You could then
transform it to

    (bit_insert_expr @0 (cond (lt:c @4 @3) @3 @4) @2)

Note that this will produce a COND_EXPR and not retain the conditional form.
phi-opt could in theory scan the sequence to be inserted for those and
match it up with the existing condition to retain the non-if-converted
control flow but I'm not sure if that's worth the hassle ;)

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-07-20  7:27 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-13  8:58 [Bug rtl-optimization/95102] New: missed if-conversion rguenth at gcc dot gnu.org
2020-05-13  9:11 ` [Bug rtl-optimization/95102] " rguenth at gcc dot gnu.org
2021-07-20  1:05 ` pinskia at gcc dot gnu.org
2021-07-20  1:08 ` pinskia at gcc dot gnu.org
2021-07-20  7:27 ` rguenth at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).