public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/50984] New: Boolean return value expression clears register too often
@ 2011-11-03 18:09 drepper.fsp at gmail dot com
  2011-11-03 18:11 ` [Bug target/50984] " pinskia at gcc dot gnu.org
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: drepper.fsp at gmail dot com @ 2011-11-03 18:09 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50984

             Bug #: 50984
           Summary: Boolean return value expression clears register too
                    often
    Classification: Unclassified
           Product: gcc
           Version: 4.7.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: drepper.fsp@gmail.com
            Target: x86_64-linux


Compile this code with the current HEAD gcc (or 4.5, I tried that as well) and
you see less than optimal code:

int
f(int a, int b)
{
  return a & 8 && b & 4;
}

For x86-64 I see this asm code:
    xorl    %eax, %eax
    andl    $8, %edi
    je    .L2
    xorl    %eax, %eax      <----- Unnecessary !!!
    andl    $4, %esi
    setne    %al
.L2:
    rep
    ret

The compiler should realize that the second xor is unnecessary.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug target/50984] Boolean return value expression clears register too often
  2011-11-03 18:09 [Bug tree-optimization/50984] New: Boolean return value expression clears register too often drepper.fsp at gmail dot com
@ 2011-11-03 18:11 ` pinskia at gcc dot gnu.org
  2011-11-03 20:01 ` [Bug rtl-optimization/50984] " ubizjak at gmail dot com
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: pinskia at gcc dot gnu.org @ 2011-11-03 18:11 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50984

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |missed-optimization
          Component|tree-optimization           |target

--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> 2011-11-03 18:11:22 UTC ---
IIRC this is a target issue.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug rtl-optimization/50984] Boolean return value expression clears register too often
  2011-11-03 18:09 [Bug tree-optimization/50984] New: Boolean return value expression clears register too often drepper.fsp at gmail dot com
  2011-11-03 18:11 ` [Bug target/50984] " pinskia at gcc dot gnu.org
@ 2011-11-03 20:01 ` ubizjak at gmail dot com
  2011-11-05 17:05 ` svfuerst at gmail dot com
  2021-08-18  5:33 ` pinskia at gcc dot gnu.org
  3 siblings, 0 replies; 5+ messages in thread
From: ubizjak at gmail dot com @ 2011-11-03 20:01 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50984

Uros Bizjak <ubizjak at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2011-11-03
          Component|target                      |rtl-optimization
     Ever Confirmed|0                           |1

--- Comment #2 from Uros Bizjak <ubizjak at gmail dot com> 2011-11-03 20:01:14 UTC ---
(In reply to comment #1)
> IIRC this is a target issue.

Partially true....

Current tree generates:

    xorl    %eax, %eax    # 50    *movdi_xor    [length = 2]
    andl    $8, %edi    # 55    *andsi_2/1    [length = 3]
    je    .L2    # 10    *jcc_1    [length = 2]
    xorl    %eax, %eax    # 46    *movsi_xor    [length = 2]
    andl    $4, %esi    # 54    *andsi_2/1    [length = 3]
    setne    %al    # 48    *setcc_qi_slp    [length = 3]
.L2:
    rep    # 56    simple_return_internal_long    [length = 2]
    ret

The first XOR is in fact load of zero in DImode, the second XOR is load of zero
in SImode. This all happens in peephole2 pass, converting

    5 ax:SI=0
      REG_EQUAL: 0
  ...
   40 ax:QI=flags:CCZ!=0
      REG_DEAD: flags:CCZ
   41 ax:SI=zero_extend(ax:QI)

to

   50 {ax:DI=0;clobber flags:CC;}

  ...
   46 {ax:SI=0;clobber flags:CC;}
   49 {flags:CCZ=cmp(si:QI&0x4,0);si:QI=si:QI&0x4;}
   48 strict_low_part=flags:CCZ!=0

We can perform both clears in SImode, as with following patch:

--cut here--
Index: config/i386/i386.md
===================================================================
--- config/i386/i386.md    (revision 180840)
+++ config/i386/i386.md    (working copy)
@@ -17331,7 +17331,7 @@
    && peep2_regno_dead_p (0, FLAGS_REG)"
   [(parallel [(set (match_dup 0) (const_int 0))
           (clobber (reg:CC FLAGS_REG))])]
-  "operands[0] = gen_lowpart (word_mode, operands[0]);")
+  "operands[0] = gen_lowpart (SImode, operands[0]);")

 (define_peephole2
   [(set (strict_low_part (match_operand 0 "register_operand" ""))
--cut here--

This results in the same assembly, but _.202r.peephole2 dump is now:

   50 {ax:SI=0;clobber flags:CC;}
  ...
   46 {ax:SI=0;clobber flags:CC;}
   49 {flags:CCZ=cmp(si:QI&0x4,0);si:QI=si:QI&0x4;}
   48 strict_low_part=flags:CCZ!=0

I'd expect that CE3 pass that follows peephole2 pass will eliminate (insn 46),
but for some reason this doesn't happen.

Confirmed as rtl-optimization issue.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug rtl-optimization/50984] Boolean return value expression clears register too often
  2011-11-03 18:09 [Bug tree-optimization/50984] New: Boolean return value expression clears register too often drepper.fsp at gmail dot com
  2011-11-03 18:11 ` [Bug target/50984] " pinskia at gcc dot gnu.org
  2011-11-03 20:01 ` [Bug rtl-optimization/50984] " ubizjak at gmail dot com
@ 2011-11-05 17:05 ` svfuerst at gmail dot com
  2021-08-18  5:33 ` pinskia at gcc dot gnu.org
  3 siblings, 0 replies; 5+ messages in thread
From: svfuerst at gmail dot com @ 2011-11-05 17:05 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50984

Steven Fuerst <svfuerst at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |svfuerst at gmail dot com

--- Comment #3 from Steven Fuerst <svfuerst at gmail dot com> 2011-11-05 17:05:03 UTC ---
It can be done with zero clears of %eax via the magic of the lea instruction:

    andl    $8, %edi
    andl    $4, %esi
    leal    (%edi, %esi, 2), %eax
    shr    $4, %eax
    ret

However, if the first condition is usually false, this method will be slower. 
On the other hand, this version can be easily manipulated to have less register
pressure.  (So it may always be better when the function is inevitably
inlined.)


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug rtl-optimization/50984] Boolean return value expression clears register too often
  2011-11-03 18:09 [Bug tree-optimization/50984] New: Boolean return value expression clears register too often drepper.fsp at gmail dot com
                   ` (2 preceding siblings ...)
  2011-11-05 17:05 ` svfuerst at gmail dot com
@ 2021-08-18  5:33 ` pinskia at gcc dot gnu.org
  3 siblings, 0 replies; 5+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-08-18  5:33 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=50984

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Severity|normal                      |enhancement
      Known to fail|                            |4.7.4

--- Comment #4 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
So after r0-118261 (this just happen to have that side effect but it is a good
side effect that way), uncprop can happen and we can produce:
  _3 = a_2(D) & 8;
  if (_3 != 0)
    goto <bb 3>;
  else
    goto <bb 4>;

  <bb 3>:
  _5 = b_4(D) & 4;
  _1 = _5 != 0;
  _6 = (int) _1;

  <bb 4>:
  # prephitmp_8 = PHI <_3(2), _6(3)>

Rather than having _3 in the PHI being 0.
And that is able to remove the extra xor.

As mentioned we should do better here though.
unsigned
f1(unsigned a, unsigned b)
{
  return (((a & 8)>>3) & ((b & 4)>>2));
}
Is the best.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-08-18  5:33 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-11-03 18:09 [Bug tree-optimization/50984] New: Boolean return value expression clears register too often drepper.fsp at gmail dot com
2011-11-03 18:11 ` [Bug target/50984] " pinskia at gcc dot gnu.org
2011-11-03 20:01 ` [Bug rtl-optimization/50984] " ubizjak at gmail dot com
2011-11-05 17:05 ` svfuerst at gmail dot com
2021-08-18  5:33 ` pinskia at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).