public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/53087] New: Poor code for conversion from _Bool to int
@ 2012-04-23 16:36 steven at gcc dot gnu.org
  2012-04-23 16:42 ` [Bug target/53087] " steven at gcc dot gnu.org
                   ` (12 more replies)
  0 siblings, 13 replies; 14+ messages in thread
From: steven at gcc dot gnu.org @ 2012-04-23 16:36 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53087

             Bug #: 53087
           Summary: Poor code for conversion from _Bool to int
    Classification: Unclassified
           Product: gcc
           Version: 4.8.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: steven@gcc.gnu.org


Consider this test case:

#define ONEUL (1UL)

int
foo (long unsigned int a)
{
  _Bool b;
  long unsigned int cst, csui;

  if (a > 27) goto return_zero;
/*cst = 217579583UL;*/
  cst = (ONEUL <<  0) | (ONEUL <<  1) | (ONEUL <<  2) |
        (ONEUL <<  3) | (ONEUL <<  4) | (ONEUL <<  5) |
        (ONEUL << 19) | (ONEUL << 20) | (ONEUL << 21) |
        (ONEUL << 22) | (ONEUL << 23) | (ONEUL << 26) |
        (ONEUL << 27);
  csui = (ONEUL << a);
  b = ((csui & cst) != 0);
  if (b)
    return 1;
  else
    return 0;

return_zero:
   return 0;
}


On powerpc64, with GCC trunk r186580, the code generated for this example is:

foo:
        .quad   .L.foo,.TOC.@tocbase,0
        .previous
        .type   foo, @function
.L.foo:
        cmpldi 7,3,27
        bgt 7,.L3
        li 10,1
        lis 9,0xcf8
        sld 3,10,3
        ori 9,9,63
        and. 10,3,9
        mfcr 9
        rlwinm 9,9,3,1
        xori 3,9,1
        blr
        .p2align 4,,15
.L3:
.L2:
        li 3,0
        blr



The poor code results from PHI-OPT which converts away the if() statement. The
.149t.optimized dump looks like this:

;; Function foo (foo, funcdef_no=0, decl_uid=1996, cgraph_uid=0)

foo (long unsigned int a)
{
  _Bool D.2013;
  long unsigned int csui;
  int D.2008;
  long unsigned int D.2005;
  int D.2004;

<bb 2>:
  if (a_2(D) > 27)
    goto <bb 4> (return_zero);
  else
    goto <bb 3>;

<bb 3>:
  D.2004_3 = (int) a_2(D);
  csui_4 = 1 << D.2004_3;
  D.2005_5 = csui_4 & 217579583;
  D.2013_7 = D.2005_5 != 0;
  D.2008_8 = (int) D.2013_7;

  # D.2008_1 = PHI <D.2008_8(3), 0(2)>
return_zero:
  return D.2008_1;

}


The last statement in <bb 3> and the PHI are expanded as follows (compiled with
-fno-tree-ter to make it easier to see as what RTL each statement expanded to):

;; D.2013_7 = D.2005_5 != 0;

(insn 15 14 16 (set (reg:CC 134)
        (compare:CC (reg:DI 123 [ D.2005 ])
            (const_int 0 [0]))) t.c:16 -1
     (nil))

(insn 16 15 17 (set (reg:DI 135)
        (eq:DI (reg:CC 134)
            (const_int 0 [0]))) t.c:16 -1
     (nil))

(insn 17 16 18 (set (reg:SI 133)
        (subreg:SI (reg:DI 135) 4)) t.c:16 -1
     (nil))

(insn 18 17 19 (set (reg:QI 132)
        (subreg:QI (reg:SI 133) 3)) t.c:16 -1
     (nil))

(insn 19 18 20 (set (reg:SI 136)
        (xor:SI (subreg:SI (reg:QI 132) 0)
            (const_int 1 [0x1]))) t.c:16 -1
     (nil))

(insn 20 19 0 (set (reg:DI 124 [ D.2013+-7 ])
        (zero_extend:DI (subreg:QI (reg:SI 136) 3))) t.c:16 -1
     (nil))

;; D.2008_8 = (int) D.2013_7;

(insn 21 20 0 (set (reg:DI 120 [ D.2008+-4 ])
        (sign_extend:DI (subreg/s/u:SI (reg:DI 124 [ D.2013+-7 ]) 4))) t.c:17
-1
     (nil))



This is a target problem, because i686, x86_64, mips, and arm all generate much
better code for this example, for example:

MIPS:
foo:
        .frame  $sp,0,$31               # vars= 0, regs= 0/0, args= 0, gp= 0
        .mask   0x00000000,0
        .fmask  0x00000000,0
        .set    noreorder
        .set    nomacro
        sltu    $2,$4,28
        beq     $2,$0,$L3
        nop

        li      $2,1                    # 0x1
        sll     $4,$2,$4
        li      $2,217579520                    # 0xcf80000
        addiu   $2,$2,63
        and     $2,$4,$2
        j       $31
        sltu    $2,$0,$2

$L3:
$L2 = .
        j       $31
        move    $2,$0



ARM:
foo:
        @ Function supports interworking.
        @ args = 0, pretend = 0, frame = 0
        @ frame_needed = 0, uses_anonymous_args = 0
        @ link register save eliminated.
        cmp     r0, #27
        bhi     .L3
        mov     r3, #1
        mov     r0, r3, asl r0
        ldr     r3, .L5
        and     r3, r0, r3
        adds    r0, r3, #0
        movne   r0, #1
        bx      lr
.L3:
.L2:
        mov     r0, #0
        bx      lr
.L6:
        .align  2
.L5:
        .word   217579583


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/53087] Poor code for conversion from _Bool to int
  2012-04-23 16:36 [Bug target/53087] New: Poor code for conversion from _Bool to int steven at gcc dot gnu.org
@ 2012-04-23 16:42 ` steven at gcc dot gnu.org
  2012-04-23 17:13 ` steven at gcc dot gnu.org
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: steven at gcc dot gnu.org @ 2012-04-23 16:42 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53087

Steven Bosscher <steven at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |missed-optimization
             Target|                            |powerpc-unknown-linux-gnu
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2012-04-23
     Ever Confirmed|0                           |1
      Known to fail|                            |4.8.0

--- Comment #1 from Steven Bosscher <steven at gcc dot gnu.org> 2012-04-23 16:40:30 UTC ---
This code generation problem is important because the GIMPLE switch lowering
transformations (as ported from stmt.c) introduce this kind of code a lot (e.g.
for the gimple.[ch] predicates).


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/53087] Poor code for conversion from _Bool to int
  2012-04-23 16:36 [Bug target/53087] New: Poor code for conversion from _Bool to int steven at gcc dot gnu.org
  2012-04-23 16:42 ` [Bug target/53087] " steven at gcc dot gnu.org
@ 2012-04-23 17:13 ` steven at gcc dot gnu.org
  2012-04-23 20:12 ` [Bug target/53087] [powerpc] Poor code from cstore expander steven at gcc dot gnu.org
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: steven at gcc dot gnu.org @ 2012-04-23 17:13 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53087

--- Comment #2 from Steven Bosscher <steven at gcc dot gnu.org> 2012-04-23 17:13:28 UTC ---
Expected code:

foo:
        .quad   .L.foo,.TOC.@tocbase,0
        .previous
        .type   foo, @function
.L.foo:
        lwz 9,0(3)
        cmplwi 7,9,27
        bgt 7,.L4
        li 8,1
        lis 10,0xcf8
        sld 9,8,9
        ori 10,10,63
        and. 8,9,10
        li 3,1
        bnelr 0
.L4:
        li 3,0
        blr


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/53087] [powerpc] Poor code from cstore expander
  2012-04-23 16:36 [Bug target/53087] New: Poor code for conversion from _Bool to int steven at gcc dot gnu.org
  2012-04-23 16:42 ` [Bug target/53087] " steven at gcc dot gnu.org
  2012-04-23 17:13 ` steven at gcc dot gnu.org
@ 2012-04-23 20:12 ` steven at gcc dot gnu.org
  2012-04-23 20:13 ` steven at gcc dot gnu.org
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: steven at gcc dot gnu.org @ 2012-04-23 20:12 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53087

Steven Bosscher <steven at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|Poor code for conversion    |[powerpc] Poor code from
                   |from _Bool to int           |cstore expander

--- Comment #3 from Steven Bosscher <steven at gcc dot gnu.org> 2012-04-23 20:12:37 UTC ---
The first time round to rs6000.md:cstore<mode>4, the insn isn't generated
because code==NE. The second time, some insns are emitted:

Breakpoint 11, emit_store_flag_1 (target=0x0, code=EQ, op0=0xfffb5f092c0,
op1=0xfffb5f60470, mode=DImode, unsignedp=0, normalizep=1, target_mode=QImode)
    at ../../trunk/gcc/expmed.c:5363
5363              do_pending_stack_adjust ();
(gdb) next
5364              tem = emit_cstore (target, icode, code, mode, compare_mode,
(gdb) 
5366              if (tem)
(gdb) p tem
$67 = (rtx) 0xfffb5f094a0
(gdb) p debug_rtx(tem)
(reg:QI 132)
$68 = void
(gdb) p debug_rtx_list (get_last_insn(), -7)
(insn 15 14 16 (set (reg:CC 134)
        (compare:CC (reg:DI 123 [ D.2005 ])
            (const_int 0 [0]))) t.c:16 -1
     (nil))

(insn 16 15 17 (set (reg:DI 135)
        (eq:DI (reg:CC 134)
            (const_int 0 [0]))) t.c:16 -1
     (nil))

(insn 17 16 18 (set (reg:SI 133)
        (subreg:SI (reg:DI 135) 4)) t.c:16 -1
     (nil))

(insn 18 17 0 (set (reg:QI 132)
        (subreg:QI (reg:SI 133) 3)) t.c:16 -1
     (nil))

$69 = void
(gdb) next
5367                return tem;
(gdb) 
5381    }
(gdb) 
emit_store_flag (target=0xfffb5f09460, code=NE, op0=0xfffb5f092c0,
op1=0xfffb5f60470, mode=DImode, unsignedp=1, normalizep=1) at
../../trunk/gcc/expmed.c:5578
5578              if (tem != 0)
(gdb) p tem
$70 = (rtx) 0xfffb5f094a0
(gdb) l
5573                   && rtx_cost (trueval, XOR, 1,
5574                                optimize_insn_for_speed_p ()) == 0)
5575            {
5576              tem = emit_store_flag_1 (subtarget, rcode, op0, op1, mode, 0,
5577                                       normalizep, target_mode);
5578              if (tem != 0)
5579                tem = expand_binop (target_mode, xor_optab, tem, trueval,
target,
5580                                    INTVAL (trueval) >= 0, OPTAB_WIDEN);
5581            }


So the problem is not the _Bool->int conversion but the cstore for
";; D.2013_7 = D.2005_5 != 0;"


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/53087] [powerpc] Poor code from cstore expander
  2012-04-23 16:36 [Bug target/53087] New: Poor code for conversion from _Bool to int steven at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2012-04-23 20:12 ` [Bug target/53087] [powerpc] Poor code from cstore expander steven at gcc dot gnu.org
@ 2012-04-23 20:13 ` steven at gcc dot gnu.org
  2012-04-23 20:24 ` dje at gcc dot gnu.org
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: steven at gcc dot gnu.org @ 2012-04-23 20:13 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53087

--- Comment #4 from Steven Bosscher <steven at gcc dot gnu.org> 2012-04-23 20:13:29 UTC ---
Smaller test case:

_Bool
foo (long unsigned int a)
{
  return (((1L << a) & 217579583UL) != 0);
}

==>

       .file   "t.c"
        .section        ".toc","aw"
        .section        ".text"
        .align 2
        .p2align 4,,15
        .globl foo
        .section        ".opd","aw"
        .align 3
foo:
        .quad   .L.foo,.TOC.@tocbase,0
        .previous
        .type   foo, @function
.L.foo:
        li 10,1
        lis 9,0xcf8
        sld 3,10,3
        ori 9,9,63
        and. 10,3,9
        mfcr 3
        rlwinm 3,3,3,1
        xori 3,3,1
        blr
        .long 0
        .byte 0,0,0,0,0,0,0,0
        .size   foo,.-.L.foo
        .ident  "GCC: (GNU) 4.8.0 20120418 (experimental) [trunk revision
186580]"


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/53087] [powerpc] Poor code from cstore expander
  2012-04-23 16:36 [Bug target/53087] New: Poor code for conversion from _Bool to int steven at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2012-04-23 20:13 ` steven at gcc dot gnu.org
@ 2012-04-23 20:24 ` dje at gcc dot gnu.org
  2012-04-24  0:41 ` dje at gcc dot gnu.org
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: dje at gcc dot gnu.org @ 2012-04-23 20:24 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53087

David Edelsohn <dje at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |dje at gcc dot gnu.org

--- Comment #5 from David Edelsohn <dje at gcc dot gnu.org> 2012-04-23 20:23:42 UTC ---
XLC produces:

        addi       r0,r0,1
        addis      r4,r0,3320
        slw        r0,r0,r3
        addi       r3,r4,63
        and        r0,r0,r3
        cntlzw     r3,r0
        addi       r0,r3,-32
        rlwinm     r3,r0,1,31,31
        blr


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/53087] [powerpc] Poor code from cstore expander
  2012-04-23 16:36 [Bug target/53087] New: Poor code for conversion from _Bool to int steven at gcc dot gnu.org
                   ` (4 preceding siblings ...)
  2012-04-23 20:24 ` dje at gcc dot gnu.org
@ 2012-04-24  0:41 ` dje at gcc dot gnu.org
  2012-04-25  5:27 ` amodra at gmail dot com
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: dje at gcc dot gnu.org @ 2012-04-24  0:41 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53087

--- Comment #6 from David Edelsohn <dje at gcc dot gnu.org> 2012-04-24 00:40:28 UTC ---
GCC 4.1 produced:

         lis 9,0xcf8
         li 0,1
         ori 9,9,63
         slw 0,0,3
         and 0,0,9
         neg 0,0
         srwi 3,0,31
         blr

The branch code is better than the code using mfcr, but the straight-line code
from XLC or GCC avoiding CR is better than both.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/53087] [powerpc] Poor code from cstore expander
  2012-04-23 16:36 [Bug target/53087] New: Poor code for conversion from _Bool to int steven at gcc dot gnu.org
                   ` (5 preceding siblings ...)
  2012-04-24  0:41 ` dje at gcc dot gnu.org
@ 2012-04-25  5:27 ` amodra at gmail dot com
  2012-04-25 12:59 ` amodra at gmail dot com
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: amodra at gmail dot com @ 2012-04-25  5:27 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53087

--- Comment #7 from Alan Modra <amodra at gmail dot com> 2012-04-25 05:26:28 UTC ---
Some more data points.  The testcase in #1 produces

gcc-4.3.6
    cmpldi 7,3,27
    mr 9,3
    li 3,0
    bgtlr 7
    lis 0,0xcf8
    ori 0,0,63
    srd 0,0,9
    rldicl 3,0,0,63
    blr

gcc-4.4.7
    cmpldi 7,3,27
    li 0,0
    bgt 7,.L3
    lis 0,0xcf8
    ori 0,0,63
    srd 0,0,3
    rldicl 0,0,0,63
.L3:
    mr 3,0
    blr

gcc-4.5.0
    cmpldi 7,3,27
    li 0,0
    bgt 7,.L2
    li 0,1
    sld 3,0,3
    lis 0,0xcf8
    ori 0,0,63
    and. 9,3,0
    mfcr 0
    rlwinm 0,0,3,1
    xori 0,0,1
    extsw 0,0
.L2:
    mr 3,0
    blr


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/53087] [powerpc] Poor code from cstore expander
  2012-04-23 16:36 [Bug target/53087] New: Poor code for conversion from _Bool to int steven at gcc dot gnu.org
                   ` (6 preceding siblings ...)
  2012-04-25  5:27 ` amodra at gmail dot com
@ 2012-04-25 12:59 ` amodra at gmail dot com
  2012-04-25 20:02 ` bonzini at gnu dot org
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: amodra at gmail dot com @ 2012-04-25 12:59 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53087

Alan Modra <amodra at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |bonzini at gnu dot org

--- Comment #8 from Alan Modra <amodra at gmail dot com> 2012-04-25 12:58:17 UTC ---
The regression appeared with r149032

2009-06-28  Paolo Bonzini  <bonzini@gnu.org>

    * expr.c (expand_expr_real_1): Just use do_store_flag.
    (do_store_flag): Drop support for TRUTH_NOT_EXPR.  Use
    emit_store_flag_force.
    * expmed.c (emit_store_flag_force): Copy here trick
    previously in expand_expr_real_1.  Try reversing the comparison.
    (emit_store_flag_1): Work if target is NULL.
    (emit_store_flag): Work if target is NULL, using the result mode
    from the comparison.  Use split_comparison, restructure final part
    to simplify conditionals.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/53087] [powerpc] Poor code from cstore expander
  2012-04-23 16:36 [Bug target/53087] New: Poor code for conversion from _Bool to int steven at gcc dot gnu.org
                   ` (7 preceding siblings ...)
  2012-04-25 12:59 ` amodra at gmail dot com
@ 2012-04-25 20:02 ` bonzini at gnu dot org
  2012-04-26 15:53 ` bonzini at gnu dot org
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: bonzini at gnu dot org @ 2012-04-25 20:02 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53087

--- Comment #9 from Paolo Bonzini <bonzini at gnu dot org> 2012-04-25 20:00:57 UTC ---
The handling of this code sequence in fold-const changed back and forth many
times, and this is likely the reason why GCC 4.1 produced straight-line code
while GCC 4.3 produced branchy code.

I think the best fix is to add support for expanding "x != 0" using the cntlzw
trick---either in the cstore expander or in emit_store_flag.

In fact, emit_store_flag already has code for a similar trick:

      /* For EQ or NE, one way to do the comparison is to apply an operation
         that converts the operand into a positive number if it is nonzero
         or zero if it was originally zero.  Then, for EQ, we subtract 1 and
         for NE we negate.  This puts the result in the sign bit.  Then we
         normalize with a shift, if needed.

         Two operations that can do the above actions are ABS and FFS, so try
         them.  If that doesn't work, and MODE is smaller than a full word,
         we can use zero-extension to the wider mode (an unsigned conversion)
         as the operation.  */

So another thing to do is to investigate why this doesn't work.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/53087] [powerpc] Poor code from cstore expander
  2012-04-23 16:36 [Bug target/53087] New: Poor code for conversion from _Bool to int steven at gcc dot gnu.org
                   ` (8 preceding siblings ...)
  2012-04-25 20:02 ` bonzini at gnu dot org
@ 2012-04-26 15:53 ` bonzini at gnu dot org
  2012-09-26  5:19 ` segher at gcc dot gnu.org
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: bonzini at gnu dot org @ 2012-04-26 15:53 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53087

Paolo Bonzini <bonzini at gnu dot org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |ASSIGNED
         AssignedTo|unassigned at gcc dot       |bonzini at gnu dot org
                   |gnu.org                     |

--- Comment #10 from Paolo Bonzini <bonzini at gnu dot org> 2012-04-26 15:52:32 UTC ---
Created attachment 27248
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=27248
tentative patch

Any change in expmed.c is not enough to avoid the cr-setting instruction
"and.", because we get that instruction by design:

   For SNE, we would prefer that the xor/abs sequence be used for integers.
   For SEQ, likewise, except that comparisons with zero should be done
   with an scc insns.  However, due to the order that combine see the
   resulting insns, we must, in fact, allow SEQ for integers.

The question is whether it is really true that "For SEQ, likewise, except that
comparisons with zero should be done with an scc insns".  In general, I think
no.  I tried disabling scc altogether on this testcase:

  int eq1_phi (unsigned long a) { return a ? 0 : 1; }

and Steven's smaller testcase.  They compile respectively to:

eq1_phi:
    sradi 0,3,63
    xor 3,0,3
    subf 3,0,3
    addi 3,3,-1
    srdi 3,3,63
    blr
foo:
    li 9,1
    lis 0,0xcf8
    sld 3,9,3
    ori 0,0,63
    and 0,3,0
    neg 3,0
    srdi 3,3,63
    blr

In both cases the code is produced by the "abs"-based trick in emit_store_flag.
 The trick produces bad code in the first case, because combine finds nothing
to simplify---unlike Steven's case, where abs effectively goes away completely.
 Since many current PPC processors lack the abs/nabs instructions and do
implement clz/ctz, I tried teaching expmed.c the cntlz trick of comment 5. 
This improves eq1_phi noticeably and produces one instruction more than above
(as expected, this is roughly the same code as XLC):

eq1_phi:
    cntlzd 3,3
    srdi 3,3,6
    blr
foo:
    li 9,1
    lis 0,0xcf8
    sld 3,9,3
    ori 0,0,63
    and 3,3,0
    cntlzd 3,3
    addi 3,3,-64
    srdi 3,3,63
    blr

Overall, I think it is preferable to get the slight pessimization of "foo" and
get better code when combine cannot optimize away the ABS.  It would anyway fix
the bug, and the slight pessimization can still be offset in other ways, as
suggested below.

Hence I'm attaching a patch that teaches expmed to use cntlz, removes
abs<mode>_nopower patterns, and makes cstoresi4 do the same thing for EQ and
NE.  I will add testcases and bootstrap/regtest on x86 before submitting, but
of course in the meanwhile it would help a lot if somebody can
bootstrap/regtest it on ppc and/or ppc64.


On top of this, one thing to look at is making switch expansion better.  For
example one way to solve this bug in a target-independent way could be to
expand switch statements to jump if (mask << a) < 0 (with a bit-reversed mask).
<0 and >=0 are a bit easier to convert to sCC than ==0 and !=0.  On SPARC the
bit-reversed mask can even be cheaper to set up (a single mov covers 14-bit
immediates, a single sethi covers 22-bit immediates as long as the lowest 10
are zero); on PowerPC and ARM it should be roughly the same.  I don't know
about Thumb.

Another thing to do is to use a smaller type if possible, in this case int
instead of long.  This is because loading 64-bit constants can be expensive.
This is especially true with the bit-reversed mask, but also in general; on
PowerPC, loading 0xCF80003F in DImode is one instruction more than 0xCF80003F
in SImode, because of a rldicl instruction needed to clear the high word.

This two improvements give for the same input bits as comment 0:

   int bar (unsigned long a) { return (((int)0xFC001F30 << a) < 0); }

which compiles to even better code:

bar:
    lis 0,0xfc00
    ori 0,0,7984
    slw 3,0,3
    srwi 3,3,31
    blr

or on x86:

    movl    %edi, %ecx
    movl    $-67100880, %eax
    sall    %cl, %eax
    shrl    $31, %eax
    ret

Finally, switch expansion could also pick a different jump direction (!= vs ==
for the current code, < vs. >= with the above suggestion) to minimize the
number of bits set in the mask.  On PPC for example, 0x0CF8FFFF is one
instruction more expensive than 0xF3070000.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/53087] [powerpc] Poor code from cstore expander
  2012-04-23 16:36 [Bug target/53087] New: Poor code for conversion from _Bool to int steven at gcc dot gnu.org
                   ` (9 preceding siblings ...)
  2012-04-26 15:53 ` bonzini at gnu dot org
@ 2012-09-26  5:19 ` segher at gcc dot gnu.org
  2012-09-26  5:42 ` segher at gcc dot gnu.org
  2012-11-08 23:02 ` steven at gcc dot gnu.org
  12 siblings, 0 replies; 14+ messages in thread
From: segher at gcc dot gnu.org @ 2012-09-26  5:19 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53087

--- Comment #11 from Segher Boessenkool <segher at gcc dot gnu.org> 2012-09-26 05:18:49 UTC ---
Author: segher
Date: Wed Sep 26 05:18:43 2012
New Revision: 191752

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=191752
Log:
gcc/
    PR target/51274
    PR target/53087
    * config/rs6000/rs6000.md (ne0si): Remove unnecessary
    earlyclobber.  Merge with...
    (ne0di): ... to...
    (ne0_<mode>): New.
    (plus_ne0si): Merge with...
    (plus_ne0di): ... to...
    (plus_ne0_<mode>): New.
    (compare_plus_ne0si): Merge with...
    (compare_plus_ne0di)... to...
    (compare_plus_ne0_<mode>): New.
    (compare_plus_ne0_<mode>_1): New.
    (plus_ne0si_compare): Merge with...
    (plus_ne0di_compare)... to...
    (plus_ne0_<mode>_compare): New.

gcc/testsuite/
    PR target/51274
    PR target/53087
    * gcc.target/powerpc/ppc-ne0-1.c: New.

Added:
    trunk/gcc/testsuite/gcc.target/powerpc/ppc-ne0-1.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/rs6000/rs6000.md
    trunk/gcc/testsuite/ChangeLog


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/53087] [powerpc] Poor code from cstore expander
  2012-04-23 16:36 [Bug target/53087] New: Poor code for conversion from _Bool to int steven at gcc dot gnu.org
                   ` (10 preceding siblings ...)
  2012-09-26  5:19 ` segher at gcc dot gnu.org
@ 2012-09-26  5:42 ` segher at gcc dot gnu.org
  2012-11-08 23:02 ` steven at gcc dot gnu.org
  12 siblings, 0 replies; 14+ messages in thread
From: segher at gcc dot gnu.org @ 2012-09-26  5:42 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53087

Segher Boessenkool <segher at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |segher at gcc dot gnu.org

--- Comment #12 from Segher Boessenkool <segher at gcc dot gnu.org> 2012-09-26 05:42:26 UTC ---
With the abs patterns gone, and the ne0 patterns fixed, mainline
now generates

        li 10,1
        lis 9,0xcf8
        ori 9,9,63
        sld 3,10,3
        and 3,3,9
        addic 9,3,-1
        subfe 3,9,3
        blr

which is the expected code.  It would be nice if we could get
the 4.1 code back, but that's a separate problem (and not a
target problem I think).


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/53087] [powerpc] Poor code from cstore expander
  2012-04-23 16:36 [Bug target/53087] New: Poor code for conversion from _Bool to int steven at gcc dot gnu.org
                   ` (11 preceding siblings ...)
  2012-09-26  5:42 ` segher at gcc dot gnu.org
@ 2012-11-08 23:02 ` steven at gcc dot gnu.org
  12 siblings, 0 replies; 14+ messages in thread
From: steven at gcc dot gnu.org @ 2012-11-08 23:02 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53087

Steven Bosscher <steven at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |RESOLVED
         Resolution|                            |FIXED

--- Comment #13 from Steven Bosscher <steven at gcc dot gnu.org> 2012-11-08 23:01:54 UTC ---
Fixed according to comment #12.


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2012-11-08 23:02 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-04-23 16:36 [Bug target/53087] New: Poor code for conversion from _Bool to int steven at gcc dot gnu.org
2012-04-23 16:42 ` [Bug target/53087] " steven at gcc dot gnu.org
2012-04-23 17:13 ` steven at gcc dot gnu.org
2012-04-23 20:12 ` [Bug target/53087] [powerpc] Poor code from cstore expander steven at gcc dot gnu.org
2012-04-23 20:13 ` steven at gcc dot gnu.org
2012-04-23 20:24 ` dje at gcc dot gnu.org
2012-04-24  0:41 ` dje at gcc dot gnu.org
2012-04-25  5:27 ` amodra at gmail dot com
2012-04-25 12:59 ` amodra at gmail dot com
2012-04-25 20:02 ` bonzini at gnu dot org
2012-04-26 15:53 ` bonzini at gnu dot org
2012-09-26  5:19 ` segher at gcc dot gnu.org
2012-09-26  5:42 ` segher at gcc dot gnu.org
2012-11-08 23:02 ` steven at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).