public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/51244] New: SH Target: Inefficient conditional branch
@ 2011-11-20 20:29 oleg.endo@t-online.de
  2011-11-22 23:36 ` [Bug target/51244] " kkojima at gcc dot gnu.org
                   ` (87 more replies)
  0 siblings, 88 replies; 89+ messages in thread
From: oleg.endo@t-online.de @ 2011-11-20 20:29 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

             Bug #: 51244
           Summary: SH Target: Inefficient conditional branch
    Classification: Unclassified
           Product: gcc
           Version: 4.7.0
            Status: UNCONFIRMED
          Severity: enhancement
          Priority: P3
         Component: target
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: oleg.endo@t-online.de
                CC: kkojima@gcc.gnu.org
            Target: sh*-*-*


Created attachment 25869
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=25869
Examples

It seems that the condition inversion sometimes gets confused, resulting in
unnecessary code bloat.  The attached examples were compiled with -Os but the
same problem happens also at -O2 and -O3.

sh-elf-gcc -v
Using built-in specs.
COLLECT_GCC=sh-elf-gcc
COLLECT_LTO_WRAPPER=/usr/local/libexec/gcc/sh-elf/4.7.0/lto-wrapper
Target: sh-elf
Configured with: ../gcc-trunk/configure --target=sh-elf --prefix=/usr/local
--enable-languages=c,c++ --enable-multilib --disable-libssp --disable-nls
--disable-werror --enable-lto --with-newlib --with-gnu-as --with-gnu-ld
--with-system-zlib
Thread model: single
gcc version 4.7.0 20111119 (experimental) (GCC)


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] SH Target: Inefficient conditional branch
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
@ 2011-11-22 23:36 ` kkojima at gcc dot gnu.org
  2011-12-27 22:03 ` oleg.endo@t-online.de
                   ` (86 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: kkojima at gcc dot gnu.org @ 2011-11-22 23:36 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #1 from Kazumoto Kojima <kkojima at gcc dot gnu.org> 2011-11-22 22:33:43 UTC ---
>  return (a != b || a != c) ? b : c;

test_func_0_NG and test_func_1_NG cases are related with the target
implementation of cstoresi4.
The middle end expands a complex conditional jump to cstores and
a simple conditional jumps.  For expression a != b, SH's cstoresi4
implementation uses sh.c:sh_emit_compare_and_set which generates
cmp/eq and movnegt insn, because we have no cmp/ne insn.  Then we've
got the sequence

  mov #-1,rn
  negc rn,rm
  tst #255,rm

which is essentially T_reg = T_reg.  Usually combine catches such
situation, but negc might be too complex for combine.
For this case, replacing current movnegt expander by insn, splitter
and peephole something like

(define_insn "movnegt"
  [(set (match_operand:SI 0 "arith_reg_dest" "=r")
    (plus:SI (reg:SI T_REG) (const_int -1)))
   (clobber (match_scratch:SI 1 "=&r"))
   (clobber (reg:SI T_REG))]
  ""
  "#"
 [(set_attr "length" "4")])

(define_split
  [(set (match_operand:SI 0 "arith_reg_dest" "=r")
    (plus:SI (reg:SI T_REG) (const_int -1)))
   (clobber (match_scratch:SI 1 "=&r"))
   (clobber (reg:SI T_REG))]
  "reload_completed"
  [(set (match_dup 1) (const_int -1))
   (parallel [(set (match_dup 0)
           (neg:SI (plus:SI (reg:SI T_REG)
                    (match_dup 1))))
          (set (reg:SI T_REG)
           (ne:SI (ior:SI (reg:SI T_REG) (match_dup 1))
              (const_int 0)))])]
  "")

(define_peephole2
  [(set (match_operand:SI 1 "" "") (const_int -1))
   (parallel [(set (match_operand:SI 0 "" "")
           (neg:SI (plus:SI (reg:SI T_REG)
                    (match_dup 1))))
          (set (reg:SI T_REG)
           (ne:SI (ior:SI (reg:SI T_REG) (match_dup 1))
              (const_int 0)))])
   (set (reg:SI T_REG)
    (eq:SI (match_operand:QI 3 "" "") (const_int 0)))]
  "REGNO (operands[3]) == REGNO (operands[0])
   && peep2_reg_dead_p (3, operands[0])
   && peep2_reg_dead_p (3, operands[1])"
  [(const_int 0)]
  "")

the above useless sequence could be removed, though we will miss
the chance that the -1 can be CSE-ed when the cstore value is
used.  This will cause a bit worse code for the loop like

int
foo (int *a, int x, int n)
{
  int i;
  int count;

  for (i = 0; i < n; i++)
    count += (*(a + i) != x);

  return count;
}

though it may be relatively rare.

BTW, OT, (a != b || a != c) ? b : c could be reduced to b, I think.

>  return a >= 0 && b >= 0 ? c : d;

x >= 0 is expanded to the sequence like

  ra = not x
  rb = -31
  rc = ra >> (neg rb)
  T = (rc == 0)
  conditional jump

and combine tries to simplify it.  combine simplifies b >= 0
successfully into shll and bt but fails to simplify a >= 0.
It seems that combine doesn't do constant propagation well and
misses the constant -31.  In this case, a peephole like

(define_peephole2
  [(set (match_operand:SI 0 "arith_reg_dest" "")
    (not:SI (match_operand:SI 1 "arith_reg_operand" "")))
   (set (match_operand:SI 2 "arith_reg_dest" "") (const_int -31))
   (set (match_operand:SI 3 "arith_reg_dest" "")
    (lshiftrt:SI (match_dup 0) (neg:SI (match_dup 2))))
   (set (reg:SI T_REG)
    (eq:SI (match_operand:QI 4 "arith_reg_operand" "")
           (const_int 0)))
   (set (pc)
    (if_then_else (match_operator 5 "comparison_operator"
            [(reg:SI T_REG) (const_int 0)])
              (label_ref (match_operand 6 "" ""))
              (pc)))]
  "REGNO (operands[3]) == REGNO (operands[4])
   && peep2_reg_dead_p (4, operands[0])
   && (peep2_reg_dead_p (4, operands[3])
       || rtx_equal_p (operands[2], operands[3]))
   && peep2_regno_dead_p (5, T_REG)"
  [(set (match_dup 2) (const_int -31))
   (set (reg:SI T_REG) (ge:SI (match_dup 1) (const_int 0)))
   (set (pc)
    (if_then_else (match_op_dup 7 [(reg:SI T_REG) (const_int 0)])
              (label_ref (match_dup 6))
              (pc)))]
  "
{
  operands[7] = gen_rtx_fmt_ee (reverse_condition (GET_CODE (operands[5])),
                GET_MODE (operands[5]),
                XEXP (operands[5], 0), XEXP (operands[5], 1));
}")

will be a workaround.  It isn't ideal, but better than nothing.

>  return a == b ? test_sub0 (a, b) : test_sub1 (a, b);
>  return a != b ? test_sub0 (a, b) : test_sub1 (a, b);

This case is intresting.  At -Os, two calls are converted into
one computed goto.  A bit surprisingly, the conversion is done
as a side effect of combine-stack-adjustments pass.  That pass
calls

  cleanup_cfg (flag_crossjumping ? CLEANUP_CROSSJUMP : 0);

and the cross jumping optimization merges two calls.
With -Os -fno-delayed-branch, the OK case is compiled to

test_func_3_OK:
        mov     r4,r1
        cmp/eq  r5,r1
        mov.l   .L4,r0
        bf      .L3
        mov     r1,r5
        mov.l   .L5,r0
        bra     .L3
        nop
.L3:
        jmp     @r0
        nop

and the NG case

test_func_3_NG:
        mov     r4,r1
        cmp/eq  r5,r1
        bt      .L2
        mov.l   .L4,r0
        bra     .L3
        nop
.L2:
        mov.l   .L5,r0
        mov     r1,r5
.L3:
        jmp     @r0
        nop

Yep, the former is lucky.  I guess that the latter requires
basic block reordering for the further simplification, though
I've found a comment

  /* Don't reorder blocks when optimizing for size because extra jump insns may
     be created; also barrier may create extra padding.

     More correctly we should have a block reordering mode that tried to
     minimize the combined size of all the jumps.  This would more or less
     automatically remove extra jumps, but would also try to use more short
     jumps instead of long jumps.  */
  if (!optimize_function_for_speed_p (cfun))
    return false;

in bb-reorder.c.


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] SH Target: Inefficient conditional branch
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
  2011-11-22 23:36 ` [Bug target/51244] " kkojima at gcc dot gnu.org
@ 2011-12-27 22:03 ` oleg.endo@t-online.de
  2011-12-27 23:17 ` oleg.endo@t-online.de
                   ` (85 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: oleg.endo@t-online.de @ 2011-12-27 22:03 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #2 from Oleg Endo <oleg.endo@t-online.de> 2011-12-27 21:26:33 UTC ---
(In reply to comment #1)
> 
> BTW, OT, (a != b || a != c) ? b : c could be reduced to b, I think.
> 

Yes, very much so.
It is reduced to "return b" for -m2, -m2e, -m2a, -m3, -m3e
but not for -m1 and -m4*.

The correct test function should be rather:

int test_func_0_NG (int a, int b, int c, int d)
{
  return (a != b || a != d) ? b : c;
}

which is actually OK for all variants except -m1 and -m4*:

    cmp/eq    r5,r4    ! 11    cmpeqsi_t/3    [length = 2]
    bf.s    .L6    ! 12    branch_false    [length = 2]
    cmp/eq    r7,r5    ! 14    cmpeqsi_t/3    [length = 2]
    bf    .L6    ! 15    branch_false    [length = 2]
    mov    r6,r5    ! 8    movsi_i/2    [length = 2]
.L6:
    rts        ! 42    *return_i    [length = 2]
    mov    r5,r0    ! 23    movsi_i/2    [length = 2]


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] SH Target: Inefficient conditional branch
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
  2011-11-22 23:36 ` [Bug target/51244] " kkojima at gcc dot gnu.org
  2011-12-27 22:03 ` oleg.endo@t-online.de
@ 2011-12-27 23:17 ` oleg.endo@t-online.de
  2011-12-28  0:42 ` oleg.endo@t-online.de
                   ` (84 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: oleg.endo@t-online.de @ 2011-12-27 23:17 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #3 from Oleg Endo <oleg.endo@t-online.de> 2011-12-27 22:43:11 UTC ---
Created attachment 26191
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=26191
Proposed patch to improve some of the issues.

(In reply to comment #1)
> 
> [...]
> 
>   mov #-1,rn
>   negc rn,rm
>   tst #255,rm
> 
> which is essentially T_reg = T_reg.  Usually combine catches such
> situation, but negc might be too complex for combine.
> For this case, replacing current movnegt expander by insn, splitter
> and peephole something like
> 
> [...]
>
> the above useless sequence could be removed, though we will miss
> the chance that the -1 can be CSE-ed when the cstore value is
> used.  This will cause a bit worse code for the loop like
> 
> int
> foo (int *a, int x, int n)
> {
>   int i;
>   int count;
> 
>   for (i = 0; i < n; i++)
>     count += (*(a + i) != x);
> 
>   return count;
> }
> 

Thanks for your ideas and comments.  It was really useful.

The attached patch removes the useless sequence and still allows the -1
constant to be CSE-ed for such cases as the example function above.

I haven't ran all tests on it yet, but CSiBE shows average code size reduction
of approx. -0.1% for -m4* with some code size increases in some files.
Would something like that be OK for stage 3?


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] SH Target: Inefficient conditional branch
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (2 preceding siblings ...)
  2011-12-27 23:17 ` oleg.endo@t-online.de
@ 2011-12-28  0:42 ` oleg.endo@t-online.de
  2011-12-28  4:57 ` oleg.endo@t-online.de
                   ` (83 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: oleg.endo@t-online.de @ 2011-12-28  0:42 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #4 from Oleg Endo <oleg.endo@t-online.de> 2011-12-27 23:17:03 UTC ---
(In reply to comment #1)
> 
> >  return a >= 0 && b >= 0 ? c : d;
> 
> x >= 0 is expanded to the sequence like
> 
>   ra = not x
>   rb = -31
>   rc = ra >> (neg rb)
>   T = (rc == 0)
>   conditional jump
> 
> and combine tries to simplify it.  combine simplifies b >= 0
> successfully into shll and bt but fails to simplify a >= 0.
> It seems that combine doesn't do constant propagation well and
> misses the constant -31.

Another simpler fail:

int test_func_22_NG (int a, int b, int c, int d)
{
  return a >= 0;
}

becomes:
        not     r4,r0   ! 9    one_cmplsi2    [length = 2]
        mov     #-31,r1 ! 12    movsi_ie/3    [length = 2]
        rts             ! 31    *return_i    [length = 2]
        shld    r1,r0   ! 13    lshrsi3_d    [length = 2]

which could be:
        cmp/pz    r4
        rts
        movt    r0

>From what I could observe, this is caused by the various shift insns which
leads combine to this result.  For example, the shll, branch sequence that
is used instead of cmp/pz, branch is caused by the ashlsi_c insn, which
defines a lt:SI comparison.  Although that is correct, using cmp/pz could
be better, since it does not modify the reg, and on SH4 it is an MT group
insn.  The ashlsi_c insn / lt:SI picking can be avoided by adjusting the 
rtx costs, for instance (just tried it out briefly).

I think a peephole in this case could fix some of the symptoms but not the
actual cause.  I'll see if I can come up with something that works without a
peephole, even though all the shift stuff looks a bit suspicious ;)


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] SH Target: Inefficient conditional branch
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (3 preceding siblings ...)
  2011-12-28  0:42 ` oleg.endo@t-online.de
@ 2011-12-28  4:57 ` oleg.endo@t-online.de
  2011-12-28 16:07 ` oleg.endo@t-online.de
                   ` (82 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: oleg.endo@t-online.de @ 2011-12-28  4:57 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #5 from Oleg Endo <oleg.endo@t-online.de> 2011-12-28 02:44:05 UTC ---
(In reply to comment #2)
> (In reply to comment #1)
> > 
> > BTW, OT, (a != b || a != c) ? b : c could be reduced to b, I think.
> > 
> 
> Yes, very much so.
> It is reduced to "return b" for -m2, -m2e, -m2a, -m3, -m3e
> but not for -m1 and -m4*.

This seems to be due to the following in sh.h:

#define BRANCH_COST(speed_p, predictable_p) \
    (TARGET_SH5 ? 1 : ! TARGET_SH2 || TARGET_HARD_SH4 ? 2 : 1)


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] SH Target: Inefficient conditional branch
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (4 preceding siblings ...)
  2011-12-28  4:57 ` oleg.endo@t-online.de
@ 2011-12-28 16:07 ` oleg.endo@t-online.de
  2011-12-28 22:30 ` kkojima at gcc dot gnu.org
                   ` (81 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: oleg.endo@t-online.de @ 2011-12-28 16:07 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #6 from Oleg Endo <oleg.endo@t-online.de> 2011-12-28 15:59:35 UTC ---
(In reply to comment #3)
> Created attachment 26191 [details]
> Proposed patch to improve some of the issues.
> 
> The attached patch removes the useless sequence and still allows the -1
> constant to be CSE-ed for such cases as the example function above.
> 
> I haven't ran all tests on it yet, but CSiBE shows average code size reduction
> of approx. -0.1% for -m4* with some code size increases in some files.

Some of the code size increases are caused by the ifcvt.c pass which tries to
transform sequences like:

int test_func_6 (int a, int b, int c)
{
  if (a == 16)
    c = 0;
  return b + c;
}

into branch-free code like:
        mov     r4,r0   ! 45    movsi_ie/2    [length = 2]
        cmp/eq  #16,r0  ! 9     cmpeqsi_t/2    [length = 2]
        mov     #-1,r0  ! 34    movsi_ie/3    [length = 2]
        negc    r0,r0   ! 38    *negc    [length = 2]
        neg     r0,r0   ! 36    negsi2    [length = 2]
        and     r6,r0   ! 37    *andsi3_compact/2    [length = 2]
        rts             ! 48    *return_i    [length = 2]
        add     r5,r0   ! 14    *addsi3_compact    [length = 2]

instead of the more compact (and on SH4 most likely better):
        mov    r4,r0   ! 41    movsi_ie/2    [length = 2]
        cmp/eq    #16,r0  ! 9    cmpeqsi_t/2    [length = 2]
        bf    0f      ! 34    *movsicc_t_true/2    [length = 4]
        mov    #0,r6
0:
        add    r5,r6   ! 14    *addsi3_compact    [length = 2]
        rts             ! 44    *return_i    [length = 2]
        mov    r6,r0   ! 19    movsi_ie/2    [length = 2]

This particular case is handled in noce_try_store_flag_mask, which does the
transformation if BRANCH_COST >= 2, which is true for -m4.  I guess before the
patch ifcvt didn't realize that this transformation can be applied.

I've tried setting BRANCH_COST to 1, which avoids this transformation but
increases overall code size a bit.


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] SH Target: Inefficient conditional branch
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (5 preceding siblings ...)
  2011-12-28 16:07 ` oleg.endo@t-online.de
@ 2011-12-28 22:30 ` kkojima at gcc dot gnu.org
  2011-12-30 22:18 ` oleg.endo@t-online.de
                   ` (80 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: kkojima at gcc dot gnu.org @ 2011-12-28 22:30 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #7 from Kazumoto Kojima <kkojima at gcc dot gnu.org> 2011-12-28 22:25:48 UTC ---
(In reply to comment #3)
> I haven't ran all tests on it yet, but CSiBE shows average code size reduction
> of approx. -0.1% for -m4* with some code size increases in some files.
> Would something like that be OK for stage 3?

Looks good, though not appropriate for stage 3, I think.


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] SH Target: Inefficient conditional branch
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (6 preceding siblings ...)
  2011-12-28 22:30 ` kkojima at gcc dot gnu.org
@ 2011-12-30 22:18 ` oleg.endo@t-online.de
  2012-02-26 23:36 ` olegendo at gcc dot gnu.org
                   ` (79 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: oleg.endo@t-online.de @ 2011-12-30 22:18 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #8 from Oleg Endo <oleg.endo@t-online.de> 2011-12-30 21:21:14 UTC ---
(In reply to comment #7)
> (In reply to comment #3)
> > I haven't ran all tests on it yet, but CSiBE shows average code size reduction
> > of approx. -0.1% for -m4* with some code size increases in some files.
> > Would something like that be OK for stage 3?
> 
> Looks good, though not appropriate for stage 3, I think.

The patch passed the testsuite without new failures.  I'll queue it up for
stage 1.


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] SH Target: Inefficient conditional branch
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (7 preceding siblings ...)
  2011-12-30 22:18 ` oleg.endo@t-online.de
@ 2012-02-26 23:36 ` olegendo at gcc dot gnu.org
  2012-03-02 21:57 ` olegendo at gcc dot gnu.org
                   ` (78 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: olegendo at gcc dot gnu.org @ 2012-02-26 23:36 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

olegendo at gcc dot gnu.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |ASSIGNED
   Last reconfirmed|                            |2012-02-26
                 CC|                            |olegendo at gcc dot gnu.org
         AssignedTo|unassigned at gcc dot       |olegendo at gcc dot gnu.org
                   |gnu.org                     |
     Ever Confirmed|0                           |1


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] SH Target: Inefficient conditional branch
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (8 preceding siblings ...)
  2012-02-26 23:36 ` olegendo at gcc dot gnu.org
@ 2012-03-02 21:57 ` olegendo at gcc dot gnu.org
  2012-03-03 12:32 ` olegendo at gcc dot gnu.org
                   ` (77 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: olegendo at gcc dot gnu.org @ 2012-03-02 21:57 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

Oleg Endo <olegendo at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #26191|0                           |1
        is obsolete|                            |

--- Comment #9 from Oleg Endo <olegendo at gcc dot gnu.org> 2012-03-02 21:56:38 UTC ---
Created attachment 26812
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=26812
Proposed patch

I've tested this patch again against rev 184764 (GCC 4.7) with

make -k check RUNTESTFLAGS="--target_board=sh-sim\{
-m2/-ml,-m2/-mb,-m2a-single/-mb,-m4-single/-ml,
-m4-single/-mb,-m4a-single/-ml,-m4a-single/-mb}"

Surprisingly, it fixes the following libstdc++ tests.

For all sub targets:
23_containers/forward_list/requirements/exception/basic.cc
23_containers/forward_list/requirements/exception/propagation_consistent.cc
23_containers/list/requirements/exception/basic.cc
23_containers/list/requirements/exception/propagation_consistent.cc
23_containers/multiset/requirements/exception/basic.cc
23_containers/multiset/requirements/exception/propagation_consistent.cc
23_containers/unordered_map/requirements/exception/propagation_consistent.cc
23_containers/unordered_multimap/requirements/exception/basic.cc
23_containers/unordered_multiset/requirements/exception/basic.cc
23_containers/unordered_multiset/requirements/exception/propagation_consistent.cc
23_containers/unordered_set/requirements/exception/propagation_consistent.cc
ext/pb_ds/regression/list_update_map_rand.cc
ext/pb_ds/regression/list_update_set_rand.cc

For -m4a-single and -m4-single (-ml and -mb):
23_containers/forward_list/requirements/exception/basic.cc
23_containers/forward_list/requirements/exception/propagation_consistent.cc
23_containers/list/requirements/exception/basic.cc
23_containers/list/requirements/exception/propagation_consistent.cc
23_containers/multiset/requirements/exception/basic.cc
23_containers/multiset/requirements/exception/propagation_consistent.cc

However, it also introduces two new of new failures.

For all sub targets:
FAIL: 21_strings/basic_string/cons/char/6.cc execution test

For -m4a-single and -m4-single (-ml and -mb):
FAIL: 22_locale/ctype/is/char/3.cc execution test

I'm looking into what is happening in the two cases.


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] SH Target: Inefficient conditional branch
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (9 preceding siblings ...)
  2012-03-02 21:57 ` olegendo at gcc dot gnu.org
@ 2012-03-03 12:32 ` olegendo at gcc dot gnu.org
  2012-03-04 17:25 ` olegendo at gcc dot gnu.org
                   ` (76 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: olegendo at gcc dot gnu.org @ 2012-03-03 12:32 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #10 from Oleg Endo <olegendo at gcc dot gnu.org> 2012-03-03 12:32:29 UTC ---
(In reply to comment #9)
> Created attachment 26812 [details]
> Proposed patch
> 
> I've tested this patch again against rev 184764 (GCC 4.7) with
> 
> make -k check RUNTESTFLAGS="--target_board=sh-sim\{
> -m2/-ml,-m2/-mb,-m2a-single/-mb,-m4-single/-ml,
> -m4-single/-mb,-m4a-single/-ml,-m4a-single/-mb}"
> 
> Surprisingly, it fixes the following libstdc++ tests.
> 

That was a false alarm.  I've messed up the test results somehow.
The libstdc++ test case fixes have nothing to do with the patch, but rather
rev 184764 vs. rev 184829.  Sorry for any confusion.

> 
> However, it also introduces two new of new failures.
> 
> For all sub targets:
> FAIL: 21_strings/basic_string/cons/char/6.cc execution test
> 
> For -m4a-single and -m4-single (-ml and -mb):
> FAIL: 22_locale/ctype/is/char/3.cc execution test
> 
> I'm looking into what is happening in the two cases.

It seems that when building newlib something gets messed up related to delayed
branches.  Building newlib with -fno-delayed-branch seems to make the failures
go away.


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] SH Target: Inefficient conditional branch
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (10 preceding siblings ...)
  2012-03-03 12:32 ` olegendo at gcc dot gnu.org
@ 2012-03-04 17:25 ` olegendo at gcc dot gnu.org
  2012-03-05 23:13 ` olegendo at gcc dot gnu.org
                   ` (75 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: olegendo at gcc dot gnu.org @ 2012-03-04 17:25 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

Oleg Endo <olegendo at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #26812|0                           |1
        is obsolete|                            |

--- Comment #11 from Oleg Endo <olegendo at gcc dot gnu.org> 2012-03-04 17:24:44 UTC ---
Created attachment 26822
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=26822
Proposed patch

This patch should be better now.
However, I'm not sure how well this will work with SH64 due to the (arbitrary)
TARGET_SH1 conditions in the insns.


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] SH Target: Inefficient conditional branch
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (11 preceding siblings ...)
  2012-03-04 17:25 ` olegendo at gcc dot gnu.org
@ 2012-03-05 23:13 ` olegendo at gcc dot gnu.org
  2012-03-05 23:38 ` olegendo at gcc dot gnu.org
                   ` (74 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: olegendo at gcc dot gnu.org @ 2012-03-05 23:13 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #12 from Oleg Endo <olegendo at gcc dot gnu.org> 2012-03-05 23:12:27 UTC ---
Author: olegendo
Date: Mon Mar  5 23:12:20 2012
New Revision: 184966

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=184966
Log:
    PR target/51244
    * config/sh/sh.c (sh_expand_t_scc): Remove SH2A special case
    and use unified expansion logic.
    * config/sh/sh.md (xorsi3_movrt): Rename to movrt.  Move
    closer to the existing movt insn.
    (negc): Rename insn to *negc.  Add new expander.
    (movnegt): Use xor pattern for T bit negation.  Reserve helper
    constant for negc pattern.
    (*movnegt): New insn and splitter.

    PR target/51244
    * gcc.target/sh/pr51244-1.c: New.
    * gcc.target/sh/pr51244-2.c: New.
    * gcc.target/sh/pr51244-3.c: New.


Added:
    trunk/gcc/testsuite/gcc.target/sh/pr48596.c
    trunk/gcc/testsuite/gcc.target/sh/pr51244-1.c
    trunk/gcc/testsuite/gcc.target/sh/pr51244-2.c
    trunk/gcc/testsuite/gcc.target/sh/pr51244-3.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/sh/sh.c
    trunk/gcc/config/sh/sh.md
    trunk/gcc/testsuite/ChangeLog


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] SH Target: Inefficient conditional branch
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (12 preceding siblings ...)
  2012-03-05 23:13 ` olegendo at gcc dot gnu.org
@ 2012-03-05 23:38 ` olegendo at gcc dot gnu.org
  2012-03-06  8:28 ` olegendo at gcc dot gnu.org
                   ` (73 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: olegendo at gcc dot gnu.org @ 2012-03-05 23:38 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #13 from Oleg Endo <olegendo at gcc dot gnu.org> 2012-03-05 23:37:35 UTC ---
On Tue, 2012-03-06 at 08:13 +0900, Kaz Kojima wrote:

> I've tested your latest patch on sh4-unknown-linux-gnu with trunk
> revision 184872.  It looks that some new failures are poping up:
> 
> New tests that FAIL:
> 
> 22_locale/ctype/is/char/3.cc execution test
> 27_io/basic_filebuf/underflow/wchar_t/9178.cc execution test
> gfortran.dg/widechar_intrinsics_6.f90  -Os  execution test
> 
> Pehaps failures might be ones you've suggested in #10 in
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244
> 
> Could you double check?

Of course!  Doing so now...


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] SH Target: Inefficient conditional branch
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (13 preceding siblings ...)
  2012-03-05 23:38 ` olegendo at gcc dot gnu.org
@ 2012-03-06  8:28 ` olegendo at gcc dot gnu.org
  2012-03-06  8:50 ` kkojima at gcc dot gnu.org
                   ` (72 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: olegendo at gcc dot gnu.org @ 2012-03-06  8:28 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #14 from Oleg Endo <olegendo at gcc dot gnu.org> 2012-03-06 08:26:06 UTC ---
(In reply to comment #13)
> On Tue, 2012-03-06 at 08:13 +0900, Kaz Kojima wrote:
> 
> > I've tested your latest patch on sh4-unknown-linux-gnu with trunk
> > revision 184872.  It looks that some new failures are poping up:
> > 
> > New tests that FAIL:
> > 
> > 22_locale/ctype/is/char/3.cc execution test
> > 27_io/basic_filebuf/underflow/wchar_t/9178.cc execution test
> > gfortran.dg/widechar_intrinsics_6.f90  -Os  execution test
> > 
> > Pehaps failures might be ones you've suggested in #10 in
> > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244
> > 
> > Could you double check?
> 
> Of course!  Doing so now...

I've run the testsuite on rev 184966 (without fortran though), but the failures
that you've mentioned did not show up.  Usually when I rebuild the whole
toolchain including newlib, I have C/CPP/CXXFLAGS_FOR_TARGET set to '-Os
-mpretend-cmove'.  This time I removed those, but the results seem to be the
same.  Could you also please try again?  This is suspicious...


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] SH Target: Inefficient conditional branch
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (14 preceding siblings ...)
  2012-03-06  8:28 ` olegendo at gcc dot gnu.org
@ 2012-03-06  8:50 ` kkojima at gcc dot gnu.org
  2012-03-06  9:48 ` olegendo at gcc dot gnu.org
                   ` (71 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: kkojima at gcc dot gnu.org @ 2012-03-06  8:50 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #15 from Kazumoto Kojima <kkojima at gcc dot gnu.org> 2012-03-06 08:49:27 UTC ---
(In reply to comment #14)
> I've run the testsuite on rev 184966 (without fortran though), but the failures
> that you've mentioned did not show up.  Usually when I rebuild the whole
> toolchain including newlib, I have C/CPP/CXXFLAGS_FOR_TARGET set to '-Os
> -mpretend-cmove'.  This time I removed those, but the results seem to be the
> same.  Could you also please try again?  This is suspicious...

I've seen same failures on sh4-unknown-linux-gnu for trunk rev 184971.
With backing r184966 changes out, they went away.  Weird.


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] SH Target: Inefficient conditional branch
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (15 preceding siblings ...)
  2012-03-06  8:50 ` kkojima at gcc dot gnu.org
@ 2012-03-06  9:48 ` olegendo at gcc dot gnu.org
  2012-03-06 10:36 ` kkojima at gcc dot gnu.org
                   ` (70 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: olegendo at gcc dot gnu.org @ 2012-03-06  9:48 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #16 from Oleg Endo <olegendo at gcc dot gnu.org> 2012-03-06 09:48:31 UTC ---
(In reply to comment #15)
> I've seen same failures on sh4-unknown-linux-gnu for trunk rev 184971.
> With backing r184966 changes out, they went away.  Weird.

Can we keep the r184966 changes anyways?  I will keep an eye on these failures
whether I can reproduce them.  If you have some time, could you please send me
the intermediate .i and .s files of the failing and passing version of the
'22_locale/ctype/is/char/3.cc' test case?


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] SH Target: Inefficient conditional branch
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (16 preceding siblings ...)
  2012-03-06  9:48 ` olegendo at gcc dot gnu.org
@ 2012-03-06 10:36 ` kkojima at gcc dot gnu.org
  2012-03-06 10:38 ` kkojima at gcc dot gnu.org
                   ` (69 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: kkojima at gcc dot gnu.org @ 2012-03-06 10:36 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #17 from Kazumoto Kojima <kkojima at gcc dot gnu.org> 2012-03-06 10:36:01 UTC ---
Created attachment 26837
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=26837
preprocessed file ctype_configure_char.i


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] SH Target: Inefficient conditional branch
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (17 preceding siblings ...)
  2012-03-06 10:36 ` kkojima at gcc dot gnu.org
@ 2012-03-06 10:38 ` kkojima at gcc dot gnu.org
  2012-03-06 10:39 ` kkojima at gcc dot gnu.org
                   ` (68 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: kkojima at gcc dot gnu.org @ 2012-03-06 10:38 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #18 from Kazumoto Kojima <kkojima at gcc dot gnu.org> 2012-03-06 10:37:13 UTC ---
Created attachment 26838
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=26838
worked .s file ctype_configure_char_good.s


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] SH Target: Inefficient conditional branch
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (18 preceding siblings ...)
  2012-03-06 10:38 ` kkojima at gcc dot gnu.org
@ 2012-03-06 10:39 ` kkojima at gcc dot gnu.org
  2012-03-06 10:40 ` kkojima at gcc dot gnu.org
                   ` (67 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: kkojima at gcc dot gnu.org @ 2012-03-06 10:39 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #19 from Kazumoto Kojima <kkojima at gcc dot gnu.org> 2012-03-06 10:38:22 UTC ---
Created attachment 26839
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=26839
unworked .s file ctype_configure_char_bad.s


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] SH Target: Inefficient conditional branch
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (19 preceding siblings ...)
  2012-03-06 10:39 ` kkojima at gcc dot gnu.org
@ 2012-03-06 10:40 ` kkojima at gcc dot gnu.org
  2012-03-06 11:30 ` olegendo at gcc dot gnu.org
                   ` (66 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: kkojima at gcc dot gnu.org @ 2012-03-06 10:40 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #20 from Kazumoto Kojima <kkojima at gcc dot gnu.org> 2012-03-06 10:40:31 UTC ---
(In reply to comment #16)
> Can we keep the r184966 changes anyways?  I will keep an eye on these failures
> whether I can reproduce them.  If you have some time, could you please send me
> the intermediate .i and .s files of the failing and passing version of the
> '22_locale/ctype/is/char/3.cc' test case?

I've confirmed that 22_locale/ctype/is/char/3.cc doesn't fail
if linking with libstdc++.a which is built with the compiler
without r184966 changes. The .s files against 3.cc are same with
the both compilers.  It looks that the problematic object is
libstdc++-v3/src/c++98/ctype_configure_char.o because the error
went away if replacing it with another one.  I've attached .i and
.s files for that file.  The option used is

COLLECT_GCC_OPTIONS='-shared-libgcc' '-B' '/exp/ldroot/dodes/xsh-gcc/./gcc'
'-nostdinc++'
'-L/exp/ldroot/dodes/xsh-gcc-orig/sh4-unknown-linux-gnu/libstdc++-v3/src'
'-L/exp/ldroot/dodes/xsh-gcc-orig/sh4-unknown-linux-gnu/libstdc++-v3/src/.libs'
'-B' '/usr/local/sh4-unknown-linux-gnu/bin/' '-B'
'/usr/local/sh4-unknown-linux-gnu/lib/' '-isystem'
'/usr/local/sh4-unknown-linux-gnu/include' '-isystem'
'/usr/local/sh4-unknown-linux-gnu/sys-include' '-I'
'/exp/ldroot/dodes/ORIG/trunk/libstdc++-v3/../libgcc' '-I'
'/exp/ldroot/dodes/xsh-gcc-orig/sh4-unknown-linux-gnu/libstdc++-v3/include/sh4-unknown-linux-gnu'
'-I'
'/exp/ldroot/dodes/xsh-gcc-orig/sh4-unknown-linux-gnu/libstdc++-v3/include'
'-I' '/exp/ldroot/dodes/ORIG/trunk/libstdc++-v3/libsupc++'
'-fno-implicit-templates' '-Wall' '-Wextra' '-Wwrite-strings' '-Wcast-qual'
'-Wabi' '-fdiagnostics-show-location=once' '-ffunction-sections'
'-fdata-sections' '-frandom-seed=ctype_configure_char.lo' '-g' '-O2' '-D'
'_GNU_SOURCE' '-S' '-fPIC' '-D' 'PIC' '-o'


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] SH Target: Inefficient conditional branch
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (20 preceding siblings ...)
  2012-03-06 10:40 ` kkojima at gcc dot gnu.org
@ 2012-03-06 11:30 ` olegendo at gcc dot gnu.org
  2012-03-06 23:43 ` olegendo at gcc dot gnu.org
                   ` (65 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: olegendo at gcc dot gnu.org @ 2012-03-06 11:30 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #21 from Oleg Endo <olegendo at gcc dot gnu.org> 2012-03-06 11:29:17 UTC ---
(In reply to comment #20)

> I've confirmed that 22_locale/ctype/is/char/3.cc doesn't fail
> if linking with libstdc++.a which is built with the compiler
> without r184966 changes. The .s files against 3.cc are same with
> the both compilers.  It looks that the problematic object is
> libstdc++-v3/src/c++98/ctype_configure_char.o because the error
> went away if replacing it with another one.  I've attached .i and
> .s files for that file.  The option used is [...]

Cool.  Thanks a lot!  I think I know what the problem is now.  Looking into
it...


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] SH Target: Inefficient conditional branch
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (21 preceding siblings ...)
  2012-03-06 11:30 ` olegendo at gcc dot gnu.org
@ 2012-03-06 23:43 ` olegendo at gcc dot gnu.org
  2012-03-08  1:26 ` olegendo at gcc dot gnu.org
                   ` (64 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: olegendo at gcc dot gnu.org @ 2012-03-06 23:43 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #22 from Oleg Endo <olegendo at gcc dot gnu.org> 2012-03-06 23:42:15 UTC ---
This is a reduced test case:

int test (volatile int* a, int b, int c)
{
  a[1] = b != 0;

  if (b == 0)
    a[10] = c;

  return b == 0;
}

with '-O2 -m4-single -mb' it gets compiled to:

        tst     r5,r5       ! b == 0 -> T
        mov     #-1,r1
        negc    r1,r1       ! b != 0 -> T, r1
        mov.l   r1,@(4,r4)
        bf      .L2         ! branch if (b == 0)
        mov.l   r6,@(40,r4)
.L2:
        tst     r5,r5
        rts    
        movt    r0

This is because in the 'movnegt' expander it is not mentioned that the T bit is
modified and the first CSE pass optimizes away the 'b == 0' test before the
branch.  I'm trying to come up with some alternative approaches...


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] SH Target: Inefficient conditional branch
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (22 preceding siblings ...)
  2012-03-06 23:43 ` olegendo at gcc dot gnu.org
@ 2012-03-08  1:26 ` olegendo at gcc dot gnu.org
  2012-03-08 11:12 ` kkojima at gcc dot gnu.org
                   ` (63 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: olegendo at gcc dot gnu.org @ 2012-03-08  1:26 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #23 from Oleg Endo <olegendo at gcc dot gnu.org> 2012-03-08 01:25:21 UTC ---
Created attachment 26853
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=26853
Patch for the patch

The attached patch seems to fix the problem.
GCC (C,C++) and CSiBE set compiles with it.  Now doing the full testsuite...

Kaz, if you have some time, could you try it out in your setup, too please?

A thing that bugs me regarding the attached patch is the big/little endian
subreg copy pasta in the patterns *negnegt, *movtt, *movt_qi.  Isn't there a
way to avoid that?


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] SH Target: Inefficient conditional branch
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (23 preceding siblings ...)
  2012-03-08  1:26 ` olegendo at gcc dot gnu.org
@ 2012-03-08 11:12 ` kkojima at gcc dot gnu.org
  2012-03-08 11:15 ` kkojima at gcc dot gnu.org
                   ` (62 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: kkojima at gcc dot gnu.org @ 2012-03-08 11:12 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #24 from Kazumoto Kojima <kkojima at gcc dot gnu.org> 2012-03-08 11:11:32 UTC ---
(In reply to comment #23)
> Kaz, if you have some time, could you try it out in your setup, too please?

On trunk revision 185088, for sh4-unknown-linux-gnu, the result of
compare_tests is:

New tests that FAIL:

gfortran.dg/associated_4.f90  -O1  execution test
gfortran.dg/forall_4.f90  -O3 -fomit-frame-pointer  execution test
gfortran.dg/forall_4.f90  -O3 -fomit-frame-pointer -funroll-all-loops
-finline-functions  execution test
gfortran.dg/forall_4.f90  -O3 -fomit-frame-pointer -funroll-loops  execution
test
gfortran.dg/forall_4.f90  -O3 -g  execution test

Old tests that failed, that have disappeared: (Eeek!)

22_locale/ctype/is/char/3.cc execution test
27_io/basic_filebuf/underflow/wchar_t/9178.cc execution test
gfortran.dg/widechar_intrinsics_6.f90  -Os  execution test


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] SH Target: Inefficient conditional branch
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (24 preceding siblings ...)
  2012-03-08 11:12 ` kkojima at gcc dot gnu.org
@ 2012-03-08 11:15 ` kkojima at gcc dot gnu.org
  2012-03-08 11:17 ` kkojima at gcc dot gnu.org
                   ` (61 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: kkojima at gcc dot gnu.org @ 2012-03-08 11:15 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #25 from Kazumoto Kojima <kkojima at gcc dot gnu.org> 2012-03-08 11:13:39 UTC ---
Created attachment 26854
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=26854
worked .s file associated_4_good.s


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] SH Target: Inefficient conditional branch
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (25 preceding siblings ...)
  2012-03-08 11:15 ` kkojima at gcc dot gnu.org
@ 2012-03-08 11:17 ` kkojima at gcc dot gnu.org
  2012-03-09  0:27 ` olegendo at gcc dot gnu.org
                   ` (60 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: kkojima at gcc dot gnu.org @ 2012-03-08 11:17 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #26 from Kazumoto Kojima <kkojima at gcc dot gnu.org> 2012-03-08 11:16:39 UTC ---
Created attachment 26855
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=26855
unworked .s file associated_4_bad.s

I've attached .s files against gfortran.dg/associated_4.f90 -O1 with
patched/unpatched compilers.


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] SH Target: Inefficient conditional branch
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (26 preceding siblings ...)
  2012-03-08 11:17 ` kkojima at gcc dot gnu.org
@ 2012-03-09  0:27 ` olegendo at gcc dot gnu.org
  2012-03-09  1:45 ` kkojima at gcc dot gnu.org
                   ` (59 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: olegendo at gcc dot gnu.org @ 2012-03-09  0:27 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

Oleg Endo <olegendo at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #26853|0                           |1
        is obsolete|                            |

--- Comment #27 from Oleg Endo <olegendo at gcc dot gnu.org> 2012-03-09 00:26:39 UTC ---
Created attachment 26858
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=26858
Patch for the patch


> Old tests that failed, that have disappeared: (Eeek!)
>
> 22_locale/ctype/is/char/3.cc execution test
> 27_io/basic_filebuf/underflow/wchar_t/9178.cc execution test
> gfortran.dg/widechar_intrinsics_6.f90  -Os  execution test

That was a feature ;)

> I've attached .s files against gfortran.dg/associated_4.f90 -O1 with
> patched/unpatched compilers.

I'm sorry, I got the definition of the negc opcode wrong in the movrt_negc
pattern.  negc leaves the T bit always at '1' in this particular case, instead
of inverting the T bit.  It is funny that in C/C++ code it was never actually
trying to re-use the T bit after the negc, but in Fortran it did.  And that's
what went wrong.

I'm now testing the attached patch for C/C++ ...


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] SH Target: Inefficient conditional branch
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (27 preceding siblings ...)
  2012-03-09  0:27 ` olegendo at gcc dot gnu.org
@ 2012-03-09  1:45 ` kkojima at gcc dot gnu.org
  2012-03-09  8:41 ` kkojima at gcc dot gnu.org
                   ` (58 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: kkojima at gcc dot gnu.org @ 2012-03-09  1:45 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #28 from Kazumoto Kojima <kkojima at gcc dot gnu.org> 2012-03-09 01:44:52 UTC ---
(In reply to comment #27)
> Created attachment 26858 [details]
> Patch for the patch

Looks all fortran regressions gone away.  I'll run full tests
on sh4-unknown-lunix-gnu.


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] SH Target: Inefficient conditional branch
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (28 preceding siblings ...)
  2012-03-09  1:45 ` kkojima at gcc dot gnu.org
@ 2012-03-09  8:41 ` kkojima at gcc dot gnu.org
  2012-03-09 10:02 ` olegendo at gcc dot gnu.org
                   ` (57 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: kkojima at gcc dot gnu.org @ 2012-03-09  8:41 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #29 from Kazumoto Kojima <kkojima at gcc dot gnu.org> 2012-03-09 08:40:32 UTC ---
(In reply to comment #28)
Regtest on sh4-unknown-lunix-gnu has been done successfully.
Oleg, your patch is pre-approved.


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] SH Target: Inefficient conditional branch
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (29 preceding siblings ...)
  2012-03-09  8:41 ` kkojima at gcc dot gnu.org
@ 2012-03-09 10:02 ` olegendo at gcc dot gnu.org
  2012-03-09 10:37 ` kkojima at gcc dot gnu.org
                   ` (56 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: olegendo at gcc dot gnu.org @ 2012-03-09 10:02 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #30 from Oleg Endo <olegendo at gcc dot gnu.org> 2012-03-09 10:02:25 UTC ---
(In reply to comment #29)
> (In reply to comment #28)
> Regtest on sh4-unknown-lunix-gnu has been done successfully.
> Oleg, your patch is pre-approved.

Thanks a lot!
Could you please attach the testsuite summary of your setup?  I'd like to
compare them to my results (in particular the libstdc++ tests).
I'm now getting similar effects as in #comment 9 again, where a bunch of
libstdc++ failures disappear and this time one new failure appears:
FAIL: 21_strings/basic_string/cons/char/6.cc execution test

This is weird...


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] SH Target: Inefficient conditional branch
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (30 preceding siblings ...)
  2012-03-09 10:02 ` olegendo at gcc dot gnu.org
@ 2012-03-09 10:37 ` kkojima at gcc dot gnu.org
  2012-03-11 13:18 ` olegendo at gcc dot gnu.org
                   ` (55 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: kkojima at gcc dot gnu.org @ 2012-03-09 10:37 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #31 from Kazumoto Kojima <kkojima at gcc dot gnu.org> 2012-03-09 10:36:31 UTC ---
Created attachment 26859
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=26859
A test result

testresult on sh4-unknown-linux-gnu [trunk revision 185088].


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] SH Target: Inefficient conditional branch
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (31 preceding siblings ...)
  2012-03-09 10:37 ` kkojima at gcc dot gnu.org
@ 2012-03-11 13:18 ` olegendo at gcc dot gnu.org
  2012-03-15  8:11 ` kkojima at gcc dot gnu.org
                   ` (54 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: olegendo at gcc dot gnu.org @ 2012-03-11 13:18 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #32 from Oleg Endo <olegendo at gcc dot gnu.org> 2012-03-11 13:18:12 UTC ---
Author: olegendo
Date: Sun Mar 11 13:18:08 2012
New Revision: 185192

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=185192
Log:
    PR target/51244
    * config/sh/sh.md (movnegt): Expand into respective insns immediately.
    Use movrt_negc instead of negc pattern for non-SH2A.
    (*movnegt): Remove.
    (*movrt_negc, *negnegt, *movtt, *movt_qi): New insns and splits.

    PR target/51244
    * gcc.target/sh/pr51244-1.c: Fix thinkos.


Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/sh/sh.md
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/testsuite/gcc.target/sh/pr51244-1.c


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] SH Target: Inefficient conditional branch
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (32 preceding siblings ...)
  2012-03-11 13:18 ` olegendo at gcc dot gnu.org
@ 2012-03-15  8:11 ` kkojima at gcc dot gnu.org
  2012-03-20  1:46 ` olegendo at gcc dot gnu.org
                   ` (53 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: kkojima at gcc dot gnu.org @ 2012-03-15  8:11 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #33 from Kazumoto Kojima <kkojima at gcc dot gnu.org> 2012-03-15 07:52:21 UTC ---
(In reply to comment #31)
> Created attachment 26859 [details]
> testresult on sh4-unknown-linux-gnu [trunk revision 185088].

FYI, looking into the libstdc++ failures for sh4-unknown-linux-gnu,
it seems that the call insn was swapped before prologue frame insns
and then it makes unwinder confused.  -fno-delayed-branch also stops
that swapping for these failing cases.  The patch below works for me.

    * config/sh/sh.c (sh_expand_prologue): Emit blockage at the end
    of prologue for unwinder and profiler.

--- ORIG/trunk/gcc/config/sh/sh.c    2012-03-06 10:28:32.000000000 +0900
+++ trunk/gcc/config/sh/sh.c    2012-03-14 20:22:15.000000000 +0900
@@ -7234,6 +7234,13 @@ sh_expand_prologue (void)
       emit_insn (gen_shcompact_incoming_args ());
     }

+  /* If we are profiling, make sure no instructions are scheduled before
+     the call to mcount.  Similarly if some call instructions are swapped
+     before frame related insns, it'll make unwinder confused because
+     currently SH has no unwind info for function epilogues.  */
+  if (crtl->profile || flag_exceptions || flag_unwind_tables)
+    emit_insn (gen_blockage ());
+
   if (flag_stack_usage_info)
     current_function_static_stack_size = stack_usage;
 }


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] SH Target: Inefficient conditional branch
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (33 preceding siblings ...)
  2012-03-15  8:11 ` kkojima at gcc dot gnu.org
@ 2012-03-20  1:46 ` olegendo at gcc dot gnu.org
  2012-03-20  2:33 ` kkojima at gcc dot gnu.org
                   ` (52 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: olegendo at gcc dot gnu.org @ 2012-03-20  1:46 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #34 from Oleg Endo <olegendo at gcc dot gnu.org> 2012-03-20 01:04:19 UTC ---
(In reply to comment #33)
> FYI, looking into the libstdc++ failures for sh4-unknown-linux-gnu,
> it seems that the call insn was swapped before prologue frame insns
> and then it makes unwinder confused.  -fno-delayed-branch also stops
> that swapping for these failing cases.  The patch below works for me.
> [...]

Interesting, thanks!  I'll also test your patch and send it around, OK?

I'm a bit confused... was the issue caused by my patches to for this PR, or by
something else?


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] SH Target: Inefficient conditional branch
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (34 preceding siblings ...)
  2012-03-20  1:46 ` olegendo at gcc dot gnu.org
@ 2012-03-20  2:33 ` kkojima at gcc dot gnu.org
  2012-03-20 20:41 ` olegendo at gcc dot gnu.org
                   ` (51 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: kkojima at gcc dot gnu.org @ 2012-03-20  2:33 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #35 from Kazumoto Kojima <kkojima at gcc dot gnu.org> 2012-03-20 01:45:14 UTC ---
(In reply to comment #34)
> Interesting, thanks!  I'll also test your patch and send it around, OK?

OK, thanks!

> I'm a bit confused... was the issue caused by my patches to for this PR, or by
> something else?

I guess that it was caused by another changes but was latent for a while.


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] SH Target: Inefficient conditional branch
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (35 preceding siblings ...)
  2012-03-20  2:33 ` kkojima at gcc dot gnu.org
@ 2012-03-20 20:41 ` olegendo at gcc dot gnu.org
  2012-05-07 20:53 ` olegendo at gcc dot gnu.org
                   ` (50 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: olegendo at gcc dot gnu.org @ 2012-03-20 20:41 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #36 from Oleg Endo <olegendo at gcc dot gnu.org> 2012-03-20 20:33:30 UTC ---
I have created a new PR 52642 for the libstdc++ failures.


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] SH Target: Inefficient conditional branch
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (36 preceding siblings ...)
  2012-03-20 20:41 ` olegendo at gcc dot gnu.org
@ 2012-05-07 20:53 ` olegendo at gcc dot gnu.org
  2012-05-08 21:43 ` olegendo at gcc dot gnu.org
                   ` (49 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: olegendo at gcc dot gnu.org @ 2012-05-07 20:53 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #37 from Oleg Endo <olegendo at gcc dot gnu.org> 2012-05-07 20:50:31 UTC ---
Created attachment 27336
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=27336
Supplementary patch

As of rev 187217, the pr51244-1.c target testcase fails at least for m4*.
The attached patch adds some 'branch_true' and 'branch_false' subreg variants
which combine tries to use.  This seems to fix the problem.
I still would like to know whether there is a better way of handling the little
/ big endian subreg offsets in the patterns without doing copy-pasta.


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] SH Target: Inefficient conditional branch
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (37 preceding siblings ...)
  2012-05-07 20:53 ` olegendo at gcc dot gnu.org
@ 2012-05-08 21:43 ` olegendo at gcc dot gnu.org
  2012-06-30 12:01 ` olegendo at gcc dot gnu.org
                   ` (48 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: olegendo at gcc dot gnu.org @ 2012-05-08 21:43 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #38 from Oleg Endo <olegendo at gcc dot gnu.org> 2012-05-08 21:36:35 UTC ---
Author: olegendo
Date: Tue May  8 21:36:30 2012
New Revision: 187298

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=187298
Log:
    PR target/51244
    * config/sh/sh.md (*branch_true, *branch_false): New insns.


Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/sh/sh.md


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] SH Target: Inefficient conditional branch
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (38 preceding siblings ...)
  2012-05-08 21:43 ` olegendo at gcc dot gnu.org
@ 2012-06-30 12:01 ` olegendo at gcc dot gnu.org
  2012-07-02 19:24 ` olegendo at gcc dot gnu.org
                   ` (47 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: olegendo at gcc dot gnu.org @ 2012-06-30 12:01 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #39 from Oleg Endo <olegendo at gcc dot gnu.org> 2012-06-30 12:00:38 UTC ---
Created attachment 27724
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=27724
Another patch

I have noticed that the branch_true and branch_false insns also require
some subreg variations to work properly.  Otherwise redundant movt and tst
insns are generated.

I'm now testing the the attached patch, which fixes those issues.

In addition to that the subreg 0 / subreg 3 copy-pasta has been removed by
introducing t-bit predicates.


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] SH Target: Inefficient conditional branch
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (39 preceding siblings ...)
  2012-06-30 12:01 ` olegendo at gcc dot gnu.org
@ 2012-07-02 19:24 ` olegendo at gcc dot gnu.org
  2012-07-08 15:03 ` olegendo at gcc dot gnu.org
                   ` (46 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: olegendo at gcc dot gnu.org @ 2012-07-02 19:24 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #40 from Oleg Endo <olegendo at gcc dot gnu.org> 2012-07-02 19:24:03 UTC ---
Author: olegendo
Date: Mon Jul  2 19:23:56 2012
New Revision: 189177

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=189177
Log:
    PR target/51244
    * config/sh/predicates.md (t_reg_operand, negt_reg_operand): New
    predicates.
    * config/sh/sh-protos.h (get_t_reg_rtx): New prototype.
    * config/sh/sh.c (get_t_reg_rtx): New function.  Use it when invoking
    gen_branch_true and gen_branch_false.
    * config/sh/sh.md: Use get_t_reg_rtx when invoking gen_branch_true and
    gen_branch_false.
    (branch_true, branch_false): Use t_reg_operand predicate.
    (*branch_true, *branch_false): Delete.
    (movt): Use t_reg_operand predicate.
    (*negnegt): Use negt_reg_operand predicate and fold little and big
    endian variants.
    (*movtt): Use t_reg_operand and fold little and big endian variants.
    (*movt_qi): Delete.

    PR target/51244
    * gcc.target/sh/pr51244-1.c: Check that movt insn is not generated.


Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/sh/predicates.md
    trunk/gcc/config/sh/sh-protos.h
    trunk/gcc/config/sh/sh.c
    trunk/gcc/config/sh/sh.md
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/testsuite/gcc.target/sh/pr51244-1.c


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] SH Target: Inefficient conditional branch
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (40 preceding siblings ...)
  2012-07-02 19:24 ` olegendo at gcc dot gnu.org
@ 2012-07-08 15:03 ` olegendo at gcc dot gnu.org
  2012-07-23 22:58 ` olegendo at gcc dot gnu.org
                   ` (45 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: olegendo at gcc dot gnu.org @ 2012-07-08 15:03 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #41 from Oleg Endo <olegendo at gcc dot gnu.org> 2012-07-08 15:03:26 UTC ---
Author: olegendo
Date: Sun Jul  8 15:03:21 2012
New Revision: 189360

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=189360
Log:
    PR target/51244
    * config/sh/sh.md (*branch_true_eq, *branch_false_ne, nott): New insns.


Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/sh/sh.md


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] SH Target: Inefficient conditional branch
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (41 preceding siblings ...)
  2012-07-08 15:03 ` olegendo at gcc dot gnu.org
@ 2012-07-23 22:58 ` olegendo at gcc dot gnu.org
  2012-07-23 23:29 ` olegendo at gcc dot gnu.org
                   ` (44 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: olegendo at gcc dot gnu.org @ 2012-07-23 22:58 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #42 from Oleg Endo <olegendo at gcc dot gnu.org> 2012-07-23 22:57:42 UTC ---
Author: olegendo
Date: Mon Jul 23 22:57:36 2012
New Revision: 189797

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=189797
Log:
    PR target/51244
    * config/sh/predicates.md (general_movsrc_operand,
    general_movdst_operand): Reject T_REG.
    * config/sh/sh.md (*extendqisi2_compact_reg, *extendhisi2_compact_reg,
    movsi_i, movsi_ie, movsi_i_lowpart, *movqi_reg_reg, *movhi_reg_reg):
    Remove T_REG alternatives.
    (*negtstsi): New insn.


Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/sh/predicates.md
    trunk/gcc/config/sh/sh.md


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] SH Target: Inefficient conditional branch
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (42 preceding siblings ...)
  2012-07-23 22:58 ` olegendo at gcc dot gnu.org
@ 2012-07-23 23:29 ` olegendo at gcc dot gnu.org
  2012-07-26  0:20 ` olegendo at gcc dot gnu.org
                   ` (43 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: olegendo at gcc dot gnu.org @ 2012-07-23 23:29 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #43 from Oleg Endo <olegendo at gcc dot gnu.org> 2012-07-23 23:29:02 UTC ---
I have noticed that on SH the CANONICALIZE_COMPARISON macro is not defined,
although it seems to be useful for the combine pass.

Another thing that I'd like to try out is using zero-displacement branches to
implement conditional execution patterns and see how it performs.


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] SH Target: Inefficient conditional branch
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (43 preceding siblings ...)
  2012-07-23 23:29 ` olegendo at gcc dot gnu.org
@ 2012-07-26  0:20 ` olegendo at gcc dot gnu.org
  2012-07-30  6:46 ` olegendo at gcc dot gnu.org
                   ` (42 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: olegendo at gcc dot gnu.org @ 2012-07-26  0:20 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #44 from Oleg Endo <olegendo at gcc dot gnu.org> 2012-07-26 00:20:05 UTC ---
Author: olegendo
Date: Thu Jul 26 00:19:58 2012
New Revision: 189877

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=189877
Log:
    PR target/51244
    * config/sh/sh.opt (mzdcbranch): New option.
    * doc/invoke.texi: Document it.
    * config/sh/sh.md (negsi_cond): Use TARGET_ZDCBRANCH as condition
    instead of TARGET_HARD_SH4.
    * config/sh/sh.c (sh_option_override): Set TARGET_ZDCBRANCH as default
    for TARGET_HARD_SH4.


Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/sh/sh.c
    trunk/gcc/config/sh/sh.md
    trunk/gcc/config/sh/sh.opt
    trunk/gcc/doc/invoke.texi


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] SH Target: Inefficient conditional branch
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (44 preceding siblings ...)
  2012-07-26  0:20 ` olegendo at gcc dot gnu.org
@ 2012-07-30  6:46 ` olegendo at gcc dot gnu.org
  2012-08-09 15:55 ` olegendo at gcc dot gnu.org
                   ` (41 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: olegendo at gcc dot gnu.org @ 2012-07-30  6:46 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #45 from Oleg Endo <olegendo at gcc dot gnu.org> 2012-07-30 06:46:40 UTC ---
Author: olegendo
Date: Mon Jul 30 06:46:36 2012
New Revision: 189953

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=189953
Log:
    PR target/51244
    * config/sh/sh.md (mov_neg_si_t): Move to Scc instructions section.
    Use t_reg_operand predicate.  Add split for negated case.
    (ashrsi2_31): Pass get_t_reg_rtx to gen_mov_neg_si_t.
    * config/sh/sh.c (expand_ashiftrt): Likewise.

    PR target/51244
    * gcc.target/sh/pr51244-4.c: New.


Added:
    trunk/gcc/testsuite/gcc.target/sh/pr51244-4.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/sh/sh.c
    trunk/gcc/config/sh/sh.md
    trunk/gcc/testsuite/ChangeLog


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] SH Target: Inefficient conditional branch
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (45 preceding siblings ...)
  2012-07-30  6:46 ` olegendo at gcc dot gnu.org
@ 2012-08-09 15:55 ` olegendo at gcc dot gnu.org
  2012-08-12 22:47 ` olegendo at gcc dot gnu.org
                   ` (40 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: olegendo at gcc dot gnu.org @ 2012-08-09 15:55 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #46 from Oleg Endo <olegendo at gcc dot gnu.org> 2012-08-09 15:55:23 UTC ---
Author: olegendo
Date: Thu Aug  9 15:55:18 2012
New Revision: 190258

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=190258
Log:
    PR target/51244
    * config/sh/sh.md: Add negc extu sequence peephole.
    (movrt, movnegt, movrt_negc, nott): Use t_reg_operand predicate.
    (*movrt_negc): New insn.
    * config/sh/sync.md (atomic_test_and_set): Pass gen_t_reg_rtx to
    gen_movnegt.
    * config/sh/sh.c (expand_cbranchsi4, sh_emit_scc_to_t,
    sh_emit_compare_and_branch, sh_emit_compare_and_set): Use get_t_reg_rtx.
    (sh_expand_t_scc): Pass gen_t_reg_rtx to gen_movnegt.

    PR target/51244
    * gcc.target/sh/pr51244-5: New.
    * gcc.target/sh/pr51244-6: New.


Added:
    trunk/gcc/testsuite/gcc.target/sh/pr51244-5.c
    trunk/gcc/testsuite/gcc.target/sh/pr51244-6.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/sh/sh.c
    trunk/gcc/config/sh/sh.md
    trunk/gcc/config/sh/sync.md
    trunk/gcc/testsuite/ChangeLog


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] SH Target: Inefficient conditional branch
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (46 preceding siblings ...)
  2012-08-09 15:55 ` olegendo at gcc dot gnu.org
@ 2012-08-12 22:47 ` olegendo at gcc dot gnu.org
  2012-08-20 20:51 ` olegendo at gcc dot gnu.org
                   ` (39 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: olegendo at gcc dot gnu.org @ 2012-08-12 22:47 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #47 from Oleg Endo <olegendo at gcc dot gnu.org> 2012-08-12 22:47:21 UTC ---
Author: olegendo
Date: Sun Aug 12 22:47:15 2012
New Revision: 190331

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=190331
Log:
    PR target/51244
    * config/sh/sh.md: Add splits for inverted compare and branch
    opportunities.
    (*cmpeqsi_t): New insn.
    (cmpgtsi_t, cmpgesi_t): Swap r and N alternatives.
    (cmpgeusi_t): Use satisfies_constraint_Z.  Emit sett insn in
    replacement insn list and not in the preparation statements.
    (clrt, sett): Add mt_group attribute.

    PR target/51244
    * gcc.target/sh/pr51244-7.c: New.
    * gcc.target/sh/pr51244-8.c: New.
    * gcc.target/sh/pr51244-9.c: New.
    * gcc.target/sh/pr51244-10.c: New.


Added:
    trunk/gcc/testsuite/gcc.target/sh/pr51244-10.c
    trunk/gcc/testsuite/gcc.target/sh/pr51244-7.c
    trunk/gcc/testsuite/gcc.target/sh/pr51244-8.c
    trunk/gcc/testsuite/gcc.target/sh/pr51244-9.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/sh/sh.md
    trunk/gcc/testsuite/ChangeLog


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] SH Target: Inefficient conditional branch
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (47 preceding siblings ...)
  2012-08-12 22:47 ` olegendo at gcc dot gnu.org
@ 2012-08-20 20:51 ` olegendo at gcc dot gnu.org
  2012-08-30 22:54 ` olegendo at gcc dot gnu.org
                   ` (38 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: olegendo at gcc dot gnu.org @ 2012-08-20 20:51 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #48 from Oleg Endo <olegendo at gcc dot gnu.org> 2012-08-20 20:51:12 UTC ---
Author: olegendo
Date: Mon Aug 20 20:51:06 2012
New Revision: 190544

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=190544
Log:
    PR target/51244
    * config/sh/sh.md (*cset_zero): New insns.

    PR target/51244
    * gcc.target/sh/pr51244-11.c: New.


Added:
    trunk/gcc/testsuite/gcc.target/sh/pr51244-11.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/sh/sh.md
    trunk/gcc/testsuite/ChangeLog


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] SH Target: Inefficient conditional branch
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (48 preceding siblings ...)
  2012-08-20 20:51 ` olegendo at gcc dot gnu.org
@ 2012-08-30 22:54 ` olegendo at gcc dot gnu.org
  2012-08-31 10:55 ` kkojima at gcc dot gnu.org
                   ` (37 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: olegendo at gcc dot gnu.org @ 2012-08-30 22:54 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #49 from Oleg Endo <olegendo at gcc dot gnu.org> 2012-08-30 22:54:23 UTC ---
Kaz, if you have some time, could you please gather some CSiBE runtime numbers
for '-mpretend-cmove' and without it?

I've compared the result-size of the CSiBE set and with -mpretend-cmove there's
a total decrease of 948 bytes, with a few opposite cases.

My idea was to obsolete the -mpretend-cmove option, and instead tie its
behavior the new option -mzdcbranch, which generally is supposed to control any
kind of zero-displacement-branch handling.


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] SH Target: Inefficient conditional branch
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (49 preceding siblings ...)
  2012-08-30 22:54 ` olegendo at gcc dot gnu.org
@ 2012-08-31 10:55 ` kkojima at gcc dot gnu.org
  2012-08-31 15:50 ` olegendo at gcc dot gnu.org
                   ` (36 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: kkojima at gcc dot gnu.org @ 2012-08-31 10:55 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #50 from Kazumoto Kojima <kkojima at gcc dot gnu.org> 2012-08-31 10:54:44 UTC ---
(In reply to comment #49)
> Kaz, if you have some time, could you please gather some CSiBE runtime numbers
> for '-mpretend-cmove' and without it?

Here is the runtime result with -O2:

test                     no         cmove    ratio(%)

bzip2-1.0.2 bzip2.d      10.9767    11.07    -0.84312
bzip2-1.0.2 bzip2recover 4.70333    4.69333   0.213068
bzip2-1.0.2 bzip2.c      43.0867    43.73    -1.47115
compiler vam.fib         2.02667    2.00667   0.996678
compiler vam.fact        1.91333    1.89333   1.05634
compiler vam.test2       0.256667   0.266667 -3.75
Here is the runtime result with -O2:

test                     no         cmove    ratio(%)

bzip2-1.0.2 bzip2.d      10.9767    11.07    -0.84312
bzip2-1.0.2 bzip2recover 4.70333    4.69333   0.213068
bzip2-1.0.2 bzip2.c      43.0867    43.73    -1.47115
compiler vam.fib         2.02667    2.00667   0.996678
compiler vam.fact        1.91333    1.89333   1.05634
compiler vam.test2       0.256667   0.266667 -3.75
flex-2.5.31 flex         13.18      13.02     1.22888
jikespg-1.3 jikespg      1.61667    1.6       1.04167
jpeg-6b jpegtran2        4.65       4.61      0.867679
jpeg-6b djpeg2           2.33       2.28667   1.89504
jpeg-6b djpeg1           2.29333    2.24667   2.07715
jpeg-6b cjpeg2           3.01333    2.99667   0.556174
jpeg-6b djpeg0           0.336667   0.35     -3.80952
jpeg-6b cjpeg0           0.476667   0.486667 -2.05479
jpeg-6b cjpeg1           3.06333    2.99667   2.22469
jpeg-6b jpegtran0        0.263333   0.27     -2.46914
jpeg-6b jpegtran1        1.9        1.86667   1.78571
libpng-1.2.5 png2pnm0    0.986667   0.963333  2.42215
libpng-1.2.5 pnm2png1    44.6333    45.6333  -2.19138
libpng-1.2.5 pnm2png0    7.93667    8.09333  -1.93575
libpng-1.2.5 png2pnm1    6.73       6.75     -0.296296
teem-1.6.0-src dehex0    1.67       1.66333   0.400802
teem-1.6.0-src dehex1    10.96      10.9367   0.21335
teem-1.6.0-src enhex1    41.1767    40.5733   1.48702
teem-1.6.0-src enhex0    6.18333    6.31     -2.0074
zlib-1.1.4 minigzip0     46.4867    46.2533   0.504468
zlib-1.1.4 minigzip      5.52333    5.50333   0.363416
flex-2.5.31 flex         13.18      13.02     1.22888
jikespg-1.3 jikespg      1.61667    1.6       1.04167
jpeg-6b jpegtran2        4.65       4.61      0.867679
jpeg-6b djpeg2           2.33       2.28667   1.89504
jpeg-6b djpeg1           2.29333    2.24667   2.07715
jpeg-6b cjpeg2           3.01333    2.99667   0.556174
jpeg-6b djpeg0           0.336667   0.35     -3.80952
jpeg-6b cjpeg0           0.476667   0.486667 -2.05479
jpeg-6b cjpeg1           3.06333    2.99667   2.22469
jpeg-6b jpegtran0        0.263333   0.27     -2.46914
jpeg-6b jpegtran1        1.9        1.86667   1.78571
libpng-1.2.5 png2pnm0    0.986667   0.963333  2.42215
libpng-1.2.5 pnm2png1    44.6333    45.6333  -2.19138
libpng-1.2.5 pnm2png0    7.93667    8.09333  -1.93575
libpng-1.2.5 png2pnm1    6.73       6.75     -0.296296
teem-1.6.0-src dehex0    1.67       1.66333   0.400802
teem-1.6.0-src dehex1    10.96      10.9367   0.21335
teem-1.6.0-src enhex1    41.1767    40.5733   1.48702
teem-1.6.0-src enhex0    6.18333    6.31     -2.0074
zlib-1.1.4 minigzip0     46.4867    46.2533   0.504468
zlib-1.1.4 minigzip      5.52333    5.50333   0.363416


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] SH Target: Inefficient conditional branch
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (50 preceding siblings ...)
  2012-08-31 10:55 ` kkojima at gcc dot gnu.org
@ 2012-08-31 15:50 ` olegendo at gcc dot gnu.org
  2012-09-04  8:03 ` olegendo at gcc dot gnu.org
                   ` (35 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: olegendo at gcc dot gnu.org @ 2012-08-31 15:50 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #51 from Oleg Endo <olegendo at gcc dot gnu.org> 2012-08-31 15:50:35 UTC ---
(In reply to comment #50)

Thanks!
Hmm .. difficult.  
There seem to be 17 improvements and 10 dis-improvements, but the
dis-improvements seem heavier.  The improvement avg is 1.1% and the
dis-improvements avg is -2.1%.  I don't know .. maybe this should wait a bit
more.


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] SH Target: Inefficient conditional branch
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (51 preceding siblings ...)
  2012-08-31 15:50 ` olegendo at gcc dot gnu.org
@ 2012-09-04  8:03 ` olegendo at gcc dot gnu.org
  2012-09-23 21:36 ` [Bug target/51244] [SH] Inefficient conditional branch and code around T bit olegendo at gcc dot gnu.org
                   ` (34 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: olegendo at gcc dot gnu.org @ 2012-09-04  8:03 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #52 from Oleg Endo <olegendo at gcc dot gnu.org> 2012-09-04 08:03:08 UTC ---
Author: olegendo
Date: Tue Sep  4 08:03:01 2012
New Revision: 190909

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=190909
Log:
    PR target/51244
    * config/sh/sh.c (prepare_cbranch_operands): Pull out comparison
    canonicalization code into...
    * (sh_canonicalize_comparison): This new function.
    * config/sh/sh-protos.h: Declare it.
    * config/sh/sh.h: Use it in new macro CANONICALIZE_COMPARISON.
    * config/sh/sh.md (cbranchsi4): Remove TARGET_CBRANCHDI4 check and
    always invoke expand_cbranchsi4.


Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/sh/sh-protos.h
    trunk/gcc/config/sh/sh.c
    trunk/gcc/config/sh/sh.h
    trunk/gcc/config/sh/sh.md


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] [SH] Inefficient conditional branch and code around T bit
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (52 preceding siblings ...)
  2012-09-04  8:03 ` olegendo at gcc dot gnu.org
@ 2012-09-23 21:36 ` olegendo at gcc dot gnu.org
  2012-09-23 21:42 ` olegendo at gcc dot gnu.org
                   ` (33 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: olegendo at gcc dot gnu.org @ 2012-09-23 21:36 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

Oleg Endo <olegendo at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|SH Target: Inefficient      |[SH] Inefficient
                   |conditional branch          |conditional branch and code
                   |                            |around T bit

--- Comment #52 from Oleg Endo <olegendo at gcc dot gnu.org> 2012-09-04 08:03:08 UTC ---
Author: olegendo
Date: Tue Sep  4 08:03:01 2012
New Revision: 190909

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=190909
Log:
    PR target/51244
    * config/sh/sh.c (prepare_cbranch_operands): Pull out comparison
    canonicalization code into...
    * (sh_canonicalize_comparison): This new function.
    * config/sh/sh-protos.h: Declare it.
    * config/sh/sh.h: Use it in new macro CANONICALIZE_COMPARISON.
    * config/sh/sh.md (cbranchsi4): Remove TARGET_CBRANCHDI4 check and
    always invoke expand_cbranchsi4.


Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/sh/sh-protos.h
    trunk/gcc/config/sh/sh.c
    trunk/gcc/config/sh/sh.h
    trunk/gcc/config/sh/sh.md


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] [SH] Inefficient conditional branch and code around T bit
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (53 preceding siblings ...)
  2012-09-23 21:36 ` [Bug target/51244] [SH] Inefficient conditional branch and code around T bit olegendo at gcc dot gnu.org
@ 2012-09-23 21:42 ` olegendo at gcc dot gnu.org
  2012-10-03 21:39 ` olegendo at gcc dot gnu.org
                   ` (32 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: olegendo at gcc dot gnu.org @ 2012-09-23 21:42 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #53 from Oleg Endo <olegendo at gcc dot gnu.org> 2012-09-23 21:41:55 UTC ---
Another case that seems to go awry:

int test_1 (int a, int b, int c, int* d)
{
  bool x = a == 0;
  d[2] = !x;

  return x ? b : c;
}

-O2 -m4:
        tst     r4,r4
        mov     #1,r1
        movt    r0
        xor     r0,r1
        tst     r0,r0
        bt/s    .L5
        mov.l   r1,@(8,r7)
        mov     r5,r6
.L5:
        rts
        mov     r6,r0

This should be something like:
        tst     r4,r4
        movt    r0
        xor     #1,r0
        bf/s    .L5
        mov.l   r1,@(8,r7)
        mov     r5,r6
.L5:
        rts
        mov     r6,r0


-O2 -m2a:
        tst     r4,r4
        movt    r0
        mov     #1,r1
        xor     r0,r1
        mov.l   r1,@(8,r7)
        tst     r0,r0
        bf      .L6
        mov     r6,r0
        rts/n
    .align 1
.L6:
        rts
        mov     r5,r0

This should be:
        tst     r4,r4
        movrt   r1
        mov.l   r1,@(8,r7)
        bt      .L6
        mov     r6,r0
        rts/n
    .align 1
.L6:
        rts
        mov     r5,r0


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] [SH] Inefficient conditional branch and code around T bit
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (54 preceding siblings ...)
  2012-09-23 21:42 ` olegendo at gcc dot gnu.org
@ 2012-10-03 21:39 ` olegendo at gcc dot gnu.org
  2012-10-12  0:41 ` olegendo at gcc dot gnu.org
                   ` (31 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: olegendo at gcc dot gnu.org @ 2012-10-03 21:39 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #54 from Oleg Endo <olegendo at gcc dot gnu.org> 2012-10-03 21:39:22 UTC ---
Author: olegendo
Date: Wed Oct  3 21:39:18 2012
New Revision: 192052

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=192052
Log:
    PR target/51244
    * config/sh/sh.md (*mov_t_msb_neg): New insn and two accompanying
    unnamed split patterns.

    PR target/51244
    * gcc.target/sh/pr51244-12.c: New.


Added:
    trunk/gcc/testsuite/gcc.target/sh/pr51244-12.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/sh/sh.md
    trunk/gcc/testsuite/ChangeLog


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] [SH] Inefficient conditional branch and code around T bit
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (55 preceding siblings ...)
  2012-10-03 21:39 ` olegendo at gcc dot gnu.org
@ 2012-10-12  0:41 ` olegendo at gcc dot gnu.org
  2012-10-15 22:08 ` olegendo at gcc dot gnu.org
                   ` (30 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: olegendo at gcc dot gnu.org @ 2012-10-12  0:41 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #55 from Oleg Endo <olegendo at gcc dot gnu.org> 2012-10-12 00:41:31 UTC ---
Author: olegendo
Date: Fri Oct 12 00:41:23 2012
New Revision: 192387

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=192387
Log:
    PR target/51244
    * config/sh/sh.md (negsi_cond, negdi_cond, stack_protect_test): Remove
    get_t_reg_rtx when invoking gen_branch_true or gen_branch_false.
    (*zero_extend<mode>si2_compact): Convert to insn_and_split.  Convert
    zero extensions of T bit stores to reg moves in splitter.  Remove
    obsolete unnamed peephole2 that caught zero extensions after negc T bit
    stores.
    (*branch_true_eq, *branch_false_ne): Delete.
    (branch_true, branch_false): Convert insn to expander.  Move actual
    insn logic to...
    (*cbranch_t): ...this new insn_and_split.  Try to find preceding
    redundant T bit stores and tests and combine them with the conditional
    branch if possible in the splitter.
    (movrt_xor, *movt_movrt): New insn_and_split.
    * config/sh/predicates.md (cbranch_treg_value): New predicate.
    * config/sh/sh-protos.h (sh_eval_treg_value): Forward declare...
    * config/sh/sh.c (sh_eval_treg_value): ...this new function.
    (expand_cbranchsi4, expand_cbranchdi4): Remove get_t_reg_rtx
    when invoking gen_branch_true or gen_branch_false.

    PR target/51244
    * gcc.target/sh/pr51244-13.c: New.
    * gcc.target/sh/pr51244-14.c: New.
    * gcc.target/sh/pr51244-15.c: New.
    * gcc.target/sh/pr51244-16.c: New.


Added:
    trunk/gcc/testsuite/gcc.target/sh/pr51244-13.c
    trunk/gcc/testsuite/gcc.target/sh/pr51244-14.c
    trunk/gcc/testsuite/gcc.target/sh/pr51244-15.c
    trunk/gcc/testsuite/gcc.target/sh/pr51244-16.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/sh/predicates.md
    trunk/gcc/config/sh/sh-protos.h
    trunk/gcc/config/sh/sh.c
    trunk/gcc/config/sh/sh.md
    trunk/gcc/testsuite/ChangeLog


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] [SH] Inefficient conditional branch and code around T bit
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (56 preceding siblings ...)
  2012-10-12  0:41 ` olegendo at gcc dot gnu.org
@ 2012-10-15 22:08 ` olegendo at gcc dot gnu.org
  2012-11-03 12:01 ` olegendo at gcc dot gnu.org
                   ` (29 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: olegendo at gcc dot gnu.org @ 2012-10-15 22:08 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #56 from Oleg Endo <olegendo at gcc dot gnu.org> 2012-10-15 22:08:14 UTC ---
Author: olegendo
Date: Mon Oct 15 22:08:07 2012
New Revision: 192481

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=192481
Log:
    PR target/51244
    * config/sh/sh-protos.h (set_of_reg): New struct.
    (sh_find_set_of_reg, sh_is_logical_t_store_expr,
    sh_try_omit_signzero_extend):  Declare...
    * config/sh/sh.c (sh_find_set_of_reg, sh_is_logical_t_store_expr,
    sh_try_omit_signzero_extend): ...these new functions.
    * config/sh/sh.md (*logical_op_t): New insn_and_split.
    (*zero_extend<mode>si2_compact): Use sh_try_omit_signzero_extend
    in splitter.
    (*extend<mode>si2_compact_reg): Convert to insn_and_split.
    Use sh_try_omit_signzero_extend in splitter.
    (*mov<mode>_reg_reg): Disallow t_reg_operand as operand 1.
    (*cbranch_t): Rewrite combine part in splitter using new
    sh_find_set_of_reg function.

    PR target/51244
    * gcc.target/sh/pr51244-17.c: New.


Added:
    trunk/gcc/testsuite/gcc.target/sh/pr51244-17.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/sh/sh-protos.h
    trunk/gcc/config/sh/sh.c
    trunk/gcc/config/sh/sh.md
    trunk/gcc/testsuite/ChangeLog


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] [SH] Inefficient conditional branch and code around T bit
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (57 preceding siblings ...)
  2012-10-15 22:08 ` olegendo at gcc dot gnu.org
@ 2012-11-03 12:01 ` olegendo at gcc dot gnu.org
  2013-07-18 16:11 ` laurent.alfonsi at st dot com
                   ` (28 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: olegendo at gcc dot gnu.org @ 2012-11-03 12:01 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #57 from Oleg Endo <olegendo at gcc dot gnu.org> 2012-11-03 12:01:05 UTC ---
Author: olegendo
Date: Sat Nov  3 12:01:01 2012
New Revision: 193119

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=193119
Log:
    PR target/51244
    * config/sh/sh.md (*cbranch_t): Allow splitting after reload.
    Allow going beyond current basic block before reload when looking for
    the reg set insn.
    * config/sh/sh.c (sh_find_set_of_reg): Don't stop at labels.

    PR target/51244
    * gcc.target/sh/pr51244-18.c: New.
    * gcc.target/sh/pr51244-19.c: New.


Added:
    trunk/gcc/testsuite/gcc.target/sh/pr51244-18.c
    trunk/gcc/testsuite/gcc.target/sh/pr51244-19.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/sh/sh.c
    trunk/gcc/config/sh/sh.md
    trunk/gcc/testsuite/ChangeLog


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] [SH] Inefficient conditional branch and code around T bit
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (58 preceding siblings ...)
  2012-11-03 12:01 ` olegendo at gcc dot gnu.org
@ 2013-07-18 16:11 ` laurent.alfonsi at st dot com
  2013-07-18 16:12 ` laurent.alfonsi at st dot com
                   ` (27 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: laurent.alfonsi at st dot com @ 2013-07-18 16:11 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

Laurent Aflonsi <laurent.alfonsi at st dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |laurent.alfonsi at st dot com

--- Comment #58 from Laurent Aflonsi <laurent.alfonsi at st dot com> ---
Created attachment 30524
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=30524&action=edit
functional regression


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] [SH] Inefficient conditional branch and code around T bit
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (59 preceding siblings ...)
  2013-07-18 16:11 ` laurent.alfonsi at st dot com
@ 2013-07-18 16:12 ` laurent.alfonsi at st dot com
  2013-07-20 14:38 ` olegendo at gcc dot gnu.org
                   ` (26 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: laurent.alfonsi at st dot com @ 2013-07-18 16:12 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #59 from Laurent Aflonsi <laurent.alfonsi at st dot com> ---
I have a functional regression due to this improvement when we are compiling
the enclosed example in -O2.
 $ sh-superh-elf-gcc -O2 pr51244-20-main.c pr51244-20.c
 $ sh-superh-elf-run a.out
 FAIL

Thus, the code is transformed from :
  _get_request:
    mov.l    @(12,r4),r1
    tst    r1,r1
    bt    .L2
    mov.l    @(4,r4),r2
    tst    r2,r2
    mov    #-1,r2
     negc    r2,r2
  .L3:
    tst    r2,r2
    bt/s    .L11
    mov    #-100,r0
        mov    #1,r2
        [...]

to : 
  _get_request:
    mov.l    @(12,r4),r1
    tst    r1,r1
    bt    .L2
    mov.l    @(4,r4),r2
    tst    r2,r2
    mov    #-1,r2
    negc    r2,r2
  .L3:
    bf/s    .L11
    mov    #-100,r0
        mov    #1,r2
        [...]

With the inputs encoded in the main function, we are supposed to follow the
simpliest flow (no jump), but when this optimization is enabled, we are jumping
to L11 to to the bt -> bf transfrmation.

Could you please look at it ?

Thanks
Laurent


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] [SH] Inefficient conditional branch and code around T bit
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (60 preceding siblings ...)
  2013-07-18 16:12 ` laurent.alfonsi at st dot com
@ 2013-07-20 14:38 ` olegendo at gcc dot gnu.org
  2013-07-23  8:21 ` laurent.alfonsi at st dot com
                   ` (25 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: olegendo at gcc dot gnu.org @ 2013-07-20 14:38 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #60 from Oleg Endo <olegendo at gcc dot gnu.org> ---
(In reply to Laurent Aflonsi from comment #59)
> I have a functional regression due to this improvement when we are compiling
> the enclosed example in -O2.
>  $ sh-superh-elf-gcc -O2 pr51244-20-main.c pr51244-20.c
>  $ sh-superh-elf-run a.out
>  FAIL
> 
> Thus, the code is transformed from :
>   _get_request:
> 	mov.l	@(12,r4),r1
> 	tst	r1,r1
> 	bt	.L2
> 	mov.l	@(4,r4),r2
> 	tst	r2,r2
> 	mov	#-1,r2
>  	negc	r2,r2
>   .L3:
> 	tst	r2,r2
> 	bt/s	.L11
> 	mov	#-100,r0
>         mov	#1,r2
>         [...]
> 
> to : 
>   _get_request:
> 	mov.l	@(12,r4),r1
> 	tst	r1,r1
> 	bt	.L2
> 	mov.l	@(4,r4),r2
> 	tst	r2,r2
> 	mov	#-1,r2
> 	negc	r2,r2
>   .L3:
> 	bf/s	.L11
> 	mov	#-100,r0
>         mov	#1,r2
>         [...]
> 
> With the inputs encoded in the main function, we are supposed to follow the
> simpliest flow (no jump), but when this optimization is enabled, we are
> jumping to L11 to to the bt -> bf transfrmation.

The idea was that sequences such as
  tst r2,r2
  mov #-1,r2
  negc r2,r2
  tst r2,r2
  bt  ...

should be folded to
  tst r2,r2
  bt  ...

... if r2 is dead afterwards (which it seems to be).  I guess I missed to
handle some cases where the tested register is in a loop or can be reached by
some other basic block.  I'll check out the details.
>From gcc-bugs-return-426410-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org Sat Jul 20 16:17:32 2013
Return-Path: <gcc-bugs-return-426410-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org>
Delivered-To: listarch-gcc-bugs@gcc.gnu.org
Received: (qmail 32147 invoked by alias); 20 Jul 2013 16:17:31 -0000
Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-bugs.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-bugs/>
List-Post: <mailto:gcc-bugs@gcc.gnu.org>
List-Help: <mailto:gcc-bugs-help@gcc.gnu.org>
Sender: gcc-bugs-owner@gcc.gnu.org
Delivered-To: mailing list gcc-bugs@gcc.gnu.org
Received: (qmail 29879 invoked by uid 48); 20 Jul 2013 16:15:28 -0000
From: "ebotcazou at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug rtl-optimization/57940] [PATCH] Rerun df_analyze after delete_unmarked_insns during DCE
Date: Sat, 20 Jul 2013 16:17:00 -0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: rtl-optimization
X-Bugzilla-Version: 4.8.1
X-Bugzilla-Keywords:
X-Bugzilla-Severity: normal
X-Bugzilla-Who: ebotcazou at gcc dot gnu.org
X-Bugzilla-Status: WAITING
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags:
X-Bugzilla-Changed-Fields: bug_status cf_reconfirmed_on cc everconfirmed
Message-ID: <bug-57940-4-vw01eVc78c@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-57940-4@http.gcc.gnu.org/bugzilla/>
References: <bug-57940-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 7bit
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-SW-Source: 2013-07/txt/msg00917.txt.bz2
Content-length: 945

http://gcc.gnu.org/bugzilla/show_bug.cgi?idW940

Eric Botcazou <ebotcazou at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |WAITING
   Last reconfirmed|                            |2013-07-20
                 CC|                            |ebotcazou at gcc dot gnu.org
     Ever confirmed|0                           |1

--- Comment #1 from Eric Botcazou <ebotcazou at gcc dot gnu.org> ---
> If delete_unmarked_insns deletes some insn, DF state might be
> out of date, and, regs_ever_live might contain unused registers till the end.

I presume this occurs after reload?

> Fixed by forcing regs_ever_live update and rerunning df_analyze () at
> fini_dce().

No, calling df_compute_regs_ever_live (true) is incorrect after reload, see the
comment in rest_of_handle_df_initialize.


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] [SH] Inefficient conditional branch and code around T bit
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (61 preceding siblings ...)
  2013-07-20 14:38 ` olegendo at gcc dot gnu.org
@ 2013-07-23  8:21 ` laurent.alfonsi at st dot com
  2013-07-27 19:28 ` olegendo at gcc dot gnu.org
                   ` (24 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: laurent.alfonsi at st dot com @ 2013-07-23  8:21 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #61 from Laurent Aflonsi <laurent.alfonsi at st dot com> ---
Yes that's the point. L3 can be reached by another block (L2):

    tst    r2,r2
    mov    #-1,r2
    negc    r2,r2
.L3:
    tst    r2,r2
    bt/s    .L11
        [...]
.L2:
    mov.l    @r4,r2
    tst    r2,r2
    bra    .L3
    movt    r2

The movt(L2) and the tst(L3) are both removed, and that's coherent for that run
path, because it is preceded by the tst r2,r2.
But that makes the first path incoherent because L3 can be reached by the very
first block. I have written a first fix, too restrictive ("pr25869-19.c
scan-assembler-not movt" is failing) :

--- ./gcc/gcc/config/sh/sh.md.orig
+++ ./gcc/gcc/config/sh/sh.md
@@ -8523,7 +8523,8 @@
           T bit.  Notice that some T bit stores such as negc also modify
           the T bit.  */
        if (modified_between_p (get_t_reg_rtx (), s1.insn, testing_insn)
-           || modified_in_p (get_t_reg_rtx (), s1.insn))
+           || modified_in_p (get_t_reg_rtx (), s1.insn)
+           || !no_labels_between_p(s1.insn, testing_insn))
          operands[2] = NULL_RTX;

        break;

The idea would be to check if "s1.insn block dominates testing_insn block",
but I don't know how to write it at this stage.

More generally, I'm surprised to see that optimization at mapping level, isn't
this a generic problematic that should be handled at rtl dead code elimination
stage on the T bit register ?

Thanks,
Laurent


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] [SH] Inefficient conditional branch and code around T bit
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (62 preceding siblings ...)
  2013-07-23  8:21 ` laurent.alfonsi at st dot com
@ 2013-07-27 19:28 ` olegendo at gcc dot gnu.org
  2013-07-28  8:51 ` olegendo at gcc dot gnu.org
                   ` (23 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: olegendo at gcc dot gnu.org @ 2013-07-27 19:28 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #62 from Oleg Endo <olegendo at gcc dot gnu.org> ---
(In reply to Laurent Aflonsi from comment #61)
> 
> More generally, I'm surprised to see that optimization at mapping level,
> isn't this a generic problematic that should be handled at rtl dead code
> elimination stage on the T bit register ?

Actually, it is a kind of generic case.  Dead code elimination would not do
these kind of logic folding.  Usually this kind of stuff handled by the combine
pass which can figure out some redundant operations or operations that cancel
each other out.  However, combine's logic is also limited and it the overall T
bit handling is a bit shaky.  That's why I introduced the additional
elimination handling that is done in the split pass after the combine pass on
insns that combine didn't catch.  I didn't want to introduce another rtl pass
just for this and touching the combine pass also didn't seem attractive since
all the other backends depend on its behavior.

Maybe it would be better to switch T_REG from SImode to BImode, which reflects
reality.  This should be relatively straight forward to do.

Another idea would be to try out using CCmode.  There some additional
optimizations done on CCmode.  However, this is a bigger change.


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] [SH] Inefficient conditional branch and code around T bit
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (63 preceding siblings ...)
  2013-07-27 19:28 ` olegendo at gcc dot gnu.org
@ 2013-07-28  8:51 ` olegendo at gcc dot gnu.org
  2013-07-28 12:26 ` olegendo at gcc dot gnu.org
                   ` (22 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: olegendo at gcc dot gnu.org @ 2013-07-28  8:51 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #63 from Oleg Endo <olegendo at gcc dot gnu.org> ---
Created attachment 30566
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=30566&action=edit
Reduced test

(In reply to Laurent Aflonsi from comment #58)
> Created attachment 30524 [details]
> functional regression

This is a stripped down test case.


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] [SH] Inefficient conditional branch and code around T bit
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (64 preceding siblings ...)
  2013-07-28  8:51 ` olegendo at gcc dot gnu.org
@ 2013-07-28 12:26 ` olegendo at gcc dot gnu.org
  2013-07-31 21:46 ` olegendo at gcc dot gnu.org
                   ` (21 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: olegendo at gcc dot gnu.org @ 2013-07-28 12:26 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #64 from Oleg Endo <olegendo at gcc dot gnu.org> ---
(In reply to Laurent Aflonsi from comment #61)
> 
> The movt(L2) and the tst(L3) are both removed, and that's coherent for that
> run path, because it is preceded by the tst r2,r2.
> But that makes the first path incoherent because L3 can be reached by the
> very first block. I have written a first fix, too restrictive ("pr25869-19.c
> scan-assembler-not movt" is failing) :
> 
> --- ./gcc/gcc/config/sh/sh.md.orig
> +++ ./gcc/gcc/config/sh/sh.md
> @@ -8523,7 +8523,8 @@
>            T bit.  Notice that some T bit stores such as negc also modify
>            the T bit.  */
>         if (modified_between_p (get_t_reg_rtx (), s1.insn, testing_insn)
> -           || modified_in_p (get_t_reg_rtx (), s1.insn))
> +           || modified_in_p (get_t_reg_rtx (), s1.insn)
> +           || !no_labels_between_p(s1.insn, testing_insn))
>           operands[2] = NULL_RTX;
>  
>         break;
> 
> The idea would be to check if "s1.insn block dominates testing_insn block",
> but I don't know how to write it at this stage.

The proper way would be to find all basic blocks that set the tested reg.  With
the reduced test case, just right before the split1 pass there are two basic
blocks that set reg 167 which is then tested for '== 0' before the conditional
branch:

(note 13 12 14 3 [bb 3] NOTE_INSN_BASIC_BLOCK)
<...>
(insn 15 14 16 3 (set (reg:SI 147 t)
        (eq:SI (reg:SI 173 [ MEM[(int *)q_3(D) + 4B] ])
            (const_int 0 [0]))) sh_tmp.cpp:84 17 {cmpeqsi_t}
     (expr_list:REG_DEAD (reg:SI 173 [ MEM[(int *)q_3(D) + 4B] ])
        (nil)))

(insn 16 15 17 3 (set (reg:SI 175)
        (const_int -1 [0xffffffffffffffff])) sh_tmp.cpp:84 250 {movsi_ie}
     (nil))
(note 17 16 18 3 NOTE_INSN_DELETED)
(insn 18 17 71 3 (parallel [
            (set (reg:SI 167 [ D.1424 ])
                (xor:SI (reg:SI 147 t)
                    (const_int 1 [0x1])))
            (set (reg:SI 147 t)
                (const_int 1 [0x1]))
            (use (reg:SI 175))
        ]) sh_tmp.cpp:84 394 {movrt_negc}
     (expr_list:REG_DEAD (reg:SI 175)
        (expr_list:REG_UNUSED (reg:SI 147 t)
            (nil))))
(jump_insn 71 18 72 3 (set (pc)
        (label_ref 27)) -1
     (nil)
 -> 27)
(barrier 72 71 21)


(code_label 21 72 22 4 2 "" [1 uses])
(note 22 21 23 4 [bb 4] NOTE_INSN_BASIC_BLOCK)
<...>
(insn 24 23 26 4 (set (reg:SI 147 t)
        (eq:SI (reg:SI 177 [ *q_3(D) ])
            (const_int 0 [0]))) sh_tmp.cpp:85 17 {cmpeqsi_t}
     (expr_list:REG_DEAD (reg:SI 177 [ *q_3(D) ])
        (nil)))
(insn 26 24 27 4 (set (reg:SI 167 [ D.1424 ])
        (reg:SI 147 t)) sh_tmp.cpp:85 392 {movt}
     (expr_list:REG_DEAD (reg:SI 147 t)
        (nil)))



(code_label 27 26 28 5 3 "" [1 uses])
(note 28 27 29 5 [bb 5] NOTE_INSN_BASIC_BLOCK)
(insn 29 28 30 5 (set (reg:SI 147 t)
        (eq:SI (reg:SI 167 [ D.1424 ])
            (const_int 0 [0]))) sh_tmp.cpp:91 17 {cmpeqsi_t}
     (expr_list:REG_DEAD (reg:SI 167 [ D.1424 ])
        (nil)))
(jump_insn 30 29 31 5 (set (pc)
        (if_then_else (ne (reg:SI 147 t)
                (const_int 0 [0]))
            (label_ref:SI 50)
            (pc))) sh_tmp.cpp:91 295 {*cbranch_t}
     (expr_list:REG_DEAD (reg:SI 147 t)
        (expr_list:REG_BR_PROB (const_int 400 [0x190])
            (nil)))
 -> 50)


Here it starts walking up the insns from insn 29 [bb 5] and finds insn 26 [bb
4], but it should also check [bb 3].
The question then is, what to do with the collected basic blocks.  Ideally it
should look at all the T bit paths in every basic block and try to eliminate
redundant T bit flipping in each basic block so that in this case [bb 5] can
start with the conditional branch.

Then this ...
        mov.l   @(4,r4),r1
        tst     r1,r1   // T = @(4,r4) == 0
        mov     #-1,r1
        negc    r1,r1   // r1 = @(4,r4) != 0
.L3:
        tst     r1,r1   // T = @(4,r4) == 0
        bt/s    .L5
        mov     #1,r1
        cmp/hi  r1,r5
        bf/s    .L9
        mov     #0,r0
        rts
        nop
.L2:
        mov.l   @r4,r1
        tst     r1,r1   // T = @(r4) == 0
        bra     .L3
        movt    r1      // r1 = @(r4) == 0


would be simplified to this:

        mov.l   @(4,r4),r1
        tst     r1,r1   // T = @(4,r4) == 0
.L3:
        bt/s    .L5
        mov     #1,r1
        cmp/hi  r1,r5
        bf/s    .L9
        mov     #0,r0
        rts
        nop
.L2:
        mov.l   @r4,r1
        bra     .L3
        tst     r1,r1   // T = @(r4) == 0


Maybe if BImode was used for the T bit, combine could do better at folding T
bit flipping.  However, it would not do cross BB analysis, so I think it's
pointless to try out BImode.
I'm not sure whether there is already something in the compiler that could do
this kind of optimization.  According to my observations it should happen after
the combine pass and before register allocation to get useful results.

Until then I think the following should be applied to 4.9 and 4.8, even if it
causes some of the T bit test cases to fail.

Index: gcc/config/sh/sh.md
===================================================================
--- gcc/config/sh/sh.md    (revision 201282)
+++ gcc/config/sh/sh.md    (working copy)
@@ -8489,15 +8489,30 @@
       continue;
     }

-    /* It's only safe to remove the testing insn if the T bit is not
-       modified between the testing insn and the insn that stores the
-       T bit.  Notice that some T bit stores such as negc also modify
-       the T bit.  */
-    if (modified_between_p (get_t_reg_rtx (), s1.insn, testing_insn)
-        || modified_in_p (get_t_reg_rtx (), s1.insn))
-      operands[2] = NULL_RTX;
+      /* It's only safe to remove the testing insn if the T bit is not
+     modified between the testing insn and the insn that stores the
+     T bit.  Notice that some T bit stores such as negc also modify
+     the T bit.  */
+      if (modified_between_p (get_t_reg_rtx (), s1.insn, testing_insn)
+      || modified_in_p (get_t_reg_rtx (), s1.insn)
+      || !no_labels_between_p (s1.insn, testing_insn))
+    operands[2] = NULL_RTX;
+      else
+    {
+      /* If the insn that sets the tested reg has a REG_DEAD note on
+         the T bit remove that note since we're extending the usage
+         of the T bit.  */
+      for (rtx n = REG_NOTES (s1.insn); n != NULL_RTX; )
+        {
+          rtx nn = XEXP (n, 1);
+          if (REG_NOTE_KIND (n) == REG_DEAD
+          && t_reg_operand (XEXP (n, 0), VOIDmode))
+          remove_note (s1.insn, n);
+          n = nn;
+        }
+    }

-    break;
+      break;
     }

   if (operands[2] == NULL_RTX)


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] [SH] Inefficient conditional branch and code around T bit
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (65 preceding siblings ...)
  2013-07-28 12:26 ` olegendo at gcc dot gnu.org
@ 2013-07-31 21:46 ` olegendo at gcc dot gnu.org
  2013-08-23  0:13 ` olegendo at gcc dot gnu.org
                   ` (20 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: olegendo at gcc dot gnu.org @ 2013-07-31 21:46 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #65 from Oleg Endo <olegendo at gcc dot gnu.org> ---
(In reply to Oleg Endo from comment #64)
> 
> would be simplified to this:
> 
>         mov.l   @(4,r4),r1
>         tst     r1,r1   // T = @(4,r4) == 0
> .L3:
>         bt/s    .L5
>         mov     #1,r1
>         cmp/hi  r1,r5
>         bf/s    .L9
>         mov     #0,r0
>         rts
>         nop
> .L2:
>         mov.l   @r4,r1
>         bra     .L3
>         tst     r1,r1   // T = @(r4) == 0

Sorry, I got confused.  The above is wrong.  One of the T bit inversions can't
be eliminated in this case.
It should be:

        mov.l   @(4,r4),r1
.L3:
        tst     r1,r1
        bt/s    .L5
        mov     #1,r1
        cmp/hi  r1,r5
        bf/s    .L9
        mov     #0,r0
        rts
        nop
.L2:
        mov.l   @r4,r1
        tst     r1,r1
        bra     .L3
        movt    r1


Or SH2A:
        mov.l   @(4,r4),r1
        tst     r1,r1
.L3:
        bt/s    .L5
        mov     #1,r1
        cmp/hi  r1,r5
        bf/s    .L9
        mov     #0,r0
        rts
        nop
.L2:
        mov.l   @r4,r1
        tst     r1,r1
        bra     .L3
        nott

However, my original 'optimized' asm snippet is valid if the reduced test case
is changed to:

static inline int
blk_oversized_queue (int* q)
{
  if (q[2])
    return q[1] == 0;   // instead of != 0
  return q[0] == 0;
}

The current trunk version eliminates the movt/tst insns and produces correct
code by accident.  It can be simplified even more:

        mov.l   @(4,r4),r1
.L3:
        tst     r1,r1
        bt/s    .L5
        mov     #1,r1
        cmp/hi  r1,r5
        bf/s    .L9
        mov     #0,r0
        rts
        nop
.L2:
        bra     .L3
        mov.l   @r4,r1

I'm trying to come up with a patch that implements t bit tracing in order to
handle those scenarios.


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] [SH] Inefficient conditional branch and code around T bit
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (66 preceding siblings ...)
  2013-07-31 21:46 ` olegendo at gcc dot gnu.org
@ 2013-08-23  0:13 ` olegendo at gcc dot gnu.org
  2013-08-23  0:25 ` kkojima at gcc dot gnu.org
                   ` (19 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: olegendo at gcc dot gnu.org @ 2013-08-23  0:13 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #66 from Oleg Endo <olegendo at gcc dot gnu.org> ---
Created attachment 30689
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=30689&action=edit
WIP RTL pass

Just wanted to give an update on the issue.

I've been writing an SH specific RTL pass that handles those multiple BB cases
as a replacement for the splitter in *cbranch_t pattern.
Basically it tries to combine comparisons and T bit cstores before cbranches
across multiple blocks.

There are still quite some open issues and some copy pasta to be folded, but
the pass can already eliminate the test cases mentioned before.  Moreover, it
also optimizes DImode comparisons and can utilize SH2A's nott instruction
better.  In order to get good results, the pass has to be run twice.

I've developed this against rev. 201282 so it also needs some adaptation for
the new passes stuff that's been done recently on trunk.

Kaz, the "WIP status" aside, would you be OK with something like that?


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] [SH] Inefficient conditional branch and code around T bit
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (67 preceding siblings ...)
  2013-08-23  0:13 ` olegendo at gcc dot gnu.org
@ 2013-08-23  0:25 ` kkojima at gcc dot gnu.org
  2013-09-24 22:43 ` olegendo at gcc dot gnu.org
                   ` (18 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: kkojima at gcc dot gnu.org @ 2013-08-23  0:25 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #67 from Kazumoto Kojima <kkojima at gcc dot gnu.org> ---
(In reply to Oleg Endo from comment #66)
> Kaz, the "WIP status" aside, would you be OK with something like that?

Yep.  Sounds good to me.


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] [SH] Inefficient conditional branch and code around T bit
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (68 preceding siblings ...)
  2013-08-23  0:25 ` kkojima at gcc dot gnu.org
@ 2013-09-24 22:43 ` olegendo at gcc dot gnu.org
  2013-10-03 22:50 ` olegendo at gcc dot gnu.org
                   ` (17 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: olegendo at gcc dot gnu.org @ 2013-09-24 22:43 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

Oleg Endo <olegendo at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #30689|0                           |1
        is obsolete|                            |

--- Comment #68 from Oleg Endo <olegendo at gcc dot gnu.org> ---
Created attachment 30889
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=30889&action=edit
RTL pass

An updated patch that adds an SH specific RTL pass against current trunk (rev
202873), not fully tested.

CSiBE for '-m2a-single -O2' and '-m4-single -mpretend-cmove -O2' look OK. 
There are only 2 cases that got actually worse in the set:

linux-2.4.23-pre3-testplatform/fs/lockd/host.s (nlm_lookup_host):

before:
.L142:
    bt    .L60
    mov.l    @(20,r11),r6
    cmp/eq    r6,r10
    bf    .L58
    add    r1,r13

after:
.L142:
    bt    .L60
    mov.l    @(20,r11),r6
    mov    r10,r5
    cmp/eq    r6,r5
    bf    .L58
    add    r1,r13


linux-2.4.23-pre3-testplatform/net/ipv4/igmp.s (add_grec):

before:
.L459:
    bt    .L294
    mov.l    @(24,r13),r1
    tst    r1,r1
    bt/s    .L295
    add    #64,r1
    mov    r13,r2
    add    #64,r2
    mov.l    @(36,r1),r1
    mov.l    @(32,r2),r2
    sub    r2,r1
    mov    #11,r2
    cmp/hs    r1,r2
.L296:
    bf/s    .L294
    mov    r13,r4
    mov.l    .L408,r0
    jsr    @r0
    mov    #0,r13

after:
.L459:
    bt    .L294
    mov.l    @(24,r13),r1
    tst    r1,r1
    bt    .L295
    add    #64,r1
    mov    r13,r2
    add    #64,r2
    mov.l    @(36,r1),r1
    mov.l    @(32,r2),r2
    sub    r2,r1
    mov    #11,r2
    cmp/hs    r1,r2
    movt    r1
.L296:
    tst    r1,r1
    bt/s    .L294
    mov    r13,r4
    mov.l    .L408,r0
    jsr    @r0
    mov    #0,r13


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] [SH] Inefficient conditional branch and code around T bit
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (69 preceding siblings ...)
  2013-09-24 22:43 ` olegendo at gcc dot gnu.org
@ 2013-10-03 22:50 ` olegendo at gcc dot gnu.org
  2013-10-12 20:47 ` olegendo at gcc dot gnu.org
                   ` (16 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: olegendo at gcc dot gnu.org @ 2013-10-03 22:50 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

Oleg Endo <olegendo at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #30889|0                           |1
        is obsolete|                            |

--- Comment #69 from Oleg Endo <olegendo at gcc dot gnu.org> ---
Created attachment 30953
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=30953&action=edit
RTL pass

(In reply to Oleg Endo from comment #68)
> Created attachment 30889 [details]
> RTL pass
> 
> An updated patch that adds an SH specific RTL pass against current trunk
> (rev 202873), not fully tested.
> 
> CSiBE for '-m2a-single -O2' and '-m4-single -mpretend-cmove -O2' look OK. 
> There are only 2 cases that got actually worse in the set:
> 
> 
> linux-2.4.23-pre3-testplatform/net/ipv4/igmp.s (add_grec):
> 
> before:
> .L459:
> 	bt	.L294
> 	mov.l	@(24,r13),r1
> 	tst	r1,r1
> 	bt/s	.L295
> 	add	#64,r1
> 	mov	r13,r2
> 	add	#64,r2
> 	mov.l	@(36,r1),r1
> 	mov.l	@(32,r2),r2
> 	sub	r2,r1
> 	mov	#11,r2
> 	cmp/hs	r1,r2
> .L296:
> 	bf/s	.L294
> 	mov	r13,r4
> 	mov.l	.L408,r0
> 	jsr	@r0
> 	mov	#0,r13
> 
> after:
> .L459:
> 	bt	.L294
> 	mov.l	@(24,r13),r1
> 	tst	r1,r1
> 	bt	.L295
> 	add	#64,r1
> 	mov	r13,r2
> 	add	#64,r2
> 	mov.l	@(36,r1),r1
> 	mov.l	@(32,r2),r2
> 	sub	r2,r1
> 	mov	#11,r2
> 	cmp/hs	r1,r2
> 	movt	r1
> .L296:
> 	tst	r1,r1
> 	bt/s	.L294
> 	mov	r13,r4
> 	mov.l	.L408,r0
> 	jsr	@r0
> 	mov	#0,r13


That case didn't get worse, it actually improved.  The 'before' code is wrong
code, due to a missed BB that sets the tested 'r1' reg to '1'.

Testing the previous version of the RTL pass (attachment 30889) against trunk
rev 202876 revealed a defect in the function 'trace_reg_uses'.  The attached
updated version fixes this.
>From gcc-bugs-return-431050-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org Thu Oct 03 22:51:51 2013
Return-Path: <gcc-bugs-return-431050-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org>
Delivered-To: listarch-gcc-bugs@gcc.gnu.org
Received: (qmail 19575 invoked by alias); 3 Oct 2013 22:51:51 -0000
Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-bugs.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-bugs/>
List-Post: <mailto:gcc-bugs@gcc.gnu.org>
List-Help: <mailto:gcc-bugs-help@gcc.gnu.org>
Sender: gcc-bugs-owner@gcc.gnu.org
Delivered-To: mailing list gcc-bugs@gcc.gnu.org
Received: (qmail 19489 invoked by uid 48); 3 Oct 2013 22:51:48 -0000
From: "paolo.carlini at oracle dot com" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug c++/58584] [c++11] ICE with invalid argument for alignas
Date: Thu, 03 Oct 2013 22:51:00 -0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: c++
X-Bugzilla-Version: 4.9.0
X-Bugzilla-Keywords: error-recovery, ice-on-invalid-code
X-Bugzilla-Severity: normal
X-Bugzilla-Who: paolo.carlini at oracle dot com
X-Bugzilla-Status: RESOLVED
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: paolo.carlini at oracle dot com
X-Bugzilla-Target-Milestone: 4.9.0
X-Bugzilla-Flags:
X-Bugzilla-Changed-Fields: bug_status resolution
Message-ID: <bug-58584-4-biTmiuW6PE@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-58584-4@http.gcc.gnu.org/bugzilla/>
References: <bug-58584-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 7bit
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-SW-Source: 2013-10/txt/msg00195.txt.bz2
Content-length: 445

http://gcc.gnu.org/bugzilla/show_bug.cgi?idX584

Paolo Carlini <paolo.carlini at oracle dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |RESOLVED
         Resolution|---                         |FIXED

--- Comment #6 from Paolo Carlini <paolo.carlini at oracle dot com> ---
Fixed for 4.9.0.


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] [SH] Inefficient conditional branch and code around T bit
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (70 preceding siblings ...)
  2013-10-03 22:50 ` olegendo at gcc dot gnu.org
@ 2013-10-12 20:47 ` olegendo at gcc dot gnu.org
  2013-10-12 21:26 ` olegendo at gcc dot gnu.org
                   ` (15 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: olegendo at gcc dot gnu.org @ 2013-10-12 20:47 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #70 from Oleg Endo <olegendo at gcc dot gnu.org> ---
Author: olegendo
Date: Sat Oct 12 20:47:22 2013
New Revision: 203492

URL: http://gcc.gnu.org/viewcvs?rev=203492&root=gcc&view=rev
Log:
    PR target/51244
    * config/sh/sh_treg_combine.cc: New SH specific RTL pass.
    * config.gcc (SH extra_objs): Add sh_ifcvt.o.
    * config/sh/t-sh (sh_treg_combine.o): New entry.
    * config/sh/sh.c (sh_fixed_condition_code_regs): New function that
    implements the target hook TARGET_FIXED_CONDITION_CODE_REGS.
    (register_sh_passes): New function.  Register sh_treg_combine pass.
    (sh_option_override): Invoke it.
    (sh_canonicalize_comparison): Handle op0_preserve_value.
    * sh.md (*cbranch_t"): Do not try to optimize missed test and branch
    opportunities.  Canonicalize branch condition.
    (nott): Allow only if pseudos can be created for non-SH2A.

    PR target/51244
    * gcc.dg/torture/p51244-21.c: New.
    * gcc.target/sh/pr51244-20.c: New.
    * gcc.target/sh/pr51244-20-sh2a.c: New.


Added:
    trunk/gcc/config/sh/sh_treg_combine.cc
    trunk/gcc/testsuite/gcc.dg/torture/pr51244-21.c
    trunk/gcc/testsuite/gcc.target/sh/pr51244-20-sh2a.c
    trunk/gcc/testsuite/gcc.target/sh/pr51244-20.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config.gcc
    trunk/gcc/config/sh/sh.c
    trunk/gcc/config/sh/sh.md
    trunk/gcc/config/sh/t-sh
    trunk/gcc/testsuite/ChangeLog


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] [SH] Inefficient conditional branch and code around T bit
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (71 preceding siblings ...)
  2013-10-12 20:47 ` olegendo at gcc dot gnu.org
@ 2013-10-12 21:26 ` olegendo at gcc dot gnu.org
  2013-12-05 17:54 ` olegendo at gcc dot gnu.org
                   ` (14 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: olegendo at gcc dot gnu.org @ 2013-10-12 21:26 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #71 from Oleg Endo <olegendo at gcc dot gnu.org> ---
(In reply to Oleg Endo from comment #70)
> Author: olegendo
> Date: Sat Oct 12 20:47:22 2013
> New Revision: 203492
> 

The issue raised in comment #59 has been fixed on 4.9.
There are some open issues though, which I will try to address in follow up
patches:

* The helper functions in sh_treg_combine.cc should go into a separate .h + .cc
file.  This would allow re-using them in other places and eliminate the similar
function 'sh_find_set_of_reg' in sh.c

* The RTL pass does the treg combine only when there is a conditional branch. 
It should also handle conditional move insns (-mpretend-cmove).

* The function 'try_combine_comparisons' in sh_reg_combine.cc always introduces
reg-reg copies.  In some cases (DImode comparisons in particular), these
reg-reg moves don't get eliminated afterwards before register allocation.  The
function should check whether creating new pseudos can be avoided by re-using
existing regs.


The sh_treg_combine RTL pass could probably be backported to 4.8 but seems too
intrusive.  Instead something like the patch in comment #64 should do, where
instead of checking for 'no_labels_between_p' it would probably be better to
check if the basic block with the conditional branch has only one predecessor.


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] [SH] Inefficient conditional branch and code around T bit
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (72 preceding siblings ...)
  2013-10-12 21:26 ` olegendo at gcc dot gnu.org
@ 2013-12-05 17:54 ` olegendo at gcc dot gnu.org
  2013-12-06 10:47 ` olegendo at gcc dot gnu.org
                   ` (13 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: olegendo at gcc dot gnu.org @ 2013-12-05 17:54 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #72 from Oleg Endo <olegendo at gcc dot gnu.org> ---
The original test case in PR 59343 is an interesting one with regard to T bit
optimizations (or the lack thereof):

void validate_number (char **numbertext)
{
  char *ptr = *numbertext;
  int valid = (ptr != 0) && (*ptr);

  for ( ; valid && *ptr; ++ptr)
    valid = (*ptr >= '0');

  if (!valid)
    *numbertext = 0;
}

with -Os -m4 -mb it is compiled to:

_validate_number:
        mov.l   @r4,r2    // [bb 2]
        tst     r2,r2
        bt/s    .L2
        mov     #0,r1


        mov.b   @r2,r1    // [bb 3]
        tst     r1,r1
        mov     #-1,r1
        negc    r1,r1

.L2:                      // [bb 4]
        mov     #47,r3

.L3:                      // [bb 5]
        tst     r1,r1
        bt      .L4

        mov.b   @r2+,r1   // [bb 6]
        tst     r1,r1
        bt/s    .L8

        cmp/gt  r3,r1     // [bb 7]

        bra     .L3
        movt    r1

.L4:
        mov.l   r1,@r4   // [bb 8]
.L8:
        rts
        nop


The basic block starting with L3 (bb 5) has three different r1 inputs from [bb
2], [bb 3] and [bb 7].  When sh_treg_combine tries to trace r1 starting in [bb
5]:

tracing (reg/v:SI 1 r1 [orig:185 valid ] [185])

[bb 5]
set of reg not found.  empty BB?

[bb 4]
set of reg not found (cstore)
set not found - aborting trace

Instead it should skip [bb 4] as it doesn't modify r1 or T bit and check [bb 3]
and [bb 2].  Because the setcc insns are not the same in [bb 2], [bb 3] and [bb
7], it would try to eliminate the cstores.  However, in [bb 2] there is no real
cstore but a constant load, which can be replaced with a clrt or sett insn
respectively.  The resulting code could be something like:

        mov.l   @r4,r2
        mov     #0,r1
        tst     r2,r2
        bt/s    .L2     // (*)
        clrt

        mov.b   @r2,r1
        tst     r1,r1
        movt    r1
        tst     r1,r1    // T = !T
.L2:
        mov     #47,r3
.L3:
        bf      .L4

        mov.b   @r2+,r1
        tst     r1,r1
        bt/s    .L8
        bra     .L3
        cmp/gt  r3,r1
.L4:
    mov.l   r1,@r4
.L8:
    rts
    nop

(*) The clrt insn actually has to be inserted before the conditional branch,
which is impossible as it modifies the branch condition.  Putting it into the
delay slot however is OK, which is usually done by the DBR pass.  A special
"branch and set/clear T" pseudo insn would be required (requires SH2+) which
produces the sequence above.  A more complicated way would be to create new
basic blocks.

The basic block reordering or similar RTL pass and the clrt/sett optimization
pass should then be able to simplify the code further to:

        mov.l   @r4,r2
        tst     r2,r2
        bf/s    .L4
        mov     #0,r1

        mov.b   @r2,r1
        tst     r1,r1
        bt/s    .L4
        mov     #47,r3
.L3:
        mov.b   @r2+,r1
        tst     r1,r1
        bt/s    .L8
        cmp/gt  r3,r1
        bt      .L3
.L4:
        mov.l   r1,@r4
.L8:
        rts
        nop


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] [SH] Inefficient conditional branch and code around T bit
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (73 preceding siblings ...)
  2013-12-05 17:54 ` olegendo at gcc dot gnu.org
@ 2013-12-06 10:47 ` olegendo at gcc dot gnu.org
  2014-05-10 20:19 ` olegendo at gcc dot gnu.org
                   ` (12 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: olegendo at gcc dot gnu.org @ 2013-12-06 10:47 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #73 from Oleg Endo <olegendo at gcc dot gnu.org> ---
Author: olegendo
Date: Fri Dec  6 10:46:53 2013
New Revision: 205734

URL: http://gcc.gnu.org/viewcvs?rev=205734&root=gcc&view=rev
Log:
    PR target/51244
    PR target/59343
    * config/sh/sh.md (*cbranch_t): Check that there are no labels between
    the s1 insn and the testing insn.  Remove REG_DEAD note    from s1 insn.

    PR target/51244
    PR target/59343
    * gcc.target/sh/pr51244-19.c: Adjust test case.


Modified:
    branches/gcc-4_8-branch/gcc/ChangeLog
    branches/gcc-4_8-branch/gcc/config/sh/sh.md
    branches/gcc-4_8-branch/gcc/testsuite/ChangeLog
    branches/gcc-4_8-branch/gcc/testsuite/gcc.target/sh/pr51244-19.c


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] [SH] Inefficient conditional branch and code around T bit
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (74 preceding siblings ...)
  2013-12-06 10:47 ` olegendo at gcc dot gnu.org
@ 2014-05-10 20:19 ` olegendo at gcc dot gnu.org
  2014-05-16 22:55 ` olegendo at gcc dot gnu.org
                   ` (11 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: olegendo at gcc dot gnu.org @ 2014-05-10 20:19 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #74 from Oleg Endo <olegendo at gcc dot gnu.org> ---
Author: olegendo
Date: Sat May 10 20:19:06 2014
New Revision: 210301

URL: http://gcc.gnu.org/viewcvs?rev=210301&root=gcc&view=rev
Log:
gcc/testsuite/
    PR target/51244
    * gcc.target/sh/pr51244-14.c: Relax criteria for passing the test.

Modified:
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/testsuite/gcc.target/sh/pr51244-14.c


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] [SH] Inefficient conditional branch and code around T bit
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (75 preceding siblings ...)
  2014-05-10 20:19 ` olegendo at gcc dot gnu.org
@ 2014-05-16 22:55 ` olegendo at gcc dot gnu.org
  2014-09-13 18:48 ` olegendo at gcc dot gnu.org
                   ` (10 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: olegendo at gcc dot gnu.org @ 2014-05-16 22:55 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #75 from Oleg Endo <olegendo at gcc dot gnu.org> ---
Author: olegendo
Date: Fri May 16 22:54:32 2014
New Revision: 210535

URL: http://gcc.gnu.org/viewcvs?rev=210535&root=gcc&view=rev
Log:
gcc/
    PR target/51244
    * config/sh/sh.c (sh_eval_treg_value): Handle t_reg_operand and
    negt_reg_operand cases.
    * config/sh/sh.md (*cset_zero): Likewise by using cbranch_treg_value
    predicate.
    * config/sh/predicates.md (cbranch_treg_value): Simplify.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/sh/predicates.md
    trunk/gcc/config/sh/sh.c
    trunk/gcc/config/sh/sh.md


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] [SH] Inefficient conditional branch and code around T bit
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (76 preceding siblings ...)
  2014-05-16 22:55 ` olegendo at gcc dot gnu.org
@ 2014-09-13 18:48 ` olegendo at gcc dot gnu.org
  2014-11-22 15:07 ` olegendo at gcc dot gnu.org
                   ` (9 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: olegendo at gcc dot gnu.org @ 2014-09-13 18:48 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #76 from Oleg Endo <olegendo at gcc dot gnu.org> ---
When compiling the libgcc divsc3 from PR 55212 with "-O2 -m2 -ml" (on sh-lra
branch) the following sequences are generated:

        tst     r0,r0
        subc    r0,r0     ! r0: T == 0 -> 0x00000000, T == 1 -> 0xFFFFFFFF
        not     r0,r0     ! r0: T == 0 -> 0xFFFFFFFF, T == 1 -> 0x00000000
        and     #1,r0     ! r0: T == 0 -> 1, T == 1 -> 0

which can be done better as:

        tst     r0,r0
        mov     #-1,r0
        negc    r0,r0

or
        tst     r0,r0
        movt    r0
        xor     #1,r0

and on SH2A:

        tst     r0,r0
        movrt   r0


combine is looking for the following patterns:

Failed to match this instruction:
(set (reg:SI 296 [ D.1371 ])
    (and:SI (not:SI (reg:SI 147 t))
        (const_int 1 [0x1])))

Failed to match this instruction:
(set (reg:SI 147 t)
    (and:SI (reg:SI 147 t)
        (const_int 1 [0x1])))

(and:SI (reg:SI T_REG) (const_int 1)) is effectively a T -> T nop move which is
supposed to be handled by the "*movtt" insn.  Maybe the case above and the
original eq:SI case in "*movtt" should be added to the t_reg_operand predicate.
 Then the "*movtt" pattern could be simplified to:

(define_insn_and_split "*movtt"
  [(set (reg:SI T_REG) (match_operand 0 "t_reg_operand"))] ...


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] [SH] Inefficient conditional branch and code around T bit
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (77 preceding siblings ...)
  2014-09-13 18:48 ` olegendo at gcc dot gnu.org
@ 2014-11-22 15:07 ` olegendo at gcc dot gnu.org
  2014-11-22 15:50 ` olegendo at gcc dot gnu.org
                   ` (8 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: olegendo at gcc dot gnu.org @ 2014-11-22 15:07 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #77 from Oleg Endo <olegendo at gcc dot gnu.org> ---
Author: olegendo
Date: Sat Nov 22 15:06:34 2014
New Revision: 217968

URL: https://gcc.gnu.org/viewcvs?rev=217968&root=gcc&view=rev
Log:
gcc/
    PR target/63986
    PR target/51244
    * config/sh/sh.c (sh_is_logical_t_store_expr,
    sh_try_omit_signzero_extend): Use rtx_insn* for insn argument.
    (sh_split_movrt_negc_to_movt_xor): New function.
    (sh_find_set_of_reg): Move to ...
    * config/sh/sh-protos.h (sh_find_set_of_reg): ... here and convert
    to template function.
    (set_of_reg): Use rtx_insn* for insn member.
    (sh_is_logical_t_store_expr, sh_try_omit_signzero_extend): Use
    rtx_insn* for insn argument.
    * config/sh/sh.md (movrt_negc, *movrt_negc): Split into movt-xor
    sequence using new sh_split_movrt_negc_to_movt_xor function.
    (movrt_xor): Allow also for SH2A.
    (*movt_movrt): Delete insns and splits.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/sh/sh-protos.h
    trunk/gcc/config/sh/sh.c
    trunk/gcc/config/sh/sh.md


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] [SH] Inefficient conditional branch and code around T bit
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (78 preceding siblings ...)
  2014-11-22 15:07 ` olegendo at gcc dot gnu.org
@ 2014-11-22 15:50 ` olegendo at gcc dot gnu.org
  2014-11-22 16:08 ` olegendo at gcc dot gnu.org
                   ` (7 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: olegendo at gcc dot gnu.org @ 2014-11-22 15:50 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #78 from Oleg Endo <olegendo at gcc dot gnu.org> ---
Author: olegendo
Date: Sat Nov 22 15:50:10 2014
New Revision: 217969

URL: https://gcc.gnu.org/viewcvs?rev=217969&root=gcc&view=rev
Log:
gcc/
    PR target/63783
    PR target/51244
    * config/sh/sh_treg_combine.cc (sh_treg_combine::make_not_reg_insn):
    Do not emit bitwise not insn.  Emit logical not insn sequence instead.
    Adjust related comments throughout the file.

gcc/testsuite/
    PR target/63783
    PR target/51244
    * gcc.target/sh/torture/pr63783-1.c: New.
    * gcc.target/sh/torture/pr63783-2.c: New.
    * gcc.target/sh/pr51244-20.c: Adjust.
    * gcc.target/sh/pr51244-20-sh2a.c: Adjust.

Added:
    trunk/gcc/testsuite/gcc.target/sh/torture/pr63783-1.c
    trunk/gcc/testsuite/gcc.target/sh/torture/pr63783-2.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/sh/sh_treg_combine.cc
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/testsuite/gcc.target/sh/pr51244-20-sh2a.c
    trunk/gcc/testsuite/gcc.target/sh/pr51244-20.c


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] [SH] Inefficient conditional branch and code around T bit
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (79 preceding siblings ...)
  2014-11-22 15:50 ` olegendo at gcc dot gnu.org
@ 2014-11-22 16:08 ` olegendo at gcc dot gnu.org
  2014-12-01  6:50 ` olegendo at gcc dot gnu.org
                   ` (6 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: olegendo at gcc dot gnu.org @ 2014-11-22 16:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #79 from Oleg Endo <olegendo at gcc dot gnu.org> ---
Author: olegendo
Date: Sat Nov 22 16:07:25 2014
New Revision: 217970

URL: https://gcc.gnu.org/viewcvs?rev=217970&root=gcc&view=rev
Log:
gcc/
    Backport from mainline
    2014-11-22  Oleg Endo  <olegendo@gcc.gnu.org>

    PR target/63783
    PR target/51244
    * config/sh/sh_treg_combine.cc (sh_treg_combine::make_not_reg_insn):
    Do not emit bitwise not insn.  Emit logical not insn sequence instead.
    Adjust related comments throughout the file.

gcc/testsuite/
    Backport from mainline
    2014-11-22  Oleg Endo  <olegendo@gcc.gnu.org>

    PR target/63783
    PR target/51244
    * gcc.target/sh/torture/pr63783-1.c: New.
    * gcc.target/sh/torture/pr63783-2.c: New.
    * gcc.target/sh/pr51244-20.c: Adjust.
    * gcc.target/sh/pr51244-20-sh2a.c: Adjust.

Added:
    branches/gcc-4_9-branch/gcc/testsuite/gcc.target/sh/torture/pr63783-1.c
    branches/gcc-4_9-branch/gcc/testsuite/gcc.target/sh/torture/pr63783-2.c
Modified:
    branches/gcc-4_9-branch/gcc/ChangeLog
    branches/gcc-4_9-branch/gcc/config/sh/sh_treg_combine.cc
    branches/gcc-4_9-branch/gcc/testsuite/ChangeLog
    branches/gcc-4_9-branch/gcc/testsuite/gcc.target/sh/pr51244-20-sh2a.c
    branches/gcc-4_9-branch/gcc/testsuite/gcc.target/sh/pr51244-20.c


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] [SH] Inefficient conditional branch and code around T bit
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (80 preceding siblings ...)
  2014-11-22 16:08 ` olegendo at gcc dot gnu.org
@ 2014-12-01  6:50 ` olegendo at gcc dot gnu.org
  2014-12-17 22:53 ` olegendo at gcc dot gnu.org
                   ` (5 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: olegendo at gcc dot gnu.org @ 2014-12-01  6:50 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #80 from Oleg Endo <olegendo at gcc dot gnu.org> ---
Author: olegendo
Date: Mon Dec  1 06:50:06 2014
New Revision: 218200

URL: https://gcc.gnu.org/viewcvs?rev=218200&root=gcc&view=rev
Log:
gcc/
    PR target/63986
    PR target/51244
    * config/sh/sh.c (sh_unspec_insn_p,
    sh_insn_operands_modified_between_p): New functions.
    (sh_split_movrt_negc_to_movt_xor): Do not delete insn if its operands
    are modified or if it has side effects, may trap or is volatile.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/sh/sh.c


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] [SH] Inefficient conditional branch and code around T bit
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (81 preceding siblings ...)
  2014-12-01  6:50 ` olegendo at gcc dot gnu.org
@ 2014-12-17 22:53 ` olegendo at gcc dot gnu.org
  2014-12-17 23:08 ` olegendo at gcc dot gnu.org
                   ` (4 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: olegendo at gcc dot gnu.org @ 2014-12-17 22:53 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #81 from Oleg Endo <olegendo at gcc dot gnu.org> ---
Author: olegendo
Date: Wed Dec 17 22:52:21 2014
New Revision: 218847

URL: https://gcc.gnu.org/viewcvs?rev=218847&root=gcc&view=rev
Log:
gcc/
    PR target/51244
    * config/sh/sh_treg_combine.cc (sh_treg_combine::try_optimize_cbranch):
    Combine ccreg inversion and cbranch into inverted cbranch.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/sh/sh_treg_combine.cc


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] [SH] Inefficient conditional branch and code around T bit
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (82 preceding siblings ...)
  2014-12-17 22:53 ` olegendo at gcc dot gnu.org
@ 2014-12-17 23:08 ` olegendo at gcc dot gnu.org
  2014-12-17 23:15 ` olegendo at gcc dot gnu.org
                   ` (3 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: olegendo at gcc dot gnu.org @ 2014-12-17 23:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #82 from Oleg Endo <olegendo at gcc dot gnu.org> ---
Author: olegendo
Date: Wed Dec 17 23:08:14 2014
New Revision: 218850

URL: https://gcc.gnu.org/viewcvs?rev=218850&root=gcc&view=rev
Log:
gcc/
    PR target/51244
    * config/sh/sh_treg_combine.cc (is_conditional_insn): New function.
    (cbranch_trace): Add member rtx* condition_rtx_in_insn, initialize it
    accordingly in constructor.
    (cbranch_trace::branch_condition_rtx_ref): New function.
    (cbranch_trace::branch_condition_rtx): Use branch_condition_rtx_ref.
    (sh_treg_combine::try_invert_branch_condition): Invert condition rtx
    in insn using reversed_comparison_code and validate_change instead of
    invert_jump_1.
    (sh_treg_combine::execute): Look for conditional insns in basic blocks
    in addition to conditional branches.
    * config/sh/sh.md (*movsicc_div0s): Remove combine patterns.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/sh/sh.md
    trunk/gcc/config/sh/sh_treg_combine.cc


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] [SH] Inefficient conditional branch and code around T bit
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (83 preceding siblings ...)
  2014-12-17 23:08 ` olegendo at gcc dot gnu.org
@ 2014-12-17 23:15 ` olegendo at gcc dot gnu.org
  2014-12-24 21:56 ` olegendo at gcc dot gnu.org
                   ` (2 subsequent siblings)
  87 siblings, 0 replies; 89+ messages in thread
From: olegendo at gcc dot gnu.org @ 2014-12-17 23:15 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #83 from Oleg Endo <olegendo at gcc dot gnu.org> ---
(In reply to Oleg Endo from comment #71)
> 
> * The RTL pass does the treg combine only when there is a conditional
> branch.  It should also handle conditional move insns (-mpretend-cmove).
> 

It does now.  It also handles nott cbranch sequences by inverting the branch
condition and deleting the nott insn.


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] [SH] Inefficient conditional branch and code around T bit
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (84 preceding siblings ...)
  2014-12-17 23:15 ` olegendo at gcc dot gnu.org
@ 2014-12-24 21:56 ` olegendo at gcc dot gnu.org
  2015-01-24 13:06 ` olegendo at gcc dot gnu.org
  2015-03-01 19:16 ` olegendo at gcc dot gnu.org
  87 siblings, 0 replies; 89+ messages in thread
From: olegendo at gcc dot gnu.org @ 2014-12-24 21:56 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #84 from Oleg Endo <olegendo at gcc dot gnu.org> ---
Author: olegendo
Date: Wed Dec 24 21:55:59 2014
New Revision: 219062

URL: https://gcc.gnu.org/viewcvs?rev=219062&root=gcc&view=rev
Log:
gcc/
    PR target/51244
    * config/sh/sh.md (*mov_t_msb_neg): Convert split into insn_and_split.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/sh/sh.md


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] [SH] Inefficient conditional branch and code around T bit
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (85 preceding siblings ...)
  2014-12-24 21:56 ` olegendo at gcc dot gnu.org
@ 2015-01-24 13:06 ` olegendo at gcc dot gnu.org
  2015-03-01 19:16 ` olegendo at gcc dot gnu.org
  87 siblings, 0 replies; 89+ messages in thread
From: olegendo at gcc dot gnu.org @ 2015-01-24 13:06 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #85 from Oleg Endo <olegendo at gcc dot gnu.org> ---
Author: olegendo
Date: Sat Jan 24 13:04:53 2015
New Revision: 220081

URL: https://gcc.gnu.org/viewcvs?rev=220081&root=gcc&view=rev
Log:
gcc/
    PR target/49263
    PR target/53987
    PR target/64345
    PR target/59533
    PR target/52933
    PR target/54236
    PR target/51244
    * config/sh/sh-protos.h
    (sh_extending_set_of_reg::can_use_as_unextended_reg,
    sh_extending_set_of_reg::use_as_unextended_reg,
    sh_is_nott_insn, sh_movt_set_dest, sh_movrt_set_dest, sh_is_movt_insn,
    sh_is_movrt_insn, sh_insn_operands_modified_between_p,
    sh_reg_dead_or_unused_after_insn, sh_in_recog_treg_set_expr,
    sh_recog_treg_set_expr, sh_split_treg_set_expr): New functions.
    (sh_treg_insns): New class.
    * config/sh/sh.c (TARGET_LEGITIMATE_COMBINED_INSN): Define target hook.
    (scope_counter): New class.
    (sh_legitimate_combined_insn, sh_is_nott_insn, sh_movt_set_dest,
    sh_movrt_set_dest, sh_reg_dead_or_unused_after_insn,
    sh_extending_set_of_reg::can_use_as_unextended_reg,
    sh_extending_set_of_reg::use_as_unextended_reg, sh_recog_treg_set_expr,
    sh_in_recog_treg_set_expr, sh_try_split_insn_simple,
    sh_split_treg_set_expr): New functions.
    (addsubcosts): Handle treg_set_expr.
    (sh_rtx_costs): Handle IF_THEN_ELSE and ZERO_EXTRACT.
    (sh_rtx_costs): Use arith_reg_operand in SIGN_EXTEND and ZERO_EXTEND.
    (sh_rtx_costs): Handle additional bit test patterns in EQ and AND cases.
    (sh_insn_operands_modified_between_p): Make non-static.
    * config/sh/predicates.md (zero_extend_movu_operand): Allow
    simple_mem_operand in addition to displacement_mem_operand.
    (zero_extend_operand): Don't allow zero_extend_movu_operand.
    (treg_set_expr, treg_set_expr_not_const01,
    arith_reg_or_treg_set_expr): New predicates.
    * config/sh/sh.md (tstsi_t): Use arith_reg_operand and
    arith_or_int_operand instead of logical_operand.  Convert to
    insn_and_split.  Try to optimize constant operand in splitter.
    (tsthi_t, tstqi_t): Fold into *tst<mode>_t.  Convert to insn_and_split.
    (*tstqi_t_zero): Delete.
    (*tst<mode>_t_subregs): Add !sh_in_recog_treg_set_expr split condition.
    (tstsi_t_and_not): Delete.
    (tst<mode>_t_zero_extract_eq): Rename to *tst<mode>_t_zero_extract.
    Convert to insn_and_split.
    (unnamed split, tstsi_t_zero_extract_xor,
    tstsi_t_zero_extract_subreg_xor_little,
    tstsi_t_zero_extract_subreg_xor_big): Delete.
    (*tstsi_t_shift_mask): New insn_and_split.
    (cmpeqsi_t, cmpgesi_t): Add new split for const_int 0 operands and try
    to recombine with surrounding insns when splitting.
    (*negtstsi): Add !sh_in_recog_treg_set_expr condition.
    (cmp_div0s_0, cmp_div0s_1, *cmp_div0s_0, *cmp_div0s_1): Rewrite as ...
    (cmp_div0s, *cmp_div0s_1, *cmp_div0s_2, *cmp_div0s_3, *cmp_div0s_4,
    *cmp_div0s_5, *cmp_div0s_6): ... these new insn_and_split patterns.
    (*cbranch_div0s: Delete.
    (*addc): Convert to insn_and_split.  Use treg_set_expr as 3rd operand.
    Try to recombine with surrounding insns when splitting.  Add operand
    order variants.
    (*addc_t_r, *addc_r_t): Use treg_set_expr_not_const01.
    (*addc_r_r_1, *addc_r_lsb, *addc_r_r_lsb, *addc_r_lsb_r, *addc_r_msb,
    *addc_r_r_msb, *addc_2r_msb): Delete.
    (*addc_2r_lsb): Rename to *addc_2r_t.  Use treg_set_expr.  Add operand
    order variant.
    (*addc_negreg_t): New insn_and_split.
    (*subc): Convert to insn_and_split.  Use treg_set_expr as 3rd operand.
    Try to recombine with surrounding insns when splitting.
    Add operand order variants.  
    (*subc_negt_reg, *subc_negreg_t, *reg_lsb_t, *reg_msb_t): New
    insn_and_split patterns.
    (*rotcr): Use arith_reg_or_treg_set_expr.  Try to recombine with
    surrounding insns when splitting.
    (unnamed rotcr split): Use arith_reg_or_treg_set_expr.
    (*rotcl): Likewise.  Add zero_extract variant.
    (*ashrsi2_31): New insn_and_split.
    (*negc): Convert to insn_and_split.  Use treg_set_expr.
    (*zero_extend<mode>si2_disp_mem): Update comment.
    (movrt_negc, *movrt_negc, nott): Add !sh_in_recog_treg_set_expr split
    condition.
    (*mov_t_msb_neg, mov_neg_si_t): Use treg_set_expr.  Try to recombine
    with surrounding insns when splitting.
    (any_treg_expr_to_reg): New insn_and_split.
    (*neg_zero_extract_0, *neg_zero_extract_1, *neg_zero_extract_2,
    *neg_zero_extract_3, *neg_zero_extract_4, *neg_zero_extract_5,
    *neg_zero_extract_6, *zero_extract_0, *zero_extract_1,
    *zero_extract_2): New single bit zero extract patterns.
    (bld_reg, *bld_regqi): Fold into bld<mode>_reg.
    (*get_thread_pointersi, store_gbr, *mov<mode>_gbr_load,
    *mov<mode>_gbr_load, *mov<mode>_gbr_load, *mov<mode>_gbr_load,
    *movdi_gbr_load): Use arith_reg_dest instead of register_operand for
    set destination.
    (set_thread_pointersi, load_gbr): Use arith_reg_operand instead of
    register_operand for set source.

gcc/testsuite/
    PR target/49263
    PR target/53987
    PR target/64345
    PR target/59533
    PR target/52933
    PR target/54236
    PR target/51244
    * gcc.target/sh/pr64345-1.c: New.
    * gcc.target/sh/pr64345-2.c: New.
    * gcc.target/sh/pr59533-1.c: New.
    * gcc.target/sh/pr49263.c: Adjust matching of expected insns.
    * gcc.target/sh/pr52933-2.c: Likewise.
    * gcc.target/sh/pr54089-1.c: Likewise.
    * gcc.target/sh/pr54236-1.c: Likewise.
    * gcc.target/sh/pr51244-20-sh2a.c: Likewise.
    * gcc.target/sh/pr49263-1.c: Remove xfails.
    * gcc.target/sh/pr49263-2.c: Likewise.
    * gcc.target/sh/pr49263-3.c: Likewise.
    * gcc.target/sh/pr53987-1.c: Likewise.
    * gcc.target/sh/pr52933-1.c: Adjust matching of expected insns.
    (test_24, test_25, test_26, test_27, test_28, test_29, test_30): New.
    * gcc.target/sh/pr51244-12.c: Adjust matching of expected insns.
    (test05, test06, test07, test08, test09, test10, test11, test12): New.
    * gcc.target/sh/pr54236-3.c: Adjust matching of expected insns.
    (test_002, test_003, test_004, test_005, test_006, test_007, test_008,
    test_009): New.
    * gcc.target/sh/pr51244-4.c: Adjust matching of expected insns.
    (test_02): New.

Added:
    trunk/gcc/testsuite/gcc.target/sh/pr59533-1.c
    trunk/gcc/testsuite/gcc.target/sh/pr64345-1.c
    trunk/gcc/testsuite/gcc.target/sh/pr64345-2.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/sh/predicates.md
    trunk/gcc/config/sh/sh-protos.h
    trunk/gcc/config/sh/sh.c
    trunk/gcc/config/sh/sh.md
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/testsuite/gcc.target/sh/pr49263-1.c
    trunk/gcc/testsuite/gcc.target/sh/pr49263-2.c
    trunk/gcc/testsuite/gcc.target/sh/pr49263-3.c
    trunk/gcc/testsuite/gcc.target/sh/pr49263.c
    trunk/gcc/testsuite/gcc.target/sh/pr51244-12.c
    trunk/gcc/testsuite/gcc.target/sh/pr51244-20-sh2a.c
    trunk/gcc/testsuite/gcc.target/sh/pr51244-4.c
    trunk/gcc/testsuite/gcc.target/sh/pr52933-1.c
    trunk/gcc/testsuite/gcc.target/sh/pr52933-2.c
    trunk/gcc/testsuite/gcc.target/sh/pr53987-1.c
    trunk/gcc/testsuite/gcc.target/sh/pr54089-1.c
    trunk/gcc/testsuite/gcc.target/sh/pr54236-1.c
    trunk/gcc/testsuite/gcc.target/sh/pr54236-3.c


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [Bug target/51244] [SH] Inefficient conditional branch and code around T bit
  2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
                   ` (86 preceding siblings ...)
  2015-01-24 13:06 ` olegendo at gcc dot gnu.org
@ 2015-03-01 19:16 ` olegendo at gcc dot gnu.org
  87 siblings, 0 replies; 89+ messages in thread
From: olegendo at gcc dot gnu.org @ 2015-03-01 19:16 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

Oleg Endo <olegendo at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |RESOLVED
         Resolution|---                         |FIXED

--- Comment #86 from Oleg Endo <olegendo at gcc dot gnu.org> ---
I'd like to close this PR as fixed because it's getting too long.  I'll try to
pull out the remaining issues into individual new PRs.


^ permalink raw reply	[flat|nested] 89+ messages in thread

end of thread, other threads:[~2015-03-01 19:16 UTC | newest]

Thread overview: 89+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-11-20 20:29 [Bug target/51244] New: SH Target: Inefficient conditional branch oleg.endo@t-online.de
2011-11-22 23:36 ` [Bug target/51244] " kkojima at gcc dot gnu.org
2011-12-27 22:03 ` oleg.endo@t-online.de
2011-12-27 23:17 ` oleg.endo@t-online.de
2011-12-28  0:42 ` oleg.endo@t-online.de
2011-12-28  4:57 ` oleg.endo@t-online.de
2011-12-28 16:07 ` oleg.endo@t-online.de
2011-12-28 22:30 ` kkojima at gcc dot gnu.org
2011-12-30 22:18 ` oleg.endo@t-online.de
2012-02-26 23:36 ` olegendo at gcc dot gnu.org
2012-03-02 21:57 ` olegendo at gcc dot gnu.org
2012-03-03 12:32 ` olegendo at gcc dot gnu.org
2012-03-04 17:25 ` olegendo at gcc dot gnu.org
2012-03-05 23:13 ` olegendo at gcc dot gnu.org
2012-03-05 23:38 ` olegendo at gcc dot gnu.org
2012-03-06  8:28 ` olegendo at gcc dot gnu.org
2012-03-06  8:50 ` kkojima at gcc dot gnu.org
2012-03-06  9:48 ` olegendo at gcc dot gnu.org
2012-03-06 10:36 ` kkojima at gcc dot gnu.org
2012-03-06 10:38 ` kkojima at gcc dot gnu.org
2012-03-06 10:39 ` kkojima at gcc dot gnu.org
2012-03-06 10:40 ` kkojima at gcc dot gnu.org
2012-03-06 11:30 ` olegendo at gcc dot gnu.org
2012-03-06 23:43 ` olegendo at gcc dot gnu.org
2012-03-08  1:26 ` olegendo at gcc dot gnu.org
2012-03-08 11:12 ` kkojima at gcc dot gnu.org
2012-03-08 11:15 ` kkojima at gcc dot gnu.org
2012-03-08 11:17 ` kkojima at gcc dot gnu.org
2012-03-09  0:27 ` olegendo at gcc dot gnu.org
2012-03-09  1:45 ` kkojima at gcc dot gnu.org
2012-03-09  8:41 ` kkojima at gcc dot gnu.org
2012-03-09 10:02 ` olegendo at gcc dot gnu.org
2012-03-09 10:37 ` kkojima at gcc dot gnu.org
2012-03-11 13:18 ` olegendo at gcc dot gnu.org
2012-03-15  8:11 ` kkojima at gcc dot gnu.org
2012-03-20  1:46 ` olegendo at gcc dot gnu.org
2012-03-20  2:33 ` kkojima at gcc dot gnu.org
2012-03-20 20:41 ` olegendo at gcc dot gnu.org
2012-05-07 20:53 ` olegendo at gcc dot gnu.org
2012-05-08 21:43 ` olegendo at gcc dot gnu.org
2012-06-30 12:01 ` olegendo at gcc dot gnu.org
2012-07-02 19:24 ` olegendo at gcc dot gnu.org
2012-07-08 15:03 ` olegendo at gcc dot gnu.org
2012-07-23 22:58 ` olegendo at gcc dot gnu.org
2012-07-23 23:29 ` olegendo at gcc dot gnu.org
2012-07-26  0:20 ` olegendo at gcc dot gnu.org
2012-07-30  6:46 ` olegendo at gcc dot gnu.org
2012-08-09 15:55 ` olegendo at gcc dot gnu.org
2012-08-12 22:47 ` olegendo at gcc dot gnu.org
2012-08-20 20:51 ` olegendo at gcc dot gnu.org
2012-08-30 22:54 ` olegendo at gcc dot gnu.org
2012-08-31 10:55 ` kkojima at gcc dot gnu.org
2012-08-31 15:50 ` olegendo at gcc dot gnu.org
2012-09-04  8:03 ` olegendo at gcc dot gnu.org
2012-09-23 21:36 ` [Bug target/51244] [SH] Inefficient conditional branch and code around T bit olegendo at gcc dot gnu.org
2012-09-23 21:42 ` olegendo at gcc dot gnu.org
2012-10-03 21:39 ` olegendo at gcc dot gnu.org
2012-10-12  0:41 ` olegendo at gcc dot gnu.org
2012-10-15 22:08 ` olegendo at gcc dot gnu.org
2012-11-03 12:01 ` olegendo at gcc dot gnu.org
2013-07-18 16:11 ` laurent.alfonsi at st dot com
2013-07-18 16:12 ` laurent.alfonsi at st dot com
2013-07-20 14:38 ` olegendo at gcc dot gnu.org
2013-07-23  8:21 ` laurent.alfonsi at st dot com
2013-07-27 19:28 ` olegendo at gcc dot gnu.org
2013-07-28  8:51 ` olegendo at gcc dot gnu.org
2013-07-28 12:26 ` olegendo at gcc dot gnu.org
2013-07-31 21:46 ` olegendo at gcc dot gnu.org
2013-08-23  0:13 ` olegendo at gcc dot gnu.org
2013-08-23  0:25 ` kkojima at gcc dot gnu.org
2013-09-24 22:43 ` olegendo at gcc dot gnu.org
2013-10-03 22:50 ` olegendo at gcc dot gnu.org
2013-10-12 20:47 ` olegendo at gcc dot gnu.org
2013-10-12 21:26 ` olegendo at gcc dot gnu.org
2013-12-05 17:54 ` olegendo at gcc dot gnu.org
2013-12-06 10:47 ` olegendo at gcc dot gnu.org
2014-05-10 20:19 ` olegendo at gcc dot gnu.org
2014-05-16 22:55 ` olegendo at gcc dot gnu.org
2014-09-13 18:48 ` olegendo at gcc dot gnu.org
2014-11-22 15:07 ` olegendo at gcc dot gnu.org
2014-11-22 15:50 ` olegendo at gcc dot gnu.org
2014-11-22 16:08 ` olegendo at gcc dot gnu.org
2014-12-01  6:50 ` olegendo at gcc dot gnu.org
2014-12-17 22:53 ` olegendo at gcc dot gnu.org
2014-12-17 23:08 ` olegendo at gcc dot gnu.org
2014-12-17 23:15 ` olegendo at gcc dot gnu.org
2014-12-24 21:56 ` olegendo at gcc dot gnu.org
2015-01-24 13:06 ` olegendo at gcc dot gnu.org
2015-03-01 19:16 ` olegendo at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).