public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/53987] New: [SH] Unnecessary zero-extension before cmp/eq
@ 2012-07-16 23:50 olegendo at gcc dot gnu.org
  2012-08-31 18:14 ` [Bug target/53987] " olegendo at gcc dot gnu.org
                   ` (12 more replies)
  0 siblings, 13 replies; 14+ messages in thread
From: olegendo at gcc dot gnu.org @ 2012-07-16 23:50 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53987

             Bug #: 53987
           Summary: [SH] Unnecessary zero-extension before cmp/eq
    Classification: Unclassified
           Product: gcc
           Version: 4.8.0
            Status: UNCONFIRMED
          Severity: enhancement
          Priority: P3
         Component: target
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: olegendo@gcc.gnu.org
            Target: sh*-*-*


The following function:

int test_00 (unsigned char* a, unsigned char* b, int c, int d)
{
  if (*a == *b)
    return c;
  return d;
}

gets compiled to:

        mov.b   @r4,r1
        mov.b   @r5,r2
        extu.b  r1,r1
        extu.b  r2,r2
        cmp/eq  r2,r1
        bt      .L4
        mov     r7,r6
.L4:
        rts
        mov     r6,r0

Obviously the zero-extensions can be omitted.
The following combine patterns can be used to eliminate the zero extensions

Index: gcc/config/sh/sh.md
===================================================================
--- gcc/config/sh/sh.md    (revision 189550)
+++ gcc/config/sh/sh.md    (working copy)
@@ -759,6 +759,30 @@
     cmp/eq    %1,%0"
    [(set_attr "type" "mt_group")])

+(define_insn_and_split "*cmpeqsi_t"
+  [(set (reg:SI T_REG)
+    (eq:SI (zero_extend:SI (match_operand:QI 0 "arith_reg_operand" "r"))
+           (match_operand:SI 1 "arith_reg_operand" "r")))]
+  "TARGET_SH1"
+{
+  gcc_unreachable ();
+  return "#";
+}
+  "&& can_create_pseudo_p ()"
+  [(set (match_dup 2) (zero_extend:SI (match_dup 0)))
+   (set (reg:SI T_REG) (eq:SI (match_dup 1) (match_dup 2)))]
+{
+  operands[2] = gen_reg_rtx (SImode);
+})
+
+(define_insn "*cmpeqsi_t"
+  [(set (reg:SI T_REG)
+    (eq:SI (sign_extend:SI (match_operand:QI 0 "arith_reg_operand" "r"))
+           (sign_extend:SI (match_operand:QI 1 "arith_reg_operand" "r"))))]
+  "TARGET_SH1"
+  "cmp/eq    %1,%0"
+   [(set_attr "type" "mt_group")])
+
 (define_insn "cmpgtsi_t"
   [(set (reg:SI T_REG)
     (gt:SI (match_operand:SI 0 "arith_reg_operand" "r,r")


A quick look at the CSiBE result-size set (-m4-single -O2 -pretend-cmove) shows
a few code size decreases and one increase, where the following code sequence
is generated in unrarlib.s:

.L397:
        mov.l   .L467,r3        ! 167    movsi_ie/1
        mov.b   @(2,r9),r0      ! 165    *movqi_load_mem_disp/1
        mov.w   @r9,r6          ! 169    *extendhisi2_compact_snd
        mov     r0,r7       !<< ! 513    *movqi_reg_reg/1
        mov.b   r0,@(2,r3)      ! 172    *movqi_store_mem_disp04/1
        mov.w   @(4,r9),r0      ! 514    *movhi_load_mem_disp/1
        extu.b  r7,r7       !<< ! 441    *zero_extendqisi2_compact
        mov.w   r6,@r3          ! 170    *movhi/4
        cmp/eq  r7,r8           ! 442    cmpeqsi_t/3

This seems to be the consequence of the splitter, which changes the insn
order...


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/53987] [SH] Unnecessary zero-extension before cmp/eq
  2012-07-16 23:50 [Bug target/53987] New: [SH] Unnecessary zero-extension before cmp/eq olegendo at gcc dot gnu.org
@ 2012-08-31 18:14 ` olegendo at gcc dot gnu.org
  2013-10-29 21:58 ` olegendo at gcc dot gnu.org
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: olegendo at gcc dot gnu.org @ 2012-08-31 18:14 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53987

--- Comment #1 from Oleg Endo <olegendo at gcc dot gnu.org> 2012-08-31 18:14:28 UTC ---
On a second thought, it might be not safe to omit zero/sign extensions if
values are compared after calculations, where the regs hold values < SImode.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/53987] [SH] Unnecessary zero-extension before cmp/eq
  2012-07-16 23:50 [Bug target/53987] New: [SH] Unnecessary zero-extension before cmp/eq olegendo at gcc dot gnu.org
  2012-08-31 18:14 ` [Bug target/53987] " olegendo at gcc dot gnu.org
@ 2013-10-29 21:58 ` olegendo at gcc dot gnu.org
  2013-12-16 19:07 ` olegendo at gcc dot gnu.org
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: olegendo at gcc dot gnu.org @ 2013-10-29 21:58 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53987

--- Comment #2 from Oleg Endo <olegendo at gcc dot gnu.org> ---
As of rev 204180 (4.9) this problem still exists.
As far as I understand, the actual root of the problem is that the 'unsigned
char' mem loads into regs are neither sign nor zero extended.  So combine sees
the following:

(insn 11 6 12 2 (set (reg:QI 169 [ *a_3(D) ])
        (mem:QI (reg:SI 4 r4 [ a ]) [0 *a_3(D)+0 S1 A8])) {*movqi}
     (expr_list:REG_DEAD (reg:SI 4 r4 [ a ])
        (nil)))
(insn 12 11 13 2 (set (reg:SI 168 [ *a_3(D)+-3 ])
        (zero_extend:SI (reg:QI 169 [ *a_3(D) ]))) {*zero_extendqisi2_compact}
     (expr_list:REG_DEAD (reg:QI 169 [ *a_3(D) ])
        (nil)))
(insn 13 12 14 2 (set (reg:QI 171 [ *b_5(D) ])
        (mem:QI (reg:SI 5 r5 [ b ]) [0 *b_5(D)+0 S1 A8])) {*movqi}
     (expr_list:REG_DEAD (reg:SI 5 r5 [ b ])
        (nil)))
(insn 14 13 15 2 (set (reg:SI 170 [ *b_5(D)+-3 ])
        (zero_extend:SI (reg:QI 171 [ *b_5(D) ]))) {*zero_extendqisi2_compact}
     (expr_list:REG_DEAD (reg:QI 171 [ *b_5(D) ])
        (nil)))
(insn 15 14 16 2 (set (reg:SI 147 t)
        (eq:SI (reg:SI 168 [ *a_3(D)+-3 ])
            (reg:SI 170 [ *b_5(D)+-3 ]))) {cmpeqsi_t}
     (expr_list:REG_DEAD (reg:SI 170 [ *b_5(D)+-3 ])
        (expr_list:REG_DEAD (reg:SI 168 [ *a_3(D)+-3 ])
            (nil))))

On the other hand, signed char mem loads are expanded as sign extending.
Since LOAD_EXTEND_OP tells that any mem loads but SImode are sign extending,
one could expect that all mem loads will be automatically expanded as such,
which is not the case.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/53987] [SH] Unnecessary zero-extension before cmp/eq
  2012-07-16 23:50 [Bug target/53987] New: [SH] Unnecessary zero-extension before cmp/eq olegendo at gcc dot gnu.org
  2012-08-31 18:14 ` [Bug target/53987] " olegendo at gcc dot gnu.org
  2013-10-29 21:58 ` olegendo at gcc dot gnu.org
@ 2013-12-16 19:07 ` olegendo at gcc dot gnu.org
  2014-12-17 21:09 ` olegendo at gcc dot gnu.org
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: olegendo at gcc dot gnu.org @ 2013-12-16 19:07 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53987

--- Comment #3 from Oleg Endo <olegendo at gcc dot gnu.org> ---
(In reply to Oleg Endo from comment #2)
> As of rev 204180 (4.9) this problem still exists.
> As far as I understand, the actual root of the problem is that the 'unsigned
> char' mem loads into regs are neither sign nor zero extended.

I've tried doing the following to enforce sign extension of memory loads <
SImode:

Index: gcc/config/sh/sh.md
===================================================================
--- gcc/config/sh/sh.md    (revision 205971)
+++ gcc/config/sh/sh.md    (working copy)
@@ -5958,7 +5958,18 @@

 (define_expand "zero_extend<mode>si2"
   [(set (match_operand:SI 0 "arith_reg_dest")
-    (zero_extend:SI (match_operand:QIHI 1 "zero_extend_operand")))])
+    (zero_extend:SI (match_operand:QIHI 1 "general_extend_operand")))]
+  ""
+{
+  if (!zero_extend_operand (operands[1], <MODE>mode))
+    {
+      rtx tmp = gen_reg_rtx (SImode);
+      emit_insn (gen_extend<mode>si2 (tmp, operands[1]));
+      emit_insn (gen_zero_extend<mode>si2 (operands[0],
+                       gen_lowpart (<MODE>mode, tmp)));
+      DONE;
+    }
+})

 (define_insn_and_split "*zero_extend<mode>si2_compact"
   [(set (match_operand:SI 0 "arith_reg_dest" "=r")

However, this doesn't fix the problem.
According to CSiBE (-m4 -ml -O2 -mpretend-cmove) there are a few cases where
register allocation is a bit better, but there are also some code size
increases (e.g. interference with the tst #imm,r0 patterns).  There's a code
size decrease of 228 bytes on the whole set.

Nevertheless, having the explicit sign_extend mem loads could be useful.  For
example knowing that a mem load sign extends the cmpeq insn could be hoisted
above the extension insns before register allocation.
On SH2A it's probably better to not allow zero extending mem loads in the
expander and defer the movu.{b|w} insn selection until the combine pass. 
Otherwise the original test case will always use zero extending mem loads, even
though sign extending ones would suffice (16 bit insns vs 32 bit insns).


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/53987] [SH] Unnecessary zero-extension before cmp/eq
  2012-07-16 23:50 [Bug target/53987] New: [SH] Unnecessary zero-extension before cmp/eq olegendo at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2013-12-16 19:07 ` olegendo at gcc dot gnu.org
@ 2014-12-17 21:09 ` olegendo at gcc dot gnu.org
  2014-12-21 16:54 ` [Bug target/53987] [SH] Unnecessary zero-extensions olegendo at gcc dot gnu.org
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: olegendo at gcc dot gnu.org @ 2014-12-17 21:09 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53987

--- Comment #4 from Oleg Endo <olegendo at gcc dot gnu.org> ---
It seems that converting unsigned values to signed values, i.e. replacing
zero-extensions with sign-extensions and recombining sign-extensions with loads
could make sense in general.

For example, the following code from CSiBE
mpeg2dec-0.3.1/libmpeg2/motion_comp.s contains sequences like:

_MC_put_x_8_c:
    .align 2
.L24:
        mov.b   @r5,r0
        sett
        extu.b  r0,r1
        mov.b   @(1,r5),r0
        extu.b  r0,r0
        addc    r1,r0
        shar    r0
        mov.b   r0,@r4
        sett
        mov.b   @(1,r5),r0
        extu.b  r0,r3
        mov.b   @(2,r5),r0
        extu.b  r0,r1
        mov     r3,r0
        addc    r1,r0
        shar    r0
        mov.b   r0,@(1,r4)
        ....

Here effectively only 8 bit values are calculated.  The zero-extensions can be
omitted, since the higher bits do not influence the result of the lowest 8 bits
and the higher bits are discarded after the 8 bit stores.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/53987] [SH] Unnecessary zero-extensions
  2012-07-16 23:50 [Bug target/53987] New: [SH] Unnecessary zero-extension before cmp/eq olegendo at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2014-12-17 21:09 ` olegendo at gcc dot gnu.org
@ 2014-12-21 16:54 ` olegendo at gcc dot gnu.org
  2014-12-22 22:27 ` olegendo at gcc dot gnu.org
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: olegendo at gcc dot gnu.org @ 2014-12-21 16:54 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53987

--- Comment #5 from Oleg Endo <olegendo at gcc dot gnu.org> ---
(In reply to Oleg Endo from comment #4)
> It seems that converting unsigned values to signed values, i.e. replacing
> zero-extensions with sign-extensions and recombining sign-extensions with
> loads could make sense in general.
> 
> For example, the following code from CSiBE
> mpeg2dec-0.3.1/libmpeg2/motion_comp.s contains sequences like:
> 
> _MC_put_x_8_c:
> 	.align 2
> .L24:
>         mov.b   @r5,r0
>         sett
>         extu.b  r0,r1
>         mov.b   @(1,r5),r0
>         extu.b  r0,r0
>         addc    r1,r0
>         shar    r0
>         mov.b   r0,@r4
>         sett
>         mov.b   @(1,r5),r0
>         extu.b  r0,r3
>         mov.b   @(2,r5),r0
>         extu.b  r0,r1
>         mov     r3,r0
>         addc    r1,r0
>         shar    r0
>         mov.b   r0,@(1,r4)
>         ....
> 
> Here effectively only 8 bit values are calculated.  The zero-extensions can
> be omitted, since the higher bits do not influence the result of the lowest
> 8 bits and the higher bits are discarded after the 8 bit stores.

Sorry, bad example.  The zero-extensions can't be emitted in this case because
of the right shift.  In the following example bits > 15 are don't care


void test_0 (unsigned short* x, int y, int z)
{
  x[11] = ((x[0] + x[1] + 1) << 1) + ((x[2] + x[3] + 1) << 2) - y;
}

.. and there are no zero-extensions in this case.  Using function arguments
instead:

void test_1 (unsigned short* x,
             unsigned short a, unsigned short b, unsigned short c,
             unsigned short d, unsigned short e)
{
  x[11] = ((a + b + 1) << 1) + ((c + d + 1) << 2) - e;
}


... brings back the zero extensions.  This is probably because of the SH ABI
(PR 52441), but in this case the extensions are really not needed.
>From gcc-bugs-return-471510-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org Sun Dec 21 17:24:32 2014
Return-Path: <gcc-bugs-return-471510-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org>
Delivered-To: listarch-gcc-bugs@gcc.gnu.org
Received: (qmail 11221 invoked by alias); 21 Dec 2014 17:24:30 -0000
Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-bugs.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-bugs/>
List-Post: <mailto:gcc-bugs@gcc.gnu.org>
List-Help: <mailto:gcc-bugs-help@gcc.gnu.org>
Sender: gcc-bugs-owner@gcc.gnu.org
Delivered-To: mailing list gcc-bugs@gcc.gnu.org
Received: (qmail 11163 invoked by uid 48); 21 Dec 2014 17:24:20 -0000
From: "olegendo at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug rtl-optimization/64366] Segmentation fault in remove_pseudos
Date: Sun, 21 Dec 2014 17:24:00 -0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: rtl-optimization
X-Bugzilla-Version: 5.0
X-Bugzilla-Keywords: ice-on-valid-code, ra
X-Bugzilla-Severity: normal
X-Bugzilla-Who: olegendo at gcc dot gnu.org
X-Bugzilla-Status: NEW
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags:
X-Bugzilla-Changed-Fields: bug_status cf_reconfirmed_on everconfirmed
Message-ID: <bug-64366-4-lMlgyruJXg@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-64366-4@http.gcc.gnu.org/bugzilla/>
References: <bug-64366-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 7bit
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-SW-Source: 2014-12/txt/msg02517.txt.bz2
Content-length: 573

https://gcc.gnu.org/bugzilla/show_bug.cgi?idd366

Oleg Endo <olegendo at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2014-12-21
     Ever confirmed|0                           |1

--- Comment #1 from Oleg Endo <olegendo at gcc dot gnu.org> ---
I get the same here when trying to compile the CSiBE set for -m4-single -O2
-mpretend-cmove and LRA enabled.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/53987] [SH] Unnecessary zero-extensions
  2012-07-16 23:50 [Bug target/53987] New: [SH] Unnecessary zero-extension before cmp/eq olegendo at gcc dot gnu.org
                   ` (4 preceding siblings ...)
  2014-12-21 16:54 ` [Bug target/53987] [SH] Unnecessary zero-extensions olegendo at gcc dot gnu.org
@ 2014-12-22 22:27 ` olegendo at gcc dot gnu.org
  2014-12-25  1:11 ` olegendo at gcc dot gnu.org
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: olegendo at gcc dot gnu.org @ 2014-12-22 22:27 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53987

--- Comment #6 from Oleg Endo <olegendo at gcc dot gnu.org> ---
In the following example function

bool stack_is_background_free( unsigned short* _data, const size_t _size){

  size_t num_pixels_above_128 = 0;
  for(size_t index = 0;index< _size;++index){
    if(_data[index] > 128)
      num_pixels_above_128++;
  }

  return num_pixels_above_128 == _size;
}


unsigned 16 bit values are compared against an 8 bit constant, which results in
the following code:

        mov.w   @r4+,r1
        extu.w  r1,r1
        cmp/hi  r7,r1
        mov     #0,r1
        addc    r1,r3
        dt      r2
        bf      .L4

In this case, the zero extension can be omitted, since the comparison result
will always be 'true' if any bits >= 8 in the 16 bit value are set:

0x8000 (sign extend) -> 0xFFFF8000 (unsigned) > 0x80 = 1
0x8000 (zero extend) -> 0x00008000 (unsigned) > 0x80 = 1


The same would also apply for 8 bit values:

int test (unsigned char* x)
{
  return x[0] >= 127;
}

compiles to:
        mov.b   @r4,r1
        mov     #126,r2
        extu.b  r1,r1
        cmp/hi  r2,r1
        rts
        movt    r0

0x80 (sign extend) -> 0xFFFFFF80 (unsigned) > 0x7E = 1
0x80 (zero extend) -> 0x00000080 (unsigned) > 0x7E = 1


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/53987] [SH] Unnecessary zero-extensions
  2012-07-16 23:50 [Bug target/53987] New: [SH] Unnecessary zero-extension before cmp/eq olegendo at gcc dot gnu.org
                   ` (5 preceding siblings ...)
  2014-12-22 22:27 ` olegendo at gcc dot gnu.org
@ 2014-12-25  1:11 ` olegendo at gcc dot gnu.org
  2014-12-30 17:26 ` olegendo at gcc dot gnu.org
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: olegendo at gcc dot gnu.org @ 2014-12-25  1:11 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53987

--- Comment #7 from Oleg Endo <olegendo at gcc dot gnu.org> ---
In bzip2-1.0.2/blocksort.s there is code such as:

.L294:
        mov.b   @(4,r11),r0
        extu.b  r0,r11
        mov.b   @(4,r3),r0
        extu.b  r0,r3
        cmp/eq  r3,r11
        bf      .L314
        mov.w   @(8,r2),r0
        extu.w  r0,r11
        mov.w   @(8,r1),r0
        extu.w  r0,r3
        cmp/eq  r3,r11
        bf      .L314
        mov     r4,r13
        add     r6,r13
        mov.b   @(5,r13),r0
        mov     r5,r3
        add     r6,r3
        extu.b  r0,r12
        mov.b   @(5,r3),r0
        extu.b  r0,r11
        cmp/eq  r11,r12
        bf      .L313
        mov.w   @(10,r2),r0
        extu.w  r0,r12
        mov.w   @(10,r1),r0
        extu.w  r0,r11
        cmp/eq  r11,r12
        bf      .L313
        mov.b   @(6,r13),r0
        extu.b  r0,r12
        mov.b   @(6,r3),r0
        extu.b  r0,r11
        cmp/eq  r11,r12
        bf      .L313
        mov.w   @(12,r2),r0
        extu.w  r0,r12
        mov.w   @(12,r1),r0
        extu.w  r0,r11
        cmp/eq  r11,r12
        bt      .L300
.L313:
        cmp/hi  r11,r12
        bra     .L308
        movt    r0

.L300:
        mov.b   @(7,r13),r0
        extu.b  r0,r11
        mov.b   @(7,r3),r0
        extu.b  r0,r3
        cmp/eq  r3,r11
        bt      .L301
.L314:
        cmp/hi  r3,r11
        bra     .L308
        movt    r0

In the BB at L294 the zero extensions can be omitted before the cmp/eq insns. 
Since the zero extended values are then compared using cmp/hi the zero
extensions must be inserted there again.  Sinking zero extensions for such
cases will reduce the code size.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/53987] [SH] Unnecessary zero-extensions
  2012-07-16 23:50 [Bug target/53987] New: [SH] Unnecessary zero-extension before cmp/eq olegendo at gcc dot gnu.org
                   ` (6 preceding siblings ...)
  2014-12-25  1:11 ` olegendo at gcc dot gnu.org
@ 2014-12-30 17:26 ` olegendo at gcc dot gnu.org
  2015-01-15  1:37 ` olegendo at gcc dot gnu.org
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: olegendo at gcc dot gnu.org @ 2014-12-30 17:26 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53987

--- Comment #8 from Oleg Endo <olegendo at gcc dot gnu.org> ---
Author: olegendo
Date: Tue Dec 30 17:26:18 2014
New Revision: 219110

URL: https://gcc.gnu.org/viewcvs?rev=219110&root=gcc&view=rev
Log:
gcc/testsuite/
    PR target/53987
    * gcc.target/sh/pr53987-1.c: New.

Added:
    trunk/gcc/testsuite/gcc.target/sh/pr53987-1.c
Modified:
    trunk/gcc/testsuite/ChangeLog


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/53987] [SH] Unnecessary zero-extensions
  2012-07-16 23:50 [Bug target/53987] New: [SH] Unnecessary zero-extension before cmp/eq olegendo at gcc dot gnu.org
                   ` (7 preceding siblings ...)
  2014-12-30 17:26 ` olegendo at gcc dot gnu.org
@ 2015-01-15  1:37 ` olegendo at gcc dot gnu.org
  2015-01-24 13:06 ` olegendo at gcc dot gnu.org
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: olegendo at gcc dot gnu.org @ 2015-01-15  1:37 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53987

--- Comment #9 from Oleg Endo <olegendo at gcc dot gnu.org> ---
Another related issue is this example function:

unsigned char h (unsigned char a, int b)
{
  return (unsigned char)(a + b);
}

compiling with -O2 -mrenesas (which allows undefined high bits in the return
value):
        extu.b  r4,r4    << not needed
        mov     r5,r0
        rts
        add     r4,r0

It seems that the fwprop1 pass causes this.
The initial RTL for this looks like this:

(insn 2 5 3 2 (set (reg/v:SI 165 [ a ])
        (zero_extend:SI (reg:QI 4 r4 [ a ]))) sh_tmp.cpp:1550 -1
     (nil))
(insn 3 2 4 2 (set (reg/v:SI 166 [ b ])
        (reg:SI 5 r5 [ b ])) sh_tmp.cpp:1550 -1
     (nil))
(note 4 3 7 2 NOTE_INSN_FUNCTION_BEG)
(insn 7 4 8 2 (set (reg:QI 168)
        (subreg:QI (reg/v:SI 166 [ b ]) 0)) sh_tmp.cpp:1551 -1
     (nil))
(insn 8 7 9 2 (set (reg:SI 169)
        (plus:SI (reg/v:SI 165 [ a ])
            (subreg:SI (reg:QI 168) 0))) sh_tmp.cpp:1551 -1
     (nil))
(insn 9 8 13 2 (set (reg:QI 164 [ <retval> ])
        (subreg:QI (reg:SI 169) 0)) sh_tmp.cpp:1551 -1
     (nil))
(insn 13 9 14 2 (set (reg/i:QI 0 r0)
        (reg:QI 164 [ <retval> ])) sh_tmp.cpp:1552 -1
     (nil))
(insn 14 13 0 2 (use (reg/i:QI 0 r0)) sh_tmp.cpp:1552 -1
     (nil))


Notice that the plus op is expanded as a subreg:SI (reg:QI) and the result is
then taken as a subreg:QI.  This should be OK with the addsi3 pattern. 
However, fwprop1 then propagates the hardregs and eliminates the subregs:

(note 5 0 2 2 [bb 2] NOTE_INSN_BASIC_BLOCK)
(insn 2 5 3 2 (set (reg/v:SI 165 [ a ])
        (zero_extend:SI (reg:QI 4 r4 [ a ]))) sh_tmp.cpp:1550 218
{*zero_extendqisi2_compact}
     (expr_list:REG_DEAD (reg:QI 4 r4 [ a ])
        (nil)))
(insn 3 2 4 2 (set (reg/v:SI 166 [ b ])
        (reg:SI 5 r5 [ b ])) sh_tmp.cpp:1550 252 {movsi_ie}
     (expr_list:REG_DEAD (reg:SI 5 r5 [ b ])
        (nil)))
(note 4 3 8 2 NOTE_INSN_FUNCTION_BEG)
(insn 8 4 13 2 (set (reg:SI 169)
        (plus:SI (reg/v:SI 165 [ a ])
            (reg/v:SI 166 [ b ]))) sh_tmp.cpp:1551 63 {*addsi3_compact}
     (expr_list:REG_DEAD (reg:QI 168 [ b ])
        (expr_list:REG_DEAD (reg/v:SI 165 [ a ])
            (nil))))
(insn 13 8 14 2 (set (reg/i:QI 0 r0)
        (subreg:QI (reg:SI 169) 0)) sh_tmp.cpp:1552 270 {*movqi}
     (expr_list:REG_DEAD (reg:SI 169)
        (nil)))
(insn 14 13 0 2 (use (reg/i:QI 0 r0)) sh_tmp.cpp:1552 -1
     (nil))

After that, everything is pretty much set and the zero extension will not be
eliminated anymore.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/53987] [SH] Unnecessary zero-extensions
  2012-07-16 23:50 [Bug target/53987] New: [SH] Unnecessary zero-extension before cmp/eq olegendo at gcc dot gnu.org
                   ` (8 preceding siblings ...)
  2015-01-15  1:37 ` olegendo at gcc dot gnu.org
@ 2015-01-24 13:06 ` olegendo at gcc dot gnu.org
  2015-01-27 23:08 ` olegendo at gcc dot gnu.org
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: olegendo at gcc dot gnu.org @ 2015-01-24 13:06 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53987

--- Comment #10 from Oleg Endo <olegendo at gcc dot gnu.org> ---
Author: olegendo
Date: Sat Jan 24 13:04:53 2015
New Revision: 220081

URL: https://gcc.gnu.org/viewcvs?rev=220081&root=gcc&view=rev
Log:
gcc/
    PR target/49263
    PR target/53987
    PR target/64345
    PR target/59533
    PR target/52933
    PR target/54236
    PR target/51244
    * config/sh/sh-protos.h
    (sh_extending_set_of_reg::can_use_as_unextended_reg,
    sh_extending_set_of_reg::use_as_unextended_reg,
    sh_is_nott_insn, sh_movt_set_dest, sh_movrt_set_dest, sh_is_movt_insn,
    sh_is_movrt_insn, sh_insn_operands_modified_between_p,
    sh_reg_dead_or_unused_after_insn, sh_in_recog_treg_set_expr,
    sh_recog_treg_set_expr, sh_split_treg_set_expr): New functions.
    (sh_treg_insns): New class.
    * config/sh/sh.c (TARGET_LEGITIMATE_COMBINED_INSN): Define target hook.
    (scope_counter): New class.
    (sh_legitimate_combined_insn, sh_is_nott_insn, sh_movt_set_dest,
    sh_movrt_set_dest, sh_reg_dead_or_unused_after_insn,
    sh_extending_set_of_reg::can_use_as_unextended_reg,
    sh_extending_set_of_reg::use_as_unextended_reg, sh_recog_treg_set_expr,
    sh_in_recog_treg_set_expr, sh_try_split_insn_simple,
    sh_split_treg_set_expr): New functions.
    (addsubcosts): Handle treg_set_expr.
    (sh_rtx_costs): Handle IF_THEN_ELSE and ZERO_EXTRACT.
    (sh_rtx_costs): Use arith_reg_operand in SIGN_EXTEND and ZERO_EXTEND.
    (sh_rtx_costs): Handle additional bit test patterns in EQ and AND cases.
    (sh_insn_operands_modified_between_p): Make non-static.
    * config/sh/predicates.md (zero_extend_movu_operand): Allow
    simple_mem_operand in addition to displacement_mem_operand.
    (zero_extend_operand): Don't allow zero_extend_movu_operand.
    (treg_set_expr, treg_set_expr_not_const01,
    arith_reg_or_treg_set_expr): New predicates.
    * config/sh/sh.md (tstsi_t): Use arith_reg_operand and
    arith_or_int_operand instead of logical_operand.  Convert to
    insn_and_split.  Try to optimize constant operand in splitter.
    (tsthi_t, tstqi_t): Fold into *tst<mode>_t.  Convert to insn_and_split.
    (*tstqi_t_zero): Delete.
    (*tst<mode>_t_subregs): Add !sh_in_recog_treg_set_expr split condition.
    (tstsi_t_and_not): Delete.
    (tst<mode>_t_zero_extract_eq): Rename to *tst<mode>_t_zero_extract.
    Convert to insn_and_split.
    (unnamed split, tstsi_t_zero_extract_xor,
    tstsi_t_zero_extract_subreg_xor_little,
    tstsi_t_zero_extract_subreg_xor_big): Delete.
    (*tstsi_t_shift_mask): New insn_and_split.
    (cmpeqsi_t, cmpgesi_t): Add new split for const_int 0 operands and try
    to recombine with surrounding insns when splitting.
    (*negtstsi): Add !sh_in_recog_treg_set_expr condition.
    (cmp_div0s_0, cmp_div0s_1, *cmp_div0s_0, *cmp_div0s_1): Rewrite as ...
    (cmp_div0s, *cmp_div0s_1, *cmp_div0s_2, *cmp_div0s_3, *cmp_div0s_4,
    *cmp_div0s_5, *cmp_div0s_6): ... these new insn_and_split patterns.
    (*cbranch_div0s: Delete.
    (*addc): Convert to insn_and_split.  Use treg_set_expr as 3rd operand.
    Try to recombine with surrounding insns when splitting.  Add operand
    order variants.
    (*addc_t_r, *addc_r_t): Use treg_set_expr_not_const01.
    (*addc_r_r_1, *addc_r_lsb, *addc_r_r_lsb, *addc_r_lsb_r, *addc_r_msb,
    *addc_r_r_msb, *addc_2r_msb): Delete.
    (*addc_2r_lsb): Rename to *addc_2r_t.  Use treg_set_expr.  Add operand
    order variant.
    (*addc_negreg_t): New insn_and_split.
    (*subc): Convert to insn_and_split.  Use treg_set_expr as 3rd operand.
    Try to recombine with surrounding insns when splitting.
    Add operand order variants.  
    (*subc_negt_reg, *subc_negreg_t, *reg_lsb_t, *reg_msb_t): New
    insn_and_split patterns.
    (*rotcr): Use arith_reg_or_treg_set_expr.  Try to recombine with
    surrounding insns when splitting.
    (unnamed rotcr split): Use arith_reg_or_treg_set_expr.
    (*rotcl): Likewise.  Add zero_extract variant.
    (*ashrsi2_31): New insn_and_split.
    (*negc): Convert to insn_and_split.  Use treg_set_expr.
    (*zero_extend<mode>si2_disp_mem): Update comment.
    (movrt_negc, *movrt_negc, nott): Add !sh_in_recog_treg_set_expr split
    condition.
    (*mov_t_msb_neg, mov_neg_si_t): Use treg_set_expr.  Try to recombine
    with surrounding insns when splitting.
    (any_treg_expr_to_reg): New insn_and_split.
    (*neg_zero_extract_0, *neg_zero_extract_1, *neg_zero_extract_2,
    *neg_zero_extract_3, *neg_zero_extract_4, *neg_zero_extract_5,
    *neg_zero_extract_6, *zero_extract_0, *zero_extract_1,
    *zero_extract_2): New single bit zero extract patterns.
    (bld_reg, *bld_regqi): Fold into bld<mode>_reg.
    (*get_thread_pointersi, store_gbr, *mov<mode>_gbr_load,
    *mov<mode>_gbr_load, *mov<mode>_gbr_load, *mov<mode>_gbr_load,
    *movdi_gbr_load): Use arith_reg_dest instead of register_operand for
    set destination.
    (set_thread_pointersi, load_gbr): Use arith_reg_operand instead of
    register_operand for set source.

gcc/testsuite/
    PR target/49263
    PR target/53987
    PR target/64345
    PR target/59533
    PR target/52933
    PR target/54236
    PR target/51244
    * gcc.target/sh/pr64345-1.c: New.
    * gcc.target/sh/pr64345-2.c: New.
    * gcc.target/sh/pr59533-1.c: New.
    * gcc.target/sh/pr49263.c: Adjust matching of expected insns.
    * gcc.target/sh/pr52933-2.c: Likewise.
    * gcc.target/sh/pr54089-1.c: Likewise.
    * gcc.target/sh/pr54236-1.c: Likewise.
    * gcc.target/sh/pr51244-20-sh2a.c: Likewise.
    * gcc.target/sh/pr49263-1.c: Remove xfails.
    * gcc.target/sh/pr49263-2.c: Likewise.
    * gcc.target/sh/pr49263-3.c: Likewise.
    * gcc.target/sh/pr53987-1.c: Likewise.
    * gcc.target/sh/pr52933-1.c: Adjust matching of expected insns.
    (test_24, test_25, test_26, test_27, test_28, test_29, test_30): New.
    * gcc.target/sh/pr51244-12.c: Adjust matching of expected insns.
    (test05, test06, test07, test08, test09, test10, test11, test12): New.
    * gcc.target/sh/pr54236-3.c: Adjust matching of expected insns.
    (test_002, test_003, test_004, test_005, test_006, test_007, test_008,
    test_009): New.
    * gcc.target/sh/pr51244-4.c: Adjust matching of expected insns.
    (test_02): New.

Added:
    trunk/gcc/testsuite/gcc.target/sh/pr59533-1.c
    trunk/gcc/testsuite/gcc.target/sh/pr64345-1.c
    trunk/gcc/testsuite/gcc.target/sh/pr64345-2.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/sh/predicates.md
    trunk/gcc/config/sh/sh-protos.h
    trunk/gcc/config/sh/sh.c
    trunk/gcc/config/sh/sh.md
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/testsuite/gcc.target/sh/pr49263-1.c
    trunk/gcc/testsuite/gcc.target/sh/pr49263-2.c
    trunk/gcc/testsuite/gcc.target/sh/pr49263-3.c
    trunk/gcc/testsuite/gcc.target/sh/pr49263.c
    trunk/gcc/testsuite/gcc.target/sh/pr51244-12.c
    trunk/gcc/testsuite/gcc.target/sh/pr51244-20-sh2a.c
    trunk/gcc/testsuite/gcc.target/sh/pr51244-4.c
    trunk/gcc/testsuite/gcc.target/sh/pr52933-1.c
    trunk/gcc/testsuite/gcc.target/sh/pr52933-2.c
    trunk/gcc/testsuite/gcc.target/sh/pr53987-1.c
    trunk/gcc/testsuite/gcc.target/sh/pr54089-1.c
    trunk/gcc/testsuite/gcc.target/sh/pr54236-1.c
    trunk/gcc/testsuite/gcc.target/sh/pr54236-3.c


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/53987] [SH] Unnecessary zero-extensions
  2012-07-16 23:50 [Bug target/53987] New: [SH] Unnecessary zero-extension before cmp/eq olegendo at gcc dot gnu.org
                   ` (9 preceding siblings ...)
  2015-01-24 13:06 ` olegendo at gcc dot gnu.org
@ 2015-01-27 23:08 ` olegendo at gcc dot gnu.org
  2015-08-01  9:57 ` olegendo at gcc dot gnu.org
  2015-09-09 12:15 ` olegendo at gcc dot gnu.org
  12 siblings, 0 replies; 14+ messages in thread
From: olegendo at gcc dot gnu.org @ 2015-01-27 23:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53987

--- Comment #11 from Oleg Endo <olegendo at gcc dot gnu.org> ---
(In reply to Oleg Endo from comment #9)
> Another related issue is this example function:
> 
> unsigned char h (unsigned char a, int b)
> {
>   return (unsigned char)(a + b);
> }
> 
> It seems that the fwprop1 pass causes this.

I've tried disabling fwprop and matching of combined insns with hard regs in
sh_legitimate_combined_insn.  But it doesn't help in this case.  The problem is
that

(insn 8 7 13 2 (set (reg:SI 169)
        (plus:SI (reg/v:SI 165 [ a ])
                  ^^^^^^^^
            (subreg:SI (reg:QI 168 [ b ]) 0))) sh_tmp.cpp:14 63
{*addsi3_compact}

the reg:SI 165 is zero_extended beforehand and then of course also the
subreg:SI (reg:QI 168) doesn't help it anymore.
If -mrenesas is not specified the example compiles to

        mov     r5,r0
        add     r4,r0
        rts
        extu.b  r0,r0   // zero extension of return value mandated by gnu abi

which doesn't have the problem.  It's initially expanded with two zero_extend
(one on the function arg + one for the return value) and then combine figures
that the first zero_extend can be eliminated.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/53987] [SH] Unnecessary zero-extensions
  2012-07-16 23:50 [Bug target/53987] New: [SH] Unnecessary zero-extension before cmp/eq olegendo at gcc dot gnu.org
                   ` (10 preceding siblings ...)
  2015-01-27 23:08 ` olegendo at gcc dot gnu.org
@ 2015-08-01  9:57 ` olegendo at gcc dot gnu.org
  2015-09-09 12:15 ` olegendo at gcc dot gnu.org
  12 siblings, 0 replies; 14+ messages in thread
From: olegendo at gcc dot gnu.org @ 2015-08-01  9:57 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53987

--- Comment #12 from Oleg Endo <olegendo at gcc dot gnu.org> ---
The following case:

int
fun_01 (char* x, char* y, int z)
{
  return ((x[1] & x[2] & x[3]) == 0) + z;
}

results in:
        add     #1,r4
        mov.b   @r4+,r2
        mov     #0,r0
        mov.b   @r4+,r1
        and     r2,r1
        mov.b   @r4+,r2
        extu.b  r1,r1     <<< unnecessary
        tst     r2,r1
        rts     
        addc    r6,r0

The tstsi_t splitter has some code to eliminate some sign/zero extensions but
it doesn't see through the and/or/xor insns.  Maybe it's better to a separate
pass that optimizes sign/zero extensions instead of adding more clutter to the
splitters.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug target/53987] [SH] Unnecessary zero-extensions
  2012-07-16 23:50 [Bug target/53987] New: [SH] Unnecessary zero-extension before cmp/eq olegendo at gcc dot gnu.org
                   ` (11 preceding siblings ...)
  2015-08-01  9:57 ` olegendo at gcc dot gnu.org
@ 2015-09-09 12:15 ` olegendo at gcc dot gnu.org
  12 siblings, 0 replies; 14+ messages in thread
From: olegendo at gcc dot gnu.org @ 2015-09-09 12:15 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53987

--- Comment #13 from Oleg Endo <olegendo at gcc dot gnu.org> ---
attachment 36309 of PR 67506 is another example.  It contains the following
code sequence:

        mov.b   @r2,r0   <<
        extu.b  r0,r0    << 
        tst     #128,r0
        bt/s    .L11
        extu.w  r0,r2    <<
        mov     r1,r2


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2015-09-09 12:15 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-07-16 23:50 [Bug target/53987] New: [SH] Unnecessary zero-extension before cmp/eq olegendo at gcc dot gnu.org
2012-08-31 18:14 ` [Bug target/53987] " olegendo at gcc dot gnu.org
2013-10-29 21:58 ` olegendo at gcc dot gnu.org
2013-12-16 19:07 ` olegendo at gcc dot gnu.org
2014-12-17 21:09 ` olegendo at gcc dot gnu.org
2014-12-21 16:54 ` [Bug target/53987] [SH] Unnecessary zero-extensions olegendo at gcc dot gnu.org
2014-12-22 22:27 ` olegendo at gcc dot gnu.org
2014-12-25  1:11 ` olegendo at gcc dot gnu.org
2014-12-30 17:26 ` olegendo at gcc dot gnu.org
2015-01-15  1:37 ` olegendo at gcc dot gnu.org
2015-01-24 13:06 ` olegendo at gcc dot gnu.org
2015-01-27 23:08 ` olegendo at gcc dot gnu.org
2015-08-01  9:57 ` olegendo at gcc dot gnu.org
2015-09-09 12:15 ` olegendo at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).