public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/54673] New: [SH] Unnecessary sign extension of logical operation results
@ 2012-09-22 23:12 olegendo at gcc dot gnu.org
  2012-10-27 13:15 ` [Bug target/54673] " olegendo at gcc dot gnu.org
  2013-02-17 11:38 ` olegendo at gcc dot gnu.org
  0 siblings, 2 replies; 3+ messages in thread
From: olegendo at gcc dot gnu.org @ 2012-09-22 23:12 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54673

             Bug #: 54673
           Summary: [SH] Unnecessary sign extension of logical operation
                    results
    Classification: Unclassified
           Product: gcc
           Version: 4.8.0
            Status: UNCONFIRMED
          Severity: enhancement
          Priority: P3
         Component: target
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: olegendo@gcc.gnu.org
            Target: sh*-*-*


Logical operations already sign extended values as inputs cause additional sign
extensions of intermediate values.  For example:

int test00 (char* a, short* b, int c)
{
  return a[0] ^ a[1] ^ c;
}

becomes:
  mov.b    @(1,r4),r0
  mov.b    @r4,r1
  xor      r1,r0
  exts.b   r0,r0
  rts
  xor      r6,r0


int test01 (char* a, short* b, int c)
{
  return a[0] | a[1] | c;
}

becomes:
  mov.b    @(1,r4),r0
  mov.b    @r4,r1
  or       r1,r0
  exts.b   r0,r0
  rts
  or       r6,r0


int test02 (char* a, short* b, int c)
{
  return a[0] & a[1] & c;
}

becomes:
  mov.b    @(1,r4),r0
  mov.b    @r4,r1
  and      r1,r0
  exts.b   r0,r0
  rts
  and      r6,r0


This seems to be caused by the fact that patterns for xorsi3, iorsi3 and andsi3
allow matching 'subreg' in their operands.  In the combine pass it can be
observed that it tries combinations such as:
Failed to match this instruction:
(set (reg:SI 173 [ D.1867 ])
    (sign_extend:SI (subreg:QI (xor:SI (subreg:SI (mem:QI (plus:SI (reg/v/f:SI
166 [ a ])
                            (const_int 1 [0x1])) [0 MEM[(char *)a_2(D) + 1B]+0
S1 A8]) 0)
                (subreg:SI (reg:QI 171 [ *a_2(D) ]) 0)) 3)))

which doesn't go anywhere.
I have quickly tried out prohibiting subreg in the operands of the
*xorsi3_compact insn and the sign extension (as well as the subreg orgy in
combine) disappeared.  However, I guess that not matching 'subreg' in the
logical patterns will cause problems with DImode logical ops, since they are
split into SImode ops working on subregs.


^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Bug target/54673] [SH] Unnecessary sign extension of logical operation results
  2012-09-22 23:12 [Bug target/54673] New: [SH] Unnecessary sign extension of logical operation results olegendo at gcc dot gnu.org
@ 2012-10-27 13:15 ` olegendo at gcc dot gnu.org
  2013-02-17 11:38 ` olegendo at gcc dot gnu.org
  1 sibling, 0 replies; 3+ messages in thread
From: olegendo at gcc dot gnu.org @ 2012-10-27 13:15 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54673

Oleg Endo <olegendo at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |ASSIGNED
   Last reconfirmed|                            |2012-10-27
         AssignedTo|unassigned at gcc dot       |olegendo at gcc dot gnu.org
                   |gnu.org                     |
     Ever Confirmed|0                           |1

--- Comment #1 from Oleg Endo <olegendo at gcc dot gnu.org> 2012-10-27 13:15:12 UTC ---
Created attachment 28542
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=28542
WIP patch (little endian only)

I've been trying out a couple of things.  It seems that disallowing subregs in
operands of logical ops introduces other missed optimization opportunities,
such as:

void test02452 (unsigned char* a, int* b, int c)
{
  b[0] = a[0] & 251;
}

where this would expand into something like 'a[0] & -5' and the 'and #imm,r0'
insn would thus not be used.

The actual problem seems to be the original expansion of memory loads, where
sometimes e.g. the movqi pattern is used to load a QImode mem into a QImode reg
without zero/sign extensions.  Thus subsequent uses of the loaded mem do not
know that it's been sign-extended by the load.
The attached patch fixes this in the movqi / movhi expanders and the redundant
extensions disappear.  However, it has an impact on the tst insns and overall
register allocation.  The tst insns cases can be handled by a few additional
patterns, but the register allocation issue leads to a couple of rather huge
code size increases in some of the CSiBE files (-O2 -m4-single -ml
-mpretend-cmove), with the following peaks:

bzip2-1.0.2
  compress       12172 -> 12708       +536 / +4.403549 %

mpeg2dec-0.3.1
  libmpeg2/motion_comp            4860 -> 5020        +160 / +3.292181 %
  libvo/yuv2rgb                   3856 -> 4056        +200 / +5.186722 %


^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Bug target/54673] [SH] Unnecessary sign extension of logical operation results
  2012-09-22 23:12 [Bug target/54673] New: [SH] Unnecessary sign extension of logical operation results olegendo at gcc dot gnu.org
  2012-10-27 13:15 ` [Bug target/54673] " olegendo at gcc dot gnu.org
@ 2013-02-17 11:38 ` olegendo at gcc dot gnu.org
  1 sibling, 0 replies; 3+ messages in thread
From: olegendo at gcc dot gnu.org @ 2013-02-17 11:38 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54673

--- Comment #2 from Oleg Endo <olegendo at gcc dot gnu.org> 2013-02-17 11:38:21 UTC ---
A related reduced example from CSiBE bzip2.c (compressStream):


typedef struct
{
  int _r[3];
  short _flags;
} FILE;


void BZ2_bzWriteOpen (int* e, FILE* f, int a, int b, int c);

FILE* outputHandleJustInCase;

void
compressStream (FILE *stream, unsigned char* ibuf)
{
  int bzerr;

  if ((stream->_flags & 0x0040) != 0)
    return;

  BZ2_bzWriteOpen (&bzerr, stream, 0, 0, 0);

  while (fgetc (stream) != -1)
    fread (ibuf, 1, 5000, stream);

  if ((stream->_flags & 0x0040) != 0)
    return;

  outputHandleJustInCase = 0;
}


... compiled with -O2 -m4 it contains the following sequence twice:

        mov.w   @(12,r4),r0
        and     #64,r0
        exts.w  r0,r1
        tst     r1,r1

the exts.w can be replaced with a reg copy, which is better for SH2A and SH4*:

        mov.w   @(12,r4),r0
        and     #64,r0
        mov     r0,r1
        tst     r1,r1


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2013-02-17 11:38 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-09-22 23:12 [Bug target/54673] New: [SH] Unnecessary sign extension of logical operation results olegendo at gcc dot gnu.org
2012-10-27 13:15 ` [Bug target/54673] " olegendo at gcc dot gnu.org
2013-02-17 11:38 ` olegendo at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).