public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/59239] New: [SH] Improve decrement-and-test insn
@ 2013-11-21 22:00 olegendo at gcc dot gnu.org
  2013-11-21 22:08 ` [Bug target/59239] " olegendo at gcc dot gnu.org
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: olegendo at gcc dot gnu.org @ 2013-11-21 22:00 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59239

            Bug ID: 59239
           Summary: [SH] Improve decrement-and-test insn
           Product: gcc
           Version: 4.9.0
            Status: UNCONFIRMED
          Severity: enhancement
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: olegendo at gcc dot gnu.org
            Target: sh*-*-*

Created attachment 31264
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=31264&action=edit
Possible cleanup patch

The decrement-and-test insn seems to rely on the define_peephole.  At least
taking it out shows some missed opportunities in the CSiBE set.
I've tried replacing it with a define_peephole2, but it would still show missed
cases such as:
   mov.l  @(...), Rn
   add    #-1,Rn
   mov.l  Rn,@(...)
   tst    Rn,Rn

Towards the very end of compilation the insns often get reordered to something
like
   mov.l  @(...), Rn
   add    #-1,Rn
   tst    Rn,Rn
   mov.l  Rn,@(...)

and the define_peephole will catch it in the final RTL pass when outputting asm
code.

Attached is a patch that replaces the old define_peephole with a manual insn
combine in cmpeqsi_t.  This catches some more of those cases where the
individual decrement and test insns are interleaved with something else.

There is one weird case in CSiBE in linux-2.4.23-pre3-testplatform/fs/iobuf.c
(alloc_kiobuf_bhs) though, where the individual decrement and test insns end up
in different basic blocks and only at the very end are emitted right next to
each other without a label in between.


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug target/59239] [SH] Improve decrement-and-test insn
  2013-11-21 22:00 [Bug target/59239] New: [SH] Improve decrement-and-test insn olegendo at gcc dot gnu.org
@ 2013-11-21 22:08 ` olegendo at gcc dot gnu.org
  2013-11-21 22:36 ` olegendo at gcc dot gnu.org
  2014-05-10 13:03 ` olegendo at gcc dot gnu.org
  2 siblings, 0 replies; 4+ messages in thread
From: olegendo at gcc dot gnu.org @ 2013-11-21 22:08 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59239

--- Comment #1 from Oleg Endo <olegendo at gcc dot gnu.org> ---
Created attachment 31265
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=31265&action=edit
Reduced test case

> There is one weird case in CSiBE in
> linux-2.4.23-pre3-testplatform/fs/iobuf.c (alloc_kiobuf_bhs) though, where
> the individual decrement and test insns end up in different basic blocks and
> only at the very end are emitted right next to each other without a label in
> between.

Attached is the reduced case of the above.


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug target/59239] [SH] Improve decrement-and-test insn
  2013-11-21 22:00 [Bug target/59239] New: [SH] Improve decrement-and-test insn olegendo at gcc dot gnu.org
  2013-11-21 22:08 ` [Bug target/59239] " olegendo at gcc dot gnu.org
@ 2013-11-21 22:36 ` olegendo at gcc dot gnu.org
  2014-05-10 13:03 ` olegendo at gcc dot gnu.org
  2 siblings, 0 replies; 4+ messages in thread
From: olegendo at gcc dot gnu.org @ 2013-11-21 22:36 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59239

--- Comment #2 from Oleg Endo <olegendo at gcc dot gnu.org> ---
(In reply to Oleg Endo from comment #0)
> Created attachment 31264 [details]
> Possible cleanup patch

After applying the patch and replacing the "doloop_end" expander with

(define_expand "doloop_end"
  [(use (match_operand:SI 0 "arith_reg_dest"))  ; loop count pseudo
   (use (match_operand 1))]                     ; label
  "TARGET_SH2"
{
  emit_insn (gen_dect (operands[0], operands[0]));
  emit_jump_insn (gen_branch_false (operands[1]));
  DONE;
})

will expose the individual decrement-and-test and cbranch insns early.  This
shows quite some changes in the CSiBE set.  The decrement-and-test insn will be
used less frequently and there are some code size increases with the worst case
being

src/nrrd/convertNrrd         4868 -> 5240        +372 / +7.641742 %

On the other hand, it causes an overall code size decrease of -3944 bytes on
the whole CSiBE set with some of the highlights being

teem-1.6.0-src  src/nrrd/tmfKernel  132652 -> 131520  -1132 / -0.853361 %
mpeg2dec-0.3.1  libmpeg2/decode     2820 -> 2620      -200 / -7.092199 %

it looks like fewer registers are used and some loop counter setup calculations
are smaller.


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug target/59239] [SH] Improve decrement-and-test insn
  2013-11-21 22:00 [Bug target/59239] New: [SH] Improve decrement-and-test insn olegendo at gcc dot gnu.org
  2013-11-21 22:08 ` [Bug target/59239] " olegendo at gcc dot gnu.org
  2013-11-21 22:36 ` olegendo at gcc dot gnu.org
@ 2014-05-10 13:03 ` olegendo at gcc dot gnu.org
  2 siblings, 0 replies; 4+ messages in thread
From: olegendo at gcc dot gnu.org @ 2014-05-10 13:03 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59239

--- Comment #3 from Oleg Endo <olegendo at gcc dot gnu.org> ---
(In reply to Oleg Endo from comment #0)
> 
> The decrement-and-test insn seems to rely on the define_peephole.  At least
> taking it out shows some missed opportunities in the CSiBE set.
> I've tried replacing it with a define_peephole2, but it would still show
> missed cases such as:
>    mov.l  @(...), Rn
>    add    #-1,Rn
>    mov.l  Rn,@(...)
>    tst    Rn,Rn

This happens especially when the mems are volatile.  An isolated example:

int test (volatile int* x, int a, int b, int c, int d)
{
  int xx = x[2] - 1;
  x[2] = xx;
  if (xx == 0)
    return a;

  return b + c;
}

Currently compiles to (-O2):
        mov.l   @(8,r4),r1
        add     #-1,r1
        mov.l   r1,@(8,r4)
        tst     r1,r1
        bt      .L5
        mov     r6,r5
        add     r7,r5
.L5:
        rts
        mov     r5,r0

which could be:
        mov.l   @(8,r4),r1
        dt      r1
        mov.l   r1,@(8,r4)
        bt      .L5
        mov     r6,r5
        add     r7,r5
.L5:
        rts
        mov     r5,r0

The problem seems that combine will not look across volatile mems when picking
insns for combination.

In this particular case it could be fixed by a manual combine step, i.e. make
the cmpeqsi_t an insn_and_split and in the split pass after combine check the
previous insns if they are a (plus (reg) (const_int -1)), then replace the
insns if matched.


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2014-05-10 13:03 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-11-21 22:00 [Bug target/59239] New: [SH] Improve decrement-and-test insn olegendo at gcc dot gnu.org
2013-11-21 22:08 ` [Bug target/59239] " olegendo at gcc dot gnu.org
2013-11-21 22:36 ` olegendo at gcc dot gnu.org
2014-05-10 13:03 ` olegendo at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).