public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/56508] New: [SH] Add support for rtv/n instruction
@ 2013-03-02 12:59 olegendo at gcc dot gnu.org
  2013-03-10 12:08 ` [Bug target/56508] " olegendo at gcc dot gnu.org
  0 siblings, 1 reply; 2+ messages in thread
From: olegendo at gcc dot gnu.org @ 2013-03-02 12:59 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56508

             Bug #: 56508
           Summary: [SH] Add support for rtv/n instruction
    Classification: Unclassified
           Product: gcc
           Version: 4.8.0
            Status: UNCONFIRMED
          Severity: enhancement
          Priority: P3
         Component: target
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: olegendo@gcc.gnu.org
            Target: sh2a*-*-*


The SH2A supports a variant of the classical 'rts' instruction which can
shorten function returns:

    ... some code, result value in r3 ...
    rts
    mov   r3,r0

on SH2A can be done in one instruction:

    rtv/n r3


^ permalink raw reply	[flat|nested] 2+ messages in thread

* [Bug target/56508] [SH] Add support for rtv/n instruction
  2013-03-02 12:59 [Bug target/56508] New: [SH] Add support for rtv/n instruction olegendo at gcc dot gnu.org
@ 2013-03-10 12:08 ` olegendo at gcc dot gnu.org
  0 siblings, 0 replies; 2+ messages in thread
From: olegendo at gcc dot gnu.org @ 2013-03-10 12:08 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56508

--- Comment #1 from Oleg Endo <olegendo at gcc dot gnu.org> 2013-03-10 12:08:24 UTC ---
I've only looked briefly how this could be implemented.
As far as I can see, there are two basic cases:

1) 

int test0 (int a, int b)
{
  return a;
}

currently compiles to:

        rts
        mov     r4,r0


In this case there is a reg copy of the return value to 'r0' before the return
insn.  Such patterns usually show RTL that goes something like (after prologue
and epilogue pass):

(insn 13 20 16 2 (set (reg/i:SI 0 r0)
        (reg/v:SI 4 r4 [orig:161 a ] [161])) sh_tmp.cpp:10 244 {movsi_ie}
     (expr_list:REG_DEAD (reg/v:SI 4 r4 [orig:161 a ] [161])
        (nil)))
(insn 16 13 25 2 (use (reg/i:SI 0 r0)) sh_tmp.cpp:10 -1
     (nil))
(note 25 16 26 2 NOTE_INSN_EPILOGUE_BEG)
(jump_insn 26 25 27 2 (return) sh_tmp.cpp:10 372 {*return_i}
     (nil)
 -> return)
(barrier 27 26 23)
(note 23 27 0 NOTE_INSN_DELETED)

This would be the easiest to convert to rtv/n.  E.g. In the split4 pass (after
register allocation, pro and epilogues, etc, but before delayed branch
scheduling), the return insn (*return_i above) could be converted into rtv/n by
walking up the insn list starting from the return insn and looking for the reg
copy insn that sets r0.  If the source of the reg copy is a GP reg, it can be
combined with the return insn as rtv/n.


2) 

int test1 (int a, int b)
{
  return a + b;
}

currently compiles to:

        mov     r4,r0
        rts
        add     r5,r0


In this case register allocation already prepares the return value in r0.  This
is a simplified case, so there can be more preceeding insns working on the
return value that is allocated to 'r0' early on.  In such cases there will not
be a reg copy to 'r0' before the return insn, so the 'combine approach' from
case 1 won't work here -- it would require additional register use analysis and
register renaming.
The obstacle for this case is that the reg copy and use insns for the return
value are generated during initial RTL expantion based on what the
TARGET_FUNCTION_VALUE hook says, but return insns are expanded after register
allocation.  Thus register allocation can't be influenced.
An option would be to return a pseudo reg in TARGET_FUNCTION_VALUE (for
outgoing == true) instead of a hard reg and adding the reg copy during return
insn expansion.  However, register allocation will then try to avoid using 'r0'
(because of allocation order).  Such cases would then result in:

int test3 (char* a)
{
  return a[0];
}

        mov.b    @r4,r1
        rts
        mov     r1,r0

or with rtv/n applied:

        mov.b   @r4,r1
        rtv/n   r1

although the better way would be to utilize the delay slot of rts:

        rts
        mov.b   @r4,r0


I guess to support this in a generic way (which could also be used by other
targets) the register for the return value should be initially left open (by
returning a pseudo in TARGET_FUNCTION_VALUE) and the register allocator should
be told a preferred register for the return value.  If the return value then
does not end up in the required register, the epilogue expansion would emit the
reg copy insn or 'rtv/n' in the SH2A.


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2013-03-10 12:08 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-03-02 12:59 [Bug target/56508] New: [SH] Add support for rtv/n instruction olegendo at gcc dot gnu.org
2013-03-10 12:08 ` [Bug target/56508] " olegendo at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).