public inbox for gcc-help@gcc.gnu.org
 help / color / mirror / Atom feed
* fuse multiple ops into one new op
@ 2014-08-01 23:18 Cherry Vanc
  2014-08-02  7:02 ` Marc Glisse
  2014-08-02  9:52 ` Jeff Law
  0 siblings, 2 replies; 8+ messages in thread
From: Cherry Vanc @ 2014-08-01 23:18 UTC (permalink / raw)
  To: gcc-help

I need to fuse multiple instructions into a single one.
...
r1 = (r1) op1 (const)
...
...
r1 = (r1) op2 (r2)
...
...
r3 = op3 (r1)
...

I defined a peephole2 pattern in my GCC backend .md file. If these
three instructions are contiguous, then I do get my test "testnew"
instruction. If these instructions are far apart, I dont.

(define_peephole2
  [(set (match_operand:DI 0 "register_operand" "")
    (op1:DI (match_dup 0) (match_operand:SI 1 "immediate_operand" "") ))
  (set (match_dup 0)
    (op2:DI (match_operand:DI 2 "register_operand" "") (match_dup 0)))
  (set (match_dup 0)
    (sign_extend:DI (op3:SI (match_dup 0))))]
  "TARGET_MYCORE"
  [(set (match_dup 0) (sign_extend:DI (op3:SI (op2:SI (op1:SI
(match_dup 0) (match_dup 1)) (match_dup 0)))))]
  "")

(define_insn "*testnew"
  [(set (match_operand:DI 0 "register_operand" "=d")
        (sign_extend:DI (op3:SI (op2:SI (op1:SI (match_dup 0)
(match_operand:SI 1 "immediate_operand" "I")) (match_dup 0)))))]
  "TARGET_MYCORE"
  "testnew 36"
  [(set_attr "mode" "DI")])

How can I fuse multiple instructions that are far apart into a new
single opcode that MYCORE has ?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: fuse multiple ops into one new op
  2014-08-01 23:18 fuse multiple ops into one new op Cherry Vanc
@ 2014-08-02  7:02 ` Marc Glisse
  2014-08-02  9:09   ` Oleg Endo
  2014-08-02  9:52 ` Jeff Law
  1 sibling, 1 reply; 8+ messages in thread
From: Marc Glisse @ 2014-08-02  7:02 UTC (permalink / raw)
  To: Cherry Vanc; +Cc: gcc-help

On Fri, 1 Aug 2014, Cherry Vanc wrote:

> I need to fuse multiple instructions into a single one.
> ...
> r1 = (r1) op1 (const)
> ...
> ...
> r1 = (r1) op2 (r2)
> ...
> ...
> r3 = op3 (r1)
> ...
>
> I defined a peephole2 pattern in my GCC backend .md file. If these
> three instructions are contiguous, then I do get my test "testnew"
> instruction. If these instructions are far apart, I dont.
>
> (define_peephole2
>  [(set (match_operand:DI 0 "register_operand" "")
>    (op1:DI (match_dup 0) (match_operand:SI 1 "immediate_operand" "") ))
>  (set (match_dup 0)
>    (op2:DI (match_operand:DI 2 "register_operand" "") (match_dup 0)))
>  (set (match_dup 0)
>    (sign_extend:DI (op3:SI (match_dup 0))))]
>  "TARGET_MYCORE"
>  [(set (match_dup 0) (sign_extend:DI (op3:SI (op2:SI (op1:SI
> (match_dup 0) (match_dup 1)) (match_dup 0)))))]
>  "")
>
> (define_insn "*testnew"
>  [(set (match_operand:DI 0 "register_operand" "=d")
>        (sign_extend:DI (op3:SI (op2:SI (op1:SI (match_dup 0)
> (match_operand:SI 1 "immediate_operand" "I")) (match_dup 0)))))]
>  "TARGET_MYCORE"
>  "testnew 36"
>  [(set_attr "mode" "DI")])
>
> How can I fuse multiple instructions that are far apart into a new
> single opcode that MYCORE has ?

Hello,

I probably haven't looked closely enough, but could you explain why 
the 'combine' pass isn't already doing what you want?

-- 
Marc Glisse

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: fuse multiple ops into one new op
  2014-08-02  7:02 ` Marc Glisse
@ 2014-08-02  9:09   ` Oleg Endo
  0 siblings, 0 replies; 8+ messages in thread
From: Oleg Endo @ 2014-08-02  9:09 UTC (permalink / raw)
  To: gcc-help; +Cc: Cherry Vanc


On 02 Aug 2014, at 09:02, Marc Glisse <marc.glisse@inria.fr> wrote:

> On Fri, 1 Aug 2014, Cherry Vanc wrote:
> 
>> I need to fuse multiple instructions into a single one.
>> ...
>> r1 = (r1) op1 (const)
>> ...
>> ...
>> r1 = (r1) op2 (r2)
>> ...
>> ...
>> r3 = op3 (r1)
>> ...
>> 
>> I defined a peephole2 pattern in my GCC backend .md file. If these
>> three instructions are contiguous, then I do get my test "testnew"
>> instruction. If these instructions are far apart, I dont.
>> 
>> (define_peephole2
>> [(set (match_operand:DI 0 "register_operand" "")
>>   (op1:DI (match_dup 0) (match_operand:SI 1 "immediate_operand" "") ))
>> (set (match_dup 0)
>>   (op2:DI (match_operand:DI 2 "register_operand" "") (match_dup 0)))
>> (set (match_dup 0)
>>   (sign_extend:DI (op3:SI (match_dup 0))))]
>> "TARGET_MYCORE"
>> [(set (match_dup 0) (sign_extend:DI (op3:SI (op2:SI (op1:SI
>> (match_dup 0) (match_dup 1)) (match_dup 0)))))]
>> "")
>> 
>> (define_insn "*testnew"
>> [(set (match_operand:DI 0 "register_operand" "=d")
>>       (sign_extend:DI (op3:SI (op2:SI (op1:SI (match_dup 0)
>> (match_operand:SI 1 "immediate_operand" "I")) (match_dup 0)))))]
>> "TARGET_MYCORE"
>> "testnew 36"
>> [(set_attr "mode" "DI")])
>> 
>> How can I fuse multiple instructions that are far apart into a new
>> single opcode that MYCORE has ?
> 
> Hello,
> 
> I probably haven't looked closely enough, but could you explain why the 'combine' pass isn't already doing what you want?

Yes, this kind of stuff is usually done using the combine pass.
However, it will not try out all permutations of instructions, but
rather follow some rules.  Thus the patterns in the .md have to
match combine's expectations.  To see which patterns it tries out,
look at the rtl pass dump.  From there it should rather easy to
write down the expected pattern.  Notice also that sometimes
patterns will not be picked if the rtx costs are off.  Again,
see combine's log for when this happens.

Cheers,
Oleg

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: fuse multiple ops into one new op
  2014-08-01 23:18 fuse multiple ops into one new op Cherry Vanc
  2014-08-02  7:02 ` Marc Glisse
@ 2014-08-02  9:52 ` Jeff Law
  2014-08-05 23:26   ` Cherry Vanc
  1 sibling, 1 reply; 8+ messages in thread
From: Jeff Law @ 2014-08-02  9:52 UTC (permalink / raw)
  To: Cherry Vanc, gcc-help

On 08/01/14 17:18, Cherry Vanc wrote:
> I need to fuse multiple instructions into a single one.
> ...
> r1 = (r1) op1 (const)
> ...
> ...
> r1 = (r1) op2 (r2)
> ...
> ...
> r3 = op3 (r1)
> ...
>
> I defined a peephole2 pattern in my GCC backend .md file. If these
> three instructions are contiguous, then I do get my test "testnew"
> instruction. If these instructions are far apart, I dont.
>
> (define_peephole2
>    [(set (match_operand:DI 0 "register_operand" "")
>      (op1:DI (match_dup 0) (match_operand:SI 1 "immediate_operand" "") ))
>    (set (match_dup 0)
>      (op2:DI (match_operand:DI 2 "register_operand" "") (match_dup 0)))
>    (set (match_dup 0)
>      (sign_extend:DI (op3:SI (match_dup 0))))]
>    "TARGET_MYCORE"
>    [(set (match_dup 0) (sign_extend:DI (op3:SI (op2:SI (op1:SI
> (match_dup 0) (match_dup 1)) (match_dup 0)))))]
>    "")
>
> (define_insn "*testnew"
>    [(set (match_operand:DI 0 "register_operand" "=d")
>          (sign_extend:DI (op3:SI (op2:SI (op1:SI (match_dup 0)
> (match_operand:SI 1 "immediate_operand" "I")) (match_dup 0)))))]
>    "TARGET_MYCORE"
>    "testnew 36"
>    [(set_attr "mode" "DI")])
>
> How can I fuse multiple instructions that are far apart into a new
> single opcode that MYCORE has ?
I suspect the problem is "r1" is set/used multiple times.  That will 
inhibit instruction combination.  If at all possible you really want 
that code to look like:


r4 = (r1) op1 (const)  /* r1 dies */
r5 = r4 op (r2) /*r2 and r2 die */
r3 = op3 (r5) /* r5 dies */


Then the combiner will attempt to combine those instructions in the 
obvious ways.  For the combiner you want to use a define_insn pattern.

define_peephole2 is primarily used in cases where there is no obvious 
dataflow between the patterns.


Jeff


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: fuse multiple ops into one new op
  2014-08-02  9:52 ` Jeff Law
@ 2014-08-05 23:26   ` Cherry Vanc
  2014-08-05 23:27     ` Cherry Vanc
  2014-08-06  5:51     ` Marc Glisse
  0 siblings, 2 replies; 8+ messages in thread
From: Cherry Vanc @ 2014-08-05 23:26 UTC (permalink / raw)
  To: Jeff Law; +Cc: gcc-help

Thanks. I am now using a define_insn based on your inputs :

(define_insn "testnew36"
  [(set (match_operand:DI 0 "register_operand" "")
    (op1:DI (match_operand:DI 1 "register_operand" "")
(match_operand:SI 2 "immediate_operand" "") ))
  (set (match_operand:DI 3 "register_operand" "")
    (op2:DI (match_operand:DI 4 "register_operand" "") (match_dup 0)))
  (set (match_operand:DI 5 "register_operand" "")
    (sign_extend:DI (op3:SI (match_dup 3))))]
  "TARGET_MYCORE"
  "testnew 36"
  [(set_attr "mode" "DI")])

Why doesnt -fdump-rtl-all-all / -fdump-rtl-all generate those .life
and .combine files so that I can take a look at the combine pass is
doing ? dump-rtl-combine doesnt spit anything either. MYCORE is a mips
adaptation using GCC 4.9.0.


On Sat, Aug 2, 2014 at 2:51 AM, Jeff Law <law@redhat.com> wrote:
> On 08/01/14 17:18, Cherry Vanc wrote:
>>
>> I need to fuse multiple instructions into a single one.
>> ...
>> r1 = (r1) op1 (const)
>> ...
>> ...
>> r1 = (r1) op2 (r2)
>> ...
>> ...
>> r3 = op3 (r1)
>> ...
>>
>> I defined a peephole2 pattern in my GCC backend .md file. If these
>> three instructions are contiguous, then I do get my test "testnew"
>> instruction. If these instructions are far apart, I dont.
>>
>> (define_peephole2
>>    [(set (match_operand:DI 0 "register_operand" "")
>>      (op1:DI (match_dup 0) (match_operand:SI 1 "immediate_operand" "") ))
>>    (set (match_dup 0)
>>      (op2:DI (match_operand:DI 2 "register_operand" "") (match_dup 0)))
>>    (set (match_dup 0)
>>      (sign_extend:DI (op3:SI (match_dup 0))))]
>>    "TARGET_MYCORE"
>>    [(set (match_dup 0) (sign_extend:DI (op3:SI (op2:SI (op1:SI
>> (match_dup 0) (match_dup 1)) (match_dup 0)))))]
>>    "")
>>
>> (define_insn "*testnew"
>>    [(set (match_operand:DI 0 "register_operand" "=d")
>>          (sign_extend:DI (op3:SI (op2:SI (op1:SI (match_dup 0)
>> (match_operand:SI 1 "immediate_operand" "I")) (match_dup 0)))))]
>>    "TARGET_MYCORE"
>>    "testnew 36"
>>    [(set_attr "mode" "DI")])
>>
>> How can I fuse multiple instructions that are far apart into a new
>> single opcode that MYCORE has ?
>
> I suspect the problem is "r1" is set/used multiple times.  That will inhibit
> instruction combination.  If at all possible you really want that code to
> look like:
>
>
> r4 = (r1) op1 (const)  /* r1 dies */
> r5 = r4 op (r2) /*r2 and r2 die */
> r3 = op3 (r5) /* r5 dies */
>
>
> Then the combiner will attempt to combine those instructions in the obvious
> ways.  For the combiner you want to use a define_insn pattern.
>
> define_peephole2 is primarily used in cases where there is no obvious
> dataflow between the patterns.
>
>
> Jeff
>
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: fuse multiple ops into one new op
  2014-08-05 23:26   ` Cherry Vanc
@ 2014-08-05 23:27     ` Cherry Vanc
  2014-08-06  5:51     ` Marc Glisse
  1 sibling, 0 replies; 8+ messages in thread
From: Cherry Vanc @ 2014-08-05 23:27 UTC (permalink / raw)
  To: Jeff Law; +Cc: gcc-help

I forgot to mention that this define_insn pattern doesnt work for me.

On Tue, Aug 5, 2014 at 4:26 PM, Cherry Vanc <cherry.vanc@gmail.com> wrote:
> Thanks. I am now using a define_insn based on your inputs :
>
> (define_insn "testnew36"
>   [(set (match_operand:DI 0 "register_operand" "")
>     (op1:DI (match_operand:DI 1 "register_operand" "")
> (match_operand:SI 2 "immediate_operand" "") ))
>   (set (match_operand:DI 3 "register_operand" "")
>     (op2:DI (match_operand:DI 4 "register_operand" "") (match_dup 0)))
>   (set (match_operand:DI 5 "register_operand" "")
>     (sign_extend:DI (op3:SI (match_dup 3))))]
>   "TARGET_MYCORE"
>   "testnew 36"
>   [(set_attr "mode" "DI")])
>
> Why doesnt -fdump-rtl-all-all / -fdump-rtl-all generate those .life
> and .combine files so that I can take a look at the combine pass is
> doing ? dump-rtl-combine doesnt spit anything either. MYCORE is a mips
> adaptation using GCC 4.9.0.
>
>
> On Sat, Aug 2, 2014 at 2:51 AM, Jeff Law <law@redhat.com> wrote:
>> On 08/01/14 17:18, Cherry Vanc wrote:
>>>
>>> I need to fuse multiple instructions into a single one.
>>> ...
>>> r1 = (r1) op1 (const)
>>> ...
>>> ...
>>> r1 = (r1) op2 (r2)
>>> ...
>>> ...
>>> r3 = op3 (r1)
>>> ...
>>>
>>> I defined a peephole2 pattern in my GCC backend .md file. If these
>>> three instructions are contiguous, then I do get my test "testnew"
>>> instruction. If these instructions are far apart, I dont.
>>>
>>> (define_peephole2
>>>    [(set (match_operand:DI 0 "register_operand" "")
>>>      (op1:DI (match_dup 0) (match_operand:SI 1 "immediate_operand" "") ))
>>>    (set (match_dup 0)
>>>      (op2:DI (match_operand:DI 2 "register_operand" "") (match_dup 0)))
>>>    (set (match_dup 0)
>>>      (sign_extend:DI (op3:SI (match_dup 0))))]
>>>    "TARGET_MYCORE"
>>>    [(set (match_dup 0) (sign_extend:DI (op3:SI (op2:SI (op1:SI
>>> (match_dup 0) (match_dup 1)) (match_dup 0)))))]
>>>    "")
>>>
>>> (define_insn "*testnew"
>>>    [(set (match_operand:DI 0 "register_operand" "=d")
>>>          (sign_extend:DI (op3:SI (op2:SI (op1:SI (match_dup 0)
>>> (match_operand:SI 1 "immediate_operand" "I")) (match_dup 0)))))]
>>>    "TARGET_MYCORE"
>>>    "testnew 36"
>>>    [(set_attr "mode" "DI")])
>>>
>>> How can I fuse multiple instructions that are far apart into a new
>>> single opcode that MYCORE has ?
>>
>> I suspect the problem is "r1" is set/used multiple times.  That will inhibit
>> instruction combination.  If at all possible you really want that code to
>> look like:
>>
>>
>> r4 = (r1) op1 (const)  /* r1 dies */
>> r5 = r4 op (r2) /*r2 and r2 die */
>> r3 = op3 (r5) /* r5 dies */
>>
>>
>> Then the combiner will attempt to combine those instructions in the obvious
>> ways.  For the combiner you want to use a define_insn pattern.
>>
>> define_peephole2 is primarily used in cases where there is no obvious
>> dataflow between the patterns.
>>
>>
>> Jeff
>>
>>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: fuse multiple ops into one new op
  2014-08-05 23:26   ` Cherry Vanc
  2014-08-05 23:27     ` Cherry Vanc
@ 2014-08-06  5:51     ` Marc Glisse
  2014-08-07 18:35       ` Cherry Vanc
  1 sibling, 1 reply; 8+ messages in thread
From: Marc Glisse @ 2014-08-06  5:51 UTC (permalink / raw)
  To: Cherry Vanc; +Cc: Jeff Law, gcc-help

On Tue, 5 Aug 2014, Cherry Vanc wrote:

> Thanks. I am now using a define_insn based on your inputs :
>
> (define_insn "testnew36"
>  [(set (match_operand:DI 0 "register_operand" "")
>    (op1:DI (match_operand:DI 1 "register_operand" "")
> (match_operand:SI 2 "immediate_operand" "") ))
>  (set (match_operand:DI 3 "register_operand" "")
>    (op2:DI (match_operand:DI 4 "register_operand" "") (match_dup 0)))
>  (set (match_operand:DI 5 "register_operand" "")
>    (sign_extend:DI (op3:SI (match_dup 3))))]
>  "TARGET_MYCORE"
>  "testnew 36"
>  [(set_attr "mode" "DI")])

Er, no, that's not what was recommended. Your *testnew in the previous 
email was much better.

> Why doesnt -fdump-rtl-all-all / -fdump-rtl-all generate those .life
> and .combine files so that I can take a look at the combine pass is
> doing ? dump-rtl-combine doesnt spit anything either. MYCORE is a mips
> adaptation using GCC 4.9.0.

Are you sure compiling file.c with options -O -da (or any of the options 
you tried) doesn't create file.c.201r.combine (number can vary)? You'll 
need to debug that first then.

-- 
Marc Glisse

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: fuse multiple ops into one new op
  2014-08-06  5:51     ` Marc Glisse
@ 2014-08-07 18:35       ` Cherry Vanc
  0 siblings, 0 replies; 8+ messages in thread
From: Cherry Vanc @ 2014-08-07 18:35 UTC (permalink / raw)
  To: gcc-help; +Cc: Jeff Law

Thanks all for your comments. Posting my comments for posterity.

I defined a define_insn pattern as follows and it worked well for me :

  (define_insn "testnew36"
         [(set (match_operand:DI 0 "register_operand" "=d")
                 (op3:DI (op2:DI (op1:DI (match_operand:DI 1
"register_operand" "")
                                                  (match_operand:SI 2
"immediate_operand" ""))
                                         (match_operand:DI 3
"register_operand" ""))))]
  "TARGET_MYCORE"
  "testnew36"
  [(set_attr "mode" "DI")
   (set_attr "length" "4")])

I am working on a clean way to update the rtx_costs this still. But
once the insn costs are in place (I somewhat put a dirty hack for
now), GCC's combine pass does fuse these ops for me. There are a few
cases where the combine pass falters with suboptimal patterns.

The case is when (I think) GCC thinks that the result of op1 +op2
combination is required for a latter insn :

(parallel [
        (set (reg:DI 256 [ *_15 ])
            (op3:DI (op2:DI (op1:DI (reg:DI 202 [ D.1563 ])
                        (const_int 4 [0x4]))
                    (reg/v/f:DI 242 [ inbuf ])) [0 *_15+0 S4 A32]))
        (set (reg/f:DI 205 [ D.1566 ])
            (op2:DI (op1:DI (reg:DI 202 [ D.1563 ])
                    (const_int 4 [0x4]))
                (reg/v/f:DI 242 [ inbuf ])))
    ])

The second set insn in the above parallel expression can be combined
with another define_insn pattern that can fuse op1, op2 and "op4" to a
new insn "testnew40". Is there a way to accomplish this ? When does
the combine pass create these parallel expressions ?

To be clear, I have two define_insn patterns at the moment :

1. testnew36 (fuses op1, op2, and op3)
2. testnew40 (fuses op1, op2, and op4)

So a stream of insns like below :

...
op1
...
op2 (consumes result of op1)
...
op3 (consumes result of op2)
...
op4 (consumes result of op2)
...

gets translated to :

...
testnew36
...
testnew40

On Tue, Aug 5, 2014 at 10:51 PM, Marc Glisse <marc.glisse@inria.fr> wrote:
> On Tue, 5 Aug 2014, Cherry Vanc wrote:
>
>> Thanks. I am now using a define_insn based on your inputs :
>>
>> (define_insn "testnew36"
>>  [(set (match_operand:DI 0 "register_operand" "")
>>    (op1:DI (match_operand:DI 1 "register_operand" "")
>> (match_operand:SI 2 "immediate_operand" "") ))
>>  (set (match_operand:DI 3 "register_operand" "")
>>    (op2:DI (match_operand:DI 4 "register_operand" "") (match_dup 0)))
>>  (set (match_operand:DI 5 "register_operand" "")
>>    (sign_extend:DI (op3:SI (match_dup 3))))]
>>  "TARGET_MYCORE"
>>  "testnew 36"
>>  [(set_attr "mode" "DI")])
>
>
> Er, no, that's not what was recommended. Your *testnew in the previous email
> was much better.
>
>
>> Why doesnt -fdump-rtl-all-all / -fdump-rtl-all generate those .life
>> and .combine files so that I can take a look at the combine pass is
>> doing ? dump-rtl-combine doesnt spit anything either. MYCORE is a mips
>> adaptation using GCC 4.9.0.
>
>
> Are you sure compiling file.c with options -O -da (or any of the options you
> tried) doesn't create file.c.201r.combine (number can vary)? You'll need to
> debug that first then.
>
> --
> Marc Glisse

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2014-08-07 18:35 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-08-01 23:18 fuse multiple ops into one new op Cherry Vanc
2014-08-02  7:02 ` Marc Glisse
2014-08-02  9:09   ` Oleg Endo
2014-08-02  9:52 ` Jeff Law
2014-08-05 23:26   ` Cherry Vanc
2014-08-05 23:27     ` Cherry Vanc
2014-08-06  5:51     ` Marc Glisse
2014-08-07 18:35       ` Cherry Vanc

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).