public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* How about providing an interface to fusing instructions via scheduling
@ 2021-09-03 10:56 gengqi
  2021-09-03 11:11 ` Kyrylo Tkachov
  0 siblings, 1 reply; 3+ messages in thread
From: gengqi @ 2021-09-03 10:56 UTC (permalink / raw)
  To: gcc

When I was adding pipeline to my backend, some instructions needed to be
fused and I found that there was no suitable interface to implement my
requirements.

 

My hope is that

1. Do instruction scheduling and combine any two instructions, and sometimes
the two instructions can be treated as 1 when they are issued

2. The two instructions only work better when they are immediately adjacent
to each other

3. An instruction can only be fused once, i.e. if the current instruction
has been fused with the previous one, the next one cannot be fused with the
current one.

 

I have referred to numerous interfaces in the “GCC INTERNALS” which
implement some of my requirements, but all of which just happen not to cover
my needs completely.

 

These interfaces are:

-      bool TARGET_SCHED_MACRO_FUSION_PAIR_P (rtx insn *prev, rtx insn
*curr)

The name of the interface looks a lot like what I need. But in reality I
found that this interface only fuses instructions that are already adjacent
to each other and does not do scheduling (not satisfy 1). And this interface
may fuse 3 or more instructions (not satisfy 3).

 

-      void TARGET_SCHED_FUSION_PRIORITY (rtx insn *insn, int max_pri, int
*fusion_pri, int *pri)

This interface is very powerful, but with only one insn being processed at a
time, this interface does not seem to be suitable for context sensitive
situations.

 

-      Use (define_bypass number out_insn_names in_insn_names [guard])

The “bypass” does not guarantee that the instruction being dispatched is
immediately adjacent to (not satisfy 2). Moreover, bypass only handles
instructions with true dependence.

 

-      int TARGET_SCHED_REORDER (FILE *file, int verbose, rtx insn **ready,
int *n_readyp, int clock) and TARGET_SCHED_REORDER2()

This interface allows free adjustment of ready instructions, but it is not
eay to get the last scheduled instruction. The last scheduled instruction
needs to be taken into account for fusion.

 

-      Use define_peephole2

Since the fused instructions are somehow identical to one instruction, it is
thought that a peephole might be a good choice. But “define_peephole2”
also does not schedule instructions.

 

In summary, I have not found an interface that does both scheduling and
fusion. Maybe we should enhance one of the above interfaces, or maybe we
should provide a new one. I think it is necessary and beneficial to have an
interface that does both scheduling and fusion.


^ permalink raw reply	[flat|nested] 3+ messages in thread

* RE: How about providing an interface to fusing instructions via scheduling
  2021-09-03 10:56 How about providing an interface to fusing instructions via scheduling gengqi
@ 2021-09-03 11:11 ` Kyrylo Tkachov
  2021-09-06  8:00   ` 答复: " gengqi
  0 siblings, 1 reply; 3+ messages in thread
From: Kyrylo Tkachov @ 2021-09-03 11:11 UTC (permalink / raw)
  To: gengqi; +Cc: gcc

Hi,

> -----Original Message-----
> From: Gcc <gcc-bounces+kyrylo.tkachov=arm.com@gcc.gnu.org> On Behalf
> Of gengqi via Gcc
> Sent: 03 September 2021 11:56
> To: gcc@gcc.gnu.org
> Subject: How about providing an interface to fusing instructions via
> scheduling
> 
> When I was adding pipeline to my backend, some instructions needed to be
> fused and I found that there was no suitable interface to implement my
> requirements.
> 
> 
> 
> My hope is that
> 
> 1. Do instruction scheduling and combine any two instructions, and
> sometimes
> the two instructions can be treated as 1 when they are issued
> 
> 2. The two instructions only work better when they are immediately adjacent
> to each other
> 
> 3. An instruction can only be fused once, i.e. if the current instruction
> has been fused with the previous one, the next one cannot be fused with the
> current one.
> 
> 
> 
> I have referred to numerous interfaces in the “GCC INTERNALS” which
> implement some of my requirements, but all of which just happen not to
> cover
> my needs completely.

Indeed, there are a few places in GCC that help, but not a clean catch-all solution.

> 
> 
> 
> These interfaces are:
> 
> -      bool TARGET_SCHED_MACRO_FUSION_PAIR_P (rtx insn *prev, rtx insn
> *curr)
> 
> The name of the interface looks a lot like what I need. But in reality I
> found that this interface only fuses instructions that are already adjacent
> to each other and does not do scheduling (not satisfy 1). And this interface
> may fuse 3 or more instructions (not satisfy 3).

Indeed, this interface ensures that instructions that are already adjacent are kept together, but doesn't bring them together from far away.

> 
> 
> 
> -      void TARGET_SCHED_FUSION_PRIORITY (rtx insn *insn, int max_pri, int
> *fusion_pri, int *pri)
> 
> This interface is very powerful, but with only one insn being processed at a
> time, this interface does not seem to be suitable for context sensitive
> situations.
> 

This is likely more appropriate for your needs. You may want to look in the implementation of this (and related) hook in the aarch64 backend.
We use it there to bring certain loads and stores together with the intent to form special load/store-pair instructions.
The scheduler brings them insns together, but we rely on post-scheduling peepholes to actually combine the two together into a single instruction.
Although there are a few cases where it misses opportunities, it works pretty well.

Thanks,
Kyrill

> 
> 
> -      Use (define_bypass number out_insn_names in_insn_names [guard])
> 
> The “bypass” does not guarantee that the instruction being dispatched is
> immediately adjacent to (not satisfy 2). Moreover, bypass only handles
> instructions with true dependence.
> 
> 
> 
> -      int TARGET_SCHED_REORDER (FILE *file, int verbose, rtx insn **ready,
> int *n_readyp, int clock) and TARGET_SCHED_REORDER2()
> 
> This interface allows free adjustment of ready instructions, but it is not
> eay to get the last scheduled instruction. The last scheduled instruction
> needs to be taken into account for fusion.
> 
> 
> 
> -      Use define_peephole2
> 
> Since the fused instructions are somehow identical to one instruction, it is
> thought that a peephole might be a good choice. But “define_peephole2”
> also does not schedule instructions.
> 
> 
> 
> In summary, I have not found an interface that does both scheduling and
> fusion. Maybe we should enhance one of the above interfaces, or maybe we
> should provide a new one. I think it is necessary and beneficial to have an
> interface that does both scheduling and fusion.


^ permalink raw reply	[flat|nested] 3+ messages in thread

* 答复: How about providing an interface to fusing instructions via scheduling
  2021-09-03 11:11 ` Kyrylo Tkachov
@ 2021-09-06  8:00   ` gengqi
  0 siblings, 0 replies; 3+ messages in thread
From: gengqi @ 2021-09-06  8:00 UTC (permalink / raw)
  To: 'Kyrylo Tkachov'; +Cc: gcc, cooper.qu

In fact, I had read the relevant code of aarch64 before suggesting this
point.

As I understood it, the interface could set the priority based on the
properties of the insn itself. The insn of the load/store instructions have
the same properties, and using these similarities it could be scheduled
together.
But the instructions that I want to pair together are not like this, their
insn's may not have the same properties. And if I want to schedule the
instructions to be fused together, I have to consider the instructions that
might be laid out before and after them.

So my requirements are not matched by the functionality provided by this
interface. I don't think it is a good choice to implement my requirements
with this interface.

-----邮件原件-----
发件人: Kyrylo Tkachov [mailto:Kyrylo.Tkachov@arm.com] 
发送时间: 2021年9月3日 19:11
收件人: gengqi <gengqi@linux.alibaba.com>
抄送: gcc@gcc.gnu.org
主题: RE: How about providing an interface to fusing instructions via
scheduling

Hi,

> -----Original Message-----
> From: Gcc <gcc-bounces+kyrylo.tkachov=arm.com@gcc.gnu.org> On Behalf 
> Of gengqi via Gcc
> Sent: 03 September 2021 11:56
> To: gcc@gcc.gnu.org
> Subject: How about providing an interface to fusing instructions via 
> scheduling
> 
> When I was adding pipeline to my backend, some instructions needed to 
> be fused and I found that there was no suitable interface to implement 
> my requirements.
> 
> 
> 
> My hope is that
> 
> 1. Do instruction scheduling and combine any two instructions, and 
> sometimes the two instructions can be treated as 1 when they are 
> issued
> 
> 2. The two instructions only work better when they are immediately 
> adjacent to each other
> 
> 3. An instruction can only be fused once, i.e. if the current 
> instruction has been fused with the previous one, the next one cannot 
> be fused with the current one.
> 
> 
> 
> I have referred to numerous interfaces in the “GCC INTERNALS” which 
> implement some of my requirements, but all of which just happen not to 
> cover my needs completely.

Indeed, there are a few places in GCC that help, but not a clean catch-all
solution.

> 
> 
> 
> These interfaces are:
> 
> -      bool TARGET_SCHED_MACRO_FUSION_PAIR_P (rtx insn *prev, rtx insn
> *curr)
> 
> The name of the interface looks a lot like what I need. But in reality 
> I found that this interface only fuses instructions that are already 
> adjacent to each other and does not do scheduling (not satisfy 1). And 
> this interface may fuse 3 or more instructions (not satisfy 3).

Indeed, this interface ensures that instructions that are already adjacent
are kept together, but doesn't bring them together from far away.

> 
> 
> 
> -      void TARGET_SCHED_FUSION_PRIORITY (rtx insn *insn, int max_pri, int
> *fusion_pri, int *pri)
> 
> This interface is very powerful, but with only one insn being 
> processed at a time, this interface does not seem to be suitable for 
> context sensitive situations.
> 

This is likely more appropriate for your needs. You may want to look in the
implementation of this (and related) hook in the aarch64 backend.
We use it there to bring certain loads and stores together with the intent
to form special load/store-pair instructions.
The scheduler brings them insns together, but we rely on post-scheduling
peepholes to actually combine the two together into a single instruction.
Although there are a few cases where it misses opportunities, it works
pretty well.

Thanks,
Kyrill

> 
> 
> -      Use (define_bypass number out_insn_names in_insn_names [guard])
> 
> The “bypass” does not guarantee that the instruction being dispatched 
> is immediately adjacent to (not satisfy 2). Moreover, bypass only 
> handles instructions with true dependence.
> 
> 
> 
> -      int TARGET_SCHED_REORDER (FILE *file, int verbose, rtx insn
**ready,
> int *n_readyp, int clock) and TARGET_SCHED_REORDER2()
> 
> This interface allows free adjustment of ready instructions, but it is 
> not eay to get the last scheduled instruction. The last scheduled 
> instruction needs to be taken into account for fusion.
> 
> 
> 
> -      Use define_peephole2
> 
> Since the fused instructions are somehow identical to one instruction, 
> it is thought that a peephole might be a good choice. But
“define_peephole2”
> also does not schedule instructions.
> 
> 
> 
> In summary, I have not found an interface that does both scheduling 
> and fusion. Maybe we should enhance one of the above interfaces, or 
> maybe we should provide a new one. I think it is necessary and 
> beneficial to have an interface that does both scheduling and fusion.


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-09-06  8:01 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-03 10:56 How about providing an interface to fusing instructions via scheduling gengqi
2021-09-03 11:11 ` Kyrylo Tkachov
2021-09-06  8:00   ` 答复: " gengqi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).