public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Question on mips multiply patterns in md file
@ 2010-03-15  8:52 Amker.Cheng
  2010-03-15 22:03 ` Jim Wilson
  0 siblings, 1 reply; 6+ messages in thread
From: Amker.Cheng @ 2010-03-15  8:52 UTC (permalink / raw)
  To: gcc

Hi :
  I am studying multiplication-accumulate patterns for mips
 and noticed there are some changes when IRA was merged.

  There are two pattern which confused me, as :

1:  In pattern "*mul_acc_si", there's constraint like "*?*?".
what does this supposed to do?
I could not connect "*?" with document on constraints
 in gcc internal document, and totally have no idea about it.


2:  there is a split pattern for "*mul_acc_si" as following:

(define_split
  [(set (match_operand:SI 0 "d_operand")
	(plus:SI (mult:SI (match_operand:SI 1 "d_operand")
			  (match_operand:SI 2 "d_operand"))
		 (match_operand:SI 3 "d_operand")))
   (clobber (match_operand:SI 4 "lo_operand"))
   (clobber (match_operand:SI 5 "d_operand"))]
  "reload_completed"
  [(parallel [(set (match_dup 5)
		   (mult:SI (match_dup 1) (match_dup 2)))
	      (clobber (match_dup 4))])
   (set (match_dup 0) (plus:SI (match_dup 5) (match_dup 3)))]
  "")

this will generate integer multiply instruction with register write,
but what if the processor has only integer multiply instructions,
which only store results in HILO?

So, any tips? Thanks a lot.

-- 
Best Regards.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Question on mips multiply patterns in md file
  2010-03-15  8:52 Question on mips multiply patterns in md file Amker.Cheng
@ 2010-03-15 22:03 ` Jim Wilson
  2010-03-16  8:55   ` Amker.Cheng
  0 siblings, 1 reply; 6+ messages in thread
From: Jim Wilson @ 2010-03-15 22:03 UTC (permalink / raw)
  To: Amker.Cheng; +Cc: gcc

On 03/15/2010 01:00 AM, Amker.Cheng wrote:
> 1:  In pattern "*mul_acc_si", there's constraint like "*?*?".
> what does this supposed to do?

'*' is in the Constraint Modifier Characters section of the docs.  It 
means ignore the next character for register class preferencing.  '?' is 
in the Multiple Alternative Constraints section.  It means that this 
alternative is less desirable than the others.  So '*?' will be ignored 
by the register class preferencing code.  It will be used in reload when 
computing which alternative is better though.  Reload computes a cost 
for each alternative, and then chooses the cheapest one, or which ever 
one occurs first (left to right) when there are ties.  '*?' has the 
effect of slightly increasing the reload cost of an alternative, and is 
used when you want to discourage reload from picking this particular 
alternative.

If you don't know anything about register class preferencing or reload 
as yet, then this is probably not going to make much sense to you, but 
it isn't anything important you need to worry about at this point.  It 
is a very minor performance optimization.

> 2:  there is a split pattern for "*mul_acc_si" as following:
> this will generate integer multiply instruction with register write,
> but what if the processor has only integer multiply instructions,
> which only store results in HILO?

A define_split can only match something generated by a define_insn, and 
the mul_acc_si define_insn is testing "GENERATE_MADD_MSUB && !TARGET_MIPS16"
so there is no serious problem.  We are just running a define_split that 
can never match anything.  This could be cleaned up a little by adding 
an appropriate condition to the define_split, or by combining the 
define_insn and define_split patterns into a define_insn_and_split pattern.

Jim

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Question on mips multiply patterns in md file
  2010-03-15 22:03 ` Jim Wilson
@ 2010-03-16  8:55   ` Amker.Cheng
  2010-03-17 20:22     ` Jim Wilson
  0 siblings, 1 reply; 6+ messages in thread
From: Amker.Cheng @ 2010-03-16  8:55 UTC (permalink / raw)
  To: gcc; +Cc: Jim Wilson

> If you don't know anything about register class preferencing or reload as
> yet, then this is probably not going to make much sense to you, but it isn't
> anything important you need to worry about at this point.  It is a very
> minor performance optimization.
>
It makes sense to me now, though I haven't read codes for IRA and reloads yet.
Thanks for the detailed explanation.
>
> A define_split can only match something generated by a define_insn, and the
> mul_acc_si define_insn is testing "GENERATE_MADD_MSUB && !TARGET_MIPS16"
> so there is no serious problem.  We are just running a define_split that can
> never match anything.  This could be cleaned up a little by adding an
> appropriate condition to the define_split, or by combining the define_insn
> and define_split patterns into a define_insn_and_split pattern.

In upper words, you mean that define_split would only get chance to
split insn generated
by the corresponding pattern "define_insn \"*mul_acc_si\"", though the
split condition is
some kind of weak(with only "reload_completed"). Because that kind of
insn would only
be generated by the "define_insn \"*mul_acc_si\"" pattern.
Did I get it right? if so, i'm afraid this is actually not my question.

What wanna know is:
mips processors normally implement following kinds of mult/mult-acc insns:
mult        : HILO <-- s * t
mul         : HILO <-- s * t ; d <-- LO
madd      : HILO <-- HILO + s * t
madd2    : HILO <-- HILO + s * t ;  d <-- HILO
------------------------------------cut here---------------------
In my understanding, the macro GENERATE_MADD_MSUB is true when the processor has
madd insn, rather than madd2. And the macro "ISA_HAS_<D>MUL3" is false if it has
no mul insn.

for this kind processor, gcc will
step 1 : generate insn using gen_mul<mode>3_internal, according to
pattern "mul<mode>3";
step 2 : the combiner try to combine by matching against pattern "*mul_acc_si";
step 3 : it's possible that gcc fail to get LO register allocated for
the combined "*mul_acc_si" insn;
step 4 : after reload, the combined insn will be split according to
the split pattern listed in previous mail.
step 5 : the split insn is actually a "mul<mode>3_internal" , but get
no LO allocated, which break the
            constraints in "mul<mode>3_internal" pattern;

So, what should I do to handle this case? I see no methods except
adding new split pattern like:

(define_split
 [(set (match_operand:SI 0 "d_operand")
       (plus:SI (mult:SI (match_operand:SI 1 "d_operand")
                         (match_operand:SI 2 "d_operand"))
                (match_operand:SI 3 "d_operand")))
  (clobber (match_operand:SI 4 "lo_operand"))
  (clobber (match_operand:SI 5 "d_operand"))]
 "SPECIAL_PROCESSOR && reload_completed"
 [(parallel [(set (match_dup 4)
                  (mult:SI (match_dup 1) (match_dup 2)))
             (clobber (match_dup 4))])
  (set (match_dup 5) (match_dup 4))
  (set (match_dup 0) (plus:SI (match_dup 5) (match_dup 3)))]
 "")

Thanks again, looking forward your new explanations.


-- 
Best Regards.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Question on mips multiply patterns in md file
  2010-03-16  8:55   ` Amker.Cheng
@ 2010-03-17 20:22     ` Jim Wilson
  2010-03-18 11:23       ` Amker.Cheng
  0 siblings, 1 reply; 6+ messages in thread
From: Jim Wilson @ 2010-03-17 20:22 UTC (permalink / raw)
  To: Amker.Cheng; +Cc: gcc

On Tue, 2010-03-16 at 15:12 +0800, Amker.Cheng wrote:
> In my understanding, the macro GENERATE_MADD_MSUB is true when the processor has
> madd insn, rather than madd2. And the macro "ISA_HAS_<D>MUL3" is false if it has
> no mul insn.
> 
> for this kind processor, gcc will
> step 1 : generate insn using gen_mul<mode>3_internal, according to
> pattern "mul<mode>3";
> step 2 : the combiner try to combine by matching against pattern "*mul_acc_si";
> step 3 : it's possible that gcc fail to get LO register allocated for
> the combined "*mul_acc_si" insn;
> step 4 : after reload, the combined insn will be split according to
> the split pattern listed in previous mail.
> step 5 : the split insn is actually a "mul<mode>3_internal" , but get
> no LO allocated, which break the
>             constraints in "mul<mode>3_internal" pattern;

OK.  A much more complicated question than I realized.  Yes, it does
look like the mips port will fail in this case.

As a practical matter, both mul and madd are MIPS32R1 and up
instructions, so I would be surprised to see a processor that has the
second one but not the first one.

The new define_split you propose looks like it would work.

Another possible solution is to take the mul_acc_si pattern and make two
copies of it, one which tests ISA_HAS_MUL3 and one which tests !
ISA_HAS_MUL3.  The first one remains the same.  The second one drops the
second alternative that has the 'd' destination.  The reasoning here is
that if splitting will result in worse code, then we shouldn't have
accepted it in the first place.  If dropping this alternative results in
register allocator failures for some strange reason, then we accept it
and generate the 3 instruction sequence with a new define_split.  If
dropping this alternative results in the register allocator generating
worse code for other surrounding operations, then it is better to accept
it and add the new define_split.

Some experimentation might be necessary to see which change is the
better solution.

Jim


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Question on mips multiply patterns in md file
  2010-03-17 20:22     ` Jim Wilson
@ 2010-03-18 11:23       ` Amker.Cheng
  2010-03-18 22:58         ` Jim Wilson
  0 siblings, 1 reply; 6+ messages in thread
From: Amker.Cheng @ 2010-03-18 11:23 UTC (permalink / raw)
  To: Jim Wilson; +Cc: gcc

> The reasoning here is
> that if splitting will result in worse code, then we shouldn't have
> accepted it in the first place.  If dropping this alternative results in
> register allocator failures for some strange reason, then we accept it
> and generate the 3 instruction sequence with a new define_split.

Thanks Jim.

I could not get your method well since don't know much about
the IRA and reload pass. Here comes the question,
Does it possible that the method would ever result in register
allocator failure?
In my understanding, doesn't reload pass would do whatever it can to make
all insns' constraints satisfied?


> If dropping this alternative results in the register allocator generating
> worse code for other surrounding operations, then it is better to accept
> it and add the new define_split.

By this , you mean I should go back to the define_split method if dropping
the alternative does results in bad insns generated by RA?


>
> Some experimentation might be necessary to see which change is the
> better solution.
Yes, I profiled MiBench and found gcc generates better codes by using
madd instruction; on the other hand, how bad the code is generated by
define_split still not closely checked.

Another thought on this topic, Maybe I should make two copy of mul_acc_si
like you said, with one remains the constraints, the other drop the "*?*?".
Does this is the same as your method suggested?


-- 
Best Regards.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Question on mips multiply patterns in md file
  2010-03-18 11:23       ` Amker.Cheng
@ 2010-03-18 22:58         ` Jim Wilson
  0 siblings, 0 replies; 6+ messages in thread
From: Jim Wilson @ 2010-03-18 22:58 UTC (permalink / raw)
  To: Amker.Cheng; +Cc: gcc

On Thu, 2010-03-18 at 19:20 +0800, Amker.Cheng wrote:
> Does it possible that the method would ever result in register
> allocator failure?
> In my understanding, doesn't reload pass would do whatever it can to make
> all insns' constraints satisfied?

In theory, there should be no failure.  In practice, maybe there will
be.  Lots of things in gcc are more complicated than they appear to be
at first, and reload is the biggest one.

The question in my mind is why does the mul_acc_si pattern have two
alternatives?  Maybe the second alternative was added as an obscure
minor optimization.  Or maybe it was added to fix an obscure register
allocator failure.  Without more info, it is impossible to know, and
thus unsafe to make assumptions about which one is true.  We may be able
to find which one is true by looking at gcc history though.  If we can
find the ChangeLog entry and/or the patch that added the second
alternative, that may tell us why the second alternative is there.

> By this , you mean I should go back to the define_split method if dropping
> the alternative does results in bad insns generated by RA?

Use the define_split method if dropping the alternative results in
reload aborting, or if dropping the alternative results in code that is
less efficient than the code you get with the define_split.

> Another thought on this topic, Maybe I should make two copy of mul_acc_si
> like you said, with one remains the constraints, the other drop the "*?*?".
> Does this is the same as your method suggested?

Yes, that is another possibility.  This makes the first alternative more
desirable, and thus reduces the number of times the second alternative
is chosen, but still allows the second one to be chosen in some cases
when it might be better. This might even be a better solution than the
first two we had.  The only way to know is to do some experimenting,
looking at assembly code to see what happens with the different choices.

Jim

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2010-03-18 21:30 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-03-15  8:52 Question on mips multiply patterns in md file Amker.Cheng
2010-03-15 22:03 ` Jim Wilson
2010-03-16  8:55   ` Amker.Cheng
2010-03-17 20:22     ` Jim Wilson
2010-03-18 11:23       ` Amker.Cheng
2010-03-18 22:58         ` Jim Wilson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).