public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* rs6000: Trivial code generation stupidity
@ 2001-12-17 10:30 degger
  2001-12-17 10:31 ` David Edelsohn
  0 siblings, 1 reply; 10+ messages in thread
From: degger @ 2001-12-17 10:30 UTC (permalink / raw)
  To: gcc

Hija,

I just noticed that the current cvs gcc generates stupid
code for dead simple cases.

Consider:

int
foo (int a, int b)
{
  return b+1;
}

This will compile into:
foo:
        addi 4,4,1
        mr 3,4
        blr

What I would have expected is:
foo:
        addi 3,4,1
        blr

The same happens for floatingpoint code. If there's more then 1
parameter then I will always get a register move which is
really unnecessary. gcc 2.95.3 gets it right. Any ideas?

--
Servus,
       Daniel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: rs6000: Trivial code generation stupidity
  2001-12-17 10:30 rs6000: Trivial code generation stupidity degger
@ 2001-12-17 10:31 ` David Edelsohn
  2001-12-17 12:54   ` degger
  0 siblings, 1 reply; 10+ messages in thread
From: David Edelsohn @ 2001-12-17 10:31 UTC (permalink / raw)
  To: degger; +Cc: gcc

>>>>> degger  writes:

degger> I just noticed that the current cvs gcc generates stupid
degger> code for dead simple cases.

degger> The same happens for floatingpoint code. If there's more then 1
degger> parameter then I will always get a register move which is
degger> really unnecessary. gcc 2.95.3 gets it right. Any ideas?

	Yes, GCC has degraded to introducing unnecessary, intermediate
register moves that it previously avoided.

David

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: rs6000: Trivial code generation stupidity
  2001-12-17 10:31 ` David Edelsohn
@ 2001-12-17 12:54   ` degger
  2001-12-17 13:12     ` David Edelsohn
  2001-12-17 13:21     ` David Edelsohn
  0 siblings, 2 replies; 10+ messages in thread
From: degger @ 2001-12-17 12:54 UTC (permalink / raw)
  To: dje; +Cc: gcc

On 17 Dec, David Edelsohn wrote:

> 	Yes, GCC has degraded to introducing unnecessary, intermediate
> register moves that it previously avoided.

Is there anything that can be done about it? Gcc 3.1 (or what it is now)
generates quite good code for me at the moment except for in some really
obvious cases (like the ones mentioned) and thus it would be a shame to
give away some performance by those nasties.

--
Servus,
       Daniel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: rs6000: Trivial code generation stupidity
  2001-12-17 12:54   ` degger
@ 2001-12-17 13:12     ` David Edelsohn
  2001-12-17 13:21     ` David Edelsohn
  1 sibling, 0 replies; 10+ messages in thread
From: David Edelsohn @ 2001-12-17 13:12 UTC (permalink / raw)
  To: degger; +Cc: gcc

>>>>> degger  writes:

degger> Is there anything that can be done about it? Gcc 3.1 (or what it is now)
degger> generates quite good code for me at the moment except for in some really
degger> obvious cases (like the ones mentioned) and thus it would be a shame to
degger> give away some performance by those nasties.

	Track down which patch introduced this regression.

David

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: rs6000: Trivial code generation stupidity
  2001-12-17 12:54   ` degger
  2001-12-17 13:12     ` David Edelsohn
@ 2001-12-17 13:21     ` David Edelsohn
  2001-12-18  3:53       ` Analysis try (was: Re: rs6000: Trivial code generation stupidity) degger
  1 sibling, 1 reply; 10+ messages in thread
From: David Edelsohn @ 2001-12-17 13:21 UTC (permalink / raw)
  To: degger; +Cc: gcc

>>>>> degger  writes:

degger> Is there anything that can be done about it? Gcc 3.1 (or what it is now)
degger> generates quite good code for me at the moment except for in some really
degger> obvious cases (like the ones mentioned) and thus it would be a shame to
degger> give away some performance by those nasties.

	Also, you can look at the output from intermediate passes and see
where gcc-2.95 chose the output in the correct register while gcc-3.1
introduces the extra copies.  Track down which phase in gcc-3.1 is not
collapsing the register copies or why it is introducing register copies.

David

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Analysis try (was: Re: rs6000: Trivial code generation stupidity)
  2001-12-17 13:21     ` David Edelsohn
@ 2001-12-18  3:53       ` degger
  2001-12-25 23:13         ` cant_combine_insn_p hard_reg->reg moves David Edelsohn
  0 siblings, 1 reply; 10+ messages in thread
From: degger @ 2001-12-18  3:53 UTC (permalink / raw)
  To: dje; +Cc: gcc

On 17 Dec, David Edelsohn wrote:

> 	Also, you can look at the output from intermediate passes and see
> where gcc-2.95 chose the output in the correct register while gcc-3.1
> introduces the extra copies.  Track down which phase in gcc-3.1 is not
> collapsing the register copies or why it is introducing register
> copies.

You asked for it but be warned that this is the first time I try to
really analyse RTL and thus my analysis could be completely bogus and/or
wrong.

Consider:
int 
foo (int a, int b)
{
  return b+1;
}

gcc 3.1 produces:
foo:
        addi 4,4,1       # 15   *addsi3_internal1/2     [length = 4]
        mr 3,4   # 24   *movsi_internal1/1      [length = 4]
        blr      # 36   *return_internal_si     [length = 4]

Here we can see that the register move comes from insn 24.
Analysing the RTL of all passes between gcc 2.95.3 and gcc 3.1 shows
that in the combine pass gcc 2.95.3 merges the plus:SI and the move
into 
(insn 14 12 15 (set (reg/i:SI 3 r3)
        (plus:SI (reg:SI 4 r4)
            (const_int 1 [0x1]))) 52 {*addsi3_internal1} (nil)
    (expr_list:REG_DEAD (reg:SI 4 r4)
        (nil)))

while gcc 3.1 keeps them in seperate insns using a virtual register for
the result of the plus:SI:
(insn 15 7 20 (set (reg:SI 118)
        (plus:SI (reg/v:SI 117)
            (const_int 1 [0x1]))) 36 {*addsi3_internal1} (insn_list 6 (nil))
    (expr_list:REG_DEAD (reg/v:SI 117)
        (nil)))

(insn 24 20 27 (set (reg/i:SI 3 r3)
        (reg:SI 118)) 295 {*movsi_internal1} (insn_list 15 (nil))
    (expr_list:REG_DEAD (reg:SI 118)
        (nil)))

So it seems that combine broke because it fails to recognize the
opportunity to get rid of a virtual register.

The bottom lines of the combine pass outputs are
2.95.3:
;; Combiner totals: 3 attempts, 3 substitutions (0 requiring new space),
;; 2 successes.

3.1:
;; Combiner totals: 0 attempts, 0 substitutions (0 requiring new space),
;; 0 successes.

If anyone wants to see the outputs of the intermediate passes feel free
to contact me and you'll get them.

--
Servus,
       Daniel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* cant_combine_insn_p hard_reg->reg moves
  2001-12-18  3:53       ` Analysis try (was: Re: rs6000: Trivial code generation stupidity) degger
@ 2001-12-25 23:13         ` David Edelsohn
  2001-12-26  1:18           ` Richard Henderson
  0 siblings, 1 reply; 10+ messages in thread
From: David Edelsohn @ 2001-12-25 23:13 UTC (permalink / raw)
  To: Bernd Schmidt, Jan Hubicka, Richard Henderson; +Cc: gcc, degger

	At Daniel's prompting, I finally tracked down the cause of the
code pessimization with respect to function arguments and return values.
Daniel tracked it down to somewhere in combine.

	The problem is cant_combine_insn_p disallowing hard registers.
Arguments and return values are specified as hard regs.  So this change
from December 2000 now often inserts a move between the hard reg allocated
to the pseudo of a computation and the hard regs of arguments and return
values.  This is rather pessimistic for some architecture, especially for
ABIs with register-based calling conventions.

	How can this improved?  Could cant_combine_insn_p test for hard
regs with either REG_EQUIV involving computations of ARG_POINTER_REGNUM or
FUNCTION_VALUE_REGNO_P?

Thanks, David

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: cant_combine_insn_p hard_reg->reg moves
  2001-12-25 23:13         ` cant_combine_insn_p hard_reg->reg moves David Edelsohn
@ 2001-12-26  1:18           ` Richard Henderson
  2001-12-26  4:13             ` Jan Hubicka
  0 siblings, 1 reply; 10+ messages in thread
From: Richard Henderson @ 2001-12-26  1:18 UTC (permalink / raw)
  To: David Edelsohn; +Cc: Bernd Schmidt, Jan Hubicka, gcc, degger

On Tue, Dec 25, 2001 at 09:02:42PM -0500, David Edelsohn wrote:
> Could cant_combine_insn_p test for hard regs with either REG_EQUIV
> involving computations of ARG_POINTER_REGNUM or FUNCTION_VALUE_REGNO_P?

No.  IIRC, this was introduced specifically to cure problems with
argument registers.  x86 regparm is the biggest offender here.

The only solutions are a new register allocator, or post-reload
cleanup optimizations such as -fcprop-registers.  Anything else
risks keeping a hard register live too long.


r~

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: cant_combine_insn_p hard_reg->reg moves
  2001-12-26  1:18           ` Richard Henderson
@ 2001-12-26  4:13             ` Jan Hubicka
  2001-12-26 11:25               ` Richard Henderson
  0 siblings, 1 reply; 10+ messages in thread
From: Jan Hubicka @ 2001-12-26  4:13 UTC (permalink / raw)
  To: Richard Henderson, David Edelsohn, Bernd Schmidt, Jan Hubicka,
	gcc, degger

> On Tue, Dec 25, 2001 at 09:02:42PM -0500, David Edelsohn wrote:
> > Could cant_combine_insn_p test for hard regs with either REG_EQUIV
> > involving computations of ARG_POINTER_REGNUM or FUNCTION_VALUE_REGNO_P?
> 
> No.  IIRC, this was introduced specifically to cure problems with
> argument registers.  x86 regparm is the biggest offender here.
> 
> The only solutions are a new register allocator, or post-reload
> cleanup optimizations such as -fcprop-registers.  Anything else
> risks keeping a hard register live too long.

BTW what is the situation of cprop-registers? Will it get enabled for 3.1
once the bugs are tracked down. Are the problems serisous.

Honza
> 
> 
> r~

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: cant_combine_insn_p hard_reg->reg moves
  2001-12-26  4:13             ` Jan Hubicka
@ 2001-12-26 11:25               ` Richard Henderson
  0 siblings, 0 replies; 10+ messages in thread
From: Richard Henderson @ 2001-12-26 11:25 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: David Edelsohn, Bernd Schmidt, gcc, degger

On Wed, Dec 26, 2001 at 01:10:56PM +0100, Jan Hubicka wrote:
> Will it get enabled for 3.1 once the bugs are tracked down.

Yes.

> Are the problems serisous.

No, or rather, I don't think so.  At the moment the scheduler
gets miscompiled on aix so we have bootstrap comparison failures.
It's probably something stupid, but I havn't found it yet.


r~

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2001-12-26 19:05 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-12-17 10:30 rs6000: Trivial code generation stupidity degger
2001-12-17 10:31 ` David Edelsohn
2001-12-17 12:54   ` degger
2001-12-17 13:12     ` David Edelsohn
2001-12-17 13:21     ` David Edelsohn
2001-12-18  3:53       ` Analysis try (was: Re: rs6000: Trivial code generation stupidity) degger
2001-12-25 23:13         ` cant_combine_insn_p hard_reg->reg moves David Edelsohn
2001-12-26  1:18           ` Richard Henderson
2001-12-26  4:13             ` Jan Hubicka
2001-12-26 11:25               ` Richard Henderson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).