Re: RFC patch: invariant addresses too cheap

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: Richard Sandiford <rdsandiford@googlemail.com>
To: Michael Matz <matz@suse.de>
Cc: Paolo Bonzini <bonzini@gnu.org>,
		  Richard Guenther <richard.guenther@gmail.com>,
		  gcc-patches@gcc.gnu.org
Subject: Re: RFC patch: invariant addresses too cheap
Date: Wed, 02 Dec 2009 22:24:00 -0000	[thread overview]
Message-ID: <87pr6xrp6h.fsf@firetop.home> (raw)
In-Reply-To: <Pine.LNX.4.64.0912011254270.18785@wotan.suse.de> (Michael Matz's message of "Tue\, 1 Dec 2009 13\:12\:20 +0100 \(CET\)")

Hi Michael,

Michael Matz <matz@suse.de> writes:
>> > Hmm.  Looking at the example in that PR, am I right in thinking that 
>> > the x86 backend says that the (%rdi,%rax,4) and 16(%rsp,%rax,4) 
>> > addresses have the same cost?  It looks that way from the code, and 
>> > fwprop wouldn't have made the substitution otherwise, right?
>> >
>> > If so, why do they have equal cost, given that replacing one address 
>> > with the other led to such a drastic performance drop?
>> >
>> > (I know I'm missing something here, sorry.)
>> 
>> But without understanding the answer to the question in my earlier mail,
>> it still seems to me that this whole idea of checking for cheap
>> addresses in loop-invariant.c is tackling things in the wrong way.
>
> The answer to your questions are: yes, you're right.  It is so to generate 
> better code, because as the comment in front of ix86_address_cost says, 
> it's better on x86 to generate more complex addresses than to load them 
> into registers, hence a constant offset isn't taken into account, neither 
> is a scaling.  (To say the truth, on some micro architectures addresses 
> that use all components are tougher to the decoder).  In the past we even 
> made more complex addresses _cheaper_ than simple one (!), which is, well 
> ..., wrong ;-) but once generated better code.

Ah, thanks, I see now.

> Unfortunately these target knobs are more an art than a science, and 
> defining them in a logical way derivable from hardware manuals often leads 
> to suboptimal code :-/

Agreed.  I've hit that too.  And I realise that to some extent it's
inevitable.

> Having said that, I'm not sure why the answer to this question matters 
> to you: neither the PR33928 testcase, nor 410.bwaves was about (reg,reg,4) 
> vs. ofs(reg,reg,4).  The former was about ofs(reg) vs ofs2(reg) and the 
> latter was about (reg) vs (reg,reg).

Well, the point was that if fwprop2 is allowed to make the substitution,
it would solve both of the problems you list.  And that seems like
conceptually the right fix.  But as Paolo says, allowing fwprop2 to make
the substitution would regress PR30907, because of the costs issue above.
So the answer mattered because I wanted a patch that involved fwprop2 but
that didn't regress 30907.

> FWIW I'd also say that fwprop2 would be a good place to sink invariants 
> generally (so that loop-invariant could hoist them as it wants).  But I 
> don't see how that will get rid of having to compute an absolute cheapness 
> (as fwprop2 would only want to sink the invariantes into cheap places).

But fwprop.c compares "before" and "after" costs, and I was thinking it
should do the same in this case too (if allowed to).  No absolute cost
checks should be needed.

Before 30907, fwprop2 already sank invariants into cheap places.
The problem was that the x86 backend called a substitution cheap
when in fact it wasn't.  It would be nice if

  (a) fwprop could once again propagate invariants into loops and
  (b) when it did so, it could ask for a stricter cost, so that we don't
      inadvertently replace one loop instruction with a more expensive
      loop instruction (as we did in 30907).

In other words, it sounds from your message that the current x86 costs
are mostly geared for two things:

  (1) stopping forms of CSE, hoisting, etc., from introducing unnecessary
      temporaries
  (2) allowing forward propagation within basic blocks of roughly
      the same frequency, to remove unnecessary temporaries

Which is good.  But it might be nice to have a separate mode that's
suitable for forward-propagating values into blocks of higher frequency,
along the lines of (b).  Does that sound reasonable?

[ Having said that, although I could try to write a patch along
  those lines, I'm not sure I could test it very well. ;(  I don't
  have access to things like SPEC. ]

Richard

next prev parent reply	other threads:[~2009-12-02 22:23 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-10-19 16:17 Michael Matz
2009-10-19 16:45 ` Paolo Bonzini
2009-10-20  9:04   ` Richard Guenther
2009-10-20 12:07     ` Paolo Bonzini
2009-10-20 16:00       ` Michael Matz
2009-10-20 16:06         ` Richard Guenther
2009-10-31 10:05       ` Richard Sandiford
2009-10-31 11:14         ` Paolo Bonzini
2009-10-31 11:16           ` Richard Guenther
2009-10-31 12:56             ` Paolo Bonzini
2009-10-31 13:53               ` Richard Sandiford
2009-10-31 14:31                 ` Paolo Bonzini
2009-11-01 10:24                   ` Richard Sandiford
2009-11-01 10:25                     ` Paolo Bonzini
2009-11-14 11:19                     ` Richard Sandiford
2009-11-15 16:50                       ` Paolo Bonzini
2009-11-15 21:36                         ` Richard Sandiford
2009-11-28 10:53                           ` Richard Sandiford
2009-12-01 13:05                             ` Michael Matz
2009-12-02 22:24                               ` Richard Sandiford [this message]
2009-12-03 12:57                                 ` Michael Matz
2009-11-17 18:02                 ` Mark Mitchell
2009-10-31 19:23         ` Michael Matz
2009-11-01 10:13           ` Richard Sandiford

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87pr6xrp6h.fsf@firetop.home \
    --to=rdsandiford@googlemail.com \
    --cc=bonzini@gnu.org \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=matz@suse.de \
    --cc=richard.guenther@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).