Re: RFC: improving estimate_num_insns

public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed

From: Josh Conner <jconner@apple.com>
To: gcc@gcc.gnu.org
Cc: Steven Bosscher <stevenb@suse.de>, Jan Hubicka <jh@suse.cz>
Subject: Re: RFC: improving estimate_num_insns
Date: Wed, 13 Jul 2005 02:31:00 -0000	[thread overview]
Message-ID: <2C2E4C8B-FC5C-4854-9CB0-10887A0D4F97@apple.com> (raw)
In-Reply-To: <200507122207.51801.stevenb@suse.de>


On Jul 12, 2005, at 1:07 PM, Steven Bosscher wrote:

> You don't say what compiler you used for these measurements.  I  
> suppose
> you used mainline?

Yes, I am working with the mainline.

> I think you should look at a lot more code than this.

OK - I stopped because I was seeing fairly consistent results.   
However, since both of these files were from the same source base, I  
can see how this may not be representative.  I'll try some more  
examples from different source code, including C++.

>> Thinking that there may be room for improvement on this, I tried this
>> same experiment with a couple of adjustments to estimate_num_insns:
>> - Instead of ignoring constants, assign them a weight of 2
>> instructions (one for the constant itself and one to load into  
>> memory)
>>
>
> Why the load into memory, you mean constants that must be loaded from
> the constant pool?  I guess the cost for the constant itself  
> depends on
> whether it is a legitimate constant or not, and how it is loaded, and
> this is all very target specific.  But a cost greater than 0 probably
> makes sense.

Good point - I was thinking of constant pools when I wrote that, even  
though that isn't always the case.

> It would be nice if you retain the comment about constants in the
> existing code somehow.  The cost of constants is not trivial, and you
> should explain your choice better in a comment.

I'm not sure which comment you mean - the one from my email, or the  
one that was originally in the source code?

>> - Instead of ignoring case labels, assign them a weight of 2
>> instructions (to represent the cost of control logic to get there)
>>
>
> This depends on how the switch statement is expanded, e.g. to a binary
> decision tree, or to a table jump, or to something smaller than that.
> So again (like everything else, sadly) this is highly target-specific
> and even context-specific.  I'd think a cost of 2 is too  
> pessimistic in
> most cases.

Do you mean that 2 is too high?  I actually got (slightly) better  
statistical results with a cost of 3, but I had the same reaction  
(that it was too pessimistic), which is why I settled on 2.

> You could look at the code in stmt.c to see how switch statements are
> expanded.  Maybe there is a cheap test you can do to make CASE_LABELs
> free for very small switch statements (i.e. the ones which should  
> never
> hold you back from inlining the function containing them ;-).

I'll look into this.

>> For what it's worth, code size is equal to or smaller for all
>> benchmarks across all platforms.
>>
>
> What about the compile time?

Oh no!  I didn't measure this.  I will have a look.

>> So, here are the open issues I see at this point:
>> 1. It appears that this change to estimate_num_instructions generates
>> a much better estimate of actual code size.  However, the benchmark
>> results are ambiguous.  Is this patch worth considering as-is?
>>
>
> I would say you'd at least have to look into the ppc gzip and eon
> regressions before this is acceptable.  But it is not my decision to
> make, of course.

Makes sense.  I'll see what kind of time I can put into this.

>> 2. Increasing instruction weights causes the instruction-based values
>> (e.g., --param max-inline-insns-auto) to be effectively lower.
>> However, changing these constants/defaults as in the second patch
>> will cause a semantic change to anyone who is setting these values at
>> the command line.  Is that change acceptable?
>>
>
> This has constantly changed from one release to the next since GCC  
> 3.3,
> so I don't think this should be a problem.

Whew...

>> Thoughts?  Advice?
>>
>
> ...on to advice then.
>
> First of all, I think you should handle ARRAY_RANGE_REF and ARRAY_REF
> the same.  And you probably should make BIT_FIELD_REF more expensive,
> and maybe make its cost depend on which bit you're addressing (you can
> see that in operands 1 and 2 of the BIT_FIELD_REF).  Its current cost
> of just 1 is probably too optimistic, just like the other cases you  
> are
> trying to address with this patch.

Great - thanks for the suggestions, I'll try these changes as well  
and see if I get even better correlation.

> Second, you may want to add a target hook to return the cost of target
> builtins.  Even builtins that expand to just one instruction are now
> counted as 16 insns plus the cost of the arguments list.  This badly
> hurts when you use e.g. SSE intrinsics.  It's probably not an issue  
> for
> the benchmark you looked at now, but maybe you want to look into it
> anyway, while you're at it.

OK, I'll look into this.

> Third,
>
>> (measured at -O3 with inlining disabled):
>>
> Then why not just -O2 with inlining disabled?  Now you have enabled
> loop unswitching, which is known to sometimes significantly increase
> code size.

Well, I thought that inlining estimates were most likely to be  
relevant at O3, and so I should include as many other optimizations  
as were likely to be performed alongside inlining, even though this  
may make it more difficult to get accurate estimates.

> Fourth, look at _much_ more code than this.  I would especially  
> look at
> a lot more C++ code, which is where our inliner heuristics can really
> dramatically improve or destroy performance.

OK.

>   See Richard Guenther's
> inline heuristics tweaks from earlier this year, in the thread  
> starting
> here: http://gcc.gnu.org/ml/gcc-patches/2005-03/msg01936.html.

I remember reading this when it was first posted, but looking at it  
again after my experiments I see the additional relevance.   
Especially the decision to assign a function call a cost of 16, which  
I quickly identified as hurting our size estimates, but which also  
improves inlining performance!

> Finally, you've apparently used grep to find all the places where
> PARAM_MAX_INLINE_INSNS_SINGLE and its friends are used, but you hve
> missed the ones in opts.c and maybe elsewhere.

Hmmm - I looked for all of the places where estimate_num_insns was  
called.  I still don't see anything in opts.c -- can you give me a  
little more of a hint?

> Good luck, and thanks for working on this difficult stuff.

Thank you for the helpful comments!

- Josh

next prev parent reply	other threads:[~2005-07-13  2:31 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-07-12 17:56 Josh Conner
2005-07-12 20:07 ` Steven Bosscher
2005-07-13  2:31   ` Josh Conner [this message]
2005-07-13  7:59     ` Steven Bosscher
2005-07-13  8:47       ` Richard Guenther
2005-07-13  2:44 Richard Kenner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2C2E4C8B-FC5C-4854-9CB0-10887A0D4F97@apple.com \
    --to=jconner@apple.com \
    --cc=gcc@gcc.gnu.org \
    --cc=jh@suse.cz \
    --cc=stevenb@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).