From: Josh Conner <jconner@apple.com>
To: gcc@gcc.gnu.org
Cc: Steven Bosscher <stevenb@suse.de>, Jan Hubicka <jh@suse.cz>
Subject: Re: RFC: improving estimate_num_insns
Date: Wed, 13 Jul 2005 02:31:00 -0000 [thread overview]
Message-ID: <2C2E4C8B-FC5C-4854-9CB0-10887A0D4F97@apple.com> (raw)
In-Reply-To: <200507122207.51801.stevenb@suse.de>
On Jul 12, 2005, at 1:07 PM, Steven Bosscher wrote:
> You don't say what compiler you used for these measurements. I
> suppose
> you used mainline?
Yes, I am working with the mainline.
> I think you should look at a lot more code than this.
OK - I stopped because I was seeing fairly consistent results.
However, since both of these files were from the same source base, I
can see how this may not be representative. I'll try some more
examples from different source code, including C++.
>> Thinking that there may be room for improvement on this, I tried this
>> same experiment with a couple of adjustments to estimate_num_insns:
>> - Instead of ignoring constants, assign them a weight of 2
>> instructions (one for the constant itself and one to load into
>> memory)
>>
>
> Why the load into memory, you mean constants that must be loaded from
> the constant pool? I guess the cost for the constant itself
> depends on
> whether it is a legitimate constant or not, and how it is loaded, and
> this is all very target specific. But a cost greater than 0 probably
> makes sense.
Good point - I was thinking of constant pools when I wrote that, even
though that isn't always the case.
> It would be nice if you retain the comment about constants in the
> existing code somehow. The cost of constants is not trivial, and you
> should explain your choice better in a comment.
I'm not sure which comment you mean - the one from my email, or the
one that was originally in the source code?
>> - Instead of ignoring case labels, assign them a weight of 2
>> instructions (to represent the cost of control logic to get there)
>>
>
> This depends on how the switch statement is expanded, e.g. to a binary
> decision tree, or to a table jump, or to something smaller than that.
> So again (like everything else, sadly) this is highly target-specific
> and even context-specific. I'd think a cost of 2 is too
> pessimistic in
> most cases.
Do you mean that 2 is too high? I actually got (slightly) better
statistical results with a cost of 3, but I had the same reaction
(that it was too pessimistic), which is why I settled on 2.
> You could look at the code in stmt.c to see how switch statements are
> expanded. Maybe there is a cheap test you can do to make CASE_LABELs
> free for very small switch statements (i.e. the ones which should
> never
> hold you back from inlining the function containing them ;-).
I'll look into this.
>> For what it's worth, code size is equal to or smaller for all
>> benchmarks across all platforms.
>>
>
> What about the compile time?
Oh no! I didn't measure this. I will have a look.
>> So, here are the open issues I see at this point:
>> 1. It appears that this change to estimate_num_instructions generates
>> a much better estimate of actual code size. However, the benchmark
>> results are ambiguous. Is this patch worth considering as-is?
>>
>
> I would say you'd at least have to look into the ppc gzip and eon
> regressions before this is acceptable. But it is not my decision to
> make, of course.
Makes sense. I'll see what kind of time I can put into this.
>> 2. Increasing instruction weights causes the instruction-based values
>> (e.g., --param max-inline-insns-auto) to be effectively lower.
>> However, changing these constants/defaults as in the second patch
>> will cause a semantic change to anyone who is setting these values at
>> the command line. Is that change acceptable?
>>
>
> This has constantly changed from one release to the next since GCC
> 3.3,
> so I don't think this should be a problem.
Whew...
>> Thoughts? Advice?
>>
>
> ...on to advice then.
>
> First of all, I think you should handle ARRAY_RANGE_REF and ARRAY_REF
> the same. And you probably should make BIT_FIELD_REF more expensive,
> and maybe make its cost depend on which bit you're addressing (you can
> see that in operands 1 and 2 of the BIT_FIELD_REF). Its current cost
> of just 1 is probably too optimistic, just like the other cases you
> are
> trying to address with this patch.
Great - thanks for the suggestions, I'll try these changes as well
and see if I get even better correlation.
> Second, you may want to add a target hook to return the cost of target
> builtins. Even builtins that expand to just one instruction are now
> counted as 16 insns plus the cost of the arguments list. This badly
> hurts when you use e.g. SSE intrinsics. It's probably not an issue
> for
> the benchmark you looked at now, but maybe you want to look into it
> anyway, while you're at it.
OK, I'll look into this.
> Third,
>
>> (measured at -O3 with inlining disabled):
>>
> Then why not just -O2 with inlining disabled? Now you have enabled
> loop unswitching, which is known to sometimes significantly increase
> code size.
Well, I thought that inlining estimates were most likely to be
relevant at O3, and so I should include as many other optimizations
as were likely to be performed alongside inlining, even though this
may make it more difficult to get accurate estimates.
> Fourth, look at _much_ more code than this. I would especially
> look at
> a lot more C++ code, which is where our inliner heuristics can really
> dramatically improve or destroy performance.
OK.
> See Richard Guenther's
> inline heuristics tweaks from earlier this year, in the thread
> starting
> here: http://gcc.gnu.org/ml/gcc-patches/2005-03/msg01936.html.
I remember reading this when it was first posted, but looking at it
again after my experiments I see the additional relevance.
Especially the decision to assign a function call a cost of 16, which
I quickly identified as hurting our size estimates, but which also
improves inlining performance!
> Finally, you've apparently used grep to find all the places where
> PARAM_MAX_INLINE_INSNS_SINGLE and its friends are used, but you hve
> missed the ones in opts.c and maybe elsewhere.
Hmmm - I looked for all of the places where estimate_num_insns was
called. I still don't see anything in opts.c -- can you give me a
little more of a hint?
> Good luck, and thanks for working on this difficult stuff.
Thank you for the helpful comments!
- Josh
next prev parent reply other threads:[~2005-07-13 2:31 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-07-12 17:56 Josh Conner
2005-07-12 20:07 ` Steven Bosscher
2005-07-13 2:31 ` Josh Conner [this message]
2005-07-13 7:59 ` Steven Bosscher
2005-07-13 8:47 ` Richard Guenther
2005-07-13 2:44 Richard Kenner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2C2E4C8B-FC5C-4854-9CB0-10887A0D4F97@apple.com \
--to=jconner@apple.com \
--cc=gcc@gcc.gnu.org \
--cc=jh@suse.cz \
--cc=stevenb@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).