Re: RFC: improving estimate_num_insns

public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed

From: Steven Bosscher <stevenb@suse.de>
To: gcc@gcc.gnu.org
Cc: Josh Conner <jconner@apple.com>, jh@suse.cz
Subject: Re: RFC: improving estimate_num_insns
Date: Tue, 12 Jul 2005 20:07:00 -0000	[thread overview]
Message-ID: <200507122207.51801.stevenb@suse.de> (raw)
In-Reply-To: <E88CA000-08FD-467A-9DE5-AF9459339DCC@apple.com>

On Tuesday 12 July 2005 19:56, Josh Conner wrote:
> In looking at the function inliner, I did some analysis of the
> correctness of estimate_num_insns (from tree-inline.c), as this
> measurement is critical to the inlining heuristics.

You don't say what compiler you used for these measurements.  I suppose
you used mainline?

>   Here is the  
> overview of what I found on the functions in two sample files from  
> the FAAD2 AAC library (measured at -O3 with inlining disabled):

I think you should look at a lot more code than this.

> Thinking that there may be room for improvement on this, I tried this
> same experiment with a couple of adjustments to estimate_num_insns:
> - Instead of ignoring constants, assign them a weight of 2
> instructions (one for the constant itself and one to load into memory)

Why the load into memory, you mean constants that must be loaded from
the constant pool?  I guess the cost for the constant itself depends on
whether it is a legitimate constant or not, and how it is loaded, and
this is all very target specific.  But a cost greater than 0 probably
makes sense.

It would be nice if you retain the comment about constants in the
existing code somehow.  The cost of constants is not trivial, and you
should explain your choice better in a comment.

> - Instead of ignoring dereferences, assign a weight of 2 to an array
> reference and 1 for a pointer dereference

Makes sense.

> - Instead of ignoring case labels, assign them a weight of 2
> instructions (to represent the cost of control logic to get there)

This depends on how the switch statement is expanded, e.g. to a binary
decision tree, or to a table jump, or to something smaller than that.
So again (like everything else, sadly) this is highly target-specific
and even context-specific.  I'd think a cost of 2 is too pessimistic in
most cases.

You could look at the code in stmt.c to see how switch statements are
expanded.  Maybe there is a cheap test you can do to make CASE_LABELs
free for very small switch statements (i.e. the ones which should never
hold you back from inlining the function containing them ;-).

> For what it's worth, code size is equal to or smaller for all
> benchmarks across all platforms.

What about the compile time?

> So, here are the open issues I see at this point:
> 1. It appears that this change to estimate_num_instructions generates
> a much better estimate of actual code size.  However, the benchmark
> results are ambiguous.  Is this patch worth considering as-is?

I would say you'd at least have to look into the ppc gzip and eon
regressions before this is acceptable.  But it is not my decision to
make, of course.

> 2. Increasing instruction weights causes the instruction-based values
> (e.g., --param max-inline-insns-auto) to be effectively lower.
> However, changing these constants/defaults as in the second patch
> will cause a semantic change to anyone who is setting these values at
> the command line.  Is that change acceptable?

This has constantly changed from one release to the next since GCC 3.3,
so I don't think this should be a problem.

> I do realize that this area is one that will eventually be likely
> best served by target-dependent routines (and constants), however I
> also see a significant benefit to all targets in fixing the default
> implementation first.

I don't really think we want to make too many things target-dependent.
This whole estimate is just that: An estimate.  And it's not intended
to be absolute.  It is just a relative measure of cost.  There are IMHO
very few trees that should have target-dependent inline cost estimates.

OK, that mas the thoughts part...

> Thoughts?  Advice?

...on to advice then.

First of all, I think you should handle ARRAY_RANGE_REF and ARRAY_REF
the same.  And you probably should make BIT_FIELD_REF more expensive,
and maybe make its cost depend on which bit you're addressing (you can
see that in operands 1 and 2 of the BIT_FIELD_REF).  Its current cost
of just 1 is probably too optimistic, just like the other cases you are
trying to address with this patch.

Second, you may want to add a target hook to return the cost of target
builtins.  Even builtins that expand to just one instruction are now
counted as 16 insns plus the cost of the arguments list.  This badly
hurts when you use e.g. SSE intrinsics.  It's probably not an issue for
the benchmark you looked at now, but maybe you want to look into it
anyway, while you're at it.

Third,
> (measured at -O3 with inlining disabled):
Then why not just -O2 with inlining disabled?  Now you have enabled
loop unswitching, which is known to sometimes significantly increase
code size.

Fourth, look at _much_ more code than this.  I would especially look at
a lot more C++ code, which is where our inliner heuristics can really
dramatically improve or destroy performance.  See Richard Guenther's
inline heuristics tweaks from earlier this year, in the thread starting
here: http://gcc.gnu.org/ml/gcc-patches/2005-03/msg01936.html.

Finally, you've apparently used grep to find all the places where
PARAM_MAX_INLINE_INSNS_SINGLE and its friends are used, but you hve
missed the ones in opts.c and maybe elsewhere.

Good luck, and thanks for working on this difficult stuff.

Gr.
Steven

next prev parent reply	other threads:[~2005-07-12 20:07 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-07-12 17:56 Josh Conner
2005-07-12 20:07 ` Steven Bosscher [this message]
2005-07-13  2:31   ` Josh Conner
2005-07-13  7:59     ` Steven Bosscher
2005-07-13  8:47       ` Richard Guenther
2005-07-13  2:44 Richard Kenner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200507122207.51801.stevenb@suse.de \
    --to=stevenb@suse.de \
    --cc=gcc@gcc.gnu.org \
    --cc=jconner@apple.com \
    --cc=jh@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).