From: Bernd Schmidt <bschmidt@redhat.com>
To: Kelvin Nilsen <kdnilsen@linux.vnet.ibm.com>, gcc-patches@gcc.gnu.org
Subject: Re: RFC: Incomplete Draft Patches to Correct Errors in Loop Unrolling Frequencies (bugzilla problem 68212)
Date: Mon, 09 Nov 2015 17:35:00 -0000 [thread overview]
Message-ID: <5640D958.7010300@redhat.com> (raw)
In-Reply-To: <563E0E67.1090101@linux.vnet.ibm.com>
On 11/07/2015 03:44 PM, Kelvin Nilsen wrote:
> This is a draft patch to partially address the concerns described in
> bugzilla problem report
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68212). The patch is
> incomplete in the sense that there are some known shortcomings with
> nested loops which I am still working on. I am sending this out for
> comments at this time because we would like these patches to be
> integrated into the GCC 6 release and want to begin responding to
> community feedback as soon as possible in order to make the integration
> possible.
I'll mainly comment on style points right now. Your code generally looks
mostly good but doesn't quite follow our guidelines. In terms of logic
I'm sure there will be followup questions after the first round of
points is addressed. Others might be better qualified to review the
frequencies manipulation; for this first round I'm just assuming that
the general idea is sound (but would appreciate input).
> 1. Before a loop body is unpeeled into a pre-header location, we
> temporarily adjust the loop body frequencies to represent the values
> appropriate for the context into which the loop body is to be
> copied.
>
> 2. After unrolling the loop body (by replicating the loop body (N-1)
> times within the loop), we recompute all frequencies associated with
> blocks contained within the loop.
If these are independent from each other it might be better to split up
the patch.
> (check_loop_frequency_integrity): new helper routine
> (set_zero_probability): added another parameter
> (duplicate_loop_to_header_edge): Add code to recompute loop
> body frequencies after blocks are replicated (unrolled) into the loop
> body. Introduce certain help routines because existing infrastructure
> routines are not reliable during typical executions of
> duplicate_loop_to_header_edge().
Please review our guidelines how to write ChangeLogs - capitalize,
punctuate, and document only the what, not the why. Also, linewrap manually.
> opt_info_start_duplication (opt_info);
> +
> ok = duplicate_loop_to_header_edge (loop, loop_latch_edge (loop),
Please make sure you don't change whitespace unnecessarily. There are a
few other occurrences in the patch, and also cases where you seem to be
adding spaces to otherwise blank lines.
> @@ -1015,14 +1041,44 @@ unroll_loop_runtime_iterations (struct loop *loop)
> bitmap_clear_bit (wont_exit, may_exit_copy);
> opt_info_start_duplication (opt_info);
>
> + {
> + /* Recompute the loop body frequencies. */
> + zero_loop_frequencies (loop);
No reason to start a braced block here.
> + /* Scale the incoming frequencies according to the heuristic that
> + * the loop frequency is the incoming edge frequency divided by
> + * 0.09. This heuristic applies only to loops that iterate over a
> + * run-time value that is not known at compile time. Note that
> + * 1/.09 equals 11.1111. We'll use integer arithmetic on ten
> + * thousandths, and then divide by 10,000 after we've "rounded".
> + */
Please examine the comment style in gcc - no asterisks to start new
lines, and comment terminators don't go on their own line.
> + sum_incoming_frequencies *= 111111; /* multiply by 11.1111 */
> + sum_incoming_frequencies += 5000; /* round by adding 0.5 */
> + sum_incoming_frequencies /= 10000; /* convert ten thousandths
> + to ones
> + */
These comments could also be improved, but really they should just be
removed since they're pretty obvious and redundant with the one before.
> +/* Define ENABLE_CHECKING to enforce the following run-time checks.
"With checking enabled, the following run-time checks are performed:"
> + * This may report false-positive errors due to round-off errors.
That doesn't sound good as it could lead to bootstrap failures when
checking is enabled.
> @@ -44,6 +55,543 @@ static void fix_loop_placements (struct loop *, bo
> static bool fix_bb_placement (basic_block);
> static void fix_bb_placements (basic_block, bool *, bitmap);
>
> +/*
> + * Return true iff block is considered to reside within the loop
> + * represented by loop_ptr.
> + */
Arguments are capitalized in function comments.
> +bool
> +in_loop_p (basic_block block, struct loop *loop_ptr)
> +{
> + basic_block *bbs = get_loop_body (loop_ptr);
> + bool result = false;
> +
> + for (unsigned int i = 0; i < loop_ptr->num_nodes; i++)
> + {
> + if (bbs[i] == block)
> + result = true;
> + }
I think something that starts with bb->loop_father and iterates outwards
would be more efficient.
> +/* A list of block_ladder_rung structs is used to keep track of all the
> + * blocks visited in a depth-first recursive traversal of a control-flow
> + * graph. This list is used to detect and prevent attempts to revisit
> + * a block that is already being visited in the recursive traversal.
> + */
> +typedef struct block_ladder_rung {
> + basic_block block;
> + struct block_ladder_rung *lower_rung;
> +} *ladder_rung_p;
I was going to suggest a vec rather than manually dealing with lists,
but it looks like you're using these in a rather unusual way in a
recursive walk. I'm not actually sure that what you're doing entirely
prevents revisiting a block, it only does so for a single recursive
call-chain, but two separate recursive calls from the same level could
still visit the same blocks. I'm not sure this is intended (the usual
approach would be to use a (s)bitmap of visited blocks). More comments
on the whole recursion scheme below.
> +/* Return true iff an_edge represents the same source and destination
> + * blocks as another_edge.
> + */
> +static bool
> +same_edge_p (edge an_edge, edge another_edge)
> +{
> + return ((an_edge->src == another_edge->src)
> + && (an_edge->dest == another_edge->dest));
> +}
Formatting aside (extra parentheses), I wonder why you can't just test
edges for pointer equality? Does anything make duplicate edges?
> +static void
> +recursively_zero_frequency (struct loop *loop_ptr, vec<edge> exit_edges,
> + ladder_rung_p ladder_rung,
> + edge incoming_edge)
> +{
> + if (incoming_edge->dest == loop_ptr->header)
> + return;
> + else if (in_edge_set_p (incoming_edge, exit_edges))
> + return;
> + else if (in_call_chain_p (incoming_edge, ladder_rung))
> + return;
> + else
> + {
> + struct block_ladder_rung a_rung;
> + basic_block block = incoming_edge->dest;
> +
> + a_rung.block = block;
> + a_rung.lower_rung = ladder_rung;
> + block->frequency = 0;
> +
> + edge_iterator ei;
> + edge successor;
> + FOR_EACH_EDGE (successor, ei, block->succs)
> + {
> + recursively_zero_frequency (loop_ptr, exit_edges,
> + &a_rung, successor);
> + }
> + }
> +}
Does this really have to be recursive, and this complicated? I think
something with a worklist and maybe a bitmap to indicate visited blocks
would be considerably easier to understand, and the recursion could be a
problem with very large testcases.
You might want to check whether something like dfs_enumerate_from or
other functions in cfganal would be helpful.
> +static bool
> +recursion_detected_p (basic_block candidate, ladder_rung_p lower_steps) {
Don't place the open brace on the same line as the declaration.
> +/* Return true iff candidate is contained within the loop represented
> + * by loop_header and loop_latch.
> + *
> + * We consider the block to be within the loop if there exists a path
> + * within the control flow graph from this node to the loop's latch
> + * which does not pass through the loop's header. (If all paths to
> + * the latch pass through the loop header, then the node is contained
> + * within an outer-nested loop but not within this loop.)
> + *
> + * Note that if a candidate's successor is in the loop and the successor
> + * is not the loop header, then the candidate itself is also in the loop.
> + * If none of the successors of a candidate are in the loop, then the
> + * candidate itself is not in the loop.
> + */
> +static bool
> +in_loop_of_header_p (basic_block candidate, basic_block loop_header,
> + basic_block loop_latch, bool start_of_recursion,
> + ladder_rung_p lower_steps)
This is another function that makes me wonder why it's so complicated.
If you want to know whether a basic block is inside a loop, why is
looking for it in the set returned by get_loop_body insufficient, or
iterating backwards from the block's loop_father? And even so I think
there are probably better approaches to writing this using worklists or
existing helper functions, as mentioned above.
> + return false; /* None of the successors was in loop */
Move comment before statement.
> +/* Return true iff block is an element of the block_set vector.
> + */
> +static bool
> +in_block_set_p (basic_block block, vec<basic_block> block_set)
> +{
> + basic_block bb;
> + unsigned int u;
> + FOR_EACH_VEC_ELT (block_set, u, bb)
> + {
> + if (bb == block)
> + return true;
> + }
> + return false;
> +}
You probably want to use (s)bitmaps so you don't have to do this
potentially expensive iteration.
> +
> +/* Return a vector containing all of the edges that exit the loop
> + * represented by the loop_blocks vector.
> + */
> +static vec<edge>
> +get_exit_edges_from_loop_blocks (vec<basic_block> loop_blocks) {
How does this one differ from the existing get_loop_exit_edges? Ah, I
see this:
> + /* When zero_partial_loop_frequencies is invoked, the *loop_ptr
> + * object is not entirely coherent, so the library service
> + * get_loop_exit_edges (loop_p) cannot be called from this context.
Just how incoherent is the information? Is that fixable? I assume this
is the same reason why you have get_loop_blocks and are using it instead
of get_loop_body? That duplication is somewhat unfortunate and it would
be nice to avoid it. I think this is the core point that needs to be
addressed first. If this is unavoidable (and I really hope not), then it
needs to be clearly documented which set of functions should be called
under which circumstances.
> + if (!in_block_set_p (edge_dest, loop_blocks))
> + {
> + results.safe_push (successor);
> + }
No braces around single statements. Please fix throughout.
> + * Zero all frequencies for all blocks contained within the loop
> + * represented by loop_ptr which are reachable from block without
> + * passing through the block header. If block is not within the loop,
> + * this has no effect. The behavior is as outlined by the following
Lose the tab character in here.
> +static void
> +recursively_increment_frequency (struct loop *loop_ptr, vec<edge> exit_edges,
> + ladder_rung_p ladder_rung,
> + edge incoming_edge,
> + int frequency_increment)
Some of these would probably fit on one line which would make this more
compact.
> +#ifdef ENABLE_CHECKING
We're getting rid of this #ifdef. Define unconditionally, and use
flag_checking to determine whether to do the verification (if it is
safe, see comment above - otherwise just make this a DEBUG_FUNCTION).
> +/**
> + * Issue a fatal error message and abort program execution.
> + */
> +static void
> +internal (const char *msg)
> +{
> + fprintf (stderr, "Fatal internal error: %s\n", msg);
> + exit (-1);
> +}
Unnecessary, we have fatal_error.
> + * The integrity check is problematic due to round-off errors. Though
Another tab character.
> @@ -1101,7 +1649,7 @@ can_duplicate_loop_p (const struct loop *loop)
> is redistributed evenly to the remaining edges coming from E->src. */
>
> static void
> -set_zero_probability (edge e)
> +set_zero_probability (struct loop *loop_ptr, edge e)
Arguments should be documented in the function comment.
> + * KELVIN_TODO:
Just "TODO" if at all.
Bernd
next prev parent reply other threads:[~2015-11-09 17:35 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-11-07 14:45 Kelvin Nilsen
2015-11-09 17:35 ` Bernd Schmidt [this message]
2015-11-10 9:56 ` Bernhard Reutner-Fischer
2015-11-10 10:16 ` Bernd Schmidt
2015-11-10 10:54 ` Richard Biener
2015-11-10 14:19 ` Michael Matz
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5640D958.7010300@redhat.com \
--to=bschmidt@redhat.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=kdnilsen@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).