public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [RFC] Induction variable candidates not sufficiently general
@ 2018-07-12 22:05 Kelvin Nilsen
  2018-07-13  6:50 ` Richard Biener
  2018-07-14  2:14 ` Bin.Cheng
  0 siblings, 2 replies; 6+ messages in thread
From: Kelvin Nilsen @ 2018-07-12 22:05 UTC (permalink / raw)
  To: gcc-patches

A somewhat old "issue report" pointed me to the code generated for a 4-fold manually unrolled version of the following loop:

> 			while (++len != len_limit) /* this is loop */
> 				if (pb[len] != cur[len])
> 					break;

As unrolled, the loop appears as:

> 		  while (++len != len_limit) /* this is loop */ {
> 		    if (pb[len] != cur[len])
> 		      break;
> 		    if (++len == len_limit)  /* unrolled 2nd iteration */
> 		      break;
> 		    if (pb[len] != cur[len])
> 		      break;
> 		    if (++len == len_limit)  /* unrolled 3rd iteration */
> 		      break;
> 		    if (pb[len] != cur[len])
> 		      break;
> 		    if (++len == len_limit)  /* unrolled 4th iteration */
> 		      break;
> 		    if (pb[len] != cur[len])
> 		      break;
> 		  }

In examining the behavior of tree-ssa-loop-ivopts.c, I've discovered the only induction variable candidates that are being considered are all forms of the len variable.  We are not considering any induction variables to represent the address expressions &pb[len] and &cur[len].

I rewrote the source code for this loop to make the addressing expressions more explicit, as in the following:

>       cur++;
>       while (++pb != last_pb) /* this is loop */ {
> 	if (*pb != *cur)
> 	  break;
> 	++cur;
> 	if (++pb == last_pb)  /* unrolled 2nd iteration */
> 	  break;
> 	if (*pb != *cur)
> 	  break;
> 	++cur;
> 	if (++pb == last_pb)  /* unrolled 3rd iteration */
> 	  break;
> 	if (*pb != *cur)
> 	  break;
> 	++cur;
> 	if (++pb == last_pb)  /* unrolled 4th iteration */
> 	  break;
> 	if (*pb != *cur)
> 	  break;
> 	++cur;
>       }

Now, gcc does a better job of identifying the "address expression induction variables".  This version of the loop runs about 10% faster than the original on my target architecture.

This would seem to be a textbook pattern for the induction variable analysis.  Does anyone have any thoughts on the best way to add these candidates to the set of induction variables that are considered by tree-ssa-loop-ivopts.c?

Thanks in advance for any suggestions.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2018-07-23  9:54 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-07-12 22:05 [RFC] Induction variable candidates not sufficiently general Kelvin Nilsen
2018-07-13  6:50 ` Richard Biener
2018-07-14  2:14 ` Bin.Cheng
2018-07-16 18:09   ` Kelvin Nilsen
2018-07-21  1:28     ` Bin.Cheng
2018-07-23  9:54       ` Richard Biener

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).