From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-prs-return-20319-listarch-gcc-prs=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 11956 invoked by alias); 11 Jul 2002 10:16:01 -0000
Mailing-List: contact gcc-prs-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Archive: <http://gcc.gnu.org/ml/gcc-prs/>
List-Post: <mailto:gcc-prs@gcc.gnu.org>
List-Help: <mailto:gcc-prs-help@gcc.gnu.org>
Sender: gcc-prs-owner@gcc.gnu.org
Received: (qmail 11942 invoked by uid 71); 11 Jul 2002 10:16:01 -0000
Date: Thu, 11 Jul 2002 03:16:00 -0000
Message-ID: <20020711101601.11941.qmail@sources.redhat.com>
To: nobody@gcc.gnu.org
Cc: gcc-prs@gcc.gnu.org,
From: Alan Modra <amodra@bigpond.net.au>
Subject: Re: optimization/7130: miscompiled code for gcc-3.1 in powerpc linux with -funroll-all-loops
Reply-To: Alan Modra <amodra@bigpond.net.au>
X-SW-Source: 2002-07/txt/msg00321.txt.bz2
List-Id: <gcc-prs.sourceware.org>

The following reply was made to PR optimization/7130; it has been noted by GNATS.

From: Alan Modra <amodra@bigpond.net.au>
To: yozo@cs.berkeley.edu, gcc-gnats@gcc.gnu.org, gcc-patches@gcc.gnu.org
Cc:  
Subject: Re: optimization/7130: miscompiled code for gcc-3.1 in powerpc linux with -funroll-all-loops
Date: Thu, 11 Jul 2002 19:38:40 +0930

 On Thu, Jul 11, 2002 at 05:21:57PM +0930, Alan Modra wrote:
 > doloop.c:doloop_modify_runtime says:
 > 
 >      If the loop has been unrolled, then the loop body has been
 >      preconditioned to iterate a multiple of unroll_number times.  If
 >      abs_inc is != 1, the full calculation is
 > 
 >        t1 = abs_inc * unroll_number;
 >        n = abs (final - initial) / t1;
 >        n += (abs (final - initial) % t1) > t1 - abs_inc;
 > 
 > This is wrong.  Taking the example in the PR, we have
 > 
 > abs_inc = 1
 > unroll_number = 4
 > abs (final - initial) = 10
 > 
 > => t1 == 4
 >    abs (final - initial) % t1 == 2
 > => n == 2
 > 
 > We want n == 3, to go around the loop fully twice, and once partially.
 > 
 > A little thought shows the correct calculation is
 > 
 > 	The amount we increment per (partially) unrolled loop
 >         t1 = abs_inc * unroll_number;
 > 
 > 	The number of time we'll go fully round the loop.
 >         n = abs (final - initial) / t1;
 > 
 > 	Plus any partial loops.
 >         n += (abs (final - initial) % t1) >= abs_inc;
 
 And a little more thought, prodded by bootstrap failing :), says
 
 	Plus any partial loops.
 	  n += (abs (final - initial) % t1) != 0;
 
 Why?  Well, GE loops don't care about final - initial being a multiple
 of the loop increment.  Besides, for unroll_number == 1, the above
 calculations ought to agree with this extract from unroll.c
 
      The number of iterations (prior to any loop unrolling) is given by:
 
        n = (abs (final - initial) + abs_inc - 1) / abs_inc.
 
      However, it is possible for the summation to overflow, and a
      safer method is:
 
        n = abs (final - initial) / abs_inc;
        n += (abs (final - initial) % abs_inc) != 0;
 
 
 gcc/ChangeLog
  	PR optimization/7130
  	* doloop.c (doloop_modify_runtime): Correct count for unrolled loops.
 
 gcc/testsuite/ChangeLog
 	* gcc.dg/unroll.c: New.
 
 (And this time, I waited for bootstrap to get through building the stage2
  compiler before firing off emails.)
 
 -- 
 Alan Modra
 IBM OzLabs - Linux Technology Centre
 
 Index: gcc/doloop.c
 ===================================================================
 RCS file: /cvs/gcc/gcc/gcc/doloop.c,v
 retrieving revision 1.20
 diff -u -p -r1.20 doloop.c
 --- gcc/doloop.c	24 Jun 2002 02:16:42 -0000	1.20
 +++ gcc/doloop.c	11 Jul 2002 09:07:06 -0000
 @@ -552,6 +552,7 @@ doloop_modify_runtime (loop, iterations_
  {
    const struct loop_info *loop_info = LOOP_INFO (loop);
    HOST_WIDE_INT abs_inc;
 +  HOST_WIDE_INT abs_loop_inc;
    int neg_inc;
    rtx diff;
    rtx sequence;
 @@ -601,7 +602,7 @@ doloop_modify_runtime (loop, iterations_
  
         t1 = abs_inc * unroll_number;
         n = abs (final - initial) / t1;
 -       n += (abs (final - initial) % t1) > t1 - abs_inc;
 +       n += (abs (final - initial) % t1) != 0;
  
       The division and modulo operations can be avoided by requiring
       that the increment is a power of 2 (precondition_loop_p enforces
 @@ -667,20 +668,21 @@ doloop_modify_runtime (loop, iterations_
  	}
      }
  
 -  if (abs_inc * loop_info->unroll_number != 1)
 +  abs_loop_inc = abs_inc * loop_info->unroll_number;
 +  if (abs_loop_inc != 1)
      {
        int shift_count;
  
 -      shift_count = exact_log2 (abs_inc * loop_info->unroll_number);
 +      shift_count = exact_log2 (abs_loop_inc);
        if (shift_count < 0)
  	abort ();
  
 -      if (abs_inc != 1)
 -	diff = expand_simple_binop (GET_MODE (diff), PLUS,
 -				    diff, GEN_INT (abs_inc - 1),
 -				    diff, 1, OPTAB_LIB_WIDEN);
 +      diff = expand_simple_binop (GET_MODE (diff), PLUS,
 +				  diff, GEN_INT (abs_loop_inc - 1),
 +				  diff, 1, OPTAB_LIB_WIDEN);
  
 -      /* (abs (final - initial) + abs_inc - 1) / (abs_inc * unroll_number)  */
 +      /* (abs (final - initial) + abs_inc * unroll_number - 1)
 +	 / (abs_inc * unroll_number)  */
        diff = expand_simple_binop (GET_MODE (diff), LSHIFTRT,
  				  diff, GEN_INT (shift_count),
  				  diff, 1, OPTAB_LIB_WIDEN);
 Index: gcc/testsuite/gcc.dg/unroll.c
 ===================================================================
 RCS file: gcc/testsuite/gcc.dg/unroll.c
 diff -N gcc/testsuite/gcc.dg/unroll.c
 --- /dev/null	1 Jan 1970 00:00:00 -0000
 +++ gcc/testsuite/gcc.dg/unroll.c	11 Jul 2002 09:31:12 -0000
 @@ -0,0 +1,39 @@
 +/* PR opt/7130 */
 +/* { dg-do run } */
 +/* { dg-options "-O2 -funroll-all-loops" } */
 +
 +#define TYPE long
 +
 +void
 +scale (TYPE *alpha, TYPE *x, int n)
 +{
 +  int i, ix;
 +
 +  if (*alpha != 1)
 +    for (i = 0, ix = 0; i < n; i++, ix += 2)
 +      {
 +	TYPE tmpr, tmpi;
 +	tmpr = *alpha * x[ix];
 +	tmpi = *alpha * x[ix + 1];
 +	x[ix] = tmpr;
 +	x[ix + 1] = tmpi;
 +      }
 +}
 +
 +int
 +main (void)
 +{
 +  int i;
 +  TYPE x[10];
 +  TYPE alpha = 2;
 +
 +  for (i = 0; i < 10; i++)
 +    x[i] = i;
 +
 +  scale (&alpha, x, 5);
 +
 +  if (x[9] != 18)
 +    abort ();
 +
 +  return 0;
 +}