public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* Loop unrolling-related Bench++ regressions?
@ 2002-02-05 17:00 Roger Sayle
  2002-02-05 17:39 ` Richard Henderson
  2002-02-06  1:32 ` Paolo Carlini
  0 siblings, 2 replies; 6+ messages in thread
From: Roger Sayle @ 2002-02-05 17:00 UTC (permalink / raw)
  To: gcc; +Cc: Richard Henderson


Richard Henderson asked for test cases where loop unrolling fails
where it succeeded before.  A good example comes from test L000003
in the bench++ benchmark suite, which has recently regressed.  My
timings show that its took 0.626 nanoseconds a week or so ago, but
now takes 2.88 nanoseconds [over 4.5 times slower] on 1GHz pentium
III running RedHat linux v7.2.

The core of the test can be reduced to the following code:

const int feature_times = 25;

extern void foo(void);

void
test (void)
{
   int i = 1;

   for (;;)
     {
       foo ();
       i++;
       if (i > feature_times)
         break;
     }
}


with the system compiler, gcc v2.96, the above loop produces the following
code with "-O3 -funroll-all-loops -fomit-frame-pointer -S"

test:
	pushl	%ebx
	subl	$8, %esp
	movl	$24, %ebx
	.p2align 2
.L3:
	call	foo
	call	foo
	call	foo
	call	foo
	call	foo
	subl	$5, %ebx
	jns	.L3
	addl	$8, %esp
	popl	%ebx
	ret


but with the same command line options and the current mainline, v3.1
the following code is generated:

test:
	pushl	%ebx
	movl	$1, %ebx
	subl	$8, %esp
.L2:
	call	foo
	leal	1(%ebx), %edx
	cmpl	$25, %edx
	jg	.L6
	call	foo
	leal	2(%ebx), %edx
	cmpl	$25, %edx
	jg	.L6
	call	foo

	...

	addl	$8, %ebx
	call	foo
	cmpl	$25, %ebx
	jle	.L2
.L6:
	addl	$8, %esp
	popl	%ebx
	ret


So although the loop is being unrolled, its actually unrolled eight times
in the second version compared to just five in the first, the compiler is
unable to determine that the loop iterates a fixed number of times, and
inserts termination checks between each call to foo.


I hope this helps.  I was actually investigating the 45% performance
improvement on some bench++ tests from re-enabling g++ builtins, and
happened to stumble across this performance regression by accident.

Roger
--
Roger Sayle,                         E-mail: roger@eyesopen.com
OpenEye Scientific Software,         WWW: http://www.eyesopen.com/
Suite 1107, 3600 Cerrillos Road,     Tel: (+1) 505-473-7385
Santa Fe, New Mexico, 87507.         Fax: (+1) 505-473-0833

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2002-04-22 15:25 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-02-05 17:00 Loop unrolling-related Bench++ regressions? Roger Sayle
2002-02-05 17:39 ` Richard Henderson
2002-04-21 11:48   ` Roger Sayle
2002-04-22  6:42     ` Gerald Pfeifer
2002-04-22  8:42       ` Roger Sayle
2002-02-06  1:32 ` Paolo Carlini

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).