public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug other/16803] New: PowerPC - invariant code motion could be removed from loop.
@ 2004-07-28 17:01 gcc-bugzilla at gcc dot gnu dot org
  2004-07-28 17:36 ` [Bug target/16803] " bangerth at dealii dot org
                   ` (16 more replies)
  0 siblings, 17 replies; 22+ messages in thread
From: gcc-bugzilla at gcc dot gnu dot org @ 2004-07-28 17:01 UTC (permalink / raw)
  To: gcc-bugs

Description:
A non-optimal code sequence is illustrated. Several invariant instructions could be hoisted from a loop.  A store with update form could be used.  A branch on count instruction could be used.  Any or all would improve the loop's performance.  Duplicate using gcc 3.5 and command line:

gcc -O3 -m64 -c test.c

Testcase:
#define SOME_CONST 20
unsigned short *x;
int y;

int main ()
{
   int i;

   for (i = 0; i<= y+SOME_CONST; i++)
       x[i] = 0;

   return 0;
}

Assembly:
Currently gcc 3.5 generates the following code:

	ld 5,.LC0@toc(2) -- load base of "y"
	lwz 3,0(5)       -- load "y"
	addi 9,3,20      -- compute "y+SOME_CONST"
	cmpwi 7,9,0      -- determine if loop should be entered.
	blt- 7,.L2       -- branch around loop if not.
	ld 7,.LC1@toc(2) -- load address of "x"
	li 8,0           -- initialize "i"
	li 6,0           -- load value to store.
.L4:
	ld 10,0(7)       -- load base of "x" - loop invariant
	sldi 12,8,1      -- compute index into "x" (i * 2)
	addi 0,8,1       -- increment i
	extsw 8,0        -- sign extend i
	sthx 6,12,10     -- store "x[i]"
	lwz 4,0(5)       -- load "y" - loop invariant
	addi 11,4,20     -- compute "y+SOME_CONST" - loop invariant
	cmpw 0,11,8      -- compare result of add with i
	bge+ 0,.L4       -- loop back.

Remove the invariant instructions and use a store with update and this code improves to:

	ld 5,.LC0@toc(2) -- load base of "y"
	lwz 3,0(5)       -- load "y"
	addi 9,3,20      -- compute "y+SOME_CONST"
	cmpwi 7,9,0      -- determine if loop should be entered.
	blt- 7,.L7       -- branch around loop if not.
	ld 7,.LC1@toc(2) -- load address of "x"
	li 8,0           -- initialize "i"
	li 6,0           -- load value to store.
	ld 10,0(7)       -- load base of "x" - loop invariant
.L5:
	addi 0,8,1       -- increment i
	extsw 8,0        -- sign extend i
	sthu 6,2(10)     -- use store with update instead of sldi/sthx
	cmpw 0,9,8       -- compare result of add with i
	bge+ 0,.L5       -- loop back.

This could be further improved to the following with the use of bct:

	ld 5,.LC0@toc(2) -- load base of "y"
	li 8,0           -- initialize "i"
	lwz 3,0(5)       -- load "y"
	addi 9,3,20      -- compute "y+SOME_CONST"
	cmpwi 7,9,0      -- determine if loop should be entered.
	blt- 7,.L7       -- branch around loop if not.
	ld 7,.LC1@toc(2) -- load address of "x"
	li 6,0           -- load value to store.
	ld 10,0(7)       -- load base of "x" - loop invariant
	mtctr 11         -- load count register
.L5:
	sthu 6,2(10)     -- use store with update instead of sldi/sthx
	bdnz+ 0,.L5      -- loop back.

The loop has gone from 8 instructions and a branch to 1 instruction and a branch.



-- 
           Summary: PowerPC - invariant code motion could be removed from
                    loop.
           Product: gcc
           Version: 3.5.0
            Status: UNCONFIRMED
          Severity: enhancement
          Priority: P1
         Component: other
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: steinmtz at us dot ibm dot com
                CC: gcc-bugs at gcc dot gnu dot org,steinmtz at us dot ibm
                    dot com
 GCC build triplet: powerpc64-linux
  GCC host triplet: powerpc64-linux
GCC target triplet: powerpc64-linux


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=16803


^ permalink raw reply	[flat|nested] 22+ messages in thread
[parent not found: <bug-16803-8614@http.gcc.gnu.org/bugzilla/>]

end of thread, other threads:[~2006-01-28 21:18 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-07-28 17:01 [Bug other/16803] New: PowerPC - invariant code motion could be removed from loop gcc-bugzilla at gcc dot gnu dot org
2004-07-28 17:36 ` [Bug target/16803] " bangerth at dealii dot org
2004-07-28 18:33 ` falk at debian dot org
2004-07-28 18:35 ` pinskia at gcc dot gnu dot org
2004-07-29  0:21 ` [Bug tree-optimization/16803] " pinskia at gcc dot gnu dot org
2004-07-30 17:41 ` steven at gcc dot gnu dot org
2004-07-30 17:49 ` [Bug target/16803] " steven at gcc dot gnu dot org
2004-07-30 19:00 ` [Bug rtl-optimization/16803] " pinskia at gcc dot gnu dot org
2004-07-30 19:15 ` rakdver at gcc dot gnu dot org
2004-08-30 12:55 ` rakdver at gcc dot gnu dot org
2004-08-30 13:39 ` rakdver at gcc dot gnu dot org
2004-08-31  4:02 ` [Bug tree-optimization/16803] " pinskia at gcc dot gnu dot org
2004-08-31 22:42 ` pinskia at gcc dot gnu dot org
2004-09-27  3:47 ` pinskia at gcc dot gnu dot org
2004-11-11 17:40 ` nathan at gcc dot gnu dot org
2004-11-11 18:19 ` pinskia at gcc dot gnu dot org
2004-11-12  9:24 ` nathan at gcc dot gnu dot org
2004-11-15  1:58 ` pinskia at gcc dot gnu dot org
     [not found] <bug-16803-8614@http.gcc.gnu.org/bugzilla/>
2005-11-02 17:16 ` [Bug rtl-optimization/16803] " pinskia at gcc dot gnu dot org
2006-01-07 16:23 ` pinskia at gcc dot gnu dot org
2006-01-07 16:24 ` pinskia at gcc dot gnu dot org
2006-01-28 21:18 ` pinskia at gcc dot gnu dot org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).