optimization/9566: Inline function produces much worse code than manual inlining.

public inbox for gcc-prs@sourceware.org
help / color / mirror / Atom feed

* optimization/9566: Inline function produces much worse code than manual inlining.
@ 2003-02-04 10:26 osv
  0 siblings, 0 replies; 3+ messages in thread
From: osv @ 2003-02-04 10:26 UTC (permalink / raw)
  To: gcc-gnats


>Number:         9566
>Category:       optimization
>Synopsis:       Inline function produces much worse code than manual inlining.
>Confidential:   no
>Severity:       serious
>Priority:       low
>Responsible:    unassigned
>State:          open
>Class:          pessimizes-code
>Submitter-Id:   net
>Arrival-Date:   Tue Feb 04 10:26:00 UTC 2003
>Closed-Date:
>Last-Modified:
>Originator:     Sergei Organov
>Release:        gcc version 3.3 20030203 (prerelease)
>Organization:
>Environment:
Linux 2.4.20 i686
>Description:
In the code below functions g1() calls inline function copy() 
and g2() is equivalent to g1() but the body of copy() is inserted
into the body of g2() manually. The assembly code produced for
g1() is very bad compared to those of g2(). The difference is most visible
for RISC processors (an example PowerPC assembly result is shown) 
though it could be seen on CISC processors as well.

The C++ code (note that the code is minimized to demonstrate the problem, 
so please ignore using of unitialized variables):

struct A {
  char const* src;
  char* dest;
  void copy() { *++dest = *++src; }
};

void g1() {
  A a;
  for(int i = 0; i < 10; ++i)
    a.copy();
}

void g2() {
  A a;
  for(int i = 0; i < 10; ++i)
    *++a.dest = *++a.src;
}

The resulting assembly for PowerPC (note the loop body is 8 vs 4
instructions):

$ ~/try-3.2/tools/bin/ppc-rtems-gcc -c -O4 -save-temps -mregnames struct.cc -o struct.o
$ cat struct.s

	.file	"struct.cc"
	.section	".text"
	.align 2
	.globl _Z2g1v
	.type	_Z2g1v, @function
_Z2g1v:
.LFB5:
	li %r3,10
	mtctr %r3
	stwu %r1,-16(%r1)
.LCFI0:
	addi %r8,%r1,8
.L10:
	lwz %r5,8(%r1)
	lwz %r3,4(%r8)
	addi %r6,%r5,1
	addi %r7,%r3,1
	stw %r7,4(%r8)
	stw %r6,8(%r1)
	lbz %r4,1(%r5)
	stb %r4,1(%r3)
	bdnz .L10
	addi %r1,%r1,16
	blr
.LFE5:
	.size	_Z2g1v, .-_Z2g1v
	.align 2
	.globl _Z2g2v
	.type	_Z2g2v, @function
_Z2g2v:
.LFB6:
	li %r3,10
	mtctr %r3
	li %r7,0
	li %r8,0
.L19:
	addi %r7,%r7,1
	lbz %r4,0(%r7)
	addi %r8,%r8,1
	stb %r4,0(%r8)
	bdnz .L19
	blr
.LFE6:
	.size	_Z2g2v, .-_Z2g2v
	.ident	"GCC: (GNU) 3.3 20030203 (prerelease)"

>How-To-Repeat:
Compile provided C++ code with '-O4 -save-temps' and look at resulting assembly.
>Fix:

>Release-Note:
>Audit-Trail:
>Unformatted:


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: optimization/9566: Inline function produces much worse code than manual inlining.
@ 2003-04-15  0:16 Andrew Pinski
  0 siblings, 0 replies; 3+ messages in thread
From: Andrew Pinski @ 2003-04-15  0:16 UTC (permalink / raw)
  To: nobody; +Cc: gcc-prs

The following reply was made to PR optimization/9566; it has been noted by GNATS.

From: Andrew Pinski <pinskia@physics.uc.edu>
To: Andrew Pinski <pinskia@physics.uc.edu>
Cc: gcc-gnats@gcc.gnu.org, osv@javad.ru, gcc-bugs@gcc.gnu.org,
   nobody@gcc.gnu.org, gcc-prs@gcc.gnu.org
Subject: Re: optimization/9566: Inline function produces much worse code than manual inlining.
Date: Mon, 14 Apr 2003 20:12:39 -0400

 Actually it is because C++'s this is pointer, so it spills the struct 
 the stack because of it.
 This g3 is the same as g1. Sorry about the pervious message, I did not 
 look at it too much.

 struct A {
    char const* src;
    char* dest;
    void copy() { *++dest = *++src; }
 };

 void g1() {
    A a;
    for(int i = 0; i < 10; ++i)
      a.copy();
 }
 void g3() {
    A a;
    for(int i = 0; i < 10; ++i)
    {
      A *b = &a;
      *++b->dest = *++b->src;
    }
 }

 A way to fix if the pointer to a local variable is only used for 
 load/store and not used for passing into a function or the pointer does 
 not change, is to remove the pointer and change it to what the pointer 
 points to (this might only be able to do on the ssa-branch).

 (As a side, it is still bad also on the ssa-branch from `3.5-tree-ssa 
 20030117`.)

 Thanks,
 Andrew Pinski

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: optimization/9566: Inline function produces much worse code than manual inlining.
@ 2003-04-14 23:46 Andrew Pinski
  0 siblings, 0 replies; 3+ messages in thread
From: Andrew Pinski @ 2003-04-14 23:46 UTC (permalink / raw)
  To: nobody; +Cc: gcc-prs

The following reply was made to PR optimization/9566; it has been noted by GNATS.

From: Andrew Pinski <pinskia@physics.uc.edu>
To: gcc-gnats@gcc.gnu.org, osv@javad.ru, gcc-bugs@gcc.gnu.org,
   nobody@gcc.gnu.org, gcc-prs@gcc.gnu.org
Cc: Andrew Pinski <pinskia@physics.uc.edu>
Subject: Re: optimization/9566: Inline function produces much worse code than manual inlining.
Date: Mon, 14 Apr 2003 19:35:58 -0400

 http://gcc.gnu.org/cgi-bin/gnatsweb.pl?cmd=view%20audit- 
 trail&database=gcc&pr=9566

 The problem is because inlining causes a temporary variable which  
 causes this and causes it to spill to the stack.

 struct A {
    char const* src;
    char* dest;
    void copy() { *++dest = *++src; }
 };

 void g1() {
    A a;
    for(int i = 0; i < 10; ++i)
      a.copy();
 }

 void g2() {
    A a;
    for(int i = 0; i < 10; ++i)
        *++a.dest = *++a.src;
 }

 void g3() {
    A a;
    for(int i = 0; i < 10; ++i)
    {
      struct A b = a;
      {
        *++b.dest = *++b.src;
      }
      a = b;
    }
 }

 Note here g3 and g1 produce the same asm on PPC.

 Thanks,
 Andrew Pinski

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2003-04-15  0:16 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-02-04 10:26 optimization/9566: Inline function produces much worse code than manual inlining osv
2003-04-14 23:46 Andrew Pinski
2003-04-15  0:16 Andrew Pinski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).