public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* failure to optimize away trivial C++ object creation
@ 2002-12-13 15:49 Martin Buchholz
  0 siblings, 0 replies; only message in thread
From: Martin Buchholz @ 2002-12-13 15:49 UTC (permalink / raw)
  To: gcc

g++ 3.2.1 x86 fails to perform some easy optimizations.  Here I look
at "class literal constant folding".  This might be an important part
of the remaining "abstraction penalty".

Consider this most simple and optimizer-friendly class with 2 data members:


    class Complex
    {
    private:
      const int real_;
      const int imag_;

    public:
      inline Complex (int real, int imag) : real_ (real), imag_ (imag) {}
      inline int Real() const { return real_; }
      inline int Imag() const { return imag_; }

      inline friend Complex operator+ (Complex z1, Complex z2)
      { return Complex (z1.real_ + z2.real_, z1.imag_ + z2.imag_); }
    };


If we now use "Complex Literals" like Complex(3,4), the compiler
should be able to do the obvious optimizations like constant-folding
just as with builtin types.

Now let's look at the x86 assembly code for two functions:

    Complex foo () { return Complex(9,11); }

==>  generates obvious optimal code:

	movl	4(%esp), %eax
	movl	$9, (%eax)
	movl	$11, 4(%eax)
	ret	$4

On the other hand,

    Complex bar () { return Complex(1,2) + Complex(8,9); }

==> generates suboptimal code:

	subl	$20, %esp
	movl	24(%esp), %eax
	movl	$1, 8(%esp)
	movl	$8, (%esp)
	movl	$2, 12(%esp)
	movl	$9, 4(%esp)
	movl	$9, (%eax)
	movl	$11, 4(%eax)
	addl	$20, %esp
	ret	$4


The two functions should generate identical code.  There seem to be
actually too simple optimizer bugs here:

- The addition operand objects above are created, but never used
  (since the result is computed at compile time).  So the code to
  generate them can simply be discarded (the constructors have no side
  effects).

- There seems to be no need to adjust %esp, since this is a leaf function.

Note that these bugs are sufficiently simple that I might be able to
write an easy optimizer pass as a postprocessor on the .s files.

But the gcc maintainers should fix the deeper problems.  Stores to
never-used stack slots should be easy to optimize away.

Could this be part of the reason Intel C++ dramatically outperforms
g++ on Scott Ladd's "Complex" benchmark?

The analogous problem does NOT appear to occur with classes containing
only one data member.

Details: g++ 3.2.1 on Linux x86;  g++ -Wall -S -O3 -fomit-frame-pointer

Disclaimer:  The only assembly code I've ever written was IBM System/370.

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2002-12-13 23:39 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-12-13 15:49 failure to optimize away trivial C++ object creation Martin Buchholz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).