From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 8295 invoked by alias); 15 Dec 2002 20:26:01 -0000 Mailing-List: contact gcc-prs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Archive: List-Post: List-Help: Sender: gcc-prs-owner@gcc.gnu.org Received: (qmail 8276 invoked by uid 71); 15 Dec 2002 20:26:01 -0000 Resent-Date: 15 Dec 2002 20:26:01 -0000 Resent-Message-ID: <20021215202601.8275.qmail@sources.redhat.com> Resent-From: gcc-gnats@gcc.gnu.org (GNATS Filer) Resent-Cc: gcc-prs@gcc.gnu.org, gcc-bugs@gcc.gnu.org Resent-Reply-To: gcc-gnats@gcc.gnu.org, martin@xemacs.org Received: (qmail 6303 invoked by uid 61); 15 Dec 2002 20:20:07 -0000 Message-Id: <20021215202007.6302.qmail@sources.redhat.com> Date: Sun, 15 Dec 2002 12:26:00 -0000 From: martin@xemacs.org Reply-To: martin@xemacs.org To: gcc-gnats@gcc.gnu.org X-Send-Pr-Version: gnatsweb-2.9.3 (1.1.1.1.2.31) Subject: optimization/8952: failure to optimize away trivial C++ object creation X-SW-Source: 2002-12/txt/msg00820.txt.bz2 List-Id: >Number: 8952 >Category: optimization >Synopsis: failure to optimize away trivial C++ object creation >Confidential: no >Severity: non-critical >Priority: medium >Responsible: unassigned >State: open >Class: pessimizes-code >Submitter-Id: net >Arrival-Date: Sun Dec 15 12:26:01 PST 2002 >Closed-Date: >Last-Modified: >Originator: martin@xemacs.org >Release: gcc-3.2.1 >Organization: >Environment: x86 Linux >Description: g++ 3.2.1 x86 fails to perform some easy optimizations. Here I look at "class literal constant folding". This might be an important part of the remaining "abstraction penalty". Consider this most simple and optimizer-friendly class with 2 data members: class Complex { private: const int real_; const int imag_; public: inline Complex (int real, int imag) : real_ (real), imag_ (imag) {} inline int Real() const { return real_; } inline int Imag() const { return imag_; } inline friend Complex operator+ (Complex z1, Complex z2) { return Complex (z1.real_ + z2.real_, z1.imag_ + z2.imag_); } }; If we now use "Complex Literals" like Complex(3,4), the compiler should be able to do the obvious optimizations like constant-folding just as with builtin types. Now let's look at the x86 assembly code for two functions: Complex foo () { return Complex(9,11); } ==> generates obvious optimal code: movl 4(%esp), %eax movl $9, (%eax) movl $11, 4(%eax) ret $4 On the other hand, Complex bar () { return Complex(1,2) + Complex(8,9); } ==> generates suboptimal code: subl $20, %esp movl 24(%esp), %eax movl $1, 8(%esp) movl $8, (%esp) movl $2, 12(%esp) movl $9, 4(%esp) movl $9, (%eax) movl $11, 4(%eax) addl $20, %esp ret $4 The two functions should generate identical code. There seem to be actually too simple optimizer bugs here: - The addition operand objects above are created, but never used (since the result is computed at compile time). So the code to generate them can simply be discarded (the constructors have no side effects). - There seems to be no need to adjust %esp, since this is a leaf function. Note that these bugs are sufficiently simple that I might be able to write an easy optimizer pass as a postprocessor on the .s files. But the gcc maintainers should fix the deeper problems. Stores to never-used stack slots should be easy to optimize away. Could this be part of the reason Intel C++ dramatically outperforms g++ on Scott Ladd's "Complex" benchmark? The analogous problem does NOT appear to occur with classes containing only one data member. Details: g++ 3.2.1 on Linux x86; g++ -Wall -S -O3 -fomit-frame-pointer Disclaimer: The only assembly code I've ever written was IBM System/370. >How-To-Repeat: Compile the following on x86 Linux using: g++ -S -O3 examine the .s file generated. class Complex { private: const int real_; const int imag_; public: inline Complex (int real, int imag) : real_ (real), imag_ (imag) {} inline int Real() const { return real_; } inline int Imag() const { return imag_; } inline friend Complex operator+ (Complex z1, Complex z2) { return Complex (z1.real_ + z2.real_, z1.imag_ + z2.imag_); } }; Complex foo () { return Complex(9,11); } Complex bar () { return Complex(1,2) + Complex(8,9); } >Fix: >Release-Note: >Audit-Trail: >Unformatted: