From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugs-return-131155-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 23589 invoked by alias); 16 Feb 2005 19:17:57 -0000
Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Archive: <http://gcc.gnu.org/ml/gcc-bugs/>
List-Post: <mailto:gcc-bugs@gcc.gnu.org>
List-Help: <mailto:gcc-bugs-help@gcc.gnu.org>
Sender: gcc-bugs-owner@gcc.gnu.org
Received: (qmail 23408 invoked by uid 48); 16 Feb 2005 19:17:38 -0000
Date: Wed, 16 Feb 2005 23:47:00 -0000
Message-ID: <20050216191738.23407.qmail@sourceware.org>
From: "roger at eyesopen dot com" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
In-Reply-To: <20050215225352.19988.stevenj@fftw.org>
References: <20050215225352.19988.stevenj@fftw.org>
Reply-To: gcc-bugzilla@gcc.gnu.org
Subject: [Bug middle-end/19988] [4.0 Regression] pessimizes fp multiply-add/subtract combo
X-Bugzilla-Reason: CC
X-SW-Source: 2005-02/txt/msg01835.txt.bz2
List-Id: <gcc-bugs.sourceware.org>


------- Additional Comments From roger at eyesopen dot com  2005-02-16 19:17 -------
Hmm.  I don't think the problem in this case is at the tree-level, where I think
keeping X-(Y*C) and -(Y*C) as a more canonical X + (Y*C') and Y*C' should help
with reassociation and other tree-ssa optimizations.  Indeed, it's these types
of transformations that have enabled the use of fmadd on the PowerPC for mainline.

The regression however comes from the (rare) interaction when a floating point
constant and its negative now need to be stored in the constant pool.  It's only
when X and -X are required in a function (potentially in short succession) that
this is a problem, and then only on machines that need to load floating point
constant from memory (AVR and other platforms with immediate floating point
constants, for example, are unaffected).

Some aspects of keeping X and -X in the constant pool were addressed by my
patch quoted in comment #1, which attempts to keep floating point constant
positive *when* this doesn't interfere with GCC's other optimizations.

I think the correct solution to this regression is to improve CSE/GCSE to
recognize that X*C can be synthesized from a previously available X*(-C) at
the cost of a negation, which is presumably cheaper than a multiplication on
most platforms.  Indeed, there's probably a set of targets for which loading
a positive from a constant pool and then negating it, is cheaper than loading
both a positive constant and then loading a negative constant.

Unfortunately, I doubt whether it'll be possible to siumultaneously address
this performance regression without reintroducing the 3.x issue mentioned in
the original "PS".  I doubt on many platforms a two multiply-adds are much
faster than a single floating point multiplication whose result is shared by
two additions.  Though again it might be possible to do something at the RTL
level, especially if duplicating the multiplication is a win with -Os.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19988