From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 23589 invoked by alias); 16 Feb 2005 19:17:57 -0000 Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org Received: (qmail 23408 invoked by uid 48); 16 Feb 2005 19:17:38 -0000 Date: Wed, 16 Feb 2005 23:47:00 -0000 Message-ID: <20050216191738.23407.qmail@sourceware.org> From: "roger at eyesopen dot com" To: gcc-bugs@gcc.gnu.org In-Reply-To: <20050215225352.19988.stevenj@fftw.org> References: <20050215225352.19988.stevenj@fftw.org> Reply-To: gcc-bugzilla@gcc.gnu.org Subject: [Bug middle-end/19988] [4.0 Regression] pessimizes fp multiply-add/subtract combo X-Bugzilla-Reason: CC X-SW-Source: 2005-02/txt/msg01835.txt.bz2 List-Id: ------- Additional Comments From roger at eyesopen dot com 2005-02-16 19:17 ------- Hmm. I don't think the problem in this case is at the tree-level, where I think keeping X-(Y*C) and -(Y*C) as a more canonical X + (Y*C') and Y*C' should help with reassociation and other tree-ssa optimizations. Indeed, it's these types of transformations that have enabled the use of fmadd on the PowerPC for mainline. The regression however comes from the (rare) interaction when a floating point constant and its negative now need to be stored in the constant pool. It's only when X and -X are required in a function (potentially in short succession) that this is a problem, and then only on machines that need to load floating point constant from memory (AVR and other platforms with immediate floating point constants, for example, are unaffected). Some aspects of keeping X and -X in the constant pool were addressed by my patch quoted in comment #1, which attempts to keep floating point constant positive *when* this doesn't interfere with GCC's other optimizations. I think the correct solution to this regression is to improve CSE/GCSE to recognize that X*C can be synthesized from a previously available X*(-C) at the cost of a negation, which is presumably cheaper than a multiplication on most platforms. Indeed, there's probably a set of targets for which loading a positive from a constant pool and then negating it, is cheaper than loading both a positive constant and then loading a negative constant. Unfortunately, I doubt whether it'll be possible to siumultaneously address this performance regression without reintroducing the 3.x issue mentioned in the original "PS". I doubt on many platforms a two multiply-adds are much faster than a single floating point multiplication whose result is shared by two additions. Though again it might be possible to do something at the RTL level, especially if duplicating the multiplication is a win with -Os. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19988