From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 11098 invoked by alias); 19 Aug 2004 08:07:21 -0000 Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org Received: (qmail 11083 invoked by uid 48); 19 Aug 2004 08:07:20 -0000 Date: Thu, 19 Aug 2004 08:07:00 -0000 Message-ID: <20040819080720.11074.qmail@sourceware.org> From: "uros at kss-loka dot si" To: gcc-bugs@gcc.gnu.org In-Reply-To: <20040517130734.15492.uros@kss-loka.si> References: <20040517130734.15492.uros@kss-loka.si> Reply-To: gcc-bugzilla@gcc.gnu.org Subject: [Bug target/15492] floating-point arguments are loaded too early to x87 stack X-Bugzilla-Reason: CC X-SW-Source: 2004-08/txt/msg01902.txt.bz2 List-Id: ------- Additional Comments From uros at kss-loka dot si 2004-08-19 08:07 ------- According to "How to optimize for the Pentium family of microprocessors" by Agner Fog, "fld r/m32/m64" consumes one clock cycle on P1, PMMX, PPRO, P2, P3 and P4, and "fld m80" consumes 3 cycles on P1, PMMX and P4 and two cycles on PPRO, P2 and P3. This means, that code in all examples will be faster, because there is less fxch instructions. Also, more fp-stack could be used to store temporary variables if arguments are taken from stack when needed instead of copying their value between fp stack registers. There is a question if it is worth to special-case "fld m80" instructions to use fp register copies instead of memory load. Again, a lot of fxch instructions would be needed and fp stack space could be wasted with register copies. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=15492