From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 1163 invoked by alias); 11 Jan 2013 22:46:45 -0000 Received: (qmail 32319 invoked by uid 48); 11 Jan 2013 22:46:10 -0000 From: "vhaisman at gmail dot com" To: gcc-bugs@gcc.gnu.org Subject: [Bug rtl-optimization/55952] New: x86 FPU, unnecessary fxch instruction Date: Fri, 11 Jan 2013 22:46:00 -0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: rtl-optimization X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: vhaisman at gmail dot com X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Changed-Fields: Message-ID: X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated Content-Type: text/plain; charset="UTF-8" MIME-Version: 1.0 Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org X-SW-Source: 2013-01/txt/msg01070.txt.bz2 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55952 Bug #: 55952 Summary: x86 FPU, unnecessary fxch instruction Classification: Unclassified Product: gcc Version: 4.7.2 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization AssignedTo: unassigned@gcc.gnu.org ReportedBy: vhaisman@gmail.com I am wondering if there are not some improvements that could be made in generation of x86 FPU code. Here is a simple sign function: float signf4(float x) { return x < 0.0f ? -1.0f : 1.0f; } It generates the following assembler code (GCC 4.7.2, g++ -m32 -O3 -fverbose-asm -save-temps -g3 -ggdb -march=native): _Z6signf4f: .LFB84: .loc 1 27 0 .cfi_startproc .LVL4: .loc 1 30 0 fld1 fldz flds 4(%esp) # x fxch %st(1) # ??? Why? fucomip %st(1), %st #, ffreep %st(0) # fld1 fchs fcmovbe %st(1), %st #,, fstp %st(1) # .loc 1 31 0 ret I am wondering why is the fxch instruction necessary and why is the code not instead like this? _Z6signf4f: .LFB84: .loc 1 27 0 .cfi_startproc .LVL4: .loc 1 30 0 fld1 flds 4(%esp) # ??? Load the parameter before the zero. fldz # ??? to avoid the fxch instruction. fucomip %st(1), %st #, [...]