From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-return-92561-listarch-gcc=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 30339 invoked by alias); 15 Mar 2004 01:55:14 -0000
Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Archive: <http://gcc.gnu.org/ml/gcc/>
List-Post: <mailto:gcc@gcc.gnu.org>
List-Help: <http://gcc.gnu.org/ml/>
Sender: gcc-owner@gcc.gnu.org
Received: (qmail 30332 invoked from network); 15 Mar 2004 01:55:12 -0000
Received: from unknown (HELO www.eyesopen.COM) (12.96.199.11)
  by sources.redhat.com with SMTP; 15 Mar 2004 01:55:12 -0000
Received: from localhost (roger@localhost)
	by www.eyesopen.COM (8.11.6/8.11.6) with ESMTP id i2F0fQN14232;
	Sun, 14 Mar 2004 17:41:27 -0700
Date: Mon, 15 Mar 2004 01:55:00 -0000
From: Roger Sayle <roger@eyesopen.com>
To: Scott Robert Ladd <coyote@coyotegulch.com>
cc: gcc@gcc.gnu.org
Subject: Re: GCC viciously beaten by ICC in trig test!
In-Reply-To: <4054ED19.8020009@coyotegulch.com>
Message-ID: <Pine.LNX.4.44.0403141713460.12909-100000@www.eyesopen.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-SW-Source: 2004-03/txt/msg00642.txt.bz2


On Sun, 14 Mar 2004, Scott Robert Ladd wrote:
> Consider the following program, compiled and run on a Pentium 4
> (Northwood) system:
>
>      #include <math.h>

For a number of benchmarks, just this first line of source code above
is enough to loose the race for GCC against Intel when compiling on Linux.

Consider the following:

	#include <math.h>

	double doit(double a)
	{
	  return sin(a) * sin(a);
	}


Compiling with gcc -O2 -ffast-math on Linux generates x86 code that's
significantly slower than Intel's compiler output.  However, commenting
out the "#include <math.h>" corrects the situation and GCC can then
generate *exactly* the same sequence as icc.


The issue is that glibc's headers provide inline implementations for sin
and cos, and thereby override all of GCC's internal builtin processing.
Once this is done, there's nothing tree-ssa, the middle-end or the i386
can do to improve the code.  If GCC is to have a hope of using "sincos"
or SSE2 specific instruction sequences, the "best intentions" of glibc's
headers (will) have to be neutralized first.  Perhaps fixincludes :>


For the interested with "#include <math.h>" GCC 3.3.3 generates

foo:    fldl    4(%esp)
        fld     %st(0)
#APP
        fsin
#NO_APP
        fxch    %st(1)
#APP
        fsin
#NO_APP
        fmulp   %st, %st(1)
        ret

without it, the same "-O2 -ffast-math -fomit-frame-pointer" options'
output is identical to the output from Intel v7.0 (and presumably later).

foo:    fldl    4(%esp)
        fsin
        fmul    %st(0), %st
        ret


Just another data point.  Avoiding <math.h> may improve your performance
and influence the results of your "command line option" experiments.

Roger
--