From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 6127 invoked by alias); 15 Mar 2004 18:22:48 -0000 Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org Received: (qmail 6115 invoked from network); 15 Mar 2004 18:22:47 -0000 Received: from unknown (HELO scanner2.ics.uci.edu) (128.195.1.36) by sources.redhat.com with SMTP; 15 Mar 2004 18:22:47 -0000 Received: from vino.ics.uci.edu (vino.ics.uci.edu [128.195.11.198]) by scanner2.ics.uci.edu (8.12.10/8.12.10) with ESMTP id i2FILQ4J020367; Mon, 15 Mar 2004 10:21:27 -0800 (PST) Message-Id: <200403151821.i2FILQ4J020367@scanner2.ics.uci.edu> To: Stelios Xanthakis Cc: Roger Sayle , gcc@gcc.gnu.org Subject: Re: GCC viciously beaten by ICC in trig test! References: From: Dan Nicolaescu In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Mon, 15 Mar 2004 18:22:00 -0000 X-ICS-MailScanner: Found to be clean X-ICS-MailScanner-SpamCheck: not spam (whitelisted), SpamAssassin (score=-96.69, required 5, MSGID_FROM_MTA_SHORT, USER_IN_WHITELIST) X-SW-Source: 2004-03/txt/msg00732.txt.bz2 Stelios Xanthakis writes: > On Sun, 14 Mar 2004, Dan Nicolaescu wrote: > > > Roger Sayle writes: > > > fsin > > > fmul %st(0), %st > > > > Intel 8.0 (that was used in the original test) generates something > > very different: Please be careful when snipping, the essential part that you deleted is this: call __libm_sse2_sincos #7.15 # LOE ebp esi edi xmm0 xmm1 ..B1.4: # Preds ..B1.1 i.e. ICC 8 generates a call to an SSE library function instead of using the fsin instruction. Given that this changed from ICC 7 to ICC 8, the library function is probably faster. > > mulsd %xmm1, %xmm1 #10.25 > > mulsd %xmm0, %xmm0 #10.15 > > addsd %xmm1, %xmm0 #10.25 > > movsd %xmm0, (%esp) #10.25 > > fldl (%esp) #10.25 > > > > Does --fpmath=sse fix this? > Can the processor in question do sse for doubles? > > In my experience, "--fpmath=sse --fsingle-precision-constants" > generates much faster code for a raytracer I have here. See above.