From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 7293 invoked by alias); 19 Dec 2006 17:18:20 -0000 Received: (qmail 7221 invoked by uid 48); 19 Dec 2006 17:18:10 -0000 Date: Tue, 19 Dec 2006 17:18:00 -0000 Message-ID: <20061219171810.7220.qmail@sourceware.org> X-Bugzilla-Reason: CC References: Subject: [Bug target/30255] register spills in x87 unit need to be 80-bit, not 64 In-Reply-To: Reply-To: gcc-bugzilla@gcc.gnu.org To: gcc-bugs@gcc.gnu.org From: "whaley at cs dot utsa dot edu" Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org X-SW-Source: 2006-12/txt/msg01608.txt.bz2 ------- Comment #10 from whaley at cs dot utsa dot edu 2006-12-19 17:18 ------- Guys, In the interests of full disclosure, I did some quick timings on the Core2Duo, and as I kind of suspected, scalar SSE crushed x87 there. I was pretty sure scalar SSE could achieve 2 flop/cycle, while Intel kept the x87 at 1 flop/cycle, and that's what my timings show. So, it does appear likely that the only people using the x87 in the future on the Intel will be people who need the extra precision (and those people would really like this fix, I will point out :). All other Intel archs (P4, PIII, etc) do 1 flop cycle for both scalar SSE and x87. On the AMDs, both x87 and scalar SSE can achieve 2 flop/cycle, with x87 running somewhat faster, with only a slight advantage in double precision, and a more commanding one in single. It looks like the next generation of AMDs will increase the maximal flop rate of vector SSE, but it does not look like they will increase the max flop rate of scalar SSE, so this may continue to be the case going forward . . . Cheers, Clint -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30255