From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 16753 invoked by alias); 27 Jun 2006 05:49:56 -0000 Received: (qmail 16723 invoked by uid 48); 27 Jun 2006 05:49:49 -0000 Date: Tue, 27 Jun 2006 06:05:00 -0000 Message-ID: <20060627054949.16722.qmail@sourceware.org> X-Bugzilla-Reason: CC References: Subject: [Bug target/27827] gcc 4 produces worse x87 code on all platforms than gcc 3 In-Reply-To: Reply-To: gcc-bugzilla@gcc.gnu.org To: gcc-bugs@gcc.gnu.org From: "uros at kss-loka dot si" Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org X-SW-Source: 2006-06/txt/msg02383.txt.bz2 List-Id: ------- Comment #22 from uros at kss-loka dot si 2006-06-27 05:49 ------- (In reply to comment #21) > Note that you are running the opposite of my test case: SSE vs SSE rather than > x87 vs x87. This whole bug report is about x87 performance. You can get more > detail on why I want x87 in my messages above, particularly comment #11, but > single precision is indeed the place where SSE cannot compete with the x87 > unit. To see it, put the flags back the way I had them in the attachment, and > you'll see that gcc 3 is much faster. Also, you should find in single Hm, these are x87 results: /usr/local.uros/gcc34/bin/gcc -DREPS=1000 -fomit-frame-pointer -O -DTYPE=float -c mmbench.c /usr/local.uros/gcc34/bin/gcc -DREPS=1000 -fomit-frame-pointer -O -c sgemm_atlas.c /usr/local.uros/gcc34/bin/gcc -DREPS=1000 -fomit-frame-pointer -O -o xsmm_gcc mmbench.o sgemm_atlas.o rm -f *.o /usr/local.uros/bin/gcc -DREPS=1000 -fomit-frame-pointer -O -DTYPE=float -c mmbench.c /usr/local.uros/bin/gcc -DREPS=1000 -fomit-frame-pointer -O -c sgemm_atlas.c /usr/local.uros/bin/gcc -DREPS=1000 -fomit-frame-pointer -O -o xsmm_gc4 mmbench.o sgemm_atlas.o rm -f *.o echo "GCC 3.x single performance:" GCC 3.x single performance: ./xsmm_gcc ALGORITHM NB REPS TIME MFLOPS ========= ===== ===== ========== ========== atlasmm 60 1000 0.141 3072.00 echo "GCC 4.x single performance:" GCC 4.x single performance: ./xsmm_gc4 ALGORITHM NB REPS TIME MFLOPS ========= ===== ===== ========== ========== atlasmm 60 1000 0.143 3029.92 -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27827