public inbox for gcc-bugs@sourceware.org help / color / mirror / Atom feed
* [Bug c/21550] New: i686 floating point performance 33% slower than gcc 3.4.3 @ 2005-05-13 15:22 trt at acm dot org 2005-05-13 18:03 ` [Bug tree-optimization/21550] [4.0/4.1 Regression] " pinskia at gcc dot gnu dot org ` (3 more replies) 0 siblings, 4 replies; 6+ messages in thread From: trt at acm dot org @ 2005-05-13 15:22 UTC (permalink / raw) To: gcc-bugs gcc 4.0.0 generates slower code than gcc 3.4.3 for the BLAS "axpy" operation. (This is no doubt specific to IA32, and perhaps also to the processor version.) The program is below, here are the timing results: gcc 3.4.3 gcc 4.0.0 Method cpu secs cpu secs z[]=x[]+alpha*y[] 1.45 1.72 z[]=z[]+alpha*y[] 1.47 2.03 z[]=z[]+y[] 1.44 1.57 The second method is a common special case of the first, so it is unfortunate that gcc 4 does poorly on it. ======== The program is in two files to defeat inlining: rzvaxpy.c and zvaxpy.c and here is the script I used to compile/run them: for m in METH1 METH2 METH3 do for cc in gcc343 gcc400 do $cc -march=i686 -O3 -D$m rzvaxpy.c zvaxpy.c echo $cc $m `(time a.out)2>&1` done done ==== zvaxpy.c void zvaxpy(double *z, double *x, double *y, int n, double alpha) { int i; #if defined(METH1) for (i = 0; i < n; i++) z[i] = x[i] + alpha * y[i]; #elif defined(METH2) for (i = 0; i < n; i++) z[i] = z[i] + alpha * y[i]; #else for (i = 0; i < n; i++) z[i] = z[i] + y[i]; #endif } ==== rzvaxpy.c #include <stdio.h> #define N 100 #define NITER ((300*1000*1000)/N) double a[100], b[100]; extern void zvaxpy(double *, double *, double *, int, double); int main() { int i; double sum; for (i = 0; i < 100; i++) { a[i] = 0; b[i] = 1; } for (i = 0; i < NITER; i++) zvaxpy(a,a, b, N, 1.1); sum = 0; for (i = 0; i < N; i++) sum += a[i]; printf("sum %g\n", sum); return 0; } -- Summary: i686 floating point performance 33% slower than gcc 3.4.3 Product: gcc Version: 4.0.0 Status: UNCONFIRMED Severity: normal Priority: P2 Component: c AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: trt at acm dot org CC: gcc-bugs at gcc dot gnu dot org GCC target triplet: i686-pc-linux-gnu http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21550 ^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug tree-optimization/21550] [4.0/4.1 Regression] i686 floating point performance 33% slower than gcc 3.4.3 2005-05-13 15:22 [Bug c/21550] New: i686 floating point performance 33% slower than gcc 3.4.3 trt at acm dot org @ 2005-05-13 18:03 ` pinskia at gcc dot gnu dot org 2005-07-08 1:35 ` mmitchel at gcc dot gnu dot org ` (2 subsequent siblings) 3 siblings, 0 replies; 6+ messages in thread From: pinskia at gcc dot gnu dot org @ 2005-05-13 18:03 UTC (permalink / raw) To: gcc-bugs ------- Additional Comments From pinskia at gcc dot gnu dot org 2005-05-13 18:03 ------- I think this basically goes back to the correct selection of IVs and i386 addressing mode, aka a*4+b and such, there are other bugs opened about that already. -- What |Removed |Added ---------------------------------------------------------------------------- Component|c |tree-optimization Keywords| |missed-optimization Summary|i686 floating point |[4.0/4.1 Regression] i686 |performance 33% slower than |floating point performance |gcc 3.4.3 |33% slower than gcc 3.4.3 Target Milestone|--- |4.0.1 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21550 ^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug tree-optimization/21550] [4.0/4.1 Regression] i686 floating point performance 33% slower than gcc 3.4.3 2005-05-13 15:22 [Bug c/21550] New: i686 floating point performance 33% slower than gcc 3.4.3 trt at acm dot org 2005-05-13 18:03 ` [Bug tree-optimization/21550] [4.0/4.1 Regression] " pinskia at gcc dot gnu dot org @ 2005-07-08 1:35 ` mmitchel at gcc dot gnu dot org 2005-09-27 15:57 ` mmitchel at gcc dot gnu dot org 2005-09-29 3:28 ` pinskia at gcc dot gnu dot org 3 siblings, 0 replies; 6+ messages in thread From: mmitchel at gcc dot gnu dot org @ 2005-07-08 1:35 UTC (permalink / raw) To: gcc-bugs -- What |Removed |Added ---------------------------------------------------------------------------- Target Milestone|4.0.1 |4.0.2 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21550 ^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug tree-optimization/21550] [4.0/4.1 Regression] i686 floating point performance 33% slower than gcc 3.4.3 2005-05-13 15:22 [Bug c/21550] New: i686 floating point performance 33% slower than gcc 3.4.3 trt at acm dot org 2005-05-13 18:03 ` [Bug tree-optimization/21550] [4.0/4.1 Regression] " pinskia at gcc dot gnu dot org 2005-07-08 1:35 ` mmitchel at gcc dot gnu dot org @ 2005-09-27 15:57 ` mmitchel at gcc dot gnu dot org 2005-09-29 3:28 ` pinskia at gcc dot gnu dot org 3 siblings, 0 replies; 6+ messages in thread From: mmitchel at gcc dot gnu dot org @ 2005-09-27 15:57 UTC (permalink / raw) To: gcc-bugs -- What |Removed |Added ---------------------------------------------------------------------------- Target Milestone|4.0.2 |4.0.3 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21550 ^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug tree-optimization/21550] [4.0/4.1 Regression] i686 floating point performance 33% slower than gcc 3.4.3 2005-05-13 15:22 [Bug c/21550] New: i686 floating point performance 33% slower than gcc 3.4.3 trt at acm dot org ` (2 preceding siblings ...) 2005-09-27 15:57 ` mmitchel at gcc dot gnu dot org @ 2005-09-29 3:28 ` pinskia at gcc dot gnu dot org 3 siblings, 0 replies; 6+ messages in thread From: pinskia at gcc dot gnu dot org @ 2005-09-29 3:28 UTC (permalink / raw) To: gcc-bugs -- What |Removed |Added ---------------------------------------------------------------------------- GCC target triplet|i686-pc-linux-gnu |i686-*-* http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21550 ^ permalink raw reply [flat|nested] 6+ messages in thread
[parent not found: <bug-21550-4397@http.gcc.gnu.org/bugzilla/>]
* [Bug tree-optimization/21550] [4.0/4.1 Regression] i686 floating point performance 33% slower than gcc 3.4.3 [not found] <bug-21550-4397@http.gcc.gnu.org/bugzilla/> @ 2005-10-16 22:25 ` pinskia at gcc dot gnu dot org 0 siblings, 0 replies; 6+ messages in thread From: pinskia at gcc dot gnu dot org @ 2005-10-16 22:25 UTC (permalink / raw) To: gcc-bugs ------- Comment #2 from pinskia at gcc dot gnu dot org 2005-10-16 22:25 ------- This has been fixed in 4.1.0. We no get: .L4: fldl (%edx,%eax,8) faddl (%ebx,%eax,8) fstpl (%edx,%eax,8) incl %eax cmpl %eax, %ecx jne .L4 Likewise for all methods. -- pinskia at gcc dot gnu dot org changed: What |Removed |Added ---------------------------------------------------------------------------- Status|UNCONFIRMED |RESOLVED Resolution| |FIXED Target Milestone|4.0.3 |4.1.0 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21550 ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2005-10-16 22:25 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2005-05-13 15:22 [Bug c/21550] New: i686 floating point performance 33% slower than gcc 3.4.3 trt at acm dot org 2005-05-13 18:03 ` [Bug tree-optimization/21550] [4.0/4.1 Regression] " pinskia at gcc dot gnu dot org 2005-07-08 1:35 ` mmitchel at gcc dot gnu dot org 2005-09-27 15:57 ` mmitchel at gcc dot gnu dot org 2005-09-29 3:28 ` pinskia at gcc dot gnu dot org [not found] <bug-21550-4397@http.gcc.gnu.org/bugzilla/> 2005-10-16 22:25 ` pinskia at gcc dot gnu dot org
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).