* [PATCH 1/4] Adjust benchtests to new support library.
@ 2016-12-19 14:04 Adhemerval Zanella
2016-12-19 14:04 ` [PATCH 2/4] benchtests: Add fmax/fmin benchmarks Adhemerval Zanella
` (3 more replies)
0 siblings, 4 replies; 11+ messages in thread
From: Adhemerval Zanella @ 2016-12-19 14:04 UTC (permalink / raw)
To: libc-alpha
This patch basically replaces the test-skeleton.c inclusion by
support/test-driver.c and also minor adjustments in bench-string.h.
Checked on x86_64-linux-gnu and powerpc64le-linux-gnu.
* benchtests/bench-string.h (TEST_FUNCTION): Use name without
parenthesis.
(CMDLINE_PROCESS): Define using function instead of macro.
* benchtests/bench-memccpy.c: Include <support/test-driver.c> instead
of test-skeleton.
* benchtests/bench-memchr.c: Likewise.
* benchtests/bench-memcmp.c: Likewise.
* benchtests/bench-memcpy-large.c: Likewise.
* benchtests/bench-memcpy.c: Likewise.
* benchtests/bench-memmem.c: Likewise.
* benchtests/bench-memmove-large.c: Likewise.
* benchtests/bench-memmove.c: Likewise.
* benchtests/bench-memset-large.c: Likewise.
* benchtests/bench-memset.c: Likewise.
* benchtests/bench-rawmemchr.c: Likewise.
* benchtests/bench-strcasecmp.c: Likewise.
* benchtests/bench-strcasestr.c: Likewise.
* benchtests/bench-strcat.c: Likewise.
* benchtests/bench-strchr.c: Likewise.
* benchtests/bench-strcmp.c: Likewise.
* benchtests/bench-strcpy.c: Likewise.
* benchtests/bench-strcpy_chk.c: Likewise.
* benchtests/bench-strlen.c: Likewise.
* benchtests/bench-strncasecmp.c: Likewise.
* benchtests/bench-strncmp.c: Likewise.
* benchtests/bench-strncpy.c: Likewise.
* benchtests/bench-strnlen.c: Likewise.
* benchtests/bench-strpbrk.c: Likewise.
* benchtests/bench-strrchr.c: Likewise.
* benchtests/bench-strsep.c: Likewise.
* benchtests/bench-strspn.c: Likewise.
* benchtests/bench-strstr.c: Likewise.
* benchtests/bench-strtok.c: Likewise.
---
ChangeLog | 36 ++++++++++++++++++++++++++++++++
benchtests/bench-memccpy.c | 2 +-
benchtests/bench-memchr.c | 2 +-
benchtests/bench-memcmp.c | 3 ++-
benchtests/bench-memcpy-large.c | 2 +-
benchtests/bench-memcpy.c | 2 +-
benchtests/bench-memmem.c | 2 +-
benchtests/bench-memmove-large.c | 2 +-
benchtests/bench-memmove.c | 4 ++--
benchtests/bench-memset-large.c | 2 +-
benchtests/bench-memset.c | 2 +-
benchtests/bench-rawmemchr.c | 2 +-
benchtests/bench-strcasecmp.c | 2 +-
benchtests/bench-strcasestr.c | 2 +-
benchtests/bench-strcat.c | 2 +-
benchtests/bench-strchr.c | 2 +-
benchtests/bench-strcmp.c | 2 +-
benchtests/bench-strcpy.c | 2 +-
benchtests/bench-strcpy_chk.c | 5 +++--
benchtests/bench-string.h | 44 ++++++++++++++++++++++++----------------
benchtests/bench-strlen.c | 2 +-
benchtests/bench-strncasecmp.c | 2 +-
benchtests/bench-strncmp.c | 2 +-
benchtests/bench-strncpy.c | 4 ++--
benchtests/bench-strnlen.c | 2 +-
benchtests/bench-strpbrk.c | 2 +-
benchtests/bench-strrchr.c | 2 +-
benchtests/bench-strsep.c | 2 +-
benchtests/bench-strspn.c | 2 +-
benchtests/bench-strstr.c | 2 +-
benchtests/bench-strtok.c | 2 +-
31 files changed, 96 insertions(+), 50 deletions(-)
diff --git a/benchtests/bench-memccpy.c b/benchtests/bench-memccpy.c
index 00ac3e1..9be8fa0 100644
--- a/benchtests/bench-memccpy.c
+++ b/benchtests/bench-memccpy.c
@@ -160,4 +160,4 @@ test_main (void)
return ret;
}
-#include "../test-skeleton.c"
+#include <support/test-driver.c>
diff --git a/benchtests/bench-memchr.c b/benchtests/bench-memchr.c
index aa012f2..7b8299a 100644
--- a/benchtests/bench-memchr.c
+++ b/benchtests/bench-memchr.c
@@ -153,4 +153,4 @@ test_main (void)
return ret;
}
-#include "../test-skeleton.c"
+#include <support/test-driver.c>
diff --git a/benchtests/bench-memcmp.c b/benchtests/bench-memcmp.c
index 96d01a1..4f01d3d 100644
--- a/benchtests/bench-memcmp.c
+++ b/benchtests/bench-memcmp.c
@@ -174,4 +174,5 @@ test_main (void)
return ret;
}
-#include "../test-skeleton.c"
+
+#include <support/test-driver.c>
diff --git a/benchtests/bench-memcpy-large.c b/benchtests/bench-memcpy-large.c
index 1ef670e..7be9930 100644
--- a/benchtests/bench-memcpy-large.c
+++ b/benchtests/bench-memcpy-large.c
@@ -120,4 +120,4 @@ test_main (void)
return ret;
}
-#include "../test-skeleton.c"
+#include <support/test-driver.c>
diff --git a/benchtests/bench-memcpy.c b/benchtests/bench-memcpy.c
index 9d9e7b6..62e95c1 100644
--- a/benchtests/bench-memcpy.c
+++ b/benchtests/bench-memcpy.c
@@ -166,4 +166,4 @@ test_main (void)
return ret;
}
-#include "../test-skeleton.c"
+#include <support/test-driver.c>
diff --git a/benchtests/bench-memmem.c b/benchtests/bench-memmem.c
index 863f02f..ab80990 100644
--- a/benchtests/bench-memmem.c
+++ b/benchtests/bench-memmem.c
@@ -161,4 +161,4 @@ test_main (void)
return ret;
}
-#include "../test-skeleton.c"
+#include <support/test-driver.c>
diff --git a/benchtests/bench-memmove-large.c b/benchtests/bench-memmove-large.c
index b8f11c1..840907b 100644
--- a/benchtests/bench-memmove-large.c
+++ b/benchtests/bench-memmove-large.c
@@ -120,4 +120,4 @@ test_main (void)
return ret;
}
-#include "../test-skeleton.c"
+#include <support/test-driver.c>
diff --git a/benchtests/bench-memmove.c b/benchtests/bench-memmove.c
index 3858f2a..9710863 100644
--- a/benchtests/bench-memmove.c
+++ b/benchtests/bench-memmove.c
@@ -139,7 +139,7 @@ do_test (size_t align1, size_t align2, size_t len)
putchar ('\n');
}
-int
+static int
test_main (void)
{
size_t i;
@@ -188,4 +188,4 @@ test_main (void)
return ret;
}
-#include "../test-skeleton.c"
+#include <support/test-driver.c>
diff --git a/benchtests/bench-memset-large.c b/benchtests/bench-memset-large.c
index e694b92..ede0081 100644
--- a/benchtests/bench-memset-large.c
+++ b/benchtests/bench-memset-large.c
@@ -131,4 +131,4 @@ test_main (void)
return ret;
}
-#include "../test-skeleton.c"
+#include <support/test-driver.c>
diff --git a/benchtests/bench-memset.c b/benchtests/bench-memset.c
index 98ec257..d7abb44 100644
--- a/benchtests/bench-memset.c
+++ b/benchtests/bench-memset.c
@@ -190,4 +190,4 @@ test_main (void)
return ret;
}
-#include "../test-skeleton.c"
+#include <support/test-driver.c>
diff --git a/benchtests/bench-rawmemchr.c b/benchtests/bench-rawmemchr.c
index 0c00c66..9cf5d5b 100644
--- a/benchtests/bench-rawmemchr.c
+++ b/benchtests/bench-rawmemchr.c
@@ -123,4 +123,4 @@ test_main (void)
return ret;
}
-#include "../test-skeleton.c"
+#include <support/test-driver.c>
diff --git a/benchtests/bench-strcasecmp.c b/benchtests/bench-strcasecmp.c
index a9a748a..7a84ab5 100644
--- a/benchtests/bench-strcasecmp.c
+++ b/benchtests/bench-strcasecmp.c
@@ -173,4 +173,4 @@ test_main (void)
return ret;
}
-#include "../test-skeleton.c"
+#include <support/test-driver.c>
diff --git a/benchtests/bench-strcasestr.c b/benchtests/bench-strcasestr.c
index bc58880..11e1aad 100644
--- a/benchtests/bench-strcasestr.c
+++ b/benchtests/bench-strcasestr.c
@@ -177,4 +177,4 @@ test_main (void)
return ret;
}
-#include "../test-skeleton.c"
+#include <support/test-driver.c>
diff --git a/benchtests/bench-strcat.c b/benchtests/bench-strcat.c
index 4fc2806..05207be 100644
--- a/benchtests/bench-strcat.c
+++ b/benchtests/bench-strcat.c
@@ -171,4 +171,4 @@ test_main (void)
return ret;
}
-#include "../test-skeleton.c"
+#include <support/test-driver.c>
diff --git a/benchtests/bench-strchr.c b/benchtests/bench-strchr.c
index 6512085..7fc7c37 100644
--- a/benchtests/bench-strchr.c
+++ b/benchtests/bench-strchr.c
@@ -207,4 +207,4 @@ test_main (void)
return ret;
}
-#include "../test-skeleton.c"
+#include <support/test-driver.c>
diff --git a/benchtests/bench-strcmp.c b/benchtests/bench-strcmp.c
index f2d83c0..8597c69 100644
--- a/benchtests/bench-strcmp.c
+++ b/benchtests/bench-strcmp.c
@@ -238,4 +238,4 @@ test_main (void)
return ret;
}
-#include "../test-skeleton.c"
+#include <support/test-driver.c>
diff --git a/benchtests/bench-strcpy.c b/benchtests/bench-strcpy.c
index 5517221..a835662 100644
--- a/benchtests/bench-strcpy.c
+++ b/benchtests/bench-strcpy.c
@@ -190,4 +190,4 @@ test_main (void)
return ret;
}
-#include "../test-skeleton.c"
+#include <support/test-driver.c>
diff --git a/benchtests/bench-strcpy_chk.c b/benchtests/bench-strcpy_chk.c
index ded6747..e987546 100644
--- a/benchtests/bench-strcpy_chk.c
+++ b/benchtests/bench-strcpy_chk.c
@@ -53,8 +53,7 @@ simple_strcpy_chk (char *dst, const char *src, size_t len)
#include <setjmp.h>
#include <signal.h>
-static int test_main (void);
-#include "../test-skeleton.c"
+#include <support/support.h>
volatile int chk_fail_ok;
jmp_buf chk_fail_buf;
@@ -241,3 +240,5 @@ test_main (void)
return 0;
}
+
+#include <support/test-driver.c>
diff --git a/benchtests/bench-string.h b/benchtests/bench-string.h
index 9c5371e..82cbe7a 100644
--- a/benchtests/bench-string.h
+++ b/benchtests/bench-string.h
@@ -16,6 +16,7 @@
License along with the GNU C Library; if not, see
<http://www.gnu.org/licenses/>. */
+#include <getopt.h>
#include <sys/cdefs.h>
typedef struct
@@ -55,7 +56,7 @@ extern impl_t __start_impls[], __stop_impls[];
# include "bench-timing.h"
-# define TEST_FUNCTION test_main ()
+# define TEST_FUNCTION test_main
# ifndef TIMEOUT
# define TIMEOUT (4 * 60)
# endif
@@ -87,24 +88,31 @@ size_t iterations = 100000;
# define CMDLINE_OPTIONS ITERATIONS_OPTIONS \
{ "random", no_argument, NULL, OPT_RANDOM }, \
{ "seed", required_argument, NULL, OPT_SEED },
-# define CMDLINE_PROCESS ITERATIONS_PROCESS \
- case OPT_RANDOM: \
- { \
- int fdr = open ("/dev/urandom", O_RDONLY); \
- \
- if (fdr < 0 || read (fdr, &seed, sizeof(seed)) != sizeof (seed)) \
- seed = time (NULL); \
- if (fdr >= 0) \
- close (fdr); \
- do_srandom = 1; \
- break; \
- } \
- \
- case OPT_SEED: \
- seed = strtoul (optarg, NULL, 0); \
- do_srandom = 1; \
- break;
+static void __attribute__ ((used))
+cmdline_process_function (int c)
+{
+ switch (c)
+ {
+ ITERATIONS_PROCESS
+ case OPT_RANDOM:
+ {
+ int fdr = open ("/dev/urandom", O_RDONLY);
+ if (fdr < 0 || read (fdr, &seed, sizeof(seed)) != sizeof (seed))
+ seed = time (NULL);
+ if (fdr >= 0)
+ close (fdr);
+ do_srandom = 1;
+ break;
+ }
+
+ case OPT_SEED:
+ seed = strtoul (optarg, NULL, 0);
+ do_srandom = 1;
+ break;
+ }
+}
+# define CMDLINE_PROCESS cmdline_process_function
# define CALL(impl, ...) \
(* (proto_t) (impl)->fn) (__VA_ARGS__)
diff --git a/benchtests/bench-strlen.c b/benchtests/bench-strlen.c
index a5305b5..f854059 100644
--- a/benchtests/bench-strlen.c
+++ b/benchtests/bench-strlen.c
@@ -139,4 +139,4 @@ test_main (void)
return ret;
}
-#include "../test-skeleton.c"
+#include <support/test-driver.c>
diff --git a/benchtests/bench-strncasecmp.c b/benchtests/bench-strncasecmp.c
index 6cbab7d..5570df2 100644
--- a/benchtests/bench-strncasecmp.c
+++ b/benchtests/bench-strncasecmp.c
@@ -204,4 +204,4 @@ test_main (void)
return ret;
}
-#include "../test-skeleton.c"
+#include <support/test-driver.c>
diff --git a/benchtests/bench-strncmp.c b/benchtests/bench-strncmp.c
index 2f03668..ac35c86 100644
--- a/benchtests/bench-strncmp.c
+++ b/benchtests/bench-strncmp.c
@@ -296,4 +296,4 @@ test_main (void)
return ret;
}
-#include "../test-skeleton.c"
+#include <support/test-driver.c>
diff --git a/benchtests/bench-strncpy.c b/benchtests/bench-strncpy.c
index 5843571..916eedb 100644
--- a/benchtests/bench-strncpy.c
+++ b/benchtests/bench-strncpy.c
@@ -172,7 +172,7 @@ do_test (size_t align1, size_t align2, size_t len, size_t n, int max_char)
putchar ('\n');
}
-int
+static int
test_main (void)
{
size_t i;
@@ -207,4 +207,4 @@ test_main (void)
return ret;
}
-#include "../test-skeleton.c"
+#include <support/test-driver.c>
diff --git a/benchtests/bench-strnlen.c b/benchtests/bench-strnlen.c
index f3dbf97..794b994 100644
--- a/benchtests/bench-strnlen.c
+++ b/benchtests/bench-strnlen.c
@@ -150,4 +150,4 @@ test_main (void)
return ret;
}
-#include "../test-skeleton.c"
+#include <support/test-driver.c>
diff --git a/benchtests/bench-strpbrk.c b/benchtests/bench-strpbrk.c
index 6591a83..4647be7 100644
--- a/benchtests/bench-strpbrk.c
+++ b/benchtests/bench-strpbrk.c
@@ -203,4 +203,4 @@ test_main (void)
return ret;
}
-#include "../test-skeleton.c"
+#include <support/test-driver.c>
diff --git a/benchtests/bench-strrchr.c b/benchtests/bench-strrchr.c
index ff93634..f211a2c 100644
--- a/benchtests/bench-strrchr.c
+++ b/benchtests/bench-strrchr.c
@@ -181,4 +181,4 @@ test_main (void)
return ret;
}
-#include "../test-skeleton.c"
+#include <support/test-driver.c>
diff --git a/benchtests/bench-strsep.c b/benchtests/bench-strsep.c
index 70dbb37..774a093 100644
--- a/benchtests/bench-strsep.c
+++ b/benchtests/bench-strsep.c
@@ -171,4 +171,4 @@ test_main (void)
return ret;
}
-#include "../test-skeleton.c"
+#include <support/test-driver.c>
diff --git a/benchtests/bench-strspn.c b/benchtests/bench-strspn.c
index 606a8dd..498b140 100644
--- a/benchtests/bench-strspn.c
+++ b/benchtests/bench-strspn.c
@@ -189,4 +189,4 @@ test_main (void)
return ret;
}
-#include "../test-skeleton.c"
+#include <support/test-driver.c>
diff --git a/benchtests/bench-strstr.c b/benchtests/bench-strstr.c
index 5e50e8e..b28685c 100644
--- a/benchtests/bench-strstr.c
+++ b/benchtests/bench-strstr.c
@@ -174,4 +174,4 @@ test_main (void)
return ret;
}
-#include "../test-skeleton.c"
+#include <support/test-driver.c>
diff --git a/benchtests/bench-strtok.c b/benchtests/bench-strtok.c
index 41e0e45..0c3bfe1 100644
--- a/benchtests/bench-strtok.c
+++ b/benchtests/bench-strtok.c
@@ -177,4 +177,4 @@ test_main (void)
return ret;
}
-#include "../test-skeleton.c"
+#include <support/test-driver.c>
--
2.7.4
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 4/4] powerpc: Remove f{max,min}{f} assembly implementations
2016-12-19 14:04 [PATCH 1/4] Adjust benchtests to new support library Adhemerval Zanella
2016-12-19 14:04 ` [PATCH 2/4] benchtests: Add fmax/fmin benchmarks Adhemerval Zanella
2016-12-19 14:04 ` [PATCH 3/4] benchtests: Add fmaxf/fminf benchmarks Adhemerval Zanella
@ 2016-12-19 14:04 ` Adhemerval Zanella
2016-12-26 19:38 ` Adhemerval Zanella
2016-12-27 18:47 ` Tulio Magno Quites Machado Filho
2016-12-19 15:35 ` [PATCH 1/4] Adjust benchtests to new support library Siddhesh Poyarekar
3 siblings, 2 replies; 11+ messages in thread
From: Adhemerval Zanella @ 2016-12-19 14:04 UTC (permalink / raw)
To: libc-alpha
This patch removes the powerpc assembly implementation of fmax/fmin.
Based on benchtests, the assembly ones shows:
$ ./testrun.sh benchtests/bench-fmax
"fmax": {
"": {
"duration": 5.07586e+09,
"iterations": 2.01676e+09,
"max": 1350.39,
"min": 2.073,
"mean": 2.51684
},
"qNaN": {
"duration": 5.09315e+09,
"iterations": 8.4568e+08,
"max": 2788,
"min": 5.806,
"mean": 6.02255
},
"sNaN": {
"duration": 5.09073e+09,
"iterations": 8.42316e+08,
"max": 4215.84,
"min": 5.737,
"mean": 6.04373
}
And
$ ./testrun.sh benchtests/bench-fmin
"fmin": {
"": {
"duration": 5.07711e+09,
"iterations": 2.02982e+09,
"max": 497.094,
"min": 2.073,
"mean": 2.50126
},
"qNaN": {
"duration": 5.09134e+09,
"iterations": 8.46968e+08,
"max": 2255.14,
"min": 5.807,
"mean": 6.01125
},
"sNaN": {
"duration": 5.09122e+09,
"iterations": 8.4746e+08,
"max": 1969.38,
"min": 5.729,
"mean": 6.00763
}
}
The default implementation (math/s_f{max.min}_template.c) shows slight better
latency for all cases:
$ ./testrun.sh benchtests/bench-fmax
"fmax": {
"": {
"duration": 5.07044e+09,
"iterations": 2.38695e+09,
"max": 2048.58,
"min": 2.073,
"mean": 2.12423
},
"qNaN": {
"duration": 5.09004e+09,
"iterations": 9.45428e+08,
"max": 3306.93,
"min": 5.138,
"mean": 5.38385
},
"sNaN": {
"duration": 5.08458e+09,
"iterations": 1.15959e+09,
"max": 972.008,
"min": 3.321,
"mean": 4.3848
}
}
And:
$ ./testrun.sh benchtests/bench-fmin
"fmin": {
"": {
"duration": 5.06817e+09,
"iterations": 2.3913e+09,
"max": 1177.9,
"min": 2.073,
"mean": 2.11942
},
"qNaN": {
"duration": 5.08857e+09,
"iterations": 9.45656e+08,
"max": 2658.83,
"min": 5.09,
"mean": 5.38099
},
"sNaN": {
"duration": 5.08093e+09,
"iterations": 1.16725e+09,
"max": 1030.74,
"min": 3.323,
"mean": 4.3529
}
}
Both were run with GCC 5.4 (ubuntu 16 default installation) using default
compiler flags on POWER8E 3.4GHz (powerpc64le-linux-gnu).
---
ChangeLog | 10 +++++
sysdeps/powerpc/fpu/s_fmax.S | 77 ----------------------------------
sysdeps/powerpc/fpu/s_fmaxf.S | 1 -
sysdeps/powerpc/fpu/s_fmin.S | 77 ----------------------------------
sysdeps/powerpc/fpu/s_fminf.S | 1 -
sysdeps/powerpc/powerpc32/fpu/s_fmax.S | 5 ---
sysdeps/powerpc/powerpc32/fpu/s_fmin.S | 5 ---
sysdeps/powerpc/powerpc64/fpu/s_fmax.S | 5 ---
sysdeps/powerpc/powerpc64/fpu/s_fmin.S | 5 ---
9 files changed, 10 insertions(+), 176 deletions(-)
delete mode 100644 sysdeps/powerpc/fpu/s_fmax.S
delete mode 100644 sysdeps/powerpc/fpu/s_fmaxf.S
delete mode 100644 sysdeps/powerpc/fpu/s_fmin.S
delete mode 100644 sysdeps/powerpc/fpu/s_fminf.S
delete mode 100644 sysdeps/powerpc/powerpc32/fpu/s_fmax.S
delete mode 100644 sysdeps/powerpc/powerpc32/fpu/s_fmin.S
delete mode 100644 sysdeps/powerpc/powerpc64/fpu/s_fmax.S
delete mode 100644 sysdeps/powerpc/powerpc64/fpu/s_fmin.S
diff --git a/sysdeps/powerpc/fpu/s_fmax.S b/sysdeps/powerpc/fpu/s_fmax.S
deleted file mode 100644
index e6405c0..0000000
--- a/sysdeps/powerpc/fpu/s_fmax.S
+++ /dev/null
@@ -1,77 +0,0 @@
-/* Floating-point maximum. PowerPC version.
- Copyright (C) 1997-2016 Free Software Foundation, Inc.
- This file is part of the GNU C Library.
-
- The GNU C Library is free software; you can redistribute it and/or
- modify it under the terms of the GNU Lesser General Public
- License as published by the Free Software Foundation; either
- version 2.1 of the License, or (at your option) any later version.
-
- The GNU C Library is distributed in the hope that it will be useful,
- but WITHOUT ANY WARRANTY; without even the implied warranty of
- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
- Lesser General Public License for more details.
-
- You should have received a copy of the GNU Lesser General Public
- License along with the GNU C Library; if not, see
- <http://www.gnu.org/licenses/>. */
-
-#include <sysdep.h>
-
-ENTRY(__fmax)
-/* double [f1] fmax (double [f1] x, double [f2] y); */
- fcmpu cr0,fp1,fp2
- blt cr0,0f /* if x < y, neither x nor y can be NaN... */
- bnulr+ cr0
-/* x and y are unordered, so one of x or y must be a NaN... */
- fcmpu cr1,fp2,fp2
- bun cr1,1f
-/* x is a NaN; y is not. Test if x is signaling. */
-#ifdef __powerpc64__
- stfd fp1,-8(r1)
- lwz r3,-8+HIWORD(r1)
-#else
- stwu r1,-16(r1)
- cfi_adjust_cfa_offset (16)
- stfd fp1,8(r1)
- lwz r3,8+HIWORD(r1)
- addi r1,r1,16
- cfi_adjust_cfa_offset (-16)
-#endif
- andis. r3,r3,8
- bne cr0,0f
- b 2f
-1: /* y is a NaN; x may or may not be. */
- fcmpu cr1,fp1,fp1
- bun cr1,2f
-/* y is a NaN; x is not. Test if y is signaling. */
-#ifdef __powerpc64__
- stfd fp2,-8(r1)
- lwz r3,-8+HIWORD(r1)
-#else
- stwu r1,-16(r1)
- cfi_adjust_cfa_offset (16)
- stfd fp2,8(r1)
- lwz r3,8+HIWORD(r1)
- addi r1,r1,16
- cfi_adjust_cfa_offset (-16)
-#endif
- andis. r3,r3,8
- bnelr cr0
-2: /* x and y are NaNs, or one is a signaling NaN. */
- fadd fp1,fp1,fp2
- blr
-0: fmr fp1,fp2
- blr
-END(__fmax)
-
-weak_alias (__fmax,fmax)
-
-/* It turns out that it's safe to use this code even for single-precision. */
-strong_alias(__fmax,__fmaxf)
-weak_alias (__fmax,fmaxf)
-
-#ifdef NO_LONG_DOUBLE
-weak_alias (__fmax,__fmaxl)
-weak_alias (__fmax,fmaxl)
-#endif
diff --git a/sysdeps/powerpc/fpu/s_fmaxf.S b/sysdeps/powerpc/fpu/s_fmaxf.S
deleted file mode 100644
index 3c2d62b..0000000
--- a/sysdeps/powerpc/fpu/s_fmaxf.S
+++ /dev/null
@@ -1 +0,0 @@
-/* __fmaxf is in s_fmax.c */
diff --git a/sysdeps/powerpc/fpu/s_fmin.S b/sysdeps/powerpc/fpu/s_fmin.S
deleted file mode 100644
index 9ae77fe..0000000
--- a/sysdeps/powerpc/fpu/s_fmin.S
+++ /dev/null
@@ -1,77 +0,0 @@
-/* Floating-point minimum. PowerPC version.
- Copyright (C) 1997-2016 Free Software Foundation, Inc.
- This file is part of the GNU C Library.
-
- The GNU C Library is free software; you can redistribute it and/or
- modify it under the terms of the GNU Lesser General Public
- License as published by the Free Software Foundation; either
- version 2.1 of the License, or (at your option) any later version.
-
- The GNU C Library is distributed in the hope that it will be useful,
- but WITHOUT ANY WARRANTY; without even the implied warranty of
- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
- Lesser General Public License for more details.
-
- You should have received a copy of the GNU Lesser General Public
- License along with the GNU C Library; if not, see
- <http://www.gnu.org/licenses/>. */
-
-#include <sysdep.h>
-
-ENTRY(__fmin)
-/* double [f1] fmin (double [f1] x, double [f2] y); */
- fcmpu cr0,fp1,fp2
- bgt cr0,0f /* if x > y, neither x nor y can be NaN... */
- bnulr+ cr0
-/* x and y are unordered, so one of x or y must be a NaN... */
- fcmpu cr1,fp2,fp2
- bun cr1,1f
-/* x is a NaN; y is not. Test if x is signaling. */
-#ifdef __powerpc64__
- stfd fp1,-8(r1)
- lwz r3,-8+HIWORD(r1)
-#else
- stwu r1,-16(r1)
- cfi_adjust_cfa_offset (16)
- stfd fp1,8(r1)
- lwz r3,8+HIWORD(r1)
- addi r1,r1,16
- cfi_adjust_cfa_offset (-16)
-#endif
- andis. r3,r3,8
- bne cr0,0f
- b 2f
-1: /* y is a NaN; x may or may not be. */
- fcmpu cr1,fp1,fp1
- bun cr1,2f
-/* y is a NaN; x is not. Test if y is signaling. */
-#ifdef __powerpc64__
- stfd fp2,-8(r1)
- lwz r3,-8+HIWORD(r1)
-#else
- stwu r1,-16(r1)
- cfi_adjust_cfa_offset (16)
- stfd fp2,8(r1)
- lwz r3,8+HIWORD(r1)
- addi r1,r1,16
- cfi_adjust_cfa_offset (-16)
-#endif
- andis. r3,r3,8
- bnelr cr0
-2: /* x and y are NaNs, or one is a signaling NaN. */
- fadd fp1,fp1,fp2
- blr
-0: fmr fp1,fp2
- blr
-END(__fmin)
-
-weak_alias (__fmin,fmin)
-
-/* It turns out that it's safe to use this code even for single-precision. */
-strong_alias(__fmin,__fminf)
-weak_alias (__fmin,fminf)
-
-#ifdef NO_LONG_DOUBLE
-weak_alias (__fmin,__fminl)
-weak_alias (__fmin,fminl)
-#endif
diff --git a/sysdeps/powerpc/fpu/s_fminf.S b/sysdeps/powerpc/fpu/s_fminf.S
deleted file mode 100644
index 10ab7fe..0000000
--- a/sysdeps/powerpc/fpu/s_fminf.S
+++ /dev/null
@@ -1 +0,0 @@
-/* __fminf is in s_fmin.c */
diff --git a/sysdeps/powerpc/powerpc32/fpu/s_fmax.S b/sysdeps/powerpc/powerpc32/fpu/s_fmax.S
deleted file mode 100644
index 6973576..0000000
--- a/sysdeps/powerpc/powerpc32/fpu/s_fmax.S
+++ /dev/null
@@ -1,5 +0,0 @@
-#include <math_ldbl_opt.h>
-#include <sysdeps/powerpc/fpu/s_fmax.S>
-#if LONG_DOUBLE_COMPAT(libm, GLIBC_2_1)
-compat_symbol (libm, __fmax, fmaxl, GLIBC_2_1)
-#endif
diff --git a/sysdeps/powerpc/powerpc32/fpu/s_fmin.S b/sysdeps/powerpc/powerpc32/fpu/s_fmin.S
deleted file mode 100644
index 6d4a0a9..0000000
--- a/sysdeps/powerpc/powerpc32/fpu/s_fmin.S
+++ /dev/null
@@ -1,5 +0,0 @@
-#include <math_ldbl_opt.h>
-#include <sysdeps/powerpc/fpu/s_fmin.S>
-#if LONG_DOUBLE_COMPAT(libm, GLIBC_2_1)
-compat_symbol (libm, __fmin, fminl, GLIBC_2_1)
-#endif
diff --git a/sysdeps/powerpc/powerpc64/fpu/s_fmax.S b/sysdeps/powerpc/powerpc64/fpu/s_fmax.S
deleted file mode 100644
index 6973576..0000000
--- a/sysdeps/powerpc/powerpc64/fpu/s_fmax.S
+++ /dev/null
@@ -1,5 +0,0 @@
-#include <math_ldbl_opt.h>
-#include <sysdeps/powerpc/fpu/s_fmax.S>
-#if LONG_DOUBLE_COMPAT(libm, GLIBC_2_1)
-compat_symbol (libm, __fmax, fmaxl, GLIBC_2_1)
-#endif
diff --git a/sysdeps/powerpc/powerpc64/fpu/s_fmin.S b/sysdeps/powerpc/powerpc64/fpu/s_fmin.S
deleted file mode 100644
index 6d4a0a9..0000000
--- a/sysdeps/powerpc/powerpc64/fpu/s_fmin.S
+++ /dev/null
@@ -1,5 +0,0 @@
-#include <math_ldbl_opt.h>
-#include <sysdeps/powerpc/fpu/s_fmin.S>
-#if LONG_DOUBLE_COMPAT(libm, GLIBC_2_1)
-compat_symbol (libm, __fmin, fminl, GLIBC_2_1)
-#endif
--
2.7.4
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 2/4] benchtests: Add fmax/fmin benchmarks
2016-12-19 14:04 [PATCH 1/4] Adjust benchtests to new support library Adhemerval Zanella
@ 2016-12-19 14:04 ` Adhemerval Zanella
2016-12-19 15:42 ` Siddhesh Poyarekar
2016-12-19 16:24 ` Joseph Myers
2016-12-19 14:04 ` [PATCH 3/4] benchtests: Add fmaxf/fminf benchmarks Adhemerval Zanella
` (2 subsequent siblings)
3 siblings, 2 replies; 11+ messages in thread
From: Adhemerval Zanella @ 2016-12-19 14:04 UTC (permalink / raw)
To: libc-alpha
This patch adds fmax and fmin benchtests. It is based math/s_fmax_template.c
implementation which checks for basically four different classes:
1. if x is greater or equal than y.
2. if x is less than y.
3. if x or y is signaling.
4. if y is nan.
Cases 1 and 2 are used for default input number (by mixing normal double
numbers and infinity), while case 3 and 4 are used each for on for a
benchmark class.
Checked on x86_64-linux-gnu and powerpc64-linux-gnu.
* benchtests/Makefile (bench-math): Add fmin and fmax.
* benchtests/fmax-inputs: New file.
* benchtests/fmin-inputs: Likewise.
---
ChangeLog | 4 ++++
benchtests/Makefile | 2 +-
benchtests/fmax-inputs | 23 +++++++++++++++++++++++
benchtests/fmin-inputs | 23 +++++++++++++++++++++++
4 files changed, 51 insertions(+), 1 deletion(-)
create mode 100644 benchtests/fmax-inputs
create mode 100644 benchtests/fmin-inputs
diff --git a/benchtests/Makefile b/benchtests/Makefile
index ba4d068..a3b53e1 100644
--- a/benchtests/Makefile
+++ b/benchtests/Makefile
@@ -24,7 +24,7 @@ subdir := benchtests
include ../Makeconfig
bench-math := acos acosh asin asinh atan atanh cos cosh exp exp2 log log2 \
- modf pow rint sin sincos sinh sqrt tan tanh
+ modf pow rint sin sincos sinh sqrt tan tanh fmin fmax
bench-pthread := pthread_once
diff --git a/benchtests/fmax-inputs b/benchtests/fmax-inputs
new file mode 100644
index 0000000..18eb8fe
--- /dev/null
+++ b/benchtests/fmax-inputs
@@ -0,0 +1,23 @@
+## includes: math.h
+## args: double:double
+## ret: double
+78.5, -78.5
+-78.5, 78.5
+0, 78.5
+78.5, 0
+0, -78.5
+-78.5, 0
+__builtin_inf (), 78.5
+__builtin_inf (), -78.5
+78.5, __builtin_inf ()
+-78.5, __builtin_inf ()
+## name: qNaN
+__builtin_nan (""), 78.5
+__builtin_nan (""), -78.5
+78.5, __builtin_nan ("")
+-78.5, __builtin_nan ("")
+## name: sNaN
+__builtin_nans (""), 78.5
+__builtin_nans (""), -78.5
+78.5, __builtin_nans ("")
+-78.5, __builtin_nans ("")
diff --git a/benchtests/fmin-inputs b/benchtests/fmin-inputs
new file mode 100644
index 0000000..18eb8fe
--- /dev/null
+++ b/benchtests/fmin-inputs
@@ -0,0 +1,23 @@
+## includes: math.h
+## args: double:double
+## ret: double
+78.5, -78.5
+-78.5, 78.5
+0, 78.5
+78.5, 0
+0, -78.5
+-78.5, 0
+__builtin_inf (), 78.5
+__builtin_inf (), -78.5
+78.5, __builtin_inf ()
+-78.5, __builtin_inf ()
+## name: qNaN
+__builtin_nan (""), 78.5
+__builtin_nan (""), -78.5
+78.5, __builtin_nan ("")
+-78.5, __builtin_nan ("")
+## name: sNaN
+__builtin_nans (""), 78.5
+__builtin_nans (""), -78.5
+78.5, __builtin_nans ("")
+-78.5, __builtin_nans ("")
--
2.7.4
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 3/4] benchtests: Add fmaxf/fminf benchmarks
2016-12-19 14:04 [PATCH 1/4] Adjust benchtests to new support library Adhemerval Zanella
2016-12-19 14:04 ` [PATCH 2/4] benchtests: Add fmax/fmin benchmarks Adhemerval Zanella
@ 2016-12-19 14:04 ` Adhemerval Zanella
2016-12-19 15:44 ` Siddhesh Poyarekar
2016-12-19 14:04 ` [PATCH 4/4] powerpc: Remove f{max,min}{f} assembly implementations Adhemerval Zanella
2016-12-19 15:35 ` [PATCH 1/4] Adjust benchtests to new support library Siddhesh Poyarekar
3 siblings, 1 reply; 11+ messages in thread
From: Adhemerval Zanella @ 2016-12-19 14:04 UTC (permalink / raw)
To: libc-alpha
This patch adds fmaxf and fminf benchtests. It is based on
math/s_fmax_template.c implementation which checks for basically four
different classes:
1. if x is greater or equal than y.
2. if x is less than y.
3. if x or y is signaling.
4. if y is nan.
Cases 1 and 2 are used for default input number (by mixing normal double
numbers and infinity), while case 3 and 4 are used each for on for a
benchmark class.
Checked on x86_64-linux-gnu and powerpc64-linux-gnu.
* benchtests/Makefile (bench-math): Add fminf and fmaxf.
* benchtests/fmaxf-inputs: New file.
* benchtests/fminf-inputs: Likewise.
---
ChangeLog | 4 ++++
benchtests/Makefile | 3 ++-
benchtests/fmaxf-inputs | 23 +++++++++++++++++++++++
benchtests/fminf-inputs | 23 +++++++++++++++++++++++
4 files changed, 52 insertions(+), 1 deletion(-)
create mode 100644 benchtests/fmaxf-inputs
create mode 100644 benchtests/fminf-inputs
diff --git a/benchtests/Makefile b/benchtests/Makefile
index a3b53e1..9e5f6c6 100644
--- a/benchtests/Makefile
+++ b/benchtests/Makefile
@@ -24,7 +24,8 @@ subdir := benchtests
include ../Makeconfig
bench-math := acos acosh asin asinh atan atanh cos cosh exp exp2 log log2 \
- modf pow rint sin sincos sinh sqrt tan tanh fmin fmax
+ modf pow rint sin sincos sinh sqrt tan tanh fmin fmax fminf \
+ fmaxf
bench-pthread := pthread_once
diff --git a/benchtests/fmaxf-inputs b/benchtests/fmaxf-inputs
new file mode 100644
index 0000000..7eb7bda
--- /dev/null
+++ b/benchtests/fmaxf-inputs
@@ -0,0 +1,23 @@
+## includes: math.h
+## args: float:float
+## ret: float
+78.5f, -78.5f
+-78.5f, 78.5f
+0.0f, 78.5f
+78.5f, 0.0f
+0.0f, -78.5f
+-78.5, 0.0f
+__builtin_inff (), 78.5f
+__builtin_inff (), -78.5f
+78.5f, __builtin_inff ()
+-78.5f, __builtin_inff ()
+## name: qNaN
+__builtin_nanf (""), 78.5f
+__builtin_nanf (""), -78.5f
+78.5f, __builtin_nanf ("")
+-78.5f, __builtin_nanf ("")
+## name: sNaN
+__builtin_nansf (""), 78.5f
+__builtin_nansf (""), -78.5f
+78.5f, __builtin_nansf ("")
+-78.5f, __builtin_nansf ("")
diff --git a/benchtests/fminf-inputs b/benchtests/fminf-inputs
new file mode 100644
index 0000000..7eb7bda
--- /dev/null
+++ b/benchtests/fminf-inputs
@@ -0,0 +1,23 @@
+## includes: math.h
+## args: float:float
+## ret: float
+78.5f, -78.5f
+-78.5f, 78.5f
+0.0f, 78.5f
+78.5f, 0.0f
+0.0f, -78.5f
+-78.5, 0.0f
+__builtin_inff (), 78.5f
+__builtin_inff (), -78.5f
+78.5f, __builtin_inff ()
+-78.5f, __builtin_inff ()
+## name: qNaN
+__builtin_nanf (""), 78.5f
+__builtin_nanf (""), -78.5f
+78.5f, __builtin_nanf ("")
+-78.5f, __builtin_nanf ("")
+## name: sNaN
+__builtin_nansf (""), 78.5f
+__builtin_nansf (""), -78.5f
+78.5f, __builtin_nansf ("")
+-78.5f, __builtin_nansf ("")
--
2.7.4
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 1/4] Adjust benchtests to new support library.
2016-12-19 14:04 [PATCH 1/4] Adjust benchtests to new support library Adhemerval Zanella
` (2 preceding siblings ...)
2016-12-19 14:04 ` [PATCH 4/4] powerpc: Remove f{max,min}{f} assembly implementations Adhemerval Zanella
@ 2016-12-19 15:35 ` Siddhesh Poyarekar
3 siblings, 0 replies; 11+ messages in thread
From: Siddhesh Poyarekar @ 2016-12-19 15:35 UTC (permalink / raw)
To: Adhemerval Zanella, libc-alpha
On Monday 19 December 2016 07:34 PM, Adhemerval Zanella wrote:
> This patch basically replaces the test-skeleton.c inclusion by
> support/test-driver.c and also minor adjustments in bench-string.h.
>
> Checked on x86_64-linux-gnu and powerpc64le-linux-gnu.
Looks OK to me.
Siddhesh
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/4] benchtests: Add fmax/fmin benchmarks
2016-12-19 14:04 ` [PATCH 2/4] benchtests: Add fmax/fmin benchmarks Adhemerval Zanella
@ 2016-12-19 15:42 ` Siddhesh Poyarekar
2016-12-19 16:24 ` Joseph Myers
1 sibling, 0 replies; 11+ messages in thread
From: Siddhesh Poyarekar @ 2016-12-19 15:42 UTC (permalink / raw)
To: Adhemerval Zanella, libc-alpha
On Monday 19 December 2016 07:34 PM, Adhemerval Zanella wrote:
> This patch adds fmax and fmin benchtests. It is based math/s_fmax_template.c
> implementation which checks for basically four different classes:
>
> 1. if x is greater or equal than y.
> 2. if x is less than y.
> 3. if x or y is signaling.
> 4. if y is nan.
>
> Cases 1 and 2 are used for default input number (by mixing normal double
> numbers and infinity), while case 3 and 4 are used each for on for a
> benchmark class.
>
> Checked on x86_64-linux-gnu and powerpc64-linux-gnu.
>
> * benchtests/Makefile (bench-math): Add fmin and fmax.
> * benchtests/fmax-inputs: New file.
> * benchtests/fmin-inputs: Likewise.
Looks good to me.
Siddhesh
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 3/4] benchtests: Add fmaxf/fminf benchmarks
2016-12-19 14:04 ` [PATCH 3/4] benchtests: Add fmaxf/fminf benchmarks Adhemerval Zanella
@ 2016-12-19 15:44 ` Siddhesh Poyarekar
0 siblings, 0 replies; 11+ messages in thread
From: Siddhesh Poyarekar @ 2016-12-19 15:44 UTC (permalink / raw)
To: Adhemerval Zanella, libc-alpha
On Monday 19 December 2016 07:34 PM, Adhemerval Zanella wrote:
> This patch adds fmaxf and fminf benchtests. It is based on
> math/s_fmax_template.c implementation which checks for basically four
> different classes:
>
> 1. if x is greater or equal than y.
> 2. if x is less than y.
> 3. if x or y is signaling.
> 4. if y is nan.
>
> Cases 1 and 2 are used for default input number (by mixing normal double
> numbers and infinity), while case 3 and 4 are used each for on for a
> benchmark class.
>
> Checked on x86_64-linux-gnu and powerpc64-linux-gnu.
>
> * benchtests/Makefile (bench-math): Add fminf and fmaxf.
> * benchtests/fmaxf-inputs: New file.
> * benchtests/fminf-inputs: Likewise.
Looks good to me.
Siddhesh
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/4] benchtests: Add fmax/fmin benchmarks
2016-12-19 14:04 ` [PATCH 2/4] benchtests: Add fmax/fmin benchmarks Adhemerval Zanella
2016-12-19 15:42 ` Siddhesh Poyarekar
@ 2016-12-19 16:24 ` Joseph Myers
2016-12-19 17:30 ` Adhemerval Zanella
1 sibling, 1 reply; 11+ messages in thread
From: Joseph Myers @ 2016-12-19 16:24 UTC (permalink / raw)
To: Adhemerval Zanella; +Cc: libc-alpha
I'd advise using -fno-builtin for these benchmarks to make sure they
really do test the libm functions not compiler built-in functions (really,
all benchmarks more generally should use -fno-builtin).
--
Joseph S. Myers
joseph@codesourcery.com
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/4] benchtests: Add fmax/fmin benchmarks
2016-12-19 16:24 ` Joseph Myers
@ 2016-12-19 17:30 ` Adhemerval Zanella
0 siblings, 0 replies; 11+ messages in thread
From: Adhemerval Zanella @ 2016-12-19 17:30 UTC (permalink / raw)
To: Joseph Myers; +Cc: libc-alpha
On 19/12/2016 14:24, Joseph Myers wrote:
> I'd advise using -fno-builtin for these benchmarks to make sure they
> really do test the libm functions not compiler built-in functions (really,
> all benchmarks more generally should use -fno-builtin).
>
I will add -fno-builtin on f{max,min}{f} benchtests.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 4/4] powerpc: Remove f{max,min}{f} assembly implementations
2016-12-19 14:04 ` [PATCH 4/4] powerpc: Remove f{max,min}{f} assembly implementations Adhemerval Zanella
@ 2016-12-26 19:38 ` Adhemerval Zanella
2016-12-27 18:47 ` Tulio Magno Quites Machado Filho
1 sibling, 0 replies; 11+ messages in thread
From: Adhemerval Zanella @ 2016-12-26 19:38 UTC (permalink / raw)
To: libc-alpha
PING.
On 19/12/2016 12:04, Adhemerval Zanella wrote:
> This patch removes the powerpc assembly implementation of fmax/fmin.
> Based on benchtests, the assembly ones shows:
>
> $ ./testrun.sh benchtests/bench-fmax
> "fmax": {
> "": {
> "duration": 5.07586e+09,
> "iterations": 2.01676e+09,
> "max": 1350.39,
> "min": 2.073,
> "mean": 2.51684
> },
> "qNaN": {
> "duration": 5.09315e+09,
> "iterations": 8.4568e+08,
> "max": 2788,
> "min": 5.806,
> "mean": 6.02255
> },
> "sNaN": {
> "duration": 5.09073e+09,
> "iterations": 8.42316e+08,
> "max": 4215.84,
> "min": 5.737,
> "mean": 6.04373
> }
>
> And
>
> $ ./testrun.sh benchtests/bench-fmin
> "fmin": {
> "": {
> "duration": 5.07711e+09,
> "iterations": 2.02982e+09,
> "max": 497.094,
> "min": 2.073,
> "mean": 2.50126
> },
> "qNaN": {
> "duration": 5.09134e+09,
> "iterations": 8.46968e+08,
> "max": 2255.14,
> "min": 5.807,
> "mean": 6.01125
> },
> "sNaN": {
> "duration": 5.09122e+09,
> "iterations": 8.4746e+08,
> "max": 1969.38,
> "min": 5.729,
> "mean": 6.00763
> }
> }
>
> The default implementation (math/s_f{max.min}_template.c) shows slight better
> latency for all cases:
>
> $ ./testrun.sh benchtests/bench-fmax
> "fmax": {
> "": {
> "duration": 5.07044e+09,
> "iterations": 2.38695e+09,
> "max": 2048.58,
> "min": 2.073,
> "mean": 2.12423
> },
> "qNaN": {
> "duration": 5.09004e+09,
> "iterations": 9.45428e+08,
> "max": 3306.93,
> "min": 5.138,
> "mean": 5.38385
> },
> "sNaN": {
> "duration": 5.08458e+09,
> "iterations": 1.15959e+09,
> "max": 972.008,
> "min": 3.321,
> "mean": 4.3848
> }
> }
>
> And:
>
> $ ./testrun.sh benchtests/bench-fmin
> "fmin": {
> "": {
> "duration": 5.06817e+09,
> "iterations": 2.3913e+09,
> "max": 1177.9,
> "min": 2.073,
> "mean": 2.11942
> },
> "qNaN": {
> "duration": 5.08857e+09,
> "iterations": 9.45656e+08,
> "max": 2658.83,
> "min": 5.09,
> "mean": 5.38099
> },
> "sNaN": {
> "duration": 5.08093e+09,
> "iterations": 1.16725e+09,
> "max": 1030.74,
> "min": 3.323,
> "mean": 4.3529
> }
> }
>
> Both were run with GCC 5.4 (ubuntu 16 default installation) using default
> compiler flags on POWER8E 3.4GHz (powerpc64le-linux-gnu).
> ---
> ChangeLog | 10 +++++
> sysdeps/powerpc/fpu/s_fmax.S | 77 ----------------------------------
> sysdeps/powerpc/fpu/s_fmaxf.S | 1 -
> sysdeps/powerpc/fpu/s_fmin.S | 77 ----------------------------------
> sysdeps/powerpc/fpu/s_fminf.S | 1 -
> sysdeps/powerpc/powerpc32/fpu/s_fmax.S | 5 ---
> sysdeps/powerpc/powerpc32/fpu/s_fmin.S | 5 ---
> sysdeps/powerpc/powerpc64/fpu/s_fmax.S | 5 ---
> sysdeps/powerpc/powerpc64/fpu/s_fmin.S | 5 ---
> 9 files changed, 10 insertions(+), 176 deletions(-)
> delete mode 100644 sysdeps/powerpc/fpu/s_fmax.S
> delete mode 100644 sysdeps/powerpc/fpu/s_fmaxf.S
> delete mode 100644 sysdeps/powerpc/fpu/s_fmin.S
> delete mode 100644 sysdeps/powerpc/fpu/s_fminf.S
> delete mode 100644 sysdeps/powerpc/powerpc32/fpu/s_fmax.S
> delete mode 100644 sysdeps/powerpc/powerpc32/fpu/s_fmin.S
> delete mode 100644 sysdeps/powerpc/powerpc64/fpu/s_fmax.S
> delete mode 100644 sysdeps/powerpc/powerpc64/fpu/s_fmin.S
>
> diff --git a/sysdeps/powerpc/fpu/s_fmax.S b/sysdeps/powerpc/fpu/s_fmax.S
> deleted file mode 100644
> index e6405c0..0000000
> --- a/sysdeps/powerpc/fpu/s_fmax.S
> +++ /dev/null
> @@ -1,77 +0,0 @@
> -/* Floating-point maximum. PowerPC version.
> - Copyright (C) 1997-2016 Free Software Foundation, Inc.
> - This file is part of the GNU C Library.
> -
> - The GNU C Library is free software; you can redistribute it and/or
> - modify it under the terms of the GNU Lesser General Public
> - License as published by the Free Software Foundation; either
> - version 2.1 of the License, or (at your option) any later version.
> -
> - The GNU C Library is distributed in the hope that it will be useful,
> - but WITHOUT ANY WARRANTY; without even the implied warranty of
> - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> - Lesser General Public License for more details.
> -
> - You should have received a copy of the GNU Lesser General Public
> - License along with the GNU C Library; if not, see
> - <http://www.gnu.org/licenses/>. */
> -
> -#include <sysdep.h>
> -
> -ENTRY(__fmax)
> -/* double [f1] fmax (double [f1] x, double [f2] y); */
> - fcmpu cr0,fp1,fp2
> - blt cr0,0f /* if x < y, neither x nor y can be NaN... */
> - bnulr+ cr0
> -/* x and y are unordered, so one of x or y must be a NaN... */
> - fcmpu cr1,fp2,fp2
> - bun cr1,1f
> -/* x is a NaN; y is not. Test if x is signaling. */
> -#ifdef __powerpc64__
> - stfd fp1,-8(r1)
> - lwz r3,-8+HIWORD(r1)
> -#else
> - stwu r1,-16(r1)
> - cfi_adjust_cfa_offset (16)
> - stfd fp1,8(r1)
> - lwz r3,8+HIWORD(r1)
> - addi r1,r1,16
> - cfi_adjust_cfa_offset (-16)
> -#endif
> - andis. r3,r3,8
> - bne cr0,0f
> - b 2f
> -1: /* y is a NaN; x may or may not be. */
> - fcmpu cr1,fp1,fp1
> - bun cr1,2f
> -/* y is a NaN; x is not. Test if y is signaling. */
> -#ifdef __powerpc64__
> - stfd fp2,-8(r1)
> - lwz r3,-8+HIWORD(r1)
> -#else
> - stwu r1,-16(r1)
> - cfi_adjust_cfa_offset (16)
> - stfd fp2,8(r1)
> - lwz r3,8+HIWORD(r1)
> - addi r1,r1,16
> - cfi_adjust_cfa_offset (-16)
> -#endif
> - andis. r3,r3,8
> - bnelr cr0
> -2: /* x and y are NaNs, or one is a signaling NaN. */
> - fadd fp1,fp1,fp2
> - blr
> -0: fmr fp1,fp2
> - blr
> -END(__fmax)
> -
> -weak_alias (__fmax,fmax)
> -
> -/* It turns out that it's safe to use this code even for single-precision. */
> -strong_alias(__fmax,__fmaxf)
> -weak_alias (__fmax,fmaxf)
> -
> -#ifdef NO_LONG_DOUBLE
> -weak_alias (__fmax,__fmaxl)
> -weak_alias (__fmax,fmaxl)
> -#endif
> diff --git a/sysdeps/powerpc/fpu/s_fmaxf.S b/sysdeps/powerpc/fpu/s_fmaxf.S
> deleted file mode 100644
> index 3c2d62b..0000000
> --- a/sysdeps/powerpc/fpu/s_fmaxf.S
> +++ /dev/null
> @@ -1 +0,0 @@
> -/* __fmaxf is in s_fmax.c */
> diff --git a/sysdeps/powerpc/fpu/s_fmin.S b/sysdeps/powerpc/fpu/s_fmin.S
> deleted file mode 100644
> index 9ae77fe..0000000
> --- a/sysdeps/powerpc/fpu/s_fmin.S
> +++ /dev/null
> @@ -1,77 +0,0 @@
> -/* Floating-point minimum. PowerPC version.
> - Copyright (C) 1997-2016 Free Software Foundation, Inc.
> - This file is part of the GNU C Library.
> -
> - The GNU C Library is free software; you can redistribute it and/or
> - modify it under the terms of the GNU Lesser General Public
> - License as published by the Free Software Foundation; either
> - version 2.1 of the License, or (at your option) any later version.
> -
> - The GNU C Library is distributed in the hope that it will be useful,
> - but WITHOUT ANY WARRANTY; without even the implied warranty of
> - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> - Lesser General Public License for more details.
> -
> - You should have received a copy of the GNU Lesser General Public
> - License along with the GNU C Library; if not, see
> - <http://www.gnu.org/licenses/>. */
> -
> -#include <sysdep.h>
> -
> -ENTRY(__fmin)
> -/* double [f1] fmin (double [f1] x, double [f2] y); */
> - fcmpu cr0,fp1,fp2
> - bgt cr0,0f /* if x > y, neither x nor y can be NaN... */
> - bnulr+ cr0
> -/* x and y are unordered, so one of x or y must be a NaN... */
> - fcmpu cr1,fp2,fp2
> - bun cr1,1f
> -/* x is a NaN; y is not. Test if x is signaling. */
> -#ifdef __powerpc64__
> - stfd fp1,-8(r1)
> - lwz r3,-8+HIWORD(r1)
> -#else
> - stwu r1,-16(r1)
> - cfi_adjust_cfa_offset (16)
> - stfd fp1,8(r1)
> - lwz r3,8+HIWORD(r1)
> - addi r1,r1,16
> - cfi_adjust_cfa_offset (-16)
> -#endif
> - andis. r3,r3,8
> - bne cr0,0f
> - b 2f
> -1: /* y is a NaN; x may or may not be. */
> - fcmpu cr1,fp1,fp1
> - bun cr1,2f
> -/* y is a NaN; x is not. Test if y is signaling. */
> -#ifdef __powerpc64__
> - stfd fp2,-8(r1)
> - lwz r3,-8+HIWORD(r1)
> -#else
> - stwu r1,-16(r1)
> - cfi_adjust_cfa_offset (16)
> - stfd fp2,8(r1)
> - lwz r3,8+HIWORD(r1)
> - addi r1,r1,16
> - cfi_adjust_cfa_offset (-16)
> -#endif
> - andis. r3,r3,8
> - bnelr cr0
> -2: /* x and y are NaNs, or one is a signaling NaN. */
> - fadd fp1,fp1,fp2
> - blr
> -0: fmr fp1,fp2
> - blr
> -END(__fmin)
> -
> -weak_alias (__fmin,fmin)
> -
> -/* It turns out that it's safe to use this code even for single-precision. */
> -strong_alias(__fmin,__fminf)
> -weak_alias (__fmin,fminf)
> -
> -#ifdef NO_LONG_DOUBLE
> -weak_alias (__fmin,__fminl)
> -weak_alias (__fmin,fminl)
> -#endif
> diff --git a/sysdeps/powerpc/fpu/s_fminf.S b/sysdeps/powerpc/fpu/s_fminf.S
> deleted file mode 100644
> index 10ab7fe..0000000
> --- a/sysdeps/powerpc/fpu/s_fminf.S
> +++ /dev/null
> @@ -1 +0,0 @@
> -/* __fminf is in s_fmin.c */
> diff --git a/sysdeps/powerpc/powerpc32/fpu/s_fmax.S b/sysdeps/powerpc/powerpc32/fpu/s_fmax.S
> deleted file mode 100644
> index 6973576..0000000
> --- a/sysdeps/powerpc/powerpc32/fpu/s_fmax.S
> +++ /dev/null
> @@ -1,5 +0,0 @@
> -#include <math_ldbl_opt.h>
> -#include <sysdeps/powerpc/fpu/s_fmax.S>
> -#if LONG_DOUBLE_COMPAT(libm, GLIBC_2_1)
> -compat_symbol (libm, __fmax, fmaxl, GLIBC_2_1)
> -#endif
> diff --git a/sysdeps/powerpc/powerpc32/fpu/s_fmin.S b/sysdeps/powerpc/powerpc32/fpu/s_fmin.S
> deleted file mode 100644
> index 6d4a0a9..0000000
> --- a/sysdeps/powerpc/powerpc32/fpu/s_fmin.S
> +++ /dev/null
> @@ -1,5 +0,0 @@
> -#include <math_ldbl_opt.h>
> -#include <sysdeps/powerpc/fpu/s_fmin.S>
> -#if LONG_DOUBLE_COMPAT(libm, GLIBC_2_1)
> -compat_symbol (libm, __fmin, fminl, GLIBC_2_1)
> -#endif
> diff --git a/sysdeps/powerpc/powerpc64/fpu/s_fmax.S b/sysdeps/powerpc/powerpc64/fpu/s_fmax.S
> deleted file mode 100644
> index 6973576..0000000
> --- a/sysdeps/powerpc/powerpc64/fpu/s_fmax.S
> +++ /dev/null
> @@ -1,5 +0,0 @@
> -#include <math_ldbl_opt.h>
> -#include <sysdeps/powerpc/fpu/s_fmax.S>
> -#if LONG_DOUBLE_COMPAT(libm, GLIBC_2_1)
> -compat_symbol (libm, __fmax, fmaxl, GLIBC_2_1)
> -#endif
> diff --git a/sysdeps/powerpc/powerpc64/fpu/s_fmin.S b/sysdeps/powerpc/powerpc64/fpu/s_fmin.S
> deleted file mode 100644
> index 6d4a0a9..0000000
> --- a/sysdeps/powerpc/powerpc64/fpu/s_fmin.S
> +++ /dev/null
> @@ -1,5 +0,0 @@
> -#include <math_ldbl_opt.h>
> -#include <sysdeps/powerpc/fpu/s_fmin.S>
> -#if LONG_DOUBLE_COMPAT(libm, GLIBC_2_1)
> -compat_symbol (libm, __fmin, fminl, GLIBC_2_1)
> -#endif
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 4/4] powerpc: Remove f{max,min}{f} assembly implementations
2016-12-19 14:04 ` [PATCH 4/4] powerpc: Remove f{max,min}{f} assembly implementations Adhemerval Zanella
2016-12-26 19:38 ` Adhemerval Zanella
@ 2016-12-27 18:47 ` Tulio Magno Quites Machado Filho
1 sibling, 0 replies; 11+ messages in thread
From: Tulio Magno Quites Machado Filho @ 2016-12-27 18:47 UTC (permalink / raw)
To: Adhemerval Zanella, libc-alpha
Adhemerval Zanella <adhemerval.zanella@linaro.org> writes:
> This patch removes the powerpc assembly implementation of fmax/fmin.
> Based on benchtests, the assembly ones shows:
>
> $ ./testrun.sh benchtests/bench-fmax
> "fmax": {
> "": {
> "duration": 5.07586e+09,
> "iterations": 2.01676e+09,
> "max": 1350.39,
> "min": 2.073,
> "mean": 2.51684
> },
> "qNaN": {
> "duration": 5.09315e+09,
> "iterations": 8.4568e+08,
> "max": 2788,
> "min": 5.806,
> "mean": 6.02255
> },
> "sNaN": {
> "duration": 5.09073e+09,
> "iterations": 8.42316e+08,
> "max": 4215.84,
> "min": 5.737,
> "mean": 6.04373
> }
>
> And
>
> $ ./testrun.sh benchtests/bench-fmin
> "fmin": {
> "": {
> "duration": 5.07711e+09,
> "iterations": 2.02982e+09,
> "max": 497.094,
> "min": 2.073,
> "mean": 2.50126
> },
> "qNaN": {
> "duration": 5.09134e+09,
> "iterations": 8.46968e+08,
> "max": 2255.14,
> "min": 5.807,
> "mean": 6.01125
> },
> "sNaN": {
> "duration": 5.09122e+09,
> "iterations": 8.4746e+08,
> "max": 1969.38,
> "min": 5.729,
> "mean": 6.00763
> }
> }
>
> The default implementation (math/s_f{max.min}_template.c) shows slight better
> latency for all cases:
>
> $ ./testrun.sh benchtests/bench-fmax
> "fmax": {
> "": {
> "duration": 5.07044e+09,
> "iterations": 2.38695e+09,
> "max": 2048.58,
> "min": 2.073,
> "mean": 2.12423
> },
> "qNaN": {
> "duration": 5.09004e+09,
> "iterations": 9.45428e+08,
> "max": 3306.93,
> "min": 5.138,
> "mean": 5.38385
> },
> "sNaN": {
> "duration": 5.08458e+09,
> "iterations": 1.15959e+09,
> "max": 972.008,
> "min": 3.321,
> "mean": 4.3848
> }
> }
>
> And:
>
> $ ./testrun.sh benchtests/bench-fmin
> "fmin": {
> "": {
> "duration": 5.06817e+09,
> "iterations": 2.3913e+09,
> "max": 1177.9,
> "min": 2.073,
> "mean": 2.11942
> },
> "qNaN": {
> "duration": 5.08857e+09,
> "iterations": 9.45656e+08,
> "max": 2658.83,
> "min": 5.09,
> "mean": 5.38099
> },
> "sNaN": {
> "duration": 5.08093e+09,
> "iterations": 1.16725e+09,
> "max": 1030.74,
> "min": 3.323,
> "mean": 4.3529
> }
> }
>
> Both were run with GCC 5.4 (ubuntu 16 default installation) using default
> compiler flags on POWER8E 3.4GHz (powerpc64le-linux-gnu).
LGTM.
For the record: the results with older GCC versions (e.g. 4.8) are getting
worse, but I don't think this regression in an old GCC version should block
this patch.
--
Tulio Magno
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2016-12-27 18:47 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-12-19 14:04 [PATCH 1/4] Adjust benchtests to new support library Adhemerval Zanella
2016-12-19 14:04 ` [PATCH 2/4] benchtests: Add fmax/fmin benchmarks Adhemerval Zanella
2016-12-19 15:42 ` Siddhesh Poyarekar
2016-12-19 16:24 ` Joseph Myers
2016-12-19 17:30 ` Adhemerval Zanella
2016-12-19 14:04 ` [PATCH 3/4] benchtests: Add fmaxf/fminf benchmarks Adhemerval Zanella
2016-12-19 15:44 ` Siddhesh Poyarekar
2016-12-19 14:04 ` [PATCH 4/4] powerpc: Remove f{max,min}{f} assembly implementations Adhemerval Zanella
2016-12-26 19:38 ` Adhemerval Zanella
2016-12-27 18:47 ` Tulio Magno Quites Machado Filho
2016-12-19 15:35 ` [PATCH 1/4] Adjust benchtests to new support library Siddhesh Poyarekar
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).