public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
* [PATCH] benchtests: Restore the clock_gettime option
@ 2020-05-19 20:30 H.J. Lu
  2020-05-19 21:18 ` Alexander Monakov
  0 siblings, 1 reply; 12+ messages in thread
From: H.J. Lu @ 2020-05-19 20:30 UTC (permalink / raw)
  To: libc-alpha

commit 7621e38bf3c58b2d0359545f1f2898017fd89d05
Author: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Date:   Tue Jan 29 17:43:45 2019 +0000

    Add generic hp-timing support

removed the clock_gettime option.  On x86, fewer cycles doesn't
necessarily mean faster exection due to frequency drop.  We should
restore the clock_gettime option.
---
 benchtests/Makefile       | 6 ++++++
 benchtests/README         | 7 ++++++-
 benchtests/bench-timing.h | 6 +++++-
 3 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/benchtests/Makefile b/benchtests/Makefile
index 335d643ecb..99e90d17a0 100644
--- a/benchtests/Makefile
+++ b/benchtests/Makefile
@@ -132,11 +132,17 @@ endif
 
 CPPFLAGS-nonlib += -DDURATION=$(BENCH_DURATION) -D_ISOMAC
 
+# Use clock_gettime to measure performance of functions.  The default is
+# to use the architecture-specific high precision timing instructions.
+ifdef USE_CLOCK_GETTIME
+CPPFLAGS-nonlib += -DUSE_CLOCK_GETTIME
+else
 # On x86 processors, use RDTSCP, instead of RDTSC, to measure performance
 # of functions.  All x86 processors since 2010 support RDTSCP instruction.
 ifdef USE_RDTSCP
 CPPFLAGS-nonlib += -DUSE_RDTSCP
 endif
+endif
 
 DETAILED_OPT :=
 
diff --git a/benchtests/README b/benchtests/README
index c4f03fd872..f440f3295a 100644
--- a/benchtests/README
+++ b/benchtests/README
@@ -27,7 +27,12 @@ BENCH_DURATION.
 
 The benchmark suite does function call measurements using architecture-specific
 high precision timing instructions whenever available.  When such support is
-not available, it uses clock_gettime (CLOCK_MONOTONIC).
+not available, it uses clock_gettime (CLOCK_MONOTONIC).  One can force the
+benchmark to use clock_gettime by invoking make as follows:
+
+  $ make USE_CLOCK_GETTIME=1 bench
+
+Again, one must run `make bench-clean' before changing the measurement method.
 
 On x86 processors, RDTSCP instruction provides more precise timing data
 than RDTSC instruction.  All x86 processors since 2010 support RDTSCP
diff --git a/benchtests/bench-timing.h b/benchtests/bench-timing.h
index 5b9a8384bb..844a7727c9 100644
--- a/benchtests/bench-timing.h
+++ b/benchtests/bench-timing.h
@@ -19,7 +19,11 @@
 #undef attribute_hidden
 #define attribute_hidden
 #define __clock_gettime clock_gettime
-#include <hp-timing.h>
+#ifdef USE_CLOCK_GETTIME
+# include <sysdeps/generic/hp-timing.h>
+#else
+# include <hp-timing.h>
+#endif
 #include <stdint.h>
 
 #define GL(x) _##x
-- 
2.26.2


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] benchtests: Restore the clock_gettime option
  2020-05-19 20:30 [PATCH] benchtests: Restore the clock_gettime option H.J. Lu
@ 2020-05-19 21:18 ` Alexander Monakov
  2020-05-19 21:45   ` H.J. Lu
  0 siblings, 1 reply; 12+ messages in thread
From: Alexander Monakov @ 2020-05-19 21:18 UTC (permalink / raw)
  To: H.J. Lu; +Cc: libc-alpha



On Tue, 19 May 2020, H.J. Lu via Libc-alpha wrote:

> commit 7621e38bf3c58b2d0359545f1f2898017fd89d05
> Author: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
> Date:   Tue Jan 29 17:43:45 2019 +0000
> 
>     Add generic hp-timing support
> 
> removed the clock_gettime option.  On x86, fewer cycles doesn't
> necessarily mean faster exection due to frequency drop.  We should
> restore the clock_gettime option.

Can you please elaborate more which x86 CPUs you have in mind here,
as since Nehalem (2008) when invariant TSC was introduced, the rdtsc(p)
instructions count at a fixed rate rather than at CPU clock rate.
And before 2008, there were no turbo frequencies and no AVX frequency drop.

Alexander

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] benchtests: Restore the clock_gettime option
  2020-05-19 21:18 ` Alexander Monakov
@ 2020-05-19 21:45   ` H.J. Lu
  2020-05-19 22:16     ` Alexander Monakov
  0 siblings, 1 reply; 12+ messages in thread
From: H.J. Lu @ 2020-05-19 21:45 UTC (permalink / raw)
  To: Alexander Monakov; +Cc: GNU C Library

On Tue, May 19, 2020 at 2:18 PM Alexander Monakov <amonakov@ispras.ru> wrote:
>
>
>
> On Tue, 19 May 2020, H.J. Lu via Libc-alpha wrote:
>
> > commit 7621e38bf3c58b2d0359545f1f2898017fd89d05
> > Author: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
> > Date:   Tue Jan 29 17:43:45 2019 +0000
> >
> >     Add generic hp-timing support
> >
> > removed the clock_gettime option.  On x86, fewer cycles doesn't
> > necessarily mean faster exection due to frequency drop.  We should
> > restore the clock_gettime option.
>
> Can you please elaborate more which x86 CPUs you have in mind here,
> as since Nehalem (2008) when invariant TSC was introduced, the rdtsc(p)
> instructions count at a fixed rate rather than at CPU clock rate.
> And before 2008, there were no turbo frequencies and no AVX frequency drop.

https://stackoverflow.com/questions/56852812/simd-instructions-lowering-cpu-frequency

-- 
H.J.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] benchtests: Restore the clock_gettime option
  2020-05-19 21:45   ` H.J. Lu
@ 2020-05-19 22:16     ` Alexander Monakov
  2020-05-20 17:51       ` H.J. Lu
  2020-05-20 18:05       ` Florian Weimer
  0 siblings, 2 replies; 12+ messages in thread
From: Alexander Monakov @ 2020-05-19 22:16 UTC (permalink / raw)
  To: H.J. Lu; +Cc: GNU C Library

On Tue, 19 May 2020, H.J. Lu wrote:

> On Tue, May 19, 2020 at 2:18 PM Alexander Monakov <amonakov@ispras.ru> wrote:
> >
> >
> >
> > On Tue, 19 May 2020, H.J. Lu via Libc-alpha wrote:
> >
> > > commit 7621e38bf3c58b2d0359545f1f2898017fd89d05
> > > Author: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
> > > Date:   Tue Jan 29 17:43:45 2019 +0000
> > >
> > >     Add generic hp-timing support
> > >
> > > removed the clock_gettime option.  On x86, fewer cycles doesn't
> > > necessarily mean faster exection due to frequency drop.  We should
> > > restore the clock_gettime option.
> >
> > Can you please elaborate more which x86 CPUs you have in mind here,
> > as since Nehalem (2008) when invariant TSC was introduced, the rdtsc(p)
> > instructions count at a fixed rate rather than at CPU clock rate.
> > And before 2008, there were no turbo frequencies and no AVX frequency drop.
> 
> https://stackoverflow.com/questions/56852812/simd-instructions-lowering-cpu-frequency

I am well aware. Again: rdtsc does not count CPU cycles on recent Intel CPUs.
RDTSC reads a register that increments at a fixed rate. So its increment is
proportional to wall clock time. When a workload is causing a reduction in
actual CPU frequency, RDTSC increment frequency is not affected and so it
remains suitable for measuring the actual wall-clock time.

Alexander

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] benchtests: Restore the clock_gettime option
  2020-05-19 22:16     ` Alexander Monakov
@ 2020-05-20 17:51       ` H.J. Lu
  2020-05-20 18:01         ` Alexander Monakov
  2020-05-20 18:05       ` Florian Weimer
  1 sibling, 1 reply; 12+ messages in thread
From: H.J. Lu @ 2020-05-20 17:51 UTC (permalink / raw)
  To: Alexander Monakov; +Cc: GNU C Library

On Tue, May 19, 2020 at 3:16 PM Alexander Monakov <amonakov@ispras.ru> wrote:
>
> On Tue, 19 May 2020, H.J. Lu wrote:
>
> > On Tue, May 19, 2020 at 2:18 PM Alexander Monakov <amonakov@ispras.ru> wrote:
> > >
> > >
> > >
> > > On Tue, 19 May 2020, H.J. Lu via Libc-alpha wrote:
> > >
> > > > commit 7621e38bf3c58b2d0359545f1f2898017fd89d05
> > > > Author: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
> > > > Date:   Tue Jan 29 17:43:45 2019 +0000
> > > >
> > > >     Add generic hp-timing support
> > > >
> > > > removed the clock_gettime option.  On x86, fewer cycles doesn't
> > > > necessarily mean faster exection due to frequency drop.  We should
> > > > restore the clock_gettime option.
> > >
> > > Can you please elaborate more which x86 CPUs you have in mind here,
> > > as since Nehalem (2008) when invariant TSC was introduced, the rdtsc(p)
> > > instructions count at a fixed rate rather than at CPU clock rate.
> > > And before 2008, there were no turbo frequencies and no AVX frequency drop.
> >
> > https://stackoverflow.com/questions/56852812/simd-instructions-lowering-cpu-frequency
>
> I am well aware. Again: rdtsc does not count CPU cycles on recent Intel CPUs.
> RDTSC reads a register that increments at a fixed rate. So its increment is
> proportional to wall clock time. When a workload is causing a reduction in
> actual CPU frequency, RDTSC increment frequency is not affected and so it
> remains suitable for measuring the actual wall-clock time.
>
> Alexander

We'd like to have it as an option on x86.

-- 
H.J.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] benchtests: Restore the clock_gettime option
  2020-05-20 17:51       ` H.J. Lu
@ 2020-05-20 18:01         ` Alexander Monakov
  0 siblings, 0 replies; 12+ messages in thread
From: Alexander Monakov @ 2020-05-20 18:01 UTC (permalink / raw)
  To: H.J. Lu; +Cc: GNU C Library

On Wed, 20 May 2020, H.J. Lu wrote:

> On Tue, May 19, 2020 at 3:16 PM Alexander Monakov <amonakov@ispras.ru> wrote:
> >
> > On Tue, 19 May 2020, H.J. Lu wrote:
> >
> > > On Tue, May 19, 2020 at 2:18 PM Alexander Monakov <amonakov@ispras.ru> wrote:
> > > >
> > > >
> > > >
> > > > On Tue, 19 May 2020, H.J. Lu via Libc-alpha wrote:
> > > >
> > > > > commit 7621e38bf3c58b2d0359545f1f2898017fd89d05
> > > > > Author: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
> > > > > Date:   Tue Jan 29 17:43:45 2019 +0000
> > > > >
> > > > >     Add generic hp-timing support
> > > > >
> > > > > removed the clock_gettime option.  On x86, fewer cycles doesn't
> > > > > necessarily mean faster exection due to frequency drop.  We should
> > > > > restore the clock_gettime option.
> > > >
> > > > Can you please elaborate more which x86 CPUs you have in mind here,
> > > > as since Nehalem (2008) when invariant TSC was introduced, the rdtsc(p)
> > > > instructions count at a fixed rate rather than at CPU clock rate.
> > > > And before 2008, there were no turbo frequencies and no AVX frequency drop.
> > >
> > > https://stackoverflow.com/questions/56852812/simd-instructions-lowering-cpu-frequency
> >
> > I am well aware. Again: rdtsc does not count CPU cycles on recent Intel CPUs.
> > RDTSC reads a register that increments at a fixed rate. So its increment is
> > proportional to wall clock time. When a workload is causing a reduction in
> > actual CPU frequency, RDTSC increment frequency is not affected and so it
> > remains suitable for measuring the actual wall-clock time.
> >
> > Alexander
> 
> We'd like to have it as an option on x86.

Then I would suggest rewording the proposed commit message to explain that.
As written, commit message gives a misleading/wrong motivation.

Alexander

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] benchtests: Restore the clock_gettime option
  2020-05-19 22:16     ` Alexander Monakov
  2020-05-20 17:51       ` H.J. Lu
@ 2020-05-20 18:05       ` Florian Weimer
  2020-05-20 18:14         ` V2 " H.J. Lu
  1 sibling, 1 reply; 12+ messages in thread
From: Florian Weimer @ 2020-05-20 18:05 UTC (permalink / raw)
  To: Alexander Monakov via Libc-alpha

* Alexander Monakov via Libc-alpha:

> I am well aware. Again: rdtsc does not count CPU cycles on recent
> Intel CPUs.

H.J. probably has a different view on what those “recent Intel CPUs”
are. 8-) I have not reviewed the mechanics of the patch, but if we need
this for some CPUs, we should make the change.

Thanks,
Florian


^ permalink raw reply	[flat|nested] 12+ messages in thread

* V2 [PATCH] benchtests: Restore the clock_gettime option
  2020-05-20 18:05       ` Florian Weimer
@ 2020-05-20 18:14         ` H.J. Lu
  2020-05-20 18:17           ` Florian Weimer
                             ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: H.J. Lu @ 2020-05-20 18:14 UTC (permalink / raw)
  To: Florian Weimer; +Cc: Alexander Monakov via Libc-alpha, Alexander Monakov

[-- Attachment #1: Type: text/plain, Size: 840 bytes --]

On Wed, May 20, 2020 at 11:05 AM Florian Weimer <fweimer@redhat.com> wrote:
>
> * Alexander Monakov via Libc-alpha:
>
> > I am well aware. Again: rdtsc does not count CPU cycles on recent
> > Intel CPUs.
>
> H.J. probably has a different view on what those “recent Intel CPUs”
> are. 8-) I have not reviewed the mechanics of the patch, but if we need
> this for some CPUs, we should make the change.
>

Here the patch with updated commit message:

commit 7621e38bf3c58b2d0359545f1f2898017fd89d05
Author: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Date:   Tue Jan 29 17:43:45 2019 +0000

    Add generic hp-timing support

removed the clock_gettime option.  Restore the clock_gettime option for
some x86 CPUs on which value from RDTSC may not be incremented at a fixed
rate.

OK for master?

Thanks.

-- 
H.J.

[-- Attachment #2: 0001-benchtests-Restore-the-clock_gettime-option.patch --]
[-- Type: text/x-patch, Size: 2645 bytes --]

From 7e48f7adbd53c18df7ab5fdd488bfcc134627480 Mon Sep 17 00:00:00 2001
From: "H.J. Lu" <hjl.tools@gmail.com>
Date: Mon, 18 May 2020 17:28:42 -0700
Subject: [PATCH] benchtests: Restore the clock_gettime option

commit 7621e38bf3c58b2d0359545f1f2898017fd89d05
Author: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Date:   Tue Jan 29 17:43:45 2019 +0000

    Add generic hp-timing support

removed the clock_gettime option.  Restore the clock_gettime option for
some x86 CPUs on which value from RDTSC may not be incremented at a fixed
rate.
---
 benchtests/Makefile       | 6 ++++++
 benchtests/README         | 7 ++++++-
 benchtests/bench-timing.h | 6 +++++-
 3 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/benchtests/Makefile b/benchtests/Makefile
index 335d643ecb..99e90d17a0 100644
--- a/benchtests/Makefile
+++ b/benchtests/Makefile
@@ -132,11 +132,17 @@ endif
 
 CPPFLAGS-nonlib += -DDURATION=$(BENCH_DURATION) -D_ISOMAC
 
+# Use clock_gettime to measure performance of functions.  The default is
+# to use the architecture-specific high precision timing instructions.
+ifdef USE_CLOCK_GETTIME
+CPPFLAGS-nonlib += -DUSE_CLOCK_GETTIME
+else
 # On x86 processors, use RDTSCP, instead of RDTSC, to measure performance
 # of functions.  All x86 processors since 2010 support RDTSCP instruction.
 ifdef USE_RDTSCP
 CPPFLAGS-nonlib += -DUSE_RDTSCP
 endif
+endif
 
 DETAILED_OPT :=
 
diff --git a/benchtests/README b/benchtests/README
index c4f03fd872..f440f3295a 100644
--- a/benchtests/README
+++ b/benchtests/README
@@ -27,7 +27,12 @@ BENCH_DURATION.
 
 The benchmark suite does function call measurements using architecture-specific
 high precision timing instructions whenever available.  When such support is
-not available, it uses clock_gettime (CLOCK_MONOTONIC).
+not available, it uses clock_gettime (CLOCK_MONOTONIC).  One can force the
+benchmark to use clock_gettime by invoking make as follows:
+
+  $ make USE_CLOCK_GETTIME=1 bench
+
+Again, one must run `make bench-clean' before changing the measurement method.
 
 On x86 processors, RDTSCP instruction provides more precise timing data
 than RDTSC instruction.  All x86 processors since 2010 support RDTSCP
diff --git a/benchtests/bench-timing.h b/benchtests/bench-timing.h
index 5b9a8384bb..844a7727c9 100644
--- a/benchtests/bench-timing.h
+++ b/benchtests/bench-timing.h
@@ -19,7 +19,11 @@
 #undef attribute_hidden
 #define attribute_hidden
 #define __clock_gettime clock_gettime
-#include <hp-timing.h>
+#ifdef USE_CLOCK_GETTIME
+# include <sysdeps/generic/hp-timing.h>
+#else
+# include <hp-timing.h>
+#endif
 #include <stdint.h>
 
 #define GL(x) _##x
-- 
2.26.2


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: V2 [PATCH] benchtests: Restore the clock_gettime option
  2020-05-20 18:14         ` V2 " H.J. Lu
@ 2020-05-20 18:17           ` Florian Weimer
  2020-05-20 18:18           ` Adhemerval Zanella
  2020-06-04 21:24           ` Carlos O'Donell
  2 siblings, 0 replies; 12+ messages in thread
From: Florian Weimer @ 2020-05-20 18:17 UTC (permalink / raw)
  To: H.J. Lu; +Cc: Alexander Monakov via Libc-alpha, Alexander Monakov

* H. J. Lu:

> On Wed, May 20, 2020 at 11:05 AM Florian Weimer <fweimer@redhat.com> wrote:
>>
>> * Alexander Monakov via Libc-alpha:
>>
>> > I am well aware. Again: rdtsc does not count CPU cycles on recent
>> > Intel CPUs.
>>
>> H.J. probably has a different view on what those “recent Intel CPUs”
>> are. 8-) I have not reviewed the mechanics of the patch, but if we need
>> this for some CPUs, we should make the change.
>>
>
> Here the patch with updated commit message:
>
> commit 7621e38bf3c58b2d0359545f1f2898017fd89d05
> Author: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
> Date:   Tue Jan 29 17:43:45 2019 +0000
>
>     Add generic hp-timing support
>
> removed the clock_gettime option.  Restore the clock_gettime option for
> some x86 CPUs on which value from RDTSC may not be incremented at a fixed
> rate.
>
> OK for master?

Patch looks okay to me.

Thanks,
Florian


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: V2 [PATCH] benchtests: Restore the clock_gettime option
  2020-05-20 18:14         ` V2 " H.J. Lu
  2020-05-20 18:17           ` Florian Weimer
@ 2020-05-20 18:18           ` Adhemerval Zanella
  2020-05-20 18:52             ` H.J. Lu
  2020-06-04 21:24           ` Carlos O'Donell
  2 siblings, 1 reply; 12+ messages in thread
From: Adhemerval Zanella @ 2020-05-20 18:18 UTC (permalink / raw)
  To: libc-alpha



On 20/05/2020 15:14, H.J. Lu via Libc-alpha wrote:
> On Wed, May 20, 2020 at 11:05 AM Florian Weimer <fweimer@redhat.com> wrote:
>>
>> * Alexander Monakov via Libc-alpha:
>>
>>> I am well aware. Again: rdtsc does not count CPU cycles on recent
>>> Intel CPUs.
>>
>> H.J. probably has a different view on what those “recent Intel CPUs”
>> are. 8-) I have not reviewed the mechanics of the patch, but if we need
>> this for some CPUs, we should make the change.
>>
> 
> Here the patch with updated commit message:
> 
> commit 7621e38bf3c58b2d0359545f1f2898017fd89d05
> Author: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
> Date:   Tue Jan 29 17:43:45 2019 +0000
> 
>     Add generic hp-timing support
> 
> removed the clock_gettime option.  Restore the clock_gettime option for
> some x86 CPUs on which value from RDTSC may not be incremented at a fixed
> rate.
> 
> OK for master?

What kind of result discrepancies are you seeing using hp-timing.h on x86?
The clock_gettime help in what exactly here (it was not clear from
discussion, neither from patch submission)?  

I am asking because we rely on hp-timing.h to get the loader profiling,
so if this does provide accurate information in some cases it might be
the case to disable on ld.so/libc.so as well.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: V2 [PATCH] benchtests: Restore the clock_gettime option
  2020-05-20 18:18           ` Adhemerval Zanella
@ 2020-05-20 18:52             ` H.J. Lu
  0 siblings, 0 replies; 12+ messages in thread
From: H.J. Lu @ 2020-05-20 18:52 UTC (permalink / raw)
  To: Adhemerval Zanella; +Cc: GNU C Library

On Wed, May 20, 2020 at 11:38 AM Adhemerval Zanella via Libc-alpha
<libc-alpha@sourceware.org> wrote:
>
>
>
> On 20/05/2020 15:14, H.J. Lu via Libc-alpha wrote:
> > On Wed, May 20, 2020 at 11:05 AM Florian Weimer <fweimer@redhat.com> wrote:
> >>
> >> * Alexander Monakov via Libc-alpha:
> >>
> >>> I am well aware. Again: rdtsc does not count CPU cycles on recent
> >>> Intel CPUs.
> >>
> >> H.J. probably has a different view on what those “recent Intel CPUs”
> >> are. 8-) I have not reviewed the mechanics of the patch, but if we need
> >> this for some CPUs, we should make the change.
> >>
> >
> > Here the patch with updated commit message:
> >
> > commit 7621e38bf3c58b2d0359545f1f2898017fd89d05
> > Author: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
> > Date:   Tue Jan 29 17:43:45 2019 +0000
> >
> >     Add generic hp-timing support
> >
> > removed the clock_gettime option.  Restore the clock_gettime option for
> > some x86 CPUs on which value from RDTSC may not be incremented at a fixed
> > rate.
> >
> > OK for master?
>
> What kind of result discrepancies are you seeing using hp-timing.h on x86?
> The clock_gettime help in what exactly here (it was not clear from
> discussion, neither from patch submission)?
>
> I am asking because we rely on hp-timing.h to get the loader profiling,
> so if this does provide accurate information in some cases it might be
> the case to disable on ld.so/libc.so as well.

We are using glibc benchmarks for tuning memory/string functions.
We want to compare results between RDTSC and clock_gettime.

-- 
H.J.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: V2 [PATCH] benchtests: Restore the clock_gettime option
  2020-05-20 18:14         ` V2 " H.J. Lu
  2020-05-20 18:17           ` Florian Weimer
  2020-05-20 18:18           ` Adhemerval Zanella
@ 2020-06-04 21:24           ` Carlos O'Donell
  2 siblings, 0 replies; 12+ messages in thread
From: Carlos O'Donell @ 2020-06-04 21:24 UTC (permalink / raw)
  To: H.J. Lu, Florian Weimer; +Cc: Alexander Monakov via Libc-alpha

On 5/20/20 2:14 PM, H.J. Lu via Libc-alpha wrote:
> On Wed, May 20, 2020 at 11:05 AM Florian Weimer <fweimer@redhat.com> wrote:
>>
>> * Alexander Monakov via Libc-alpha:
>>
>>> I am well aware. Again: rdtsc does not count CPU cycles on recent
>>> Intel CPUs.
>>
>> H.J. probably has a different view on what those “recent Intel CPUs”
>> are. 8-) I have not reviewed the mechanics of the patch, but if we need
>> this for some CPUs, we should make the change.
>>
> 
> Here the patch with updated commit message:
> 
> commit 7621e38bf3c58b2d0359545f1f2898017fd89d05
> Author: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
> Date:   Tue Jan 29 17:43:45 2019 +0000
> 
>     Add generic hp-timing support
> 
> removed the clock_gettime option.  Restore the clock_gettime option for
> some x86 CPUs on which value from RDTSC may not be incremented at a fixed
> rate.
> 
> OK for master?

OK for master. 

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
 
> Thanks.
> 
> From 7e48f7adbd53c18df7ab5fdd488bfcc134627480 Mon Sep 17 00:00:00 2001
> From: "H.J. Lu" <hjl.tools@gmail.com>
> Date: Mon, 18 May 2020 17:28:42 -0700
> Subject: [PATCH] benchtests: Restore the clock_gettime option
> 
> commit 7621e38bf3c58b2d0359545f1f2898017fd89d05
> Author: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
> Date:   Tue Jan 29 17:43:45 2019 +0000
> 
>     Add generic hp-timing support
> 
> removed the clock_gettime option.  Restore the clock_gettime option for
> some x86 CPUs on which value from RDTSC may not be incremented at a fixed
> rate.
> ---
>  benchtests/Makefile       | 6 ++++++
>  benchtests/README         | 7 ++++++-
>  benchtests/bench-timing.h | 6 +++++-
>  3 files changed, 17 insertions(+), 2 deletions(-)
> 
> diff --git a/benchtests/Makefile b/benchtests/Makefile
> index 335d643ecb..99e90d17a0 100644
> --- a/benchtests/Makefile
> +++ b/benchtests/Makefile
> @@ -132,11 +132,17 @@ endif
>  
>  CPPFLAGS-nonlib += -DDURATION=$(BENCH_DURATION) -D_ISOMAC
>  
> +# Use clock_gettime to measure performance of functions.  The default is
> +# to use the architecture-specific high precision timing instructions.
> +ifdef USE_CLOCK_GETTIME
> +CPPFLAGS-nonlib += -DUSE_CLOCK_GETTIME
> +else
>  # On x86 processors, use RDTSCP, instead of RDTSC, to measure performance
>  # of functions.  All x86 processors since 2010 support RDTSCP instruction.
>  ifdef USE_RDTSCP
>  CPPFLAGS-nonlib += -DUSE_RDTSCP
>  endif
> +endif
>  
>  DETAILED_OPT :=
>  
> diff --git a/benchtests/README b/benchtests/README
> index c4f03fd872..f440f3295a 100644
> --- a/benchtests/README
> +++ b/benchtests/README
> @@ -27,7 +27,12 @@ BENCH_DURATION.
>  
>  The benchmark suite does function call measurements using architecture-specific
>  high precision timing instructions whenever available.  When such support is
> -not available, it uses clock_gettime (CLOCK_MONOTONIC).
> +not available, it uses clock_gettime (CLOCK_MONOTONIC).  One can force the
> +benchmark to use clock_gettime by invoking make as follows:
> +
> +  $ make USE_CLOCK_GETTIME=1 bench
> +
> +Again, one must run `make bench-clean' before changing the measurement method.
>  
>  On x86 processors, RDTSCP instruction provides more precise timing data
>  than RDTSC instruction.  All x86 processors since 2010 support RDTSCP
> diff --git a/benchtests/bench-timing.h b/benchtests/bench-timing.h
> index 5b9a8384bb..844a7727c9 100644
> --- a/benchtests/bench-timing.h
> +++ b/benchtests/bench-timing.h
> @@ -19,7 +19,11 @@
>  #undef attribute_hidden
>  #define attribute_hidden
>  #define __clock_gettime clock_gettime
> -#include <hp-timing.h>
> +#ifdef USE_CLOCK_GETTIME
> +# include <sysdeps/generic/hp-timing.h>
> +#else
> +# include <hp-timing.h>
> +#endif
>  #include <stdint.h>
>  
>  #define GL(x) _##x
> -- 
> 2.26.2
> 


-- 
Cheers,
Carlos.


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2020-06-04 21:24 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-19 20:30 [PATCH] benchtests: Restore the clock_gettime option H.J. Lu
2020-05-19 21:18 ` Alexander Monakov
2020-05-19 21:45   ` H.J. Lu
2020-05-19 22:16     ` Alexander Monakov
2020-05-20 17:51       ` H.J. Lu
2020-05-20 18:01         ` Alexander Monakov
2020-05-20 18:05       ` Florian Weimer
2020-05-20 18:14         ` V2 " H.J. Lu
2020-05-20 18:17           ` Florian Weimer
2020-05-20 18:18           ` Adhemerval Zanella
2020-05-20 18:52             ` H.J. Lu
2020-06-04 21:24           ` Carlos O'Donell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).