public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
* patch for hp-timing.h
@ 2023-02-03 13:29 Wilco Dijkstra
  2023-02-03 13:38 ` Adhemerval Zanella Netto
  2023-02-03 14:02 ` Tang, Jun
  0 siblings, 2 replies; 7+ messages in thread
From: Wilco Dijkstra @ 2023-02-03 13:29 UTC (permalink / raw)
  To: Tang, Jun; +Cc: 'GNU C Library'

Hi Jun,

> hp-timing is only used on benchtest today, but it can be used on other
> measurements in the future.

It can't as proposed below since this would overflow in ~16 seconds.
We could improve this by removing the zero bits from the multiply
and doing some shifts, maybe that will make the interval long enough
for integer multiply. Otherwise I'd suggest the floating point version since
it's not like the benchtests don't already use floating point.

Cheers,
Wilco

diff --git a/benchtests/Makefile b/benchtests/Makefile
index 292976b26b..a624614207 100644
--- a/benchtests/Makefile
+++ b/benchtests/Makefile
@@ -499,4 +499,5 @@ $(objpfx)bench-%.c: %-inputs $(bench-deps)
 	  cat $($*-INCLUDE); \
 	fi; \
 	$(PYTHON) scripts/bench.py $(patsubst %-inputs,%,$<); } > $@-tmp
+	cp -f $@-tmp $@-bak

Unintended change I guess?

 	mv -f $@-tmp $@
diff --git a/sysdeps/aarch64/hp-timing.h b/sysdeps/aarch64/hp-timing.h
index f7f7ac7cae..c699effe6a 100644
--- a/sysdeps/aarch64/hp-timing.h
+++ b/sysdeps/aarch64/hp-timing.h
@@ -41,7 +41,7 @@ typedef uint64_t hp_timing_t;
 #define HP_TIMING_DIFF(Diff, Start, End)			\
 ({  hp_timing_t freq;						\
     __asm__ __volatile__ ("mrs %0, cntfrq_el0" : "=r" (freq));	\
-   (Diff) = ((End) - (Start)) * (UINT64_C(1000000000) / freq);	\
+   (Diff) = (((End) - (Start)) * UINT64_C(1000000000)) / freq; \
 })
 
 #endif	/* hp-timing.h */

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: patch for hp-timing.h
  2023-02-03 13:29 patch for hp-timing.h Wilco Dijkstra
@ 2023-02-03 13:38 ` Adhemerval Zanella Netto
  2023-02-03 14:02 ` Tang, Jun
  1 sibling, 0 replies; 7+ messages in thread
From: Adhemerval Zanella Netto @ 2023-02-03 13:38 UTC (permalink / raw)
  To: Wilco Dijkstra, Tang, Jun; +Cc: 'GNU C Library'



On 03/02/23 10:29, Wilco Dijkstra via Libc-alpha wrote:
> Hi Jun,
> 
>> hp-timing is only used on benchtest today, but it can be used on other
>> measurements in the future.
> 
> It can't as proposed below since this would overflow in ~16 seconds.
> We could improve this by removing the zero bits from the multiply
> and doing some shifts, maybe that will make the interval long enough
> for integer multiply. Otherwise I'd suggest the floating point version since
> it's not like the benchtests don't already use floating point.
> 
> Cheers,
> Wilco

Do we really need to handle more than ~16 seconds for the hp-timing? It
is usually used along with a accumulator and I would expect that once you
need to measure more than 16s using clock_gettime works as expected.

> 
> diff --git a/benchtests/Makefile b/benchtests/Makefile
> index 292976b26b..a624614207 100644
> --- a/benchtests/Makefile
> +++ b/benchtests/Makefile
> @@ -499,4 +499,5 @@ $(objpfx)bench-%.c: %-inputs $(bench-deps)
>  	  cat $($*-INCLUDE); \
>  	fi; \
>  	$(PYTHON) scripts/bench.py $(patsubst %-inputs,%,$<); } > $@-tmp
> +	cp -f $@-tmp $@-bak
> 
> Unintended change I guess?
> 
>  	mv -f $@-tmp $@
> diff --git a/sysdeps/aarch64/hp-timing.h b/sysdeps/aarch64/hp-timing.h
> index f7f7ac7cae..c699effe6a 100644
> --- a/sysdeps/aarch64/hp-timing.h
> +++ b/sysdeps/aarch64/hp-timing.h
> @@ -41,7 +41,7 @@ typedef uint64_t hp_timing_t;
>  #define HP_TIMING_DIFF(Diff, Start, End)			\
>  ({  hp_timing_t freq;						\
>      __asm__ __volatile__ ("mrs %0, cntfrq_el0" : "=r" (freq));	\
> -   (Diff) = ((End) - (Start)) * (UINT64_C(1000000000) / freq);	\
> +   (Diff) = (((End) - (Start)) * UINT64_C(1000000000)) / freq; \
>  })
>  
>  #endif	/* hp-timing.h */

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: patch for hp-timing.h
  2023-02-03 13:29 patch for hp-timing.h Wilco Dijkstra
  2023-02-03 13:38 ` Adhemerval Zanella Netto
@ 2023-02-03 14:02 ` Tang, Jun
  2023-02-06 16:53   ` Wilco Dijkstra
  1 sibling, 1 reply; 7+ messages in thread
From: Tang, Jun @ 2023-02-03 14:02 UTC (permalink / raw)
  To: Wilco Dijkstra; +Cc: 'GNU C Library'

Wilco,

Removing zero bits from multiply and making shifts afterwards can extend the time before overflow.  Does it also reduce the resolution of the timer?

Regards,

Jun

-----Original Message-----
From: Wilco Dijkstra <Wilco.Dijkstra@arm.com> 
Sent: Friday, February 3, 2023 7:30 AM
To: Tang, Jun <juntangc@amazon.com>
Cc: 'GNU C Library' <libc-alpha@sourceware.org>
Subject: [EXTERNAL] patch for hp-timing.h

CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.



Hi Jun,

> hp-timing is only used on benchtest today, but it can be used on other 
> measurements in the future.

It can't as proposed below since this would overflow in ~16 seconds.
We could improve this by removing the zero bits from the multiply and doing some shifts, maybe that will make the interval long enough for integer multiply. Otherwise I'd suggest the floating point version since it's not like the benchtests don't already use floating point.

Cheers,
Wilco

diff --git a/benchtests/Makefile b/benchtests/Makefile index 292976b26b..a624614207 100644
--- a/benchtests/Makefile
+++ b/benchtests/Makefile
@@ -499,4 +499,5 @@ $(objpfx)bench-%.c: %-inputs $(bench-deps)
          cat $($*-INCLUDE); \
        fi; \
        $(PYTHON) scripts/bench.py $(patsubst %-inputs,%,$<); } > $@-tmp
+       cp -f $@-tmp $@-bak

Unintended change I guess?

        mv -f $@-tmp $@
diff --git a/sysdeps/aarch64/hp-timing.h b/sysdeps/aarch64/hp-timing.h index f7f7ac7cae..c699effe6a 100644
--- a/sysdeps/aarch64/hp-timing.h
+++ b/sysdeps/aarch64/hp-timing.h
@@ -41,7 +41,7 @@ typedef uint64_t hp_timing_t;
 #define HP_TIMING_DIFF(Diff, Start, End)                       \
 ({  hp_timing_t freq;                                          \
     __asm__ __volatile__ ("mrs %0, cntfrq_el0" : "=r" (freq)); \
-   (Diff) = ((End) - (Start)) * (UINT64_C(1000000000) / freq); \
+   (Diff) = (((End) - (Start)) * UINT64_C(1000000000)) / freq; \
 })

 #endif /* hp-timing.h */

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: patch for hp-timing.h
  2023-02-03 14:02 ` Tang, Jun
@ 2023-02-06 16:53   ` Wilco Dijkstra
  2023-02-06 19:34     ` Tang, Jun
  0 siblings, 1 reply; 7+ messages in thread
From: Wilco Dijkstra @ 2023-02-06 16:53 UTC (permalink / raw)
  To: Tang, Jun; +Cc: 'GNU C Library'

Hi Jun,

> Removing zero bits from multiply and making shifts afterwards can extend the time before
> overflow.  Does it also reduce the resolution of the timer?

It wouldn't if we shift out zero bits. For example 1000000 has 6 zero bits
at the bottom, so assuming every implementation uses a multiple of 1MHz
then there is no loss of accuracy. And 6 extra bits would mean 1024 seconds
before overflow at 1GHz which seems long enough.

Cheers,
Wilco

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: patch for hp-timing.h
  2023-02-06 16:53   ` Wilco Dijkstra
@ 2023-02-06 19:34     ` Tang, Jun
  2023-02-08 14:58       ` Wilco Dijkstra
  0 siblings, 1 reply; 7+ messages in thread
From: Tang, Jun @ 2023-02-06 19:34 UTC (permalink / raw)
  To: Wilco Dijkstra; +Cc: 'GNU C Library'

Wilco,

Thanks for the suggestion, 

There are 7 zero bits at the end of freq (1050000000).  So the time can be extended to 2048 second before losing resolution.  Below is the updated patch for 1024 sec -

-   (Diff) = ((End) - (Start)) * (UINT64_C(1000000000) / freq); \
+   (Diff) = (((End) - (Start)) * UINT64_C(1000000000>>6)) / (freq>>6); \
 
Regards,

Jun
 
-----Original Message-----
From: Wilco Dijkstra <Wilco.Dijkstra@arm.com> 
Sent: Monday, February 6, 2023 10:54 AM
To: Tang, Jun <juntangc@amazon.com>
Cc: 'GNU C Library' <libc-alpha@sourceware.org>
Subject: RE: [EXTERNAL]patch for hp-timing.h

CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.



Hi Jun,

> Removing zero bits from multiply and making shifts afterwards can 
> extend the time before overflow.  Does it also reduce the resolution of the timer?

It wouldn't if we shift out zero bits. For example 1000000 has 6 zero bits at the bottom, so assuming every implementation uses a multiple of 1MHz then there is no loss of accuracy. And 6 extra bits would mean 1024 seconds before overflow at 1GHz which seems long enough.

Cheers,
Wilco

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: patch for hp-timing.h
  2023-02-06 19:34     ` Tang, Jun
@ 2023-02-08 14:58       ` Wilco Dijkstra
  0 siblings, 0 replies; 7+ messages in thread
From: Wilco Dijkstra @ 2023-02-08 14:58 UTC (permalink / raw)
  To: Tang, Jun; +Cc: 'GNU C Library'

Hi Jun,

> There are 7 zero bits at the end of freq (1050000000).  So the time can be extended to
> 2048 second before losing resolution.  Below is the updated patch for 1024 sec -
>
> -   (Diff) = ((End) - (Start)) * (UINT64_C(1000000000) / freq); \
> +   (Diff) = (((End) - (Start)) * UINT64_C(1000000000>>6)) / (freq>>6); \
 
Minor codestyle: there should be a space before/after '>>'.

Apart from that it looks good and works fine.

Can you resend an updated patch with comment header etc in a new email with
subject like "[PATCH v2] AArch64: Fix HP_TIMING_DIFF computation" or similar?

I guess this is your first patch and you don't have commit rights?

Cheers,
Wilco

^ permalink raw reply	[flat|nested] 7+ messages in thread

* patch for hp-timing.h
@ 2023-01-31 23:20 Tang, Jun
  0 siblings, 0 replies; 7+ messages in thread
From: Tang, Jun @ 2023-01-31 23:20 UTC (permalink / raw)
  To: libc-alpha


[-- Attachment #1.1: Type: text/plain, Size: 364 bytes --]

hp-timing is only used on benchtest today, but it can be used on other measurements in the future. It's broken on graviton 3 because the freq we have is 1050000000. 1000000000ULL/1050000000ULL = 0 and thus the difference becomes 0.  Attached patch fix the bug.  Link to the bug report - https://sourceware.org/bugzilla/show_bug.cgi?id=29329.

Regards,

Jun


[-- Attachment #2: hp-timing-patch.diff --]
[-- Type: application/octet-stream, Size: 875 bytes --]

diff --git a/benchtests/Makefile b/benchtests/Makefile
index 292976b26b..a624614207 100644
--- a/benchtests/Makefile
+++ b/benchtests/Makefile
@@ -499,4 +499,5 @@ $(objpfx)bench-%.c: %-inputs $(bench-deps)
 	  cat $($*-INCLUDE); \
 	fi; \
 	$(PYTHON) scripts/bench.py $(patsubst %-inputs,%,$<); } > $@-tmp
+	cp -f $@-tmp $@-bak
 	mv -f $@-tmp $@
diff --git a/sysdeps/aarch64/hp-timing.h b/sysdeps/aarch64/hp-timing.h
index f7f7ac7cae..c699effe6a 100644
--- a/sysdeps/aarch64/hp-timing.h
+++ b/sysdeps/aarch64/hp-timing.h
@@ -41,7 +41,7 @@ typedef uint64_t hp_timing_t;
 #define HP_TIMING_DIFF(Diff, Start, End)			\
 ({  hp_timing_t freq;						\
     __asm__ __volatile__ ("mrs %0, cntfrq_el0" : "=r" (freq));	\
-   (Diff) = ((End) - (Start)) * (UINT64_C(1000000000) / freq);	\
+   (Diff) = (((End) - (Start)) * UINT64_C(1000000000)) / freq; \
 })
 
 #endif	/* hp-timing.h */

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2023-02-08 14:58 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-03 13:29 patch for hp-timing.h Wilco Dijkstra
2023-02-03 13:38 ` Adhemerval Zanella Netto
2023-02-03 14:02 ` Tang, Jun
2023-02-06 16:53   ` Wilco Dijkstra
2023-02-06 19:34     ` Tang, Jun
2023-02-08 14:58       ` Wilco Dijkstra
  -- strict thread matches above, loose matches on Subject: below --
2023-01-31 23:20 Tang, Jun

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).