public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
* Monday Patch Queue Review update (2023-11-06)
@ 2023-11-06 14:41 Carlos O'Donell
  2023-11-06 18:18 ` Noah Goldstein
  0 siblings, 1 reply; 5+ messages in thread
From: Carlos O'Donell @ 2023-11-06 14:41 UTC (permalink / raw)
  To: libc-alpha

Most recent meeting status is always here:
https://sourceware.org/glibc/wiki/PatchworkReviewMeetings#Update

Meeting: 2023-11-06 @ 0900h EST5EDT

Video/Audio: https://bbb.linuxfoundation.org/room/adm-alk-1uu-7fu

IRC: #glibc on OFTC.

Review new patches and restart review at the top.

 * State NEW delegate NOBODY at 459 patches.
 * Carlos's SLI at 214 days average patch age in queue and 103254 accumulated patch days.
 * Starting at 79195.
 * v2: Multiple floating-point environment fixes (Adhemerval)
  * Carlos to look at the hppa code.
 * Update BAD_TYPECHECK to work on x86_64 (Flavio)
  * Needs Hurd review.
 * Remove ia64-linux-gnu (Adhemerval)
  * On thread discussion that upstream kernel support needed for a Linux port.
 * [1/6] aarch64: Add vector implementations of asin routines (Joe)
  * Szabolcs: More vector math functions.
 * x86: Only align destination to 1x VEC_SIZE in memset 4x loop (Noah)
 * x86: Fix unchecked AVX512-VBMI2 usage in strrchr-evex-base.S (Noah)
 * v3: Add a tunable to decorate anonymous memory maps (Adhemerval)
  * Missing 3 RBs for v3.
 * x86: Improve ERMS usage on Zen3+ (Adhemerval)
  * Would be valuable to reach out to AMD for feedback.
 * 78835: [v2,7/7] linux: Use fchmodat2 on fchmod for flags different than 0 (BZ 26401) (Adhemerval)
  * Carlos: If Florian gave an overall review then we should put it into master and start testing in the rolling release distributions.
 * resolv: free only initialized items from gai pool (Jan Palus)
  * Needs review.
 * Stopped at 78539.
 * Carlos: Siddhesh, May we please run the cleanup scripts again to remove older patches that haven't seen movement in the queue based on the community agreed upon time periods?
 * Siddhesh: Yes, I can run the scripts and cleanup the items that don't meet the current time period.
 * Adhemerval: Review for pthread cancel?
  * Carlos: Yes, Thursday/Friday at the latest this week for review.
 * Siddhesh: Will you share a v3 for the improved loader env var handling?
  * Ahdemerval: Yes, in the queue.

-- 
Cheers,
Carlos.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Monday Patch Queue Review update (2023-11-06)
  2023-11-06 14:41 Monday Patch Queue Review update (2023-11-06) Carlos O'Donell
@ 2023-11-06 18:18 ` Noah Goldstein
  2023-11-06 21:24   ` Adhemerval Zanella Netto
  0 siblings, 1 reply; 5+ messages in thread
From: Noah Goldstein @ 2023-11-06 18:18 UTC (permalink / raw)
  To: Carlos O'Donell; +Cc: libc-alpha

On Mon, Nov 6, 2023 at 8:42 AM Carlos O'Donell <carlos@redhat.com> wrote:
>
> Most recent meeting status is always here:
> https://sourceware.org/glibc/wiki/PatchworkReviewMeetings#Update
>
> Meeting: 2023-11-06 @ 0900h EST5EDT
>
> Video/Audio: https://bbb.linuxfoundation.org/room/adm-alk-1uu-7fu
>
> IRC: #glibc on OFTC.
>
> Review new patches and restart review at the top.
>
>  * State NEW delegate NOBODY at 459 patches.
>  * Carlos's SLI at 214 days average patch age in queue and 103254 accumulated patch days.
>  * Starting at 79195.
>  * v2: Multiple floating-point environment fixes (Adhemerval)
>   * Carlos to look at the hppa code.
>  * Update BAD_TYPECHECK to work on x86_64 (Flavio)
>   * Needs Hurd review.
>  * Remove ia64-linux-gnu (Adhemerval)
>   * On thread discussion that upstream kernel support needed for a Linux port.
>  * [1/6] aarch64: Add vector implementations of asin routines (Joe)
>   * Szabolcs: More vector math functions.
>  * x86: Only align destination to 1x VEC_SIZE in memset 4x loop (Noah)
>  * x86: Fix unchecked AVX512-VBMI2 usage in strrchr-evex-base.S (Noah)
>  * v3: Add a tunable to decorate anonymous memory maps (Adhemerval)
>   * Missing 3 RBs for v3.
>  * x86: Improve ERMS usage on Zen3+ (Adhemerval)
>   * Would be valuable to reach out to AMD for feedback.

Missed the meeting, but is there a bit more context to this?

>  * 78835: [v2,7/7] linux: Use fchmodat2 on fchmod for flags different than 0 (BZ 26401) (Adhemerval)
>   * Carlos: If Florian gave an overall review then we should put it into master and start testing in the rolling release distributions.
>  * resolv: free only initialized items from gai pool (Jan Palus)
>   * Needs review.
>  * Stopped at 78539.
>  * Carlos: Siddhesh, May we please run the cleanup scripts again to remove older patches that haven't seen movement in the queue based on the community agreed upon time periods?
>  * Siddhesh: Yes, I can run the scripts and cleanup the items that don't meet the current time period.
>  * Adhemerval: Review for pthread cancel?
>   * Carlos: Yes, Thursday/Friday at the latest this week for review.
>  * Siddhesh: Will you share a v3 for the improved loader env var handling?
>   * Ahdemerval: Yes, in the queue.
>
> --
> Cheers,
> Carlos.
>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Monday Patch Queue Review update (2023-11-06)
  2023-11-06 18:18 ` Noah Goldstein
@ 2023-11-06 21:24   ` Adhemerval Zanella Netto
  2023-11-06 21:46     ` Noah Goldstein
  0 siblings, 1 reply; 5+ messages in thread
From: Adhemerval Zanella Netto @ 2023-11-06 21:24 UTC (permalink / raw)
  To: libc-alpha, Noah Goldstein



On 06/11/23 15:18, Noah Goldstein wrote:
> On Mon, Nov 6, 2023 at 8:42 AM Carlos O'Donell <carlos@redhat.com> wrote:
>>
>> Most recent meeting status is always here:
>> https://sourceware.org/glibc/wiki/PatchworkReviewMeetings#Update
>>
>> Meeting: 2023-11-06 @ 0900h EST5EDT
>>
>> Video/Audio: https://bbb.linuxfoundation.org/room/adm-alk-1uu-7fu
>>
>> IRC: #glibc on OFTC.
>>
>> Review new patches and restart review at the top.
>>
>>  * State NEW delegate NOBODY at 459 patches.
>>  * Carlos's SLI at 214 days average patch age in queue and 103254 accumulated patch days.
>>  * Starting at 79195.
>>  * v2: Multiple floating-point environment fixes (Adhemerval)
>>   * Carlos to look at the hppa code.
>>  * Update BAD_TYPECHECK to work on x86_64 (Flavio)
>>   * Needs Hurd review.
>>  * Remove ia64-linux-gnu (Adhemerval)
>>   * On thread discussion that upstream kernel support needed for a Linux port.
>>  * [1/6] aarch64: Add vector implementations of asin routines (Joe)
>>   * Szabolcs: More vector math functions.
>>  * x86: Only align destination to 1x VEC_SIZE in memset 4x loop (Noah)
>>  * x86: Fix unchecked AVX512-VBMI2 usage in strrchr-evex-base.S (Noah)
>>  * v3: Add a tunable to decorate anonymous memory maps (Adhemerval)
>>   * Missing 3 RBs for v3.
>>  * x86: Improve ERMS usage on Zen3+ (Adhemerval)
>>   * Would be valuable to reach out to AMD for feedback.
> 
> Missed the meeting, but is there a bit more context to this?

I wrote the patchset summary with my finding on a Zen3 core [1], but
essentially what I have found is ERMS is not really an advantage on
the sizes where it is being enabled for Zen3+ cores (between 2113,
rep_movsb_threshold, and L2 cache size, rep_movsb_stop_threshold or 
524288 on a Zen3 core).

The provided microbenchmark provided by BZ#30995 shows that some
alignments the resulting throughput is *really* bad; while for others
is still slight worse than vectorized alternative.  So the patchset
just disables ERMS for Zen3+ cores, and I really seem even a small
improvement on SPECcpu20017 502.gcc_r  (which hits really hard memset).

To still allow the user to enable ERMS usage, I added a new tunable
glibc.cpu.x86_rep_movsb_stop_threshold.  So one can define a size
range to use ERMS. 

Also on BZ#30994 and BZ#30995, there are some discussion that current
strategy for large size (>32MB) is also suboptimal for Zen3+.  But this
compares against non-temporal store, which would most likely require
some more data to see if the tradeoff is still good (and I don't have
access to a Zen4 core with AVX512).

[1] https://sourceware.org/pipermail/libc-alpha/2023-October/152416.html

> 
>>  * 78835: [v2,7/7] linux: Use fchmodat2 on fchmod for flags different than 0 (BZ 26401) (Adhemerval)
>>   * Carlos: If Florian gave an overall review then we should put it into master and start testing in the rolling release distributions.
>>  * resolv: free only initialized items from gai pool (Jan Palus)
>>   * Needs review.
>>  * Stopped at 78539.
>>  * Carlos: Siddhesh, May we please run the cleanup scripts again to remove older patches that haven't seen movement in the queue based on the community agreed upon time periods?
>>  * Siddhesh: Yes, I can run the scripts and cleanup the items that don't meet the current time period.
>>  * Adhemerval: Review for pthread cancel?
>>   * Carlos: Yes, Thursday/Friday at the latest this week for review.
>>  * Siddhesh: Will you share a v3 for the improved loader env var handling?
>>   * Ahdemerval: Yes, in the queue.
>>
>> --
>> Cheers,
>> Carlos.
>>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Monday Patch Queue Review update (2023-11-06)
  2023-11-06 21:24   ` Adhemerval Zanella Netto
@ 2023-11-06 21:46     ` Noah Goldstein
  2023-11-07 13:30       ` Adhemerval Zanella Netto
  0 siblings, 1 reply; 5+ messages in thread
From: Noah Goldstein @ 2023-11-06 21:46 UTC (permalink / raw)
  To: Adhemerval Zanella Netto; +Cc: libc-alpha

On Mon, Nov 6, 2023 at 3:24 PM Adhemerval Zanella Netto
<adhemerval.zanella@linaro.org> wrote:
>
>
>
> On 06/11/23 15:18, Noah Goldstein wrote:
> > On Mon, Nov 6, 2023 at 8:42 AM Carlos O'Donell <carlos@redhat.com> wrote:
> >>
> >> Most recent meeting status is always here:
> >> https://sourceware.org/glibc/wiki/PatchworkReviewMeetings#Update
> >>
> >> Meeting: 2023-11-06 @ 0900h EST5EDT
> >>
> >> Video/Audio: https://bbb.linuxfoundation.org/room/adm-alk-1uu-7fu
> >>
> >> IRC: #glibc on OFTC.
> >>
> >> Review new patches and restart review at the top.
> >>
> >>  * State NEW delegate NOBODY at 459 patches.
> >>  * Carlos's SLI at 214 days average patch age in queue and 103254 accumulated patch days.
> >>  * Starting at 79195.
> >>  * v2: Multiple floating-point environment fixes (Adhemerval)
> >>   * Carlos to look at the hppa code.
> >>  * Update BAD_TYPECHECK to work on x86_64 (Flavio)
> >>   * Needs Hurd review.
> >>  * Remove ia64-linux-gnu (Adhemerval)
> >>   * On thread discussion that upstream kernel support needed for a Linux port.
> >>  * [1/6] aarch64: Add vector implementations of asin routines (Joe)
> >>   * Szabolcs: More vector math functions.
> >>  * x86: Only align destination to 1x VEC_SIZE in memset 4x loop (Noah)
> >>  * x86: Fix unchecked AVX512-VBMI2 usage in strrchr-evex-base.S (Noah)
> >>  * v3: Add a tunable to decorate anonymous memory maps (Adhemerval)
> >>   * Missing 3 RBs for v3.
> >>  * x86: Improve ERMS usage on Zen3+ (Adhemerval)
> >>   * Would be valuable to reach out to AMD for feedback.
> >
> > Missed the meeting, but is there a bit more context to this?
>
> I wrote the patchset summary with my finding on a Zen3 core [1], but
> essentially what I have found is ERMS is not really an advantage on
> the sizes where it is being enabled for Zen3+ cores (between 2113,
> rep_movsb_threshold, and L2 cache size, rep_movsb_stop_threshold or
> 524288 on a Zen3 core).
>
> The provided microbenchmark provided by BZ#30995 shows that some
> alignments the resulting throughput is *really* bad; while for others
> is still slight worse than vectorized alternative.  So the patchset
> just disables ERMS for Zen3+ cores, and I really seem even a small
> improvement on SPECcpu20017 502.gcc_r  (which hits really hard memset).
>

The BZ seem to be memcpy only, is this param being exported to memset
as well? If so we probably need to NT-store memset impl.
> To still allow the user to enable ERMS usage, I added a new tunable
> glibc.cpu.x86_rep_movsb_stop_threshold.  So one can define a size
> range to use ERMS.
>
> Also on BZ#30994 and BZ#30995, there are some discussion that current
> strategy for large size (>32MB) is also suboptimal for Zen3+.  But this
> compares against non-temporal store, which would most likely require
> some more data to see if the tradeoff is still good (and I don't have
> access to a Zen4 core with AVX512).
>
> [1] https://sourceware.org/pipermail/libc-alpha/2023-October/152416.html
>
> >
> >>  * 78835: [v2,7/7] linux: Use fchmodat2 on fchmod for flags different than 0 (BZ 26401) (Adhemerval)
> >>   * Carlos: If Florian gave an overall review then we should put it into master and start testing in the rolling release distributions.
> >>  * resolv: free only initialized items from gai pool (Jan Palus)
> >>   * Needs review.
> >>  * Stopped at 78539.
> >>  * Carlos: Siddhesh, May we please run the cleanup scripts again to remove older patches that haven't seen movement in the queue based on the community agreed upon time periods?
> >>  * Siddhesh: Yes, I can run the scripts and cleanup the items that don't meet the current time period.
> >>  * Adhemerval: Review for pthread cancel?
> >>   * Carlos: Yes, Thursday/Friday at the latest this week for review.
> >>  * Siddhesh: Will you share a v3 for the improved loader env var handling?
> >>   * Ahdemerval: Yes, in the queue.
> >>
> >> --
> >> Cheers,
> >> Carlos.
> >>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Monday Patch Queue Review update (2023-11-06)
  2023-11-06 21:46     ` Noah Goldstein
@ 2023-11-07 13:30       ` Adhemerval Zanella Netto
  0 siblings, 0 replies; 5+ messages in thread
From: Adhemerval Zanella Netto @ 2023-11-07 13:30 UTC (permalink / raw)
  To: Noah Goldstein; +Cc: libc-alpha



On 06/11/23 18:46, Noah Goldstein wrote:
> On Mon, Nov 6, 2023 at 3:24 PM Adhemerval Zanella Netto
> <adhemerval.zanella@linaro.org> wrote:
>>
>>
>>
>> On 06/11/23 15:18, Noah Goldstein wrote:
>>> On Mon, Nov 6, 2023 at 8:42 AM Carlos O'Donell <carlos@redhat.com> wrote:
>>>>
>>>> Most recent meeting status is always here:
>>>> https://sourceware.org/glibc/wiki/PatchworkReviewMeetings#Update
>>>>
>>>> Meeting: 2023-11-06 @ 0900h EST5EDT
>>>>
>>>> Video/Audio: https://bbb.linuxfoundation.org/room/adm-alk-1uu-7fu
>>>>
>>>> IRC: #glibc on OFTC.
>>>>
>>>> Review new patches and restart review at the top.
>>>>
>>>>  * State NEW delegate NOBODY at 459 patches.
>>>>  * Carlos's SLI at 214 days average patch age in queue and 103254 accumulated patch days.
>>>>  * Starting at 79195.
>>>>  * v2: Multiple floating-point environment fixes (Adhemerval)
>>>>   * Carlos to look at the hppa code.
>>>>  * Update BAD_TYPECHECK to work on x86_64 (Flavio)
>>>>   * Needs Hurd review.
>>>>  * Remove ia64-linux-gnu (Adhemerval)
>>>>   * On thread discussion that upstream kernel support needed for a Linux port.
>>>>  * [1/6] aarch64: Add vector implementations of asin routines (Joe)
>>>>   * Szabolcs: More vector math functions.
>>>>  * x86: Only align destination to 1x VEC_SIZE in memset 4x loop (Noah)
>>>>  * x86: Fix unchecked AVX512-VBMI2 usage in strrchr-evex-base.S (Noah)
>>>>  * v3: Add a tunable to decorate anonymous memory maps (Adhemerval)
>>>>   * Missing 3 RBs for v3.
>>>>  * x86: Improve ERMS usage on Zen3+ (Adhemerval)
>>>>   * Would be valuable to reach out to AMD for feedback.
>>>
>>> Missed the meeting, but is there a bit more context to this?
>>
>> I wrote the patchset summary with my finding on a Zen3 core [1], but
>> essentially what I have found is ERMS is not really an advantage on
>> the sizes where it is being enabled for Zen3+ cores (between 2113,
>> rep_movsb_threshold, and L2 cache size, rep_movsb_stop_threshold or
>> 524288 on a Zen3 core).
>>
>> The provided microbenchmark provided by BZ#30995 shows that some
>> alignments the resulting throughput is *really* bad; while for others
>> is still slight worse than vectorized alternative.  So the patchset
>> just disables ERMS for Zen3+ cores, and I really seem even a small
>> improvement on SPECcpu20017 502.gcc_r  (which hits really hard memset).
>>
> 
> The BZ seem to be memcpy only, is this param being exported to memset
> as well? If so we probably need to NT-store memset impl.

We have x86_rep_stosb_threshold for memset and I have extended the memset
comment to state how it is used [1].

[1] https://patchwork.sourceware.org/project/glibc/patch/20231031200925.3297456-5-adhemerval.zanella@linaro.org/

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-11-07 13:30 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-11-06 14:41 Monday Patch Queue Review update (2023-11-06) Carlos O'Donell
2023-11-06 18:18 ` Noah Goldstein
2023-11-06 21:24   ` Adhemerval Zanella Netto
2023-11-06 21:46     ` Noah Goldstein
2023-11-07 13:30       ` Adhemerval Zanella Netto

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).