public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
* [RFC]: Removing old Falkor ifuncs
@ 2022-12-09 15:14 Wilco Dijkstra
  2022-12-09 15:24 ` Siddhesh Poyarekar
  0 siblings, 1 reply; 5+ messages in thread
From: Wilco Dijkstra @ 2022-12-09 15:14 UTC (permalink / raw)
  To: siddhesh; +Cc: 'GNU C Library'

Hi Siddhesh,

Do we need the ifuncs for Falkor? The SIMD memcpy is now the default
generic memcpy and that is quite similar to the Falkor one, so it seems
time to remove the Falkor variants. Since you are the original author,
what do you think?

Cheers,
Wilco

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC]: Removing old Falkor ifuncs
  2022-12-09 15:14 [RFC]: Removing old Falkor ifuncs Wilco Dijkstra
@ 2022-12-09 15:24 ` Siddhesh Poyarekar
  2022-12-09 18:00   ` Wilco Dijkstra
  0 siblings, 1 reply; 5+ messages in thread
From: Siddhesh Poyarekar @ 2022-12-09 15:24 UTC (permalink / raw)
  To: Wilco Dijkstra; +Cc: 'GNU C Library'

On 2022-12-09 10:14, Wilco Dijkstra wrote:
> Do we need the ifuncs for Falkor? The SIMD memcpy is now the default
> generic memcpy and that is quite similar to the Falkor one, so it seems
> time to remove the Falkor variants. Since you are the original author,
> what do you think?

The key differentiator in memcpy/memmove at that time was the register 
number usage since that affected how the hardware prefetcher performed. 
Changing that might affect performance on falkor, although I don't 
exactly remember by how much.

The other key differentiator is in memset, where reading dczid_el0 was 
expensive on falkor, so the value is hard coded to avoid reading it 
everytime.

Sid

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC]: Removing old Falkor ifuncs
  2022-12-09 15:24 ` Siddhesh Poyarekar
@ 2022-12-09 18:00   ` Wilco Dijkstra
  2022-12-21 17:27     ` Wilco Dijkstra
  0 siblings, 1 reply; 5+ messages in thread
From: Wilco Dijkstra @ 2022-12-09 18:00 UTC (permalink / raw)
  To: Siddhesh Poyarekar; +Cc: 'GNU C Library'

Hi Siddhesh,

>On 2022-12-09 10:14, Wilco Dijkstra wrote:
>> Do we need the ifuncs for Falkor? The SIMD memcpy is now the default
>> generic memcpy and that is quite similar to the Falkor one, so it seems
>> time to remove the Falkor variants. Since you are the original author,
>> what do you think?
>
> The key differentiator in memcpy/memmove at that time was the register 
> number usage since that affected how the hardware prefetcher performed. 
> Changing that might affect performance on falkor, although I don't 
> exactly remember by how much.

If there was a difference, it would likely be on large copies. But it would be hard
to test without access to a machine...

> The other key differentiator is in memset, where reading dczid_el0 was 
> expensive on falkor, so the value is hard coded to avoid reading it 
> everytime.

Yes this is true for various microarchitectures. I posted a patch to make this
more general a few years ago, ie. always use an ifunc if the ZVA size is 64
rather than do something CPU specific.

Cheers,
Wilco

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC]: Removing old Falkor ifuncs
  2022-12-09 18:00   ` Wilco Dijkstra
@ 2022-12-21 17:27     ` Wilco Dijkstra
  2022-12-21 23:38       ` Siddhesh Poyarekar
  0 siblings, 1 reply; 5+ messages in thread
From: Wilco Dijkstra @ 2022-12-21 17:27 UTC (permalink / raw)
  To: Siddhesh Poyarekar; +Cc: 'GNU C Library'

Hi Siddhesh,

>>On 2022-12-09 10:14, Wilco Dijkstra wrote:
>>> Do we need the ifuncs for Falkor? The SIMD memcpy is now the default
>>> generic memcpy and that is quite similar to the Falkor one, so it seems
>>> time to remove the Falkor variants. Since you are the original author,
>>> what do you think?
>>
>> The key differentiator in memcpy/memmove at that time was the register 
>> number usage since that affected how the hardware prefetcher performed. 
>> Changing that might affect performance on falkor, although I don't 
>> exactly remember by how much.
>
> If there was a difference, it would likely be on large copies. But it would be hard
> to test without access to a machine...

I managed to get an old Falkor revived, so was able to finally run benchtests.
The new generic memcpy is about 10% faster on bench-memcpy-random test
when sizes fit in L1, and about 5% faster overall. Bench-memcpy-large and -walk
are very similar, so it doesn't seem to have any effect on prefetching in large copies.

So it looks like the new generic memcpy is better overall.

Cheers,
Wilco

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC]: Removing old Falkor ifuncs
  2022-12-21 17:27     ` Wilco Dijkstra
@ 2022-12-21 23:38       ` Siddhesh Poyarekar
  0 siblings, 0 replies; 5+ messages in thread
From: Siddhesh Poyarekar @ 2022-12-21 23:38 UTC (permalink / raw)
  To: Wilco Dijkstra; +Cc: 'GNU C Library'

On 2022-12-21 12:27, Wilco Dijkstra wrote:
> Hi Siddhesh,
> 
>>> On 2022-12-09 10:14, Wilco Dijkstra wrote:
>>>> Do we need the ifuncs for Falkor? The SIMD memcpy is now the default
>>>> generic memcpy and that is quite similar to the Falkor one, so it seems
>>>> time to remove the Falkor variants. Since you are the original author,
>>>> what do you think?
>>>
>>> The key differentiator in memcpy/memmove at that time was the register
>>> number usage since that affected how the hardware prefetcher performed.
>>> Changing that might affect performance on falkor, although I don't
>>> exactly remember by how much.
>>
>> If there was a difference, it would likely be on large copies. But it would be hard
>> to test without access to a machine...
> 
> I managed to get an old Falkor revived, so was able to finally run benchtests.
> The new generic memcpy is about 10% faster on bench-memcpy-random test
> when sizes fit in L1, and about 5% faster overall. Bench-memcpy-large and -walk
> are very similar, so it doesn't seem to have any effect on prefetching in large copies.
> 
> So it looks like the new generic memcpy is better overall.

Great, I'd say go for it then :)

Thanks,
Sid

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-12-21 23:38 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-12-09 15:14 [RFC]: Removing old Falkor ifuncs Wilco Dijkstra
2022-12-09 15:24 ` Siddhesh Poyarekar
2022-12-09 18:00   ` Wilco Dijkstra
2022-12-21 17:27     ` Wilco Dijkstra
2022-12-21 23:38       ` Siddhesh Poyarekar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).