Hi Adhemerval, Please kindly see the following link for the test results of comparing with new generic version. https://sourceware.org/pipermail/libc-alpha/2022-September/142016.html Comparing with the previous patch, we further optimized strchr and strchrnul, 4 instructions was reduced before the loop. Best regards, Deng jianbo From: Adhemerval Zanella Netto Date: Fri, 2 Sep 2022 09:27:33 -0300 To: Joseph Myers , Carlos O'Donell CC:caiyinyu , libc-alpha@sourceware.org, i.swmail@xen0n.name, xuchenghua@loongson.cn Subject: Re: [PATCH 0/2] LoongArch: Add optimized functions. On 15/08/22 17:46, Joseph Myers wrote: On Mon, 15 Aug 2022, Carlos O'Donell via Libc-alpha wrote: On 8/15/22 04:57, caiyinyu wrote: Tested on LoongArch machine: gcc 13.0.0, Linux kernel 5.19.0 rc2, binutils branch master 2eb132bdfb9. Could you please post microbenchmark results for these changes? How much faster are they from the generic versions? Note that so far we haven't merged the improved generic string functions that were posted a while back (https://sourceware.org/legacy-ml/libc-alpha/2018-01/msg00318.html is the version linked from https://sourceware.org/glibc/wiki/NewPorts - don't know if it's the most recent version). So even if assembly versions are better than the current generic string functions, they might not be better than improved generic versions with architecture-specific implementations of the headers to provide per-architecture tuning. And it seems that some of this newer implementations does what my patch basically does. The memmove is an improvement since the generic code we have does a internal libcall to memcpy (which some architecture optimizes it by implementing memcpy and memmove on some TU to just do a branch instead of a function call). I will rebase and resend my improved generic string, I think it would yield very similar numbers to the str* assembly implementations proposed.