From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from xry111.site (xry111.site [IPv6:2001:470:683e::1]) by sourceware.org (Postfix) with ESMTPS id 310ED3858016 for ; Wed, 28 Sep 2022 16:42:24 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 310ED3858016 Authentication-Results: sourceware.org; dmarc=pass (p=reject dis=none) header.from=xry111.site Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=xry111.site DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=xry111.site; s=default; t=1664383342; bh=uww16zcH+CsYJVsgcNsO3xFk1X+SSbb8o87DCsioj5E=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=XLFyNoElgOgSqOW5SKclFha/xZ+lRZAD2nHFH7wsb9+5oxAPOlvvtRIC7RhZIJXCC E4eez41MotzFIVqjrzzdA0tVj8bwMG+jfVNxjdoTwSGrcRectxgAZh38tJOyio2+nm qmJewB9doPpuiH950zPI4KIBILJsv5iZS9yM4D0U= Received: from localhost.localdomain (xry111.site [IPv6:2001:470:683e::1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature ECDSA (P-384)) (Client did not present a certificate) (Authenticated sender: xry111@xry111.site) by xry111.site (Postfix) with ESMTPSA id 4EB2366797; Wed, 28 Sep 2022 12:42:20 -0400 (EDT) Message-ID: <6afb2b9136ff6c96d5b729340427f59e24ebf268.camel@xry111.site> Subject: Re: [PATCH 0/2] LoongArch: Add optimized functions. From: Xi Ruoyao To: Richard Henderson , Adhemerval Zanella Netto , "dengjianbo@loongson.cn" Cc: xuchenghua , "i.swmail" , libc-alpha , joseph , caiyinyu , Lulu Cheng Date: Thu, 29 Sep 2022 00:42:19 +0800 In-Reply-To: <1679af30-ee17-3016-1bd3-192f744ad8ef@linaro.org> References: <403f78f0-55d9-48cf-c62a-4a0462a76987@loongson.cn> <2022091910031722091613@loongson.cn> <0172d70e-e939-31d4-bcd8-b47f274f97d9@linaro.org> <9cbcd3541c903aaba8038237befee5e3720d144e.camel@xry111.site> <1fec4245-9eb4-108d-722e-ba36a1df0023@linaro.org> <8411c465e01de9608633f8b1fd2d82d3ef16f001.camel@xry111.site> <1679af30-ee17-3016-1bd3-192f744ad8ef@linaro.org> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable User-Agent: Evolution 3.46.0 MIME-Version: 1.0 X-Spam-Status: No, score=-1.7 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FROM_SUSPICIOUS_NTLD,LIKELY_SPAM_FROM,SPF_HELO_PASS,SPF_PASS,TXREP,T_PDS_OTHER_BAD_TLD autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Wed, 2022-09-28 at 07:22 -0700, Richard Henderson wrote: > On 9/26/22 06:49, Xi Ruoyao via Libc-alpha wrote: > > Hi Adhemerval and Jianbo, > >=20 > > I've customized string-fzi.h and string-maskoff.h for LoongArch (see > > attachment).=C2=A0 With them on top of Adhermerval's v5 "Improve generi= c > > string routines" patch and GCC & Binutils trunk, the benchmark result > > seems comparable with the assembly version for strchr, strcmp, and > > strchrnul. Hi Richard, > There is nothing in string-maskoff.h that the compiler should not be able= to produce=20 > itself from the generic version.=C2=A0 Having a brief look, the compiler = simply needs to be=20 > improved to unify two current AND patterns (which is an existing bug) and= add the=20 > additional case for bstrins.d. Added GCC LoongArch port maintainer into Cc:. It's actually more complicated. Without the inline assembly in repeat_bytes(), the compiler does not hoist the 4-instruction 64-bit immediate load sequence out of a loop for "some reason I don't know yet". > Similarly, there is nothing in string-fzi.h that should not be gotten fro= m longlong.h;=20 > your only changes are to use __builtin_clz, which longlong.h exports as c= ount_trailing_zeros. No, it does not. By default longlong.h uses a table driven approach for count_{trailing,leading}_zeros. I can add __loongarch__ (or __loongarch_lp64) into longlong.h though. IIUC I need to submit the change to GCC, then Glibc merges longlong.h from GCC, right? --=20 Xi Ruoyao School of Aerospace Science and Technology, Xidian University