From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ua1-x935.google.com (mail-ua1-x935.google.com [IPv6:2607:f8b0:4864:20::935]) by sourceware.org (Postfix) with ESMTPS id 44E4F3858CDA for ; Thu, 29 Sep 2022 11:45:44 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 44E4F3858CDA Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org Received: by mail-ua1-x935.google.com with SMTP id y20so397547uao.8 for ; Thu, 29 Sep 2022 04:45:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=content-transfer-encoding:in-reply-to:organization:from:references :cc:to:content-language:subject:user-agent:mime-version:date :message-id:from:to:cc:subject:date; bh=kGwZjr1t3e3+r2DLcOAkI4ZNAYNdP9ysHzUfPz1pBzA=; b=UVLL3pJshWLGALUwnrD3CcK6wGSKUdGcO2u5iAgZn9f+DHuhmXsd2JZnGuSpvl7KNc S6rOeboxUsdiaSL6LkZ24d617lZObQRy2LbXSzXVABXOzUFAsOEiJzbn4Ohlt/mUS1ty nZDkbEbG8slpD4MUSBynp6SPlBQFV1spEQ5tcDWns1jDMtZZVKYbxvCHCUXVNZGflq90 jQF/NIiKzF82hs8tfFNZKhKVxNYqfVwfPC01iCa/zs7oM7qIksn/WmA1Y+Ay5I3q983i 6JumhgE6lKmRLA4b2aNchqA5SfXy9clJvC/xiHX64gDRKI5ig3au4gDxPHNcdTAT2vog uUMg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:in-reply-to:organization:from:references :cc:to:content-language:subject:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date; bh=kGwZjr1t3e3+r2DLcOAkI4ZNAYNdP9ysHzUfPz1pBzA=; b=E/Li+jsfyBb3mu1SbRmCCo5zajg9Izic7A+i4v2di7oXkf8A43WRN+Oe31Qwz2QlFj AjzmQaaYV/J2AzUDsxGliYCHWYUCZ2xMBbZQnurTkMRu3Ihz5JLPh9kBndcuaIchEqX4 DPUEy5VrzjRJq6kV/oatqTuVatIV2yIFJJOSfU04DfDleRAXlW9PJUux92XOARpGbi1a qqoigp+s0TvW5A+pLrYtD/XmTUVknZgcC2fJFOGWliy8w9E2NyKYoYGq+pF8H8u133S2 vgKtS6GvNs+cNWtp+I7do9zd3/95YWARRMCc2WgzSMROiq4YKel1eSihgbe0VZwQZ+Xk uBFg== X-Gm-Message-State: ACrzQf1t6ffrGsDg5/o4jF2NS2lpU7lg9AuG00FshVpXkKbNvmVrEKBA wNhbY+JYh82S/ndz7jEuGRhxBX7m69gKPwfE X-Google-Smtp-Source: AMsMyM4ZTDMuTVMFvezqDII0/q8F1ThmBoj4pYyOUmyTMRHryYKSRSa/KloDcmtONo2GB6QZ2xySCg== X-Received: by 2002:ab0:59a8:0:b0:3cd:aa28:aacd with SMTP id g37-20020ab059a8000000b003cdaa28aacdmr1216404uad.123.1664451943523; Thu, 29 Sep 2022 04:45:43 -0700 (PDT) Received: from ?IPV6:2804:1b3:a7c2:3736:1cf0:b6ef:db77:c498? ([2804:1b3:a7c2:3736:1cf0:b6ef:db77:c498]) by smtp.gmail.com with ESMTPSA id u68-20020a676047000000b00398a9210ca7sm1283941vsb.7.2022.09.29.04.45.41 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 29 Sep 2022 04:45:43 -0700 (PDT) Message-ID: <4d22961b-017f-857d-bdba-f6f544860858@linaro.org> Date: Thu, 29 Sep 2022 08:45:40 -0300 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.3.0 Subject: Re: [PATCH 0/2] LoongArch: Add optimized functions. Content-Language: en-US To: Xi Ruoyao , "dengjianbo@loongson.cn" Cc: libc-alpha , caiyinyu , xuchenghua , "i.swmail" , joseph References: <403f78f0-55d9-48cf-c62a-4a0462a76987@loongson.cn> <2022091910031722091613@loongson.cn> <0172d70e-e939-31d4-bcd8-b47f274f97d9@linaro.org> <9cbcd3541c903aaba8038237befee5e3720d144e.camel@xry111.site> <1fec4245-9eb4-108d-722e-ba36a1df0023@linaro.org> <8411c465e01de9608633f8b1fd2d82d3ef16f001.camel@xry111.site> From: Adhemerval Zanella Netto Organization: Linaro In-Reply-To: <8411c465e01de9608633f8b1fd2d82d3ef16f001.camel@xry111.site> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-8.7 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 26/09/22 10:49, Xi Ruoyao wrote: > Hi Adhemerval and Jianbo, > > I've customized string-fzi.h and string-maskoff.h for LoongArch (see > attachment). With them on top of Adhermerval's v5 "Improve generic > string routines" patch and GCC & Binutils trunk, the benchmark result > seems comparable with the assembly version for strchr, strcmp, and > strchrnul. Excellent, that's exactly what I was aiming for this generic string. I see that Richard's sees that gcc should handle it better, although I do not oppose adding this header to support older gcc versions. > > By the way I've tried to unroll the loop in strchr manually, but then > the compiler produced some bad thing (moving words from a register to > another with no reason) and the result is slower. Have you tried to use compiler flags to explicit use loops unrolls: CFLAGS-strchr.c += --param max-variable-expansions-in-unroller=2 --param max-unroll-times=2 -funroll-loops We use this on powerpc, so it might worth a try. Another options would be to use loop_unroll.h and add a per-arch flag to define the expansion factor. I would prefer if we could set this by compiler. > > I've not really plotted the the result, just took a quick look with my > eyes. You can try the bench with my headers in sysdeps/loongarch.