From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ot1-x32c.google.com (mail-ot1-x32c.google.com [IPv6:2607:f8b0:4864:20::32c]) by sourceware.org (Postfix) with ESMTPS id 2D7F03858D28 for ; Mon, 31 Jul 2023 20:12:28 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 2D7F03858D28 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org Received: by mail-ot1-x32c.google.com with SMTP id 46e09a7af769-6bca3588edbso785442a34.0 for ; Mon, 31 Jul 2023 13:12:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1690834347; x=1691439147; h=content-transfer-encoding:in-reply-to:organization:from:references :cc:to:content-language:subject:user-agent:mime-version:date :message-id:from:to:cc:subject:date:message-id:reply-to; bh=ch7bJZdvH7MM7Zh1xpK/y8AAV23nUme2WTpHnnPuLAc=; b=mdiOsX+4zY5oA5qQIJxr85NByNnO7e4rOIk2zdtE83L7nnqfCXAaqGtgTXgf619zll imHs5ron8ybNp7h0IDB1GKGTkGOdnOgXvK1h+tNnUz0KAI+4WXRqm+ESFWdMSr9PqHmX 388YkfauKwMHzJqUEBGvuSZBr7bRakTn7+zS1taQUM5BEzyesUEVIC3jYmZhB6fsNqs7 dgwC81qQFZqP/6hM0q3hva6ISn0CaCrLssJuz9WZjlhH65FBK/erwD0hPom5P727GcaQ /Y7U71Z4Wde64SwxYknoc1bwLC5+fZBZEfnW43f7VESgsbWRfs4O2u5pYEfZW3kkKXC+ K7Fg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1690834347; x=1691439147; h=content-transfer-encoding:in-reply-to:organization:from:references :cc:to:content-language:subject:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=ch7bJZdvH7MM7Zh1xpK/y8AAV23nUme2WTpHnnPuLAc=; b=TneTqVAYOk1SAGF2QhP2i1UD8BVKwFysy/9t2DoBR1ZEhaR2yPozZenuXAh/VKcHLK R80QL2bD+y8ijNCJ0M8AnsBgklA7BDpbALrHrY1nIoHzYNzmCz+g5jTQJr8LBinJRpGO Mv+lz4AyzELzJp51Z5d0KbtpyQEkCXdAm+6hxFEgOocpdqV/AzZZMA2tPkyvSpnm56g2 ABQwuYfFcl1qcVUcJUsYF4YlW00qAFflF1/TXeeXq4bzMNgYuTdiSkro6gxC0O8KZmR6 mK2ZfLVqIAUUyvI3lbRDWVuCJJO2Gqc4Ls4KPRcXcTnZhCuawHDJlZO418j/QxrtMVCN 2k2g== X-Gm-Message-State: ABy/qLa3BU6LbY6EGu1ROaYPgbNtjpwPNj+YqFztziRznXbk8MeVwa/a JI6y3jLFXqXPtAr4QDNWz41l8Q== X-Google-Smtp-Source: APBJJlHNytqCgiAryLENoy5zjx8XskBr4O2nEG3wnKwkuog+mEqPABGtsr5thCaS/HOc+MV1koJMIg== X-Received: by 2002:a05:6830:18dc:b0:6b9:8feb:8337 with SMTP id v28-20020a05683018dc00b006b98feb8337mr9546472ote.9.1690834347062; Mon, 31 Jul 2023 13:12:27 -0700 (PDT) Received: from ?IPV6:2804:1b3:a7c1:440b:a49a:e567:9a27:3db8? ([2804:1b3:a7c1:440b:a49a:e567:9a27:3db8]) by smtp.gmail.com with ESMTPSA id d7-20020a056830138700b006b89dafb721sm4344604otq.78.2023.07.31.13.12.25 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 31 Jul 2023 13:12:26 -0700 (PDT) Message-ID: <587651e8-c4e1-7d83-76fa-7395f68e457f@linaro.org> Date: Mon, 31 Jul 2023 17:12:23 -0300 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.13.0 Subject: Re: [PATCH v2] x86_64: Optimize ffsll function code size. Content-Language: en-US To: Sunil K Pandey , libc-alpha@sourceware.org Cc: hjl.tools@gmail.com References: <20230731183549.2396362-1-skpgkp2@gmail.com> From: Adhemerval Zanella Netto Organization: Linaro In-Reply-To: <20230731183549.2396362-1-skpgkp2@gmail.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-12.4 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,NICE_REPLY_A,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 31/07/23 15:35, Sunil K Pandey via Libc-alpha wrote: > Ffsll function size is 17 byte, this patch optimizes size to 16 byte. > Currently ffsll function randomly regress by ~20%, depending on how > code get aligned. > > This patch fixes ffsll function random performance regression. > > Changes from v1: > - Further reduce size ffsll function size to 12 bytes. > --- > sysdeps/x86_64/ffsll.c | 10 +++++----- > 1 file changed, 5 insertions(+), 5 deletions(-) > > diff --git a/sysdeps/x86_64/ffsll.c b/sysdeps/x86_64/ffsll.c > index a1c13d4906..6a5803c7c1 100644 > --- a/sysdeps/x86_64/ffsll.c > +++ b/sysdeps/x86_64/ffsll.c > @@ -26,13 +26,13 @@ int > ffsll (long long int x) > { > long long int cnt; > - long long int tmp; > > - asm ("bsfq %2,%0\n" /* Count low bits in X and store in %1. */ > - "cmoveq %1,%0\n" /* If number was zero, use -1 as result. */ > - : "=&r" (cnt), "=r" (tmp) : "rm" (x), "1" (-1)); > + asm ("mov $-1,%k0\n" /* Intialize CNT to -1. */ > + "bsf %1,%0\n" /* Count low bits in X and store in CNT. */ > + "inc %k0\n" /* Increment CNT by 1. */ > + : "=&r" (cnt) : "r" (x)); > > - return cnt + 1; > + return cnt; > } > > #ifndef __ILP32__ I still prefer if we can just remove this arch-optimized function in favor in compiler builtins.