From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm1-x32f.google.com (mail-wm1-x32f.google.com [IPv6:2a00:1450:4864:20::32f]) by sourceware.org (Postfix) with ESMTPS id EEA543858425 for ; Wed, 26 Jul 2023 16:52:11 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org EEA543858425 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-wm1-x32f.google.com with SMTP id 5b1f17b1804b1-3fd190065a8so71143925e9.3 for ; Wed, 26 Jul 2023 09:52:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1690390330; x=1690995130; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=qfMXOcICc7G1/X691AYf0DZiCMITBDweYrNi6QxbUZg=; b=ollaVukHINqJa7lTDmXr7iplNArCIc04Upwt2g+VzMzuunjAsDCFk/I2p3CP0GHXuQ gGNfAMeW6inaR/sfOAsi4XHUuUO4E4izahRWr1cShKoAD+urkKnE01anzHMtfIwsUIV5 tq3Jvjfr2c7qYAO6dsFSFJDJqPtUh+mEc8yDvqo8W35t80N4i/MyOy7kuX0eK0CrhD9W tGl6fFqrxq3bkRltoUj2Loh4X06KBhbHZC9rcLce6ks6yPkzXXtWrDI3EQYk4T6URgWm luelh3AE5bucQAvZW6jkXEk6O0AQhr/0yrWrRiGHKU/tsR5FvVr4B2rflYs3n8OYA+Lc Tn4w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1690390330; x=1690995130; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=qfMXOcICc7G1/X691AYf0DZiCMITBDweYrNi6QxbUZg=; b=EZGM519sWslBgkv7IWu0fAXDucWE63KWNhqmCWlSMBAlm0R4q8yz+f9DIyM9sAtgqW PfB245OVXnEj8uXxLbO3WpTIUGdnjykVIj8M5un8h8sCE2cnlpykBNWD0PnaSCRUuiUI K6cZmHIhPXpoB3vjTuCYmi9cN4pJ9xfNmsue+Uvk6nv+lCeiULAS2mmF6hDUUkAKrI4j Npavd1DtjR8sJjWZZ1YuTNsTKt2c/J1HpvtLDUrt4Ff+JA6uEAEu2rGIY6E1X4rh+sCv FimY4K0J8ZIWfqYPv/F+AJi8DutJpNfcSAERNl7aLQVkl0WeMN7iDCjpFp+zc8l3entr c3iQ== X-Gm-Message-State: ABy/qLZjZ7FWV8UdWPpUogtqajw7nv6IAsX1AMtJd/JSWQswcMNAR2A0 ZZQKpWZFnBE7h1IlEYvusg31E5asQNUMXfjkjF4= X-Google-Smtp-Source: APBJJlHhKxEGd+/dLYulGkNBTrVXTKUARx2pTmB3GJun+o5wg4TKEZC7JTZtSovIouaeslTt1GToOBnLF+LKl1fBx3k= X-Received: by 2002:a5d:4e92:0:b0:314:3b1f:8ea2 with SMTP id e18-20020a5d4e92000000b003143b1f8ea2mr1995365wru.6.1690390330463; Wed, 26 Jul 2023 09:52:10 -0700 (PDT) MIME-Version: 1.0 References: <20230726160524.1955013-1-skpgkp2@gmail.com> In-Reply-To: From: Sunil Pandey Date: Wed, 26 Jul 2023 09:51:34 -0700 Message-ID: Subject: Re: [PATCH] x86_64: Optimize ffsll function code size. To: Richard Henderson Cc: libc-alpha@sourceware.org, hjl.tools@gmail.com Content-Type: multipart/alternative; boundary="0000000000002ddc7f060166ab8d" X-Spam-Status: No, score=-8.5 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_FROM,GIT_PATCH_0,HK_RANDOM_ENVFROM,HK_RANDOM_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: --0000000000002ddc7f060166ab8d Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Wed, Jul 26, 2023 at 9:38=E2=80=AFAM Richard Henderson < richard.henderson@linaro.org> wrote: > On 7/26/23 09:05, Sunil K Pandey via Libc-alpha wrote: > > Ffsll function size is 17 byte, this patch optimizes size to 16 byte. > > Currently ffsll function randomly regress by ~20%, depending on how > > code get aligned. > > > > This patch fixes ffsll function random performance regression. > > --- > > sysdeps/x86_64/ffsll.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/sysdeps/x86_64/ffsll.c b/sysdeps/x86_64/ffsll.c > > index a1c13d4906..dbded6f0a1 100644 > > --- a/sysdeps/x86_64/ffsll.c > > +++ b/sysdeps/x86_64/ffsll.c > > @@ -29,7 +29,7 @@ ffsll (long long int x) > > long long int tmp; > > > > asm ("bsfq %2,%0\n" /* Count low bits in X and store > in %1. */ > > - "cmoveq %1,%0\n" /* If number was zero, use -1 as > result. */ > > + "cmove %k1,%k0\n" /* If number was zero, use -1 as result. > */ > > This no longer produces -1, but 0xffffffff in cnt. However, since the > return type is > 'int', cnt need not be 'long long int' either. I'm not sure why tmp > exists at all, since > cnt is the only register modified. > Here is the exact assembly produced with this change. ./build-x86_64-linux/string/ffsll.o: file format elf64-x86-64 Disassembly of section .text: 0000000000000000 : 0: ba ff ff ff ff mov $0xffffffff,%edx 5: 48 0f bc c7 bsf %rdi,%rax 9: 0f 44 c2 cmove %edx,%eax c: 83 c0 01 add $0x1,%eax f: c3 ret > > > r~ > --0000000000002ddc7f060166ab8d--