From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from zimbra.cs.ucla.edu (zimbra.cs.ucla.edu [131.179.128.68]) by sourceware.org (Postfix) with ESMTPS id 43FA5395BC79 for ; Fri, 13 May 2022 17:58:43 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 43FA5395BC79 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=cs.ucla.edu Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=cs.ucla.edu Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 7CB411600B7; Fri, 13 May 2022 10:58:42 -0700 (PDT) Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id 7C20TURxjfqO; Fri, 13 May 2022 10:58:41 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id C75401600BE; Fri, 13 May 2022 10:58:41 -0700 (PDT) X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id NTqueenaaRb4; Fri, 13 May 2022 10:58:41 -0700 (PDT) Received: from [192.168.1.9] (cpe-172-91-119-151.socal.res.rr.com [172.91.119.151]) by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id 9E3EE1600B7; Fri, 13 May 2022 10:58:41 -0700 (PDT) Message-ID: <5163f97d-dc3f-937f-583f-a94a95ac2247@cs.ucla.edu> Date: Fri, 13 May 2022 10:58:41 -0700 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.8.1 Content-Language: en-US To: Siddhesh Poyarekar References: <20220505184348.3357550-3-siddhesh@sourceware.org> <20220512131503.764504-1-siddhesh@sourceware.org> <4ffe566a-8002-b574-daee-d6927b8ceaef@cs.ucla.edu> <02979241-ff47-ef76-0c77-268aca00c4b8@sourceware.org> <87ilq9d5n2.fsf@oldenburg.str.redhat.com> <63c7ffd1-f609-d99a-a2fd-3359bfe7c663@gotplt.org> From: Paul Eggert Organization: UCLA Computer Science Department Cc: Adhemerval Zanella , libc-alpha@sourceware.org Subject: Re: [PATCH v3] wcrtomb: Make behavior POSIX compliant In-Reply-To: <63c7ffd1-f609-d99a-a2fd-3359bfe7c663@gotplt.org> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-4.8 required=5.0 tests=BAYES_00, KAM_DMARC_STATUS, NICE_REPLY_A, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 13 May 2022 17:58:44 -0000 On 5/13/22 06:42, Siddhesh Poyarekar wrote: > Ugh, both of you just kill all the fun.=C2=A0 I'll push a fix with just= the=20 > memcpy ;) Just as well, as the code I sent before is slow for ASCII anyway. I=20 shouldn't let that stand as my suggestion, even if just for fun. > with gcc 11 it seems to produce worse code, merging the two memcpys ins= tead of inlining the first one. Here's a better version if someone ever wants to tune this stuff. In=20 this version there's only one memcpy so GCC won't merge it. On x86-64=20 with GCC 12.1 -O2, if RESULT<=3D2 this executes 6 instructions (including= =20 1 comparison and 1 conditional branch); if 3<=3DRESULT<=3D4 this executes= 8=20 instructions total (including 2 comparisons and 2 conditional branches). if (result <=3D 2) { s[0] =3D buf[0]; s[result - 1] =3D buf[result - 1]; } else if (result <=3D 4) { s[0] =3D buf[0]; s[1] =3D buf[1]; s[result - 2] =3D buf[result - 2]; s[result - 1] =3D buf[result - 1]; } else memcpy (s, buf, result); There are of course other variants....