From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail.ispras.ru (mail.ispras.ru [83.149.199.84]) by sourceware.org (Postfix) with ESMTPS id 6D32F3858D33 for ; Mon, 17 Jul 2023 16:57:28 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 6D32F3858D33 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=ispras.ru Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=ispras.ru Received: from [10.10.3.121] (unknown [10.10.3.121]) by mail.ispras.ru (Postfix) with ESMTPS id BE2904076751; Mon, 17 Jul 2023 16:57:26 +0000 (UTC) DKIM-Filter: OpenDKIM Filter v2.11.0 mail.ispras.ru BE2904076751 Date: Mon, 17 Jul 2023 19:57:26 +0300 (MSK) From: Alexander Monakov To: Adhemerval Zanella cc: libc-alpha@sourceware.org Subject: Re: [PATCH v5 1/6] stdlib: Optimization qsort{_r} swap implementation In-Reply-To: <20230713132540.2854320-2-adhemerval.zanella@linaro.org> Message-ID: <49920fe1-b5a1-939c-9d34-bd553cd4352e@ispras.ru> References: <20230713132540.2854320-1-adhemerval.zanella@linaro.org> <20230713132540.2854320-2-adhemerval.zanella@linaro.org> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Spam-Status: No, score=-3.3 required=5.0 tests=BAYES_00,KAM_DMARC_STATUS,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Thu, 13 Jul 2023, Adhemerval Zanella via Libc-alpha wrote: > +/* Returns true if elements can be copied using word loads and stores. > + The SIZE and BASE must be a multiple of the ALIGN. */ > +__attribute_const__ __always_inline static bool Kernel folks sure love their double underscores, but plain 'static inline bool' should work just as well here. I'd recommend to properly credit Linux lib/sort.c since obviously this commit originated as a direct copy of the code therein, not just being inspired by what's there. > +is_aligned (const void *base, size_t size, unsigned char align) > +{ > + return (((uintptr_t) base | size) & (align - 1)) == 0; > +} > + > +static void > +swap_words_64 (void * restrict a, void * restrict b, size_t n) > +{ > + do > + { > + n -= 8; > + uint64_t t = *(uint64_t *)(a + n); > + *(uint64_t *)(a + n) = *(uint64_t *)(b + n); > + *(uint64_t *)(b + n) = t; > + } while (n); > +} The undefined behavior resulting from attempt to access via an 'uint64_t *' something of a different type is known to cause miscompilation in practice: https://www.spec.org/cpu2017/Docs/benchmarks/605.mcf_s.html#portability I see that the new string routines do it properly via typedef unsigned long int __attribute__ ((__may_alias__)) op_t; Would be nice to avoid new instances of such UB here. Alexander