From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail.ispras.ru (mail.ispras.ru [83.149.199.84]) by sourceware.org (Postfix) with ESMTPS id 12CC33895FE5 for ; Tue, 15 Nov 2022 13:59:14 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 12CC33895FE5 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=ispras.ru Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=ispras.ru Received: from [10.10.3.121] (unknown [10.10.3.121]) by mail.ispras.ru (Postfix) with ESMTPS id 6EBA440737A0; Tue, 15 Nov 2022 13:59:12 +0000 (UTC) DKIM-Filter: OpenDKIM Filter v2.11.0 mail.ispras.ru 6EBA440737A0 Date: Tue, 15 Nov 2022 16:59:12 +0300 (MSK) From: Alexander Monakov To: Jonathan Wakely cc: Hongyu Wang , gcc-patches@gcc.gnu.org, hongtao.liu@intel.com, ubizjak@gmail.com Subject: Re: [PATCH] doc: Reword the description of -mrelax-cmpxchg-loop [PR 107676] In-Reply-To: Message-ID: <7a41f182-1638-1a70-c0dc-b90b1985c31@ispras.ru> References: <20221115033559.66827-1-hongyu.wang@intel.com> <9289c261-6aeb-2fdf-6599-4e8d77c30f8@ispras.ru> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Spam-Status: No, score=-3.1 required=5.0 tests=BAYES_00,KAM_DMARC_STATUS,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Tue, 15 Nov 2022, Jonathan Wakely wrote: > > How about the following: > > > > When emitting a compare-and-swap loop for @ref{__sync Builtins} > > and @ref{__atomic Builtins} lacking a native instruction, optimize > > for the highly contended case by issuing an atomic load before the > > @code{CMPXCHG} instruction, and invoke the @code{PAUSE} instruction > > when restarting the loop. > > That's much better, thanks. My only remaining quibble would be that > "invoking" an instruction seems only marginally better than running > one. Emitting? Issuing? Using? Adding? Right, it should be 'using'; let me also add 'to save CPU power': When emitting a compare-and-swap loop for @ref{__sync Builtins} and @ref{__atomic Builtins} lacking a native instruction, optimize for the highly contended case by issuing an atomic load before the @code{CMPXCHG} instruction, and using the @code{PAUSE} instruction to save CPU power when restarting the loop. Alexander