From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from loongson.cn (mail.loongson.cn [114.242.206.163]) by sourceware.org (Postfix) with ESMTP id 2D2A0396E84A for ; Thu, 17 Nov 2022 01:39:40 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 2D2A0396E84A Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=loongson.cn Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=loongson.cn Received: from loongson.cn (unknown [111.9.175.10]) by gateway (Coremail) with SMTP id _____8Dx_7fakHVjNSAIAA--.21727S3; Thu, 17 Nov 2022 09:39:38 +0800 (CST) Received: from [10.136.12.12] (unknown [111.9.175.10]) by localhost.localdomain (Coremail) with SMTP id AQAAf8AxR1fZkHVj4osVAA--.38633S3; Thu, 17 Nov 2022 09:39:38 +0800 (CST) Subject: Re: [PATCH] LoongArch: Fix atomic_exchange make comparison and may jump out To: Xi Ruoyao , Chenghua Xu , Lulu Cheng Cc: Weining Lu , Xing Li , yala , Peng Fan , gcc-patches@gcc.gnu.org, Huang Pei References: <20221115130328.15413-1-hejinyang@loongson.cn> <8039c23568889fe85afbe6940ed625448cf6cd56.camel@xry111.site> <1dd9ace0-a83f-c530-2d65-5f762e0cc81e@loongson.cn> <0390618e9d9e74eb2ea22ae8a934cbc37cd483a7.camel@xry111.site> From: Jinyang He Message-ID: Date: Thu, 17 Nov 2022 09:39:36 +0800 User-Agent: Mozilla/5.0 (X11; Linux mips64; rv:68.0) Gecko/20100101 Thunderbird/68.7.0 MIME-Version: 1.0 In-Reply-To: <0390618e9d9e74eb2ea22ae8a934cbc37cd483a7.camel@xry111.site> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US X-CM-TRANSID:AQAAf8AxR1fZkHVj4osVAA--.38633S3 X-CM-SenderInfo: pkhmx0p1dqwqxorr0wxvrqhubq/ X-Coremail-Antispam: 1Uk129KBjvJXoW7tFW3CryrKw48Ww47Kw4xXrb_yoW8AF1Upr W8ta1YkrZ5Jrn5JwnrCw48XrySkr4rGas8Jr9Yqa48Z393KrnI9r1IqwsFgFsFkw4rtw1Y qryUZwnFvFn8ZaDanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUj1kv1TuYvTs0mT0YCTnIWj qI5I8CrVACY4xI64kE6c02F40Ex7xfYxn0WfASr-VFAUDa7-sFnT9fnUUIcSsGvfJTRUUU bIxYFVCjjxCrM7AC8VAFwI0_Jr0_Gr1l1xkIjI8I6I8E6xAIw20EY4v20xvaj40_Wr0E3s 1l1IIY67AEw4v_Jrv_JF1l8cAvFVAK0II2c7xJM28CjxkF64kEwVA0rcxSw2x7M28EF7xv wVC0I7IYx2IY67AKxVW7JVWDJwA2z4x0Y4vE2Ix0cI8IcVCY1x0267AKxVWxJVW8Jr1l84 ACjcxK6I8E87Iv67AKxVW8Jr0_Cr1UM28EF7xvwVC2z280aVCY1x0267AKxVW8Jr0_Cr1U M2AIxVAIcxkEcVAq07x20xvEncxIr21l57IF6xkI12xvs2x26I8E6xACxx1l5I8CrVACY4 xI64kE6c02F40Ex7xfMcIj6xIIjxv20xvE14v26r106r15McIj6I8E87Iv67AKxVWUJVW8 JwAm72CE4IkC6x0Yz7v_Jr0_Gr1lF7xvr2IY64vIr41lc7I2V7IY0VAS07AlzVAYIcxG8w CF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E14v26r1j 6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64 vIr41lIxAIcVC0I7IYx2IY67AKxVWUJVWUCwCI42IY6xIIjxv20xvEc7CjxVAFwI0_Jr0_ Gr1lIxAIcVCF04k26cxKx2IYs7xG6r1j6r1xMIIF0xvEx4A2jsIE14v26r1j6r4UMIIF0x vEx4A2jsIEc7CjxVAFwI0_Jr0_GrUvcSsGvfC2KfnxnUUI43ZEXa7IU8j-e5UUUUU== X-Spam-Status: No, score=-4.8 required=5.0 tests=BAYES_00,BODY_8BITS,KAM_DMARC_STATUS,NICE_REPLY_A,SPF_HELO_PASS,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 2022/11/16 下午7:46, Xi Ruoyao wrote: > On Wed, 2022-11-16 at 10:11 +0800, Jinyang He wrote: > >>>> +  return "%G6\\n\\t" >>>> +        "1:\\n\\t" >>>> +        "ll.\\t%0,%1\\n\\t" >>>> +        "and\\t%7,%0,%z3\\n\\t" >>>> +        "or%i5\\t%7,%7,%5\\n\\t" >>>> +        "sc.\\t%7,%1\\n\\t" >>>> +        "beqz\\t%7,1b\\n\\t"; >>> Do we need a "dbar 0x700" after beqz? >>> >>> /* snip */ >> That's worth discussing. Actually I don't see any dbar hint definition >> like 0x700 in the manual right now. >> Besides, I think what should be provided here is a relaxed version. And >> whether the barrier exsit or not is depend on the specific memory_order. > It's not related to memory order, but for a hardware issue workaround. > Jiaxun told me (via LKML): > > I had checked with Loongson guys and they confirmed that the > workaround still needs to be applied to latest 3A4000 processors, > including 3A4000 for MIPS and 3A5000 for LoongArch. > > Though, the reason behind the workaround varies with the evaluation > of their uArch, for GS464V based core, barrier is required as the > uArch design allows regular load to be reordered after an atomic > linked load, and that would break assumption of compiler atomic > constraints. That certainly seems to be needed, but before or after. It's beyond my recognition and cc huangpei@loongson.cn for help. > > Without these dbar instructions I'd got random test failures in GCC > libgomp test suite. > > We use a non-zero hint here because it is treated exactly same as zero > in 3A5000, and the future LoongArch processors can fix the issue and > ignore the dbar 0x700 instruction. Thanks, it's a nice workaround.