From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from loongson.cn (mail.loongson.cn [114.242.206.163]) by sourceware.org (Postfix) with ESMTP id 4AD8B3954C5B for ; Thu, 17 Nov 2022 03:46:53 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 4AD8B3954C5B Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=loongson.cn Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=loongson.cn Received: from loongson.cn (unknown [111.9.175.10]) by gateway (Coremail) with SMTP id _____8CxjdqmrnVjhCgIAA--.23138S3; Thu, 17 Nov 2022 11:46:46 +0800 (CST) Received: from [10.136.12.12] (unknown [111.9.175.10]) by localhost.localdomain (Coremail) with SMTP id AQAAf8AxDuKirnVjMKUVAA--.57112S3; Thu, 17 Nov 2022 11:46:44 +0800 (CST) Subject: Re: [PATCH] LoongArch: Fix atomic_exchange make comparison and may jump out To: Xi Ruoyao , Chenghua Xu , Lulu Cheng Cc: Weining Lu , Xing Li , yala , Peng Fan , gcc-patches@gcc.gnu.org, Huang Pei References: <20221115130328.15413-1-hejinyang@loongson.cn> <8039c23568889fe85afbe6940ed625448cf6cd56.camel@xry111.site> <1dd9ace0-a83f-c530-2d65-5f762e0cc81e@loongson.cn> <0390618e9d9e74eb2ea22ae8a934cbc37cd483a7.camel@xry111.site> <2b83052845d53312f6d5af2953162cfa693b6538.camel@xry111.site> From: Jinyang He Message-ID: <24e66a2a-6e0e-db69-7ecc-fd98ca3bb963@loongson.cn> Date: Thu, 17 Nov 2022 11:46:42 +0800 User-Agent: Mozilla/5.0 (X11; Linux mips64; rv:68.0) Gecko/20100101 Thunderbird/68.7.0 MIME-Version: 1.0 In-Reply-To: <2b83052845d53312f6d5af2953162cfa693b6538.camel@xry111.site> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US X-CM-TRANSID:AQAAf8AxDuKirnVjMKUVAA--.57112S3 X-CM-SenderInfo: pkhmx0p1dqwqxorr0wxvrqhubq/ X-Coremail-Antispam: 1Uk129KBjvJXoWxXFWrJrW5CrWUKrW3WF4DArb_yoW5ZFyrpr 1fJ3W7KrW5Jrn5J347tr1UXryjyr18J3WDXr1rJFy8Wr1qyr1aqr4UXw10gryUJ3y8Jr1j qr1UXwnrZF1UAF7anT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUj1kv1TuYvTs0mT0YCTnIWj qI5I8CrVACY4xI64kE6c02F40Ex7xfYxn0WfASr-VFAUDa7-sFnT9fnUUIcSsGvfJTRUUU bakYFVCjjxCrM7AC8VAFwI0_Jr0_Gr1l1xkIjI8I6I8E6xAIw20EY4v20xvaj40_Wr0E3s 1l1IIY67AEw4v_Jr0_Jr4l8cAvFVAK0II2c7xJM28CjxkF64kEwVA0rcxSw2x7M28EF7xv wVC0I7IYx2IY67AKxVW8JVW5JwA2z4x0Y4vE2Ix0cI8IcVCY1x0267AKxVW8JVWxJwA2z4 x0Y4vEx4A2jsIE14v26r4UJVWxJr1l84ACjcxK6I8E87Iv6xkF7I0E14v26r4UJVWxJr1l e2I262IYc4CY6c8Ij28IcVAaY2xG8wAqjxCEc2xF0cIa020Ex4CE44I27wAqx4xG64xvF2 IEw4CE5I8CrVC2j2WlYx0E2Ix0cI8IcVAFwI0_Jr0_Jr4lYx0Ex4A2jsIE14v26r1j6r4U McvjeVCFs4IE7xkEbVWUJVW8JwACjcxG0xvEwIxGrwCYjI0SjxkI62AI1cAE67vIY487Mx AIw28IcxkI7VAKI48JMxC20s026xCaFVCjc4AY6r1j6r4UMxCIbckI1I0E14v26r1Y6r17 MI8I3I0E5I8CrVAFwI0_Jr0_Jr4lx2IqxVCjr7xvwVAFwI0_JrI_JrWlx4CE17CEb7AF67 AKxVWUtVW8ZwCIc40Y0x0EwIxGrwCI42IY6xIIjxv20xvE14v26r1j6r1xMIIF0xvE2Ix0 cI8IcVCY1x0267AKxVWUJVW8JwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF4lIxAIcVC2z2 80aVAFwI0_Jr0_Gr1lIxAIcVC2z280aVCY1x0267AKxVWUJVW8JbIYCTnIWIevJa73UjIF yTuYvjxUzsqWUUUUU X-Spam-Status: No, score=-3.5 required=5.0 tests=BAYES_00,BODY_8BITS,KAM_DMARC_STATUS,NICE_REPLY_A,SPF_HELO_PASS,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 2022/11/17 上午11:38, Xi Ruoyao wrote: > On Thu, 2022-11-17 at 10:55 +0800, Jinyang He wrote: >> On 2022/11/17 上午9:39, Jinyang He wrote: >> >>> On 2022/11/16 下午7:46, Xi Ruoyao wrote: >>> >>>> On Wed, 2022-11-16 at 10:11 +0800, Jinyang He wrote: >>>> >>>>>>> +  return "%G6\\n\\t" >>>>>>> +        "1:\\n\\t" >>>>>>> +        "ll.\\t%0,%1\\n\\t" >>>>>>> +        "and\\t%7,%0,%z3\\n\\t" >>>>>>> +        "or%i5\\t%7,%7,%5\\n\\t" >>>>>>> +        "sc.\\t%7,%1\\n\\t" >>>>>>> +        "beqz\\t%7,1b\\n\\t"; >>>>>> Do we need a "dbar 0x700" after beqz? >>>>>> >>>>>> /* snip */ >>>>> That's worth discussing. Actually I don't see any dbar hint definition >>>>> like 0x700 in the manual right now. >>>>> Besides, I think what should be provided here is a relaxed version. And >>>>> whether the barrier exsit or not is depend on the specific >>>>> memory_order. >>>> It's not related to memory order, but for a hardware issue workaround. >>>> Jiaxun told me (via LKML): >>>> >>>>     I had checked with Loongson guys and they confirmed that the >>>>     workaround still needs to be applied to latest 3A4000 processors, >>>>     including 3A4000 for MIPS and 3A5000 for LoongArch. >>>>         Though, the reason behind the workaround varies with the >>>> evaluation >>>>     of their uArch, for GS464V based core, barrier is required as the >>>>     uArch design allows regular load to be reordered after an atomic >>>>     linked load, and that would break assumption of compiler atomic >>>>     constraints. >>> That certainly seems to be needed, but before or after. It's beyond my >>> recognition and cc huangpei@loongson.cn for help. >> >> Pei told me the ll-sc works at present like follows, >> >> uArch like: >>    ll -> (ll.dbar ll.ld_atomic) >>    sc -> (sc.dbar sc.st_atomic) >> >> exchange: >> ll.dbar >> <---------------------------+ >> ll.ld_atomic $rd            | >> ...(no jmp)                 | >> sc.dbar                     | >> sc.st_stomic $rd            | >> ld $rj -can-not-emit-at-----+ >> >> The load $rj can not emit between ll.dbar and ll.ld_atomic because the >> sc.dbar barrier it. >> >> >> compare and exchange: >> ll.dbar >> <-----------------------+ >> ll.ld_atomic $rd        | >> ...(jmp) ---------------+------+ >> sc.dbar                 |      | >> sc.st_stomic $rd        |      | >>                          |   <--+ >> ld $rj -may-emit-at-----+ >> >> Jumping out ll-sc may lead loading $rj emit between ll.dbar and ll.atomic. >> >> >> Thus, exchange not need dbar. >> >> >>> >>>> Without these dbar instructions I'd got random test failures in GCC >>>> libgomp test suite. >> Which test suite? > I mean when we didn't use dbar 0x700 for compare-and-exchange (during > the early development stage of GCC for LoongArch) I observed these > failures. > > So we do need an additional dbar for compare-and-exchange, but do not > need it for a bare atomic exchange? Yes.