From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail.loongson.cn (mail.loongson.cn [114.242.206.163]) by sourceware.org (Postfix) with ESMTP id 8DF533858C5E for ; Mon, 4 Dec 2023 08:58:03 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 8DF533858C5E Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=loongson.cn Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=loongson.cn ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 8DF533858C5E Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=114.242.206.163 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701680286; cv=none; b=VWZJZ4NgXruJT3+lb5eP2dfPSnGXqWJaTuvru4kfpGbCTfMOQux8RwSo5TilK2KUXLTEbvcBhvoSdl117tCuTvotgGIQTglVZAgWQD20CBmTjux4bjp1j0Np6ZBuLfqvgnlBjKirH/RwPTj7zR3gbKhy+hMfMHfjT/C5vBiax7M= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701680286; c=relaxed/simple; bh=5GAinz1ufii1Ge9lsE8WF081cOz3Z4ovChNgbCk4CYU=; h=Subject:To:From:Message-ID:Date:MIME-Version; b=kxil9GP7H8RJKHxSKlgh5W7MZUAAwuj/3RS4sEEAyNd7y5bAJoipgIiNk38H4VxgCizG5PAlQJ/TynvkHW8Bw5p64RqJ8PFsDTnkeYndeMkPY9ZwFfAZE60b6dHN8f0anYBIUR5udZGG1/Y0OnK6FEo8woioOjBr+/Kt428ZSTc= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from loongson.cn (unknown [111.9.175.10]) by gateway (Coremail) with SMTP id _____8DxqOqXlG1lmrQ+AA--.23941S3; Mon, 04 Dec 2023 16:57:59 +0800 (CST) Received: from [10.136.12.26] (unknown [111.9.175.10]) by localhost.localdomain (Coremail) with SMTP id AQAAf8BxrdyTlG1l6xtUAA--.54343S3; Mon, 04 Dec 2023 16:57:57 +0800 (CST) Subject: Re: [PATCH v2 0/5] LoongArch tls le model linker relaxation support. To: =?UTF-8?B?5bi45L2z55Cb?= , binutils@sourceware.org Cc: xuchenghua@loongson.cn, chenglulu@loongson.cn, liuzhensong@loongson.cn, xry111@xry111.site, i.swmail@xen0n.name, maskray@google.com, cailulu@loongson.cn, luweining@loongson.cn, wanglei@loongson.cn, Lazy_Linux@126.com, mengqinggang@loongson.cn References: <5a46833f-00db-fb85-2723-461caf330263@loongson.cn> From: Jinyang He Message-ID: <41fe69f8-e96a-599d-4e62-7f7571c604e4@loongson.cn> Date: Mon, 4 Dec 2023 16:57:55 +0800 User-Agent: Mozilla/5.0 (X11; Linux loongarch64; rv:68.0) Gecko/20100101 Thunderbird/68.7.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US X-CM-TRANSID:AQAAf8BxrdyTlG1l6xtUAA--.54343S3 X-CM-SenderInfo: pkhmx0p1dqwqxorr0wxvrqhubq/ X-Coremail-Antispam: 1Uk129KBj93XoW3Jr4fAryDWry7Cw4UGw13Jrc_yoWxJr1Dp3 yrKas8KFW8XF1xAwsIgw1UAryrtr1rJw1UXF9IqF1jkrsYqr1qqr1UXr1Y9F15ZF48WrW2 qw18t3sruF1UtwcCm3ZEXasCq-sJn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7ZEXa sCq-sGcSsGvfJ3Ic02F40EFcxC0VAKzVAqx4xG6I80ebIjqfuFe4nvWSU5nxnvy29KBjDU 0xBIdaVrnRJUUUv2b4IE77IF4wAFF20E14v26r1j6r4UM7CY07I20VC2zVCF04k26cxKx2 IYs7xG6rWj6s0DM7CIcVAFz4kK6r1Y6r17M28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48v e4kI8wA2z4x0Y4vE2Ix0cI8IcVAFwI0_Xr0_Ar1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI 0_Gr0_Cr1l84ACjcxK6I8E87Iv67AKxVW8Jr0_Cr1UM28EF7xvwVC2z280aVCY1x0267AK xVW8Jr0_Cr1UM2AIxVAIcxkEcVAq07x20xvEncxIr21l57IF6xkI12xvs2x26I8E6xACxx 1l5I8CrVACY4xI64kE6c02F40Ex7xfMcIj6xIIjxv20xvE14v26r1Y6r17McIj6I8E87Iv 67AKxVW8JVWxJwAm72CE4IkC6x0Yz7v_Jr0_Gr1lF7xvr2IY64vIr41lc7I2V7IY0VAS07 AlzVAYIcxG8wCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02 F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_Jw0_GF ylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUJVWUCwCI42IY6xIIjxv20xvEc7Cj xVAFwI0_Jr0_Gr1lIxAIcVCF04k26cxKx2IYs7xG6r1j6r1xMIIF0xvEx4A2jsIE14v26r 1j6r4UMIIF0xvEx4A2jsIEc7CjxVAFwI0_Jr0_GrUvcSsGvfC2KfnxnUUI43ZEXa7IU8vA pUUUUUU== X-Spam-Status: No, score=-6.7 required=5.0 tests=BAYES_00,BODY_8BITS,KAM_DMARC_STATUS,NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 2023-12-04 11:39, 常佳琛 wrote: > The above is a simple explanation of the O0 optimization, > which is currently available with O2 and O3 turned on. > > example: > test.c: > __thread int count1; > int main(){ >     count1 = 1; > } > (Enable O2 option and no relax) > 0000000120000480
: >    120000480:1400000c lu12i.w     $t0, 0 >    120000484:0280040d li.w $t1, 1 >    120000488:0010898c add.d       $t0, $t0, $tp >    12000048c:00150004 move        $a0, $zero >    120000490:2980018d st.w $t1, $t0, 0 >    120000494:4c000020 ret > > (Enable O2 option and relax) > 0000000120000480
: >    120000480:0280040d li.w $t1, 1 >    120000484:00150004 move        $a0, $zero >    120000488:2980004d st.w $t1, $tp, 0 >    12000048c:4c000020 ret > > As you can see, with the O2 option turned on, the order of > instructions changes, > but the relax optimization is still not affected, and the address > calculation of the > tls variable count1 is correct before and after optimization. The > situation of enabling > O3 is similar to that of enabling O2. > > How can I get your gcc (or patches)? I tried to compare access to non-thread var with old gcc. Condition: __thread int a; int b; extern int foo(int *); Compare in old gcc: a = 1;                                 b = 1; lu12i.w $r12,%le_hi20(a)               pcalau12i $r12,%pc_hi20(b) ori     $r12,$r12,%le_lo12(a) addi.w  $r13,$r0,1                     addi.w  $r13,$r0,1 stx.w   $r13,$r12,$r2                  st.w    $r13,$r12,%pc_lo12(b) a = 1; return foo(&a);                 b = 1; return foo(&b); lu12i.w $r12,%le_hi20(a)               pcalau12i $r4,%pc_hi20(b) ori     $r12,$r12,%le_lo12(a)          addi.d  $r4,$r4,%pc_lo12(b) addi.w  $r13,$r0,1                     addi.w  $r12,$r0,1 add.d   $r4,$r12,$r2 stx.w   $r13,$r12,$r2                  stptr.w $r12,$r4,0 b       %plt(foo)                      b       %plt(foo) I worry about this case we need the address of the thread-var after accessing it, which may cause worse sequence in your gcc. For the non-thread-var it load the address to a register first and then access it by that register. How about your gcc handle this case? > > > From: Jinyang He > Date: 2023-12-04 10:25:13 > To: changjiachen ,binutils@sourceware.org > Cc: xuchenghua@loongson.cn,chenglulu@loongson.cn,liuzhensong@loongson.cn,xry111@xry111.site,i.swmail@xen0n.name,maskray@google.com,cailulu@loongson.cn,luweining@loongson.cn,wanglei@loongson.cn,Lazy_Linux@126.com,mengqinggang@loongson.cn > Subject: Re: [PATCH v2 0/5] LoongArch tls le model linker relaxation support.> > >On 2023-12-02 14:53, changjiachen wrote: > >> This is the v2 version of patches to support loongarch linker tls le model relax. > >> > >> Changes from v1: > >> > >> * Modified v1-0000-cover-letter.patch part of the explanatory content. > >> > >> Before Modify: > >> > >> example: __thread int a = 1; > >> > >> old insn sequence: > >> > >> lu12i.w $r12,%le_hi20_r(a) > >> ori $r12,$r12,%le_lo12_r(a) > >> add.d $r12,$r12,$r2,%le_add_r(a) > >> li.w $r13,$r0,1 > >> stptr.w $r13,$r12,0 > >> > >> new insn sequence: > >> > >> lu12i.w $r12,%le_hi20_r(a) > >> add.d $r12,$r12,$r2,%le_add_r(a) > >> li.w $r13,$r0,1 > >> st.w $r13,$r12,%le_lo12_r(a) > >> > >> After Modify: > >> > >> example: __thread int a = 1; > >> > >> old insn sequence(at the O0 optimization level): > > > >If the sequence appear only at -O0, is it worth optimizing by relaxation? > > > > > >> > >> lu12i.w $r12,%le_hi20(a) > >> ori $r12,$r12,%le_lo12(a) > >> add.d $r12,$r12,$r2 > >> addi.w $r13,$r0,1 > >> stptr.w $r13,$r12,0 > >> > >> new insn sequence(at the O0 optimization level): > >> > >> lu12i.w $r12,%le_hi20_r(a) > >> add.d $r12,$r12,$r2,%le_add_r(a) > >And here, if the sequence appear in other optimization level, will > >register value ($r12) being different between the old sequence and > >the new sequence cause other problems, e.g. worse sequence? Have you > > > >tried this relaxation at other optimization levels? > > > > > >Thanks. > > > >> addi.w $r13,$r0,1 > >> st.w $r13,$r12,%le_lo12_r(a) > >> > >> changjiachen (5): > >> LoongArch: bfd: Add support for tls le relax. > >> LoongArch: include: Add support for tls le relax. > >> LoongArch: opcodes: Add support for tls le relax. > >> LoongArch: gas: Add support for tls le relax. > >> LoongArch: ld: Add support for tls le relax. > >> > >> bfd/bfd-in2.h | 4 + > >> bfd/elfnn-loongarch.c | 74 +++++++++ > >> bfd/elfxx-loongarch.c | 50 ++++++ > >> bfd/libbfd.h | 3 + > >> bfd/reloc.c | 6 + > >> gas/config/tc-loongarch.c | 12 +- > >> gas/testsuite/gas/loongarch/reloc.d | 18 +++ > >> gas/testsuite/gas/loongarch/reloc.s | 11 ++ > >> include/elf/loongarch.h | 13 ++ > >> ld/testsuite/ld-loongarch-elf/old-tls-le.s | 19 +++ > >> .../relax-bound-check-tls-le.s | 48 ++++++ > >> .../ld-loongarch-elf/relax-check-tls-le.s | 43 ++++++ > >> ld/testsuite/ld-loongarch-elf/relax-tls-le.s | 17 ++ > >> ld/testsuite/ld-loongarch-elf/relax.exp | 146 +++++++++++++++++- > >> .../tls-relax-compatible-check-old.s | 39 +++++ > >> opcodes/loongarch-opc.c | 1 + > >> 16 files changed, 501 insertions(+), 3 deletions(-) > >> create mode 100644 ld/testsuite/ld-loongarch-elf/old-tls-le.s > >> create mode 100644 ld/testsuite/ld-loongarch-elf/relax-bound-check-tls-le.s > >> create mode 100644 ld/testsuite/ld-loongarch-elf/relax-check-tls-le.s > >> create mode 100644 ld/testsuite/ld-loongarch-elf/relax-tls-le.s > >> create mode 100644 ld/testsuite/ld-loongarch-elf/tls-relax-compatible-check-old.s > >> > > >