From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from loongson.cn (mail.loongson.cn [114.242.206.163]) by sourceware.org (Postfix) with ESMTP id 16BB73858C5F for ; Tue, 18 Apr 2023 12:03:04 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 16BB73858C5F Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=loongson.cn Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=loongson.cn Received: from loongson.cn (unknown [10.20.4.52]) by gateway (Coremail) with SMTP id _____8BxMI_2hj5k2W0eAA--.47613S3; Tue, 18 Apr 2023 20:03:03 +0800 (CST) Received: from [10.20.4.52] (unknown [10.20.4.52]) by localhost.localdomain (Coremail) with SMTP id AQAAf8Cxfb71hj5kQfAsAA--.13128S2; Tue, 18 Apr 2023 20:03:02 +0800 (CST) Subject: Re: [PATCH] LoongArch: Improve GAR store for va_list To: Xi Ruoyao , gcc-patches@gcc.gnu.org Cc: WANG Xuerui , Chenghua Xu References: <20230328180139.74395-1-xry111@xry111.site> <7ac6f461-7fcc-6df9-8089-8728c4211b31@loongson.cn> <48461b85f62ba02b334aaffb34936beb7a874d6e.camel@xry111.site> <4e1857872733daabb18b54696169e932338667cf.camel@xry111.site> From: Lulu Cheng Message-ID: <0ab6a266-6576-f7bf-9a53-77e2eac92812@loongson.cn> Date: Tue, 18 Apr 2023 20:03:01 +0800 User-Agent: Mozilla/5.0 (X11; Linux mips64; rv:68.0) Gecko/20100101 Thunderbird/68.7.0 MIME-Version: 1.0 In-Reply-To: <4e1857872733daabb18b54696169e932338667cf.camel@xry111.site> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US X-CM-TRANSID:AQAAf8Cxfb71hj5kQfAsAA--.13128S2 X-CM-SenderInfo: xfkh0wpoxo3qxorr0wxvrqhubq/ X-Coremail-Antispam: 1Uk129KBjvJXoW7Zr4Utw4DKry7try5CF4rZrb_yoW8KryxpF Z3JaySkF4kGr47tr1jqa15ZFy0vrW5trWfCrZ5KFyjv3ZrWry0qr45KFs0yas5Crn5X3Wa qw4jga47XFy5A3DanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUj1kv1TuYvTs0mT0YCTnIWj qI5I8CrVACY4xI64kE6c02F40Ex7xfYxn0WfASr-VFAUDa7-sFnT9fnUUIcSsGvfJTRUUU baAYFVCjjxCrM7AC8VAFwI0_Jr0_Gr1l1xkIjI8I6I8E6xAIw20EY4v20xvaj40_Wr0E3s 1l1IIY67AEw4v_JrI_Jryl8cAvFVAK0II2c7xJM28CjxkF64kEwVA0rcxSw2x7M28EF7xv wVC0I7IYx2IY67AKxVWUJVWUCwA2z4x0Y4vE2Ix0cI8IcVCY1x0267AKxVWUJVW8JwA2z4 x0Y4vEx4A2jsIE14v26r4UJVWxJr1l84ACjcxK6I8E87Iv6xkF7I0E14v26r4UJVWxJr1l e2I262IYc4CY6c8Ij28IcVAaY2xG8wAqjxCEc2xF0cIa020Ex4CE44I27wAqx4xG64xvF2 IEw4CE5I8CrVC2j2WlYx0E2Ix0cI8IcVAFwI0_Jr0_Jr4lYx0Ex4A2jsIE14v26r4j6F4U McvjeVCFs4IE7xkEbVWUJVW8JwACjcxG0xvEwIxGrwCYjI0SjxkI62AI1cAE67vIY487Mx AIw28IcxkI7VAKI48JMxC20s026xCaFVCjc4AY6r1j6r4UMxCIbckI1I0E14v26r1Y6r17 MI8I3I0E5I8CrVAFwI0_Jr0_Jr4lx2IqxVCjr7xvwVAFwI0_JrI_JrWlx4CE17CEb7AF67 AKxVWUAVWUtwCIc40Y0x0EwIxGrwCI42IY6xIIjxv20xvE14v26r1j6r1xMIIF0xvE2Ix0 cI8IcVCY1x0267AKxVWUJVW8JwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF4lIxAIcVC2z2 80aVAFwI0_Gr0_Cr1lIxAIcVC2z280aVCY1x0267AKxVW8JVW8JrUvcSsGvfC2KfnxnUUI 43ZEXa7IU8PCzJUUUUU== X-Spam-Status: No, score=-7.8 required=5.0 tests=BAYES_00,KAM_DMARC_STATUS,NICE_REPLY_A,SPF_HELO_PASS,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: 在 2023/4/18 下午7:48, Xi Ruoyao 写道: > On Tue, 2023-04-18 at 19:21 +0800, Lulu Cheng wrote: >> 在 2023/4/18 下午5:27, Xi Ruoyao 写道: >>> On Mon, 2023-04-10 at 17:45 +0800, Lulu Cheng wrote: >>>> Sorry, it's my question. I still have some questions that I haven't >>>> understood, so I haven't replied to the email yet.:-( >>> I've verified the value of cfun->va_list_gpr_size with -fdump-tree- >>> stdarg and various testcases (including extracting aggregates and >>> floating-point values in the va list) and the result seems correct.  And >>> gcc/testsuite/gcc.c-torture/execute/va-arg-*.c should provide a good >>> enough test coverage. >>> >>> Is there still something seemly problematic? >> >> I think there is no problem with the code modification, but I found that >> the $r12 register is stored whether or not this patch is added. I don't >> understand why.:-( > It has been stored before the change: > > test: > .LFB0 = . > .cfi_startproc > addi.d $r3,$r3,-80 > .cfi_def_cfa_offset 80 > addi.d $r12,$r3,24 > st.d $r5,$r3,24 > st.d $r6,$r3,32 > st.d $r7,$r3,40 > st.d $r8,$r3,48 > st.d $r9,$r3,56 > st.d $r10,$r3,64 > st.d $r11,$r3,72 > st.d $r12,$r3,8 # <===== > add.w $r4,$r5,$r4 > addi.d $r3,$r3,80 > .cfi_def_cfa_offset 0 > jr $r1 > .cfi_endproc > > AFAIK it's related to how the variable arguments are implemented in > general. The problem is when we expands __builtin_va_list or > __builtin_va_arg, the registers containing the variable arguments and > the pointer to the variable argument store area (r12 in this case) may > be already clobbered, so the compiler have to store them expanding the > prologue of the function (when the prologue is expanded we don't know if > the following code will clobber the registers). > > This also causes a difficulty to avoid saving the GARs for *used* > variable arguments as well. > > On x86_64 we have the same issue: > > test: > .LFB0: > .cfi_startproc > leaq 8(%rsp), %rax > movq %rsi, -40(%rsp) > movq %rax, -64(%rsp) # <===== > leaq -48(%rsp), %rax > movq %rax, -56(%rsp) > movl -40(%rsp), %eax > movl $8, -72(%rsp) > addl %edi, %eax > ret > .cfi_endproc > > I'll try to remove all of these in the GCC 14 development cycle (as they > are causing sub-optimal code in various Glibc functions), but it's not > easy... > > Ok, I have no more questions.