From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from loongson.cn (mail.loongson.cn [114.242.206.163]) by sourceware.org (Postfix) with ESMTP id 134F73858C20 for ; Thu, 23 Mar 2023 12:01:22 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 134F73858C20 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=loongson.cn Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=loongson.cn Received: from loongson.cn (unknown [10.20.4.187]) by gateway (Coremail) with SMTP id _____8AxJISQPxxknk8QAA--.24768S3; Thu, 23 Mar 2023 20:01:21 +0800 (CST) Received: from [10.20.4.187] (unknown [10.20.4.187]) by localhost.localdomain (Coremail) with SMTP id AQAAf8DxxryPPxxkpVEKAA--.1951S3; Thu, 23 Mar 2023 20:01:19 +0800 (CST) Subject: Re: [PATCH] LoongArch: Add Syscall Assembly Implementation To: Xi Ruoyao , libc-alpha@sourceware.org Cc: adhemerval.zanella@linaro.org References: <20230323084013.1100656-1-caiyinyu@loongson.cn> From: caiyinyu Message-ID: <8a4e2e72-9daf-d264-f49d-719daa2407b5@loongson.cn> Date: Thu, 23 Mar 2023 20:01:18 +0800 User-Agent: Mozilla/5.0 (X11; Linux mips64; rv:68.0) Gecko/20100101 Thunderbird/68.7.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US X-CM-TRANSID:AQAAf8DxxryPPxxkpVEKAA--.1951S3 X-CM-SenderInfo: 5fdl5xhq1xqz5rrqw2lrqou0/ X-Coremail-Antispam: 1Uk129KBjvJXoWxWFW5XF4fWF1UXw17XrW8Zwb_yoWrCry5pr 18Xr1UJryUJr18Jr1UJr1DGryUJr1UJ34UJr1UJF1UKr1UAr1jqr1UXr1qgFnxJr48Ar1U Jr1UJr1UZr1UJw7anT9S1TB71UUUUUDqnTZGkaVYY2UrUUUUj1kv1TuYvTs0mT0YCTnIWj qI5I8CrVACY4xI64kE6c02F40Ex7xfYxn0WfASr-VFAUDa7-sFnT9fnUUIcSsGvfJTRUUU bxkYFVCjjxCrM7AC8VAFwI0_Jr0_Gr1l1xkIjI8I6I8E6xAIw20EY4v20xvaj40_Wr0E3s 1l1IIY67AEw4v_Jr0_Jr4l8cAvFVAK0II2c7xJM28CjxkF64kEwVA0rcxSw2x7M28EF7xv wVC0I7IYx2IY67AKxVW8JVW5JwA2z4x0Y4vE2Ix0cI8IcVCY1x0267AKxVW8JVWxJwA2z4 x0Y4vEx4A2jsIE14v26F4UJVW0owA2z4x0Y4vEx4A2jsIEc7CjxVAFwI0_Cr1j6rxdM2AI xVAIcxkEcVAq07x20xvEncxIr21l57IF6xkI12xvs2x26I8E6xACxx1l5I8CrVACY4xI64 kE6c02F40Ex7xfMcIj6xIIjxv20xvE14v26r1Y6r17McIj6I8E87Iv67AKxVWUJVW8JwAm 72CE4IkC6x0Yz7v_Jr0_Gr1lF7xvr2IY64vIr41lc7I2V7IY0VAS07AlzVAYIcxG8wCF04 k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E14v26r1j6r18 MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_Jrv_JF1lIxkGc2Ij64vIr4 1lIxAIcVC0I7IYx2IY67AKxVWUCVW8JwCI42IY6xIIjxv20xvEc7CjxVAFwI0_Jr0_Gr1l IxAIcVCF04k26cxKx2IYs7xG6r1j6r1xMIIF0xvEx4A2jsIE14v26r1j6r4UMIIF0xvEx4 A2jsIEc7CjxVAFwI0_Jr0_GrUvcSsGvfC2KfnxnUUI43ZEXa7IU1QVy3UUUUU== X-Spam-Status: No, score=-5.3 required=5.0 tests=BAYES_00,BODY_8BITS,KAM_DMARC_STATUS,NICE_REPLY_A,SPF_HELO_PASS,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: 在 2023/3/23 下午5:25, Xi Ruoyao 写道: > General question: is there a notable benefit optimizing syscall with > assembly? AFAIK nobody will put syscall on a hot path, and the cycles > saved by using the assembly implementation should be negligible > comparing with all the cost of context switch etc. Yes. New patch: https://sourceware.org/pipermail/libc-alpha/2023-March/146588.html Without this patch(objdump -d libc.so...): 00000000000dd45c :    dd45c:       02fec063        addi.d          $sp, $sp, -80(0xfb0)    dd460:       02c0606c        addi.d          $t0, $sp, 24(0x18)    dd464:       29c06065        st.d            $a1, $sp, 24(0x18)    dd468:       29c08066        st.d            $a2, $sp, 32(0x20)    dd46c:       29c0a067        st.d            $a3, $sp, 40(0x28)    dd470:       29c0c068        st.d            $a4, $sp, 48(0x30)    dd474:       29c0e069        st.d            $a5, $sp, 56(0x38)    dd478:       29c1206b        st.d            $a7, $sp, 72(0x48)    dd47c:       29c1006a        st.d            $a6, $sp, 64(0x40)    dd480:       0015008b        move            $a7, $a0    dd484:       29c0206c        st.d            $t0, $sp, 8(0x8)    dd488:       001500a4        move            $a0, $a1    dd48c:       001500c5        move            $a1, $a2    dd490:       001500e6        move            $a2, $a3    dd494:       00150107        move            $a3, $a4    dd498:       00150128        move            $a4, $a5    dd49c:       00150149        move            $a5, $a6    dd4a0:       002b0000        syscall         0x0    dd4a4:       15ffffec        lu12i.w         $t0, -1(0xfffff)    dd4a8:       68000d84        bltu            $t0, $a0, 12(0xc)       # dd4b4    dd4ac:       02c14063        addi.d          $sp, $sp, 80(0x50)    dd4b0:       4c000020        jirl            $zero, $ra, 0    dd4b4:       1a00128c        pcalau12i       $t0, 148(0x94)    dd4b8:       28d9e18c        ld.d            $t0, $t0, 1656(0x678)    dd4bc:       0011100d        sub.w           $t1, $zero, $a0    dd4c0:       02bffc04        addi.w          $a0, $zero, -1(0xfff)    dd4c4:       3818098d        stx.w           $t1, $t0, $tp    dd4c8:       02c14063        addi.d          $sp, $sp, 80(0x50)    dd4cc:       4c000020        jirl            $zero, $ra, 0 > > On Thu, 2023-03-23 at 16:40 +0800, caiyinyu wrote: >> + ENTRY (syscall) >> + move t0, a7 >> + move a7, a0 /* Syscall number -> a0. */ >> + move a0, a1 /* shift arg1 - arg6. */ >> + move a1, a2 >> + move a2, a3 >> + move a3, a4 >> + move a4, a5 >> + move a5, a6 >> + move a6, t0 /* arg7 is saved in t0. */ >> + syscall 0 /* Do the system call. */ >> +       lu12i.w t0, -1 > "li.w t0, -4096" will do the same thing, and be more readable. > > And this line seems indented with a tab, while other lines are indented > with 8 spaces. Fixed >> +        bltu   t0, a0, L (error) >> +        ret                     /* Return to caller.  */ > "ret" is not recognized by GNU assembler <= 2.39, it's better to use > the old-style "jr ra" for backward compatibility. Fixed. > >> + >> +L (error): >> +        b      __syscall_error