From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-yb1-xb2d.google.com (mail-yb1-xb2d.google.com [IPv6:2607:f8b0:4864:20::b2d]) by sourceware.org (Postfix) with ESMTPS id 5CD273858D28 for ; Thu, 25 Jan 2024 03:36:17 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 5CD273858D28 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 5CD273858D28 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::b2d ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1706153780; cv=none; b=pNIyO1pnPIO4PdWIm6MlnIaKQ9FecGFqfKKusKJrQiTCyV4R+RRK6lAuatAtJy79P9sIYjoSypiFu8mo9jA7Q27gJtO7ttYt/z19XcedGk3jIZhZNpZvffa7JEY4TmYDigrDGRdH6ZKfy/Ps5m8Jw0I3QNQikxXJzeL+NSTWctU= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1706153780; c=relaxed/simple; bh=ObW0gnlidKK9GBreZ7R39RG8GZceq4Xqa4E0pM+DQdY=; h=DKIM-Signature:MIME-Version:From:Date:Message-ID:Subject:To; b=bcvZaYr2V9QfaR9ClI9jAsV+q8jvv/hTQ/ajdxq92GYzmnDHt8sQN6Brd+fCRI0TQrsTmqUMvOT86GQu872SJrDz3T7uoVrWG4RUs90IpbAmWhsZtthd06GiP6caj0PTqVD4tO4KiPqzCG6CnrLi+oM2q/LmN60HOrMX56pBcXE= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-yb1-xb2d.google.com with SMTP id 3f1490d57ef6-dc238cb1b17so5830900276.0 for ; Wed, 24 Jan 2024 19:36:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1706153776; x=1706758576; darn=gcc.gnu.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=gaxsfu6ydUBYDlchmX3J3FHAyFH0Ldu16IGOza8Nd6E=; b=atnWHjczld4dWc+XLq1OupIk56LJA/BS8Rc/B9JtH3lpUDk+VdV4iXra0l+OpcaGcs hg9b49iVPnCV6vWaCYYZQtcAfSgIIZJhRbWXQFVp4BNR3JZiuXHMYR2+HYyoiwSe4TP/ f+51o2mbwBah6PeYpTbB/yxKvJzppjNZL8ac29LKJIW4ti73c6YgHBVxiBTPfvCfIhW1 KxQ1/1UJeZECgDbvl/M4CulJTDYdiG64NleYg50AIqbdIVf1a6LleoFjH6yk3wBoQp8+ FDLLk+QC+jiexorIBcdBeZbes+jLqV/KoeSShJGWiUWhCbfwjToRZTy82hl47vXlSg4U misg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1706153776; x=1706758576; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=gaxsfu6ydUBYDlchmX3J3FHAyFH0Ldu16IGOza8Nd6E=; b=QNQ+QD3M2BN06KHjyrRLUj15S2MSrbiUpOnPyv4Pp6gHRO+ShsdExKktShNp9EHZuW 6dDQ6NK6fnKso9xzyqnxZJoTbFT2upCsYb7lZ76Nw34IO01+RF72w2WNcOyQVEckYzC0 zNZWK8UGRli5jMt3Hd6xoV9V3DpgOIaKJIZ236Y/Acv3lavIceXSNj+r8Q/EbGMOjYfF 6/7k12IZxLZx12t2SveaR8RY/8CUR0EciBsKHX5fJu/JDpQ+ttwawdzqqp+jYJaC8/ZB MgvnpGapF60v1hXh5E/6UMHGcPrRDDy5nnzCwTKir274/+obaruEacIE7V3QIe0PJf0X ljhA== X-Gm-Message-State: AOJu0YwdGeDcNsPK/T1O6Ze0l1AvXdpygoQ7KvPR5MF1VlXZUgxiM4wZ VfzjPbV00u37Rhl+4+756A2t4lKpVDtLA7DZhNePd2VUYWPG8u9knmBEietj51bz/NQk7tOq1OF ChzwCgVf6UzzOyqUQwrNnqMfL4NE= X-Google-Smtp-Source: AGHT+IFlMdLw3KgzC9aMg9fNNrYNPB/ohI+ExrDqfim0OmcT7XuK6Q7bBxq1U35YfUdhCWyLbAAoAxhD7Hb33xeBLeo= X-Received: by 2002:a5b:306:0:b0:dc2:28e6:9cac with SMTP id j6-20020a5b0306000000b00dc228e69cacmr299685ybp.76.1706153775405; Wed, 24 Jan 2024 19:36:15 -0800 (PST) MIME-Version: 1.0 References: <20240123145951.2092315-1-hjl.tools@gmail.com> In-Reply-To: <20240123145951.2092315-1-hjl.tools@gmail.com> From: Hongtao Liu Date: Thu, 25 Jan 2024 11:36:03 +0800 Message-ID: Subject: Re: [PATCH v3 0/2] x86: Don't save callee-saved registers if not needed To: "H.J. Lu" Cc: gcc-patches@gcc.gnu.org, ubizjak@gmail.com, hongtao.liu@intel.com, jh@suse.cz Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,KAM_SHORT,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Tue, Jan 23, 2024 at 11:00=E2=80=AFPM H.J. Lu wrot= e: > > Changes in v3: > > 1. Rebase against commit 02e68389494 > 2. Don't add call_no_callee_saved_registers to machine_function since > all callee-saved registers are properly clobbered by callee with > no_callee_saved_registers attribute. > The patch LGTM, it should be low risk since there's already no_caller_save_registers attribute, the patch just extends to no_callee_save_registers with the same approach. So if there's no objection(or any concerns) in the next couple days, I'm ok for the patch to be in GCC14 and backport. > Changes in v2: > > 1. Rebase against commit f9df00340e3 > 2. Don't add redundant clobbered_registers check in ix86_expand_call. > > In some cases, there are no need to save callee-saved registers: > > 1. If a noreturn function doesn't throw nor support exceptions, it can > skip saving callee-saved registers. > > 2. When an interrupt handler is implemented by an assembly stub which doe= s: > > 1. Save all registers. > 2. Call a C function. > 3. Restore all registers. > 4. Return from interrupt. > > it is completely unnecessary to save and restore any registers in the C > function called by the assembly stub, even if they would normally be > callee-saved. > > This patch set adds no_callee_saved_registers function attribute, which > is complementary to no_caller_saved_registers function attribute, to > classify x86 backend call-saved register handling type with > > 1. Default call-saved registers. > 2. No caller-saved registers with no_caller_saved_registers attribute. > 3. No callee-saved registers with no_callee_saved_registers attribute. > > Functions of no callee-saved registers won't save callee-saved registers. > If a noreturn function doesn't throw nor support exceptions, it is > classified as the no callee-saved registers type. > > With these changes, __libc_start_main in glibc 2.39, which is a noreturn > function, is changed from > > __libc_start_main: > endbr64 > push %r15 > push %r14 > mov %rcx,%r14 > push %r13 > push %r12 > push %rbp > mov %esi,%ebp > push %rbx > mov %rdx,%rbx > sub $0x28,%rsp > mov %rdi,(%rsp) > mov %fs:0x28,%rax > mov %rax,0x18(%rsp) > xor %eax,%eax > test %r9,%r9 > > to > > __libc_start_main: > endbr64 > sub $0x28,%rsp > mov %esi,%ebp > mov %rdx,%rbx > mov %rcx,%r14 > mov %rdi,(%rsp) > mov %fs:0x28,%rax > mov %rax,0x18(%rsp) > xor %eax,%eax > test %r9,%r9 > > In Linux kernel 6.7.0 on x86-64, do_exit is changed from > > do_exit: > endbr64 > call > push %r15 > push %r14 > push %r13 > push %r12 > mov %rdi,%r12 > push %rbp > push %rbx > mov %gs:0x0,%rbx > sub $0x28,%rsp > mov %gs:0x28,%rax > mov %rax,0x20(%rsp) > xor %eax,%eax > call *0x0(%rip) # > test $0x2,%ah > je > > to > > do_exit: > endbr64 > call > sub $0x28,%rsp > mov %rdi,%r12 > mov %gs:0x28,%rax > mov %rax,0x20(%rsp) > xor %eax,%eax > mov %gs:0x0,%rbx > call *0x0(%rip) # > test $0x2,%ah > je > > I compared GCC master branch bootstrap and test times on a slow machine > with 6.6 Linux kernels compiled with the original GCC 13 and the GCC 13 > with the backported patch. The performance data isn't precise since the > measurements were done on different days with different GCC sources under > different 6.6 kernel versions. > > GCC master branch build time in seconds: > > before after improvement > 30043.75user 30013.16user 0% > 1274.85system 1243.72system 2.4% > > GCC master branch test time in seconds (new tests added): > > before after improvement > 216035.90user 216547.51user 0 > 27365.51system 26658.54system 2.6% > > Backported to GCC 13 to rebuild system glibc and kernel on Fedora 39. > Systems perform normally. > > > H.J. Lu (2): > x86: Add no_callee_saved_registers function attribute > x86: Don't save callee-saved registers in noreturn functions > > gcc/config/i386/i386-expand.cc | 52 +++++++++++++--- > gcc/config/i386/i386-options.cc | 61 +++++++++++++++---- > gcc/config/i386/i386.cc | 57 +++++++++++++---- > gcc/config/i386/i386.h | 16 ++++- > gcc/doc/extend.texi | 8 +++ > .../gcc.dg/torture/no-callee-saved-run-1a.c | 23 +++++++ > .../gcc.dg/torture/no-callee-saved-run-1b.c | 59 ++++++++++++++++++ > .../gcc.target/i386/no-callee-saved-1.c | 30 +++++++++ > .../gcc.target/i386/no-callee-saved-10.c | 46 ++++++++++++++ > .../gcc.target/i386/no-callee-saved-11.c | 11 ++++ > .../gcc.target/i386/no-callee-saved-12.c | 10 +++ > .../gcc.target/i386/no-callee-saved-13.c | 16 +++++ > .../gcc.target/i386/no-callee-saved-14.c | 16 +++++ > .../gcc.target/i386/no-callee-saved-15.c | 17 ++++++ > .../gcc.target/i386/no-callee-saved-16.c | 16 +++++ > .../gcc.target/i386/no-callee-saved-17.c | 16 +++++ > .../gcc.target/i386/no-callee-saved-18.c | 51 ++++++++++++++++ > .../gcc.target/i386/no-callee-saved-2.c | 30 +++++++++ > .../gcc.target/i386/no-callee-saved-3.c | 8 +++ > .../gcc.target/i386/no-callee-saved-4.c | 8 +++ > .../gcc.target/i386/no-callee-saved-5.c | 11 ++++ > .../gcc.target/i386/no-callee-saved-6.c | 12 ++++ > .../gcc.target/i386/no-callee-saved-7.c | 49 +++++++++++++++ > .../gcc.target/i386/no-callee-saved-8.c | 50 +++++++++++++++ > .../gcc.target/i386/no-callee-saved-9.c | 49 +++++++++++++++ > gcc/testsuite/gcc.target/i386/pr38534-1.c | 26 ++++++++ > gcc/testsuite/gcc.target/i386/pr38534-2.c | 18 ++++++ > gcc/testsuite/gcc.target/i386/pr38534-3.c | 19 ++++++ > gcc/testsuite/gcc.target/i386/pr38534-4.c | 18 ++++++ > .../gcc.target/i386/stack-check-17.c | 19 +++--- > 30 files changed, 775 insertions(+), 47 deletions(-) > create mode 100644 gcc/testsuite/gcc.dg/torture/no-callee-saved-run-1a.c > create mode 100644 gcc/testsuite/gcc.dg/torture/no-callee-saved-run-1b.c > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-1.c > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-10.c > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-11.c > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-12.c > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-13.c > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-14.c > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-15.c > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-16.c > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-17.c > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-18.c > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-2.c > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-3.c > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-4.c > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-5.c > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-6.c > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-7.c > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-8.c > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-9.c > create mode 100644 gcc/testsuite/gcc.target/i386/pr38534-1.c > create mode 100644 gcc/testsuite/gcc.target/i386/pr38534-2.c > create mode 100644 gcc/testsuite/gcc.target/i386/pr38534-3.c > create mode 100644 gcc/testsuite/gcc.target/i386/pr38534-4.c > > -- > 2.43.0 > --=20 BR, Hongtao