From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-yb1-xb2b.google.com (mail-yb1-xb2b.google.com [IPv6:2607:f8b0:4864:20::b2b]) by sourceware.org (Postfix) with ESMTPS id 1AEFE3858C56 for ; Sat, 27 Jan 2024 11:18:12 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 1AEFE3858C56 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 1AEFE3858C56 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::b2b ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1706354294; cv=none; b=Mg0JMIHcWyA8OCZPFwWALSvwT1qfjcv69blImSSR9873T3VO/77dcY2pCWRXindxiE4M66yfLC0Fpsq3q/kVy1pnqzBYWE7nlQ/4/O29eeZKl65q5KnpsTlE3vhc+Uwb+4XzYW+LW7M5A5IJYf0UGtI97DmlfRFTEn8yxezpijA= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1706354294; c=relaxed/simple; bh=S3jq9rCUL/r8ZdSR/Q5giFDcWYbri0qsqp+9TaYAECE=; h=DKIM-Signature:MIME-Version:From:Date:Message-ID:Subject:To; b=iHrVNpfdIktIi7vli9xkgAlfcbLwl3wiit70+t3qWbyviuQ5h3GBTQLWCA0Ix9Z11q8pKsapiSfMxKChruxq8gbhJ65tWBPN9TW0PuXXCuwtLx9+Mh74YDTt00vdQ4Eu1ULsfblYblNrOHs9MzFTjYYVQZ7Txz3+idNyaDkzREg= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-yb1-xb2b.google.com with SMTP id 3f1490d57ef6-dc63dfe77caso988636276.3 for ; Sat, 27 Jan 2024 03:18:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1706354291; x=1706959091; darn=gcc.gnu.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=x8Dr0qeRVXgd2KroniafNlAjVulXnZI51hoAnLgIRqQ=; b=gaOo2nNoFX8lSPkr4hN4h8/LEwQL3osgUrvn04Ddho3ccrp73bIttoVzsaeJ2s+0dr 1SwF3QKfJx1KFS4IQD0RvhIc9uxLeDD+jluOFShYvz5gAubGjJzxwAYTNWAwaZM8jHo7 VGzot8HsflJcZdpbVvv/0KJaAhDq/bJBLR9T3YeaaEl9t3Ng3e2ERHwEWbMSyZDjVkgZ VYj/Xms7SczA4fzGaP4e6CdOFlX0sZj3TEhJVRJWx5i7zzo0OO5UeqAGVcoJdlNgc2dX gCOzvSf/FB0GrQSBrUEolc7TjC/mBtn0uLk4axv+pROtcLRQ2deOqQvnCLQJKndnaRfa al7w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1706354291; x=1706959091; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=x8Dr0qeRVXgd2KroniafNlAjVulXnZI51hoAnLgIRqQ=; b=viika8+dBnGTQdIcf9KmArUcPc63Dru46cwmXDyBRYXiLyFFDlH3rGpRGDspo3g2sG yB8BFHeuklE6uCiTTS9j4x4BeUsnDLEVQ/+z6gOX6kO4En7ggKvOlvbYugq3VnkUuviW +1JRV9l2ayP6pTYy1oek66oGpAgw+x+hcqARlV4DdCDyRIahdf74H0akUiL8FsRlkZjY 3OIfWyjugDDCNPuBYM0kNeH2QbtNh/BEqDMCtdTLpX9g686e/EhzMx/W8haxTefqjD3h jiBtrAbfSgLdH0EaeVlqyN93Nc9huAgsGrCktx0VAwBQj7sJ1NoC6jJth39tx26ue32j QgTA== X-Gm-Message-State: AOJu0Yx2ziq0qanfChjQnAtQkrZrbnjJGOnu6KC+V3A6qJ4DnONUs/yA Bj21D8dOOo3MrDHnnjidiB25LB+EwUxBamEBHqheqlRJIlmtb+wHk3WvmoD7Kq7kmsVEkwy5Qmu CjUr75wAIb60VUA+EmIURC7acGjM= X-Google-Smtp-Source: AGHT+IHP5DYctyf5uiqSFWy4FSHsi4aE1rpc6z8dj4eQVLcAsC51iGM1I9Hi15a6toeqAdpwWkmXNgYMaKTMU2+trto= X-Received: by 2002:a0d:e855:0:b0:5ff:5aab:f0d9 with SMTP id r82-20020a0de855000000b005ff5aabf0d9mr1598640ywe.58.1706354291337; Sat, 27 Jan 2024 03:18:11 -0800 (PST) MIME-Version: 1.0 References: <20240123145951.2092315-1-hjl.tools@gmail.com> In-Reply-To: From: "H.J. Lu" Date: Sat, 27 Jan 2024 03:17:28 -0800 Message-ID: Subject: Re: [PATCH v3 0/2] x86: Don't save callee-saved registers if not needed To: Hongtao Liu Cc: gcc-patches@gcc.gnu.org, ubizjak@gmail.com, hongtao.liu@intel.com, jh@suse.cz Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-3014.3 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,KAM_SHORT,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Wed, Jan 24, 2024 at 7:36=E2=80=AFPM Hongtao Liu wr= ote: > > On Tue, Jan 23, 2024 at 11:00=E2=80=AFPM H.J. Lu wr= ote: > > > > Changes in v3: > > > > 1. Rebase against commit 02e68389494 > > 2. Don't add call_no_callee_saved_registers to machine_function since > > all callee-saved registers are properly clobbered by callee with > > no_callee_saved_registers attribute. > > > The patch LGTM, it should be low risk since there's already > no_caller_save_registers attribute, the patch just extends to > no_callee_save_registers with the same approach. > So if there's no objection(or any concerns) in the next couple days, > I'm ok for the patch to be in GCC14 and backport. I am checking it in. Thanks. H.J. > > Changes in v2: > > > > 1. Rebase against commit f9df00340e3 > > 2. Don't add redundant clobbered_registers check in ix86_expand_call. > > > > In some cases, there are no need to save callee-saved registers: > > > > 1. If a noreturn function doesn't throw nor support exceptions, it can > > skip saving callee-saved registers. > > > > 2. When an interrupt handler is implemented by an assembly stub which d= oes: > > > > 1. Save all registers. > > 2. Call a C function. > > 3. Restore all registers. > > 4. Return from interrupt. > > > > it is completely unnecessary to save and restore any registers in the C > > function called by the assembly stub, even if they would normally be > > callee-saved. > > > > This patch set adds no_callee_saved_registers function attribute, which > > is complementary to no_caller_saved_registers function attribute, to > > classify x86 backend call-saved register handling type with > > > > 1. Default call-saved registers. > > 2. No caller-saved registers with no_caller_saved_registers attribute= . > > 3. No callee-saved registers with no_callee_saved_registers attribute= . > > > > Functions of no callee-saved registers won't save callee-saved register= s. > > If a noreturn function doesn't throw nor support exceptions, it is > > classified as the no callee-saved registers type. > > > > With these changes, __libc_start_main in glibc 2.39, which is a noretur= n > > function, is changed from > > > > __libc_start_main: > > endbr64 > > push %r15 > > push %r14 > > mov %rcx,%r14 > > push %r13 > > push %r12 > > push %rbp > > mov %esi,%ebp > > push %rbx > > mov %rdx,%rbx > > sub $0x28,%rsp > > mov %rdi,(%rsp) > > mov %fs:0x28,%rax > > mov %rax,0x18(%rsp) > > xor %eax,%eax > > test %r9,%r9 > > > > to > > > > __libc_start_main: > > endbr64 > > sub $0x28,%rsp > > mov %esi,%ebp > > mov %rdx,%rbx > > mov %rcx,%r14 > > mov %rdi,(%rsp) > > mov %fs:0x28,%rax > > mov %rax,0x18(%rsp) > > xor %eax,%eax > > test %r9,%r9 > > > > In Linux kernel 6.7.0 on x86-64, do_exit is changed from > > > > do_exit: > > endbr64 > > call > > push %r15 > > push %r14 > > push %r13 > > push %r12 > > mov %rdi,%r12 > > push %rbp > > push %rbx > > mov %gs:0x0,%rbx > > sub $0x28,%rsp > > mov %gs:0x28,%rax > > mov %rax,0x20(%rsp) > > xor %eax,%eax > > call *0x0(%rip) # > > test $0x2,%ah > > je > > > > to > > > > do_exit: > > endbr64 > > call > > sub $0x28,%rsp > > mov %rdi,%r12 > > mov %gs:0x28,%rax > > mov %rax,0x20(%rsp) > > xor %eax,%eax > > mov %gs:0x0,%rbx > > call *0x0(%rip) # > > test $0x2,%ah > > je > > > > I compared GCC master branch bootstrap and test times on a slow machine > > with 6.6 Linux kernels compiled with the original GCC 13 and the GCC 13 > > with the backported patch. The performance data isn't precise since th= e > > measurements were done on different days with different GCC sources und= er > > different 6.6 kernel versions. > > > > GCC master branch build time in seconds: > > > > before after improvement > > 30043.75user 30013.16user 0% > > 1274.85system 1243.72system 2.4% > > > > GCC master branch test time in seconds (new tests added): > > > > before after improvement > > 216035.90user 216547.51user 0 > > 27365.51system 26658.54system 2.6% > > > > Backported to GCC 13 to rebuild system glibc and kernel on Fedora 39. > > Systems perform normally. > > > > > > H.J. Lu (2): > > x86: Add no_callee_saved_registers function attribute > > x86: Don't save callee-saved registers in noreturn functions > > > > gcc/config/i386/i386-expand.cc | 52 +++++++++++++--- > > gcc/config/i386/i386-options.cc | 61 +++++++++++++++---- > > gcc/config/i386/i386.cc | 57 +++++++++++++---- > > gcc/config/i386/i386.h | 16 ++++- > > gcc/doc/extend.texi | 8 +++ > > .../gcc.dg/torture/no-callee-saved-run-1a.c | 23 +++++++ > > .../gcc.dg/torture/no-callee-saved-run-1b.c | 59 ++++++++++++++++++ > > .../gcc.target/i386/no-callee-saved-1.c | 30 +++++++++ > > .../gcc.target/i386/no-callee-saved-10.c | 46 ++++++++++++++ > > .../gcc.target/i386/no-callee-saved-11.c | 11 ++++ > > .../gcc.target/i386/no-callee-saved-12.c | 10 +++ > > .../gcc.target/i386/no-callee-saved-13.c | 16 +++++ > > .../gcc.target/i386/no-callee-saved-14.c | 16 +++++ > > .../gcc.target/i386/no-callee-saved-15.c | 17 ++++++ > > .../gcc.target/i386/no-callee-saved-16.c | 16 +++++ > > .../gcc.target/i386/no-callee-saved-17.c | 16 +++++ > > .../gcc.target/i386/no-callee-saved-18.c | 51 ++++++++++++++++ > > .../gcc.target/i386/no-callee-saved-2.c | 30 +++++++++ > > .../gcc.target/i386/no-callee-saved-3.c | 8 +++ > > .../gcc.target/i386/no-callee-saved-4.c | 8 +++ > > .../gcc.target/i386/no-callee-saved-5.c | 11 ++++ > > .../gcc.target/i386/no-callee-saved-6.c | 12 ++++ > > .../gcc.target/i386/no-callee-saved-7.c | 49 +++++++++++++++ > > .../gcc.target/i386/no-callee-saved-8.c | 50 +++++++++++++++ > > .../gcc.target/i386/no-callee-saved-9.c | 49 +++++++++++++++ > > gcc/testsuite/gcc.target/i386/pr38534-1.c | 26 ++++++++ > > gcc/testsuite/gcc.target/i386/pr38534-2.c | 18 ++++++ > > gcc/testsuite/gcc.target/i386/pr38534-3.c | 19 ++++++ > > gcc/testsuite/gcc.target/i386/pr38534-4.c | 18 ++++++ > > .../gcc.target/i386/stack-check-17.c | 19 +++--- > > 30 files changed, 775 insertions(+), 47 deletions(-) > > create mode 100644 gcc/testsuite/gcc.dg/torture/no-callee-saved-run-1a= .c > > create mode 100644 gcc/testsuite/gcc.dg/torture/no-callee-saved-run-1b= .c > > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-1.c > > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-10.c > > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-11.c > > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-12.c > > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-13.c > > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-14.c > > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-15.c > > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-16.c > > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-17.c > > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-18.c > > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-2.c > > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-3.c > > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-4.c > > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-5.c > > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-6.c > > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-7.c > > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-8.c > > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-9.c > > create mode 100644 gcc/testsuite/gcc.target/i386/pr38534-1.c > > create mode 100644 gcc/testsuite/gcc.target/i386/pr38534-2.c > > create mode 100644 gcc/testsuite/gcc.target/i386/pr38534-3.c > > create mode 100644 gcc/testsuite/gcc.target/i386/pr38534-4.c > > > > -- > > 2.43.0 > > > > > -- > BR, > Hongtao --=20 H.J.