From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id 9415C3858D28 for ; Wed, 28 Sep 2022 05:37:58 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 9415C3858D28 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linux.ibm.com Received: from pps.filterd (m0187473.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 28S3crTs024800; Wed, 28 Sep 2022 05:37:57 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=message-id : date : subject : to : cc : references : from : in-reply-to : content-type : content-transfer-encoding : mime-version; s=pp1; bh=fqPemgUdzCG/fIiuIp5H9Q58T7FfK324HfiaPCjkpp0=; b=KCmbvUr8H7+6mrsAw3tBNGAXbgvT08aVKlCsfha+7ooIYqkiLSPsYurq+F24J7od31LH YLdNzZ9E5NP3XCHig9wWpt0dp5YNe7Fs9WWPox0C+34BkjPqHoeckyT0aIyr8YkO/8TU ULEoAhkDHmdFVJL7lBv4YQKWuiGFSGMUe5ghBG13c6+qyAiV2xslOHXXmnUZmSWnpEDU N2h8PeHl5FqnkKCQaqCJac6O2qqYbqAilnshiSdAMcuBhGo4ixJHSaJA2KIVESQuzokk Ulk0RISFp3yt4nZGNVFPk3nEJkL2CJfxZjdycdY3ADXH/DH8ZPbbAmRhAyShpDlsFEcX 8w== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3jve7kaugh-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 28 Sep 2022 05:37:57 +0000 Received: from m0187473.ppops.net (m0187473.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 28S5WIXc009546; Wed, 28 Sep 2022 05:37:56 GMT Received: from ppma05fra.de.ibm.com (6c.4a.5195.ip4.static.sl-reverse.com [149.81.74.108]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3jve7kauf3-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 28 Sep 2022 05:37:56 +0000 Received: from pps.filterd (ppma05fra.de.ibm.com [127.0.0.1]) by ppma05fra.de.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 28S5aWR5005001; Wed, 28 Sep 2022 05:37:54 GMT Received: from b06cxnps4076.portsmouth.uk.ibm.com (d06relay13.portsmouth.uk.ibm.com [9.149.109.198]) by ppma05fra.de.ibm.com with ESMTP id 3jssh8uj90-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 28 Sep 2022 05:37:54 +0000 Received: from d06av24.portsmouth.uk.ibm.com (mk.ibm.com [9.149.105.60]) by b06cxnps4076.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 28S5bpCm50856290 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 28 Sep 2022 05:37:51 GMT Received: from d06av24.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id A6BAB4203F; Wed, 28 Sep 2022 05:37:51 +0000 (GMT) Received: from d06av24.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id E45E14204C; Wed, 28 Sep 2022 05:37:49 +0000 (GMT) Received: from [9.197.236.18] (unknown [9.197.236.18]) by d06av24.portsmouth.uk.ibm.com (Postfix) with ESMTP; Wed, 28 Sep 2022 05:37:49 +0000 (GMT) Message-ID: Date: Wed, 28 Sep 2022 13:37:48 +0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:91.0) Gecko/20100101 Thunderbird/91.6.1 Subject: PING^1 [PATCH v4] rs6000: Rework ELFv2 support for -fpatchable-function-entry* [PR99888] Content-Language: en-US To: GCC Patches Cc: David Edelsohn , Segher Boessenkool , AlanM References: From: "Kewen.Lin" In-Reply-To: Content-Type: text/plain; charset=UTF-8 X-TM-AS-GCONF: 00 X-Proofpoint-GUID: YPuEeO9zasYVE6duZAcmIC8MIgKbgGEh X-Proofpoint-ORIG-GUID: aSh5OsTQeRCyGeNEyZF8duVDa5HDXNzX Content-Transfer-Encoding: 7bit X-Proofpoint-UnRewURL: 0 URL was un-rewritten MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.528,FMLib:17.11.122.1 definitions=2022-09-28_02,2022-09-27_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 impostorscore=0 lowpriorityscore=0 spamscore=0 bulkscore=0 clxscore=1015 mlxlogscore=999 adultscore=0 mlxscore=0 phishscore=0 suspectscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2209130000 definitions=main-2209280032 X-Spam-Status: No, score=-11.9 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,GIT_PATCH_0,KAM_SHORT,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hi, Gentle ping: https://gcc.gnu.org/pipermail/gcc-patches/2022-August/600277.html BR, Kewen on 2022/8/25 13:50, Kewen.Lin via Gcc-patches wrote: > Hi, > > As PR99888 and its related show, the current support for > -fpatchable-function-entry on powerpc ELFv2 doesn't work > well with global entry existence. For example, with one > command line option -fpatchable-function-entry=3,2, it got > below w/o this patch: > > .LPFE1: > nop > nop > .type foo, @function > foo: > nop > .LFB0: > .cfi_startproc > .LCF0: > 0: addis 2,12,.TOC.-.LCF0@ha > addi 2,2,.TOC.-.LCF0@l > .localentry foo,.-foo > > , the assembly is unexpected since the patched nops have > no effects when being entered from local entry. > > This patch is to update the nops patched before and after > local entry, it looks like: > > .type foo, @function > foo: > .LFB0: > .cfi_startproc > .LCF0: > 0: addis 2,12,.TOC.-.LCF0@ha > addi 2,2,.TOC.-.LCF0@l > nop > nop > .localentry foo,.-foo > nop > > Bootstrapped and regtested on powerpc64-linux-gnu P7 & P8, > and powerpc64le-linux-gnu P9 & P10. > > v4: Change the remaining NOP to nop and update documentation of option > -fpatchable-function-entry for PowerPC ELFv2 ABI dual entry points > as Segher suggested. > > v3: https://gcc.gnu.org/pipermail/gcc-patches/2022-August/599925.html > > v2: https://gcc.gnu.org/pipermail/gcc-patches/2022-August/599617.html > > v1: https://gcc.gnu.org/pipermail/gcc-patches/2022-August/599461.html > > Is it ok for trunk? > > BR, > Kewen > ---- > PR target/99888 > PR target/105649 > > gcc/ChangeLog: > > * doc/invoke.texi (option -fpatchable-function-entry): Adjust the > documentation for PowerPC ELFv2 ABI dual entry points. > * config/rs6000/rs6000-internal.h > (rs6000_print_patchable_function_entry): New function declaration. > * config/rs6000/rs6000-logue.cc (rs6000_output_function_prologue): > Support patchable-function-entry by emitting nops before and after > local entry for the function that needs global entry. > * config/rs6000/rs6000.cc (rs6000_print_patchable_function_entry): Skip > the function that needs global entry till global entry has been > emitted. > * config/rs6000/rs6000.h (struct machine_function): New bool member > global_entry_emitted. > > gcc/testsuite/ChangeLog: > > * gcc.target/powerpc/pr99888-1.c: New test. > * gcc.target/powerpc/pr99888-2.c: New test. > * gcc.target/powerpc/pr99888-3.c: New test. > * gcc.target/powerpc/pr99888-4.c: New test. > * gcc.target/powerpc/pr99888-5.c: New test. > * gcc.target/powerpc/pr99888-6.c: New test. > * c-c++-common/patchable_function_entry-default.c: Adjust for > powerpc_elfv2 to avoid compilation error. > --- > gcc/config/rs6000/rs6000-internal.h | 5 +++ > gcc/config/rs6000/rs6000-logue.cc | 32 ++++++++++++++ > gcc/config/rs6000/rs6000.cc | 10 ++++- > gcc/config/rs6000/rs6000.h | 4 ++ > gcc/doc/invoke.texi | 8 +++- > .../patchable_function_entry-default.c | 3 ++ > gcc/testsuite/gcc.target/powerpc/pr99888-1.c | 43 +++++++++++++++++++ > gcc/testsuite/gcc.target/powerpc/pr99888-2.c | 43 +++++++++++++++++++ > gcc/testsuite/gcc.target/powerpc/pr99888-3.c | 11 +++++ > gcc/testsuite/gcc.target/powerpc/pr99888-4.c | 13 ++++++ > gcc/testsuite/gcc.target/powerpc/pr99888-5.c | 13 ++++++ > gcc/testsuite/gcc.target/powerpc/pr99888-6.c | 14 ++++++ > 12 files changed, 195 insertions(+), 4 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/powerpc/pr99888-1.c > create mode 100644 gcc/testsuite/gcc.target/powerpc/pr99888-2.c > create mode 100644 gcc/testsuite/gcc.target/powerpc/pr99888-3.c > create mode 100644 gcc/testsuite/gcc.target/powerpc/pr99888-4.c > create mode 100644 gcc/testsuite/gcc.target/powerpc/pr99888-5.c > create mode 100644 gcc/testsuite/gcc.target/powerpc/pr99888-6.c > > diff --git a/gcc/config/rs6000/rs6000-internal.h b/gcc/config/rs6000/rs6000-internal.h > index 8ee8c987b81..d80c04b5ae5 100644 > --- a/gcc/config/rs6000/rs6000-internal.h > +++ b/gcc/config/rs6000/rs6000-internal.h > @@ -183,10 +183,15 @@ extern tree rs6000_fold_builtin (tree fndecl ATTRIBUTE_UNUSED, > tree *args ATTRIBUTE_UNUSED, > bool ignore ATTRIBUTE_UNUSED); > > +extern void rs6000_print_patchable_function_entry (FILE *, > + unsigned HOST_WIDE_INT, > + bool); > + > extern bool rs6000_passes_float; > extern bool rs6000_passes_long_double; > extern bool rs6000_passes_vector; > extern bool rs6000_returns_struct; > extern bool cpu_builtin_p; > > + > #endif > diff --git a/gcc/config/rs6000/rs6000-logue.cc b/gcc/config/rs6000/rs6000-logue.cc > index 59fe1c8cb8b..51f55d1d527 100644 > --- a/gcc/config/rs6000/rs6000-logue.cc > +++ b/gcc/config/rs6000/rs6000-logue.cc > @@ -4013,11 +4013,43 @@ rs6000_output_function_prologue (FILE *file) > fprintf (file, "\tadd 2,2,12\n"); > } > > + unsigned short patch_area_size = crtl->patch_area_size; > + unsigned short patch_area_entry = crtl->patch_area_entry; > + /* Need to emit the patching area. */ > + if (patch_area_size > 0) > + { > + cfun->machine->global_entry_emitted = true; > + /* As ELFv2 ABI shows, the allowable bytes between the global > + and local entry points are 0, 4, 8, 16, 32 and 64 when > + there is a local entry point. Considering there are two > + non-prefixed instructions for global entry point prologue > + (8 bytes), the count for patchable nops before local entry > + point would be 2, 6 and 14. It's possible to support those > + other counts of nops by not making a local entry point, but > + we don't have clear user cases for them, so leave them > + unsupported for now. */ > + if (patch_area_entry > 0) > + { > + if (patch_area_entry != 2 > + && patch_area_entry != 6 > + && patch_area_entry != 14) > + error ("unsupported number of nops before function entry (%u)", > + patch_area_entry); > + rs6000_print_patchable_function_entry (file, patch_area_entry, > + true); > + patch_area_size -= patch_area_entry; > + } > + } > + > fputs ("\t.localentry\t", file); > assemble_name (file, name); > fputs (",.-", file); > assemble_name (file, name); > fputs ("\n", file); > + /* Emit the nops after local entry. */ > + if (patch_area_size > 0) > + rs6000_print_patchable_function_entry (file, patch_area_size, > + patch_area_entry == 0); > } > > else if (rs6000_pcrel_p ()) > diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc > index 777a06599c3..c79a0ee7a49 100644 > --- a/gcc/config/rs6000/rs6000.cc > +++ b/gcc/config/rs6000/rs6000.cc > @@ -14898,8 +14898,14 @@ rs6000_print_patchable_function_entry (FILE *file, > if (!(TARGET_64BIT && DEFAULT_ABI != ABI_ELFv2) > && HAVE_GAS_SECTION_LINK_ORDER) > flags |= SECTION_LINK_ORDER; > - default_print_patchable_function_entry_1 (file, patch_area_size, record_p, > - flags); > + bool global_entry_needed_p = rs6000_global_entry_point_prologue_needed_p (); > + /* For a function which needs global entry point, we will emit the > + patchable area before and after local entry point under the control of > + cfun->machine->global_entry_emitted, see the handling in function > + rs6000_output_function_prologue. */ > + if (!global_entry_needed_p || cfun->machine->global_entry_emitted) > + default_print_patchable_function_entry_1 (file, patch_area_size, record_p, > + flags); > } > > > enum rtx_code > diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h > index ad9bf0f7358..c352421e87c 100644 > --- a/gcc/config/rs6000/rs6000.h > +++ b/gcc/config/rs6000/rs6000.h > @@ -2439,6 +2439,10 @@ typedef struct GTY(()) machine_function > bool lr_is_wrapped_separately; > bool toc_is_wrapped_separately; > bool mma_return_type_error; > + /* Indicate global entry is emitted, only useful when the function requires > + global entry. It helps to control the patchable area before and after > + local entry. */ > + bool global_entry_emitted; > } machine_function; > #endif > > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi > index b7c75028789..c98f5ceb0b9 100644 > --- a/gcc/doc/invoke.texi > +++ b/gcc/doc/invoke.texi > @@ -16717,9 +16717,13 @@ the area size or to remove it completely on a single function. > If @code{N=0}, no pad location is recorded. > > The NOP instructions are inserted at---and maybe before, depending on > -@var{M}---the function entry address, even before the prologue. > +@var{M}---the function entry address, even before the prologue. On > +PowerPC with the ELFv2 ABI, for one function with dual entry points, > +the local entry point is taken as the function entry for generation. > > -The maximum value of @var{N} and @var{M} is 65535. > +The maximum value of @var{N} and @var{M} is 65535. On PowerPC with the > +ELFv2 ABI, for one function with dual entry points, the supported values > +for @var{M} are 0, 2, 6 and 14. > @end table > > > diff --git a/gcc/testsuite/c-c++-common/patchable_function_entry-default.c b/gcc/testsuite/c-c++-common/patchable_function_entry-default.c > index 7036f7bfbea..a501efccb19 100644 > --- a/gcc/testsuite/c-c++-common/patchable_function_entry-default.c > +++ b/gcc/testsuite/c-c++-common/patchable_function_entry-default.c > @@ -1,6 +1,9 @@ > /* { dg-do compile { target { ! { nvptx*-*-* visium-*-* } } } } */ > /* { dg-options "-O2 -fpatchable-function-entry=3,1" } */ > /* { dg-additional-options "-fno-pie" { target sparc*-*-* } } */ > +/* See PR99888, one single preceding nop isn't allowed on powerpc_elfv2, > + so overriding with two preceding nops to make it pass there. */ > +/* { dg-additional-options "-fpatchable-function-entry=3,2" { target powerpc_elfv2 } } */ > /* { dg-final { scan-assembler-times "nop|NOP|SWYM" 3 { target { ! { alpha*-*-* } } } } } */ > /* { dg-final { scan-assembler-times "bis" 3 { target alpha*-*-* } } } */ > > diff --git a/gcc/testsuite/gcc.target/powerpc/pr99888-1.c b/gcc/testsuite/gcc.target/powerpc/pr99888-1.c > new file mode 100644 > index 00000000000..9370b4e7438 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/powerpc/pr99888-1.c > @@ -0,0 +1,43 @@ > +/* Verify no errors for different nops after local entry on ELFv2. */ > + > +extern int a; > + > +__attribute__ ((noipa, patchable_function_entry (1, 0))) > +int test1 (int b) { > + return a + b; > +} > + > +__attribute__ ((noipa, patchable_function_entry (2, 0))) > +int test2 (int b) { > + return a + b; > +} > + > +__attribute__ ((noipa, patchable_function_entry (3, 0))) > +int test3 (int b) { > + return a + b; > +} > + > +__attribute__ ((noipa, patchable_function_entry (4, 0))) > +int test4 (int b) { > + return a + b; > +} > + > +__attribute__ ((noipa, patchable_function_entry (5, 0))) > +int test5 (int b) { > + return a + b; > +} > + > +__attribute__ ((noipa, patchable_function_entry (6, 0))) > +int test6 (int b) { > + return a + b; > +} > + > +__attribute__ ((noipa, patchable_function_entry (7, 0))) > +int test7 (int b) { > + return a + b; > +} > + > +__attribute__ ((noipa, patchable_function_entry (8, 0))) > +int test8 (int b) { > + return a + b; > +} > diff --git a/gcc/testsuite/gcc.target/powerpc/pr99888-2.c b/gcc/testsuite/gcc.target/powerpc/pr99888-2.c > new file mode 100644 > index 00000000000..45061712602 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/powerpc/pr99888-2.c > @@ -0,0 +1,43 @@ > +/* Verify no errors for 2, 6 and 14 nops before local entry on ELFv2. */ > + > +extern int a; > + > +__attribute__ ((noipa, patchable_function_entry (2, 2))) > +int test1 (int b) { > + return a + b; > +} > + > +__attribute__ ((noipa, patchable_function_entry (4, 2))) > +int test2 (int b) { > + return a + b; > +} > + > +__attribute__ ((noipa, patchable_function_entry (6, 6))) > +int test3 (int b) { > + return a + b; > +} > + > +__attribute__ ((noipa, patchable_function_entry (8, 6))) > +int test4 (int b) { > + return a + b; > +} > + > +__attribute__ ((noipa, patchable_function_entry (16, 6))) > +int test5 (int b) { > + return a + b; > +} > + > +__attribute__ ((noipa, patchable_function_entry (14, 14))) > +int test6 (int b) { > + return a + b; > +} > + > +__attribute__ ((noipa, patchable_function_entry (28, 14))) > +int test7 (int b) { > + return a + b; > +} > + > +__attribute__ ((noipa, patchable_function_entry (64, 14))) > +int test8 (int b) { > + return a + b; > +} > diff --git a/gcc/testsuite/gcc.target/powerpc/pr99888-3.c b/gcc/testsuite/gcc.target/powerpc/pr99888-3.c > new file mode 100644 > index 00000000000..4531ae32036 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/powerpc/pr99888-3.c > @@ -0,0 +1,11 @@ > +/* { dg-options "-fpatchable-function-entry=1" } */ > + > +/* Verify no errors on ELFv2, using command line option instead of > + function attribute. */ > + > +extern int a; > + > +int test (int b) { > + return a + b; > +} > + > diff --git a/gcc/testsuite/gcc.target/powerpc/pr99888-4.c b/gcc/testsuite/gcc.target/powerpc/pr99888-4.c > new file mode 100644 > index 00000000000..00a8d4d316e > --- /dev/null > +++ b/gcc/testsuite/gcc.target/powerpc/pr99888-4.c > @@ -0,0 +1,13 @@ > +/* { dg-require-effective-target powerpc_elfv2 } */ > +/* There is no global entry point prologue with pcrel. */ > +/* { dg-options "-mno-pcrel -fpatchable-function-entry=1,1" } */ > + > +/* Verify one error emitted for unexpected 1 nop before local > + entry. */ > + > +extern int a; > + > +int test (int b) { > + return a + b; > +} > +/* { dg-error "unsupported number of nops before function entry \\(1\\)" "" { target *-*-* } .-1 } */ > diff --git a/gcc/testsuite/gcc.target/powerpc/pr99888-5.c b/gcc/testsuite/gcc.target/powerpc/pr99888-5.c > new file mode 100644 > index 00000000000..39d3b4465f1 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/powerpc/pr99888-5.c > @@ -0,0 +1,13 @@ > +/* { dg-require-effective-target powerpc_elfv2 } */ > +/* There is no global entry point prologue with pcrel. */ > +/* { dg-options "-mno-pcrel -fpatchable-function-entry=7,3" } */ > + > +/* Verify one error emitted for unexpected 3 nops before local > + entry. */ > + > +extern int a; > + > +int test (int b) { > + return a + b; > +} > +/* { dg-error "unsupported number of nops before function entry \\(3\\)" "" { target *-*-* } .-1 } */ > diff --git a/gcc/testsuite/gcc.target/powerpc/pr99888-6.c b/gcc/testsuite/gcc.target/powerpc/pr99888-6.c > new file mode 100644 > index 00000000000..c6c18dcc7ac > --- /dev/null > +++ b/gcc/testsuite/gcc.target/powerpc/pr99888-6.c > @@ -0,0 +1,14 @@ > +/* { dg-require-effective-target powerpc_elfv2 } */ > +/* There is no global entry point prologue with pcrel. */ > +/* { dg-options "-mno-pcrel" } */ > + > +/* Verify one error emitted for unexpected 4 nops before local > + entry. */ > + > +extern int a; > + > +__attribute__ ((patchable_function_entry (20, 4))) > +int test (int b) { > + return a + b; > +} > +/* { dg-error "unsupported number of nops before function entry \\(4\\)" "" { target *-*-* } .-1 } */ > -- > 2.27.0 > > >