From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 78502 invoked by alias); 13 Aug 2018 23:58:16 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 78487 invoked by uid 89); 13 Aug 2018 23:58:15 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-26.9 required=5.0 tests=BAYES_00,GIT_PATCH_0,GIT_PATCH_1,GIT_PATCH_2,GIT_PATCH_3,SPF_HELO_PASS autolearn=ham version=3.3.2 spammy=10,000, 12a, truly, sep X-HELO: aserp2130.oracle.com Received: from aserp2130.oracle.com (HELO aserp2130.oracle.com) (141.146.126.79) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 13 Aug 2018 23:58:10 +0000 Received: from pps.filterd (aserp2130.oracle.com [127.0.0.1]) by aserp2130.oracle.com (8.16.0.22/8.16.0.22) with SMTP id w7DNsXAI042301; Mon, 13 Aug 2018 23:58:08 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=subject : from : to : cc : references : message-id : date : mime-version : in-reply-to : content-type; s=corp-2018-07-02; bh=9VRW53rj0bazeN6khbFT8fziwhBusJv9ZEdTrS1iq8Q=; b=AFKu/hFjUtTkXC32GIU12I+z+veNk8RuRVEjJJSxEpIEWK1XYO6zqTuUEX0Rd0PSuLYA 02Yc0t6ClfgD10G/D2aIURhS5oXW10UEJW5nprvCm5az9zcadP2JqVuXLXQSQaJGWML1 M749sgm05WT6K0UzfP9sxlwPP1T110cVAGi84y1uJmyzuQS3GzqlQ8gJ762KYF9Kar4Z llMh/lzwJkqf3pzCtxNzowR64u1O0OHQVZJyq6hSTOywbWkSA3mxrJw8u3DTXTpO3q65 v4L94t9ce6ZSETf7UvuJZ70Pd581BS+LCkZ/9hvtJc7yVTbTlKACzxcxJsJWB6aAtX4U 2w== Received: from userv0021.oracle.com (userv0021.oracle.com [156.151.31.71]) by aserp2130.oracle.com with ESMTP id 2ksnacxhgs-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 13 Aug 2018 23:58:07 +0000 Received: from userv0122.oracle.com (userv0122.oracle.com [156.151.31.75]) by userv0021.oracle.com (8.14.4/8.14.4) with ESMTP id w7DNw6WB032252 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 13 Aug 2018 23:58:06 GMT Received: from abhmp0016.oracle.com (abhmp0016.oracle.com [141.146.116.22]) by userv0122.oracle.com (8.14.4/8.14.4) with ESMTP id w7DNw68t010478; Mon, 13 Aug 2018 23:58:06 GMT Received: from [10.39.248.140] (/10.39.248.140) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Mon, 13 Aug 2018 16:58:05 -0700 Subject: [PING][PATCH] Make function clone name numbering independent. From: Michael Ploujnikov To: Richard Biener Cc: Bernhard Fischer , GCC Patches , hubicka@ucw.cz, Sriraman Tallam References: <6A1E6E88-63F7-43E4-8A53-78C7694D34BE@gmail.com> <069cb2a6-2e37-58fb-2009-35e0300270f1@oracle.com> <618e046e-3f28-f008-237d-497bcff2531e@oracle.com> <121c01f4-c1a2-249e-2cac-c6d5ec250fcb@oracle.com> <702ab3f1-fe25-9013-7a1e-f5e1615c9396@oracle.com> <85bd8119-0bce-a06d-df3f-1a1a6ed88187@oracle.com> Openpgp: preference=signencrypt Message-ID: <3167e521-aa5b-e47c-6d9b-956a1af2a886@oracle.com> Date: Mon, 13 Aug 2018 23:58:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 In-Reply-To: Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="Wx6z3LXerzgi60klt99l4E1Pb2J0qUIp6" X-IsSubscribed: yes X-SW-Source: 2018-08/txt/msg00776.txt.bz2 This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --Wx6z3LXerzgi60klt99l4E1Pb2J0qUIp6 Content-Type: multipart/mixed; boundary="ITsagRFYGVPKq4W7mLWSYVeE9plV5YfFt"; protected-headers="v1" From: Michael Ploujnikov To: Richard Biener Cc: Bernhard Fischer , GCC Patches , hubicka@ucw.cz, Sriraman Tallam Message-ID: <3167e521-aa5b-e47c-6d9b-956a1af2a886@oracle.com> Subject: [PING][PATCH] Make function clone name numbering independent. References: <6A1E6E88-63F7-43E4-8A53-78C7694D34BE@gmail.com> <069cb2a6-2e37-58fb-2009-35e0300270f1@oracle.com> <618e046e-3f28-f008-237d-497bcff2531e@oracle.com> <121c01f4-c1a2-249e-2cac-c6d5ec250fcb@oracle.com> <702ab3f1-fe25-9013-7a1e-f5e1615c9396@oracle.com> <85bd8119-0bce-a06d-df3f-1a1a6ed88187@oracle.com> In-Reply-To: --ITsagRFYGVPKq4W7mLWSYVeE9plV5YfFt Content-Type: multipart/mixed; boundary="------------923C9FABC8F498159BD89C87" Content-Language: en-US This is a multi-part message in MIME format. --------------923C9FABC8F498159BD89C87 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Content-length: 35236 Ping and I've updated the patch since last time as follows: - unittest scans assembly rather than the constprop dump because its forward changed - unittests should handle different hosts where any of NO_DOT_IN_LABEL, NO_DOLLAR_IN_LABEL or __USER_LABEL_PREFIX__ may be defined - not 100% it's safe to change DECL_NAME to DECL_ASSEMBLER_NAME in cgraph_node::create_virtual_clone, but I've attempted to reduce some code duplication - lto-partition.c: privatize_symbol_name_1 *does* need numbered names - but cold sections don't - Expecting an IDENTIFIER_NODE in clone_function_name_1 avoids unreliable string pointer use as pointed out in the first review - renamed clone_function_name_1 and clone_function_name to numbered_clone_function_name_1 and numbered_clone_function_name to clarify purpose and discourage future unintended uses Also bootstrapped and regtested. - Michael On 2018-08-02 03:05 PM, Michael Ploujnikov wrote: > On 2018-08-01 06:37 AM, Richard Biener wrote: >> On Tue, Jul 31, 2018 at 7:40 PM Michael Ploujnikov >> wrote: >>> >>> On 2018-07-26 01:27 PM, Michael Ploujnikov wrote: >>>> On 2018-07-24 09:57 AM, Michael Ploujnikov wrote: >>>>> On 2018-07-20 06:05 AM, Richard Biener wrote: >>>>>>> /* Return a new assembler name for a clone with SUFFIX of a decl n= amed >>>>>>> NAME. */ >>>>>>> @@ -521,14 +521,13 @@ tree >>>>>>> clone_function_name_1 (const char *name, const char *suffix) >>>>>> >>>>>> pass this function the counter to use.... >>>>>> >>>>>>> { >>>>>>> size_t len =3D strlen (name); >>>>>>> - char *tmp_name, *prefix; >>>>>>> + char *prefix; >>>>>>> >>>>>>> prefix =3D XALLOCAVEC (char, len + strlen (suffix) + 2); >>>>>>> memcpy (prefix, name, len); >>>>>>> strcpy (prefix + len + 1, suffix); >>>>>>> prefix[len] =3D symbol_table::symbol_suffix_separator (); >>>>>>> - ASM_FORMAT_PRIVATE_NAME (tmp_name, prefix, clone_fn_id_num++); >>>>>> >>>>>> and keep using ASM_FORMAT_PRIVATE_NAME here. You need to change >>>>>> the lto/lto-partition.c caller (just use zero as counter). >>>>>> >>>>>>> - return get_identifier (tmp_name); >>>>>>> + return get_identifier (prefix); >>>>>>> } >>>>>>> >>>>>>> /* Return a new assembler name for a clone of DECL with SUFFIX. */ >>>>>>> @@ -537,7 +536,17 @@ tree >>>>>>> clone_function_name (tree decl, const char *suffix) >>>>>>> { >>>>>>> tree name =3D DECL_ASSEMBLER_NAME (decl); >>>>>>> - return clone_function_name_1 (IDENTIFIER_POINTER (name), suffix); >>>>>>> + const char *decl_name =3D IDENTIFIER_POINTER (name); >>>>>>> + char *numbered_name; >>>>>>> + unsigned int *suffix_counter; >>>>>>> + if (!clone_fn_ids) { >>>>>>> + /* Initialize the per-function counter hash table if this is t= he first call */ >>>>>>> + clone_fn_ids =3D hash_map::create_ggc = (64); >>>>>>> + } >>>>>> >>>>>> I still do not like throwing memory at the problem in this way for t= he >>>>>> little benefit >>>>>> this change provides :/ >>>>>> >>>>>> So no approval from me at this point... >>>>>> >>>>>> Richard. >>>>> >>>>> Can you give me an idea of the memory constraints that are involved? >>>>> >>>>> The highest memory usage increase that I could find was when compiling >>>>> a source file (from Linux) with roughly 10,000 functions. It showed a= 2kB >>>>> increase over the before-patch use of 6936kB which is barely 0.03%. >>>>> >>>>> Using a single counter can result in more confusing namespacing when >>>>> you have .bar.clone.4 despite there only being 3 clones of .bar. >>>>> >>>>> From a practical point of view this change is helpful to anyone >>>>> diffing binary output such as forensic analysts, Debian Reproducible >>>>> Builds or even someone validating compiler output (before and after a= n input >>>>> source patch). The extra changes that this patch alleviates are a >>>>> distraction and could even be misleading. For example, applying a >>>>> source patch to the same Linux source produces the following binary >>>>> diff before my change: >>>>> >>>>> --- /tmp/output.o.objdump >>>>> +++ /tmp/patched-output.o.objdump >>>>> @@ -1,5 +1,5 @@ >>>>> >>>>> -/tmp/uverbs_cmd/output.o: file format elf32-i386 >>>>> +/tmp/uverbs_cmd/patched-output.o: file format elf32-i386 >>>>> >>>>> >>>>> Disassembly of section .text.get_order: >>>>> @@ -265,12 +265,12 @@ >>>>> 3: e9 fc ff ff ff jmp 4 >>>>> 4: R_386_PC32 .text.put_uobj_read >>>>> >>>>> -Disassembly of section .text.trace_kmalloc.constprop.3: >>>>> +Disassembly of section .text.trace_kmalloc.constprop.4: >>>>> >>>>> -00000000 : >>>>> +00000000 : >>>>> 0: 83 3d 04 00 00 00 00 cmpl $0x0,0x4 >>>>> 2: R_386_32 __tracepoint_kmalloc >>>>> - 7: 74 34 je 3d >>>>> + 7: 74 34 je 3d >>>>> 9: 55 push %ebp >>>>> a: 89 cd mov %ecx,%ebp >>>>> c: 57 push %edi >>>>> @@ -281,7 +281,7 @@ >>>>> 13: 8b 1d 10 00 00 00 mov 0x10,%ebx >>>>> 15: R_386_32 __tracepoint_kmalloc >>>>> 19: 85 db test %ebx,%ebx >>>>> - 1b: 74 1b je 38 >>>>> + 1b: 74 1b je 38 >>>>> 1d: 68 d0 00 00 00 push $0xd0 >>>>> 22: 89 fa mov %edi,%edx >>>>> 24: 89 f0 mov %esi,%eax >>>>> @@ -292,7 +292,7 @@ >>>>> 31: 58 pop %eax >>>>> 32: 83 3b 00 cmpl $0x0,(%ebx) >>>>> 35: 5a pop %edx >>>>> - 36: eb e3 jmp 1b >>>>> + 36: eb e3 jmp 1b >>>>> 38: 5b pop %ebx >>>>> 39: 5e pop %esi >>>>> 3a: 5f pop %edi >>>>> @@ -846,7 +846,7 @@ >>>>> 78: b8 5f 00 00 00 mov $0x5f,%eax >>>>> 79: R_386_32 .text.ib_uverbs_alloc_pd >>>>> 7d: e8 fc ff ff ff call 7e >>>>> - 7e: R_386_PC32 .text.trace_kmalloc.constprop.3 >>>>> + 7e: R_386_PC32 .text.trace_kmalloc.constprop.4 >>>>> 82: c7 45 d4 f4 ff ff ff movl $0xfffffff4,-0x2c(%ebp) >>>>> 89: 59 pop %ecx >>>>> 8a: 85 db test %ebx,%ebx >>>>> @@ -1068,7 +1068,7 @@ >>>>> 9c: b8 83 00 00 00 mov $0x83,%eax >>>>> 9d: R_386_32 .text.ib_uverbs_reg_mr >>>>> a1: e8 fc ff ff ff call a2 >>>>> - a2: R_386_PC32 .text.trace_kmalloc.constprop.3 >>>>> + a2: R_386_PC32 .text.trace_kmalloc.constprop.4 >>>>> a6: ba f4 ff ff ff mov $0xfffffff4,%edx >>>>> ab: 58 pop %eax >>>>> ac: 85 db test %ebx,%ebx >>>>> @@ -1385,7 +1385,7 @@ >>>>> 99: b8 7b 00 00 00 mov $0x7b,%eax >>>>> 9a: R_386_32 .text.ib_uverbs_create_cq >>>>> 9e: e8 fc ff ff ff call 9f >>>>> - 9f: R_386_PC32 .text.trace_kmalloc.constprop.3 >>>>> + 9f: R_386_PC32 .text.trace_kmalloc.constprop.4 >>>>> a3: 58 pop %eax >>>>> a4: 85 db test %ebx,%ebx >>>>> a6: 75 0a jne b2 >>>>> @@ -1607,129 +1607,107 @@ >>>>> 00000000 : >>>>> 0: 55 push %ebp >>>>> 1: 57 push %edi >>>>> - 2: 89 c7 mov %eax,%edi >>>>> - 4: 56 push %esi >>>>> - 5: 89 ce mov %ecx,%esi >>>>> - 7: b9 10 00 00 00 mov $0x10,%ecx >>>>> - c: 53 push %ebx >>>>> - d: 83 ec 18 sub $0x18,%esp >>>>> - 10: 8d 44 24 08 lea 0x8(%esp),%eax >>>>> - 14: e8 fc ff ff ff call 15 >>>>> - 15: R_386_PC32 copy_from_user >>>>> - 19: 85 c0 test %eax,%eax >>>>> - 1b: 0f 85 34 01 00 00 jne 155 >>>>> - 21: 6b 44 24 14 34 imul $0x34,0x14(%esp),%eax >>>>> - 26: ba d0 00 00 00 mov $0xd0,%edx >>>>> - 2b: e8 fc ff ff ff call 2c >>>>> - 2c: R_386_PC32 __kmalloc >>>>> - 30: 89 c5 mov %eax,%ebp >>>>> - 32: 85 c0 test %eax,%eax >>>>> - 34: 0f 84 22 01 00 00 je 15c >>>>> - 3a: 6b 44 24 14 30 imul $0x30,0x14(%esp),%eax >>>>> - 3f: ba d0 00 00 00 mov $0xd0,%edx >>>>> - 44: 83 c0 08 add $0x8,%eax >>>>> - 47: 89 44 24 04 mov %eax,0x4(%esp) >>>>> - 4b: e8 fc ff ff ff call 4c >>>>> - 4c: R_386_PC32 __kmalloc >>>>> - 50: ba f4 ff ff ff mov $0xfffffff4,%edx >>>>> - 55: 89 04 24 mov %eax,(%esp) >>>>> - 58: 85 c0 test %eax,%eax >>>>> - 5a: 0f 84 e1 00 00 00 je 141 >>>>> - 60: 8b 4f 58 mov 0x58(%edi),%ecx >>>>> - 63: 6a 00 push $0x0 >>>>> - 65: b8 00 00 00 00 mov $0x0,%eax >>>>> - 66: R_386_32 ib_uverbs_cq_idr >>>>> - 6a: 8b 54 24 14 mov 0x14(%esp),%edx >>>>> - 6e: e8 fc ff ff ff call 6f >>>>> - 6f: R_386_PC32 .text.idr_read_obj >>>>> - 73: ba ea ff ff ff mov $0xffffffea,%edx >>>>> - 78: 89 c7 mov %eax,%edi >>>>> - 7a: 58 pop %eax >>>>> - 7b: 85 ff test %edi,%edi >>>>> - 7d: 0f 84 ae 00 00 00 je 131 >>>>> - 83: 8b 1f mov (%edi),%ebx >>>>> - 85: 8b 54 24 14 mov 0x14(%esp),%edx >>>>> - 89: 89 e9 mov %ebp,%ecx >>>>> - 8b: 89 f8 mov %edi,%eax >>>>> - 8d: ff 93 70 01 00 00 call *0x170(%ebx) >>>>> - 93: 8b 1c 24 mov (%esp),%ebx >>>>> - 96: 89 03 mov %eax,(%ebx) >>>>> - 98: 89 f8 mov %edi,%eax >>>>> - 9a: e8 fc ff ff ff call 9b >>>>> - 9b: R_386_PC32 .text.put_cq_read >>>>> - 9f: 8b 1c 24 mov (%esp),%ebx >>>>> - a2: 89 e8 mov %ebp,%eax >>>>> - a4: 6b 3b 34 imul $0x34,(%ebx),%edi >>>>> - a7: 8d 53 08 lea 0x8(%ebx),%edx >>>>> - aa: 01 ef add %ebp,%edi >>>>> - ac: 39 f8 cmp %edi,%eax >>>>> - ae: 74 67 je 117 >>>>> - b0: 8b 08 mov (%eax),%ecx >>>>> - b2: 8b 58 04 mov 0x4(%eax),%ebx >>>>> - b5: 83 c2 30 add $0x30,%edx >>>>> - b8: 83 c0 34 add $0x34,%eax >>>>> - bb: 89 4a d0 mov %ecx,-0x30(%edx) >>>>> - be: 89 5a d4 mov %ebx,-0x2c(%edx) >>>>> - c1: 8b 48 d4 mov -0x2c(%eax),%ecx >>>>> - c4: 89 4a d8 mov %ecx,-0x28(%edx) >>>>> - c7: 8b 48 d8 mov -0x28(%eax),%ecx >>>>> - ca: 89 4a dc mov %ecx,-0x24(%edx) >>>>> - cd: 8b 48 dc mov -0x24(%eax),%ecx >>>>> - d0: 89 4a e0 mov %ecx,-0x20(%edx) >>>>> - d3: 8b 48 e0 mov -0x20(%eax),%ecx >>>>> - d6: 89 4a e4 mov %ecx,-0x1c(%edx) >>>>> - d9: 8b 48 e8 mov -0x18(%eax),%ecx >>>>> - dc: 89 4a e8 mov %ecx,-0x18(%edx) >>>>> - df: 8b 48 e4 mov -0x1c(%eax),%ecx >>>>> - e2: 8b 49 20 mov 0x20(%ecx),%ecx >>>>> - e5: 89 4a ec mov %ecx,-0x14(%edx) >>>>> - e8: 8b 48 ec mov -0x14(%eax),%ecx >>>>> - eb: 89 4a f0 mov %ecx,-0x10(%edx) >>>>> - ee: 8b 48 f0 mov -0x10(%eax),%ecx >>>>> - f1: 89 4a f4 mov %ecx,-0xc(%edx) >>>>> - f4: 8b 48 f4 mov -0xc(%eax),%ecx >>>>> - f7: 66 89 4a f8 mov %cx,-0x8(%edx) >>>>> - fb: 66 8b 48 f6 mov -0xa(%eax),%cx >>>>> - ff: 66 89 4a fa mov %cx,-0x6(%edx) >>>>> - 103: 8a 48 f8 mov -0x8(%eax),%cl >>>>> - 106: 88 4a fc mov %cl,-0x4(%edx) >>>>> - 109: 8a 48 f9 mov -0x7(%eax),%cl >>>>> - 10c: 88 4a fd mov %cl,-0x3(%edx) >>>>> - 10f: 8a 48 fa mov -0x6(%eax),%cl >>>>> - 112: 88 4a fe mov %cl,-0x2(%edx) >>>>> - 115: eb 95 jmp ac >>>>> - 117: 8b 14 24 mov (%esp),%edx >>>>> - 11a: 8b 4c 24 04 mov 0x4(%esp),%ecx >>>>> - 11e: 8b 44 24 08 mov 0x8(%esp),%eax >>>>> - 122: e8 fc ff ff ff call 123 >>>>> - 123: R_386_PC32 copy_to_user >>>>> - 127: 83 f8 01 cmp $0x1,%eax >>>>> - 12a: 19 d2 sbb %edx,%edx >>>>> - 12c: f7 d2 not %edx >>>>> - 12e: 83 e2 f2 and $0xfffffff2,%edx >>>>> - 131: 8b 04 24 mov (%esp),%eax >>>>> - 134: 89 54 24 04 mov %edx,0x4(%esp) >>>>> - 138: e8 fc ff ff ff call 139 >>>>> - 139: R_386_PC32 kfree >>>>> - 13d: 8b 54 24 04 mov 0x4(%esp),%edx >>>>> - 141: 89 e8 mov %ebp,%eax >>>>> - 143: 89 14 24 mov %edx,(%esp) >>>>> - 146: e8 fc ff ff ff call 147 >>>>> - 147: R_386_PC32 kfree >>>>> - 14b: 8b 14 24 mov (%esp),%edx >>>>> - 14e: 85 d2 test %edx,%edx >>>>> - 150: 0f 45 f2 cmovne %edx,%esi >>>>> - 153: eb 0c jmp 161 >>>>> - 155: be f2 ff ff ff mov $0xfffffff2,%esi >>>>> - 15a: eb 05 jmp 161 >>>>> - 15c: be f4 ff ff ff mov $0xfffffff4,%esi >>>>> - 161: 83 c4 18 add $0x18,%esp >>>>> - 164: 89 f0 mov %esi,%eax >>>>> - 166: 5b pop %ebx >>>>> - 167: 5e pop %esi >>>>> - 168: 5f pop %edi >>>>> - 169: 5d pop %ebp >>>>> - 16a: c3 ret >>>>> + 2: bf f2 ff ff ff mov $0xfffffff2,%edi >>>>> + 7: 56 push %esi >>>>> + 8: 53 push %ebx >>>>> + 9: 89 c3 mov %eax,%ebx >>>>> + b: 81 ec 84 00 00 00 sub $0x84,%esp >>>>> + 11: 89 4c 24 04 mov %ecx,0x4(%esp) >>>>> + 15: 8d 44 24 10 lea 0x10(%esp),%eax >>>>> + 19: b9 10 00 00 00 mov $0x10,%ecx >>>>> + 1e: e8 fc ff ff ff call 1f >>>>> + 1f: R_386_PC32 copy_from_user >>>>> + 23: 89 04 24 mov %eax,(%esp) >>>>> + 26: 85 c0 test %eax,%eax >>>>> + 28: 0f 85 1f 01 00 00 jne 14d >>>>> + 2e: 8b 4b 58 mov 0x58(%ebx),%ecx >>>>> + 31: 6a 00 push $0x0 >>>>> + 33: b8 00 00 00 00 mov $0x0,%eax >>>>> + 34: R_386_32 ib_uverbs_cq_idr >>>>> + 38: bf ea ff ff ff mov $0xffffffea,%edi >>>>> + 3d: 8b 54 24 1c mov 0x1c(%esp),%edx >>>>> + 41: e8 fc ff ff ff call 42 >>>>> + 42: R_386_PC32 .text.idr_read_obj >>>>> + 46: 89 c3 mov %eax,%ebx >>>>> + 48: 58 pop %eax >>>>> + 49: 85 db test %ebx,%ebx >>>>> + 4b: 0f 84 fc 00 00 00 je 14d >>>>> + 51: 8b 6c 24 10 mov 0x10(%esp),%ebp >>>>> + 55: b9 02 00 00 00 mov $0x2,%ecx >>>>> + 5a: 8d 7c 24 08 lea 0x8(%esp),%edi >>>>> + 5e: 8b 04 24 mov (%esp),%eax >>>>> + 61: 8d 75 08 lea 0x8(%ebp),%esi >>>>> + 64: f3 ab rep stos %eax,%es:(%edi) >>>>> + 66: 8b 44 24 1c mov 0x1c(%esp),%eax >>>>> + 6a: 39 44 24 08 cmp %eax,0x8(%esp) >>>>> + 6e: 73 1f jae 8f >>>>> + 70: 8b 3b mov (%ebx),%edi >>>>> + 72: 8d 4c 24 50 lea 0x50(%esp),%ecx >>>>> + 76: ba 01 00 00 00 mov $0x1,%edx >>>>> + 7b: 89 d8 mov %ebx,%eax >>>>> + 7d: ff 97 70 01 00 00 call *0x170(%edi) >>>>> + 83: 89 c7 mov %eax,%edi >>>>> + 85: 85 c0 test %eax,%eax >>>>> + 87: 0f 88 b9 00 00 00 js 146 >>>>> + 8d: 75 21 jne b0 >>>>> + 8f: b9 08 00 00 00 mov $0x8,%ecx >>>>> + 94: 8d 54 24 08 lea 0x8(%esp),%edx >>>>> + 98: 89 e8 mov %ebp,%eax >>>>> + 9a: bf f2 ff ff ff mov $0xfffffff2,%edi >>>>> + 9f: e8 fc ff ff ff call a0 >>>>> + a0: R_386_PC32 copy_to_user >>>>> + a4: 85 c0 test %eax,%eax >>>>> + a6: 0f 44 7c 24 04 cmove 0x4(%esp),%edi >>>>> + ab: e9 96 00 00 00 jmp 146 >>>>> + b0: 8b 44 24 50 mov 0x50(%esp),%eax >>>>> + b4: 8b 54 24 54 mov 0x54(%esp),%edx >>>>> + b8: b9 30 00 00 00 mov $0x30,%ecx >>>>> + bd: 89 44 24 20 mov %eax,0x20(%esp) >>>>> + c1: 8b 44 24 58 mov 0x58(%esp),%eax >>>>> + c5: 89 54 24 24 mov %edx,0x24(%esp) >>>>> + c9: 8b 54 24 74 mov 0x74(%esp),%edx >>>>> + cd: 89 44 24 28 mov %eax,0x28(%esp) >>>>> + d1: 8b 44 24 5c mov 0x5c(%esp),%eax >>>>> + d5: 89 44 24 2c mov %eax,0x2c(%esp) >>>>> + d9: 8b 44 24 60 mov 0x60(%esp),%eax >>>>> + dd: 89 44 24 30 mov %eax,0x30(%esp) >>>>> + e1: 8b 44 24 64 mov 0x64(%esp),%eax >>>>> + e5: 89 44 24 34 mov %eax,0x34(%esp) >>>>> + e9: 8b 44 24 6c mov 0x6c(%esp),%eax >>>>> + ed: 89 44 24 38 mov %eax,0x38(%esp) >>>>> + f1: 8b 44 24 68 mov 0x68(%esp),%eax >>>>> + f5: 8b 40 20 mov 0x20(%eax),%eax >>>>> + f8: 89 54 24 44 mov %edx,0x44(%esp) >>>>> + fc: 8d 54 24 20 lea 0x20(%esp),%edx >>>>> + 100: c6 44 24 4f 00 movb $0x0,0x4f(%esp) >>>>> + 105: 89 44 24 3c mov %eax,0x3c(%esp) >>>>> + 109: 8b 44 24 70 mov 0x70(%esp),%eax >>>>> + 10d: 89 44 24 40 mov %eax,0x40(%esp) >>>>> + 111: 8b 44 24 78 mov 0x78(%esp),%eax >>>>> + 115: 89 44 24 48 mov %eax,0x48(%esp) >>>>> + 119: 8b 44 24 7c mov 0x7c(%esp),%eax >>>>> + 11d: 66 89 44 24 4c mov %ax,0x4c(%esp) >>>>> + 122: 8a 44 24 7e mov 0x7e(%esp),%al >>>>> + 126: 88 44 24 4e mov %al,0x4e(%esp) >>>>> + 12a: 89 f0 mov %esi,%eax >>>>> + 12c: e8 fc ff ff ff call 12d >>>>> + 12d: R_386_PC32 copy_to_user >>>>> + 131: 85 c0 test %eax,%eax >>>>> + 133: 75 0c jne 141 >>>>> + 135: 83 c6 30 add $0x30,%esi >>>>> + 138: ff 44 24 08 incl 0x8(%esp) >>>>> + 13c: e9 25 ff ff ff jmp 66 >>>>> + 141: bf f2 ff ff ff mov $0xfffffff2,%edi >>>>> + 146: 89 d8 mov %ebx,%eax >>>>> + 148: e8 fc ff ff ff call 149 >>>>> + 149: R_386_PC32 .text.put_cq_read >>>>> + 14d: 81 c4 84 00 00 00 add $0x84,%esp >>>>> + 153: 89 f8 mov %edi,%eax >>>>> + 155: 5b pop %ebx >>>>> + 156: 5e pop %esi >>>>> + 157: 5f pop %edi >>>>> + 158: 5d pop %ebp >>>>> + 159: c3 ret >>>>> >>>>> Disassembly of section .text.ib_uverbs_req_notify_cq: >>>>> >>>>> @@ -1915,7 +1893,7 @@ >>>>> 94: b8 7b 00 00 00 mov $0x7b,%eax >>>>> 95: R_386_32 .text.ib_uverbs_create_qp >>>>> 99: e8 fc ff ff ff call 9a >>>>> - 9a: R_386_PC32 .text.trace_kmalloc.constprop.3 >>>>> + 9a: R_386_PC32 .text.trace_kmalloc.constprop.4 >>>>> 9e: 59 pop %ecx >>>>> 9f: c7 85 50 ff ff ff f4 movl $0xfffffff4,-0xb0(%ebp) >>>>> a6: ff ff ff >>>>> @@ -2241,7 +2219,7 @@ >>>>> 68: b8 4f 00 00 00 mov $0x4f,%eax >>>>> 69: R_386_32 .text.ib_uverbs_query_qp >>>>> 6d: e8 fc ff ff ff call 6e >>>>> - 6e: R_386_PC32 .text.trace_kmalloc.constprop.3 >>>>> + 6e: R_386_PC32 .text.trace_kmalloc.constprop.4 >>>>> 72: 59 pop %ecx >>>>> 73: c7 85 5c ff ff ff 10 movl $0x10,-0xa4(%ebp) >>>>> 7a: 00 00 00 >>>>> @@ -2260,7 +2238,7 @@ >>>>> a3: b8 86 00 00 00 mov $0x86,%eax >>>>> a4: R_386_32 .text.ib_uverbs_query_qp >>>>> a8: e8 fc ff ff ff call a9 >>>>> - a9: R_386_PC32 .text.trace_kmalloc.constprop.3 >>>>> + a9: R_386_PC32 .text.trace_kmalloc.constprop.4 >>>>> ad: 5a pop %edx >>>>> ae: 85 db test %ebx,%ebx >>>>> b0: 0f 84 c1 01 00 00 je 277 >>>>> @@ -2462,7 +2440,7 @@ >>>>> 88: b8 6f 00 00 00 mov $0x6f,%eax >>>>> 89: R_386_32 .text.ib_uverbs_modify_qp >>>>> 8d: e8 fc ff ff ff call 8e >>>>> - 8e: R_386_PC32 .text.trace_kmalloc.constprop.3 >>>>> + 8e: R_386_PC32 .text.trace_kmalloc.constprop.4 >>>>> 92: 5a pop %edx >>>>> 93: ba f4 ff ff ff mov $0xfffffff4,%edx >>>>> 98: 85 db test %ebx,%ebx >>>>> @@ -3129,7 +3107,7 @@ >>>>> 6a: b8 4c 00 00 00 mov $0x4c,%eax >>>>> 6b: R_386_32 .text.ib_uverbs_create_ah >>>>> 6f: e8 fc ff ff ff call 70 >>>>> - 70: R_386_PC32 .text.trace_kmalloc.constprop.3 >>>>> + 70: R_386_PC32 .text.trace_kmalloc.constprop.4 >>>>> 74: 58 pop %eax >>>>> 75: 85 db test %ebx,%ebx >>>>> 77: 75 0a jne 83 >>>>> @@ -3396,7 +3374,7 @@ >>>>> af: b8 91 00 00 00 mov $0x91,%eax >>>>> b0: R_386_32 .text.ib_uverbs_attach_mcast >>>>> b4: e8 fc ff ff ff call b5 >>>>> - b5: R_386_PC32 .text.trace_kmalloc.constprop.3 >>>>> + b5: R_386_PC32 .text.trace_kmalloc.constprop.4 >>>>> b9: 58 pop %eax >>>>> ba: 85 db test %ebx,%ebx >>>>> bc: 75 07 jne c5 >>>>> @@ -3572,7 +3550,7 @@ >>>>> 77: b8 5e 00 00 00 mov $0x5e,%eax >>>>> 78: R_386_32 .text.ib_uverbs_create_srq >>>>> 7c: e8 fc ff ff ff call 7d >>>>> - 7d: R_386_PC32 .text.trace_kmalloc.constprop.3 >>>>> + 7d: R_386_PC32 .text.trace_kmalloc.constprop.4 >>>>> 81: ba f4 ff ff ff mov $0xfffffff4,%edx >>>>> 86: 58 pop %eax >>>>> 87: 85 db test %ebx,%ebx >>>>> >>>>> Needless to say, after my change the diff only shows the truly changed >>>>> functions. >>>>> >>>>> - Michael >>>>> >>>> >>>> Updated patch >>>> >>>> >>>> gcc: >>>> 2018-07-26 Michael Ploujnikov >>>> >>>> Make function clone name numbering independent. >>>> * cgraph.h (clone_function_name_1): Add clone_num argument >>>> * cgraphclones.c: Replace clone_fn_id_num with clone_fn_ids. >>>> (clone_function_name): Use counters from the hashtable. >>>> >>>> lto: >>>> 2018-07-26 Michael Ploujnikov >>>> >>>> * lto-partition.c (privatize_symbol_name_1): Pass 0 to >>>> clone_function_name_1. >>>> >>>> >>>> testsuite: >>>> 2018-07-26 Michael Ploujnikov >>>> >>>> Clone numbering should be independent for each function. >>>> * gcc/testsuite/gcc.dg/independent-cloneids-1.c: New test. >>>> >>>> --- >>>> gcc/cgraph.h | 2 +- >>>> gcc/cgraphclones.c | 16 ++++++++--- >>>> gcc/lto/lto-partition.c | 2 +- >>>> gcc/testsuite/gcc.dg/independent-cloneids-1.c | 38 ++++++++++++++++++= +++++++++ >>>> 4 files changed, 52 insertions(+), 6 deletions(-) >>>> create mode 100644 gcc/testsuite/gcc.dg/independent-cloneids-1.c >>>> >>>> diff --git gcc/cgraph.h gcc/cgraph.h >>>> index a8b1b4c..fe44bd0 100644 >>>> --- gcc/cgraph.h >>>> +++ gcc/cgraph.h >>>> @@ -2368,7 +2368,7 @@ basic_block init_lowered_empty_function (tree, b= ool, >>>> profile_count); >>>> tree thunk_adjust (gimple_stmt_iterator *, tree, bool, HOST_WIDE_INT,= tree); >>>> /* In cgraphclones.c */ >>>> >>>> -tree clone_function_name_1 (const char *, const char *); >>>> +tree clone_function_name_1 (const char *, const char *, unsigned int); >>>> tree clone_function_name (tree decl, const char *); >>>> >>>> void tree_function_versioning (tree, tree, vec *, >>>> diff --git gcc/cgraphclones.c gcc/cgraphclones.c >>>> index 6e84a31..7de7a65 100644 >>>> --- gcc/cgraphclones.c >>>> +++ gcc/cgraphclones.c >>>> @@ -512,13 +512,13 @@ cgraph_node::create_clone (tree new_decl, profil= e_count >>>> prof_count, >>>> return new_node; >>>> } >>>> >>>> -static GTY(()) unsigned int clone_fn_id_num; >>>> +static GTY(()) hash_map *clone_fn_ids; >>>> >>>> /* Return a new assembler name for a clone with SUFFIX of a decl named >>>> NAME. */ >>>> >>>> tree >>>> -clone_function_name_1 (const char *name, const char *suffix) >>>> +clone_function_name_1 (const char *name, const char *suffix, unsigned= int >>>> clone_num) >>>> { >>>> size_t len =3D strlen (name); >>>> char *tmp_name, *prefix; >>>> @@ -527,7 +527,7 @@ clone_function_name_1 (const char *name, const cha= r *suffix) >>>> memcpy (prefix, name, len); >>>> strcpy (prefix + len + 1, suffix); >>>> prefix[len] =3D symbol_table::symbol_suffix_separator (); >>>> - ASM_FORMAT_PRIVATE_NAME (tmp_name, prefix, clone_fn_id_num++); >>>> + ASM_FORMAT_PRIVATE_NAME (tmp_name, prefix, clone_num); >>>> return get_identifier (tmp_name); >>>> } >>>> >>>> @@ -537,7 +537,15 @@ tree >>>> clone_function_name (tree decl, const char *suffix) >>>> { >>>> tree name =3D DECL_ASSEMBLER_NAME (decl); >>>> - return clone_function_name_1 (IDENTIFIER_POINTER (name), suffix); >>>> + const char *decl_name =3D IDENTIFIER_POINTER (name); >>>> + unsigned int *suffix_counter; >>>> + if (!clone_fn_ids) { >>>> + /* Initialize the per-function counter hash table on first call */ >>>> + clone_fn_ids =3D hash_map::create_ggc (64= ); >>>> + } >>>> + suffix_counter =3D &clone_fn_ids->get_or_insert(decl_name); >>>> + *suffix_counter =3D *suffix_counter + 1; >>>> + return clone_function_name_1 (decl_name, suffix, *suffix_counter); >>>> } >>>> >>>> >>>> diff --git gcc/lto/lto-partition.c gcc/lto/lto-partition.c >>>> index c7a5710..4b15232 100644 >>>> --- gcc/lto/lto-partition.c >>>> +++ gcc/lto/lto-partition.c >>>> @@ -965,7 +965,7 @@ privatize_symbol_name_1 (symtab_node *node, tree d= ecl) >>>> name =3D maybe_rewrite_identifier (name); >>>> symtab->change_decl_assembler_name (decl, >>>> clone_function_name_1 (name, >>>> - "lto_priv")= ); >>>> + "lto_priv",= 0)); >>>> >>>> if (node->lto_file_data) >>>> lto_record_renamed_decl (node->lto_file_data, name, >>>> diff --git gcc/testsuite/gcc.dg/independent-cloneids-1.c >>>> gcc/testsuite/gcc.dg/independent-cloneids-1.c >>>> new file mode 100644 >>>> index 0000000..d723e20 >>>> --- /dev/null >>>> +++ gcc/testsuite/gcc.dg/independent-cloneids-1.c >>>> @@ -0,0 +1,38 @@ >>>> +/* { dg-do compile } */ >>>> +/* { dg-options "-O3 -fipa-cp -fipa-cp-clone -fdump-ipa-cp" } */ >>>> + >>>> +extern int printf (const char *, ...); >>>> + >>>> +static int __attribute__ ((noinline)) >>>> +foo (int arg) >>>> +{ >>>> + return 7 * arg; >>>> +} >>>> + >>>> +static int __attribute__ ((noinline)) >>>> +bar (int arg) >>>> +{ >>>> + return arg * arg; >>>> +} >>>> + >>>> +int >>>> +baz (int arg) >>>> +{ >>>> + printf("%d\n", bar (3)); >>>> + printf("%d\n", bar (4)); >>>> + printf("%d\n", foo (5)); >>>> + printf("%d\n", foo (6)); >>>> + /* adding or removing the following call should not affect foo >>>> + function's clone numbering */ >>>> + printf("%d\n", bar (7)); >>>> + return foo (8); >>>> +} >>>> + >>>> +/* { dg-final { scan-ipa-dump "Function bar.constprop.0" "cp" } } */ >>>> +/* { dg-final { scan-ipa-dump "Function bar.constprop.1" "cp" } } */ >>>> +/* { dg-final { scan-ipa-dump "Function bar.constprop.3" "cp" } } */ >>>> +/* { dg-final { scan-ipa-dump "Function foo.constprop.0" "cp" } } */ >>>> +/* { dg-final { scan-ipa-dump "Function foo.constprop.1" "cp" } } */ >>>> +/* { dg-final { scan-ipa-dump "Function foo.constprop.2" "cp" } } */ >>>> +/* { dg-final { scan-ipa-dump-not "Function foo.constprop.3" "cp" } }= */ >>>> +/* { dg-final { scan-ipa-dump-not "Function foo.constprop.4" "cp" } }= */ >>>> >>> >>> Ping? Could someone else please take a look at this since Richard appea= rs to be >>> overloaded at the moment? >> >> The implementation is now how I suggested. I'm still not 100% agreeing = to >> the solution of using a hash-map here as I hoped that most callers have >> a good idea themselves what "number" of clone they are creating. Thus >> most of them should be explicit in passing the number. But I didn't inv= estigate >> any of them (but the LTO one which I think doesn't need the suffix at al= l). >> >> So if anybody besides me is fine with this approach feel free to approve. >> >> One minor nit: >> >>>> + if (!clone_fn_ids) { >>>> + /* Initialize the per-function counter hash table on first call */ >>>> + clone_fn_ids =3D hash_map::create_ggc (64= ); >>>> + } >> >> omit {} around single stmts, the comment doesn't add much information but >> if you want to keep it move it before the if(). >> >> Thanks and sorry for the delay (slowly wading through older patches...), >> Richard. >=20 > No worries, and thank you for the review. >=20 > I dug a little deeper and found that LTO does need the numbered suffixes. > Otherwise I get errors in a lot of tests that look like: >=20 > ... > spawn -ignore SIGHUP /gcc/build/gcc/xgcc -B/gcc/build/gcc/ c_lto_pr60820_= 0.o > c_lto_pr60820_1.o -fno-diagnostics-show-caret -fdiagnostics-color=3Dnever= -flto -r > -nostdlib -O2 -flinker-output=3Dnolto-rel -o gcc-dg-lto-pr60820-01.exe >=20 > pid is 52067 -52067 > lto1: error: two or more sections for > .gnu.lto_local_in6addr_any.lto_priv.0.lto_priv.0.3f1a2975b4946e93 >=20 > lto1: internal compiler error: cannot read LTO decls from /tmp/ccwo9PaB.l= trans0.o >=20 > ... >=20 > which makes sense because lto-partition.c:rename_statics calls > privatize_symbol_name in a loop over symbols with the same asm name. >=20 > So I'm planning to go back to the version where clone_function_name_1 jus= t takes > two strings and I'm wondering if I should do: >=20 > name =3D IDENTIFIER_POINTER (get_identifier (maybe_rewrite_identifier (na= me))); >=20 > in privatize_symbol_name_1 to guarantee that name always persists. >=20 >=20 > - Michael >=20 --------------923C9FABC8F498159BD89C87 Content-Type: text/x-patch; name="0001-Make-function-assembly-more-independent.patch" Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename="0001-Make-function-assembly-more-independent.patch" Content-length: 5972 =46rom adde0632266d3f1b0540e09b9db931df4302d2bc Mon Sep 17 00:00:00 2001 From: Michael Ploujnikov Date: Mon, 16 Jul 2018 12:55:49 -0400 Subject: [PATCH 1/4] Make function assembly more independent. This changes clone_function_name such that clone names are based on a per-function counter rather than a global one. gcc: 2018-08-02 Michael Ploujnikov Make function clone name numbering independent. * cgraphclones.c: Replace clone_fn_id_num with clone_fn_ids. (clone_function_name_1): Take an IDENTIFIER_NODE instead of a string. Use clone_fn_id_num. lto: 2018-08-02 Michael Ploujnikov * lto-partition.c (privatize_symbol_name_1): Pass a persistent identifier node to clone_function_name_1. testsuite: 2018-08-02 Michael Ploujnikov Clone id counters should be completely independent from one another. * gcc.dg/independent-cloneids-1.c: New test. --- gcc/cgraph.h | 2 +- gcc/cgraphclones.c | 19 +++++++++----- gcc/lto/lto-partition.c | 5 ++-- gcc/testsuite/gcc.dg/independent-cloneids-1.c | 38 +++++++++++++++++++++++= ++++ 4 files changed, 54 insertions(+), 10 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/independent-cloneids-1.c diff --git gcc/cgraph.h gcc/cgraph.h index a8b1b4c..fadc107 100644 --- gcc/cgraph.h +++ gcc/cgraph.h @@ -2368,7 +2368,7 @@ basic_block init_lowered_empty_function (tree, bool, = profile_count); tree thunk_adjust (gimple_stmt_iterator *, tree, bool, HOST_WIDE_INT, tree= ); /* In cgraphclones.c */ =20 -tree clone_function_name_1 (const char *, const char *); +tree clone_function_name_1 (tree identifier, const char *); tree clone_function_name (tree decl, const char *); =20 void tree_function_versioning (tree, tree, vec *, diff --git gcc/cgraphclones.c gcc/cgraphclones.c index 6e84a31..fb81fbd 100644 --- gcc/cgraphclones.c +++ gcc/cgraphclones.c @@ -512,14 +512,15 @@ cgraph_node::create_clone (tree new_decl, profile_cou= nt prof_count, return new_node; } =20 -static GTY(()) unsigned int clone_fn_id_num; +static GTY(()) hash_map *clone_fn_ids; =20 -/* Return a new assembler name for a clone with SUFFIX of a decl named - NAME. */ +/* Return a new assembler name for a numbered clone with SUFFIX of a + decl named IDENTIFIER. */ =20 tree -clone_function_name_1 (const char *name, const char *suffix) +clone_function_name_1 (tree identifier, const char *suffix) { + const char *name =3D IDENTIFIER_POINTER (identifier); size_t len =3D strlen (name); char *tmp_name, *prefix; =20 @@ -527,7 +528,12 @@ clone_function_name_1 (const char *name, const char *s= uffix) memcpy (prefix, name, len); strcpy (prefix + len + 1, suffix); prefix[len] =3D symbol_table::symbol_suffix_separator (); - ASM_FORMAT_PRIVATE_NAME (tmp_name, prefix, clone_fn_id_num++); + /* Initialize the per-function counter hash table on first call */ + if (!clone_fn_ids) + clone_fn_ids =3D hash_map::create_ggc (64); + unsigned int &suffix_counter =3D clone_fn_ids->get_or_insert(name); + ASM_FORMAT_PRIVATE_NAME (tmp_name, prefix, suffix_counter); + suffix_counter++; return get_identifier (tmp_name); } =20 @@ -536,8 +542,7 @@ clone_function_name_1 (const char *name, const char *su= ffix) tree clone_function_name (tree decl, const char *suffix) { - tree name =3D DECL_ASSEMBLER_NAME (decl); - return clone_function_name_1 (IDENTIFIER_POINTER (name), suffix); + return clone_function_name_1 (DECL_ASSEMBLER_NAME (decl), suffix); } =20 =20 diff --git gcc/lto/lto-partition.c gcc/lto/lto-partition.c index c7a5710..e3efa71 100644 --- gcc/lto/lto-partition.c +++ gcc/lto/lto-partition.c @@ -962,11 +962,12 @@ privatize_symbol_name_1 (symtab_node *node, tree decl) if (must_not_rename (node, name)) return false; =20 - name =3D maybe_rewrite_identifier (name); + tree identifier =3D get_identifier (maybe_rewrite_identifier (name)); symtab->change_decl_assembler_name (decl, - clone_function_name_1 (name, + clone_function_name_1 (identifier, "lto_priv")); =20 + name =3D IDENTIFIER_POINTER (identifier); if (node->lto_file_data) lto_record_renamed_decl (node->lto_file_data, name, IDENTIFIER_POINTER diff --git gcc/testsuite/gcc.dg/independent-cloneids-1.c gcc/testsuite/gcc.= dg/independent-cloneids-1.c new file mode 100644 index 0000000..3203895 --- /dev/null +++ gcc/testsuite/gcc.dg/independent-cloneids-1.c @@ -0,0 +1,38 @@ +/* { dg-do compile } */ +/* { dg-options "-O3 -fipa-cp -fipa-cp-clone" } */ + +extern int printf (const char *, ...); + +static int __attribute__ ((noinline)) +foo (int arg) +{ + return 7 * arg; +} + +static int __attribute__ ((noinline)) +bar (int arg) +{ + return arg * arg; +} + +int +baz (int arg) +{ + printf("%d\n", bar (3)); + printf("%d\n", bar (4)); + printf("%d\n", foo (5)); + printf("%d\n", foo (6)); + /* adding or removing the following call should not affect foo + function's clone numbering */ + printf("%d\n", bar (7)); + return foo (8); +} + +/* { dg-final { scan-assembler-times {(?n)\m_*bar[.$_]constprop[.$_]0:} 1 = } } */ +/* { dg-final { scan-assembler-times {(?n)\m_*bar[.$_]constprop[.$_]1:} 1 = } } */ +/* { dg-final { scan-assembler-times {(?n)\m_*bar[.$_]constprop[.$_]2:} 1 = } } */ +/* { dg-final { scan-assembler-times {(?n)\m_*foo[.$_]constprop[.$_]0:} 1 = } } */ +/* { dg-final { scan-assembler-times {(?n)\m_*foo[.$_]constprop[.$_]1:} 1 = } } */ +/* { dg-final { scan-assembler-times {(?n)\m_*foo[.$_]constprop[.$_]2:} 1 = } } */ +/* { dg-final { scan-assembler-not {(?n)\m_*foo[.$_]constprop[.$_]3:} } } = */ +/* { dg-final { scan-assembler-not {(?n)\m_*foo[.$_]constprop[.$_]4:} } } = */ --=20 2.7.4 --------------923C9FABC8F498159BD89C87 Content-Type: text/x-patch; name="0002-Rename-clone_function_name_1-and-clone_function_name.patch" Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename*0="0002-Rename-clone_function_name_1-and-clone_function_name.pa"; filename*1="tch" Content-length: 8835 =46rom bbea3705e59e23289cdba72da143f89b3a08b78c Mon Sep 17 00:00:00 2001 From: Michael Ploujnikov Date: Tue, 7 Aug 2018 20:36:53 -0400 Subject: [PATCH 2/4] Rename clone_function_name_1 and clone_function_name to clarify usage. gcc: 2018-08-02 Michael Ploujnikov * gcc/cgraph.h: Rename clone_function_name_1 to numbered_clone_function_name_1. Rename clone_function_name to numbered_clone_function_name. * cgraphclones.c: Ditto. * config/rs6000/rs6000.c: Ditto. * lto/lto-partition.c: Ditto. * multiple_target.c: Ditto. * omp-expand.c: Ditto. * omp-low.c: Ditto. * omp-simd-clone.c: Ditto. * symtab.c: Ditto. --- gcc/cgraph.h | 4 ++-- gcc/cgraphclones.c | 14 +++++++------- gcc/config/rs6000/rs6000.c | 2 +- gcc/lto/lto-partition.c | 4 ++-- gcc/multiple_target.c | 8 ++++---- gcc/omp-expand.c | 2 +- gcc/omp-low.c | 4 ++-- gcc/omp-simd-clone.c | 2 +- gcc/symtab.c | 2 +- 9 files changed, 21 insertions(+), 21 deletions(-) diff --git gcc/cgraph.h gcc/cgraph.h index fadc107..322df1e 100644 --- gcc/cgraph.h +++ gcc/cgraph.h @@ -2368,8 +2368,8 @@ basic_block init_lowered_empty_function (tree, bool, = profile_count); tree thunk_adjust (gimple_stmt_iterator *, tree, bool, HOST_WIDE_INT, tree= ); /* In cgraphclones.c */ =20 -tree clone_function_name_1 (tree identifier, const char *); -tree clone_function_name (tree decl, const char *); +tree numbered_clone_function_name_1 (tree identifier, const char *); +tree numbered_clone_function_name (tree decl, const char *); =20 void tree_function_versioning (tree, tree, vec *, bool, bitmap, bool, bitmap, basic_block); diff --git gcc/cgraphclones.c gcc/cgraphclones.c index fb81fbd..08fc922 100644 --- gcc/cgraphclones.c +++ gcc/cgraphclones.c @@ -316,7 +316,7 @@ duplicate_thunk_for_node (cgraph_node *thunk, cgraph_no= de *node) gcc_checking_assert (!DECL_RESULT (new_decl)); gcc_checking_assert (!DECL_RTL_SET_P (new_decl)); =20 - DECL_NAME (new_decl) =3D clone_function_name (thunk->decl, "artificial_t= hunk"); + DECL_NAME (new_decl) =3D numbered_clone_function_name (thunk->decl, "art= ificial_thunk"); SET_DECL_ASSEMBLER_NAME (new_decl, DECL_NAME (new_decl)); =20 new_thunk =3D cgraph_node::create (new_decl); @@ -518,7 +518,7 @@ static GTY(()) hash_map *clone_= fn_ids; decl named IDENTIFIER. */ =20 tree -clone_function_name_1 (tree identifier, const char *suffix) +numbered_clone_function_name_1 (tree identifier, const char *suffix) { const char *name =3D IDENTIFIER_POINTER (identifier); size_t len =3D strlen (name); @@ -537,12 +537,12 @@ clone_function_name_1 (tree identifier, const char *s= uffix) return get_identifier (tmp_name); } =20 -/* Return a new assembler name for a clone of DECL with SUFFIX. */ +/* Return a new assembler name for a clone of DECL with numbered SUFFIX. = */ =20 tree -clone_function_name (tree decl, const char *suffix) +numbered_clone_function_name (tree decl, const char *suffix) { - return clone_function_name_1 (DECL_ASSEMBLER_NAME (decl), suffix); + return numbered_clone_function_name_1 (DECL_ASSEMBLER_NAME (decl), suffi= x); } =20 =20 @@ -590,7 +590,7 @@ cgraph_node::create_virtual_clone (vec r= edirect_callers, strcpy (name + len + 1, suffix); name[len] =3D '.'; DECL_NAME (new_decl) =3D get_identifier (name); - SET_DECL_ASSEMBLER_NAME (new_decl, clone_function_name (old_decl, suffix= )); + SET_DECL_ASSEMBLER_NAME (new_decl, numbered_clone_function_name (old_dec= l, suffix)); SET_DECL_RTL (new_decl, NULL); =20 new_node =3D create_clone (new_decl, count, false, @@ -969,7 +969,7 @@ cgraph_node::create_version_clone_with_body =3D build_function_decl_skip_args (old_decl, args_to_skip, skip_retu= rn); =20 /* Generate a new name for the new version. */ - DECL_NAME (new_decl) =3D clone_function_name (old_decl, suffix); + DECL_NAME (new_decl) =3D numbered_clone_function_name (old_decl, suffix); SET_DECL_ASSEMBLER_NAME (new_decl, DECL_NAME (new_decl)); SET_DECL_RTL (new_decl, NULL); =20 diff --git gcc/config/rs6000/rs6000.c gcc/config/rs6000/rs6000.c index c2af4b8..f9c9936 100644 --- gcc/config/rs6000/rs6000.c +++ gcc/config/rs6000/rs6000.c @@ -36512,7 +36512,7 @@ make_resolver_func (const tree default_decl, { /* Make the resolver function static. The resolver function returns void *. */ - tree decl_name =3D clone_function_name (default_decl, "resolver"); + tree decl_name =3D numbered_clone_function_name (default_decl, "resolver= "); const char *resolver_name =3D IDENTIFIER_POINTER (decl_name); tree type =3D build_function_type_list (ptr_type_node, NULL_TREE); tree decl =3D build_fn_decl (resolver_name, type); diff --git gcc/lto/lto-partition.c gcc/lto/lto-partition.c index e3efa71..c872325 100644 --- gcc/lto/lto-partition.c +++ gcc/lto/lto-partition.c @@ -964,8 +964,8 @@ privatize_symbol_name_1 (symtab_node *node, tree decl) =20 tree identifier =3D get_identifier (maybe_rewrite_identifier (name)); symtab->change_decl_assembler_name (decl, - clone_function_name_1 (identifier, - "lto_priv")); + numbered_clone_function_name_1 (identifier, + "lto_priv")); =20 name =3D IDENTIFIER_POINTER (identifier); if (node->lto_file_data) diff --git gcc/multiple_target.c gcc/multiple_target.c index a1fe09a..592e7b9 100644 --- gcc/multiple_target.c +++ gcc/multiple_target.c @@ -162,8 +162,8 @@ create_dispatcher_calls (struct cgraph_node *node) } =20 symtab->change_decl_assembler_name (node->decl, - clone_function_name (node->decl, - "default")); + numbered_clone_function_name (node->decl, + "default")); =20 /* FIXME: copy of cgraph_node::make_local that should be cleaned up in next stage1. */ @@ -312,8 +312,8 @@ create_target_clone (cgraph_node *node, bool definition= , char *name) new_node =3D cgraph_node::get_create (new_decl); /* Generate a new name for the new version. */ symtab->change_decl_assembler_name (new_node->decl, - clone_function_name (node->decl, - name)); + numbered_clone_function_name (node->decl, + name)); } return new_node; } diff --git gcc/omp-expand.c gcc/omp-expand.c index d2a77c0..3aeaae0 100644 --- gcc/omp-expand.c +++ gcc/omp-expand.c @@ -7625,7 +7625,7 @@ grid_expand_target_grid_body (struct omp_region *targ= et) expand_omp (gpukernel->inner); =20 tree kern_fndecl =3D copy_node (orig_child_fndecl); - DECL_NAME (kern_fndecl) =3D clone_function_name (kern_fndecl, "kernel"); + DECL_NAME (kern_fndecl) =3D numbered_clone_function_name (kern_fndecl, "= kernel"); SET_DECL_ASSEMBLER_NAME (kern_fndecl, DECL_NAME (kern_fndecl)); tree tgtblock =3D gimple_block (tgt_stmt); tree fniniblock =3D make_node (BLOCK); diff --git gcc/omp-low.c gcc/omp-low.c index 843c66f..6748e05 100644 --- gcc/omp-low.c +++ gcc/omp-low.c @@ -1531,8 +1531,8 @@ scan_sharing_clauses (tree clauses, omp_context *ctx) static tree create_omp_child_function_name (bool task_copy) { - return clone_function_name (current_function_decl, - task_copy ? "_omp_cpyfn" : "_omp_fn"); + return numbered_clone_function_name (current_function_decl, + task_copy ? "_omp_cpyfn" : "_omp_fn"); } =20 /* Return true if CTX may belong to offloaded code: either if current func= tion diff --git gcc/omp-simd-clone.c gcc/omp-simd-clone.c index b15adf0..806c486 100644 --- gcc/omp-simd-clone.c +++ gcc/omp-simd-clone.c @@ -444,7 +444,7 @@ simd_clone_create (struct cgraph_node *old_node) { tree old_decl =3D old_node->decl; tree new_decl =3D copy_node (old_node->decl); - DECL_NAME (new_decl) =3D clone_function_name (old_decl, "simdclone"); + DECL_NAME (new_decl) =3D numbered_clone_function_name (old_decl, "si= mdclone"); SET_DECL_ASSEMBLER_NAME (new_decl, DECL_NAME (new_decl)); SET_DECL_RTL (new_decl, NULL); DECL_STATIC_CONSTRUCTOR (new_decl) =3D 0; diff --git gcc/symtab.c gcc/symtab.c index c5464cb..de65b65 100644 --- gcc/symtab.c +++ gcc/symtab.c @@ -1787,7 +1787,7 @@ symtab_node::noninterposable_alias (void) /* Otherwise create a new one. */ new_decl =3D copy_node (node->decl); DECL_DLLIMPORT_P (new_decl) =3D 0; - DECL_NAME (new_decl) =3D clone_function_name (node->decl, "localalias"); + DECL_NAME (new_decl) =3D numbered_clone_function_name (node->decl, "loca= lalias"); if (TREE_CODE (new_decl) =3D=3D FUNCTION_DECL) DECL_STRUCT_FUNCTION (new_decl) =3D NULL; DECL_INITIAL (new_decl) =3D NULL; --=20 2.7.4 --------------923C9FABC8F498159BD89C87 Content-Type: text/x-patch; name="0003-Cold-sections-don-t-need-to-be-numbered.patch" Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename="0003-Cold-sections-don-t-need-to-be-numbered.patch" Content-length: 5734 =46rom f6a5f30d8975214c564ec3740163b6156ca2ad97 Mon Sep 17 00:00:00 2001 From: Michael Ploujnikov Date: Tue, 7 Aug 2018 20:47:12 -0400 Subject: [PATCH 3/4] Cold sections don't need to be numbered. gcc: 2018-08-02 Michael Ploujnikov * cgraph.h: Add suffixed_function_name. * cgraphclones.c: Ditto. * final.c (final_scan_insn_1): Use it. testsuite: 2018-08-02 Michael Ploujnikov * gcc.dg/tree-prof/cold_partition_label.c: Update for new cold section names. * gcc.dg/tree-prof/section-attr-1.c: ditto. * gcc.dg/tree-prof/section-attr-2.c: ditto. * gcc.dg/tree-prof/section-attr-3.c: ditto. --- gcc/cgraph.h | 1 + gcc/cgraphclones.c | 16 ++++++++++++++++ gcc/final.c | 2 +- gcc/testsuite/gcc.dg/tree-prof/cold_partition_label.c | 4 ++-- gcc/testsuite/gcc.dg/tree-prof/section-attr-1.c | 2 +- gcc/testsuite/gcc.dg/tree-prof/section-attr-2.c | 2 +- gcc/testsuite/gcc.dg/tree-prof/section-attr-3.c | 2 +- 7 files changed, 23 insertions(+), 6 deletions(-) diff --git gcc/cgraph.h gcc/cgraph.h index 322df1e..836ceb1 100644 --- gcc/cgraph.h +++ gcc/cgraph.h @@ -2368,6 +2368,7 @@ basic_block init_lowered_empty_function (tree, bool, = profile_count); tree thunk_adjust (gimple_stmt_iterator *, tree, bool, HOST_WIDE_INT, tree= ); /* In cgraphclones.c */ =20 +tree suffixed_function_name (tree decl, const char *); tree numbered_clone_function_name_1 (tree identifier, const char *); tree numbered_clone_function_name (tree decl, const char *); =20 diff --git gcc/cgraphclones.c gcc/cgraphclones.c index 08fc922..784f1df 100644 --- gcc/cgraphclones.c +++ gcc/cgraphclones.c @@ -514,6 +514,22 @@ cgraph_node::create_clone (tree new_decl, profile_coun= t prof_count, =20 static GTY(()) hash_map *clone_fn_ids; =20 +/* Return a new assembler name for a clone of DECL with SUFFIX. */ + +tree +suffixed_function_name (tree decl, const char *suffix) +{ + const char *name =3D IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (decl)); + size_t len =3D strlen (name); + char *prefix; + + prefix =3D XALLOCAVEC (char, len + strlen (suffix) + 2); + memcpy (prefix, name, len); + prefix[len] =3D symbol_table::symbol_suffix_separator (); + strcpy (prefix + len + 1, suffix); + return get_identifier (prefix); +} + /* Return a new assembler name for a numbered clone with SUFFIX of a decl named IDENTIFIER. */ =20 diff --git gcc/final.c gcc/final.c index 842e5e0..ac83417 100644 --- gcc/final.c +++ gcc/final.c @@ -2203,7 +2203,7 @@ final_scan_insn_1 (rtx_insn *insn, FILE *file, int op= timize_p ATTRIBUTE_UNUSED, =20 if (in_cold_section_p) cold_function_name - =3D clone_function_name (current_function_decl, "cold"); + =3D suffixed_function_name (current_function_decl, "cold"); =20 if (dwarf2out_do_frame ()) { diff --git gcc/testsuite/gcc.dg/tree-prof/cold_partition_label.c gcc/testsu= ite/gcc.dg/tree-prof/cold_partition_label.c index 52518cf..450308d 100644 --- gcc/testsuite/gcc.dg/tree-prof/cold_partition_label.c +++ gcc/testsuite/gcc.dg/tree-prof/cold_partition_label.c @@ -37,6 +37,6 @@ main (int argc, char *argv[]) return 0; } =20 -/* { dg-final-use { scan-assembler "foo\[._\]+cold\[\._\]+0" { target *-*-= linux* *-*-gnu* } } } */ -/* { dg-final-use { scan-assembler "size\[ \ta-zA-Z0-0\]+foo\[._\]+cold\[\= ._\]+0" { target *-*-linux* *-*-gnu* } } } */ +/* { dg-final-use { scan-assembler "foo\[._\]+cold" { target *-*-linux* *-= *-gnu* } } } */ +/* { dg-final-use { scan-assembler "size\[ \ta-zA-Z0-0\]+foo\[._\]+cold" {= target *-*-linux* *-*-gnu* } } } */ /* { dg-final-use { scan-tree-dump-not "Invalid sum" "optimized"} } */ diff --git gcc/testsuite/gcc.dg/tree-prof/section-attr-1.c gcc/testsuite/gc= c.dg/tree-prof/section-attr-1.c index ee6662e..1bb8cd9 100644 --- gcc/testsuite/gcc.dg/tree-prof/section-attr-1.c +++ gcc/testsuite/gcc.dg/tree-prof/section-attr-1.c @@ -42,4 +42,4 @@ foo (int path) } } =20 -/* { dg-final-use { scan-assembler "\.section\[\t \]*\.text\.unlikely\[\\n= \\r\]+\[\t \]*\.size\[\t \]*foo\.cold\.0" { target *-*-linux* *-*-gnu* } } = } */ +/* { dg-final-use { scan-assembler "\.section\[\t \]*\.text\.unlikely\[\\n= \\r\]+\[\t \]*\.size\[\t \]*foo\.cold" { target *-*-linux* *-*-gnu* } } } */ diff --git gcc/testsuite/gcc.dg/tree-prof/section-attr-2.c gcc/testsuite/gc= c.dg/tree-prof/section-attr-2.c index 898a395..31be7eb 100644 --- gcc/testsuite/gcc.dg/tree-prof/section-attr-2.c +++ gcc/testsuite/gcc.dg/tree-prof/section-attr-2.c @@ -41,4 +41,4 @@ foo (int path) } } =20 -/* { dg-final-use { scan-assembler "\.section\[\t \]*\.text\.unlikely\[\\n= \\r\]+\[\t \]*\.size\[\t \]*foo\.cold\.0" { target *-*-linux* *-*-gnu* } } = } */ +/* { dg-final-use { scan-assembler "\.section\[\t \]*\.text\.unlikely\[\\n= \\r\]+\[\t \]*\.size\[\t \]*foo\.cold" { target *-*-linux* *-*-gnu* } } } */ diff --git gcc/testsuite/gcc.dg/tree-prof/section-attr-3.c gcc/testsuite/gc= c.dg/tree-prof/section-attr-3.c index 36829dc..0e64001 100644 --- gcc/testsuite/gcc.dg/tree-prof/section-attr-3.c +++ gcc/testsuite/gcc.dg/tree-prof/section-attr-3.c @@ -42,4 +42,4 @@ foo (int path) } } =20 -/* { dg-final-use { scan-assembler "\.section\[\t \]*\.text\.unlikely\[\\n= \\r\]+\[\t \]*\.size\[\t \]*foo\.cold\.0" { target *-*-linux* *-*-gnu* } } = } */ +/* { dg-final-use { scan-assembler "\.section\[\t \]*\.text\.unlikely\[\\n= \\r\]+\[\t \]*\.size\[\t \]*foo\.cold" { target *-*-linux* *-*-gnu* } } } */ --=20 2.7.4 --------------923C9FABC8F498159BD89C87 Content-Type: text/x-patch; name="0004-Use-suffixed_function_name-in-cgraph_node-create_vir.patch" Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename*0="0004-Use-suffixed_function_name-in-cgraph_node-create_vir.pa"; filename*1="tch" Content-length: 1705 =46rom d96bedcf51c289f3c80a12a681a7831d9e287874 Mon Sep 17 00:00:00 2001 From: Michael Ploujnikov Date: Mon, 13 Aug 2018 13:59:46 -0400 Subject: [PATCH 4/4] Use suffixed_function_name in cgraph_node::create_virtual_clone. gcc: 2018-08-02 Michael Ploujnikov * cgraphclones.c (cgraph_node::create_virtual_clone): Use suffixed_function_name. --- gcc/cgraphclones.c | 10 ++-------- 1 file changed, 2 insertions(+), 8 deletions(-) diff --git gcc/cgraphclones.c gcc/cgraphclones.c index 784f1df..876b7fa 100644 --- gcc/cgraphclones.c +++ gcc/cgraphclones.c @@ -576,9 +576,8 @@ cgraph_node::create_virtual_clone (vec r= edirect_callers, tree old_decl =3D decl; cgraph_node *new_node =3D NULL; tree new_decl; - size_t len, i; + size_t i; ipa_replace_map *map; - char *name; =20 gcc_checking_assert (local.versionable); gcc_assert (local.can_change_signature || !args_to_skip); @@ -600,12 +599,7 @@ cgraph_node::create_virtual_clone (vec = redirect_callers, sometimes storing only clone decl instead of original. */ =20 /* Generate a new name for the new version. */ - len =3D IDENTIFIER_LENGTH (DECL_NAME (old_decl)); - name =3D XALLOCAVEC (char, len + strlen (suffix) + 2); - memcpy (name, IDENTIFIER_POINTER (DECL_NAME (old_decl)), len); - strcpy (name + len + 1, suffix); - name[len] =3D '.'; - DECL_NAME (new_decl) =3D get_identifier (name); + DECL_NAME (new_decl) =3D suffixed_function_name (old_decl, suffix); SET_DECL_ASSEMBLER_NAME (new_decl, numbered_clone_function_name (old_dec= l, suffix)); SET_DECL_RTL (new_decl, NULL); =20 --=20 2.7.4 --------------923C9FABC8F498159BD89C87-- --ITsagRFYGVPKq4W7mLWSYVeE9plV5YfFt-- --Wx6z3LXerzgi60klt99l4E1Pb2J0qUIp6 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" Content-length: 473 -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQEcBAEBCAAGBQJbchsLAAoJEBnHrw6/GfukvJ4H/Ag/XAaWTMkWvGluDfXdsiXC 0It/sRzAoXn8FM0bE8PSobxPa9yA3oVmnRC6clDswzuQdbbhgwg4PdODPlOXDW8k 9JNMyKZ0dNQURve00d/KGBPrjxTE32QYFyEaviUKzWj/IX7DViRMFCsobpLEZHwF 9WFGXVRatyX1CKpgZRA4hgD3dne7OZjWOmjb7PhOiJECXCIE6pP103/+ku6UIyla ugKIjoKU4xOwQnQmLhmUra4P1evpQXiORyUKwqsJw64e+Ck7iXMrK/MXyLPQjYwa 2ZN+BqVd/NsBE66RWJ8JFByovYHgiOM4cEHcDckgXMLo4Z/z73LoRPkiZjrhIN0= =QRB9 -----END PGP SIGNATURE----- --Wx6z3LXerzgi60klt99l4E1Pb2J0qUIp6--