From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id D913F3857C56 for ; Tue, 11 May 2021 15:58:00 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org D913F3857C56 Received: from pps.filterd (m0098393.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.43/8.16.0.43) with SMTP id 14BFWnFa065363; Tue, 11 May 2021 11:57:59 -0400 Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 38fsr5ygcc-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 11 May 2021 11:57:59 -0400 Received: from m0098393.ppops.net (m0098393.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.43/8.16.0.43) with SMTP id 14BFXRPx067013; Tue, 11 May 2021 11:57:59 -0400 Received: from ppma04dal.us.ibm.com (7a.29.35a9.ip4.static.sl-reverse.com [169.53.41.122]) by mx0a-001b2d01.pphosted.com with ESMTP id 38fsr5ygbt-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 11 May 2021 11:57:59 -0400 Received: from pps.filterd (ppma04dal.us.ibm.com [127.0.0.1]) by ppma04dal.us.ibm.com (8.16.0.43/8.16.0.43) with SMTP id 14BFgA1R032629; Tue, 11 May 2021 15:57:58 GMT Received: from b01cxnp23032.gho.pok.ibm.com (b01cxnp23032.gho.pok.ibm.com [9.57.198.27]) by ppma04dal.us.ibm.com with ESMTP id 38dj99mm5t-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 11 May 2021 15:57:57 +0000 Received: from b01ledav004.gho.pok.ibm.com (b01ledav004.gho.pok.ibm.com [9.57.199.109]) by b01cxnp23032.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 14BFvv6328377364 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 11 May 2021 15:57:57 GMT Received: from b01ledav004.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 2C67F112066; Tue, 11 May 2021 15:57:57 +0000 (GMT) Received: from b01ledav004.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 9EE51112063; Tue, 11 May 2021 15:57:56 +0000 (GMT) Received: from Bills-MacBook-Pro.local (unknown [9.211.137.77]) by b01ledav004.gho.pok.ibm.com (Postfix) with ESMTP; Tue, 11 May 2021 15:57:56 +0000 (GMT) Reply-To: wschmidt@linux.ibm.com Subject: Re: [PATCH 00/57] Replace the Power target-specific built-in machinery To: gcc-patches@gcc.gnu.org Cc: segher@kernel.crashing.org, dje.gcc@gmail.com References: From: Bill Schmidt Message-ID: Date: Tue, 11 May 2021 10:57:56 -0500 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:78.0) Gecko/20100101 Thunderbird/78.10.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-GB X-TM-AS-GCONF: 00 X-Proofpoint-GUID: s1H7o1Ol3GIUbIJUD-FA39-bvPv1ff3T X-Proofpoint-ORIG-GUID: omsUCf0OMD4MSsUG03HA4e7j2m75JRvR X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.391, 18.0.761 definitions=2021-05-11_02:2021-05-11, 2021-05-11 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxscore=0 priorityscore=1501 mlxlogscore=999 malwarescore=0 suspectscore=0 impostorscore=0 bulkscore=0 adultscore=0 lowpriorityscore=0 clxscore=1015 spamscore=0 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2104190000 definitions=main-2105110112 X-Spam-Status: No, score=-5.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, NICE_REPLY_A, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 11 May 2021 15:58:03 -0000 Hi!  I'd like to ping this series.  This is a big change, so I'd like to get it committed fairly early in stage 1.  I know you have a lot stacked up, though. Thanks! Bill On 4/27/21 10:32 AM, Bill Schmidt wrote: > The design of the target-specific built-in function support in the > Power back end has not stood the test of time. The machinery is > grossly inefficient, confusing, and arcane; and adding new built-in > functions is inefficient and error-prone. This patch set introduces a > replacement. > > Because of the scope of the changes, it's important to be able to > verify that the new system makes only intended changes to the > functions that are supported. Therefore this patch set adds a new > mechanism, and (in the final patch) enables it instead of the existing > support, but does not yet remove the old support. That will happen in > a follow-up patch once we're comfortable with the new system. > > Most of the patches in this set are specific to the rs6000 back end. > However, the first two patches make changes in common code and require > review from the appropriate maintainers. Jakub and Jeff, I would > appreciate it if you could look at these two small patches. > > After these changes are upstream, adding new built-in functions will > usually be as simple as adding two lines to a file, > rs6000-builtin-new.def, that give the prototype of the function and a > little additional information. Adding new overloaded functions will > require adding a new section to another file, rs6000-overload.def, > with one line describing the overload information, and two lines for > each function to be dispatched to from the overloaded function. > > The patches are divided into the following sections. > > Patches 0001-0002: Common code patches > > Patch 0001 adds a mechanism to the Makefile to allow specifying > additional dependencies for "out_object_file", which is rs6000.o for > the rs6000 back end. I found this necessary to be able to have > rs6000.o depend on a header file generated during the build. > > Patch 0002 expands the gengtype machinery to scan header files > created during the build for GC roots. > > Patches 0003, 0005-0023: Generator program > > A new program, rs6000-gen-builtins, is created and executed during > the build. It reads rs6000-builtin-new.def and rs6000-overload.def > and produces three output files: rs6000-builtins.h, > rs6000-builtins.c, and rs6000-vecdefines.h. rs6000-builtins.h > defines the data structures representing the built-in functions, > overloaded functions, overload instantiations, and function type > specifiers. rs6000-builtins.c contains static initializers for the > data structures, as well as the function rs6000_autoinit_builtins > that performs additional run-time initialization. > rs6000-vecdefines.h contains a set of #defines that map external > identifiers such as vec_add to their internal builtin names, such as > __builtin_vec_add. This replaces most of the similar #defines > previously contained in altivec.h, which now #includes the new file > instead. > > This set of patches adds the source for the generator program. > > Patches 0024-0025: Target build machinery > > These patches make changes to config.gcc and t-rs6000 to build and > run the new generator program, and to ensure that the garbage > collection roots in rs6000-builtins.h are scanned by gengtype. > > Patches 0004, 0026-0031, 0033-0037: Input files > > These patches build up the input files to the generator program, > listing all of the built-in functions and overloads to be > processed. > > Patch 0032: Add pointer types > > This patch creates and caches a bunch of pointer type nodes. The > existing built-in machinery, for some reason, only created base > types up front and created the pointer types on demand (over and > over and over again). The new mechanism needs all the type nodes > available, so we add them here. > > Patch 0038: Call rs6000_autoinit_builtins > > Patch 0039: A little special handling for Darwin > > Patches 0040-0041: Miscellaneous support patches > > Patch 0042: Rewrite the overload processing > > Most of this code remains largely the same as before, with the same > special handling for a few interesting built-in functions. But the > general handling of overloaded functions is now much more efficient > since the new data structures are designed for quick lookup, whereas > the old machinery does a brutal linear search. > > Patch 0043: Rewrite gimple folding > > The "rewrite" here consists entirely of changing the names of the > builtins to be processed, since we need a separate enumeration of > builtins for the new machinery. > > Patch 0044: Vectorization support > > Small updates to the functions used for mapping built-ins to their > vectorized counterparts. > > Patches 0045-0050: Rewrite built-in function expansion > > This is where most of the meat comes in. Lookup of built-ins at > expand time is again much more efficient, replacing the old > mechanism of multiple linear searches over the whole built-in > table. Another major change is that all built-in functions are > always defined, but a test at expand time is used to determine > whether they are enabled. This allows proper handling of > built-ins in the presence of "#pragma target" directives. Also, > handling of special cases is made more uniform using an attribute > system, which I hope makes this much easier to maintain. > > Patches 0051-0052: Miscellaneous changes > > Patch 0053: Debug support > > Small changes here to allow gathering of a little more data from > -mdebug=builtin. I used this to look for differences between > functions defined by the old and new built-in support. > > Patch 0054: Change altivec.h to use rs6000-vecdefines.h > > Patch 0055: Test case adjustments > > Most of these changes are due to automating checks for literal > arguments that must be within a certain range. This gives us more > regular error messages, which don't always match the previous error > messages. There are also some adjustments because altivec.h now > includes rs6000-vecdefines.h. > > Patch 0056: Flip the switch to enable the new support > > Victory is ours... > > Patch 0057: Fix one last late-breaking change > > Keeping the code up-to-date with upstream has been fun. When I > rebased to create the patch set, I found one new issue where a > small change had been made to the overload handling for the > vec_insert builtins. This patch reflects that change into the > new handling. My version of git is having trouble with > interactive rebasing, so it was easier to just add the extra patch. > > Now, with all that done, there are a few things that are not yet > done: > > (1) A future patch will remove the old code. > > (2) There are times where we ought to dispatch an overload to one > function if VSX is available, and to another function if it is not. > We need a general mechanism for allowing conditional dispatch. I've > outlined a method for this in rs6000-overload.def that I want to > implement down the road. > > (3) I want to investigate why vec_mul requires special handling in > rs6000-c.c; it doesn't seem like it should. > > (4) Similarly, can we remove some of the special handling for > vec_adde, vec_addec, vec_sube, and vec_subec? > > (5) The parser in the generator program doesn't yet handle > "escape-newline" sequences for breaking long lines. I should add that > capability. > > (6) Longer term, can we use a similar mechanism for built-in functions > used for all targets in common code? > > > A word about compatibility: > > I deliberately implemented all the old built-ins exactly as previously > defined, wherever possible, despite an overwhelming desire to pitch > out a bunch of them that have already been considered deprecated for > ages. I found that it was too difficult to both implement a new > system and remove deprecated things at the same time, and in general > it seems like a dangerous thing to do. Better to do this in stages if > we're going to do it at all. Unfortunately a lot of deprecated things > still appear all over our own test suite, and I'm afraid we can assume > they appear in user code all over the place as well. > > What I've done instead is to make very clear which interfaces are > considered deprecated in the input files themselves. Over time, > perhaps we can start to remove some of these, but in reality I think > we're going to just continue to be stuck with them. > > Here is a complete list of known incompatibilities with the old > mechanism: > > (1) __builtin_vec_vpopcntu[bdhw] were all registered as overloads but > didn't have any instantiations. Therefore they could not have been > used anywhere, and I haven't implemented them. > > (2) I added ten new built-ins named __builtin_vsx_xxpermx_ to be > used for the overloaded vec_xxpermx function, instead of the bloody > hack that was used before. The functionality of vec_xxpermx is > unchanged. > > (3) A number of built-ins used "long" for DImode types, which would > break these for 32-bit. I changed those arguments and return values > to "long long" to avoid such problems, when those built-ins were not > restricted to 64-bit mode already. There aren't many such cases. > > (4) A small handful of builtins didn't have the correct return type to > match the mode of the pattern, so I fixed those. They are all new in > GCC 11 and can't have worked properly. > > (5) I handled the MMA internal functions slightly differently, so that > all the ones with extra vector_quad arguments are listed as such, > rather than having that hacked on during expand time. > > (6) __builtin_vsx_xl_len_r took only a char * rather than a void *; > fixing this was backward compatible. > > (7) __builtin_vsx_splat_2d[fi] were incompletely defined and couldn't > have ever worked; fixed. > > (8) A small handful of builtins weren't marked as "const," but are > obviously const, so I fixed those. > > I've kept a complete list of discrepancies for my records, in case any > issues arise from my misunderstanding something. > > I do want to thank all the people who have contributed to the built-in > design over the years. For all my griping, there are some marvelous > bits in there that I hope I have kept intact. My hope is to make the > whole system much easier to use and maintain going forward. Time will > tell. > > The patches have been bootstrapped and tested on a Power10 > little-endian system, and on a Power8 big-endian system with both 32- > and 64-bit enabled, with no regressions. I'm not crazy enough to > believe I don't have any errors in here, but I have endeavoured to > test and minimize them to the best of my ability. > > Is this series okay for trunk, in GCC 12 stage 1? > > Thanks! > Bill > > Bill Schmidt (57): > Allow targets to specify build dependencies for out_object_file > Support scanning of build-time GC roots in gengtype > rs6000: Initial create of rs6000-gen-builtins.c > rs6000: Add initial input files > rs6000: Add file support and functions for diagnostic support > rs6000: Add helper functions for parsing > rs6000: Add functions for matching types, part 1 of 3 > rs6000: Add functions for matching types, part 2 of 3 > rs6000: Add functions for matching types, part 3 of 3 > rs6000: Red-black tree implementation for balanced tree search > rs6000: Main function with stubs for parsing and output > rs6000: Parsing built-in input file, part 1 of 3 > rs6000: Parsing built-in input file, part 2 of 3 > rs6000: Parsing built-in input file, part 3 of 3 > rs6000: Parsing of overload input file > rs6000: Build and store function type identifiers > rs6000: Write output to the builtin definition include file > rs6000: Write output to the builtins header file > rs6000: Write output to the builtins init file, part 1 of 3 > rs6000: Write output to the builtins init file, part 2 of 3 > rs6000: Write output to the builtins init file, part 3 of 3 > rs6000: Write static initializations for built-in table > rs6000: Write static initializations for overload tables > rs6000: Incorporate new builtins code into the build machinery > rs6000: Add gengtype handling to the build machinery > rs6000: Add the rest of the [altivec] stanza to the builtins file > rs6000: Add VSX builtins > rs6000: Add available-everywhere and ancient builtins > rs6000: Add power7 and power7-64 builtins > rs6000: Add power8-vector builtins > rs6000: Add Power9 builtins > rs6000: Add more type nodes to support builtin processing > rs6000: Add Power10 builtins > rs6000: Add MMA builtins > rs6000: Add miscellaneous builtins > rs6000: Add Cell builtins > rs6000: Add remaining overloads > rs6000: Execute the automatic built-in initialization code > rs6000: Darwin builtin support > rs6000: Add sanity to V2DI_type_node definitions > rs6000: Always initialize vector_pair and vector_quad nodes > rs6000: Handle overloads during program parsing > rs6000: Handle gimple folding of target built-ins > rs6000: Support for vectorizing built-in functions > rs6000: Builtin expansion, part 1 > rs6000: Builtin expansion, part 2 > rs6000: Builtin expansion, part 3 > rs6000: Builtin expansion, part 4 > rs6000: Builtin expansion, part 5 > rs6000: Builtin expansion, part 6 > rs6000: Update rs6000_builtin_decl > rs6000: Miscellaneous uses of rs6000_builtin_decls_x > rs6000: Debug support > rs6000: Update altivec.h for automated interfaces > rs6000: Test case adjustments > rs6000: Enable the new builtin support > rs6000: Adjust to late-breaking change > > gcc/Makefile.in | 8 +- > gcc/config.gcc | 2 + > gcc/config/rs6000/altivec.h | 516 +- > gcc/config/rs6000/darwin.h | 8 +- > gcc/config/rs6000/rbtree.c | 233 + > gcc/config/rs6000/rbtree.h | 51 + > gcc/config/rs6000/rs6000-builtin-new.def | 3875 +++++++++++ > gcc/config/rs6000/rs6000-c.c | 1083 +++ > gcc/config/rs6000/rs6000-call.c | 3371 ++++++++- > gcc/config/rs6000/rs6000-gen-builtins.c | 2997 ++++++++ > gcc/config/rs6000/rs6000-overload.def | 6076 +++++++++++++++++ > gcc/config/rs6000/rs6000.c | 219 +- > gcc/config/rs6000/rs6000.h | 82 + > gcc/config/rs6000/t-rs6000 | 44 +- > gcc/gengtype-state.c | 29 +- > gcc/gengtype.c | 19 +- > gcc/gengtype.h | 5 + > .../powerpc/bfp/scalar-extract-exp-2.c | 2 +- > .../powerpc/bfp/scalar-extract-sig-2.c | 2 +- > .../powerpc/bfp/scalar-insert-exp-2.c | 2 +- > .../powerpc/bfp/scalar-insert-exp-5.c | 2 +- > .../powerpc/bfp/scalar-insert-exp-8.c | 2 +- > .../powerpc/bfp/scalar-test-neg-2.c | 2 +- > .../powerpc/bfp/scalar-test-neg-3.c | 2 +- > .../powerpc/bfp/scalar-test-neg-5.c | 2 +- > .../gcc.target/powerpc/byte-in-set-2.c | 2 +- > gcc/testsuite/gcc.target/powerpc/cmpb-2.c | 2 +- > gcc/testsuite/gcc.target/powerpc/cmpb32-2.c | 2 +- > .../gcc.target/powerpc/crypto-builtin-2.c | 14 +- > .../powerpc/fold-vec-splat-floatdouble.c | 4 +- > .../powerpc/fold-vec-splat-longlong.c | 10 +- > .../powerpc/fold-vec-splat-misc-invalid.c | 8 +- > .../gcc.target/powerpc/p8vector-builtin-8.c | 1 + > gcc/testsuite/gcc.target/powerpc/pr80315-1.c | 2 +- > gcc/testsuite/gcc.target/powerpc/pr80315-2.c | 2 +- > gcc/testsuite/gcc.target/powerpc/pr80315-3.c | 2 +- > gcc/testsuite/gcc.target/powerpc/pr80315-4.c | 2 +- > gcc/testsuite/gcc.target/powerpc/pr88100.c | 12 +- > .../gcc.target/powerpc/pragma_misc9.c | 2 +- > .../gcc.target/powerpc/pragma_power8.c | 2 + > .../gcc.target/powerpc/pragma_power9.c | 3 + > .../powerpc/test_fpscr_drn_builtin_error.c | 4 +- > .../powerpc/test_fpscr_rn_builtin_error.c | 12 +- > gcc/testsuite/gcc.target/powerpc/test_mffsl.c | 3 +- > gcc/testsuite/gcc.target/powerpc/vec-gnb-2.c | 2 +- > .../gcc.target/powerpc/vsu/vec-all-nez-7.c | 2 +- > .../gcc.target/powerpc/vsu/vec-any-eqz-7.c | 2 +- > .../gcc.target/powerpc/vsu/vec-cmpnez-7.c | 2 +- > .../gcc.target/powerpc/vsu/vec-cntlz-lsbb-2.c | 2 +- > .../gcc.target/powerpc/vsu/vec-cnttz-lsbb-2.c | 2 +- > .../gcc.target/powerpc/vsu/vec-xl-len-13.c | 2 +- > .../gcc.target/powerpc/vsu/vec-xst-len-12.c | 2 +- > 52 files changed, 17915 insertions(+), 824 deletions(-) > create mode 100644 gcc/config/rs6000/rbtree.c > create mode 100644 gcc/config/rs6000/rbtree.h > create mode 100644 gcc/config/rs6000/rs6000-builtin-new.def > create mode 100644 gcc/config/rs6000/rs6000-gen-builtins.c > create mode 100644 gcc/config/rs6000/rs6000-overload.def >