* RE: [0/7] Type promotion pass and elimination of zext/sext [not found] <A610E03AD50BFC4D95529A36D37FA55E8A7AB808CC@GEORGE.Emea.Arm.com> @ 2015-09-07 10:51 ` Wilco Dijkstra 2015-09-07 11:31 ` Kugan 0 siblings, 1 reply; 28+ messages in thread From: Wilco Dijkstra @ 2015-09-07 10:51 UTC (permalink / raw) To: 'Kugan', Renlin Li Cc: 'GCC Patches', 'Richard Biener' > Kugan wrote: > 2. vector-compare-1.c from c-c++-common/torture fails to assemble with > -O3 -g Error: unaligned opcodes detected in executable segment. It works > fine if I remove the -g. I am looking into it and needs to be fixed as well. This is a known assembler bug I found a while back, Renlin is looking into it. Basically when debug tables are inserted at the end of a code section the assembler doesn't align to the alignment required by the debug tables. Wilco ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [0/7] Type promotion pass and elimination of zext/sext 2015-09-07 10:51 ` [0/7] Type promotion pass and elimination of zext/sext Wilco Dijkstra @ 2015-09-07 11:31 ` Kugan 2015-09-07 12:17 ` pinskia 0 siblings, 1 reply; 28+ messages in thread From: Kugan @ 2015-09-07 11:31 UTC (permalink / raw) To: Wilco Dijkstra, Renlin Li; +Cc: 'GCC Patches', 'Richard Biener' On 07/09/15 20:46, Wilco Dijkstra wrote: >> Kugan wrote: >> 2. vector-compare-1.c from c-c++-common/torture fails to assemble with >> -O3 -g Error: unaligned opcodes detected in executable segment. It works >> fine if I remove the -g. I am looking into it and needs to be fixed as well. > > This is a known assembler bug I found a while back, Renlin is looking into it. > Basically when debug tables are inserted at the end of a code section the > assembler doesn't align to the alignment required by the debug tables. This is precisely what seems to be happening. Renlin, could you please let me know when you have a patch (even if it is a prototype or a hack). Thanks, Kugan ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [0/7] Type promotion pass and elimination of zext/sext 2015-09-07 11:31 ` Kugan @ 2015-09-07 12:17 ` pinskia 2015-09-07 12:49 ` Wilco Dijkstra 2015-09-08 8:03 ` Renlin Li 0 siblings, 2 replies; 28+ messages in thread From: pinskia @ 2015-09-07 12:17 UTC (permalink / raw) To: Kugan; +Cc: Wilco Dijkstra, Renlin Li, GCC Patches, Richard Biener > On Sep 7, 2015, at 7:22 PM, Kugan <kugan.vivekanandarajah@linaro.org> wrote: > > > > On 07/09/15 20:46, Wilco Dijkstra wrote: >>> Kugan wrote: >>> 2. vector-compare-1.c from c-c++-common/torture fails to assemble with >>> -O3 -g Error: unaligned opcodes detected in executable segment. It works >>> fine if I remove the -g. I am looking into it and needs to be fixed as well. >> >> This is a known assembler bug I found a while back, Renlin is looking into it. >> Basically when debug tables are inserted at the end of a code section the >> assembler doesn't align to the alignment required by the debug tables. > > This is precisely what seems to be happening. Renlin, could you please > let me know when you have a patch (even if it is a prototype or a hack). I had noticed that but I read through the assembler code and it sounded very much like it was a designed this way and that the compiler was not supposed to emit assembly like this and fix up the alignment. Thanks, Andrew > > Thanks, > Kugan ^ permalink raw reply [flat|nested] 28+ messages in thread
* RE: [0/7] Type promotion pass and elimination of zext/sext 2015-09-07 12:17 ` pinskia @ 2015-09-07 12:49 ` Wilco Dijkstra 2015-09-08 8:03 ` Renlin Li 1 sibling, 0 replies; 28+ messages in thread From: Wilco Dijkstra @ 2015-09-07 12:49 UTC (permalink / raw) To: pinskia, Kugan; +Cc: Renlin Li, GCC Patches, Richard Biener > pinskia@gmail.com wrote: > > On Sep 7, 2015, at 7:22 PM, Kugan <kugan.vivekanandarajah@linaro.org> wrote: > > > > > > > > On 07/09/15 20:46, Wilco Dijkstra wrote: > >>> Kugan wrote: > >>> 2. vector-compare-1.c from c-c++-common/torture fails to assemble with > >>> -O3 -g Error: unaligned opcodes detected in executable segment. It works > >>> fine if I remove the -g. I am looking into it and needs to be fixed as well. > >> > >> This is a known assembler bug I found a while back, Renlin is looking into it. > >> Basically when debug tables are inserted at the end of a code section the > >> assembler doesn't align to the alignment required by the debug tables. > > > > This is precisely what seems to be happening. Renlin, could you please > > let me know when you have a patch (even if it is a prototype or a hack). > > > I had noticed that but I read through the assembler code and it sounded very much like it was > a designed this way and that the compiler was not supposed to emit assembly like this and fix > up the alignment. No, the bug is introduced solely by the assembler - there is no way to avoid it as you can't expect users to align the end of the code section to an unspecified debug alignment (which could potentially vary depending on the generated debug info). The assembler aligns unaligned instructions without a warning, and doesn't require the section size to be a multiple of the section alignment, ie. the design is that the assembler can deal with any alignment. Wilco ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [0/7] Type promotion pass and elimination of zext/sext 2015-09-07 12:17 ` pinskia 2015-09-07 12:49 ` Wilco Dijkstra @ 2015-09-08 8:03 ` Renlin Li 2015-09-08 12:37 ` Wilco Dijkstra 1 sibling, 1 reply; 28+ messages in thread From: Renlin Li @ 2015-09-08 8:03 UTC (permalink / raw) To: pinskia, Kugan Cc: Wilco Dijkstra, GCC Patches, Richard Biener, Nicholas Clifton Hi Andrew, Previously, there is a discussion thread in binutils mailing list: https://sourceware.org/ml/binutils/2015-04/msg00032.html Nick proposed a way to fix, Richard Henderson hold similar opinion as you. Regards, Renlin On 07/09/15 12:45, pinskia@gmail.com wrote: > > > >> On Sep 7, 2015, at 7:22 PM, Kugan <kugan.vivekanandarajah@linaro.org> wrote: >> >> >> >> On 07/09/15 20:46, Wilco Dijkstra wrote: >>>> Kugan wrote: >>>> 2. vector-compare-1.c from c-c++-common/torture fails to assemble with >>>> -O3 -g Error: unaligned opcodes detected in executable segment. It works >>>> fine if I remove the -g. I am looking into it and needs to be fixed as well. >>> This is a known assembler bug I found a while back, Renlin is looking into it. >>> Basically when debug tables are inserted at the end of a code section the >>> assembler doesn't align to the alignment required by the debug tables. >> This is precisely what seems to be happening. Renlin, could you please >> let me know when you have a patch (even if it is a prototype or a hack). > > I had noticed that but I read through the assembler code and it sounded very much like it was a designed this way and that the compiler was not supposed to emit assembly like this and fix up the alignment. > > Thanks, > Andrew > >> Thanks, >> Kugan ^ permalink raw reply [flat|nested] 28+ messages in thread
* RE: [0/7] Type promotion pass and elimination of zext/sext 2015-09-08 8:03 ` Renlin Li @ 2015-09-08 12:37 ` Wilco Dijkstra 0 siblings, 0 replies; 28+ messages in thread From: Wilco Dijkstra @ 2015-09-08 12:37 UTC (permalink / raw) To: Renlin Li, pinskia, Kugan; +Cc: GCC Patches, Richard Biener, nickc > Renlin Li wrote: > Hi Andrew, > > Previously, there is a discussion thread in binutils mailing list: > > https://sourceware.org/ml/binutils/2015-04/msg00032.html > > Nick proposed a way to fix, Richard Henderson hold similar opinion as you. Both Nick and Richard H seem to think it is an issue with unaligned instructions rather than an alignment bug in the debug code in the assembler (probably due to the misleading error message). Although it would work, since we don't have/need unaligned instructions that proposed patch is not the right fix for this issue. Anyway aligning the debug tables correctly should be a safe and trivial fix. Wilco ^ permalink raw reply [flat|nested] 28+ messages in thread
* [0/7] Type promotion pass and elimination of zext/sext @ 2015-09-07 2:55 Kugan 2015-10-20 20:13 ` Kugan 0 siblings, 1 reply; 28+ messages in thread From: Kugan @ 2015-09-07 2:55 UTC (permalink / raw) To: gcc-patches; +Cc: Richard Biener This a new version of the patch posted in https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00226.html. I have done more testing and spitted the patch to make it more easier to review. There are still couple of issues to be addressed and I am working on them. 1. AARCH64 bootstrap now fails with the commit 94f92c36a83d66a893c3bc6f00a038ba3dbe2a6f. simplify-rtx.c is mis-compiled in stage2 and fwprop.c is failing. It looks to me that there is a latent issue which gets exposed my patch. I can also reproduce this in x86_64 if I use the same PROMOTE_MODE which is used in aarch64 port. For the time being, I am using patch 0006-temporary-workaround-for-bootstrap-failure-due-to-co.patch as a workaround. This meeds to be fixed before the patches are ready to be committed. 2. vector-compare-1.c from c-c++-common/torture fails to assemble with -O3 -g Error: unaligned opcodes detected in executable segment. It works fine if I remove the -g. I am looking into it and needs to be fixed as well. In the meantime, I would appreciate if you take some time to review this. I have bootstrapped on x86_64-linux-gnu, arm-linux-gnu and aarch-64-linux-gnu (with the workaround) and regression tested. Thanks, Kugan ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [0/7] Type promotion pass and elimination of zext/sext 2015-09-07 2:55 Kugan @ 2015-10-20 20:13 ` Kugan 2015-10-21 12:56 ` Richard Biener 0 siblings, 1 reply; 28+ messages in thread From: Kugan @ 2015-10-20 20:13 UTC (permalink / raw) To: gcc-patches; +Cc: Richard Biener On 07/09/15 12:53, Kugan wrote: > > This a new version of the patch posted in > https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00226.html. I have done > more testing and spitted the patch to make it more easier to review. > There are still couple of issues to be addressed and I am working on them. > > 1. AARCH64 bootstrap now fails with the commit > 94f92c36a83d66a893c3bc6f00a038ba3dbe2a6f. simplify-rtx.c is mis-compiled > in stage2 and fwprop.c is failing. It looks to me that there is a latent > issue which gets exposed my patch. I can also reproduce this in x86_64 > if I use the same PROMOTE_MODE which is used in aarch64 port. For the > time being, I am using patch > 0006-temporary-workaround-for-bootstrap-failure-due-to-co.patch as a > workaround. This meeds to be fixed before the patches are ready to be > committed. > > 2. vector-compare-1.c from c-c++-common/torture fails to assemble with > -O3 -g Error: unaligned opcodes detected in executable segment. It works > fine if I remove the -g. I am looking into it and needs to be fixed as well. Hi Richard, Now that stage 1 is going to close, I would like to get these patches accepted for stage1. I will try my best to address your review comments ASAP. * Issue 1 above (AARCH64 bootstrap now fails with the commit) is no longer present as it is fixed in trunk. Patch-6 is no longer needed. * Issue 2 is also reported as known issue * Promotion of PARM_DECLs and RESULT_DECLs in IPA pass and patterns in match.pd for SEXT_EXPR, I would like to propose them as a follow up patch once this is accepted. * I am happy to turn this pass off by default till IPA and match.pd changes are accepted. I can do regular testing to make sure that this pass works properly till we enable it by default. Please let me know what you think, Thanks, Kugan ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [0/7] Type promotion pass and elimination of zext/sext 2015-10-20 20:13 ` Kugan @ 2015-10-21 12:56 ` Richard Biener 2015-10-21 13:57 ` Richard Biener 2015-10-22 11:01 ` Kugan 0 siblings, 2 replies; 28+ messages in thread From: Richard Biener @ 2015-10-21 12:56 UTC (permalink / raw) To: Kugan; +Cc: gcc-patches On Tue, Oct 20, 2015 at 10:03 PM, Kugan <kugan.vivekanandarajah@linaro.org> wrote: > > > On 07/09/15 12:53, Kugan wrote: >> >> This a new version of the patch posted in >> https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00226.html. I have done >> more testing and spitted the patch to make it more easier to review. >> There are still couple of issues to be addressed and I am working on them. >> >> 1. AARCH64 bootstrap now fails with the commit >> 94f92c36a83d66a893c3bc6f00a038ba3dbe2a6f. simplify-rtx.c is mis-compiled >> in stage2 and fwprop.c is failing. It looks to me that there is a latent >> issue which gets exposed my patch. I can also reproduce this in x86_64 >> if I use the same PROMOTE_MODE which is used in aarch64 port. For the >> time being, I am using patch >> 0006-temporary-workaround-for-bootstrap-failure-due-to-co.patch as a >> workaround. This meeds to be fixed before the patches are ready to be >> committed. >> >> 2. vector-compare-1.c from c-c++-common/torture fails to assemble with >> -O3 -g Error: unaligned opcodes detected in executable segment. It works >> fine if I remove the -g. I am looking into it and needs to be fixed as well. > > Hi Richard, > > Now that stage 1 is going to close, I would like to get these patches > accepted for stage1. I will try my best to address your review comments > ASAP. Ok, can you make the whole patch series available so I can poke at the implementation a bit? Please state the revision it was rebased on (or point me to a git/svn branch the work resides on). > * Issue 1 above (AARCH64 bootstrap now fails with the commit) is no > longer present as it is fixed in trunk. Patch-6 is no longer needed. > > * Issue 2 is also reported as known issue > > * Promotion of PARM_DECLs and RESULT_DECLs in IPA pass and patterns in > match.pd for SEXT_EXPR, I would like to propose them as a follow up > patch once this is accepted. I thought more about this and don't think it can be made work without a lot of hassle. Instead to get rid of the remaining "badly" typed registers in the function we can key different type requirements on a pass property (PROP_promoted_regs), thus simply change the expectation of the types of function parameters / results according to their promotion. The promotion pass would set PROP_promoted_regs then. I will look over the patch(es) this week but as said I'd like to play with some code examples myself and thus like to have the current patchset in a more easily accessible form (and sure to apply to some rev.). Thanks, Richard. > * I am happy to turn this pass off by default till IPA and match.pd > changes are accepted. I can do regular testing to make sure that this > pass works properly till we enable it by default. > > > Please let me know what you think, > > Thanks, > Kugan ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [0/7] Type promotion pass and elimination of zext/sext 2015-10-21 12:56 ` Richard Biener @ 2015-10-21 13:57 ` Richard Biener 2015-10-21 17:17 ` Joseph Myers 2015-10-21 18:11 ` Richard Henderson 2015-10-22 11:01 ` Kugan 1 sibling, 2 replies; 28+ messages in thread From: Richard Biener @ 2015-10-21 13:57 UTC (permalink / raw) To: Kugan, Richard Henderson; +Cc: gcc-patches On Wed, Oct 21, 2015 at 2:45 PM, Richard Biener <richard.guenther@gmail.com> wrote: > On Tue, Oct 20, 2015 at 10:03 PM, Kugan > <kugan.vivekanandarajah@linaro.org> wrote: >> >> >> On 07/09/15 12:53, Kugan wrote: >>> >>> This a new version of the patch posted in >>> https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00226.html. I have done >>> more testing and spitted the patch to make it more easier to review. >>> There are still couple of issues to be addressed and I am working on them. >>> >>> 1. AARCH64 bootstrap now fails with the commit >>> 94f92c36a83d66a893c3bc6f00a038ba3dbe2a6f. simplify-rtx.c is mis-compiled >>> in stage2 and fwprop.c is failing. It looks to me that there is a latent >>> issue which gets exposed my patch. I can also reproduce this in x86_64 >>> if I use the same PROMOTE_MODE which is used in aarch64 port. For the >>> time being, I am using patch >>> 0006-temporary-workaround-for-bootstrap-failure-due-to-co.patch as a >>> workaround. This meeds to be fixed before the patches are ready to be >>> committed. >>> >>> 2. vector-compare-1.c from c-c++-common/torture fails to assemble with >>> -O3 -g Error: unaligned opcodes detected in executable segment. It works >>> fine if I remove the -g. I am looking into it and needs to be fixed as well. >> >> Hi Richard, >> >> Now that stage 1 is going to close, I would like to get these patches >> accepted for stage1. I will try my best to address your review comments >> ASAP. > > Ok, can you make the whole patch series available so I can poke at the > implementation a bit? Please state the revision it was rebased on > (or point me to a git/svn branch the work resides on). > >> * Issue 1 above (AARCH64 bootstrap now fails with the commit) is no >> longer present as it is fixed in trunk. Patch-6 is no longer needed. >> >> * Issue 2 is also reported as known issue >> >> * Promotion of PARM_DECLs and RESULT_DECLs in IPA pass and patterns in >> match.pd for SEXT_EXPR, I would like to propose them as a follow up >> patch once this is accepted. > > I thought more about this and don't think it can be made work without a lot of > hassle. Instead to get rid of the remaining "badly" typed registers in the > function we can key different type requirements on a pass property > (PROP_promoted_regs), thus simply change the expectation of the > types of function parameters / results according to their promotion. Or maybe we should simply make GIMPLE _always_ adhere to the ABI details from the start (gimplification). Note that this does not only involve PROMOTE_MODE. Note that for what GIMPLE is concerned I'd only "lower" passing / returning in registers (whee, and then we have things like targetm.calls.split_complex_arg ... not to mention passing GIMPLE memory in registers). Maybe I'm shooting too far here in the attempt to make GIMPLE closer to the target (to expose those redundant extensions on GIMPLE) and we'll end up with a bigger mess than with not doing this? Richard. > The promotion pass would set PROP_promoted_regs then. > > I will look over the patch(es) this week but as said I'd like to play with > some code examples myself and thus like to have the current patchset > in a more easily accessible form (and sure to apply to some rev.). > > Thanks, > Richard. > >> * I am happy to turn this pass off by default till IPA and match.pd >> changes are accepted. I can do regular testing to make sure that this >> pass works properly till we enable it by default. >> >> >> Please let me know what you think, >> >> Thanks, >> Kugan ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [0/7] Type promotion pass and elimination of zext/sext 2015-10-21 13:57 ` Richard Biener @ 2015-10-21 17:17 ` Joseph Myers 2015-10-21 18:11 ` Richard Henderson 1 sibling, 0 replies; 28+ messages in thread From: Joseph Myers @ 2015-10-21 17:17 UTC (permalink / raw) To: Richard Biener; +Cc: Kugan, Richard Henderson, gcc-patches On Wed, 21 Oct 2015, Richard Biener wrote: > Or maybe we should simply make GIMPLE _always_ adhere to the ABI > details from the start (gimplification). Note that this does not only involve > PROMOTE_MODE. Note that for what GIMPLE is concerned I'd only > "lower" passing / returning in registers (whee, and then we have > things like targetm.calls.split_complex_arg ... not to mention passing > GIMPLE memory in registers). > > Maybe I'm shooting too far here in the attempt to make GIMPLE closer > to the target (to expose those redundant extensions on GIMPLE) and > we'll end up with a bigger mess than with not doing this? I don't know at what point target-specific promotion should appear, but right now it's visible before then (front ends use targetm.calls.promote_prototypes), which is definitely too early. -- Joseph S. Myers joseph@codesourcery.com ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [0/7] Type promotion pass and elimination of zext/sext 2015-10-21 13:57 ` Richard Biener 2015-10-21 17:17 ` Joseph Myers @ 2015-10-21 18:11 ` Richard Henderson 2015-10-22 12:48 ` Richard Biener 1 sibling, 1 reply; 28+ messages in thread From: Richard Henderson @ 2015-10-21 18:11 UTC (permalink / raw) To: Richard Biener, Kugan; +Cc: gcc-patches On 10/21/2015 03:56 AM, Richard Biener wrote: > On Wed, Oct 21, 2015 at 2:45 PM, Richard Biener > <richard.guenther@gmail.com> wrote: >> On Tue, Oct 20, 2015 at 10:03 PM, Kugan >> <kugan.vivekanandarajah@linaro.org> wrote: >>> >>> >>> On 07/09/15 12:53, Kugan wrote: >>>> >>>> This a new version of the patch posted in >>>> https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00226.html. I have done >>>> more testing and spitted the patch to make it more easier to review. >>>> There are still couple of issues to be addressed and I am working on them. >>>> >>>> 1. AARCH64 bootstrap now fails with the commit >>>> 94f92c36a83d66a893c3bc6f00a038ba3dbe2a6f. simplify-rtx.c is mis-compiled >>>> in stage2 and fwprop.c is failing. It looks to me that there is a latent >>>> issue which gets exposed my patch. I can also reproduce this in x86_64 >>>> if I use the same PROMOTE_MODE which is used in aarch64 port. For the >>>> time being, I am using patch >>>> 0006-temporary-workaround-for-bootstrap-failure-due-to-co.patch as a >>>> workaround. This meeds to be fixed before the patches are ready to be >>>> committed. >>>> >>>> 2. vector-compare-1.c from c-c++-common/torture fails to assemble with >>>> -O3 -g Error: unaligned opcodes detected in executable segment. It works >>>> fine if I remove the -g. I am looking into it and needs to be fixed as well. >>> >>> Hi Richard, >>> >>> Now that stage 1 is going to close, I would like to get these patches >>> accepted for stage1. I will try my best to address your review comments >>> ASAP. >> >> Ok, can you make the whole patch series available so I can poke at the >> implementation a bit? Please state the revision it was rebased on >> (or point me to a git/svn branch the work resides on). >> >>> * Issue 1 above (AARCH64 bootstrap now fails with the commit) is no >>> longer present as it is fixed in trunk. Patch-6 is no longer needed. >>> >>> * Issue 2 is also reported as known issue >>> >>> * Promotion of PARM_DECLs and RESULT_DECLs in IPA pass and patterns in >>> match.pd for SEXT_EXPR, I would like to propose them as a follow up >>> patch once this is accepted. >> >> I thought more about this and don't think it can be made work without a lot of >> hassle. Instead to get rid of the remaining "badly" typed registers in the >> function we can key different type requirements on a pass property >> (PROP_promoted_regs), thus simply change the expectation of the >> types of function parameters / results according to their promotion. > > Or maybe we should simply make GIMPLE _always_ adhere to the ABI > details from the start (gimplification). Note that this does not only involve > PROMOTE_MODE. Note that for what GIMPLE is concerned I'd only > "lower" passing / returning in registers (whee, and then we have > things like targetm.calls.split_complex_arg ... not to mention passing > GIMPLE memory in registers). > > Maybe I'm shooting too far here in the attempt to make GIMPLE closer > to the target (to expose those redundant extensions on GIMPLE) and > we'll end up with a bigger mess than with not doing this? I'm leary of building this in as early as gimplification, lest we get into trouble with splitting out bits of the current function for off-loading. What happens when the cpu and gpu have different promotion rules? r~ ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [0/7] Type promotion pass and elimination of zext/sext 2015-10-21 18:11 ` Richard Henderson @ 2015-10-22 12:48 ` Richard Biener 0 siblings, 0 replies; 28+ messages in thread From: Richard Biener @ 2015-10-22 12:48 UTC (permalink / raw) To: Richard Henderson; +Cc: Kugan, gcc-patches On Wed, Oct 21, 2015 at 7:55 PM, Richard Henderson <rth@redhat.com> wrote: > On 10/21/2015 03:56 AM, Richard Biener wrote: >> >> On Wed, Oct 21, 2015 at 2:45 PM, Richard Biener >> <richard.guenther@gmail.com> wrote: >>> >>> On Tue, Oct 20, 2015 at 10:03 PM, Kugan >>> <kugan.vivekanandarajah@linaro.org> wrote: >>>> >>>> >>>> >>>> On 07/09/15 12:53, Kugan wrote: >>>>> >>>>> >>>>> This a new version of the patch posted in >>>>> https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00226.html. I have done >>>>> more testing and spitted the patch to make it more easier to review. >>>>> There are still couple of issues to be addressed and I am working on >>>>> them. >>>>> >>>>> 1. AARCH64 bootstrap now fails with the commit >>>>> 94f92c36a83d66a893c3bc6f00a038ba3dbe2a6f. simplify-rtx.c is >>>>> mis-compiled >>>>> in stage2 and fwprop.c is failing. It looks to me that there is a >>>>> latent >>>>> issue which gets exposed my patch. I can also reproduce this in x86_64 >>>>> if I use the same PROMOTE_MODE which is used in aarch64 port. For the >>>>> time being, I am using patch >>>>> 0006-temporary-workaround-for-bootstrap-failure-due-to-co.patch as a >>>>> workaround. This meeds to be fixed before the patches are ready to be >>>>> committed. >>>>> >>>>> 2. vector-compare-1.c from c-c++-common/torture fails to assemble with >>>>> -O3 -g Error: unaligned opcodes detected in executable segment. It >>>>> works >>>>> fine if I remove the -g. I am looking into it and needs to be fixed as >>>>> well. >>>> >>>> >>>> Hi Richard, >>>> >>>> Now that stage 1 is going to close, I would like to get these patches >>>> accepted for stage1. I will try my best to address your review comments >>>> ASAP. >>> >>> >>> Ok, can you make the whole patch series available so I can poke at the >>> implementation a bit? Please state the revision it was rebased on >>> (or point me to a git/svn branch the work resides on). >>> >>>> * Issue 1 above (AARCH64 bootstrap now fails with the commit) is no >>>> longer present as it is fixed in trunk. Patch-6 is no longer needed. >>>> >>>> * Issue 2 is also reported as known issue >>>> >>>> * Promotion of PARM_DECLs and RESULT_DECLs in IPA pass and patterns in >>>> match.pd for SEXT_EXPR, I would like to propose them as a follow up >>>> patch once this is accepted. >>> >>> >>> I thought more about this and don't think it can be made work without a >>> lot of >>> hassle. Instead to get rid of the remaining "badly" typed registers in >>> the >>> function we can key different type requirements on a pass property >>> (PROP_promoted_regs), thus simply change the expectation of the >>> types of function parameters / results according to their promotion. >> >> >> Or maybe we should simply make GIMPLE _always_ adhere to the ABI >> details from the start (gimplification). Note that this does not only >> involve >> PROMOTE_MODE. Note that for what GIMPLE is concerned I'd only >> "lower" passing / returning in registers (whee, and then we have >> things like targetm.calls.split_complex_arg ... not to mention passing >> GIMPLE memory in registers). >> >> Maybe I'm shooting too far here in the attempt to make GIMPLE closer >> to the target (to expose those redundant extensions on GIMPLE) and >> we'll end up with a bigger mess than with not doing this? > > > I'm leary of building this in as early as gimplification, lest we get into > trouble with splitting out bits of the current function for off-loading. > What happens when the cpu and gpu have different promotion rules? Ah, of course. I tend to forget these issues. Richard. > > r~ ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [0/7] Type promotion pass and elimination of zext/sext 2015-10-21 12:56 ` Richard Biener 2015-10-21 13:57 ` Richard Biener @ 2015-10-22 11:01 ` Kugan 2015-10-22 14:24 ` Richard Biener 1 sibling, 1 reply; 28+ messages in thread From: Kugan @ 2015-10-22 11:01 UTC (permalink / raw) To: Richard Biener; +Cc: gcc-patches [-- Attachment #1: Type: text/plain, Size: 1751 bytes --] On 21/10/15 23:45, Richard Biener wrote: > On Tue, Oct 20, 2015 at 10:03 PM, Kugan > <kugan.vivekanandarajah@linaro.org> wrote: >> >> >> On 07/09/15 12:53, Kugan wrote: >>> >>> This a new version of the patch posted in >>> https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00226.html. I have done >>> more testing and spitted the patch to make it more easier to review. >>> There are still couple of issues to be addressed and I am working on them. >>> >>> 1. AARCH64 bootstrap now fails with the commit >>> 94f92c36a83d66a893c3bc6f00a038ba3dbe2a6f. simplify-rtx.c is mis-compiled >>> in stage2 and fwprop.c is failing. It looks to me that there is a latent >>> issue which gets exposed my patch. I can also reproduce this in x86_64 >>> if I use the same PROMOTE_MODE which is used in aarch64 port. For the >>> time being, I am using patch >>> 0006-temporary-workaround-for-bootstrap-failure-due-to-co.patch as a >>> workaround. This meeds to be fixed before the patches are ready to be >>> committed. >>> >>> 2. vector-compare-1.c from c-c++-common/torture fails to assemble with >>> -O3 -g Error: unaligned opcodes detected in executable segment. It works >>> fine if I remove the -g. I am looking into it and needs to be fixed as well. >> >> Hi Richard, >> >> Now that stage 1 is going to close, I would like to get these patches >> accepted for stage1. I will try my best to address your review comments >> ASAP. > > Ok, can you make the whole patch series available so I can poke at the > implementation a bit? Please state the revision it was rebased on > (or point me to a git/svn branch the work resides on). > Thanks. Please find the patched rebated against trunk@229156. I have skipped the test-case readjustment patches. Thanks, Kugan [-- Attachment #2: 0004-debug-stmt-in-widen-mode.patch --] [-- Type: text/x-diff, Size: 3166 bytes --] From 2dc1cccfc59ae6967928b52396227b52a50803d9 Mon Sep 17 00:00:00 2001 From: Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org> Date: Thu, 22 Oct 2015 10:54:31 +1100 Subject: [PATCH 4/4] debug stmt in widen mode --- gcc/gimple-ssa-type-promote.c | 82 +++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 79 insertions(+), 3 deletions(-) diff --git a/gcc/gimple-ssa-type-promote.c b/gcc/gimple-ssa-type-promote.c index e62a7c6..c0b6aa1 100644 --- a/gcc/gimple-ssa-type-promote.c +++ b/gcc/gimple-ssa-type-promote.c @@ -589,10 +589,86 @@ fixup_uses (tree use, tree promoted_type, tree old_type) { case GIMPLE_DEBUG: { - gsi = gsi_for_stmt (stmt); - gsi_remove (&gsi, true); - break; + /* Change the GIMPLE_DEBUG stmt such that the value bound is + computed in promoted_type and then converted to required + type. */ + tree op, new_op = NULL_TREE; + gdebug *copy = NULL, *gs = as_a <gdebug *> (stmt); + enum tree_code code; + + /* Get the value that is bound in debug stmt. */ + switch (gs->subcode) + { + case GIMPLE_DEBUG_BIND: + op = gimple_debug_bind_get_value (gs); + break; + case GIMPLE_DEBUG_SOURCE_BIND: + op = gimple_debug_source_bind_get_value (gs); + break; + default: + gcc_unreachable (); + } + + code = TREE_CODE (op); + /* Convert the value computed in promoted_type to + old_type. */ + if (code == SSA_NAME && use == op) + new_op = build1 (NOP_EXPR, old_type, use); + else if (TREE_CODE_CLASS (TREE_CODE (op)) == tcc_unary + && code != NOP_EXPR) + { + tree op0 = TREE_OPERAND (op, 0); + if (op0 == use) + { + tree temp = build1 (code, promoted_type, op0); + new_op = build1 (NOP_EXPR, old_type, temp); + } + } + else if (TREE_CODE_CLASS (TREE_CODE (op)) == tcc_binary + /* Skip codes that are rejected in safe_to_promote_use_p. */ + && code != LROTATE_EXPR + && code != RROTATE_EXPR + && code != COMPLEX_EXPR) + { + tree op0 = TREE_OPERAND (op, 0); + tree op1 = TREE_OPERAND (op, 1); + if (op0 == use || op1 == use) + { + if (TREE_CODE (op0) == INTEGER_CST) + op0 = convert_int_cst (promoted_type, op0, SIGNED); + if (TREE_CODE (op1) == INTEGER_CST) + op1 = convert_int_cst (promoted_type, op1, SIGNED); + tree temp = build2 (code, promoted_type, op0, op1); + new_op = build1 (NOP_EXPR, old_type, temp); + } + } + + /* Create new GIMPLE_DEBUG stmt with the new value (new_op) to + be bound, if new value has been calculated */ + if (new_op) + { + if (gimple_debug_bind_p (stmt)) + { + copy = gimple_build_debug_bind + (gimple_debug_bind_get_var (stmt), + new_op, + stmt); + } + if (gimple_debug_source_bind_p (stmt)) + { + copy = gimple_build_debug_source_bind + (gimple_debug_source_bind_get_var (stmt), new_op, + stmt); + } + + if (copy) + { + gsi = gsi_for_stmt (stmt); + gsi_replace (&gsi, copy, false); + } + } } + break; case GIMPLE_ASM: case GIMPLE_CALL: -- 1.9.1 [-- Attachment #3: 0003-Optimize-ZEXT_EXPR-with-tree-vrp.patch --] [-- Type: text/x-diff, Size: 3519 bytes --] From 1044b1b5ebf8ad696a942207b031e3668ab2a0de Mon Sep 17 00:00:00 2001 From: Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org> Date: Thu, 22 Oct 2015 10:53:56 +1100 Subject: [PATCH 3/4] Optimize ZEXT_EXPR with tree-vrp --- gcc/tree-vrp.c | 70 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 70 insertions(+) diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c index fe34ffd..cdff9c0 100644 --- a/gcc/tree-vrp.c +++ b/gcc/tree-vrp.c @@ -2241,6 +2241,7 @@ extract_range_from_binary_expr_1 (value_range *vr, && code != LSHIFT_EXPR && code != MIN_EXPR && code != MAX_EXPR + && code != SEXT_EXPR && code != BIT_AND_EXPR && code != BIT_IOR_EXPR && code != BIT_XOR_EXPR) @@ -2801,6 +2802,52 @@ extract_range_from_binary_expr_1 (value_range *vr, extract_range_from_multiplicative_op_1 (vr, code, &vr0, &vr1); return; } + else if (code == SEXT_EXPR) + { + gcc_assert (range_int_cst_p (&vr1)); + HOST_WIDE_INT prec = tree_to_uhwi (vr1.min); + type = vr0.type; + wide_int tmin, tmax; + wide_int may_be_nonzero, must_be_nonzero; + + wide_int type_min = wi::min_value (prec, SIGNED); + wide_int type_max = wi::max_value (prec, SIGNED); + type_min = wide_int_to_tree (expr_type, type_min); + type_max = wide_int_to_tree (expr_type, type_max); + wide_int sign_bit + = wi::set_bit_in_zero (prec - 1, + TYPE_PRECISION (TREE_TYPE (vr0.min))); + if (zero_nonzero_bits_from_vr (expr_type, &vr0, + &may_be_nonzero, + &must_be_nonzero)) + { + if (wi::bit_and (must_be_nonzero, sign_bit) == sign_bit) + { + /* If to-be-extended sign bit is one. */ + tmin = type_min; + tmax = wi::zext (may_be_nonzero, prec); + } + else if (wi::bit_and (may_be_nonzero, sign_bit) + != sign_bit) + { + /* If to-be-extended sign bit is zero. */ + tmin = wi::zext (must_be_nonzero, prec); + tmax = wi::zext (may_be_nonzero, prec); + } + else + { + tmin = type_min; + tmax = type_max; + } + } + else + { + tmin = type_min; + tmax = type_max; + } + min = wide_int_to_tree (expr_type, tmin); + max = wide_int_to_tree (expr_type, tmax); + } else if (code == RSHIFT_EXPR || code == LSHIFT_EXPR) { @@ -9166,6 +9213,28 @@ simplify_bit_ops_using_ranges (gimple_stmt_iterator *gsi, gimple *stmt) break; } break; + case SEXT_EXPR: + { + unsigned int prec = tree_to_uhwi (op1); + wide_int sign_bit + = wi::set_bit_in_zero (prec - 1, + TYPE_PRECISION (TREE_TYPE (vr0.min))); + wide_int mask = wi::mask (prec, true, + TYPE_PRECISION (TREE_TYPE (vr0.min))); + if (wi::bit_and (must_be_nonzero0, sign_bit) == sign_bit) + { + /* If to-be-extended sign bit is one. */ + if (wi::bit_and (must_be_nonzero0, mask) == mask) + op = op0; + } + else if (wi::bit_and (may_be_nonzero0, sign_bit) != sign_bit) + { + /* If to-be-extended sign bit is zero. */ + if (wi::bit_and (may_be_nonzero0, mask) == 0) + op = op0; + } + } + break; default: gcc_unreachable (); } @@ -9868,6 +9937,7 @@ simplify_stmt_using_ranges (gimple_stmt_iterator *gsi) case BIT_AND_EXPR: case BIT_IOR_EXPR: + case SEXT_EXPR: /* Optimize away BIT_AND_EXPR and BIT_IOR_EXPR if all the bits being cleared are already cleared or all the bits being set are already set. */ -- 1.9.1 [-- Attachment #4: 0002-Add-type-promotion-pass.patch --] [-- Type: text/x-diff, Size: 29013 bytes --] From 0cd8d75c4130639f4a3fe8294bcbfdf4f2d3e4eb Mon Sep 17 00:00:00 2001 From: Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org> Date: Thu, 22 Oct 2015 10:52:37 +1100 Subject: [PATCH 2/4] Add type promotion pass --- gcc/Makefile.in | 1 + gcc/common.opt | 4 + gcc/doc/invoke.texi | 10 + gcc/gimple-ssa-type-promote.c | 831 ++++++++++++++++++++++++++++++++++++++++++ gcc/passes.def | 1 + gcc/timevar.def | 1 + gcc/tree-pass.h | 1 + gcc/tree-ssanames.c | 3 +- 8 files changed, 851 insertions(+), 1 deletion(-) create mode 100644 gcc/gimple-ssa-type-promote.c diff --git a/gcc/Makefile.in b/gcc/Makefile.in index b91b8dc..c6aed45 100644 --- a/gcc/Makefile.in +++ b/gcc/Makefile.in @@ -1499,6 +1499,7 @@ OBJS = \ tree-vect-slp.o \ tree-vectorizer.o \ tree-vrp.o \ + gimple-ssa-type-promote.o \ tree.o \ valtrack.o \ value-prof.o \ diff --git a/gcc/common.opt b/gcc/common.opt index 12ca0d6..f450428 100644 --- a/gcc/common.opt +++ b/gcc/common.opt @@ -2404,6 +2404,10 @@ ftree-vrp Common Report Var(flag_tree_vrp) Init(0) Optimization Perform Value Range Propagation on trees. +ftree-type-promote +Common Report Var(flag_tree_type_promote) Init(1) Optimization +Perform Type Promotion on trees + funit-at-a-time Common Report Var(flag_unit_at_a_time) Init(1) Compile whole compilation unit at a time. diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index cd82544..bc059a0 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -9093,6 +9093,16 @@ enabled by default at @option{-O2} and higher. Null pointer check elimination is only done if @option{-fdelete-null-pointer-checks} is enabled. +@item -ftree-type-promote +@opindex ftree-type-promote +This pass applies type promotion to SSA names in the function and +inserts appropriate truncations to preserve the semantics. Idea of +this pass is to promote operations such a way that we can minimise +generation of subreg in RTL, that intern results in removal of +redundant zero/sign extensions. + +This optimization is enabled by default. + @item -fsplit-ivs-in-unroller @opindex fsplit-ivs-in-unroller Enables expression of values of induction variables in later iterations diff --git a/gcc/gimple-ssa-type-promote.c b/gcc/gimple-ssa-type-promote.c new file mode 100644 index 0000000..e62a7c6 --- /dev/null +++ b/gcc/gimple-ssa-type-promote.c @@ -0,0 +1,831 @@ +/* Type promotion of SSA names to minimise redundant zero/sign extension. + Copyright (C) 2015 Free Software Foundation, Inc. + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify +it under the terms of the GNU General Public License as published by +the Free Software Foundation; either version 3, or (at your option) +any later version. + +GCC is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +You should have received a copy of the GNU General Public License +along with GCC; see the file COPYING3. If not see +<http://www.gnu.org/licenses/>. */ + +#include "config.h" +#include "system.h" +#include "coretypes.h" +#include "backend.h" +#include "hash-set.h" +#include "machmode.h" +#include "vec.h" +#include "double-int.h" +#include "input.h" +#include "symtab.h" +#include "wide-int.h" +#include "inchash.h" +#include "tree.h" +#include "fold-const.h" +#include "stor-layout.h" +#include "predict.h" +#include "function.h" +#include "dominance.h" +#include "cfg.h" +#include "basic-block.h" +#include "tree-ssa-alias.h" +#include "gimple-fold.h" +#include "tree-eh.h" +#include "gimple-expr.h" +#include "is-a.h" +#include "gimple.h" +#include "gimple-iterator.h" +#include "gimple-ssa.h" +#include "tree-phinodes.h" +#include "ssa-iterators.h" +#include "stringpool.h" +#include "tree-ssanames.h" +#include "tree-pass.h" +#include "gimple-pretty-print.h" +#include "langhooks.h" +#include "sbitmap.h" +#include "domwalk.h" +#include "tree-dfa.h" + +/* This pass applies type promotion to SSA names in the function and + inserts appropriate truncations. Idea of this pass is to promote operations + such a way that we can minimise generation of subreg in RTL, + that in turn results in removal of redundant zero/sign extensions. This pass + will run prior to The VRP and DOM such that they will be able to optimise + redundant truncations and extensions. This is based on the discussion from + https://gcc.gnu.org/ml/gcc-patches/2014-09/msg00472.html. + +*/ + +static unsigned n_ssa_val; +static sbitmap ssa_to_be_promoted_bitmap; +static sbitmap ssa_sets_higher_bits_bitmap; +static hash_map <tree, tree> *original_type_map; + +static bool +type_precision_ok (tree type) +{ + return (TYPE_PRECISION (type) == 8 + || TYPE_PRECISION (type) == 16 + || TYPE_PRECISION (type) == 32); +} + +/* Return the promoted type for TYPE. */ +static tree +get_promoted_type (tree type) +{ + tree promoted_type; + enum machine_mode mode; + int uns; + if (POINTER_TYPE_P (type) + || !INTEGRAL_TYPE_P (type) + || !type_precision_ok (type)) + return type; + + mode = TYPE_MODE (type); +#ifdef PROMOTE_MODE + uns = TYPE_SIGN (type); + PROMOTE_MODE (mode, uns, type); +#endif + uns = TYPE_SIGN (type); + promoted_type = lang_hooks.types.type_for_mode (mode, uns); + if (promoted_type + && (TYPE_PRECISION (promoted_type) > TYPE_PRECISION (type))) + type = promoted_type; + return type; +} + +/* Return true if ssa NAME is already considered for promotion. */ +static bool +ssa_promoted_p (tree name) +{ + if (TREE_CODE (name) == SSA_NAME) + { + unsigned int index = SSA_NAME_VERSION (name); + if (index < n_ssa_val) + return bitmap_bit_p (ssa_to_be_promoted_bitmap, index); + } + return true; +} + + +/* Set ssa NAME to be already considered for promotion. */ +static void +set_ssa_promoted (tree name) +{ + if (TREE_CODE (name) == SSA_NAME) + { + unsigned int index = SSA_NAME_VERSION (name); + if (index < n_ssa_val) + bitmap_set_bit (ssa_to_be_promoted_bitmap, index); + } +} + +/* Insert COPY_STMT along the edge from STMT to its successor. */ +static void +insert_stmt_on_edge (gimple *stmt, gimple *copy_stmt) +{ + edge_iterator ei; + edge e, edge = NULL; + basic_block bb = gimple_bb (stmt); + + FOR_EACH_EDGE (e, ei, bb->succs) + if (!(e->flags & EDGE_EH)) + { + gcc_assert (edge == NULL); + edge = e; + } + + gcc_assert (edge); + gsi_insert_on_edge_immediate (edge, copy_stmt); +} + +/* Return true if it is safe to promote the defined SSA_NAME in the STMT + itself. */ +static bool +safe_to_promote_def_p (gimple *stmt) +{ + enum tree_code code = gimple_assign_rhs_code (stmt); + if (gimple_vuse (stmt) != NULL_TREE + || gimple_vdef (stmt) != NULL_TREE + || code == ARRAY_REF + || code == LROTATE_EXPR + || code == RROTATE_EXPR + || code == VIEW_CONVERT_EXPR + || code == BIT_FIELD_REF + || code == REALPART_EXPR + || code == IMAGPART_EXPR + || code == REDUC_MAX_EXPR + || code == REDUC_PLUS_EXPR + || code == REDUC_MIN_EXPR) + return false; + return true; +} + +/* Return true if it is safe to promote the use in the STMT. */ +static bool +safe_to_promote_use_p (gimple *stmt) +{ + enum tree_code code = gimple_assign_rhs_code (stmt); + tree lhs = gimple_assign_lhs (stmt); + + if (gimple_vuse (stmt) != NULL_TREE + || gimple_vdef (stmt) != NULL_TREE + || code == VIEW_CONVERT_EXPR + || code == LROTATE_EXPR + || code == RROTATE_EXPR + || code == CONSTRUCTOR + || code == BIT_FIELD_REF + || code == COMPLEX_EXPR + || code == ASM_EXPR + || VECTOR_TYPE_P (TREE_TYPE (lhs))) + return false; + return true; +} + +/* Return true if the SSA_NAME has to be truncated to preserve the + semantics. */ +static bool +truncate_use_p (gimple *stmt) +{ + enum tree_code code = gimple_assign_rhs_code (stmt); + if (TREE_CODE_CLASS (code) + == tcc_comparison + || code == TRUNC_DIV_EXPR + || code == CEIL_DIV_EXPR + || code == FLOOR_DIV_EXPR + || code == ROUND_DIV_EXPR + || code == TRUNC_MOD_EXPR + || code == CEIL_MOD_EXPR + || code == FLOOR_MOD_EXPR + || code == ROUND_MOD_EXPR + || code == LSHIFT_EXPR + || code == RSHIFT_EXPR) + return true; + return false; +} + +/* Return true if LHS will be promoted later. */ +static bool +tobe_promoted_p (tree lhs) +{ + if (TREE_CODE (lhs) == SSA_NAME + && !POINTER_TYPE_P (TREE_TYPE (lhs)) + && INTEGRAL_TYPE_P (TREE_TYPE (lhs)) + && !VECTOR_TYPE_P (TREE_TYPE (lhs)) + && !ssa_promoted_p (lhs) + && (get_promoted_type (TREE_TYPE (lhs)) + != TREE_TYPE (lhs))) + return true; + else + return false; +} + +/* Convert constant CST to TYPE. */ +static tree +convert_int_cst (tree type, tree cst, signop sign = SIGNED) +{ + wide_int wi_cons = fold_convert (type, cst); + wi_cons = wi::ext (wi_cons, TYPE_PRECISION (TREE_TYPE (cst)), sign); + return wide_int_to_tree (type, wi_cons); +} + +/* Promote constants in STMT to TYPE. If PROMOTE_COND_EXPR is true, + promote only the constants in conditions part of the COND_EXPR. */ +static void +promote_cst_in_stmt (gimple *stmt, tree type, bool promote_cond = false) +{ + tree op; + ssa_op_iter iter; + use_operand_p oprnd; + int index; + tree op0, op1; + signop sign = SIGNED; + + switch (gimple_code (stmt)) + { + case GIMPLE_ASSIGN: + if (promote_cond + && gimple_assign_rhs_code (stmt) == COND_EXPR) + { + /* Promote INTEGER_CST that are tcc_compare arguments. */ + sign = TYPE_SIGN (type); + op = gimple_assign_rhs1 (stmt); + op0 = TREE_OPERAND (op, 0); + op1 = TREE_OPERAND (op, 1); + if (TREE_CODE (op0) == INTEGER_CST) + op0 = convert_int_cst (type, op0, sign); + if (TREE_CODE (op1) == INTEGER_CST) + op1 = convert_int_cst (type, op1, sign); + tree new_op = build2 (TREE_CODE (op), type, op0, op1); + gimple_assign_set_rhs1 (stmt, new_op); + } + else + { + /* Promote INTEGER_CST in GIMPLE_ASSIGN. */ + op = gimple_assign_rhs3 (stmt); + if (op && TREE_CODE (op) == INTEGER_CST) + gimple_assign_set_rhs3 (stmt, convert_int_cst (type, op, sign)); + if (TREE_CODE_CLASS (gimple_assign_rhs_code (stmt)) + == tcc_comparison) + sign = TYPE_SIGN (type); + op = gimple_assign_rhs1 (stmt); + if (op && TREE_CODE (op) == INTEGER_CST) + gimple_assign_set_rhs1 (stmt, convert_int_cst (type, op, sign)); + op = gimple_assign_rhs2 (stmt); + if (op && TREE_CODE (op) == INTEGER_CST) + gimple_assign_set_rhs2 (stmt, convert_int_cst (type, op, sign)); + } + break; + + case GIMPLE_PHI: + { + /* Promote INTEGER_CST arguments to GIMPLE_PHI. */ + gphi *phi = as_a <gphi *> (stmt); + FOR_EACH_PHI_ARG (oprnd, phi, iter, SSA_OP_USE) + { + op = USE_FROM_PTR (oprnd); + index = PHI_ARG_INDEX_FROM_USE (oprnd); + if (TREE_CODE (op) == INTEGER_CST) + SET_PHI_ARG_DEF (phi, index, convert_int_cst (type, op, sign)); + } + } + break; + + case GIMPLE_COND: + { + /* Promote INTEGER_CST that are GIMPLE_COND arguments. */ + gcond *cond = as_a <gcond *> (stmt); + op = gimple_cond_lhs (cond); + sign = TYPE_SIGN (type); + + if (op && TREE_CODE (op) == INTEGER_CST) + gimple_cond_set_lhs (cond, convert_int_cst (type, op, sign)); + op = gimple_cond_rhs (cond); + + if (op && TREE_CODE (op) == INTEGER_CST) + gimple_cond_set_rhs (cond, convert_int_cst (type, op, sign)); + } + break; + + default: + gcc_unreachable (); + } +} + +/* Create an ssa with TYPE to copy ssa VAR. */ +static tree +make_promoted_copy (tree var, gimple *def_stmt, tree type) +{ + tree new_lhs = make_ssa_name (type, def_stmt); + if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (var)) + SSA_NAME_OCCURS_IN_ABNORMAL_PHI (new_lhs) = 1; + return new_lhs; +} + +/* Zero/sign extend (depending on type) VAR and truncate to WIDTH bits. + Assign the zero/sign extended value in NEW_VAR. gimple statement + that performs the zero/sign extension is returned. */ +static gimple * +zero_sign_extend_stmt (tree new_var, tree var, int width) +{ + gcc_assert (TYPE_PRECISION (TREE_TYPE (var)) + == TYPE_PRECISION (TREE_TYPE (new_var))); + gcc_assert (TYPE_PRECISION (TREE_TYPE (var)) > width); + gcc_assert (width != 1); + gimple *stmt; + + if (TYPE_UNSIGNED (TREE_TYPE (new_var))) + { + /* Zero extend. */ + tree cst + = wide_int_to_tree (TREE_TYPE (var), + wi::mask (width, false, + TYPE_PRECISION (TREE_TYPE (var)))); + stmt = gimple_build_assign (new_var, BIT_AND_EXPR, + var, cst); + } + else + /* Sign extend. */ + stmt = gimple_build_assign (new_var, + SEXT_EXPR, + var, build_int_cst (TREE_TYPE (var), width)); + return stmt; +} + + +void duplicate_default_ssa (tree to, tree from) +{ + SET_SSA_NAME_VAR_OR_IDENTIFIER (to, SSA_NAME_VAR (from)); + SSA_NAME_IS_DEFAULT_DEF (to) = SSA_NAME_IS_DEFAULT_DEF (from); + SSA_NAME_DEF_STMT (to) = SSA_NAME_DEF_STMT (from); + SET_SSA_NAME_VAR_OR_IDENTIFIER (from, NULL_TREE); + SSA_NAME_IS_DEFAULT_DEF (to) = 1; + SSA_NAME_IS_DEFAULT_DEF (from) = 0; +} + +/* Promote definition DEF to PROMOTED_TYPE. If the stmt that defines def + is def_stmt, make the type of def promoted_type. If the stmt is such + that, result of the def_stmt cannot be of promoted_type, create a new_def + of the original_type and make the def_stmt assign its value to newdef. + Then, create a CONVERT_EXPR to convert new_def to def of promoted type. + + For example, for stmt with original_type char and promoted_type int: + char _1 = mem; + becomes: + char _2 = mem; + int _1 = (int)_2; + + If the def_stmt allows def to be promoted, promote def in-place + (and its arguments when needed). + + For example: + char _3 = _1 + _2; + becomes: + int _3 = _1 + _2; + Here, _1 and _2 will also be promoted. */ + +static void +promote_definition (tree def, + tree promoted_type) +{ + gimple *def_stmt = SSA_NAME_DEF_STMT (def); + gimple *copy_stmt = NULL; + basic_block bb; + gimple_stmt_iterator gsi; + tree original_type = TREE_TYPE (def); + tree new_def; + bool do_not_promote = false; + + switch (gimple_code (def_stmt)) + { + case GIMPLE_PHI: + { + /* Promote def by fixing its type and make def anonymous. */ + TREE_TYPE (def) = promoted_type; + SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE); + promote_cst_in_stmt (def_stmt, promoted_type); + break; + } + + case GIMPLE_ASM: + { + gasm *asm_stmt = as_a <gasm *> (def_stmt); + for (unsigned int i = 0; i < gimple_asm_noutputs (asm_stmt); ++i) + { + /* Promote def and copy (i.e. convert) the value defined + by asm to def. */ + tree link = gimple_asm_output_op (asm_stmt, i); + tree op = TREE_VALUE (link); + if (op == def) + { + new_def = copy_ssa_name (def); + set_ssa_promoted (new_def); + duplicate_default_ssa (new_def, def); + TREE_VALUE (link) = new_def; + gimple_asm_set_output_op (asm_stmt, i, link); + + TREE_TYPE (def) = promoted_type; + copy_stmt = gimple_build_assign (def, CONVERT_EXPR, + new_def, NULL_TREE); + gsi = gsi_for_stmt (def_stmt); + SSA_NAME_IS_DEFAULT_DEF (new_def) = 0; + gsi_insert_after (&gsi, copy_stmt, GSI_NEW_STMT); + break; + } + } + break; + } + + case GIMPLE_NOP: + { + if (SSA_NAME_VAR (def) == NULL) + { + /* Promote def by fixing its type for anonymous def. */ + TREE_TYPE (def) = promoted_type; + } + else + { + /* Create a promoted copy of parameters. */ + bb = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun)); + gcc_assert (bb); + gsi = gsi_after_labels (bb); + new_def = copy_ssa_name (def); + set_ssa_promoted (new_def); + set_ssa_default_def (cfun, SSA_NAME_VAR (def), new_def); + duplicate_default_ssa (new_def, def); + TREE_TYPE (def) = promoted_type; + copy_stmt = gimple_build_assign (def, CONVERT_EXPR, + new_def, NULL_TREE); + SSA_NAME_DEF_STMT (def) = copy_stmt; + gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT); + } + break; + } + + case GIMPLE_ASSIGN: + { + enum tree_code code = gimple_assign_rhs_code (def_stmt); + if (!safe_to_promote_def_p (def_stmt)) + { + do_not_promote = true; + } + else if (CONVERT_EXPR_CODE_P (code)) + { + tree rhs = gimple_assign_rhs1 (def_stmt); + if (!type_precision_ok (TREE_TYPE (rhs))) + { + do_not_promote = true; + } + else if (types_compatible_p (TREE_TYPE (rhs), promoted_type)) + { + /* As we travel statements in dominated order, arguments + of def_stmt will be visited before visiting def. If RHS + is already promoted and type is compatible, we can convert + them into ZERO/SIGN EXTEND stmt. */ + tree &type = original_type_map->get_or_insert (rhs); + if (type == NULL_TREE) + type = TREE_TYPE (rhs); + if (TYPE_PRECISION (original_type) < TYPE_PRECISION (type)) + type = original_type; + gcc_assert (type != NULL_TREE); + TREE_TYPE (def) = promoted_type; + gimple *copy_stmt = + zero_sign_extend_stmt (def, rhs, + TYPE_PRECISION (type)); + SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE); + gsi = gsi_for_stmt (def_stmt); + gsi_replace (&gsi, copy_stmt, false); + } + else { + /* If RHS is not promoted OR their types are not + compatible, create CONVERT_EXPR that converts + RHS to promoted DEF type and perform a + ZERO/SIGN EXTEND to get the required value + from RHS. */ + tree s = (TYPE_PRECISION (TREE_TYPE (def)) + < TYPE_PRECISION (TREE_TYPE (rhs))) + ? TREE_TYPE (def) : TREE_TYPE (rhs); + new_def = copy_ssa_name (def); + set_ssa_promoted (new_def); + TREE_TYPE (def) = promoted_type; + TREE_TYPE (new_def) = promoted_type; + SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE); + SET_SSA_NAME_VAR_OR_IDENTIFIER (new_def, NULL_TREE); + gimple_set_lhs (def_stmt, new_def); + gimple *copy_stmt = + zero_sign_extend_stmt (def, new_def, + TYPE_PRECISION (s)); + gsi = gsi_for_stmt (def_stmt); + if (lookup_stmt_eh_lp (def_stmt) > 0) + insert_stmt_on_edge (def_stmt, copy_stmt); + else + gsi_insert_after (&gsi, copy_stmt, GSI_NEW_STMT); + } + } + else + { + /* Promote def by fixing its type and make def anonymous. */ + SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE); + promote_cst_in_stmt (def_stmt, promoted_type); + TREE_TYPE (def) = promoted_type; + } + break; + } + + default: + do_not_promote = true; + break; + } + + if (do_not_promote) + { + /* Promote def and copy (i.e. convert) the value defined + by the stmt that cannot be promoted. */ + new_def = copy_ssa_name (def); + set_ssa_promoted (new_def); + SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE); + TREE_TYPE (def) = promoted_type; + gimple_set_lhs (def_stmt, new_def); + copy_stmt = gimple_build_assign (def, CONVERT_EXPR, + new_def, NULL_TREE); + gsi = gsi_for_stmt (def_stmt); + if (lookup_stmt_eh_lp (def_stmt) > 0) + insert_stmt_on_edge (def_stmt, copy_stmt); + else + gsi_insert_after (&gsi, copy_stmt, GSI_NEW_STMT); + } + else + { + /* Type is now promoted. Due to this, some of the value ranges computed + by VRP1 will is invalid. TODO: We can be intelligent in deciding + which ranges to be invalidated instead of invalidating everything. */ + SSA_NAME_RANGE_INFO (def) = NULL; + } +} + +/* Fix the (promoted) USE in stmts where USE cannot be be promoted. */ +static unsigned int +fixup_uses (tree use, tree promoted_type, tree old_type) +{ + gimple *stmt; + imm_use_iterator ui; + gimple_stmt_iterator gsi; + use_operand_p op; + + FOR_EACH_IMM_USE_STMT (stmt, ui, use) + { + bool do_not_promote = false; + switch (gimple_code (stmt)) + { + case GIMPLE_DEBUG: + { + gsi = gsi_for_stmt (stmt); + gsi_remove (&gsi, true); + break; + } + + case GIMPLE_ASM: + case GIMPLE_CALL: + case GIMPLE_RETURN: + { + /* USE cannot be promoted here. */ + do_not_promote = true; + break; + } + + case GIMPLE_ASSIGN: + { + enum tree_code code = gimple_assign_rhs_code (stmt); + tree lhs = gimple_assign_lhs (stmt); + if (!safe_to_promote_use_p (stmt)) + { + do_not_promote = true; + } + else if (truncate_use_p (stmt)) + { + /* In some stmts, value in USE has to be ZERO/SIGN + Extended based on the original type for correct + result. */ + tree temp = make_promoted_copy (use, NULL, TREE_TYPE (use)); + gimple *copy_stmt = + zero_sign_extend_stmt (temp, use, + TYPE_PRECISION (old_type)); + gsi = gsi_for_stmt (stmt); + gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT); + + FOR_EACH_IMM_USE_ON_STMT (op, ui) + SET_USE (op, temp); + if (TREE_CODE_CLASS (code) + == tcc_comparison) + promote_cst_in_stmt (stmt, promoted_type, true); + update_stmt (stmt); + } + else if (CONVERT_EXPR_CODE_P (code)) + { + tree rhs = gimple_assign_rhs1 (stmt); + if (!type_precision_ok (TREE_TYPE (rhs))) + { + do_not_promote = true; + } + else if (types_compatible_p (TREE_TYPE (lhs), promoted_type)) + { + /* Type of LHS and promoted RHS are compatible, we can + convert this into ZERO/SIGN EXTEND stmt. */ + gimple *copy_stmt = + zero_sign_extend_stmt (lhs, use, + TYPE_PRECISION (old_type)); + gsi = gsi_for_stmt (stmt); + set_ssa_promoted (lhs); + gsi_replace (&gsi, copy_stmt, false); + } + else if (tobe_promoted_p (lhs)) + { + /* If LHS will be promoted later, store the original + type of RHS so that we can convert it to ZERO/SIGN + EXTEND when LHS is promoted. */ + tree rhs = gimple_assign_rhs1 (stmt); + tree &type = original_type_map->get_or_insert (rhs); + type = TREE_TYPE (old_type); + } + else + { + do_not_promote = true; + } + } + break; + } + + case GIMPLE_COND: + { + /* In GIMPLE_COND, value in USE has to be ZERO/SIGN + Extended based on the original type for correct + result. */ + tree temp = make_promoted_copy (use, NULL, TREE_TYPE (use)); + gimple *copy_stmt = + zero_sign_extend_stmt (temp, use, + TYPE_PRECISION (old_type)); + gsi = gsi_for_stmt (stmt); + gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT); + + FOR_EACH_IMM_USE_ON_STMT (op, ui) + SET_USE (op, temp); + promote_cst_in_stmt (stmt, promoted_type, true); + update_stmt (stmt); + break; + } + + default: + break; + } + + if (do_not_promote) + { + /* FOR stmts where USE canoot be promoted, create an + original type copy. */ + tree temp; + temp = copy_ssa_name (use); + set_ssa_promoted (temp); + TREE_TYPE (temp) = old_type; + gimple *copy_stmt = gimple_build_assign (temp, CONVERT_EXPR, + use, NULL_TREE); + gsi = gsi_for_stmt (stmt); + gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT); + FOR_EACH_IMM_USE_ON_STMT (op, ui) + SET_USE (op, temp); + update_stmt (stmt); + } + } + return 0; +} + +/* Promote definition of NAME and adjust its uses if necessary. */ +static unsigned int +promote_def_and_uses (tree name) +{ + tree type; + if (tobe_promoted_p (name)) + { + type = get_promoted_type (TREE_TYPE (name)); + tree old_type = TREE_TYPE (name); + promote_definition (name, type); + fixup_uses (name, type, old_type); + set_ssa_promoted (name); + } + return 0; +} + +/* Promote all the stmts in the basic block. */ +static void +promote_all_stmts (basic_block bb) +{ + gimple_stmt_iterator gsi; + ssa_op_iter iter; + tree def; + + for (gphi_iterator gpi = gsi_start_phis (bb); + !gsi_end_p (gpi); gsi_next (&gpi)) + { + gphi *phi = gpi.phi (); + use_operand_p op; + + FOR_EACH_PHI_ARG (op, phi, iter, SSA_OP_USE) + { + def = USE_FROM_PTR (op); + promote_def_and_uses (def); + } + def = PHI_RESULT (phi); + promote_def_and_uses (def); + } + for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi)) + { + gimple *stmt = gsi_stmt (gsi); + + FOR_EACH_SSA_TREE_OPERAND (def, stmt, iter, SSA_OP_USE | SSA_OP_DEF) + promote_def_and_uses (def); + } +} + + +class type_promotion_dom_walker : public dom_walker +{ +public: + type_promotion_dom_walker (cdi_direction direction) + : dom_walker (direction) {} + virtual void before_dom_children (basic_block bb) + { + promote_all_stmts (bb); + } +}; + +/* Main entry point to the pass. */ +static unsigned int +execute_type_promotion (void) +{ + n_ssa_val = num_ssa_names; + original_type_map = new hash_map<tree, tree>; + ssa_to_be_promoted_bitmap = sbitmap_alloc (n_ssa_val); + bitmap_clear (ssa_to_be_promoted_bitmap); + ssa_sets_higher_bits_bitmap = sbitmap_alloc (n_ssa_val); + bitmap_clear (ssa_sets_higher_bits_bitmap); + + calculate_dominance_info (CDI_DOMINATORS); + /* Walk the CFG in dominator order. */ + type_promotion_dom_walker (CDI_DOMINATORS) + .walk (ENTRY_BLOCK_PTR_FOR_FN (cfun)); + + sbitmap_free (ssa_to_be_promoted_bitmap); + sbitmap_free (ssa_sets_higher_bits_bitmap); + free_dominance_info (CDI_DOMINATORS); + delete original_type_map; + return 0; +} + +namespace { +const pass_data pass_data_type_promotion = +{ + GIMPLE_PASS, /* type */ + "promotion", /* name */ + OPTGROUP_NONE, /* optinfo_flags */ + TV_TREE_TYPE_PROMOTE, /* tv_id */ + PROP_ssa, /* properties_required */ + 0, /* properties_provided */ + 0, /* properties_destroyed */ + 0, /* todo_flags_start */ + (TODO_cleanup_cfg | TODO_update_ssa | TODO_verify_all), +}; + +class pass_type_promotion : public gimple_opt_pass +{ +public: + pass_type_promotion (gcc::context *ctxt) + : gimple_opt_pass (pass_data_type_promotion, ctxt) + {} + + /* opt_pass methods: */ + opt_pass * clone () { return new pass_type_promotion (m_ctxt); } + virtual bool gate (function *) { return flag_tree_type_promote != 0; } + virtual unsigned int execute (function *) + { + return execute_type_promotion (); + } + +}; // class pass_type_promotion + +} // anon namespace + +gimple_opt_pass * +make_pass_type_promote (gcc::context *ctxt) +{ + return new pass_type_promotion (ctxt); +} + diff --git a/gcc/passes.def b/gcc/passes.def index 36d2b3b..78c463a 100644 --- a/gcc/passes.def +++ b/gcc/passes.def @@ -272,6 +272,7 @@ along with GCC; see the file COPYING3. If not see POP_INSERT_PASSES () NEXT_PASS (pass_simduid_cleanup); NEXT_PASS (pass_lower_vector_ssa); + NEXT_PASS (pass_type_promote); NEXT_PASS (pass_cse_reciprocals); NEXT_PASS (pass_reassoc); NEXT_PASS (pass_strength_reduction); diff --git a/gcc/timevar.def b/gcc/timevar.def index b429faf..a8d40c3 100644 --- a/gcc/timevar.def +++ b/gcc/timevar.def @@ -278,6 +278,7 @@ DEFTIMEVAR (TV_VTABLE_VERIFICATION , "vtable verification") DEFTIMEVAR (TV_TREE_UBSAN , "tree ubsan") DEFTIMEVAR (TV_INITIALIZE_RTL , "initialize rtl") DEFTIMEVAR (TV_GIMPLE_LADDRESS , "address lowering") +DEFTIMEVAR (TV_TREE_TYPE_PROMOTE , "tree type promote") /* Everything else in rest_of_compilation not included above. */ DEFTIMEVAR (TV_EARLY_LOCAL , "early local passes") diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h index 333b5a7..449dd19 100644 --- a/gcc/tree-pass.h +++ b/gcc/tree-pass.h @@ -436,6 +436,7 @@ extern gimple_opt_pass *make_pass_fre (gcc::context *ctxt); extern gimple_opt_pass *make_pass_check_data_deps (gcc::context *ctxt); extern gimple_opt_pass *make_pass_copy_prop (gcc::context *ctxt); extern gimple_opt_pass *make_pass_isolate_erroneous_paths (gcc::context *ctxt); +extern gimple_opt_pass *make_pass_type_promote (gcc::context *ctxt); extern gimple_opt_pass *make_pass_vrp (gcc::context *ctxt); extern gimple_opt_pass *make_pass_uncprop (gcc::context *ctxt); extern gimple_opt_pass *make_pass_return_slot (gcc::context *ctxt); diff --git a/gcc/tree-ssanames.c b/gcc/tree-ssanames.c index 82fd4a1..80fcf70 100644 --- a/gcc/tree-ssanames.c +++ b/gcc/tree-ssanames.c @@ -207,7 +207,8 @@ set_range_info (tree name, enum value_range_type range_type, unsigned int precision = TYPE_PRECISION (TREE_TYPE (name)); /* Allocate if not available. */ - if (ri == NULL) + if (ri == NULL + || (precision != ri->get_min ().get_precision ())) { size_t size = (sizeof (range_info_def) + trailing_wide_ints <3>::extra_size (precision)); -- 1.9.1 [-- Attachment #5: 0001-Add-new-SEXT_EXPR-tree-code.patch --] [-- Type: text/x-diff, Size: 5067 bytes --] From c0ce364e3a422912a08189645efde46c36583753 Mon Sep 17 00:00:00 2001 From: Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org> Date: Thu, 22 Oct 2015 10:51:42 +1100 Subject: [PATCH 1/4] Add new SEXT_EXPR tree code --- gcc/cfgexpand.c | 12 ++++++++++++ gcc/expr.c | 20 ++++++++++++++++++++ gcc/fold-const.c | 4 ++++ gcc/tree-cfg.c | 12 ++++++++++++ gcc/tree-inline.c | 1 + gcc/tree-pretty-print.c | 11 +++++++++++ gcc/tree.def | 5 +++++ 7 files changed, 65 insertions(+) diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c index eaad859..aeb64bb 100644 --- a/gcc/cfgexpand.c +++ b/gcc/cfgexpand.c @@ -5054,6 +5054,18 @@ expand_debug_expr (tree exp) case FMA_EXPR: return simplify_gen_ternary (FMA, mode, inner_mode, op0, op1, op2); + case SEXT_EXPR: + gcc_assert (CONST_INT_P (op1)); + inner_mode = mode_for_size (INTVAL (op1), MODE_INT, 0); + gcc_assert (GET_MODE_BITSIZE (inner_mode) == INTVAL (op1)); + + if (mode != inner_mode) + op0 = simplify_gen_unary (SIGN_EXTEND, + mode, + gen_lowpart_SUBREG (inner_mode, op0), + inner_mode); + return op0; + default: flag_unsupported: #ifdef ENABLE_CHECKING diff --git a/gcc/expr.c b/gcc/expr.c index da68870..c2f535f 100644 --- a/gcc/expr.c +++ b/gcc/expr.c @@ -9318,6 +9318,26 @@ expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode, target = expand_vec_cond_expr (type, treeop0, treeop1, treeop2, target); return target; + case SEXT_EXPR: + { + machine_mode inner_mode = mode_for_size (tree_to_uhwi (treeop1), + MODE_INT, 0); + rtx temp, result; + rtx op0 = expand_normal (treeop0); + op0 = force_reg (mode, op0); + if (mode != inner_mode) + { + result = gen_reg_rtx (mode); + temp = simplify_gen_unary (SIGN_EXTEND, mode, + gen_lowpart_SUBREG (inner_mode, op0), + inner_mode); + convert_move (result, temp, 0); + } + else + result = op0; + return result; + } + default: gcc_unreachable (); } diff --git a/gcc/fold-const.c b/gcc/fold-const.c index 602ea24..a149bad 100644 --- a/gcc/fold-const.c +++ b/gcc/fold-const.c @@ -987,6 +987,10 @@ int_const_binop_1 (enum tree_code code, const_tree arg1, const_tree parg2, res = wi::bit_and (arg1, arg2); break; + case SEXT_EXPR: + res = wi::sext (arg1, arg2.to_uhwi ()); + break; + case RSHIFT_EXPR: case LSHIFT_EXPR: if (wi::neg_p (arg2)) diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c index 8e3e810..d18b3f7 100644 --- a/gcc/tree-cfg.c +++ b/gcc/tree-cfg.c @@ -3752,6 +3752,18 @@ verify_gimple_assign_binary (gassign *stmt) return false; } + case SEXT_EXPR: + { + if (!INTEGRAL_TYPE_P (lhs_type) + || !useless_type_conversion_p (lhs_type, rhs1_type) + || !tree_fits_uhwi_p (rhs2)) + { + error ("invalid operands in sext expr"); + return true; + } + return false; + } + case VEC_WIDEN_LSHIFT_HI_EXPR: case VEC_WIDEN_LSHIFT_LO_EXPR: { diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c index b8269ef..e61c200 100644 --- a/gcc/tree-inline.c +++ b/gcc/tree-inline.c @@ -3893,6 +3893,7 @@ estimate_operator_cost (enum tree_code code, eni_weights *weights, case BIT_XOR_EXPR: case BIT_AND_EXPR: case BIT_NOT_EXPR: + case SEXT_EXPR: case TRUTH_ANDIF_EXPR: case TRUTH_ORIF_EXPR: diff --git a/gcc/tree-pretty-print.c b/gcc/tree-pretty-print.c index 11f90051..bec9082 100644 --- a/gcc/tree-pretty-print.c +++ b/gcc/tree-pretty-print.c @@ -1923,6 +1923,14 @@ dump_generic_node (pretty_printer *pp, tree node, int spc, int flags, } break; + case SEXT_EXPR: + pp_string (pp, "SEXT_EXPR <"); + dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags, false); + pp_string (pp, ", "); + dump_generic_node (pp, TREE_OPERAND (node, 1), spc, flags, false); + pp_greater (pp); + break; + case MODIFY_EXPR: case INIT_EXPR: dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags, @@ -3561,6 +3569,9 @@ op_symbol_code (enum tree_code code) case MIN_EXPR: return "min"; + case SEXT_EXPR: + return "sext"; + default: return "<<< ??? >>>"; } diff --git a/gcc/tree.def b/gcc/tree.def index d0a3bd6..789cfdd 100644 --- a/gcc/tree.def +++ b/gcc/tree.def @@ -760,6 +760,11 @@ DEFTREECODE (BIT_XOR_EXPR, "bit_xor_expr", tcc_binary, 2) DEFTREECODE (BIT_AND_EXPR, "bit_and_expr", tcc_binary, 2) DEFTREECODE (BIT_NOT_EXPR, "bit_not_expr", tcc_unary, 1) +/* Sign-extend operation. It will sign extend first operand from + the sign bit specified by the second operand. The type of the + result is that of the first operand. */ +DEFTREECODE (SEXT_EXPR, "sext_expr", tcc_binary, 2) + /* ANDIF and ORIF allow the second operand not to be computed if the value of the expression is determined from the first operand. AND, OR, and XOR always compute the second operand whether its value is -- 1.9.1 ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [0/7] Type promotion pass and elimination of zext/sext 2015-10-22 11:01 ` Kugan @ 2015-10-22 14:24 ` Richard Biener 2015-10-27 1:48 ` kugan 0 siblings, 1 reply; 28+ messages in thread From: Richard Biener @ 2015-10-22 14:24 UTC (permalink / raw) To: Kugan; +Cc: gcc-patches On Thu, Oct 22, 2015 at 12:50 PM, Kugan <kugan.vivekanandarajah@linaro.org> wrote: > > > On 21/10/15 23:45, Richard Biener wrote: >> On Tue, Oct 20, 2015 at 10:03 PM, Kugan >> <kugan.vivekanandarajah@linaro.org> wrote: >>> >>> >>> On 07/09/15 12:53, Kugan wrote: >>>> >>>> This a new version of the patch posted in >>>> https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00226.html. I have done >>>> more testing and spitted the patch to make it more easier to review. >>>> There are still couple of issues to be addressed and I am working on them. >>>> >>>> 1. AARCH64 bootstrap now fails with the commit >>>> 94f92c36a83d66a893c3bc6f00a038ba3dbe2a6f. simplify-rtx.c is mis-compiled >>>> in stage2 and fwprop.c is failing. It looks to me that there is a latent >>>> issue which gets exposed my patch. I can also reproduce this in x86_64 >>>> if I use the same PROMOTE_MODE which is used in aarch64 port. For the >>>> time being, I am using patch >>>> 0006-temporary-workaround-for-bootstrap-failure-due-to-co.patch as a >>>> workaround. This meeds to be fixed before the patches are ready to be >>>> committed. >>>> >>>> 2. vector-compare-1.c from c-c++-common/torture fails to assemble with >>>> -O3 -g Error: unaligned opcodes detected in executable segment. It works >>>> fine if I remove the -g. I am looking into it and needs to be fixed as well. >>> >>> Hi Richard, >>> >>> Now that stage 1 is going to close, I would like to get these patches >>> accepted for stage1. I will try my best to address your review comments >>> ASAP. >> >> Ok, can you make the whole patch series available so I can poke at the >> implementation a bit? Please state the revision it was rebased on >> (or point me to a git/svn branch the work resides on). >> > > Thanks. Please find the patched rebated against trunk@229156. I have > skipped the test-case readjustment patches. Some quick observations. On x86_64 when building short bar (short y); int foo (short x) { short y = bar (x) + 15; return y; } with -m32 -O2 -mtune=pentiumpro (which ends up promoting HImode regs) I get <bb 2>: _1 = (int) x_10(D); _2 = (_1) sext (16); _11 = bar (_2); _5 = (int) _11; _12 = (unsigned int) _5; _6 = _12 & 65535; _7 = _6 + 15; _13 = (int) _7; _8 = (_13) sext (16); _9 = (_8) sext (16); return _9; which looks fine but the VRP optimization doesn't trigger for the redundant sext (ranges are computed correctly but the 2nd extension is not removed). This also makes me notice trivial match.pd patterns are missing, like for example (simplify (sext (sext@2 @0 @1) @3) (if (tree_int_cst_compare (@1, @3) <= 0) @2 (sext @0 @3))) as VRP doesn't run at -O1 we must rely on those to remove rendudant extensions, otherwise generated code might get worse compared to without the pass(?) I also notice that the 'short' argument does not get it's sign-extension removed as redundand either even though we have _1 = (int) x_8(D); Found new range for _1: [-32768, 32767] In the end I suspect that keeping track of the "simple" cases in the promotion pass itself (by keeping a lattice) might be a good idea (after we fix VRP to do its work). In some way whether the ABI guarantees promoted argument registers might need some other target hook queries. Now onto the 0002 patch. +static bool +type_precision_ok (tree type) +{ + return (TYPE_PRECISION (type) == 8 + || TYPE_PRECISION (type) == 16 + || TYPE_PRECISION (type) == 32); +} that's a weird function to me. You probably want TYPE_PRECISION (type) == GET_MODE_PRECISION (TYPE_MODE (type)) here? And guard that thing with POINTER_TYPE_P || INTEGRAL_TYPE_P? +/* Return the promoted type for TYPE. */ +static tree +get_promoted_type (tree type) +{ + tree promoted_type; + enum machine_mode mode; + int uns; + if (POINTER_TYPE_P (type) + || !INTEGRAL_TYPE_P (type) + || !type_precision_ok (type)) + return type; + + mode = TYPE_MODE (type); +#ifdef PROMOTE_MODE + uns = TYPE_SIGN (type); + PROMOTE_MODE (mode, uns, type); +#endif + uns = TYPE_SIGN (type); + promoted_type = lang_hooks.types.type_for_mode (mode, uns); + if (promoted_type + && (TYPE_PRECISION (promoted_type) > TYPE_PRECISION (type))) + type = promoted_type; I think what you want to verify is that TYPE_PRECISION (promoted_type) == GET_MODE_PRECISION (mode). And to not even bother with this simply use promoted_type = build_nonstandard_integer_type (GET_MODE_PRECISION (mode), uns); You use a domwalk but also might create new basic-blocks during it (insert_on_edge_immediate), that's a no-no, commit edge inserts after the domwalk. ssa_sets_higher_bits_bitmap looks unused and we generally don't free dominance info, so please don't do that. I fired off a bootstrap on ppc64-linux which fails building stage1 libgcc with /abuild/rguenther/obj/./gcc/xgcc -B/abuild/rguenther/obj/./gcc/ -B/usr/local/powerpc64-unknown-linux-gnu/bin/ -B/usr/local/powerpc64-unknown-linux-gnu/lib/ -isystem /usr/local/powerpc64-unknown-linux-gnu/include -isystem /usr/local/powerpc64-unknown-linux-gnu/sys-include -g -O2 -O2 -g -O2 -DIN_GCC -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual -Wno-format -Wstrict-prototypes -Wmissing-prototypes -Wold-style-definition -isystem ./include -fPIC -mlong-double-128 -mno-minimal-toc -g -DIN_LIBGCC2 -fbuilding-libgcc -fno-stack-protector -fPIC -mlong-double-128 -mno-minimal-toc -I. -I. -I../.././gcc -I../../../trunk/libgcc -I../../../trunk/libgcc/. -I../../../trunk/libgcc/../gcc -I../../../trunk/libgcc/../include -I../../../trunk/libgcc/../libdecnumber/dpd -I../../../trunk/libgcc/../libdecnumber -DHAVE_CC_TLS -o _divdi3.o -MT _divdi3.o -MD -MP -MF _divdi3.dep -DL_divdi3 -c ../../../trunk/libgcc/libgcc2.c \ -fexceptions -fnon-call-exceptions -fvisibility=hidden -DHIDE_EXPORTS In file included from ../../../trunk/libgcc/libgcc2.c:56:0: ../../../trunk/libgcc/libgcc2.c: In function ‘__divti3’: ../../../trunk/libgcc/libgcc2.h:193:20: internal compiler error: in expand_debug_locations, at cfgexpand.c:5277 as hinted at above a bootstrap on i?86 (yes, 32bit) with --with-tune=pentiumpro might be another good testing candidate. + FOR_EACH_SSA_TREE_OPERAND (def, stmt, iter, SSA_OP_USE | SSA_OP_DEF) + promote_def_and_uses (def); it looks like you are doing some redundant work by walking both defs and uses of each stmt. I'd say you should separate def and use processing and use FOR_EACH_SSA_USE_OPERAND (use, stmt, iter, SSA_OP_USE) promote_use (use); FOR_EACH_SSA_DEF_OPERAND (def, stmt, iter, SSA_OP_DEF) promote_def (def); this should make processing more efficient (memory local) compared to doing the split handling in promote_def_and_uses. I think it will be convenient to have a SSA name info structure where you can remember the original type a name was promoted from as well as whether it was promoted or not. This way adjusting debug uses should be "trivial": +static unsigned int +fixup_uses (tree use, tree promoted_type, tree old_type) +{ + gimple *stmt; + imm_use_iterator ui; + gimple_stmt_iterator gsi; + use_operand_p op; + + FOR_EACH_IMM_USE_STMT (stmt, ui, use) + { + bool do_not_promote = false; + switch (gimple_code (stmt)) + { + case GIMPLE_DEBUG: + { + gsi = gsi_for_stmt (stmt); + gsi_remove (&gsi, true); rather than doing the above you'd do sth like SET_USE (use, fold_convert (old_type, new_def)); update_stmt (stmt); note that while you may not be able to use promoted regs at all uses (like calls or asms) you can promote all defs, if only with a compensation statement after the original def. The SSA name info struct can be used to note down the actual SSA name holding the promoted def. The pass looks a lot better than last time (it's way smaller!) but still needs some improvements. There are some more fishy details with respect to how you allocate/change SSA names but I think those can be dealt with once the basic structure looks how I like it to be. Thanks, Richard. > > Thanks, > Kugan ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [0/7] Type promotion pass and elimination of zext/sext 2015-10-22 14:24 ` Richard Biener @ 2015-10-27 1:48 ` kugan 2015-10-28 15:51 ` Richard Biener 0 siblings, 1 reply; 28+ messages in thread From: kugan @ 2015-10-27 1:48 UTC (permalink / raw) To: Richard Biener; +Cc: gcc-patches On 23/10/15 01:23, Richard Biener wrote: > On Thu, Oct 22, 2015 at 12:50 PM, Kugan > <kugan.vivekanandarajah@linaro.org> wrote: >> >> >> On 21/10/15 23:45, Richard Biener wrote: >>> On Tue, Oct 20, 2015 at 10:03 PM, Kugan >>> <kugan.vivekanandarajah@linaro.org> wrote: >>>> >>>> >>>> On 07/09/15 12:53, Kugan wrote: >>>>> >>>>> This a new version of the patch posted in >>>>> https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00226.html. I have done >>>>> more testing and spitted the patch to make it more easier to review. >>>>> There are still couple of issues to be addressed and I am working on them. >>>>> >>>>> 1. AARCH64 bootstrap now fails with the commit >>>>> 94f92c36a83d66a893c3bc6f00a038ba3dbe2a6f. simplify-rtx.c is mis-compiled >>>>> in stage2 and fwprop.c is failing. It looks to me that there is a latent >>>>> issue which gets exposed my patch. I can also reproduce this in x86_64 >>>>> if I use the same PROMOTE_MODE which is used in aarch64 port. For the >>>>> time being, I am using patch >>>>> 0006-temporary-workaround-for-bootstrap-failure-due-to-co.patch as a >>>>> workaround. This meeds to be fixed before the patches are ready to be >>>>> committed. >>>>> >>>>> 2. vector-compare-1.c from c-c++-common/torture fails to assemble with >>>>> -O3 -g Error: unaligned opcodes detected in executable segment. It works >>>>> fine if I remove the -g. I am looking into it and needs to be fixed as well. >>>> >>>> Hi Richard, >>>> >>>> Now that stage 1 is going to close, I would like to get these patches >>>> accepted for stage1. I will try my best to address your review comments >>>> ASAP. >>> >>> Ok, can you make the whole patch series available so I can poke at the >>> implementation a bit? Please state the revision it was rebased on >>> (or point me to a git/svn branch the work resides on). >>> >> >> Thanks. Please find the patched rebated against trunk@229156. I have >> skipped the test-case readjustment patches. > > Some quick observations. On x86_64 when building Hi Richard, Thanks for the review. > > short bar (short y); > int foo (short x) > { > short y = bar (x) + 15; > return y; > } > > with -m32 -O2 -mtune=pentiumpro (which ends up promoting HImode regs) > I get > > <bb 2>: > _1 = (int) x_10(D); > _2 = (_1) sext (16); > _11 = bar (_2); > _5 = (int) _11; > _12 = (unsigned int) _5; > _6 = _12 & 65535; > _7 = _6 + 15; > _13 = (int) _7; > _8 = (_13) sext (16); > _9 = (_8) sext (16); > return _9; > > which looks fine but the VRP optimization doesn't trigger for the redundant sext > (ranges are computed correctly but the 2nd extension is not removed). > > This also makes me notice trivial match.pd patterns are missing, like > for example > > (simplify > (sext (sext@2 @0 @1) @3) > (if (tree_int_cst_compare (@1, @3) <= 0) > @2 > (sext @0 @3))) > > as VRP doesn't run at -O1 we must rely on those to remove rendudant extensions, > otherwise generated code might get worse compared to without the pass(?) Do you think that we should enable this pass only when vrp is enabled. Otherwise, even when we do the simple optimizations you mentioned below, we might not be able to remove all the redundancies. > > I also notice that the 'short' argument does not get it's sign-extension removed > as redundand either even though we have > > _1 = (int) x_8(D); > Found new range for _1: [-32768, 32767] > I am looking into it. > In the end I suspect that keeping track of the "simple" cases in the promotion > pass itself (by keeping a lattice) might be a good idea (after we fix VRP to do > its work). In some way whether the ABI guarantees promoted argument > registers might need some other target hook queries. > > Now onto the 0002 patch. > > +static bool > +type_precision_ok (tree type) > +{ > + return (TYPE_PRECISION (type) == 8 > + || TYPE_PRECISION (type) == 16 > + || TYPE_PRECISION (type) == 32); > +} > > that's a weird function to me. You probably want > TYPE_PRECISION (type) == GET_MODE_PRECISION (TYPE_MODE (type)) > here? And guard that thing with POINTER_TYPE_P || INTEGRAL_TYPE_P? > I will change this. (I have a patch which I am testing with other changes you have asked for) > +/* Return the promoted type for TYPE. */ > +static tree > +get_promoted_type (tree type) > +{ > + tree promoted_type; > + enum machine_mode mode; > + int uns; > + if (POINTER_TYPE_P (type) > + || !INTEGRAL_TYPE_P (type) > + || !type_precision_ok (type)) > + return type; > + > + mode = TYPE_MODE (type); > +#ifdef PROMOTE_MODE > + uns = TYPE_SIGN (type); > + PROMOTE_MODE (mode, uns, type); > +#endif > + uns = TYPE_SIGN (type); > + promoted_type = lang_hooks.types.type_for_mode (mode, uns); > + if (promoted_type > + && (TYPE_PRECISION (promoted_type) > TYPE_PRECISION (type))) > + type = promoted_type; > > I think what you want to verify is that TYPE_PRECISION (promoted_type) > == GET_MODE_PRECISION (mode). > And to not even bother with this simply use > > promoted_type = build_nonstandard_integer_type (GET_MODE_PRECISION (mode), uns); > I am changing this too. > You use a domwalk but also might create new basic-blocks during it > (insert_on_edge_immediate), that's a > no-no, commit edge inserts after the domwalk. I am sorry, I dont understand "commit edge inserts after the domwalk" Is there a way to do this in the current implementation? > ssa_sets_higher_bits_bitmap looks unused and > we generally don't free dominance info, so please don't do that. > > I fired off a bootstrap on ppc64-linux which fails building stage1 libgcc with > > /abuild/rguenther/obj/./gcc/xgcc -B/abuild/rguenther/obj/./gcc/ > -B/usr/local/powerpc64-unknown-linux-gnu/bin/ > -B/usr/local/powerpc64-unknown-linux-gnu/lib/ -isystem > /usr/local/powerpc64-unknown-linux-gnu/include -isystem > /usr/local/powerpc64-unknown-linux-gnu/sys-include -g -O2 -O2 -g > -O2 -DIN_GCC -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual > -Wno-format -Wstrict-prototypes -Wmissing-prototypes > -Wold-style-definition -isystem ./include -fPIC -mlong-double-128 > -mno-minimal-toc -g -DIN_LIBGCC2 -fbuilding-libgcc > -fno-stack-protector -fPIC -mlong-double-128 -mno-minimal-toc -I. > -I. -I../.././gcc -I../../../trunk/libgcc -I../../../trunk/libgcc/. > -I../../../trunk/libgcc/../gcc -I../../../trunk/libgcc/../include > -I../../../trunk/libgcc/../libdecnumber/dpd > -I../../../trunk/libgcc/../libdecnumber -DHAVE_CC_TLS -o _divdi3.o > -MT _divdi3.o -MD -MP -MF _divdi3.dep -DL_divdi3 -c > ../../../trunk/libgcc/libgcc2.c \ > -fexceptions -fnon-call-exceptions -fvisibility=hidden -DHIDE_EXPORTS > In file included from ../../../trunk/libgcc/libgcc2.c:56:0: > ../../../trunk/libgcc/libgcc2.c: In function â__divti3â: > ../../../trunk/libgcc/libgcc2.h:193:20: internal compiler error: in > expand_debug_locations, at cfgexpand.c:5277 > I am testing on gcc computefarm. I will get it to bootstrap and will do the regression testing before posting the next version. > as hinted at above a bootstrap on i?86 (yes, 32bit) with > --with-tune=pentiumpro might be another good testing candidate. > > + FOR_EACH_SSA_TREE_OPERAND (def, stmt, iter, SSA_OP_USE | SSA_OP_DEF) > + promote_def_and_uses (def); > > it looks like you are doing some redundant work by walking both defs > and uses of each stmt. I'd say you should separate > def and use processing and use > > FOR_EACH_SSA_USE_OPERAND (use, stmt, iter, SSA_OP_USE) > promote_use (use); > FOR_EACH_SSA_DEF_OPERAND (def, stmt, iter, SSA_OP_DEF) > promote_def (def); > Name promote_def_and_uses in my implementation is a bit confusing. It is promoting the SSA_NAMEs. We only have to do that for the definitions if we can do the SSA_NAMEs defined by parameters. I also have a bitmap to see if we have promoted a variable and avoid doing it again. I will try to improve this. > this should make processing more efficient (memory local) compared to > doing the split handling > in promote_def_and_uses. > > I think it will be convenient to have a SSA name info structure where > you can remember the original > type a name was promoted from as well as whether it was promoted or > not. This way adjusting > debug uses should be "trivial": > > +static unsigned int > +fixup_uses (tree use, tree promoted_type, tree old_type) > +{ > + gimple *stmt; > + imm_use_iterator ui; > + gimple_stmt_iterator gsi; > + use_operand_p op; > + > + FOR_EACH_IMM_USE_STMT (stmt, ui, use) > + { > + bool do_not_promote = false; > + switch (gimple_code (stmt)) > + { > + case GIMPLE_DEBUG: > + { > + gsi = gsi_for_stmt (stmt); > + gsi_remove (&gsi, true); > > rather than doing the above you'd do sth like > > SET_USE (use, fold_convert (old_type, new_def)); > update_stmt (stmt); > We do have these information (original type a name was promoted from as well as whether it was promoted or not). To make it easy to review, in the patch that adds the pass,I am removing these debug stmts. But in patch 4, I am trying to handle this properly. Maybe I should combine them. > note that while you may not be able to use promoted regs at all uses > (like calls or asms) you can promote all defs, if only with a compensation > statement after the original def. The SSA name info struct can be used > to note down the actual SSA name holding the promoted def. > > The pass looks a lot better than last time (it's way smaller!) but > still needs some > improvements. There are some more fishy details with respect to how you > allocate/change SSA names but I think those can be dealt with once the > basic structure looks how I like it to be. > I will post an updated patch in a day or two. Thanks again, Kugan ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [0/7] Type promotion pass and elimination of zext/sext 2015-10-27 1:48 ` kugan @ 2015-10-28 15:51 ` Richard Biener 2015-11-02 9:17 ` Kugan 0 siblings, 1 reply; 28+ messages in thread From: Richard Biener @ 2015-10-28 15:51 UTC (permalink / raw) To: kugan; +Cc: gcc-patches On Tue, Oct 27, 2015 at 1:50 AM, kugan <kugan.vivekanandarajah@linaro.org> wrote: > > > On 23/10/15 01:23, Richard Biener wrote: >> >> On Thu, Oct 22, 2015 at 12:50 PM, Kugan >> <kugan.vivekanandarajah@linaro.org> wrote: >>> >>> >>> >>> On 21/10/15 23:45, Richard Biener wrote: >>>> >>>> On Tue, Oct 20, 2015 at 10:03 PM, Kugan >>>> <kugan.vivekanandarajah@linaro.org> wrote: >>>>> >>>>> >>>>> >>>>> On 07/09/15 12:53, Kugan wrote: >>>>>> >>>>>> >>>>>> This a new version of the patch posted in >>>>>> https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00226.html. I have done >>>>>> more testing and spitted the patch to make it more easier to review. >>>>>> There are still couple of issues to be addressed and I am working on >>>>>> them. >>>>>> >>>>>> 1. AARCH64 bootstrap now fails with the commit >>>>>> 94f92c36a83d66a893c3bc6f00a038ba3dbe2a6f. simplify-rtx.c is >>>>>> mis-compiled >>>>>> in stage2 and fwprop.c is failing. It looks to me that there is a >>>>>> latent >>>>>> issue which gets exposed my patch. I can also reproduce this in x86_64 >>>>>> if I use the same PROMOTE_MODE which is used in aarch64 port. For the >>>>>> time being, I am using patch >>>>>> 0006-temporary-workaround-for-bootstrap-failure-due-to-co.patch as a >>>>>> workaround. This meeds to be fixed before the patches are ready to be >>>>>> committed. >>>>>> >>>>>> 2. vector-compare-1.c from c-c++-common/torture fails to assemble with >>>>>> -O3 -g Error: unaligned opcodes detected in executable segment. It >>>>>> works >>>>>> fine if I remove the -g. I am looking into it and needs to be fixed as >>>>>> well. >>>>> >>>>> >>>>> Hi Richard, >>>>> >>>>> Now that stage 1 is going to close, I would like to get these patches >>>>> accepted for stage1. I will try my best to address your review comments >>>>> ASAP. >>>> >>>> >>>> Ok, can you make the whole patch series available so I can poke at the >>>> implementation a bit? Please state the revision it was rebased on >>>> (or point me to a git/svn branch the work resides on). >>>> >>> >>> Thanks. Please find the patched rebated against trunk@229156. I have >>> skipped the test-case readjustment patches. >> >> >> Some quick observations. On x86_64 when building > > > Hi Richard, > > Thanks for the review. > >> >> short bar (short y); >> int foo (short x) >> { >> short y = bar (x) + 15; >> return y; >> } >> >> with -m32 -O2 -mtune=pentiumpro (which ends up promoting HImode regs) >> I get >> >> <bb 2>: >> _1 = (int) x_10(D); >> _2 = (_1) sext (16); >> _11 = bar (_2); >> _5 = (int) _11; >> _12 = (unsigned int) _5; >> _6 = _12 & 65535; >> _7 = _6 + 15; >> _13 = (int) _7; >> _8 = (_13) sext (16); >> _9 = (_8) sext (16); >> return _9; >> >> which looks fine but the VRP optimization doesn't trigger for the >> redundant sext >> (ranges are computed correctly but the 2nd extension is not removed). >> >> This also makes me notice trivial match.pd patterns are missing, like >> for example >> >> (simplify >> (sext (sext@2 @0 @1) @3) >> (if (tree_int_cst_compare (@1, @3) <= 0) >> @2 >> (sext @0 @3))) >> >> as VRP doesn't run at -O1 we must rely on those to remove rendudant >> extensions, >> otherwise generated code might get worse compared to without the pass(?) > > > Do you think that we should enable this pass only when vrp is enabled. > Otherwise, even when we do the simple optimizations you mentioned below, we > might not be able to remove all the redundancies. > >> >> I also notice that the 'short' argument does not get it's sign-extension >> removed >> as redundand either even though we have >> >> _1 = (int) x_8(D); >> Found new range for _1: [-32768, 32767] >> > > I am looking into it. > >> In the end I suspect that keeping track of the "simple" cases in the >> promotion >> pass itself (by keeping a lattice) might be a good idea (after we fix VRP >> to do >> its work). In some way whether the ABI guarantees promoted argument >> registers might need some other target hook queries. >> >> Now onto the 0002 patch. >> >> +static bool >> +type_precision_ok (tree type) >> +{ >> + return (TYPE_PRECISION (type) == 8 >> + || TYPE_PRECISION (type) == 16 >> + || TYPE_PRECISION (type) == 32); >> +} >> >> that's a weird function to me. You probably want >> TYPE_PRECISION (type) == GET_MODE_PRECISION (TYPE_MODE (type)) >> here? And guard that thing with POINTER_TYPE_P || INTEGRAL_TYPE_P? >> > > I will change this. (I have a patch which I am testing with other changes > you have asked for) > > >> +/* Return the promoted type for TYPE. */ >> +static tree >> +get_promoted_type (tree type) >> +{ >> + tree promoted_type; >> + enum machine_mode mode; >> + int uns; >> + if (POINTER_TYPE_P (type) >> + || !INTEGRAL_TYPE_P (type) >> + || !type_precision_ok (type)) >> + return type; >> + >> + mode = TYPE_MODE (type); >> +#ifdef PROMOTE_MODE >> + uns = TYPE_SIGN (type); >> + PROMOTE_MODE (mode, uns, type); >> +#endif >> + uns = TYPE_SIGN (type); >> + promoted_type = lang_hooks.types.type_for_mode (mode, uns); >> + if (promoted_type >> + && (TYPE_PRECISION (promoted_type) > TYPE_PRECISION (type))) >> + type = promoted_type; >> >> I think what you want to verify is that TYPE_PRECISION (promoted_type) >> == GET_MODE_PRECISION (mode). >> And to not even bother with this simply use >> >> promoted_type = build_nonstandard_integer_type (GET_MODE_PRECISION (mode), >> uns); >> > > I am changing this too. > >> You use a domwalk but also might create new basic-blocks during it >> (insert_on_edge_immediate), that's a >> no-no, commit edge inserts after the domwalk. > > > I am sorry, I dont understand "commit edge inserts after the domwalk" Is > there a way to do this in the current implementation? Yes, simply use gsi_insert_on_edge () and after the domwalk is done do gsi_commit_edge_inserts (). >> ssa_sets_higher_bits_bitmap looks unused and >> we generally don't free dominance info, so please don't do that. >> >> I fired off a bootstrap on ppc64-linux which fails building stage1 libgcc >> with >> >> /abuild/rguenther/obj/./gcc/xgcc -B/abuild/rguenther/obj/./gcc/ >> -B/usr/local/powerpc64-unknown-linux-gnu/bin/ >> -B/usr/local/powerpc64-unknown-linux-gnu/lib/ -isystem >> /usr/local/powerpc64-unknown-linux-gnu/include -isystem >> /usr/local/powerpc64-unknown-linux-gnu/sys-include -g -O2 -O2 -g >> -O2 -DIN_GCC -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual >> -Wno-format -Wstrict-prototypes -Wmissing-prototypes >> -Wold-style-definition -isystem ./include -fPIC -mlong-double-128 >> -mno-minimal-toc -g -DIN_LIBGCC2 -fbuilding-libgcc >> -fno-stack-protector -fPIC -mlong-double-128 -mno-minimal-toc -I. >> -I. -I../.././gcc -I../../../trunk/libgcc -I../../../trunk/libgcc/. >> -I../../../trunk/libgcc/../gcc -I../../../trunk/libgcc/../include >> -I../../../trunk/libgcc/../libdecnumber/dpd >> -I../../../trunk/libgcc/../libdecnumber -DHAVE_CC_TLS -o _divdi3.o >> -MT _divdi3.o -MD -MP -MF _divdi3.dep -DL_divdi3 -c >> ../../../trunk/libgcc/libgcc2.c \ >> -fexceptions -fnon-call-exceptions -fvisibility=hidden >> -DHIDE_EXPORTS >> In file included from ../../../trunk/libgcc/libgcc2.c:56:0: >> ../../../trunk/libgcc/libgcc2.c: In function ‘__divti3’: >> ../../../trunk/libgcc/libgcc2.h:193:20: internal compiler error: in >> expand_debug_locations, at cfgexpand.c:5277 >> > > I am testing on gcc computefarm. I will get it to bootstrap and will do the > regression testing before posting the next version. > >> as hinted at above a bootstrap on i?86 (yes, 32bit) with >> --with-tune=pentiumpro might be another good testing candidate. >> >> + FOR_EACH_SSA_TREE_OPERAND (def, stmt, iter, SSA_OP_USE | >> SSA_OP_DEF) >> + promote_def_and_uses (def); >> >> it looks like you are doing some redundant work by walking both defs >> and uses of each stmt. I'd say you should separate >> def and use processing and use >> >> FOR_EACH_SSA_USE_OPERAND (use, stmt, iter, SSA_OP_USE) >> promote_use (use); >> FOR_EACH_SSA_DEF_OPERAND (def, stmt, iter, SSA_OP_DEF) >> promote_def (def); >> > > Name promote_def_and_uses in my implementation is a bit confusing. It is > promoting the SSA_NAMEs. We only have to do that for the definitions if we > can do the SSA_NAMEs defined by parameters. > > I also have a bitmap to see if we have promoted a variable and avoid doing > it again. I will try to improve this. > > > >> this should make processing more efficient (memory local) compared to >> doing the split handling >> in promote_def_and_uses. >> >> I think it will be convenient to have a SSA name info structure where >> you can remember the original >> type a name was promoted from as well as whether it was promoted or >> not. This way adjusting >> debug uses should be "trivial": >> >> +static unsigned int >> +fixup_uses (tree use, tree promoted_type, tree old_type) >> +{ >> + gimple *stmt; >> + imm_use_iterator ui; >> + gimple_stmt_iterator gsi; >> + use_operand_p op; >> + >> + FOR_EACH_IMM_USE_STMT (stmt, ui, use) >> + { >> + bool do_not_promote = false; >> + switch (gimple_code (stmt)) >> + { >> + case GIMPLE_DEBUG: >> + { >> + gsi = gsi_for_stmt (stmt); >> + gsi_remove (&gsi, true); >> >> rather than doing the above you'd do sth like >> >> SET_USE (use, fold_convert (old_type, new_def)); >> update_stmt (stmt); >> > > We do have these information (original type a name was promoted from as well > as whether it was promoted or not). To make it easy to review, in the patch > that adds the pass,I am removing these debug stmts. But in patch 4, I am > trying to handle this properly. Maybe I should combine them. Yeah, it's a bit confusing otherwise. >> note that while you may not be able to use promoted regs at all uses >> (like calls or asms) you can promote all defs, if only with a compensation >> statement after the original def. The SSA name info struct can be used >> to note down the actual SSA name holding the promoted def. >> >> The pass looks a lot better than last time (it's way smaller!) but >> still needs some >> improvements. There are some more fishy details with respect to how you >> allocate/change SSA names but I think those can be dealt with once the >> basic structure looks how I like it to be. >> > > I will post an updated patch in a day or two. Thanks, Richard. > Thanks again, > Kugan ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [0/7] Type promotion pass and elimination of zext/sext 2015-10-28 15:51 ` Richard Biener @ 2015-11-02 9:17 ` Kugan 2015-11-03 14:40 ` Richard Biener 0 siblings, 1 reply; 28+ messages in thread From: Kugan @ 2015-11-02 9:17 UTC (permalink / raw) To: Richard Biener; +Cc: gcc-patches [-- Attachment #1: Type: text/plain, Size: 12098 bytes --] On 29/10/15 02:45, Richard Biener wrote: > On Tue, Oct 27, 2015 at 1:50 AM, kugan > <kugan.vivekanandarajah@linaro.org> wrote: >> >> >> On 23/10/15 01:23, Richard Biener wrote: >>> >>> On Thu, Oct 22, 2015 at 12:50 PM, Kugan >>> <kugan.vivekanandarajah@linaro.org> wrote: >>>> >>>> >>>> >>>> On 21/10/15 23:45, Richard Biener wrote: >>>>> >>>>> On Tue, Oct 20, 2015 at 10:03 PM, Kugan >>>>> <kugan.vivekanandarajah@linaro.org> wrote: >>>>>> >>>>>> >>>>>> >>>>>> On 07/09/15 12:53, Kugan wrote: >>>>>>> >>>>>>> >>>>>>> This a new version of the patch posted in >>>>>>> https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00226.html. I have done >>>>>>> more testing and spitted the patch to make it more easier to review. >>>>>>> There are still couple of issues to be addressed and I am working on >>>>>>> them. >>>>>>> >>>>>>> 1. AARCH64 bootstrap now fails with the commit >>>>>>> 94f92c36a83d66a893c3bc6f00a038ba3dbe2a6f. simplify-rtx.c is >>>>>>> mis-compiled >>>>>>> in stage2 and fwprop.c is failing. It looks to me that there is a >>>>>>> latent >>>>>>> issue which gets exposed my patch. I can also reproduce this in x86_64 >>>>>>> if I use the same PROMOTE_MODE which is used in aarch64 port. For the >>>>>>> time being, I am using patch >>>>>>> 0006-temporary-workaround-for-bootstrap-failure-due-to-co.patch as a >>>>>>> workaround. This meeds to be fixed before the patches are ready to be >>>>>>> committed. >>>>>>> >>>>>>> 2. vector-compare-1.c from c-c++-common/torture fails to assemble with >>>>>>> -O3 -g Error: unaligned opcodes detected in executable segment. It >>>>>>> works >>>>>>> fine if I remove the -g. I am looking into it and needs to be fixed as >>>>>>> well. >>>>>> >>>>>> >>>>>> Hi Richard, >>>>>> >>>>>> Now that stage 1 is going to close, I would like to get these patches >>>>>> accepted for stage1. I will try my best to address your review comments >>>>>> ASAP. >>>>> >>>>> >>>>> Ok, can you make the whole patch series available so I can poke at the >>>>> implementation a bit? Please state the revision it was rebased on >>>>> (or point me to a git/svn branch the work resides on). >>>>> >>>> >>>> Thanks. Please find the patched rebated against trunk@229156. I have >>>> skipped the test-case readjustment patches. >>> >>> >>> Some quick observations. On x86_64 when building >> >> >> Hi Richard, >> >> Thanks for the review. >> >>> >>> short bar (short y); >>> int foo (short x) >>> { >>> short y = bar (x) + 15; >>> return y; >>> } >>> >>> with -m32 -O2 -mtune=pentiumpro (which ends up promoting HImode regs) >>> I get >>> >>> <bb 2>: >>> _1 = (int) x_10(D); >>> _2 = (_1) sext (16); >>> _11 = bar (_2); >>> _5 = (int) _11; >>> _12 = (unsigned int) _5; >>> _6 = _12 & 65535; >>> _7 = _6 + 15; >>> _13 = (int) _7; >>> _8 = (_13) sext (16); >>> _9 = (_8) sext (16); >>> return _9; >>> >>> which looks fine but the VRP optimization doesn't trigger for the >>> redundant sext >>> (ranges are computed correctly but the 2nd extension is not removed). Thanks for the comments. Please fond the attached patches with which I am now getting cat .192t.optimized ;; Function foo (foo, funcdef_no=0, decl_uid=1406, cgraph_uid=0, symbol_order=0) foo (short int x) { signed int _1; int _2; signed int _5; unsigned int _6; unsigned int _7; signed int _8; int _9; short int _11; unsigned int _12; signed int _13; <bb 2>: _1 = (signed int) x_10(D); _2 = _1; _11 = bar (_2); _5 = (signed int) _11; _12 = (unsigned int) _11; _6 = _12 & 65535; _7 = _6 + 15; _13 = (signed int) _7; _8 = (_13) sext (16); _9 = _8; return _9; } There are still some redundancies. The asm difference after RTL optimizations is - addl $15, %eax + addw $15, %ax >>> >>> This also makes me notice trivial match.pd patterns are missing, like >>> for example >>> >>> (simplify >>> (sext (sext@2 @0 @1) @3) >>> (if (tree_int_cst_compare (@1, @3) <= 0) >>> @2 >>> (sext @0 @3))) >>> >>> as VRP doesn't run at -O1 we must rely on those to remove rendudant >>> extensions, >>> otherwise generated code might get worse compared to without the pass(?) >> >> >> Do you think that we should enable this pass only when vrp is enabled. >> Otherwise, even when we do the simple optimizations you mentioned below, we >> might not be able to remove all the redundancies. >> >>> >>> I also notice that the 'short' argument does not get it's sign-extension >>> removed >>> as redundand either even though we have >>> >>> _1 = (int) x_8(D); >>> Found new range for _1: [-32768, 32767] >>> >> >> I am looking into it. >> >>> In the end I suspect that keeping track of the "simple" cases in the >>> promotion >>> pass itself (by keeping a lattice) might be a good idea (after we fix VRP >>> to do >>> its work). In some way whether the ABI guarantees promoted argument >>> registers might need some other target hook queries. I tried adding it in the attached patch with record_visit_stmt to track whether an ssa would have value overflow or properly zero/sign extended in promoted mode. We can use this to eliminate some of the zero/sign extension at gimple level. As it is, it doesn't do much. If this is what you had in mind, I will extend it based on your feedback. >>> >>> Now onto the 0002 patch. >>> >>> +static bool >>> +type_precision_ok (tree type) >>> +{ >>> + return (TYPE_PRECISION (type) == 8 >>> + || TYPE_PRECISION (type) == 16 >>> + || TYPE_PRECISION (type) == 32); >>> +} >>> >>> that's a weird function to me. You probably want >>> TYPE_PRECISION (type) == GET_MODE_PRECISION (TYPE_MODE (type)) >>> here? And guard that thing with POINTER_TYPE_P || INTEGRAL_TYPE_P? >>> >> >> I will change this. (I have a patch which I am testing with other changes >> you have asked for) >> >> >>> +/* Return the promoted type for TYPE. */ >>> +static tree >>> +get_promoted_type (tree type) >>> +{ >>> + tree promoted_type; >>> + enum machine_mode mode; >>> + int uns; >>> + if (POINTER_TYPE_P (type) >>> + || !INTEGRAL_TYPE_P (type) >>> + || !type_precision_ok (type)) >>> + return type; >>> + >>> + mode = TYPE_MODE (type); >>> +#ifdef PROMOTE_MODE >>> + uns = TYPE_SIGN (type); >>> + PROMOTE_MODE (mode, uns, type); >>> +#endif >>> + uns = TYPE_SIGN (type); >>> + promoted_type = lang_hooks.types.type_for_mode (mode, uns); >>> + if (promoted_type >>> + && (TYPE_PRECISION (promoted_type) > TYPE_PRECISION (type))) >>> + type = promoted_type; >>> >>> I think what you want to verify is that TYPE_PRECISION (promoted_type) >>> == GET_MODE_PRECISION (mode). >>> And to not even bother with this simply use >>> >>> promoted_type = build_nonstandard_integer_type (GET_MODE_PRECISION (mode), >>> uns); >>> >> >> I am changing this too. >> >>> You use a domwalk but also might create new basic-blocks during it >>> (insert_on_edge_immediate), that's a >>> no-no, commit edge inserts after the domwalk. >> >> >> I am sorry, I dont understand "commit edge inserts after the domwalk" Is >> there a way to do this in the current implementation? > > Yes, simply use gsi_insert_on_edge () and after the domwalk is done do > gsi_commit_edge_inserts (). > >>> ssa_sets_higher_bits_bitmap looks unused and >>> we generally don't free dominance info, so please don't do that. >>> >>> I fired off a bootstrap on ppc64-linux which fails building stage1 libgcc >>> with >>> >>> /abuild/rguenther/obj/./gcc/xgcc -B/abuild/rguenther/obj/./gcc/ >>> -B/usr/local/powerpc64-unknown-linux-gnu/bin/ >>> -B/usr/local/powerpc64-unknown-linux-gnu/lib/ -isystem >>> /usr/local/powerpc64-unknown-linux-gnu/include -isystem >>> /usr/local/powerpc64-unknown-linux-gnu/sys-include -g -O2 -O2 -g >>> -O2 -DIN_GCC -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual >>> -Wno-format -Wstrict-prototypes -Wmissing-prototypes >>> -Wold-style-definition -isystem ./include -fPIC -mlong-double-128 >>> -mno-minimal-toc -g -DIN_LIBGCC2 -fbuilding-libgcc >>> -fno-stack-protector -fPIC -mlong-double-128 -mno-minimal-toc -I. >>> -I. -I../.././gcc -I../../../trunk/libgcc -I../../../trunk/libgcc/. >>> -I../../../trunk/libgcc/../gcc -I../../../trunk/libgcc/../include >>> -I../../../trunk/libgcc/../libdecnumber/dpd >>> -I../../../trunk/libgcc/../libdecnumber -DHAVE_CC_TLS -o _divdi3.o >>> -MT _divdi3.o -MD -MP -MF _divdi3.dep -DL_divdi3 -c >>> ../../../trunk/libgcc/libgcc2.c \ >>> -fexceptions -fnon-call-exceptions -fvisibility=hidden >>> -DHIDE_EXPORTS >>> In file included from ../../../trunk/libgcc/libgcc2.c:56:0: >>> ../../../trunk/libgcc/libgcc2.c: In function â__divti3â: >>> ../../../trunk/libgcc/libgcc2.h:193:20: internal compiler error: in >>> expand_debug_locations, at cfgexpand.c:5277 >>> With the attached patch, now I am running into Bootstrap comparison failure. I am looking into it. Please review this version so that I can address them while fixing this issue. Thanks, Kugan >> >> I am testing on gcc computefarm. I will get it to bootstrap and will do the >> regression testing before posting the next version. >> >>> as hinted at above a bootstrap on i?86 (yes, 32bit) with >>> --with-tune=pentiumpro might be another good testing candidate. >>> >>> + FOR_EACH_SSA_TREE_OPERAND (def, stmt, iter, SSA_OP_USE | >>> SSA_OP_DEF) >>> + promote_def_and_uses (def); >>> >>> it looks like you are doing some redundant work by walking both defs >>> and uses of each stmt. I'd say you should separate >>> def and use processing and use >>> >>> FOR_EACH_SSA_USE_OPERAND (use, stmt, iter, SSA_OP_USE) >>> promote_use (use); >>> FOR_EACH_SSA_DEF_OPERAND (def, stmt, iter, SSA_OP_DEF) >>> promote_def (def); >>> >> >> Name promote_def_and_uses in my implementation is a bit confusing. It is >> promoting the SSA_NAMEs. We only have to do that for the definitions if we >> can do the SSA_NAMEs defined by parameters. >> >> I also have a bitmap to see if we have promoted a variable and avoid doing >> it again. I will try to improve this. >> >> >> >>> this should make processing more efficient (memory local) compared to >>> doing the split handling >>> in promote_def_and_uses. >>> >>> I think it will be convenient to have a SSA name info structure where >>> you can remember the original >>> type a name was promoted from as well as whether it was promoted or >>> not. This way adjusting >>> debug uses should be "trivial": >>> >>> +static unsigned int >>> +fixup_uses (tree use, tree promoted_type, tree old_type) >>> +{ >>> + gimple *stmt; >>> + imm_use_iterator ui; >>> + gimple_stmt_iterator gsi; >>> + use_operand_p op; >>> + >>> + FOR_EACH_IMM_USE_STMT (stmt, ui, use) >>> + { >>> + bool do_not_promote = false; >>> + switch (gimple_code (stmt)) >>> + { >>> + case GIMPLE_DEBUG: >>> + { >>> + gsi = gsi_for_stmt (stmt); >>> + gsi_remove (&gsi, true); >>> >>> rather than doing the above you'd do sth like >>> >>> SET_USE (use, fold_convert (old_type, new_def)); >>> update_stmt (stmt); >>> >> >> We do have these information (original type a name was promoted from as well >> as whether it was promoted or not). To make it easy to review, in the patch >> that adds the pass,I am removing these debug stmts. But in patch 4, I am >> trying to handle this properly. Maybe I should combine them. > > Yeah, it's a bit confusing otherwise. > >>> note that while you may not be able to use promoted regs at all uses >>> (like calls or asms) you can promote all defs, if only with a compensation >>> statement after the original def. The SSA name info struct can be used >>> to note down the actual SSA name holding the promoted def. >>> >>> The pass looks a lot better than last time (it's way smaller!) but >>> still needs some >>> improvements. There are some more fishy details with respect to how you >>> allocate/change SSA names but I think those can be dealt with once the >>> basic structure looks how I like it to be. >>> >> >> I will post an updated patch in a day or two. > > Thanks, > Richard. > >> Thanks again, >> Kugan [-- Attachment #2: 0003-Optimize-ZEXT_EXPR-with-tree-vrp.patch --] [-- Type: text/x-diff, Size: 3519 bytes --] From 355a6ebe7cc2548417e2e4976b842fbbf5e93224 Mon Sep 17 00:00:00 2001 From: Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org> Date: Thu, 22 Oct 2015 10:53:56 +1100 Subject: [PATCH 3/3] Optimize ZEXT_EXPR with tree-vrp --- gcc/match.pd | 6 ++++++ gcc/tree-vrp.c | 59 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 65 insertions(+) diff --git a/gcc/match.pd b/gcc/match.pd index 0a9598e..1b152f1 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -2585,3 +2585,9 @@ along with GCC; see the file COPYING3. If not see (if (INTEGRAL_TYPE_P (TREE_TYPE (@0))) (op @0 (ext @1 @2))))) +(simplify + (sext (sext@2 @0 @1) @3) + (if (tree_int_cst_compare (@1, @3) <= 0) + @2 + (sext @0 @3))) + diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c index fe34ffd..671a388 100644 --- a/gcc/tree-vrp.c +++ b/gcc/tree-vrp.c @@ -2241,6 +2241,7 @@ extract_range_from_binary_expr_1 (value_range *vr, && code != LSHIFT_EXPR && code != MIN_EXPR && code != MAX_EXPR + && code != SEXT_EXPR && code != BIT_AND_EXPR && code != BIT_IOR_EXPR && code != BIT_XOR_EXPR) @@ -2801,6 +2802,52 @@ extract_range_from_binary_expr_1 (value_range *vr, extract_range_from_multiplicative_op_1 (vr, code, &vr0, &vr1); return; } + else if (code == SEXT_EXPR) + { + gcc_assert (range_int_cst_p (&vr1)); + HOST_WIDE_INT prec = tree_to_uhwi (vr1.min); + type = vr0.type; + wide_int tmin, tmax; + wide_int may_be_nonzero, must_be_nonzero; + + wide_int type_min = wi::min_value (prec, SIGNED); + wide_int type_max = wi::max_value (prec, SIGNED); + type_min = wide_int_to_tree (expr_type, type_min); + type_max = wide_int_to_tree (expr_type, type_max); + wide_int sign_bit + = wi::set_bit_in_zero (prec - 1, + TYPE_PRECISION (TREE_TYPE (vr0.min))); + if (zero_nonzero_bits_from_vr (expr_type, &vr0, + &may_be_nonzero, + &must_be_nonzero)) + { + if (wi::bit_and (must_be_nonzero, sign_bit) == sign_bit) + { + /* If to-be-extended sign bit is one. */ + tmin = type_min; + tmax = wi::zext (may_be_nonzero, prec); + } + else if (wi::bit_and (may_be_nonzero, sign_bit) + != sign_bit) + { + /* If to-be-extended sign bit is zero. */ + tmin = wi::zext (must_be_nonzero, prec); + tmax = wi::zext (may_be_nonzero, prec); + } + else + { + tmin = type_min; + tmax = type_max; + } + } + else + { + tmin = type_min; + tmax = type_max; + } + min = wide_int_to_tree (expr_type, tmin); + max = wide_int_to_tree (expr_type, tmax); + } else if (code == RSHIFT_EXPR || code == LSHIFT_EXPR) { @@ -9166,6 +9213,17 @@ simplify_bit_ops_using_ranges (gimple_stmt_iterator *gsi, gimple *stmt) break; } break; + case SEXT_EXPR: + { + unsigned int prec = tree_to_uhwi (op1); + wide_int min = vr0.min; + wide_int max = vr0.max; + wide_int sext_min = wi::sext (min, prec); + wide_int sext_max = wi::sext (max, prec); + if (min == sext_min && max == sext_max) + op = op0; + } + break; default: gcc_unreachable (); } @@ -9868,6 +9926,7 @@ simplify_stmt_using_ranges (gimple_stmt_iterator *gsi) case BIT_AND_EXPR: case BIT_IOR_EXPR: + case SEXT_EXPR: /* Optimize away BIT_AND_EXPR and BIT_IOR_EXPR if all the bits being cleared are already cleared or all the bits being set are already set. */ -- 1.9.1 [-- Attachment #3: 0002-Add-type-promotion-pass.patch --] [-- Type: text/x-diff, Size: 33011 bytes --] From 8b2256e4787adb05ac9c439ef54d5befe035915d Mon Sep 17 00:00:00 2001 From: Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org> Date: Thu, 22 Oct 2015 10:52:37 +1100 Subject: [PATCH 2/3] Add type promotion pass --- gcc/Makefile.in | 1 + gcc/common.opt | 4 + gcc/doc/invoke.texi | 10 + gcc/gimple-ssa-type-promote.c | 997 ++++++++++++++++++++++++++++++++++++++++++ gcc/passes.def | 1 + gcc/timevar.def | 1 + gcc/tree-pass.h | 1 + gcc/tree-ssanames.c | 3 +- 8 files changed, 1017 insertions(+), 1 deletion(-) create mode 100644 gcc/gimple-ssa-type-promote.c diff --git a/gcc/Makefile.in b/gcc/Makefile.in index b91b8dc..c6aed45 100644 --- a/gcc/Makefile.in +++ b/gcc/Makefile.in @@ -1499,6 +1499,7 @@ OBJS = \ tree-vect-slp.o \ tree-vectorizer.o \ tree-vrp.o \ + gimple-ssa-type-promote.o \ tree.o \ valtrack.o \ value-prof.o \ diff --git a/gcc/common.opt b/gcc/common.opt index 12ca0d6..f450428 100644 --- a/gcc/common.opt +++ b/gcc/common.opt @@ -2404,6 +2404,10 @@ ftree-vrp Common Report Var(flag_tree_vrp) Init(0) Optimization Perform Value Range Propagation on trees. +ftree-type-promote +Common Report Var(flag_tree_type_promote) Init(1) Optimization +Perform Type Promotion on trees + funit-at-a-time Common Report Var(flag_unit_at_a_time) Init(1) Compile whole compilation unit at a time. diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index cd82544..bc059a0 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -9093,6 +9093,16 @@ enabled by default at @option{-O2} and higher. Null pointer check elimination is only done if @option{-fdelete-null-pointer-checks} is enabled. +@item -ftree-type-promote +@opindex ftree-type-promote +This pass applies type promotion to SSA names in the function and +inserts appropriate truncations to preserve the semantics. Idea of +this pass is to promote operations such a way that we can minimise +generation of subreg in RTL, that intern results in removal of +redundant zero/sign extensions. + +This optimization is enabled by default. + @item -fsplit-ivs-in-unroller @opindex fsplit-ivs-in-unroller Enables expression of values of induction variables in later iterations diff --git a/gcc/gimple-ssa-type-promote.c b/gcc/gimple-ssa-type-promote.c new file mode 100644 index 0000000..2831fec --- /dev/null +++ b/gcc/gimple-ssa-type-promote.c @@ -0,0 +1,997 @@ +/* Type promotion of SSA names to minimise redundant zero/sign extension. + Copyright (C) 2015 Free Software Foundation, Inc. + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify +it under the terms of the GNU General Public License as published by +the Free Software Foundation; either version 3, or (at your option) +any later version. + +GCC is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +You should have received a copy of the GNU General Public License +along with GCC; see the file COPYING3. If not see +<http://www.gnu.org/licenses/>. */ + +#include "config.h" +#include "system.h" +#include "coretypes.h" +#include "backend.h" +#include "hash-set.h" +#include "machmode.h" +#include "vec.h" +#include "double-int.h" +#include "input.h" +#include "symtab.h" +#include "wide-int.h" +#include "inchash.h" +#include "tree.h" +#include "fold-const.h" +#include "stor-layout.h" +#include "predict.h" +#include "function.h" +#include "dominance.h" +#include "cfg.h" +#include "basic-block.h" +#include "tree-ssa-alias.h" +#include "gimple-fold.h" +#include "tree-eh.h" +#include "gimple-expr.h" +#include "is-a.h" +#include "gimple.h" +#include "gimple-iterator.h" +#include "gimple-ssa.h" +#include "tree-phinodes.h" +#include "ssa-iterators.h" +#include "stringpool.h" +#include "tree-ssanames.h" +#include "tree-pass.h" +#include "gimple-pretty-print.h" +#include "langhooks.h" +#include "sbitmap.h" +#include "domwalk.h" +#include "tree-dfa.h" + +/* This pass applies type promotion to SSA names in the function and + inserts appropriate truncations. Idea of this pass is to promote operations + such a way that we can minimise generation of subreg in RTL, + that in turn results in removal of redundant zero/sign extensions. This pass + will run prior to The VRP and DOM such that they will be able to optimise + redundant truncations and extensions. This is based on the discussion from + https://gcc.gnu.org/ml/gcc-patches/2014-09/msg00472.html. + +*/ + +static unsigned n_ssa_val; +static sbitmap ssa_to_be_promoted_bitmap; +static sbitmap ssa_sets_higher_bits_bitmap; +static hash_map <tree, tree> *original_type_map; + +static bool +type_precision_ok (tree type) +{ + return (TYPE_PRECISION (type) + == GET_MODE_PRECISION (TYPE_MODE (type))); +} + +/* Return the promoted type for TYPE. */ +static tree +get_promoted_type (tree type) +{ + tree promoted_type; + enum machine_mode mode; + int uns; + + if (POINTER_TYPE_P (type) + || !INTEGRAL_TYPE_P (type) + || !type_precision_ok (type)) + return type; + + mode = TYPE_MODE (type); +#ifdef PROMOTE_MODE + uns = TYPE_SIGN (type); + PROMOTE_MODE (mode, uns, type); +#endif + uns = TYPE_SIGN (type); + if (TYPE_PRECISION (type) == GET_MODE_PRECISION (mode)) + return type; + promoted_type + = build_nonstandard_integer_type (GET_MODE_PRECISION (mode), + uns); + gcc_assert (TYPE_PRECISION (promoted_type) == GET_MODE_PRECISION (mode)); + return promoted_type; +} + +/* Return true if ssa NAME is already considered for promotion. */ +static bool +ssa_promoted_p (tree name) +{ + if (TREE_CODE (name) == SSA_NAME) + { + unsigned int index = SSA_NAME_VERSION (name); + if (index < n_ssa_val) + return bitmap_bit_p (ssa_to_be_promoted_bitmap, index); + } + return true; +} + +/* Set ssa NAME to be already considered for promotion. */ +static void +set_ssa_promoted (tree name) +{ + if (TREE_CODE (name) == SSA_NAME) + { + unsigned int index = SSA_NAME_VERSION (name); + if (index < n_ssa_val) + bitmap_set_bit (ssa_to_be_promoted_bitmap, index); + } +} + +/* Set ssa NAME will have higher bits if promoted. */ +static void +set_ssa_overflows (tree name) +{ + if (TREE_CODE (name) == SSA_NAME) + { + unsigned int index = SSA_NAME_VERSION (name); + if (index < n_ssa_val) + bitmap_set_bit (ssa_sets_higher_bits_bitmap, index); + } +} + + +/* Return true if ssa NAME will have higher bits if promoted. */ +static bool +ssa_overflows_p (tree name ATTRIBUTE_UNUSED) +{ + if (TREE_CODE (name) == SSA_NAME) + { + unsigned int index = SSA_NAME_VERSION (name); + if (index < n_ssa_val) + return bitmap_bit_p (ssa_sets_higher_bits_bitmap, index); + } + return true; +} + +/* Visit PHI stmt and record if variables might have higher bits set if + promoted. */ +static bool +record_visit_phi_node (gimple *stmt) +{ + tree def; + ssa_op_iter i; + use_operand_p op; + bool high_bits_set = false; + gphi *phi = as_a <gphi *> (stmt); + tree lhs = PHI_RESULT (phi); + + if (TREE_CODE (lhs) != SSA_NAME + || POINTER_TYPE_P (TREE_TYPE (lhs)) + || !INTEGRAL_TYPE_P (TREE_TYPE (lhs)) + || ssa_overflows_p (lhs)) + return false; + + FOR_EACH_PHI_ARG (op, phi, i, SSA_OP_USE) + { + def = USE_FROM_PTR (op); + if (ssa_overflows_p (def)) + high_bits_set = true; + } + + if (high_bits_set) + { + set_ssa_overflows (lhs); + return true; + } + else + return false; +} + +/* Visit STMT and record if variables might have higher bits set if + promoted. */ +static bool +record_visit_stmt (gimple *stmt) +{ + bool changed = false; + gcc_assert (gimple_code (stmt) == GIMPLE_ASSIGN); + enum tree_code code = gimple_assign_rhs_code (stmt); + tree lhs = gimple_assign_lhs (stmt); + tree rhs1 = gimple_assign_rhs1 (stmt); + + if (TREE_CODE (lhs) != SSA_NAME + || POINTER_TYPE_P (TREE_TYPE (lhs)) + || !INTEGRAL_TYPE_P (TREE_TYPE (lhs))) + return false; + + switch (code) + { + case SSA_NAME: + if (!ssa_overflows_p (lhs) + && ssa_overflows_p (rhs1)) + { + set_ssa_overflows (lhs); + changed = true; + } + break; + + default: + if (!ssa_overflows_p (lhs)) + { + set_ssa_overflows (lhs); + changed = true; + } + break; + } + return changed; +} + +static void +process_all_stmts_for_unsafe_promotion () +{ + basic_block bb; + gimple_stmt_iterator gsi; + auto_vec<gimple *> work_list; + + FOR_EACH_BB_FN (bb, cfun) + { + for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi); gsi_next (&gsi)) + { + gimple *phi = gsi_stmt (gsi); + work_list.safe_push (phi); + } + + for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi)) + { + gimple *stmt = gsi_stmt (gsi); + if (gimple_code (stmt) == GIMPLE_ASSIGN) + work_list.safe_push (stmt); + } + } + + while (work_list.length () > 0) + { + bool changed; + gimple *stmt = work_list.pop (); + tree lhs; + + switch (gimple_code (stmt)) + { + + case GIMPLE_ASSIGN: + changed = record_visit_stmt (stmt); + lhs = gimple_assign_lhs (stmt); + break; + + case GIMPLE_PHI: + changed = record_visit_phi_node (stmt); + lhs = PHI_RESULT (stmt); + break; + + default: + gcc_assert (false); + break; + } + + if (changed) + { + gimple *use_stmt; + imm_use_iterator ui; + + FOR_EACH_IMM_USE_STMT (use_stmt, ui, lhs) + { + if (gimple_code (use_stmt) == GIMPLE_ASSIGN + || gimple_code (use_stmt) == GIMPLE_PHI) + work_list.safe_push (use_stmt); + } + } + } +} + +/* Return true if it is safe to promote the defined SSA_NAME in the STMT + itself. */ +static bool +safe_to_promote_def_p (gimple *stmt) +{ + enum tree_code code = gimple_assign_rhs_code (stmt); + if (gimple_vuse (stmt) != NULL_TREE + || gimple_vdef (stmt) != NULL_TREE + || code == ARRAY_REF + || code == LROTATE_EXPR + || code == RROTATE_EXPR + || code == VIEW_CONVERT_EXPR + || code == BIT_FIELD_REF + || code == REALPART_EXPR + || code == IMAGPART_EXPR + || code == REDUC_MAX_EXPR + || code == REDUC_PLUS_EXPR + || code == REDUC_MIN_EXPR) + return false; + return true; +} + +/* Return true if it is safe to promote the use in the STMT. */ +static bool +safe_to_promote_use_p (gimple *stmt) +{ + enum tree_code code = gimple_assign_rhs_code (stmt); + tree lhs = gimple_assign_lhs (stmt); + + if (gimple_vuse (stmt) != NULL_TREE + || gimple_vdef (stmt) != NULL_TREE + || code == VIEW_CONVERT_EXPR + || code == LROTATE_EXPR + || code == RROTATE_EXPR + || code == CONSTRUCTOR + || code == BIT_FIELD_REF + || code == COMPLEX_EXPR + || code == ASM_EXPR + || VECTOR_TYPE_P (TREE_TYPE (lhs))) + return false; + return true; +} + +/* Return true if the SSA_NAME has to be truncated to preserve the + semantics. */ +static bool +truncate_use_p (gimple *stmt) +{ + enum tree_code code = gimple_assign_rhs_code (stmt); + if (TREE_CODE_CLASS (code) + == tcc_comparison + || code == TRUNC_DIV_EXPR + || code == CEIL_DIV_EXPR + || code == FLOOR_DIV_EXPR + || code == ROUND_DIV_EXPR + || code == TRUNC_MOD_EXPR + || code == CEIL_MOD_EXPR + || code == FLOOR_MOD_EXPR + || code == ROUND_MOD_EXPR + || code == LSHIFT_EXPR + || code == RSHIFT_EXPR) + return true; + return false; +} + +/* Return true if LHS will be promoted later. */ +static bool +tobe_promoted_p (tree lhs) +{ + if (TREE_CODE (lhs) == SSA_NAME + && !POINTER_TYPE_P (TREE_TYPE (lhs)) + && INTEGRAL_TYPE_P (TREE_TYPE (lhs)) + && !VECTOR_TYPE_P (TREE_TYPE (lhs)) + && !ssa_promoted_p (lhs) + && (get_promoted_type (TREE_TYPE (lhs)) + != TREE_TYPE (lhs))) + return true; + else + return false; +} + +/* Convert constant CST to TYPE. */ +static tree +convert_int_cst (tree type, tree cst, signop sign = SIGNED) +{ + wide_int wi_cons = fold_convert (type, cst); + wi_cons = wi::ext (wi_cons, TYPE_PRECISION (TREE_TYPE (cst)), sign); + return wide_int_to_tree (type, wi_cons); +} + +/* Promote constants in STMT to TYPE. If PROMOTE_COND_EXPR is true, + promote only the constants in conditions part of the COND_EXPR. */ +static void +promote_cst_in_stmt (gimple *stmt, tree type, bool promote_cond = false) +{ + tree op; + ssa_op_iter iter; + use_operand_p oprnd; + int index; + tree op0, op1; + signop sign = SIGNED; + + switch (gimple_code (stmt)) + { + case GIMPLE_ASSIGN: + if (promote_cond + && gimple_assign_rhs_code (stmt) == COND_EXPR) + { + /* Promote INTEGER_CST that are tcc_compare arguments. */ + sign = TYPE_SIGN (type); + op = gimple_assign_rhs1 (stmt); + op0 = TREE_OPERAND (op, 0); + op1 = TREE_OPERAND (op, 1); + if (TREE_CODE (op0) == INTEGER_CST) + op0 = convert_int_cst (type, op0, sign); + if (TREE_CODE (op1) == INTEGER_CST) + op1 = convert_int_cst (type, op1, sign); + tree new_op = build2 (TREE_CODE (op), type, op0, op1); + gimple_assign_set_rhs1 (stmt, new_op); + } + else + { + /* Promote INTEGER_CST in GIMPLE_ASSIGN. */ + op = gimple_assign_rhs3 (stmt); + if (op && TREE_CODE (op) == INTEGER_CST) + gimple_assign_set_rhs3 (stmt, convert_int_cst (type, op, sign)); + if (TREE_CODE_CLASS (gimple_assign_rhs_code (stmt)) + == tcc_comparison) + sign = TYPE_SIGN (type); + op = gimple_assign_rhs1 (stmt); + if (op && TREE_CODE (op) == INTEGER_CST) + gimple_assign_set_rhs1 (stmt, convert_int_cst (type, op, sign)); + op = gimple_assign_rhs2 (stmt); + if (op && TREE_CODE (op) == INTEGER_CST) + gimple_assign_set_rhs2 (stmt, convert_int_cst (type, op, sign)); + } + break; + + case GIMPLE_PHI: + { + /* Promote INTEGER_CST arguments to GIMPLE_PHI. */ + gphi *phi = as_a <gphi *> (stmt); + FOR_EACH_PHI_ARG (oprnd, phi, iter, SSA_OP_USE) + { + op = USE_FROM_PTR (oprnd); + index = PHI_ARG_INDEX_FROM_USE (oprnd); + if (TREE_CODE (op) == INTEGER_CST) + SET_PHI_ARG_DEF (phi, index, convert_int_cst (type, op, sign)); + } + } + break; + + case GIMPLE_COND: + { + /* Promote INTEGER_CST that are GIMPLE_COND arguments. */ + gcond *cond = as_a <gcond *> (stmt); + op = gimple_cond_lhs (cond); + sign = TYPE_SIGN (type); + + if (op && TREE_CODE (op) == INTEGER_CST) + gimple_cond_set_lhs (cond, convert_int_cst (type, op, sign)); + op = gimple_cond_rhs (cond); + + if (op && TREE_CODE (op) == INTEGER_CST) + gimple_cond_set_rhs (cond, convert_int_cst (type, op, sign)); + } + break; + + default: + gcc_unreachable (); + } +} + +/* Create an ssa with TYPE to copy ssa VAR. */ +static tree +make_promoted_copy (tree var, gimple *def_stmt, tree type) +{ + tree new_lhs = make_ssa_name (type, def_stmt); + if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (var)) + SSA_NAME_OCCURS_IN_ABNORMAL_PHI (new_lhs) = 1; + return new_lhs; +} + +/* Zero/sign extend (depending on type) VAR and truncate to WIDTH bits. + Assign the zero/sign extended value in NEW_VAR. gimple statement + that performs the zero/sign extension is returned. */ +static gimple * +zero_sign_extend_stmt (tree new_var, tree var, int width) +{ + gcc_assert (TYPE_PRECISION (TREE_TYPE (var)) + == TYPE_PRECISION (TREE_TYPE (new_var))); + gcc_assert (TYPE_PRECISION (TREE_TYPE (var)) > width); + gimple *stmt; + + if (TYPE_UNSIGNED (TREE_TYPE (new_var))) + { + /* Zero extend. */ + tree cst + = wide_int_to_tree (TREE_TYPE (var), + wi::mask (width, false, + TYPE_PRECISION (TREE_TYPE (var)))); + stmt = gimple_build_assign (new_var, BIT_AND_EXPR, + var, cst); + } + else + /* Sign extend. */ + stmt = gimple_build_assign (new_var, + SEXT_EXPR, + var, build_int_cst (TREE_TYPE (var), width)); + return stmt; +} + + +void duplicate_default_ssa (tree to, tree from) +{ + SET_SSA_NAME_VAR_OR_IDENTIFIER (to, SSA_NAME_VAR (from)); + SSA_NAME_IS_DEFAULT_DEF (to) = SSA_NAME_IS_DEFAULT_DEF (from); + SSA_NAME_DEF_STMT (to) = SSA_NAME_DEF_STMT (from); + SET_SSA_NAME_VAR_OR_IDENTIFIER (from, NULL_TREE); + SSA_NAME_IS_DEFAULT_DEF (to) = 1; + SSA_NAME_IS_DEFAULT_DEF (from) = 0; +} + +/* Promote definition DEF to PROMOTED_TYPE. If the stmt that defines def + is def_stmt, make the type of def promoted_type. If the stmt is such + that, result of the def_stmt cannot be of promoted_type, create a new_def + of the original_type and make the def_stmt assign its value to newdef. + Then, create a CONVERT_EXPR to convert new_def to def of promoted type. + + For example, for stmt with original_type char and promoted_type int: + char _1 = mem; + becomes: + char _2 = mem; + int _1 = (int)_2; + + If the def_stmt allows def to be promoted, promote def in-place + (and its arguments when needed). + + For example: + char _3 = _1 + _2; + becomes: + int _3 = _1 + _2; + Here, _1 and _2 will also be promoted. */ + +static void +promote_ssa (tree def, + tree promoted_type) +{ + gimple *def_stmt = SSA_NAME_DEF_STMT (def); + gimple *copy_stmt = NULL; + basic_block bb; + gimple_stmt_iterator gsi; + tree original_type = TREE_TYPE (def); + tree new_def; + bool do_not_promote = false; + + switch (gimple_code (def_stmt)) + { + case GIMPLE_PHI: + { + /* Promote def by fixing its type and make def anonymous. */ + TREE_TYPE (def) = promoted_type; + SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE); + promote_cst_in_stmt (def_stmt, promoted_type); + break; + } + + case GIMPLE_ASM: + { + gasm *asm_stmt = as_a <gasm *> (def_stmt); + for (unsigned int i = 0; i < gimple_asm_noutputs (asm_stmt); ++i) + { + /* Promote def and copy (i.e. convert) the value defined + by asm to def. */ + tree link = gimple_asm_output_op (asm_stmt, i); + tree op = TREE_VALUE (link); + if (op == def) + { + new_def = copy_ssa_name (def); + set_ssa_promoted (new_def); + duplicate_default_ssa (new_def, def); + TREE_VALUE (link) = new_def; + gimple_asm_set_output_op (asm_stmt, i, link); + + TREE_TYPE (def) = promoted_type; + copy_stmt = gimple_build_assign (def, CONVERT_EXPR, + new_def, NULL_TREE); + gsi = gsi_for_stmt (def_stmt); + SSA_NAME_IS_DEFAULT_DEF (new_def) = 0; + gsi_insert_after (&gsi, copy_stmt, GSI_NEW_STMT); + break; + } + } + break; + } + + case GIMPLE_NOP: + { + if (SSA_NAME_VAR (def) == NULL) + { + /* Promote def by fixing its type for anonymous def. */ + TREE_TYPE (def) = promoted_type; + } + else + { + /* Create a promoted copy of parameters. */ + bb = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun)); + gcc_assert (bb); + gsi = gsi_after_labels (bb); + new_def = copy_ssa_name (def); + set_ssa_promoted (new_def); + set_ssa_default_def (cfun, SSA_NAME_VAR (def), new_def); + duplicate_default_ssa (new_def, def); + TREE_TYPE (def) = promoted_type; + copy_stmt = gimple_build_assign (def, CONVERT_EXPR, + new_def, NULL_TREE); + SSA_NAME_DEF_STMT (def) = copy_stmt; + gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT); + } + break; + } + + case GIMPLE_ASSIGN: + { + enum tree_code code = gimple_assign_rhs_code (def_stmt); + if (!safe_to_promote_def_p (def_stmt)) + { + do_not_promote = true; + } + else if (CONVERT_EXPR_CODE_P (code)) + { + tree rhs = gimple_assign_rhs1 (def_stmt); + if (!type_precision_ok (TREE_TYPE (rhs))) + { + do_not_promote = true; + } + else if (types_compatible_p (TREE_TYPE (rhs), promoted_type)) + { + /* As we travel statements in dominated order, arguments + of def_stmt will be visited before visiting def. If RHS + is already promoted and type is compatible, we can convert + them into ZERO/SIGN EXTEND stmt. */ + tree &type = original_type_map->get_or_insert (rhs); + if (type == NULL_TREE) + type = TREE_TYPE (rhs); + if ((TYPE_PRECISION (original_type) > TYPE_PRECISION (type)) + || (TYPE_UNSIGNED (original_type) != TYPE_UNSIGNED (type))) + { + tree &type = original_type_map->get_or_insert (rhs); + if (type == NULL_TREE) + type = TREE_TYPE (rhs); + if (TYPE_PRECISION (original_type) < TYPE_PRECISION (type)) + type = original_type; + gcc_assert (type != NULL_TREE); + TREE_TYPE (def) = promoted_type; + gimple *copy_stmt = + zero_sign_extend_stmt (def, rhs, + TYPE_PRECISION (type)); + SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE); + gsi = gsi_for_stmt (def_stmt); + gsi_replace (&gsi, copy_stmt, false); + } + else + { + TREE_TYPE (def) = promoted_type; + SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE); + } + } + else + { + /* If RHS is not promoted OR their types are not + compatible, create CONVERT_EXPR that converts + RHS to promoted DEF type and perform a + ZERO/SIGN EXTEND to get the required value + from RHS. */ + tree s = (TYPE_PRECISION (TREE_TYPE (def)) + < TYPE_PRECISION (TREE_TYPE (rhs))) + ? TREE_TYPE (def) : TREE_TYPE (rhs); + new_def = copy_ssa_name (def); + set_ssa_promoted (new_def); + TREE_TYPE (def) = promoted_type; + TREE_TYPE (new_def) = promoted_type; + SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE); + SET_SSA_NAME_VAR_OR_IDENTIFIER (new_def, NULL_TREE); + gimple_set_lhs (def_stmt, new_def); + gimple *copy_stmt = + zero_sign_extend_stmt (def, new_def, + TYPE_PRECISION (s)); + gsi = gsi_for_stmt (def_stmt); + if (lookup_stmt_eh_lp (def_stmt) > 0 + || (gimple_code (def_stmt) == GIMPLE_CALL + && gimple_call_ctrl_altering_p (def_stmt))) + gsi_insert_on_edge (FALLTHRU_EDGE (gimple_bb (def_stmt)), + copy_stmt); + else + gsi_insert_after (&gsi, copy_stmt, GSI_NEW_STMT); + } + } + else + { + /* Promote def by fixing its type and make def anonymous. */ + SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE); + promote_cst_in_stmt (def_stmt, promoted_type); + TREE_TYPE (def) = promoted_type; + } + break; + } + + default: + do_not_promote = true; + break; + } + + if (do_not_promote) + { + /* Promote def and copy (i.e. convert) the value defined + by the stmt that cannot be promoted. */ + new_def = copy_ssa_name (def); + set_ssa_promoted (new_def); + SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE); + TREE_TYPE (def) = promoted_type; + gimple_set_lhs (def_stmt, new_def); + copy_stmt = gimple_build_assign (def, CONVERT_EXPR, + new_def, NULL_TREE); + gsi = gsi_for_stmt (def_stmt); + if (lookup_stmt_eh_lp (def_stmt) > 0 + || (gimple_code (def_stmt) == GIMPLE_CALL + && gimple_call_ctrl_altering_p (def_stmt))) + gsi_insert_on_edge (FALLTHRU_EDGE (gimple_bb (def_stmt)), + copy_stmt); + else + gsi_insert_after (&gsi, copy_stmt, GSI_NEW_STMT); + } + else + { + /* Type is now promoted. Due to this, some of the value ranges computed + by VRP1 will is invalid. TODO: We can be intelligent in deciding + which ranges to be invalidated instead of invalidating everything. */ + SSA_NAME_RANGE_INFO (def) = NULL; + } +} + +/* Fix the (promoted) USE in stmts where USE cannot be be promoted. */ +static unsigned int +fixup_uses (tree use, tree promoted_type, tree old_type) +{ + gimple *stmt; + imm_use_iterator ui; + gimple_stmt_iterator gsi; + use_operand_p op; + + FOR_EACH_IMM_USE_STMT (stmt, ui, use) + { + bool do_not_promote = false; + switch (gimple_code (stmt)) + { + case GIMPLE_DEBUG: + { + FOR_EACH_IMM_USE_ON_STMT (op, ui) + SET_USE (op, fold_convert (old_type, use)); + update_stmt (stmt); + } + break; + + case GIMPLE_ASM: + case GIMPLE_CALL: + case GIMPLE_RETURN: + { + /* USE cannot be promoted here. */ + do_not_promote = true; + break; + } + + case GIMPLE_ASSIGN: + { + enum tree_code code = gimple_assign_rhs_code (stmt); + tree lhs = gimple_assign_lhs (stmt); + if (!safe_to_promote_use_p (stmt)) + { + do_not_promote = true; + } + else if (truncate_use_p (stmt) + || !INTEGRAL_TYPE_P (TREE_TYPE (lhs))) + { + if (TREE_CODE_CLASS (code) + == tcc_comparison) + promote_cst_in_stmt (stmt, promoted_type, true); + if (!ssa_overflows_p (use)) + break; + /* In some stmts, value in USE has to be ZERO/SIGN + Extended based on the original type for correct + result. */ + tree temp = make_promoted_copy (use, NULL, TREE_TYPE (use)); + gimple *copy_stmt = + zero_sign_extend_stmt (temp, use, + TYPE_PRECISION (old_type)); + gsi = gsi_for_stmt (stmt); + gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT); + + FOR_EACH_IMM_USE_ON_STMT (op, ui) + SET_USE (op, temp); + update_stmt (stmt); + } + else if (CONVERT_EXPR_CODE_P (code)) + { + tree rhs = gimple_assign_rhs1 (stmt); + if (!type_precision_ok (TREE_TYPE (rhs))) + { + do_not_promote = true; + } + else if (types_compatible_p (TREE_TYPE (lhs), promoted_type)) + { + /* Type of LHS and promoted RHS are compatible, we can + convert this into ZERO/SIGN EXTEND stmt. */ + gimple *copy_stmt = + zero_sign_extend_stmt (lhs, use, + TYPE_PRECISION (old_type)); + gsi = gsi_for_stmt (stmt); + set_ssa_promoted (lhs); + gsi_replace (&gsi, copy_stmt, false); + } + else if (tobe_promoted_p (lhs)) + { + /* If LHS will be promoted later, store the original + type of RHS so that we can convert it to ZERO/SIGN + EXTEND when LHS is promoted. */ + tree rhs = gimple_assign_rhs1 (stmt); + tree &type = original_type_map->get_or_insert (rhs); + type = TREE_TYPE (old_type); + } + else + { + do_not_promote = true; + } + } + break; + } + + case GIMPLE_COND: + if (ssa_overflows_p (use)) + { + /* In GIMPLE_COND, value in USE has to be ZERO/SIGN + Extended based on the original type for correct + result. */ + tree temp = make_promoted_copy (use, NULL, TREE_TYPE (use)); + gimple *copy_stmt = + zero_sign_extend_stmt (temp, use, + TYPE_PRECISION (old_type)); + gsi = gsi_for_stmt (stmt); + gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT); + + FOR_EACH_IMM_USE_ON_STMT (op, ui) + SET_USE (op, temp); + } + promote_cst_in_stmt (stmt, promoted_type, true); + update_stmt (stmt); + break; + + default: + break; + } + + if (do_not_promote) + { + /* FOR stmts where USE canoot be promoted, create an + original type copy. */ + tree temp; + temp = copy_ssa_name (use); + set_ssa_promoted (temp); + TREE_TYPE (temp) = old_type; + gimple *copy_stmt = gimple_build_assign (temp, CONVERT_EXPR, + use, NULL_TREE); + gsi = gsi_for_stmt (stmt); + gsi_insert_before (&gsi, copy_stmt, GSI_NEW_STMT); + FOR_EACH_IMM_USE_ON_STMT (op, ui) + SET_USE (op, temp); + update_stmt (stmt); + } + } + return 0; +} +void debug_tree (tree); + +/* Promote definition of NAME and adjust its uses if necessary. */ +static unsigned int +promote_ssa_if_not_promoted (tree name) +{ + tree type; + if (tobe_promoted_p (name)) + { + type = get_promoted_type (TREE_TYPE (name)); + tree old_type = TREE_TYPE (name); + promote_ssa (name, type); + set_ssa_promoted (name); + fixup_uses (name, type, old_type); + } + return 0; +} + +/* Promote all the stmts in the basic block. */ +static void +promote_all_stmts (basic_block bb) +{ + gimple_stmt_iterator gsi; + ssa_op_iter iter; + tree def; + + for (gphi_iterator gpi = gsi_start_phis (bb); + !gsi_end_p (gpi); gsi_next (&gpi)) + { + gphi *phi = gpi.phi (); + use_operand_p op; + + FOR_EACH_PHI_ARG (op, phi, iter, SSA_OP_USE) + { + def = USE_FROM_PTR (op); + promote_ssa_if_not_promoted (def); + } + def = PHI_RESULT (phi); + promote_ssa_if_not_promoted (def); + } + for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi)) + { + gimple *stmt = gsi_stmt (gsi); + + FOR_EACH_SSA_TREE_OPERAND (def, stmt, iter, SSA_OP_USE | SSA_OP_DEF) + promote_ssa_if_not_promoted (def); + } +} + + +class type_promotion_dom_walker : public dom_walker +{ +public: + type_promotion_dom_walker (cdi_direction direction) + : dom_walker (direction) {} + virtual void before_dom_children (basic_block bb) + { + promote_all_stmts (bb); + } +}; + +/* Main entry point to the pass. */ +static unsigned int +execute_type_promotion (void) +{ + n_ssa_val = num_ssa_names; + original_type_map = new hash_map<tree, tree>; + ssa_to_be_promoted_bitmap = sbitmap_alloc (n_ssa_val); + bitmap_clear (ssa_to_be_promoted_bitmap); + ssa_sets_higher_bits_bitmap = sbitmap_alloc (n_ssa_val); + bitmap_clear (ssa_sets_higher_bits_bitmap); + + calculate_dominance_info (CDI_DOMINATORS); + process_all_stmts_for_unsafe_promotion (); + /* Walk the CFG in dominator order. */ + type_promotion_dom_walker (CDI_DOMINATORS) + .walk (ENTRY_BLOCK_PTR_FOR_FN (cfun)); + gsi_commit_edge_inserts (); + sbitmap_free (ssa_to_be_promoted_bitmap); + sbitmap_free (ssa_sets_higher_bits_bitmap); + delete original_type_map; + return 0; +} + +namespace { +const pass_data pass_data_type_promotion = +{ + GIMPLE_PASS, /* type */ + "promotion", /* name */ + OPTGROUP_NONE, /* optinfo_flags */ + TV_TREE_TYPE_PROMOTE, /* tv_id */ + PROP_ssa, /* properties_required */ + 0, /* properties_provided */ + 0, /* properties_destroyed */ + 0, /* todo_flags_start */ + (TODO_cleanup_cfg | TODO_update_ssa | TODO_verify_all), +}; + +class pass_type_promotion : public gimple_opt_pass +{ +public: + pass_type_promotion (gcc::context *ctxt) + : gimple_opt_pass (pass_data_type_promotion, ctxt) + {} + + /* opt_pass methods: */ + opt_pass * clone () { return new pass_type_promotion (m_ctxt); } + virtual bool gate (function *) { return flag_tree_type_promote != 0; } + virtual unsigned int execute (function *) + { + return execute_type_promotion (); + } + +}; // class pass_type_promotion + +} // anon namespace + +gimple_opt_pass * +make_pass_type_promote (gcc::context *ctxt) +{ + return new pass_type_promotion (ctxt); +} + diff --git a/gcc/passes.def b/gcc/passes.def index 36d2b3b..78c463a 100644 --- a/gcc/passes.def +++ b/gcc/passes.def @@ -272,6 +272,7 @@ along with GCC; see the file COPYING3. If not see POP_INSERT_PASSES () NEXT_PASS (pass_simduid_cleanup); NEXT_PASS (pass_lower_vector_ssa); + NEXT_PASS (pass_type_promote); NEXT_PASS (pass_cse_reciprocals); NEXT_PASS (pass_reassoc); NEXT_PASS (pass_strength_reduction); diff --git a/gcc/timevar.def b/gcc/timevar.def index b429faf..a8d40c3 100644 --- a/gcc/timevar.def +++ b/gcc/timevar.def @@ -278,6 +278,7 @@ DEFTIMEVAR (TV_VTABLE_VERIFICATION , "vtable verification") DEFTIMEVAR (TV_TREE_UBSAN , "tree ubsan") DEFTIMEVAR (TV_INITIALIZE_RTL , "initialize rtl") DEFTIMEVAR (TV_GIMPLE_LADDRESS , "address lowering") +DEFTIMEVAR (TV_TREE_TYPE_PROMOTE , "tree type promote") /* Everything else in rest_of_compilation not included above. */ DEFTIMEVAR (TV_EARLY_LOCAL , "early local passes") diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h index 333b5a7..449dd19 100644 --- a/gcc/tree-pass.h +++ b/gcc/tree-pass.h @@ -436,6 +436,7 @@ extern gimple_opt_pass *make_pass_fre (gcc::context *ctxt); extern gimple_opt_pass *make_pass_check_data_deps (gcc::context *ctxt); extern gimple_opt_pass *make_pass_copy_prop (gcc::context *ctxt); extern gimple_opt_pass *make_pass_isolate_erroneous_paths (gcc::context *ctxt); +extern gimple_opt_pass *make_pass_type_promote (gcc::context *ctxt); extern gimple_opt_pass *make_pass_vrp (gcc::context *ctxt); extern gimple_opt_pass *make_pass_uncprop (gcc::context *ctxt); extern gimple_opt_pass *make_pass_return_slot (gcc::context *ctxt); diff --git a/gcc/tree-ssanames.c b/gcc/tree-ssanames.c index 82fd4a1..80fcf70 100644 --- a/gcc/tree-ssanames.c +++ b/gcc/tree-ssanames.c @@ -207,7 +207,8 @@ set_range_info (tree name, enum value_range_type range_type, unsigned int precision = TYPE_PRECISION (TREE_TYPE (name)); /* Allocate if not available. */ - if (ri == NULL) + if (ri == NULL + || (precision != ri->get_min ().get_precision ())) { size_t size = (sizeof (range_info_def) + trailing_wide_ints <3>::extra_size (precision)); -- 1.9.1 [-- Attachment #4: 0001-Add-new-SEXT_EXPR-tree-code.patch --] [-- Type: text/x-diff, Size: 5067 bytes --] From c0ce364e3a422912a08189645efde46c36583753 Mon Sep 17 00:00:00 2001 From: Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org> Date: Thu, 22 Oct 2015 10:51:42 +1100 Subject: [PATCH 1/3] Add new SEXT_EXPR tree code --- gcc/cfgexpand.c | 12 ++++++++++++ gcc/expr.c | 20 ++++++++++++++++++++ gcc/fold-const.c | 4 ++++ gcc/tree-cfg.c | 12 ++++++++++++ gcc/tree-inline.c | 1 + gcc/tree-pretty-print.c | 11 +++++++++++ gcc/tree.def | 5 +++++ 7 files changed, 65 insertions(+) diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c index eaad859..aeb64bb 100644 --- a/gcc/cfgexpand.c +++ b/gcc/cfgexpand.c @@ -5054,6 +5054,18 @@ expand_debug_expr (tree exp) case FMA_EXPR: return simplify_gen_ternary (FMA, mode, inner_mode, op0, op1, op2); + case SEXT_EXPR: + gcc_assert (CONST_INT_P (op1)); + inner_mode = mode_for_size (INTVAL (op1), MODE_INT, 0); + gcc_assert (GET_MODE_BITSIZE (inner_mode) == INTVAL (op1)); + + if (mode != inner_mode) + op0 = simplify_gen_unary (SIGN_EXTEND, + mode, + gen_lowpart_SUBREG (inner_mode, op0), + inner_mode); + return op0; + default: flag_unsupported: #ifdef ENABLE_CHECKING diff --git a/gcc/expr.c b/gcc/expr.c index da68870..c2f535f 100644 --- a/gcc/expr.c +++ b/gcc/expr.c @@ -9318,6 +9318,26 @@ expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode, target = expand_vec_cond_expr (type, treeop0, treeop1, treeop2, target); return target; + case SEXT_EXPR: + { + machine_mode inner_mode = mode_for_size (tree_to_uhwi (treeop1), + MODE_INT, 0); + rtx temp, result; + rtx op0 = expand_normal (treeop0); + op0 = force_reg (mode, op0); + if (mode != inner_mode) + { + result = gen_reg_rtx (mode); + temp = simplify_gen_unary (SIGN_EXTEND, mode, + gen_lowpart_SUBREG (inner_mode, op0), + inner_mode); + convert_move (result, temp, 0); + } + else + result = op0; + return result; + } + default: gcc_unreachable (); } diff --git a/gcc/fold-const.c b/gcc/fold-const.c index 602ea24..a149bad 100644 --- a/gcc/fold-const.c +++ b/gcc/fold-const.c @@ -987,6 +987,10 @@ int_const_binop_1 (enum tree_code code, const_tree arg1, const_tree parg2, res = wi::bit_and (arg1, arg2); break; + case SEXT_EXPR: + res = wi::sext (arg1, arg2.to_uhwi ()); + break; + case RSHIFT_EXPR: case LSHIFT_EXPR: if (wi::neg_p (arg2)) diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c index 8e3e810..d18b3f7 100644 --- a/gcc/tree-cfg.c +++ b/gcc/tree-cfg.c @@ -3752,6 +3752,18 @@ verify_gimple_assign_binary (gassign *stmt) return false; } + case SEXT_EXPR: + { + if (!INTEGRAL_TYPE_P (lhs_type) + || !useless_type_conversion_p (lhs_type, rhs1_type) + || !tree_fits_uhwi_p (rhs2)) + { + error ("invalid operands in sext expr"); + return true; + } + return false; + } + case VEC_WIDEN_LSHIFT_HI_EXPR: case VEC_WIDEN_LSHIFT_LO_EXPR: { diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c index b8269ef..e61c200 100644 --- a/gcc/tree-inline.c +++ b/gcc/tree-inline.c @@ -3893,6 +3893,7 @@ estimate_operator_cost (enum tree_code code, eni_weights *weights, case BIT_XOR_EXPR: case BIT_AND_EXPR: case BIT_NOT_EXPR: + case SEXT_EXPR: case TRUTH_ANDIF_EXPR: case TRUTH_ORIF_EXPR: diff --git a/gcc/tree-pretty-print.c b/gcc/tree-pretty-print.c index 11f90051..bec9082 100644 --- a/gcc/tree-pretty-print.c +++ b/gcc/tree-pretty-print.c @@ -1923,6 +1923,14 @@ dump_generic_node (pretty_printer *pp, tree node, int spc, int flags, } break; + case SEXT_EXPR: + pp_string (pp, "SEXT_EXPR <"); + dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags, false); + pp_string (pp, ", "); + dump_generic_node (pp, TREE_OPERAND (node, 1), spc, flags, false); + pp_greater (pp); + break; + case MODIFY_EXPR: case INIT_EXPR: dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags, @@ -3561,6 +3569,9 @@ op_symbol_code (enum tree_code code) case MIN_EXPR: return "min"; + case SEXT_EXPR: + return "sext"; + default: return "<<< ??? >>>"; } diff --git a/gcc/tree.def b/gcc/tree.def index d0a3bd6..789cfdd 100644 --- a/gcc/tree.def +++ b/gcc/tree.def @@ -760,6 +760,11 @@ DEFTREECODE (BIT_XOR_EXPR, "bit_xor_expr", tcc_binary, 2) DEFTREECODE (BIT_AND_EXPR, "bit_and_expr", tcc_binary, 2) DEFTREECODE (BIT_NOT_EXPR, "bit_not_expr", tcc_unary, 1) +/* Sign-extend operation. It will sign extend first operand from + the sign bit specified by the second operand. The type of the + result is that of the first operand. */ +DEFTREECODE (SEXT_EXPR, "sext_expr", tcc_binary, 2) + /* ANDIF and ORIF allow the second operand not to be computed if the value of the expression is determined from the first operand. AND, OR, and XOR always compute the second operand whether its value is -- 1.9.1 ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [0/7] Type promotion pass and elimination of zext/sext 2015-11-02 9:17 ` Kugan @ 2015-11-03 14:40 ` Richard Biener 2015-11-08 9:43 ` Kugan 0 siblings, 1 reply; 28+ messages in thread From: Richard Biener @ 2015-11-03 14:40 UTC (permalink / raw) To: Kugan; +Cc: gcc-patches On Mon, Nov 2, 2015 at 10:17 AM, Kugan <kugan.vivekanandarajah@linaro.org> wrote: > > > On 29/10/15 02:45, Richard Biener wrote: >> On Tue, Oct 27, 2015 at 1:50 AM, kugan >> <kugan.vivekanandarajah@linaro.org> wrote: >>> >>> >>> On 23/10/15 01:23, Richard Biener wrote: >>>> >>>> On Thu, Oct 22, 2015 at 12:50 PM, Kugan >>>> <kugan.vivekanandarajah@linaro.org> wrote: >>>>> >>>>> >>>>> >>>>> On 21/10/15 23:45, Richard Biener wrote: >>>>>> >>>>>> On Tue, Oct 20, 2015 at 10:03 PM, Kugan >>>>>> <kugan.vivekanandarajah@linaro.org> wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>> On 07/09/15 12:53, Kugan wrote: >>>>>>>> >>>>>>>> >>>>>>>> This a new version of the patch posted in >>>>>>>> https://gcc.gnu.org/ml/gcc-patches/2015-08/msg00226.html. I have done >>>>>>>> more testing and spitted the patch to make it more easier to review. >>>>>>>> There are still couple of issues to be addressed and I am working on >>>>>>>> them. >>>>>>>> >>>>>>>> 1. AARCH64 bootstrap now fails with the commit >>>>>>>> 94f92c36a83d66a893c3bc6f00a038ba3dbe2a6f. simplify-rtx.c is >>>>>>>> mis-compiled >>>>>>>> in stage2 and fwprop.c is failing. It looks to me that there is a >>>>>>>> latent >>>>>>>> issue which gets exposed my patch. I can also reproduce this in x86_64 >>>>>>>> if I use the same PROMOTE_MODE which is used in aarch64 port. For the >>>>>>>> time being, I am using patch >>>>>>>> 0006-temporary-workaround-for-bootstrap-failure-due-to-co.patch as a >>>>>>>> workaround. This meeds to be fixed before the patches are ready to be >>>>>>>> committed. >>>>>>>> >>>>>>>> 2. vector-compare-1.c from c-c++-common/torture fails to assemble with >>>>>>>> -O3 -g Error: unaligned opcodes detected in executable segment. It >>>>>>>> works >>>>>>>> fine if I remove the -g. I am looking into it and needs to be fixed as >>>>>>>> well. >>>>>>> >>>>>>> >>>>>>> Hi Richard, >>>>>>> >>>>>>> Now that stage 1 is going to close, I would like to get these patches >>>>>>> accepted for stage1. I will try my best to address your review comments >>>>>>> ASAP. >>>>>> >>>>>> >>>>>> Ok, can you make the whole patch series available so I can poke at the >>>>>> implementation a bit? Please state the revision it was rebased on >>>>>> (or point me to a git/svn branch the work resides on). >>>>>> >>>>> >>>>> Thanks. Please find the patched rebated against trunk@229156. I have >>>>> skipped the test-case readjustment patches. >>>> >>>> >>>> Some quick observations. On x86_64 when building >>> >>> >>> Hi Richard, >>> >>> Thanks for the review. >>> >>>> >>>> short bar (short y); >>>> int foo (short x) >>>> { >>>> short y = bar (x) + 15; >>>> return y; >>>> } >>>> >>>> with -m32 -O2 -mtune=pentiumpro (which ends up promoting HImode regs) >>>> I get >>>> >>>> <bb 2>: >>>> _1 = (int) x_10(D); >>>> _2 = (_1) sext (16); >>>> _11 = bar (_2); >>>> _5 = (int) _11; >>>> _12 = (unsigned int) _5; >>>> _6 = _12 & 65535; >>>> _7 = _6 + 15; >>>> _13 = (int) _7; >>>> _8 = (_13) sext (16); >>>> _9 = (_8) sext (16); >>>> return _9; >>>> >>>> which looks fine but the VRP optimization doesn't trigger for the >>>> redundant sext >>>> (ranges are computed correctly but the 2nd extension is not removed). > > Thanks for the comments. Please fond the attached patches with which I > am now getting > cat .192t.optimized > > ;; Function foo (foo, funcdef_no=0, decl_uid=1406, cgraph_uid=0, > symbol_order=0) > > foo (short int x) > { > signed int _1; > int _2; > signed int _5; > unsigned int _6; > unsigned int _7; > signed int _8; > int _9; > short int _11; > unsigned int _12; > signed int _13; > > <bb 2>: > _1 = (signed int) x_10(D); > _2 = _1; > _11 = bar (_2); > _5 = (signed int) _11; > _12 = (unsigned int) _11; > _6 = _12 & 65535; > _7 = _6 + 15; > _13 = (signed int) _7; > _8 = (_13) sext (16); > _9 = _8; > return _9; > > } > > > There are still some redundancies. The asm difference after RTL > optimizations is > > - addl $15, %eax > + addw $15, %ax > > >>>> >>>> This also makes me notice trivial match.pd patterns are missing, like >>>> for example >>>> >>>> (simplify >>>> (sext (sext@2 @0 @1) @3) >>>> (if (tree_int_cst_compare (@1, @3) <= 0) >>>> @2 >>>> (sext @0 @3))) >>>> >>>> as VRP doesn't run at -O1 we must rely on those to remove rendudant >>>> extensions, >>>> otherwise generated code might get worse compared to without the pass(?) >>> >>> >>> Do you think that we should enable this pass only when vrp is enabled. >>> Otherwise, even when we do the simple optimizations you mentioned below, we >>> might not be able to remove all the redundancies. >>> >>>> >>>> I also notice that the 'short' argument does not get it's sign-extension >>>> removed >>>> as redundand either even though we have >>>> >>>> _1 = (int) x_8(D); >>>> Found new range for _1: [-32768, 32767] >>>> >>> >>> I am looking into it. >>> >>>> In the end I suspect that keeping track of the "simple" cases in the >>>> promotion >>>> pass itself (by keeping a lattice) might be a good idea (after we fix VRP >>>> to do >>>> its work). In some way whether the ABI guarantees promoted argument >>>> registers might need some other target hook queries. > > I tried adding it in the attached patch with record_visit_stmt to track > whether an ssa would have value overflow or properly zero/sign extended > in promoted mode. We can use this to eliminate some of the zero/sign > extension at gimple level. As it is, it doesn't do much. If this is what > you had in mind, I will extend it based on your feedback. > > >>>> >>>> Now onto the 0002 patch. >>>> >>>> +static bool >>>> +type_precision_ok (tree type) >>>> +{ >>>> + return (TYPE_PRECISION (type) == 8 >>>> + || TYPE_PRECISION (type) == 16 >>>> + || TYPE_PRECISION (type) == 32); >>>> +} >>>> >>>> that's a weird function to me. You probably want >>>> TYPE_PRECISION (type) == GET_MODE_PRECISION (TYPE_MODE (type)) >>>> here? And guard that thing with POINTER_TYPE_P || INTEGRAL_TYPE_P? >>>> >>> >>> I will change this. (I have a patch which I am testing with other changes >>> you have asked for) >>> >>> >>>> +/* Return the promoted type for TYPE. */ >>>> +static tree >>>> +get_promoted_type (tree type) >>>> +{ >>>> + tree promoted_type; >>>> + enum machine_mode mode; >>>> + int uns; >>>> + if (POINTER_TYPE_P (type) >>>> + || !INTEGRAL_TYPE_P (type) >>>> + || !type_precision_ok (type)) >>>> + return type; >>>> + >>>> + mode = TYPE_MODE (type); >>>> +#ifdef PROMOTE_MODE >>>> + uns = TYPE_SIGN (type); >>>> + PROMOTE_MODE (mode, uns, type); >>>> +#endif >>>> + uns = TYPE_SIGN (type); >>>> + promoted_type = lang_hooks.types.type_for_mode (mode, uns); >>>> + if (promoted_type >>>> + && (TYPE_PRECISION (promoted_type) > TYPE_PRECISION (type))) >>>> + type = promoted_type; >>>> >>>> I think what you want to verify is that TYPE_PRECISION (promoted_type) >>>> == GET_MODE_PRECISION (mode). >>>> And to not even bother with this simply use >>>> >>>> promoted_type = build_nonstandard_integer_type (GET_MODE_PRECISION (mode), >>>> uns); >>>> >>> >>> I am changing this too. >>> >>>> You use a domwalk but also might create new basic-blocks during it >>>> (insert_on_edge_immediate), that's a >>>> no-no, commit edge inserts after the domwalk. >>> >>> >>> I am sorry, I dont understand "commit edge inserts after the domwalk" Is >>> there a way to do this in the current implementation? >> >> Yes, simply use gsi_insert_on_edge () and after the domwalk is done do >> gsi_commit_edge_inserts (). >> >>>> ssa_sets_higher_bits_bitmap looks unused and >>>> we generally don't free dominance info, so please don't do that. >>>> >>>> I fired off a bootstrap on ppc64-linux which fails building stage1 libgcc >>>> with >>>> >>>> /abuild/rguenther/obj/./gcc/xgcc -B/abuild/rguenther/obj/./gcc/ >>>> -B/usr/local/powerpc64-unknown-linux-gnu/bin/ >>>> -B/usr/local/powerpc64-unknown-linux-gnu/lib/ -isystem >>>> /usr/local/powerpc64-unknown-linux-gnu/include -isystem >>>> /usr/local/powerpc64-unknown-linux-gnu/sys-include -g -O2 -O2 -g >>>> -O2 -DIN_GCC -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual >>>> -Wno-format -Wstrict-prototypes -Wmissing-prototypes >>>> -Wold-style-definition -isystem ./include -fPIC -mlong-double-128 >>>> -mno-minimal-toc -g -DIN_LIBGCC2 -fbuilding-libgcc >>>> -fno-stack-protector -fPIC -mlong-double-128 -mno-minimal-toc -I. >>>> -I. -I../.././gcc -I../../../trunk/libgcc -I../../../trunk/libgcc/. >>>> -I../../../trunk/libgcc/../gcc -I../../../trunk/libgcc/../include >>>> -I../../../trunk/libgcc/../libdecnumber/dpd >>>> -I../../../trunk/libgcc/../libdecnumber -DHAVE_CC_TLS -o _divdi3.o >>>> -MT _divdi3.o -MD -MP -MF _divdi3.dep -DL_divdi3 -c >>>> ../../../trunk/libgcc/libgcc2.c \ >>>> -fexceptions -fnon-call-exceptions -fvisibility=hidden >>>> -DHIDE_EXPORTS >>>> In file included from ../../../trunk/libgcc/libgcc2.c:56:0: >>>> ../../../trunk/libgcc/libgcc2.c: In function ‘__divti3’: >>>> ../../../trunk/libgcc/libgcc2.h:193:20: internal compiler error: in >>>> expand_debug_locations, at cfgexpand.c:5277 >>>> > > With the attached patch, now I am running into Bootstrap comparison > failure. I am looking into it. Please review this version so that I can > address them while fixing this issue. I notice diff --git a/gcc/tree-ssanames.c b/gcc/tree-ssanames.c index 82fd4a1..80fcf70 100644 --- a/gcc/tree-ssanames.c +++ b/gcc/tree-ssanames.c @@ -207,7 +207,8 @@ set_range_info (tree name, enum value_range_type range_type, unsigned int precision = TYPE_PRECISION (TREE_TYPE (name)); /* Allocate if not available. */ - if (ri == NULL) + if (ri == NULL + || (precision != ri->get_min ().get_precision ())) and I think you need to clear range info on promoted SSA vars in the promotion pass. The basic "structure" thing still remains. You walk over all uses and defs in all stmts in promote_all_stmts which ends up calling promote_ssa_if_not_promoted on all uses and defs which in turn promotes (the "def") and then fixes up all uses in all stmts. Instead of this you should, in promote_all_stmts, walk over all uses doing what fixup_uses does and then walk over all defs, doing what promote_ssa does. + case GIMPLE_NOP: + { + if (SSA_NAME_VAR (def) == NULL) + { + /* Promote def by fixing its type for anonymous def. */ + TREE_TYPE (def) = promoted_type; + } + else + { + /* Create a promoted copy of parameters. */ + bb = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun)); I think the uninitialized vars are somewhat tricky and it would be best to create a new uninit anonymous SSA name for them. You can have SSA_NAME_VAR != NULL and def _not_ being a parameter btw. +/* Return true if it is safe to promote the defined SSA_NAME in the STMT + itself. */ +static bool +safe_to_promote_def_p (gimple *stmt) +{ + enum tree_code code = gimple_assign_rhs_code (stmt); + if (gimple_vuse (stmt) != NULL_TREE + || gimple_vdef (stmt) != NULL_TREE + || code == ARRAY_REF + || code == LROTATE_EXPR + || code == RROTATE_EXPR + || code == VIEW_CONVERT_EXPR + || code == BIT_FIELD_REF + || code == REALPART_EXPR + || code == IMAGPART_EXPR + || code == REDUC_MAX_EXPR + || code == REDUC_PLUS_EXPR + || code == REDUC_MIN_EXPR) + return false; + return true; huh, I think this function has an odd name, maybe can_promote_operation ()? Please use TREE_CODE_CLASS (code) == tcc_reference for all _REF trees. Note that as followup things like the rotates should be "expanded" like we'd do on RTL (open-coding the thing). And we'd need a way to specify zero-/sign-extended loads. +/* Return true if it is safe to promote the use in the STMT. */ +static bool +safe_to_promote_use_p (gimple *stmt) +{ + enum tree_code code = gimple_assign_rhs_code (stmt); + tree lhs = gimple_assign_lhs (stmt); + + if (gimple_vuse (stmt) != NULL_TREE + || gimple_vdef (stmt) != NULL_TREE I think the vuse/vdef check is bogus, you can have a use of 'i_3' in say _2 = a[i_3]; + || code == VIEW_CONVERT_EXPR + || code == LROTATE_EXPR + || code == RROTATE_EXPR + || code == CONSTRUCTOR + || code == BIT_FIELD_REF + || code == COMPLEX_EXPR + || code == ASM_EXPR + || VECTOR_TYPE_P (TREE_TYPE (lhs))) + return false; + return true; ASM_EXPR can never appear here. I think PROMOTE_MODE never promotes vector types - what cases did you need to add VECTOR_TYPE_P for? +/* Return true if the SSA_NAME has to be truncated to preserve the + semantics. */ +static bool +truncate_use_p (gimple *stmt) +{ + enum tree_code code = gimple_assign_rhs_code (stmt); I think the description can be improved. This is about stray bits set beyond the original type, correct? Please use NOP_EXPR wherever you use CONVERT_EXPR right how. + if (TREE_CODE_CLASS (code) + == tcc_comparison) + promote_cst_in_stmt (stmt, promoted_type, true); don't you always need to promote constant operands? Richard. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [0/7] Type promotion pass and elimination of zext/sext 2015-11-03 14:40 ` Richard Biener @ 2015-11-08 9:43 ` Kugan 2015-11-10 14:13 ` Richard Biener 0 siblings, 1 reply; 28+ messages in thread From: Kugan @ 2015-11-08 9:43 UTC (permalink / raw) To: Richard Biener; +Cc: gcc-patches [-- Attachment #1: Type: text/plain, Size: 5762 bytes --] Thanks Richard for the comments. Please find the attached patches which now passes bootstrap with x86_64-none-linux-gnu, aarch64-linux-gnu and ppc64-linux-gnu. Regression testing is ongoing. Please find the comments for your questions/suggestions below. > > I notice > > diff --git a/gcc/tree-ssanames.c b/gcc/tree-ssanames.c > index 82fd4a1..80fcf70 100644 > --- a/gcc/tree-ssanames.c > +++ b/gcc/tree-ssanames.c > @@ -207,7 +207,8 @@ set_range_info (tree name, enum value_range_type range_type, > unsigned int precision = TYPE_PRECISION (TREE_TYPE (name)); > > /* Allocate if not available. */ > - if (ri == NULL) > + if (ri == NULL > + || (precision != ri->get_min ().get_precision ())) > > and I think you need to clear range info on promoted SSA vars in the > promotion pass. Done. > > The basic "structure" thing still remains. You walk over all uses and > defs in all stmts > in promote_all_stmts which ends up calling promote_ssa_if_not_promoted on all > uses and defs which in turn promotes (the "def") and then fixes up all > uses in all stmts. Done. > > Instead of this you should, in promote_all_stmts, walk over all uses doing what > fixup_uses does and then walk over all defs, doing what promote_ssa does. > > + case GIMPLE_NOP: > + { > + if (SSA_NAME_VAR (def) == NULL) > + { > + /* Promote def by fixing its type for anonymous def. */ > + TREE_TYPE (def) = promoted_type; > + } > + else > + { > + /* Create a promoted copy of parameters. */ > + bb = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun)); > > I think the uninitialized vars are somewhat tricky and it would be best > to create a new uninit anonymous SSA name for them. You can > have SSA_NAME_VAR != NULL and def _not_ being a parameter > btw. Done. I also had to do some changes to in couple of other places to reflect this. They are: --- a/gcc/tree-ssa-reassoc.c +++ b/gcc/tree-ssa-reassoc.c @@ -302,6 +302,7 @@ phi_rank (gimple *stmt) { tree arg = gimple_phi_arg_def (stmt, i); if (TREE_CODE (arg) == SSA_NAME + && SSA_NAME_VAR (arg) && !SSA_NAME_IS_DEFAULT_DEF (arg)) { gimple *def_stmt = SSA_NAME_DEF_STMT (arg); @@ -434,7 +435,8 @@ get_rank (tree e) if (gimple_code (stmt) == GIMPLE_PHI) return phi_rank (stmt); - if (!is_gimple_assign (stmt)) + if (!is_gimple_assign (stmt) + && !gimple_nop_p (stmt)) return bb_rank[gimple_bb (stmt)->index]; and --- a/gcc/tree-ssa.c +++ b/gcc/tree-ssa.c @@ -752,7 +752,8 @@ verify_use (basic_block bb, basic_block def_bb, use_operand_p use_p, TREE_VISITED (ssa_name) = 1; if (gimple_nop_p (SSA_NAME_DEF_STMT (ssa_name)) - && SSA_NAME_IS_DEFAULT_DEF (ssa_name)) + && (SSA_NAME_IS_DEFAULT_DEF (ssa_name) + || SSA_NAME_VAR (ssa_name) == NULL)) ; /* Default definitions have empty statements. Nothing to do. */ else if (!def_bb) { Does this look OK? > > +/* Return true if it is safe to promote the defined SSA_NAME in the STMT > + itself. */ > +static bool > +safe_to_promote_def_p (gimple *stmt) > +{ > + enum tree_code code = gimple_assign_rhs_code (stmt); > + if (gimple_vuse (stmt) != NULL_TREE > + || gimple_vdef (stmt) != NULL_TREE > + || code == ARRAY_REF > + || code == LROTATE_EXPR > + || code == RROTATE_EXPR > + || code == VIEW_CONVERT_EXPR > + || code == BIT_FIELD_REF > + || code == REALPART_EXPR > + || code == IMAGPART_EXPR > + || code == REDUC_MAX_EXPR > + || code == REDUC_PLUS_EXPR > + || code == REDUC_MIN_EXPR) > + return false; > + return true; > > huh, I think this function has an odd name, maybe > can_promote_operation ()? Please > use TREE_CODE_CLASS (code) == tcc_reference for all _REF trees. Done. > > Note that as followup things like the rotates should be "expanded" like > we'd do on RTL (open-coding the thing). And we'd need a way to > specify zero-/sign-extended loads. > > +/* Return true if it is safe to promote the use in the STMT. */ > +static bool > +safe_to_promote_use_p (gimple *stmt) > +{ > + enum tree_code code = gimple_assign_rhs_code (stmt); > + tree lhs = gimple_assign_lhs (stmt); > + > + if (gimple_vuse (stmt) != NULL_TREE > + || gimple_vdef (stmt) != NULL_TREE > > I think the vuse/vdef check is bogus, you can have a use of 'i_3' in say > _2 = a[i_3]; > When I remove this, I see errors in stmts like: unsigned char unsigned int # .MEM_197 = VDEF <.MEM_187> fs_9(D)->fde_encoding = _154; > + || code == VIEW_CONVERT_EXPR > + || code == LROTATE_EXPR > + || code == RROTATE_EXPR > + || code == CONSTRUCTOR > + || code == BIT_FIELD_REF > + || code == COMPLEX_EXPR > + || code == ASM_EXPR > + || VECTOR_TYPE_P (TREE_TYPE (lhs))) > + return false; > + return true; > > ASM_EXPR can never appear here. I think PROMOTE_MODE never > promotes vector types - what cases did you need to add VECTOR_TYPE_P for? Done > > +/* Return true if the SSA_NAME has to be truncated to preserve the > + semantics. */ > +static bool > +truncate_use_p (gimple *stmt) > +{ > + enum tree_code code = gimple_assign_rhs_code (stmt); > > I think the description can be improved. This is about stray bits set > beyond the original type, correct? > > Please use NOP_EXPR wherever you use CONVERT_EXPR right how. > > + if (TREE_CODE_CLASS (code) > + == tcc_comparison) > + promote_cst_in_stmt (stmt, promoted_type, true); > > don't you always need to promote constant operands? I am promoting all the constants. Here, I am promoting the the constants that are part of the conditions. Thanks, Kugan [-- Attachment #2: 0003-Optimize-ZEXT_EXPR-with-tree-vrp.patch --] [-- Type: text/x-diff, Size: 3519 bytes --] From a25f711713778cd3ed3d0976cc3f37d541479afb Mon Sep 17 00:00:00 2001 From: Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org> Date: Thu, 22 Oct 2015 10:53:56 +1100 Subject: [PATCH 3/4] Optimize ZEXT_EXPR with tree-vrp --- gcc/match.pd | 6 ++++++ gcc/tree-vrp.c | 59 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 65 insertions(+) diff --git a/gcc/match.pd b/gcc/match.pd index 0a9598e..1b152f1 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -2585,3 +2585,9 @@ along with GCC; see the file COPYING3. If not see (if (INTEGRAL_TYPE_P (TREE_TYPE (@0))) (op @0 (ext @1 @2))))) +(simplify + (sext (sext@2 @0 @1) @3) + (if (tree_int_cst_compare (@1, @3) <= 0) + @2 + (sext @0 @3))) + diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c index fe34ffd..671a388 100644 --- a/gcc/tree-vrp.c +++ b/gcc/tree-vrp.c @@ -2241,6 +2241,7 @@ extract_range_from_binary_expr_1 (value_range *vr, && code != LSHIFT_EXPR && code != MIN_EXPR && code != MAX_EXPR + && code != SEXT_EXPR && code != BIT_AND_EXPR && code != BIT_IOR_EXPR && code != BIT_XOR_EXPR) @@ -2801,6 +2802,52 @@ extract_range_from_binary_expr_1 (value_range *vr, extract_range_from_multiplicative_op_1 (vr, code, &vr0, &vr1); return; } + else if (code == SEXT_EXPR) + { + gcc_assert (range_int_cst_p (&vr1)); + HOST_WIDE_INT prec = tree_to_uhwi (vr1.min); + type = vr0.type; + wide_int tmin, tmax; + wide_int may_be_nonzero, must_be_nonzero; + + wide_int type_min = wi::min_value (prec, SIGNED); + wide_int type_max = wi::max_value (prec, SIGNED); + type_min = wide_int_to_tree (expr_type, type_min); + type_max = wide_int_to_tree (expr_type, type_max); + wide_int sign_bit + = wi::set_bit_in_zero (prec - 1, + TYPE_PRECISION (TREE_TYPE (vr0.min))); + if (zero_nonzero_bits_from_vr (expr_type, &vr0, + &may_be_nonzero, + &must_be_nonzero)) + { + if (wi::bit_and (must_be_nonzero, sign_bit) == sign_bit) + { + /* If to-be-extended sign bit is one. */ + tmin = type_min; + tmax = wi::zext (may_be_nonzero, prec); + } + else if (wi::bit_and (may_be_nonzero, sign_bit) + != sign_bit) + { + /* If to-be-extended sign bit is zero. */ + tmin = wi::zext (must_be_nonzero, prec); + tmax = wi::zext (may_be_nonzero, prec); + } + else + { + tmin = type_min; + tmax = type_max; + } + } + else + { + tmin = type_min; + tmax = type_max; + } + min = wide_int_to_tree (expr_type, tmin); + max = wide_int_to_tree (expr_type, tmax); + } else if (code == RSHIFT_EXPR || code == LSHIFT_EXPR) { @@ -9166,6 +9213,17 @@ simplify_bit_ops_using_ranges (gimple_stmt_iterator *gsi, gimple *stmt) break; } break; + case SEXT_EXPR: + { + unsigned int prec = tree_to_uhwi (op1); + wide_int min = vr0.min; + wide_int max = vr0.max; + wide_int sext_min = wi::sext (min, prec); + wide_int sext_max = wi::sext (max, prec); + if (min == sext_min && max == sext_max) + op = op0; + } + break; default: gcc_unreachable (); } @@ -9868,6 +9926,7 @@ simplify_stmt_using_ranges (gimple_stmt_iterator *gsi) case BIT_AND_EXPR: case BIT_IOR_EXPR: + case SEXT_EXPR: /* Optimize away BIT_AND_EXPR and BIT_IOR_EXPR if all the bits being cleared are already cleared or all the bits being set are already set. */ -- 1.9.1 [-- Attachment #3: 0002-Add-type-promotion-pass.patch --] [-- Type: text/x-diff, Size: 37083 bytes --] From f1b226443b63eda75f38f204a0befa5578e6df0f Mon Sep 17 00:00:00 2001 From: Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org> Date: Thu, 22 Oct 2015 10:52:37 +1100 Subject: [PATCH 2/4] Add type promotion pass --- gcc/Makefile.in | 1 + gcc/auto-profile.c | 2 +- gcc/common.opt | 4 + gcc/doc/invoke.texi | 10 + gcc/gimple-ssa-type-promote.c | 1026 +++++++++++++++++++++++++++++++++++++++++ gcc/passes.def | 1 + gcc/timevar.def | 1 + gcc/tree-pass.h | 1 + gcc/tree-ssa-reassoc.c | 4 +- gcc/tree-ssa-uninit.c | 23 +- gcc/tree-ssa.c | 3 +- libiberty/cp-demangle.c | 2 +- 12 files changed, 1064 insertions(+), 14 deletions(-) create mode 100644 gcc/gimple-ssa-type-promote.c diff --git a/gcc/Makefile.in b/gcc/Makefile.in index b91b8dc..c6aed45 100644 --- a/gcc/Makefile.in +++ b/gcc/Makefile.in @@ -1499,6 +1499,7 @@ OBJS = \ tree-vect-slp.o \ tree-vectorizer.o \ tree-vrp.o \ + gimple-ssa-type-promote.o \ tree.o \ valtrack.o \ value-prof.o \ diff --git a/gcc/auto-profile.c b/gcc/auto-profile.c index 25202c5..d32c3b6 100644 --- a/gcc/auto-profile.c +++ b/gcc/auto-profile.c @@ -1266,7 +1266,7 @@ afdo_propagate_circuit (const bb_set &annotated_bb, edge_set *annotated_edge) FOR_EACH_EDGE (e, ei, bb->succs) { unsigned i, total = 0; - edge only_one; + edge only_one = NULL; bool check_value_one = (((integer_onep (cmp_rhs)) ^ (gimple_cond_code (cmp_stmt) == EQ_EXPR)) ^ ((e->flags & EDGE_TRUE_VALUE) != 0)); diff --git a/gcc/common.opt b/gcc/common.opt index 12ca0d6..f450428 100644 --- a/gcc/common.opt +++ b/gcc/common.opt @@ -2404,6 +2404,10 @@ ftree-vrp Common Report Var(flag_tree_vrp) Init(0) Optimization Perform Value Range Propagation on trees. +ftree-type-promote +Common Report Var(flag_tree_type_promote) Init(1) Optimization +Perform Type Promotion on trees + funit-at-a-time Common Report Var(flag_unit_at_a_time) Init(1) Compile whole compilation unit at a time. diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index cd82544..bc059a0 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -9093,6 +9093,16 @@ enabled by default at @option{-O2} and higher. Null pointer check elimination is only done if @option{-fdelete-null-pointer-checks} is enabled. +@item -ftree-type-promote +@opindex ftree-type-promote +This pass applies type promotion to SSA names in the function and +inserts appropriate truncations to preserve the semantics. Idea of +this pass is to promote operations such a way that we can minimise +generation of subreg in RTL, that intern results in removal of +redundant zero/sign extensions. + +This optimization is enabled by default. + @item -fsplit-ivs-in-unroller @opindex fsplit-ivs-in-unroller Enables expression of values of induction variables in later iterations diff --git a/gcc/gimple-ssa-type-promote.c b/gcc/gimple-ssa-type-promote.c new file mode 100644 index 0000000..1d24566 --- /dev/null +++ b/gcc/gimple-ssa-type-promote.c @@ -0,0 +1,1026 @@ +/* Type promotion of SSA names to minimise redundant zero/sign extension. + Copyright (C) 2015 Free Software Foundation, Inc. + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify +it under the terms of the GNU General Public License as published by +the Free Software Foundation; either version 3, or (at your option) +any later version. + +GCC is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +You should have received a copy of the GNU General Public License +along with GCC; see the file COPYING3. If not see +<http://www.gnu.org/licenses/>. */ + +#include "config.h" +#include "system.h" +#include "coretypes.h" +#include "backend.h" +#include "hash-set.h" +#include "machmode.h" +#include "vec.h" +#include "double-int.h" +#include "input.h" +#include "symtab.h" +#include "wide-int.h" +#include "inchash.h" +#include "tree.h" +#include "fold-const.h" +#include "stor-layout.h" +#include "predict.h" +#include "function.h" +#include "dominance.h" +#include "cfg.h" +#include "basic-block.h" +#include "tree-ssa-alias.h" +#include "gimple-fold.h" +#include "tree-eh.h" +#include "gimple-expr.h" +#include "is-a.h" +#include "gimple.h" +#include "gimple-iterator.h" +#include "gimple-ssa.h" +#include "tree-phinodes.h" +#include "ssa-iterators.h" +#include "stringpool.h" +#include "tree-ssanames.h" +#include "tree-pass.h" +#include "gimple-pretty-print.h" +#include "langhooks.h" +#include "sbitmap.h" +#include "domwalk.h" +#include "tree-dfa.h" + +/* This pass applies type promotion to SSA names in the function and + inserts appropriate truncations. Idea of this pass is to promote operations + such a way that we can minimise generation of subreg in RTL, + that in turn results in removal of redundant zero/sign extensions. This pass + will run prior to The VRP and DOM such that they will be able to optimise + redundant truncations and extensions. This is based on the discussion from + https://gcc.gnu.org/ml/gcc-patches/2014-09/msg00472.html. + +*/ + +struct ssa_name_info +{ + tree ssa; + tree type; + tree promoted_type; +}; + +/* Obstack for ssa_name_info. */ +static struct obstack ssa_name_info_obstack; + +static unsigned n_ssa_val; +static sbitmap ssa_to_be_promoted_bitmap; +static sbitmap ssa_sets_higher_bits_bitmap; +static hash_map <tree, ssa_name_info *> *ssa_name_info_map; + +static bool +type_precision_ok (tree type) +{ + return (TYPE_PRECISION (type) + == GET_MODE_PRECISION (TYPE_MODE (type))); +} + +/* Return the promoted type for TYPE. */ +static tree +get_promoted_type (tree type) +{ + tree promoted_type; + enum machine_mode mode; + int uns; + + if (POINTER_TYPE_P (type) + || !INTEGRAL_TYPE_P (type) + || !type_precision_ok (type)) + return type; + + mode = TYPE_MODE (type); +#ifdef PROMOTE_MODE + uns = TYPE_SIGN (type); + PROMOTE_MODE (mode, uns, type); +#endif + uns = TYPE_SIGN (type); + if (TYPE_PRECISION (type) == GET_MODE_PRECISION (mode)) + return type; + promoted_type + = build_nonstandard_integer_type (GET_MODE_PRECISION (mode), + uns); + gcc_assert (TYPE_PRECISION (promoted_type) == GET_MODE_PRECISION (mode)); + return promoted_type; +} + +/* Return true if ssa NAME is already considered for promotion. */ +static bool +ssa_promoted_p (tree name) +{ + if (TREE_CODE (name) == SSA_NAME) + { + unsigned int index = SSA_NAME_VERSION (name); + if (index < n_ssa_val) + return bitmap_bit_p (ssa_to_be_promoted_bitmap, index); + } + return true; +} + +/* Set ssa NAME to be already considered for promotion. */ +static void +set_ssa_promoted (tree name) +{ + if (TREE_CODE (name) == SSA_NAME) + { + unsigned int index = SSA_NAME_VERSION (name); + if (index < n_ssa_val) + bitmap_set_bit (ssa_to_be_promoted_bitmap, index); + } +} + +/* Set ssa NAME will have higher bits if promoted. */ +static void +set_ssa_overflows (tree name) +{ + if (TREE_CODE (name) == SSA_NAME) + { + unsigned int index = SSA_NAME_VERSION (name); + if (index < n_ssa_val) + bitmap_set_bit (ssa_sets_higher_bits_bitmap, index); + } +} + + +/* Return true if ssa NAME will have higher bits if promoted. */ +static bool +ssa_overflows_p (tree name) +{ + if (TREE_CODE (name) == SSA_NAME) + { + unsigned int index = SSA_NAME_VERSION (name); + gimple *def_stmt = SSA_NAME_DEF_STMT (name); + + if (gimple_code (def_stmt) == GIMPLE_NOP + && SSA_NAME_VAR (name) + && TREE_CODE (SSA_NAME_VAR (name)) != PARM_DECL) + return true; + if (index < n_ssa_val) + return bitmap_bit_p (ssa_sets_higher_bits_bitmap, index); + } + return true; +} + +/* Visit PHI stmt and record if variables might have higher bits set if + promoted. */ +static bool +record_visit_phi_node (gimple *stmt) +{ + tree def; + ssa_op_iter i; + use_operand_p op; + bool high_bits_set = false; + gphi *phi = as_a <gphi *> (stmt); + tree lhs = PHI_RESULT (phi); + + if (TREE_CODE (lhs) != SSA_NAME + || POINTER_TYPE_P (TREE_TYPE (lhs)) + || !INTEGRAL_TYPE_P (TREE_TYPE (lhs)) + || ssa_overflows_p (lhs)) + return false; + + FOR_EACH_PHI_ARG (op, phi, i, SSA_OP_USE) + { + def = USE_FROM_PTR (op); + if (ssa_overflows_p (def)) + high_bits_set = true; + } + + if (high_bits_set) + { + set_ssa_overflows (lhs); + return true; + } + else + return false; +} + +/* Visit STMT and record if variables might have higher bits set if + promoted. */ +static bool +record_visit_stmt (gimple *stmt) +{ + bool changed = false; + gcc_assert (gimple_code (stmt) == GIMPLE_ASSIGN); + enum tree_code code = gimple_assign_rhs_code (stmt); + tree lhs = gimple_assign_lhs (stmt); + tree rhs1 = gimple_assign_rhs1 (stmt); + + if (TREE_CODE (lhs) != SSA_NAME + || POINTER_TYPE_P (TREE_TYPE (lhs)) + || !INTEGRAL_TYPE_P (TREE_TYPE (lhs))) + return false; + + switch (code) + { + case SSA_NAME: + if (!ssa_overflows_p (lhs) + && ssa_overflows_p (rhs1)) + { + set_ssa_overflows (lhs); + changed = true; + } + break; + + default: + if (!ssa_overflows_p (lhs)) + { + set_ssa_overflows (lhs); + changed = true; + } + break; + } + return changed; +} + +static void +process_all_stmts_for_unsafe_promotion () +{ + basic_block bb; + gimple_stmt_iterator gsi; + auto_vec<gimple *> work_list; + + FOR_EACH_BB_FN (bb, cfun) + { + for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi); gsi_next (&gsi)) + { + gimple *phi = gsi_stmt (gsi); + work_list.safe_push (phi); + } + + for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi)) + { + gimple *stmt = gsi_stmt (gsi); + if (gimple_code (stmt) == GIMPLE_ASSIGN) + work_list.safe_push (stmt); + } + } + + while (work_list.length () > 0) + { + bool changed; + gimple *stmt = work_list.pop (); + tree lhs; + + switch (gimple_code (stmt)) + { + + case GIMPLE_ASSIGN: + changed = record_visit_stmt (stmt); + lhs = gimple_assign_lhs (stmt); + break; + + case GIMPLE_PHI: + changed = record_visit_phi_node (stmt); + lhs = PHI_RESULT (stmt); + break; + + default: + gcc_assert (false); + break; + } + + if (changed) + { + gimple *use_stmt; + imm_use_iterator ui; + + FOR_EACH_IMM_USE_STMT (use_stmt, ui, lhs) + { + if (gimple_code (use_stmt) == GIMPLE_ASSIGN + || gimple_code (use_stmt) == GIMPLE_PHI) + work_list.safe_push (use_stmt); + } + } + } +} + +/* Return true if it is safe to promote the defined SSA_NAME in the STMT + itself. */ +static bool +can_promote_operation_p (gimple *stmt) +{ + enum tree_code code = gimple_assign_rhs_code (stmt); + if (gimple_vuse (stmt) != NULL_TREE + || gimple_vdef (stmt) != NULL_TREE + || TREE_CODE_CLASS (code) == tcc_reference + || code == LROTATE_EXPR + || code == RROTATE_EXPR + || code == VIEW_CONVERT_EXPR + || code == REALPART_EXPR + || code == IMAGPART_EXPR + || code == REDUC_MAX_EXPR + || code == REDUC_PLUS_EXPR + || code == REDUC_MIN_EXPR) + return false; + return true; +} + +/* Return true if it is safe to promote the use in the STMT. */ +static bool +safe_to_promote_use_p (gimple *stmt) +{ + enum tree_code code = gimple_assign_rhs_code (stmt); + tree lhs = gimple_assign_lhs (stmt); + + if (gimple_vuse (stmt) != NULL_TREE + || gimple_vdef (stmt) != NULL_TREE + || code == VIEW_CONVERT_EXPR + || code == LROTATE_EXPR + || code == RROTATE_EXPR + || code == CONSTRUCTOR + || code == BIT_FIELD_REF + || code == COMPLEX_EXPR + || VECTOR_TYPE_P (TREE_TYPE (lhs))) + return false; + return true; +} + +/* Return true if the SSA_NAME has to be truncated when (stray bits are set + beyond the original type in promoted mode) to preserve the semantics. */ +static bool +truncate_use_p (gimple *stmt) +{ + enum tree_code code = gimple_assign_rhs_code (stmt); + if (TREE_CODE_CLASS (code) == tcc_comparison + || code == TRUNC_DIV_EXPR + || code == CEIL_DIV_EXPR + || code == FLOOR_DIV_EXPR + || code == ROUND_DIV_EXPR + || code == TRUNC_MOD_EXPR + || code == CEIL_MOD_EXPR + || code == FLOOR_MOD_EXPR + || code == ROUND_MOD_EXPR + || code == LSHIFT_EXPR + || code == RSHIFT_EXPR) + return true; + return false; +} + +/* Return true if LHS will be promoted later. */ +static bool +tobe_promoted_p (tree lhs) +{ + if (TREE_CODE (lhs) == SSA_NAME + && !POINTER_TYPE_P (TREE_TYPE (lhs)) + && INTEGRAL_TYPE_P (TREE_TYPE (lhs)) + && !VECTOR_TYPE_P (TREE_TYPE (lhs)) + && !ssa_promoted_p (lhs) + && (get_promoted_type (TREE_TYPE (lhs)) + != TREE_TYPE (lhs))) + return true; + else + return false; +} + +/* Convert constant CST to TYPE. */ +static tree +convert_int_cst (tree type, tree cst, signop sign = SIGNED) +{ + wide_int wi_cons = fold_convert (type, cst); + wi_cons = wi::ext (wi_cons, TYPE_PRECISION (TREE_TYPE (cst)), sign); + return wide_int_to_tree (type, wi_cons); +} + +/* Promote constants in STMT to TYPE. If PROMOTE_COND_EXPR is true, + promote only the constants in conditions part of the COND_EXPR. */ +static void +promote_cst_in_stmt (gimple *stmt, tree type, bool promote_cond = false) +{ + tree op; + ssa_op_iter iter; + use_operand_p oprnd; + int index; + tree op0, op1; + signop sign = SIGNED; + + switch (gimple_code (stmt)) + { + case GIMPLE_ASSIGN: + if (promote_cond + && gimple_assign_rhs_code (stmt) == COND_EXPR) + { + /* Promote INTEGER_CST that are tcc_compare arguments. */ + sign = TYPE_SIGN (type); + op = gimple_assign_rhs1 (stmt); + op0 = TREE_OPERAND (op, 0); + op1 = TREE_OPERAND (op, 1); + if (TREE_CODE (op0) == INTEGER_CST) + op0 = convert_int_cst (type, op0, sign); + if (TREE_CODE (op1) == INTEGER_CST) + op1 = convert_int_cst (type, op1, sign); + tree new_op = build2 (TREE_CODE (op), type, op0, op1); + gimple_assign_set_rhs1 (stmt, new_op); + } + else + { + /* Promote INTEGER_CST in GIMPLE_ASSIGN. */ + op = gimple_assign_rhs3 (stmt); + if (op && TREE_CODE (op) == INTEGER_CST) + gimple_assign_set_rhs3 (stmt, convert_int_cst (type, op, sign)); + if (TREE_CODE_CLASS (gimple_assign_rhs_code (stmt)) + == tcc_comparison) + sign = TYPE_SIGN (type); + op = gimple_assign_rhs1 (stmt); + if (op && TREE_CODE (op) == INTEGER_CST) + gimple_assign_set_rhs1 (stmt, convert_int_cst (type, op, sign)); + op = gimple_assign_rhs2 (stmt); + if (op && TREE_CODE (op) == INTEGER_CST) + gimple_assign_set_rhs2 (stmt, convert_int_cst (type, op, sign)); + } + break; + + case GIMPLE_PHI: + { + /* Promote INTEGER_CST arguments to GIMPLE_PHI. */ + gphi *phi = as_a <gphi *> (stmt); + FOR_EACH_PHI_ARG (oprnd, phi, iter, SSA_OP_USE) + { + op = USE_FROM_PTR (oprnd); + index = PHI_ARG_INDEX_FROM_USE (oprnd); + if (TREE_CODE (op) == INTEGER_CST) + SET_PHI_ARG_DEF (phi, index, convert_int_cst (type, op, sign)); + } + } + break; + + case GIMPLE_COND: + { + /* Promote INTEGER_CST that are GIMPLE_COND arguments. */ + gcond *cond = as_a <gcond *> (stmt); + op = gimple_cond_lhs (cond); + sign = TYPE_SIGN (type); + + if (op && TREE_CODE (op) == INTEGER_CST) + gimple_cond_set_lhs (cond, convert_int_cst (type, op, sign)); + op = gimple_cond_rhs (cond); + + if (op && TREE_CODE (op) == INTEGER_CST) + gimple_cond_set_rhs (cond, convert_int_cst (type, op, sign)); + } + break; + + default: + gcc_unreachable (); + } +} + +/* Create an ssa with TYPE to copy ssa VAR. */ +static tree +make_promoted_copy (tree var, gimple *def_stmt, tree type) +{ + tree new_lhs = make_ssa_name (type, def_stmt); + if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (var)) + SSA_NAME_OCCURS_IN_ABNORMAL_PHI (new_lhs) = 1; + return new_lhs; +} + +/* Zero/sign extend (depending on type) VAR and truncate to WIDTH bits. + Assign the zero/sign extended value in NEW_VAR. gimple statement + that performs the zero/sign extension is returned. */ +static gimple * +zero_sign_extend_stmt (tree new_var, tree var, int width) +{ + gcc_assert (TYPE_PRECISION (TREE_TYPE (var)) + == TYPE_PRECISION (TREE_TYPE (new_var))); + gcc_assert (TYPE_PRECISION (TREE_TYPE (var)) > width); + gimple *stmt; + + if (TYPE_UNSIGNED (TREE_TYPE (new_var))) + { + /* Zero extend. */ + tree cst + = wide_int_to_tree (TREE_TYPE (var), + wi::mask (width, false, + TYPE_PRECISION (TREE_TYPE (var)))); + stmt = gimple_build_assign (new_var, BIT_AND_EXPR, + var, cst); + } + else + /* Sign extend. */ + stmt = gimple_build_assign (new_var, + SEXT_EXPR, + var, build_int_cst (TREE_TYPE (var), width)); + return stmt; +} + + +void duplicate_default_ssa (tree to, tree from) +{ + SET_SSA_NAME_VAR_OR_IDENTIFIER (to, SSA_NAME_VAR (from)); + SSA_NAME_IS_DEFAULT_DEF (to) = SSA_NAME_IS_DEFAULT_DEF (from); + SSA_NAME_DEF_STMT (to) = SSA_NAME_DEF_STMT (from); + SET_SSA_NAME_VAR_OR_IDENTIFIER (from, NULL_TREE); + SSA_NAME_IS_DEFAULT_DEF (to) = 1; + SSA_NAME_IS_DEFAULT_DEF (from) = 0; +} + +/* Promote definition DEF to PROMOTED_TYPE. If the stmt that defines def + is def_stmt, make the type of def promoted_type. If the stmt is such + that, result of the def_stmt cannot be of promoted_type, create a new_def + of the original_type and make the def_stmt assign its value to newdef. + Then, create a CONVERT_EXPR to convert new_def to def of promoted type. + + For example, for stmt with original_type char and promoted_type int: + char _1 = mem; + becomes: + char _2 = mem; + int _1 = (int)_2; + + If the def_stmt allows def to be promoted, promote def in-place + (and its arguments when needed). + + For example: + char _3 = _1 + _2; + becomes: + int _3 = _1 + _2; + Here, _1 and _2 will also be promoted. */ +static void +promote_ssa (tree def, gimple_stmt_iterator *gsi) +{ + gimple *def_stmt = SSA_NAME_DEF_STMT (def); + gimple *copy_stmt = NULL; + basic_block bb; + gimple_stmt_iterator gsi2; + tree original_type = TREE_TYPE (def); + tree new_def; + bool do_not_promote = false; + if (!tobe_promoted_p (def)) + return; + tree promoted_type = get_promoted_type (TREE_TYPE (def)); + ssa_name_info *info = (ssa_name_info *) obstack_alloc (&ssa_name_info_obstack, + sizeof (ssa_name_info)); + info->type = original_type; + info->promoted_type = promoted_type; + info->ssa = def; + ssa_name_info_map->put (def, info); + + switch (gimple_code (def_stmt)) + { + case GIMPLE_PHI: + { + /* Promote def by fixing its type and make def anonymous. */ + TREE_TYPE (def) = promoted_type; + SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE); + promote_cst_in_stmt (def_stmt, promoted_type); + break; + } + + case GIMPLE_ASM: + { + gasm *asm_stmt = as_a <gasm *> (def_stmt); + for (unsigned int i = 0; i < gimple_asm_noutputs (asm_stmt); ++i) + { + /* Promote def and copy (i.e. convert) the value defined + by asm to def. */ + tree link = gimple_asm_output_op (asm_stmt, i); + tree op = TREE_VALUE (link); + if (op == def) + { + new_def = copy_ssa_name (def); + set_ssa_promoted (new_def); + duplicate_default_ssa (new_def, def); + TREE_VALUE (link) = new_def; + gimple_asm_set_output_op (asm_stmt, i, link); + + TREE_TYPE (def) = promoted_type; + copy_stmt = gimple_build_assign (def, NOP_EXPR, + new_def, NULL_TREE); + SSA_NAME_IS_DEFAULT_DEF (new_def) = 0; + gsi2 = gsi_for_stmt (def_stmt); + gsi_insert_after (&gsi2, copy_stmt, GSI_NEW_STMT); + break; + } + } + break; + } + + case GIMPLE_NOP: + { + if (SSA_NAME_VAR (def) == NULL + || TREE_CODE (SSA_NAME_VAR (def)) != PARM_DECL) + { + /* Promote def by fixing its type for anonymous def. */ + if (SSA_NAME_VAR (def)) + { + set_ssa_default_def (cfun, SSA_NAME_VAR (def), NULL_TREE); + SSA_NAME_IS_DEFAULT_DEF (def) = 0; + SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE); + } + TREE_TYPE (def) = promoted_type; + } + else + { + /* Create a promoted copy of parameters. */ + bb = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun)); + gcc_assert (bb); + gsi2 = gsi_after_labels (bb); + new_def = copy_ssa_name (def); + set_ssa_promoted (new_def); + set_ssa_default_def (cfun, SSA_NAME_VAR (def), new_def); + duplicate_default_ssa (new_def, def); + TREE_TYPE (def) = promoted_type; + copy_stmt = gimple_build_assign (def, NOP_EXPR, + new_def, NULL_TREE); + SSA_NAME_DEF_STMT (def) = copy_stmt; + gsi_insert_before (&gsi2, copy_stmt, GSI_NEW_STMT); + } + break; + } + + case GIMPLE_ASSIGN: + { + enum tree_code code = gimple_assign_rhs_code (def_stmt); + if (!can_promote_operation_p (def_stmt)) + { + do_not_promote = true; + } + else if (CONVERT_EXPR_CODE_P (code)) + { + tree rhs = gimple_assign_rhs1 (def_stmt); + if (!type_precision_ok (TREE_TYPE (rhs))) + { + do_not_promote = true; + } + else if (types_compatible_p (TREE_TYPE (rhs), promoted_type)) + { + /* As we travel statements in dominated order, arguments + of def_stmt will be visited before visiting def. If RHS + is already promoted and type is compatible, we can convert + them into ZERO/SIGN EXTEND stmt. */ + ssa_name_info *info = ssa_name_info_map->get_or_insert (rhs); + tree type; + if (info == NULL) + type = TREE_TYPE (rhs); + else + type = info->type; + if ((TYPE_PRECISION (original_type) + > TYPE_PRECISION (type)) + || (TYPE_UNSIGNED (original_type) + != TYPE_UNSIGNED (type))) + { + if (TYPE_PRECISION (original_type) < TYPE_PRECISION (type)) + type = original_type; + gcc_assert (type != NULL_TREE); + TREE_TYPE (def) = promoted_type; + gimple *copy_stmt = + zero_sign_extend_stmt (def, rhs, + TYPE_PRECISION (type)); + SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE); + gsi_replace (gsi, copy_stmt, false); + } + else + { + TREE_TYPE (def) = promoted_type; + SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE); + } + } + else + { + /* If RHS is not promoted OR their types are not + compatible, create CONVERT_EXPR that converts + RHS to promoted DEF type and perform a + ZERO/SIGN EXTEND to get the required value + from RHS. */ + tree s = (TYPE_PRECISION (TREE_TYPE (def)) + < TYPE_PRECISION (TREE_TYPE (rhs))) + ? TREE_TYPE (def) : TREE_TYPE (rhs); + new_def = copy_ssa_name (def); + set_ssa_promoted (new_def); + TREE_TYPE (def) = promoted_type; + TREE_TYPE (new_def) = promoted_type; + SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE); + SET_SSA_NAME_VAR_OR_IDENTIFIER (new_def, NULL_TREE); + gimple_set_lhs (def_stmt, new_def); + gimple *copy_stmt = + zero_sign_extend_stmt (def, new_def, + TYPE_PRECISION (s)); + gsi2 = gsi_for_stmt (def_stmt); + if (lookup_stmt_eh_lp (def_stmt) > 0 + || (gimple_code (def_stmt) == GIMPLE_CALL + && gimple_call_ctrl_altering_p (def_stmt))) + gsi_insert_on_edge (FALLTHRU_EDGE (gimple_bb (def_stmt)), + copy_stmt); + else + gsi_insert_after (&gsi2, copy_stmt, GSI_NEW_STMT); + } + } + else + { + /* Promote def by fixing its type and make def anonymous. */ + SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE); + promote_cst_in_stmt (def_stmt, promoted_type); + TREE_TYPE (def) = promoted_type; + } + break; + } + + default: + do_not_promote = true; + break; + } + + if (do_not_promote) + { + /* Promote def and copy (i.e. convert) the value defined + by the stmt that cannot be promoted. */ + new_def = copy_ssa_name (def); + set_ssa_promoted (new_def); + SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE); + TREE_TYPE (def) = promoted_type; + gimple_set_lhs (def_stmt, new_def); + copy_stmt = gimple_build_assign (def, NOP_EXPR, + new_def, NULL_TREE); + gsi2 = gsi_for_stmt (def_stmt); + if (lookup_stmt_eh_lp (def_stmt) > 0 + || (gimple_code (def_stmt) == GIMPLE_CALL + && gimple_call_ctrl_altering_p (def_stmt))) + gsi_insert_on_edge (FALLTHRU_EDGE (gimple_bb (def_stmt)), + copy_stmt); + else + gsi_insert_after (&gsi2, copy_stmt, GSI_NEW_STMT); + } + + SSA_NAME_RANGE_INFO (def) = NULL; +} + +/* Fix the (promoted) USE in stmts where USE cannot be be promoted. */ +static unsigned int +fixup_uses (gimple *stmt, gimple_stmt_iterator *gsi, + use_operand_p op, tree use) +{ + ssa_name_info *info = ssa_name_info_map->get_or_insert (use); + if (!info) + return 0; + + tree promoted_type = info->promoted_type; + tree old_type = info->type; + bool do_not_promote = false; + + switch (gimple_code (stmt)) + { + case GIMPLE_DEBUG: + { + SET_USE (op, fold_convert (old_type, use)); + update_stmt (stmt); + } + break; + + case GIMPLE_ASM: + case GIMPLE_CALL: + case GIMPLE_RETURN: + { + /* USE cannot be promoted here. */ + do_not_promote = true; + break; + } + + case GIMPLE_ASSIGN: + { + enum tree_code code = gimple_assign_rhs_code (stmt); + tree lhs = gimple_assign_lhs (stmt); + if (!safe_to_promote_use_p (stmt)) + { + do_not_promote = true; + } + else if (truncate_use_p (stmt) + || !INTEGRAL_TYPE_P (TREE_TYPE (lhs))) + { + /* Promote the constant in comparison when other comparsion + operand is promoted. All other constants are promoted as + part of promoting definition in promote_ssa. */ + if (TREE_CODE_CLASS (code) == tcc_comparison) + promote_cst_in_stmt (stmt, promoted_type, true); + if (!ssa_overflows_p (use)) + break; + /* In some stmts, value in USE has to be ZERO/SIGN + Extended based on the original type for correct + result. */ + tree temp = make_promoted_copy (use, NULL, TREE_TYPE (use)); + gimple *copy_stmt = + zero_sign_extend_stmt (temp, use, + TYPE_PRECISION (old_type)); + gsi_insert_before (gsi, copy_stmt, GSI_NEW_STMT); + + SET_USE (op, temp); + update_stmt (stmt); + } + else if (CONVERT_EXPR_CODE_P (code)) + { + if (types_compatible_p (TREE_TYPE (lhs), promoted_type)) + { + /* Type of LHS and promoted RHS are compatible, we can + convert this into ZERO/SIGN EXTEND stmt. */ + gimple *copy_stmt = + zero_sign_extend_stmt (lhs, use, + TYPE_PRECISION (old_type)); + set_ssa_promoted (lhs); + gsi_replace (gsi, copy_stmt, false); + } + else if (tobe_promoted_p (lhs)); + else + { + do_not_promote = true; + } + } + break; + } + + case GIMPLE_COND: + if (ssa_overflows_p (use)) + { + /* In GIMPLE_COND, value in USE has to be ZERO/SIGN + Extended based on the original type for correct + result. */ + tree temp = make_promoted_copy (use, NULL, TREE_TYPE (use)); + gimple *copy_stmt = + zero_sign_extend_stmt (temp, use, + TYPE_PRECISION (old_type)); + gsi_insert_before (gsi, copy_stmt, GSI_NEW_STMT); + SET_USE (op, temp); + } + promote_cst_in_stmt (stmt, promoted_type, true); + update_stmt (stmt); + break; + + default: + break; + } + + if (do_not_promote) + { + /* FOR stmts where USE canoot be promoted, create an + original type copy. */ + tree temp; + temp = copy_ssa_name (use); + set_ssa_promoted (temp); + TREE_TYPE (temp) = old_type; + gimple *copy_stmt = gimple_build_assign (temp, NOP_EXPR, + use, NULL_TREE); + gsi_insert_before (gsi, copy_stmt, GSI_NEW_STMT); + SET_USE (op, temp); + update_stmt (stmt); + } + return 0; +} + + +/* Promote all the stmts in the basic block. */ +static void +promote_all_stmts (basic_block bb) +{ + gimple_stmt_iterator gsi; + ssa_op_iter iter; + tree def, use; + use_operand_p op; + + for (gphi_iterator gpi = gsi_start_phis (bb); + !gsi_end_p (gpi); gsi_next (&gpi)) + { + gphi *phi = gpi.phi (); + def = PHI_RESULT (phi); + promote_ssa (def, &gsi); + + FOR_EACH_PHI_ARG (op, phi, iter, SSA_OP_USE) + { + use = USE_FROM_PTR (op); + if (TREE_CODE (use) == SSA_NAME + && gimple_code (SSA_NAME_DEF_STMT (use)) == GIMPLE_NOP) + promote_ssa (use, &gsi); + fixup_uses (phi, &gsi, op, use); + } + } + for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi)) + { + gimple *stmt = gsi_stmt (gsi); + if (is_gimple_debug (stmt)) + continue; + + FOR_EACH_SSA_TREE_OPERAND (def, stmt, iter, SSA_OP_DEF) + promote_ssa (def, &gsi); + + FOR_EACH_SSA_USE_OPERAND (op, stmt, iter, SSA_OP_USE) + { + use = USE_FROM_PTR (op); + if (TREE_CODE (use) == SSA_NAME + && gimple_code (SSA_NAME_DEF_STMT (use)) == GIMPLE_NOP) + promote_ssa (use, &gsi); + fixup_uses (stmt, &gsi, op, use); + } + } +} + +void promote_debug_stmts () +{ + basic_block bb; + gimple_stmt_iterator gsi; + ssa_op_iter iter; + tree use; + use_operand_p op; + + FOR_EACH_BB_FN (bb, cfun) + for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi)) + { + gimple *stmt = gsi_stmt (gsi); + if (!is_gimple_debug (stmt)) + continue; + FOR_EACH_SSA_USE_OPERAND (op, stmt, iter, SSA_OP_USE) + { + use = USE_FROM_PTR (op); + fixup_uses (stmt, &gsi, op, use); + } + } +} + + +class type_promotion_dom_walker : public dom_walker +{ +public: + type_promotion_dom_walker (cdi_direction direction) + : dom_walker (direction) {} + virtual void before_dom_children (basic_block bb) + { + promote_all_stmts (bb); + } +}; + +/* Main entry point to the pass. */ +static unsigned int +execute_type_promotion (void) +{ + n_ssa_val = num_ssa_names; + ssa_name_info_map = new hash_map<tree, ssa_name_info *>; + ssa_to_be_promoted_bitmap = sbitmap_alloc (n_ssa_val); + bitmap_clear (ssa_to_be_promoted_bitmap); + ssa_sets_higher_bits_bitmap = sbitmap_alloc (n_ssa_val); + bitmap_clear (ssa_sets_higher_bits_bitmap); + + /* Create the obstack where ssa_name_info will reside. */ + gcc_obstack_init (&ssa_name_info_obstack); + + calculate_dominance_info (CDI_DOMINATORS); + process_all_stmts_for_unsafe_promotion (); + /* Walk the CFG in dominator order. */ + type_promotion_dom_walker (CDI_DOMINATORS) + .walk (ENTRY_BLOCK_PTR_FOR_FN (cfun)); + promote_debug_stmts (); + gsi_commit_edge_inserts (); + + obstack_free (&ssa_name_info_obstack, NULL); + sbitmap_free (ssa_to_be_promoted_bitmap); + sbitmap_free (ssa_sets_higher_bits_bitmap); + delete ssa_name_info_map; + return 0; +} + +namespace { +const pass_data pass_data_type_promotion = +{ + GIMPLE_PASS, /* type */ + "promotion", /* name */ + OPTGROUP_NONE, /* optinfo_flags */ + TV_TREE_TYPE_PROMOTE, /* tv_id */ + PROP_ssa, /* properties_required */ + 0, /* properties_provided */ + 0, /* properties_destroyed */ + 0, /* todo_flags_start */ + (TODO_cleanup_cfg | TODO_update_ssa | TODO_verify_all), +}; + +class pass_type_promotion : public gimple_opt_pass +{ +public: + pass_type_promotion (gcc::context *ctxt) + : gimple_opt_pass (pass_data_type_promotion, ctxt) + {} + + /* opt_pass methods: */ + opt_pass * clone () { return new pass_type_promotion (m_ctxt); } + virtual bool gate (function *) { return flag_tree_type_promote != 0; } + virtual unsigned int execute (function *) + { + return execute_type_promotion (); + } + +}; // class pass_type_promotion + +} // anon namespace + +gimple_opt_pass * +make_pass_type_promote (gcc::context *ctxt) +{ + return new pass_type_promotion (ctxt); +} + diff --git a/gcc/passes.def b/gcc/passes.def index 36d2b3b..78c463a 100644 --- a/gcc/passes.def +++ b/gcc/passes.def @@ -272,6 +272,7 @@ along with GCC; see the file COPYING3. If not see POP_INSERT_PASSES () NEXT_PASS (pass_simduid_cleanup); NEXT_PASS (pass_lower_vector_ssa); + NEXT_PASS (pass_type_promote); NEXT_PASS (pass_cse_reciprocals); NEXT_PASS (pass_reassoc); NEXT_PASS (pass_strength_reduction); diff --git a/gcc/timevar.def b/gcc/timevar.def index b429faf..a8d40c3 100644 --- a/gcc/timevar.def +++ b/gcc/timevar.def @@ -278,6 +278,7 @@ DEFTIMEVAR (TV_VTABLE_VERIFICATION , "vtable verification") DEFTIMEVAR (TV_TREE_UBSAN , "tree ubsan") DEFTIMEVAR (TV_INITIALIZE_RTL , "initialize rtl") DEFTIMEVAR (TV_GIMPLE_LADDRESS , "address lowering") +DEFTIMEVAR (TV_TREE_TYPE_PROMOTE , "tree type promote") /* Everything else in rest_of_compilation not included above. */ DEFTIMEVAR (TV_EARLY_LOCAL , "early local passes") diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h index 333b5a7..449dd19 100644 --- a/gcc/tree-pass.h +++ b/gcc/tree-pass.h @@ -436,6 +436,7 @@ extern gimple_opt_pass *make_pass_fre (gcc::context *ctxt); extern gimple_opt_pass *make_pass_check_data_deps (gcc::context *ctxt); extern gimple_opt_pass *make_pass_copy_prop (gcc::context *ctxt); extern gimple_opt_pass *make_pass_isolate_erroneous_paths (gcc::context *ctxt); +extern gimple_opt_pass *make_pass_type_promote (gcc::context *ctxt); extern gimple_opt_pass *make_pass_vrp (gcc::context *ctxt); extern gimple_opt_pass *make_pass_uncprop (gcc::context *ctxt); extern gimple_opt_pass *make_pass_return_slot (gcc::context *ctxt); diff --git a/gcc/tree-ssa-reassoc.c b/gcc/tree-ssa-reassoc.c index 45b8d46..07845e3 100644 --- a/gcc/tree-ssa-reassoc.c +++ b/gcc/tree-ssa-reassoc.c @@ -302,6 +302,7 @@ phi_rank (gimple *stmt) { tree arg = gimple_phi_arg_def (stmt, i); if (TREE_CODE (arg) == SSA_NAME + && SSA_NAME_VAR (arg) && !SSA_NAME_IS_DEFAULT_DEF (arg)) { gimple *def_stmt = SSA_NAME_DEF_STMT (arg); @@ -434,7 +435,8 @@ get_rank (tree e) if (gimple_code (stmt) == GIMPLE_PHI) return phi_rank (stmt); - if (!is_gimple_assign (stmt)) + if (!is_gimple_assign (stmt) + && !gimple_nop_p (stmt)) return bb_rank[gimple_bb (stmt)->index]; /* If we already have a rank for this expression, use that. */ diff --git a/gcc/tree-ssa-uninit.c b/gcc/tree-ssa-uninit.c index 3f7dbcf..93422ac 100644 --- a/gcc/tree-ssa-uninit.c +++ b/gcc/tree-ssa-uninit.c @@ -201,16 +201,19 @@ warn_uninitialized_vars (bool warn_possibly_uninitialized) FOR_EACH_SSA_USE_OPERAND (use_p, stmt, op_iter, SSA_OP_USE) { use = USE_FROM_PTR (use_p); - if (always_executed) - warn_uninit (OPT_Wuninitialized, use, - SSA_NAME_VAR (use), SSA_NAME_VAR (use), - "%qD is used uninitialized in this function", - stmt, UNKNOWN_LOCATION); - else if (warn_possibly_uninitialized) - warn_uninit (OPT_Wmaybe_uninitialized, use, - SSA_NAME_VAR (use), SSA_NAME_VAR (use), - "%qD may be used uninitialized in this function", - stmt, UNKNOWN_LOCATION); + if (SSA_NAME_VAR (use)) + { + if (always_executed) + warn_uninit (OPT_Wuninitialized, use, + SSA_NAME_VAR (use), SSA_NAME_VAR (use), + "%qD is used uninitialized in this function", + stmt, UNKNOWN_LOCATION); + else if (warn_possibly_uninitialized) + warn_uninit (OPT_Wmaybe_uninitialized, use, + SSA_NAME_VAR (use), SSA_NAME_VAR (use), + "%qD may be used uninitialized in this function", + stmt, UNKNOWN_LOCATION); + } } /* For memory the only cheap thing we can do is see if we diff --git a/gcc/tree-ssa.c b/gcc/tree-ssa.c index 4b869be..3e520fc 100644 --- a/gcc/tree-ssa.c +++ b/gcc/tree-ssa.c @@ -752,7 +752,8 @@ verify_use (basic_block bb, basic_block def_bb, use_operand_p use_p, TREE_VISITED (ssa_name) = 1; if (gimple_nop_p (SSA_NAME_DEF_STMT (ssa_name)) - && SSA_NAME_IS_DEFAULT_DEF (ssa_name)) + && (SSA_NAME_IS_DEFAULT_DEF (ssa_name) + || SSA_NAME_VAR (ssa_name) == NULL)) ; /* Default definitions have empty statements. Nothing to do. */ else if (!def_bb) { diff --git a/libiberty/cp-demangle.c b/libiberty/cp-demangle.c index ff608a3..6722331 100644 --- a/libiberty/cp-demangle.c +++ b/libiberty/cp-demangle.c @@ -4353,7 +4353,7 @@ d_print_comp_inner (struct d_print_info *dpi, int options, /* Variable used to store the current templates while a previously captured scope is used. */ - struct d_print_template *saved_templates; + struct d_print_template *saved_templates = NULL; /* Nonzero if templates have been stored in the above variable. */ int need_template_restore = 0; -- 1.9.1 [-- Attachment #4: 0001-Add-new-SEXT_EXPR-tree-code.patch --] [-- Type: text/x-diff, Size: 5067 bytes --] From c0ce364e3a422912a08189645efde46c36583753 Mon Sep 17 00:00:00 2001 From: Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org> Date: Thu, 22 Oct 2015 10:51:42 +1100 Subject: [PATCH 1/4] Add new SEXT_EXPR tree code --- gcc/cfgexpand.c | 12 ++++++++++++ gcc/expr.c | 20 ++++++++++++++++++++ gcc/fold-const.c | 4 ++++ gcc/tree-cfg.c | 12 ++++++++++++ gcc/tree-inline.c | 1 + gcc/tree-pretty-print.c | 11 +++++++++++ gcc/tree.def | 5 +++++ 7 files changed, 65 insertions(+) diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c index eaad859..aeb64bb 100644 --- a/gcc/cfgexpand.c +++ b/gcc/cfgexpand.c @@ -5054,6 +5054,18 @@ expand_debug_expr (tree exp) case FMA_EXPR: return simplify_gen_ternary (FMA, mode, inner_mode, op0, op1, op2); + case SEXT_EXPR: + gcc_assert (CONST_INT_P (op1)); + inner_mode = mode_for_size (INTVAL (op1), MODE_INT, 0); + gcc_assert (GET_MODE_BITSIZE (inner_mode) == INTVAL (op1)); + + if (mode != inner_mode) + op0 = simplify_gen_unary (SIGN_EXTEND, + mode, + gen_lowpart_SUBREG (inner_mode, op0), + inner_mode); + return op0; + default: flag_unsupported: #ifdef ENABLE_CHECKING diff --git a/gcc/expr.c b/gcc/expr.c index da68870..c2f535f 100644 --- a/gcc/expr.c +++ b/gcc/expr.c @@ -9318,6 +9318,26 @@ expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode, target = expand_vec_cond_expr (type, treeop0, treeop1, treeop2, target); return target; + case SEXT_EXPR: + { + machine_mode inner_mode = mode_for_size (tree_to_uhwi (treeop1), + MODE_INT, 0); + rtx temp, result; + rtx op0 = expand_normal (treeop0); + op0 = force_reg (mode, op0); + if (mode != inner_mode) + { + result = gen_reg_rtx (mode); + temp = simplify_gen_unary (SIGN_EXTEND, mode, + gen_lowpart_SUBREG (inner_mode, op0), + inner_mode); + convert_move (result, temp, 0); + } + else + result = op0; + return result; + } + default: gcc_unreachable (); } diff --git a/gcc/fold-const.c b/gcc/fold-const.c index 602ea24..a149bad 100644 --- a/gcc/fold-const.c +++ b/gcc/fold-const.c @@ -987,6 +987,10 @@ int_const_binop_1 (enum tree_code code, const_tree arg1, const_tree parg2, res = wi::bit_and (arg1, arg2); break; + case SEXT_EXPR: + res = wi::sext (arg1, arg2.to_uhwi ()); + break; + case RSHIFT_EXPR: case LSHIFT_EXPR: if (wi::neg_p (arg2)) diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c index 8e3e810..d18b3f7 100644 --- a/gcc/tree-cfg.c +++ b/gcc/tree-cfg.c @@ -3752,6 +3752,18 @@ verify_gimple_assign_binary (gassign *stmt) return false; } + case SEXT_EXPR: + { + if (!INTEGRAL_TYPE_P (lhs_type) + || !useless_type_conversion_p (lhs_type, rhs1_type) + || !tree_fits_uhwi_p (rhs2)) + { + error ("invalid operands in sext expr"); + return true; + } + return false; + } + case VEC_WIDEN_LSHIFT_HI_EXPR: case VEC_WIDEN_LSHIFT_LO_EXPR: { diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c index b8269ef..e61c200 100644 --- a/gcc/tree-inline.c +++ b/gcc/tree-inline.c @@ -3893,6 +3893,7 @@ estimate_operator_cost (enum tree_code code, eni_weights *weights, case BIT_XOR_EXPR: case BIT_AND_EXPR: case BIT_NOT_EXPR: + case SEXT_EXPR: case TRUTH_ANDIF_EXPR: case TRUTH_ORIF_EXPR: diff --git a/gcc/tree-pretty-print.c b/gcc/tree-pretty-print.c index 11f90051..bec9082 100644 --- a/gcc/tree-pretty-print.c +++ b/gcc/tree-pretty-print.c @@ -1923,6 +1923,14 @@ dump_generic_node (pretty_printer *pp, tree node, int spc, int flags, } break; + case SEXT_EXPR: + pp_string (pp, "SEXT_EXPR <"); + dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags, false); + pp_string (pp, ", "); + dump_generic_node (pp, TREE_OPERAND (node, 1), spc, flags, false); + pp_greater (pp); + break; + case MODIFY_EXPR: case INIT_EXPR: dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags, @@ -3561,6 +3569,9 @@ op_symbol_code (enum tree_code code) case MIN_EXPR: return "min"; + case SEXT_EXPR: + return "sext"; + default: return "<<< ??? >>>"; } diff --git a/gcc/tree.def b/gcc/tree.def index d0a3bd6..789cfdd 100644 --- a/gcc/tree.def +++ b/gcc/tree.def @@ -760,6 +760,11 @@ DEFTREECODE (BIT_XOR_EXPR, "bit_xor_expr", tcc_binary, 2) DEFTREECODE (BIT_AND_EXPR, "bit_and_expr", tcc_binary, 2) DEFTREECODE (BIT_NOT_EXPR, "bit_not_expr", tcc_unary, 1) +/* Sign-extend operation. It will sign extend first operand from + the sign bit specified by the second operand. The type of the + result is that of the first operand. */ +DEFTREECODE (SEXT_EXPR, "sext_expr", tcc_binary, 2) + /* ANDIF and ORIF allow the second operand not to be computed if the value of the expression is determined from the first operand. AND, OR, and XOR always compute the second operand whether its value is -- 1.9.1 ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [0/7] Type promotion pass and elimination of zext/sext 2015-11-08 9:43 ` Kugan @ 2015-11-10 14:13 ` Richard Biener 2015-11-12 6:08 ` Kugan 2015-11-14 1:15 ` Kugan 0 siblings, 2 replies; 28+ messages in thread From: Richard Biener @ 2015-11-10 14:13 UTC (permalink / raw) To: Kugan; +Cc: gcc-patches On Sun, Nov 8, 2015 at 10:43 AM, Kugan <kugan.vivekanandarajah@linaro.org> wrote: > > Thanks Richard for the comments. Please find the attached patches which > now passes bootstrap with x86_64-none-linux-gnu, aarch64-linux-gnu and > ppc64-linux-gnu. Regression testing is ongoing. Please find the comments > for your questions/suggestions below. > >> >> I notice >> >> diff --git a/gcc/tree-ssanames.c b/gcc/tree-ssanames.c >> index 82fd4a1..80fcf70 100644 >> --- a/gcc/tree-ssanames.c >> +++ b/gcc/tree-ssanames.c >> @@ -207,7 +207,8 @@ set_range_info (tree name, enum value_range_type range_type, >> unsigned int precision = TYPE_PRECISION (TREE_TYPE (name)); >> >> /* Allocate if not available. */ >> - if (ri == NULL) >> + if (ri == NULL >> + || (precision != ri->get_min ().get_precision ())) >> >> and I think you need to clear range info on promoted SSA vars in the >> promotion pass. > > Done. > >> >> The basic "structure" thing still remains. You walk over all uses and >> defs in all stmts >> in promote_all_stmts which ends up calling promote_ssa_if_not_promoted on all >> uses and defs which in turn promotes (the "def") and then fixes up all >> uses in all stmts. > > Done. Not exactly. I still see /* Promote all the stmts in the basic block. */ static void promote_all_stmts (basic_block bb) { gimple_stmt_iterator gsi; ssa_op_iter iter; tree def, use; use_operand_p op; for (gphi_iterator gpi = gsi_start_phis (bb); !gsi_end_p (gpi); gsi_next (&gpi)) { gphi *phi = gpi.phi (); def = PHI_RESULT (phi); promote_ssa (def, &gsi); FOR_EACH_PHI_ARG (op, phi, iter, SSA_OP_USE) { use = USE_FROM_PTR (op); if (TREE_CODE (use) == SSA_NAME && gimple_code (SSA_NAME_DEF_STMT (use)) == GIMPLE_NOP) promote_ssa (use, &gsi); fixup_uses (phi, &gsi, op, use); } you still call promote_ssa on both DEFs and USEs and promote_ssa looks at SSA_NAME_DEF_STMT of the passed arg. Please call promote_ssa just on DEFs and fixup_uses on USEs. Any reason you do not promote debug stmts during the DOM walk? So for each DEF you record in ssa_name_info struct ssa_name_info { tree ssa; tree type; tree promoted_type; }; (the fields need documenting). Add a tree promoted_def to it which you can replace any use of the DEF with. Currently as you call promote_ssa for DEFs and USEs you repeatedly overwrite the entry in ssa_name_info_map with a new copy. So you should assert it wasn't already there. switch (gimple_code (def_stmt)) { case GIMPLE_PHI: { the last { is indented too much it should be indented 2 spaces relative to the 'case' SSA_NAME_RANGE_INFO (def) = NULL; only needed in the case 'def' was promoted itself. Please use reset_flow_sensitive_info (def). >> >> Instead of this you should, in promote_all_stmts, walk over all uses doing what >> fixup_uses does and then walk over all defs, doing what promote_ssa does. >> >> + case GIMPLE_NOP: >> + { >> + if (SSA_NAME_VAR (def) == NULL) >> + { >> + /* Promote def by fixing its type for anonymous def. */ >> + TREE_TYPE (def) = promoted_type; >> + } >> + else >> + { >> + /* Create a promoted copy of parameters. */ >> + bb = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun)); >> >> I think the uninitialized vars are somewhat tricky and it would be best >> to create a new uninit anonymous SSA name for them. You can >> have SSA_NAME_VAR != NULL and def _not_ being a parameter >> btw. > > Done. I also had to do some changes to in couple of other places to > reflect this. > They are: > --- a/gcc/tree-ssa-reassoc.c > +++ b/gcc/tree-ssa-reassoc.c > @@ -302,6 +302,7 @@ phi_rank (gimple *stmt) > { > tree arg = gimple_phi_arg_def (stmt, i); > if (TREE_CODE (arg) == SSA_NAME > + && SSA_NAME_VAR (arg) > && !SSA_NAME_IS_DEFAULT_DEF (arg)) > { > gimple *def_stmt = SSA_NAME_DEF_STMT (arg); > @@ -434,7 +435,8 @@ get_rank (tree e) > if (gimple_code (stmt) == GIMPLE_PHI) > return phi_rank (stmt); > > - if (!is_gimple_assign (stmt)) > + if (!is_gimple_assign (stmt) > + && !gimple_nop_p (stmt)) > return bb_rank[gimple_bb (stmt)->index]; > > and > > --- a/gcc/tree-ssa.c > +++ b/gcc/tree-ssa.c > @@ -752,7 +752,8 @@ verify_use (basic_block bb, basic_block def_bb, > use_operand_p use_p, > TREE_VISITED (ssa_name) = 1; > > if (gimple_nop_p (SSA_NAME_DEF_STMT (ssa_name)) > - && SSA_NAME_IS_DEFAULT_DEF (ssa_name)) > + && (SSA_NAME_IS_DEFAULT_DEF (ssa_name) > + || SSA_NAME_VAR (ssa_name) == NULL)) > ; /* Default definitions have empty statements. Nothing to do. */ > else if (!def_bb) > { > > Does this look OK? Hmm, no, this looks bogus. I think the best thing to do is not promoting default defs at all and instead promote at the uses. /* Create a promoted copy of parameters. */ bb = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun)); gcc_assert (bb); gsi2 = gsi_after_labels (bb); new_def = copy_ssa_name (def); set_ssa_promoted (new_def); set_ssa_default_def (cfun, SSA_NAME_VAR (def), new_def); duplicate_default_ssa (new_def, def); TREE_TYPE (def) = promoted_type; AFAIK this is just an awkward way of replacing all uses by a new DEF, sth that should be supported by the machinery so that other default defs can just do new_def = get_or_create_default_def (create_tmp_reg (promoted_type)); and have all uses ('def') replaced by new_def. >> >> +/* Return true if it is safe to promote the defined SSA_NAME in the STMT >> + itself. */ >> +static bool >> +safe_to_promote_def_p (gimple *stmt) >> +{ >> + enum tree_code code = gimple_assign_rhs_code (stmt); >> + if (gimple_vuse (stmt) != NULL_TREE >> + || gimple_vdef (stmt) != NULL_TREE >> + || code == ARRAY_REF >> + || code == LROTATE_EXPR >> + || code == RROTATE_EXPR >> + || code == VIEW_CONVERT_EXPR >> + || code == BIT_FIELD_REF >> + || code == REALPART_EXPR >> + || code == IMAGPART_EXPR >> + || code == REDUC_MAX_EXPR >> + || code == REDUC_PLUS_EXPR >> + || code == REDUC_MIN_EXPR) >> + return false; >> + return true; >> >> huh, I think this function has an odd name, maybe >> can_promote_operation ()? Please >> use TREE_CODE_CLASS (code) == tcc_reference for all _REF trees. > > Done. > >> >> Note that as followup things like the rotates should be "expanded" like >> we'd do on RTL (open-coding the thing). And we'd need a way to >> specify zero-/sign-extended loads. >> >> +/* Return true if it is safe to promote the use in the STMT. */ >> +static bool >> +safe_to_promote_use_p (gimple *stmt) >> +{ >> + enum tree_code code = gimple_assign_rhs_code (stmt); >> + tree lhs = gimple_assign_lhs (stmt); >> + >> + if (gimple_vuse (stmt) != NULL_TREE >> + || gimple_vdef (stmt) != NULL_TREE >> >> I think the vuse/vdef check is bogus, you can have a use of 'i_3' in say >> _2 = a[i_3]; >> > When I remove this, I see errors in stmts like: > > unsigned char > unsigned int > # .MEM_197 = VDEF <.MEM_187> > fs_9(D)->fde_encoding = _154; Yeah, as said a stmt based check is really bogus without context. As the predicate is only used in a single place it's better to inline it there. In this case you want to handle loads/stores differently. From this context it looks like not iterating over uses in the caller but rather iterating over uses here makes most sense as you then can do if (gimple_store_p (stmt)) { promote all uses that are not gimple_assign_rhs1 () } you can also transparently handle constants for the cases where promoting is required. At the moment their handling is interwinded with the def promotion code. That makes the whole thing hard to follow. Thanks, Richard. > >> + || code == VIEW_CONVERT_EXPR >> + || code == LROTATE_EXPR >> + || code == RROTATE_EXPR >> + || code == CONSTRUCTOR >> + || code == BIT_FIELD_REF >> + || code == COMPLEX_EXPR >> + || code == ASM_EXPR >> + || VECTOR_TYPE_P (TREE_TYPE (lhs))) >> + return false; >> + return true; >> >> ASM_EXPR can never appear here. I think PROMOTE_MODE never >> promotes vector types - what cases did you need to add VECTOR_TYPE_P for? > > Done >> >> +/* Return true if the SSA_NAME has to be truncated to preserve the >> + semantics. */ >> +static bool >> +truncate_use_p (gimple *stmt) >> +{ >> + enum tree_code code = gimple_assign_rhs_code (stmt); >> >> I think the description can be improved. This is about stray bits set >> beyond the original type, correct? >> >> Please use NOP_EXPR wherever you use CONVERT_EXPR right how. >> >> + if (TREE_CODE_CLASS (code) >> + == tcc_comparison) >> + promote_cst_in_stmt (stmt, promoted_type, true); >> >> don't you always need to promote constant operands? > > I am promoting all the constants. Here, I am promoting the the constants > that are part of the conditions. > > > Thanks, > Kugan ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [0/7] Type promotion pass and elimination of zext/sext 2015-11-10 14:13 ` Richard Biener @ 2015-11-12 6:08 ` Kugan 2015-11-14 1:15 ` Kugan 1 sibling, 0 replies; 28+ messages in thread From: Kugan @ 2015-11-12 6:08 UTC (permalink / raw) To: Richard Biener; +Cc: gcc-patches [-- Attachment #1: Type: text/plain, Size: 9053 bytes --] Hi Richard, Thanks for the review. >>> >>> The basic "structure" thing still remains. You walk over all uses and >>> defs in all stmts >>> in promote_all_stmts which ends up calling promote_ssa_if_not_promoted on all >>> uses and defs which in turn promotes (the "def") and then fixes up all >>> uses in all stmts. >> >> Done. > > Not exactly. I still see > > /* Promote all the stmts in the basic block. */ > static void > promote_all_stmts (basic_block bb) > { > gimple_stmt_iterator gsi; > ssa_op_iter iter; > tree def, use; > use_operand_p op; > > for (gphi_iterator gpi = gsi_start_phis (bb); > !gsi_end_p (gpi); gsi_next (&gpi)) > { > gphi *phi = gpi.phi (); > def = PHI_RESULT (phi); > promote_ssa (def, &gsi); > > FOR_EACH_PHI_ARG (op, phi, iter, SSA_OP_USE) > { > use = USE_FROM_PTR (op); > if (TREE_CODE (use) == SSA_NAME > && gimple_code (SSA_NAME_DEF_STMT (use)) == GIMPLE_NOP) > promote_ssa (use, &gsi); > fixup_uses (phi, &gsi, op, use); > } > > you still call promote_ssa on both DEFs and USEs and promote_ssa looks > at SSA_NAME_DEF_STMT of the passed arg. Please call promote_ssa just > on DEFs and fixup_uses on USEs. I am doing this to promote SSA that are defined with GIMPLE_NOP. Is there anyway to iterate over this. I have added gcc_assert to make sure that promote_ssa is called only once. > > Any reason you do not promote debug stmts during the DOM walk? > > So for each DEF you record in ssa_name_info > > struct ssa_name_info > { > tree ssa; > tree type; > tree promoted_type; > }; > > (the fields need documenting). Add a tree promoted_def to it which you > can replace any use of the DEF with. In this version of the patch, I am promoting the def in place. If we decide to change, I will add it. If I understand you correctly, this is to be used in iterating over uses and fixing. > > Currently as you call promote_ssa for DEFs and USEs you repeatedly > overwrite the entry in ssa_name_info_map with a new copy. So you > should assert it wasn't already there. > > switch (gimple_code (def_stmt)) > { > case GIMPLE_PHI: > { > > the last { is indented too much it should be indented 2 spaces > relative to the 'case' Done. > > > SSA_NAME_RANGE_INFO (def) = NULL; > > only needed in the case 'def' was promoted itself. Please use > reset_flow_sensitive_info (def). We are promoting all the defs. In some-cases we can however use the value ranges in SSA just by promoting to new type (as the values will be the same). Shall I do it as a follow up. > >>> >>> Instead of this you should, in promote_all_stmts, walk over all uses doing what >>> fixup_uses does and then walk over all defs, doing what promote_ssa does. >>> >>> + case GIMPLE_NOP: >>> + { >>> + if (SSA_NAME_VAR (def) == NULL) >>> + { >>> + /* Promote def by fixing its type for anonymous def. */ >>> + TREE_TYPE (def) = promoted_type; >>> + } >>> + else >>> + { >>> + /* Create a promoted copy of parameters. */ >>> + bb = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun)); >>> >>> I think the uninitialized vars are somewhat tricky and it would be best >>> to create a new uninit anonymous SSA name for them. You can >>> have SSA_NAME_VAR != NULL and def _not_ being a parameter >>> btw. >> >> Done. I also had to do some changes to in couple of other places to >> reflect this. >> They are: >> --- a/gcc/tree-ssa-reassoc.c >> +++ b/gcc/tree-ssa-reassoc.c >> @@ -302,6 +302,7 @@ phi_rank (gimple *stmt) >> { >> tree arg = gimple_phi_arg_def (stmt, i); >> if (TREE_CODE (arg) == SSA_NAME >> + && SSA_NAME_VAR (arg) >> && !SSA_NAME_IS_DEFAULT_DEF (arg)) >> { >> gimple *def_stmt = SSA_NAME_DEF_STMT (arg); >> @@ -434,7 +435,8 @@ get_rank (tree e) >> if (gimple_code (stmt) == GIMPLE_PHI) >> return phi_rank (stmt); >> >> - if (!is_gimple_assign (stmt)) >> + if (!is_gimple_assign (stmt) >> + && !gimple_nop_p (stmt)) >> return bb_rank[gimple_bb (stmt)->index]; >> >> and >> >> --- a/gcc/tree-ssa.c >> +++ b/gcc/tree-ssa.c >> @@ -752,7 +752,8 @@ verify_use (basic_block bb, basic_block def_bb, >> use_operand_p use_p, >> TREE_VISITED (ssa_name) = 1; >> >> if (gimple_nop_p (SSA_NAME_DEF_STMT (ssa_name)) >> - && SSA_NAME_IS_DEFAULT_DEF (ssa_name)) >> + && (SSA_NAME_IS_DEFAULT_DEF (ssa_name) >> + || SSA_NAME_VAR (ssa_name) == NULL)) >> ; /* Default definitions have empty statements. Nothing to do. */ >> else if (!def_bb) >> { >> >> Does this look OK? > > Hmm, no, this looks bogus. I have removed all the above. > > I think the best thing to do is not promoting default defs at all and instead > promote at the uses. > > /* Create a promoted copy of parameters. */ > bb = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun)); > gcc_assert (bb); > gsi2 = gsi_after_labels (bb); > new_def = copy_ssa_name (def); > set_ssa_promoted (new_def); > set_ssa_default_def (cfun, SSA_NAME_VAR (def), new_def); > duplicate_default_ssa (new_def, def); > TREE_TYPE (def) = promoted_type; > > AFAIK this is just an awkward way of replacing all uses by a new DEF, sth > that should be supported by the machinery so that other default defs can just > do > > new_def = get_or_create_default_def (create_tmp_reg > (promoted_type)); > > and have all uses ('def') replaced by new_def. I experimented with get_or_create_default_def. Here we have to have a SSA_NAME_VAR (def) of promoted type. In the attached patch I am doing the following and seems to work. Does this looks OK? + } + else if (TREE_CODE (SSA_NAME_VAR (def)) != PARM_DECL) + { + tree var = copy_node (SSA_NAME_VAR (def)); + TREE_TYPE (var) = promoted_type; + TREE_TYPE (def) = promoted_type; + SET_SSA_NAME_VAR_OR_IDENTIFIER (def, var); + } I prefer to promote def as otherwise iterating over the uses and promoting can look complicated (have to look at all the different types of stmts again and do the right thing as It was in the earlier version of this before we move to this approach) >>> >>> Note that as followup things like the rotates should be "expanded" like >>> we'd do on RTL (open-coding the thing). And we'd need a way to >>> specify zero-/sign-extended loads. >>> >>> +/* Return true if it is safe to promote the use in the STMT. */ >>> +static bool >>> +safe_to_promote_use_p (gimple *stmt) >>> +{ >>> + enum tree_code code = gimple_assign_rhs_code (stmt); >>> + tree lhs = gimple_assign_lhs (stmt); >>> + >>> + if (gimple_vuse (stmt) != NULL_TREE >>> + || gimple_vdef (stmt) != NULL_TREE >>> >>> I think the vuse/vdef check is bogus, you can have a use of 'i_3' in say >>> _2 = a[i_3]; >>> >> When I remove this, I see errors in stmts like: >> >> unsigned char >> unsigned int >> # .MEM_197 = VDEF <.MEM_187> >> fs_9(D)->fde_encoding = _154; > > Yeah, as said a stmt based check is really bogus without context. As the > predicate is only used in a single place it's better to inline it > there. In this > case you want to handle loads/stores differently. From this context it > looks like not iterating over uses in the caller but rather iterating over > uses here makes most sense as you then can do > > if (gimple_store_p (stmt)) > { > promote all uses that are not gimple_assign_rhs1 () > } > > you can also transparently handle constants for the cases where promoting > is required. At the moment their handling is interwinded with the def promotion > code. That makes the whole thing hard to follow. I have updated the comments with: +/* Promote constants in STMT to TYPE. If PROMOTE_COND_EXPR is true, + promote only the constants in conditions part of the COND_EXPR. + + We promote the constants when the associated operands are promoted. + This usually means that we promote the constants when we promote the + defining stmnts (as part of promote_ssa). However for COND_EXPR, we + can promote only when we promote the other operand. Therefore, this + is done during fixup_use. */ I am handling gimple_debug separately to avoid any code difference with and without -g option. I have updated the comments for this. Tested attached patch on ppc64, aarch64 and x86-none-linux-gnu. regression testing for ppc64 is progressing. I also noticed that tree-ssa-uninit sometimes gives false positives due to the assumptions it makes. Is it OK to move this pass before type promotion? I can do the testings and post a separate patch with this if this OK. I also removed the optimization that prevents some of the redundant truncation/extensions from type promotion pass, as it dosent do much as of now. I can send a proper follow up patch. Is that OK? Thanks, Kugan [-- Attachment #2: 0003-Optimize-ZEXT_EXPR-with-tree-vrp.patch --] [-- Type: text/x-patch, Size: 3609 bytes --] From 0eb41ec18322484cf0ae8ca6631ac9dc913576fb Mon Sep 17 00:00:00 2001 From: Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org> Date: Thu, 22 Oct 2015 10:53:56 +1100 Subject: [PATCH 3/5] Optimize ZEXT_EXPR with tree-vrp --- gcc/match.pd | 6 ++++++ gcc/tree-vrp.c | 61 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 67 insertions(+) diff --git a/gcc/match.pd b/gcc/match.pd index 0a9598e..1b152f1 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -2585,3 +2585,9 @@ along with GCC; see the file COPYING3. If not see (if (INTEGRAL_TYPE_P (TREE_TYPE (@0))) (op @0 (ext @1 @2))))) +(simplify + (sext (sext@2 @0 @1) @3) + (if (tree_int_cst_compare (@1, @3) <= 0) + @2 + (sext @0 @3))) + diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c index fe34ffd..024c8ef 100644 --- a/gcc/tree-vrp.c +++ b/gcc/tree-vrp.c @@ -2241,6 +2241,7 @@ extract_range_from_binary_expr_1 (value_range *vr, && code != LSHIFT_EXPR && code != MIN_EXPR && code != MAX_EXPR + && code != SEXT_EXPR && code != BIT_AND_EXPR && code != BIT_IOR_EXPR && code != BIT_XOR_EXPR) @@ -2801,6 +2802,54 @@ extract_range_from_binary_expr_1 (value_range *vr, extract_range_from_multiplicative_op_1 (vr, code, &vr0, &vr1); return; } + else if (code == SEXT_EXPR) + { + gcc_assert (range_int_cst_p (&vr1)); + HOST_WIDE_INT prec = tree_to_uhwi (vr1.min); + type = vr0.type; + wide_int tmin, tmax; + wide_int may_be_nonzero, must_be_nonzero; + + wide_int type_min = wi::min_value (prec, SIGNED); + wide_int type_max = wi::max_value (prec, SIGNED); + type_min = wide_int_to_tree (expr_type, type_min); + type_max = wide_int_to_tree (expr_type, type_max); + type_min = wi::sext (type_min, prec); + type_max = wi::sext (type_max, prec); + wide_int sign_bit + = wi::set_bit_in_zero (prec - 1, + TYPE_PRECISION (TREE_TYPE (vr0.min))); + if (zero_nonzero_bits_from_vr (expr_type, &vr0, + &may_be_nonzero, + &must_be_nonzero)) + { + if (wi::bit_and (must_be_nonzero, sign_bit) == sign_bit) + { + /* If to-be-extended sign bit is one. */ + tmin = type_min; + tmax = wi::zext (may_be_nonzero, prec); + } + else if (wi::bit_and (may_be_nonzero, sign_bit) + != sign_bit) + { + /* If to-be-extended sign bit is zero. */ + tmin = wi::zext (must_be_nonzero, prec); + tmax = wi::zext (may_be_nonzero, prec); + } + else + { + tmin = type_min; + tmax = type_max; + } + } + else + { + tmin = type_min; + tmax = type_max; + } + min = wide_int_to_tree (expr_type, tmin); + max = wide_int_to_tree (expr_type, tmax); + } else if (code == RSHIFT_EXPR || code == LSHIFT_EXPR) { @@ -9166,6 +9215,17 @@ simplify_bit_ops_using_ranges (gimple_stmt_iterator *gsi, gimple *stmt) break; } break; + case SEXT_EXPR: + { + unsigned int prec = tree_to_uhwi (op1); + wide_int min = vr0.min; + wide_int max = vr0.max; + wide_int sext_min = wi::sext (min, prec); + wide_int sext_max = wi::sext (max, prec); + if (min == sext_min && max == sext_max) + op = op0; + } + break; default: gcc_unreachable (); } @@ -9868,6 +9928,7 @@ simplify_stmt_using_ranges (gimple_stmt_iterator *gsi) case BIT_AND_EXPR: case BIT_IOR_EXPR: + case SEXT_EXPR: /* Optimize away BIT_AND_EXPR and BIT_IOR_EXPR if all the bits being cleared are already cleared or all the bits being set are already set. */ -- 1.9.1 [-- Attachment #3: 0002-Add-type-promotion-pass.patch --] [-- Type: text/x-patch, Size: 30437 bytes --] From 31c9caf7b239827ed6ac7ad7f4fe05e0ba4197e2 Mon Sep 17 00:00:00 2001 From: Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org> Date: Thu, 22 Oct 2015 10:52:37 +1100 Subject: [PATCH 2/5] Add type promotion pass --- gcc/Makefile.in | 1 + gcc/auto-profile.c | 2 +- gcc/common.opt | 4 + gcc/doc/invoke.texi | 10 + gcc/gimple-ssa-type-promote.c | 845 ++++++++++++++++++++++++++++++++++++++++++ gcc/passes.def | 1 + gcc/timevar.def | 1 + gcc/tree-pass.h | 1 + libiberty/cp-demangle.c | 2 +- 9 files changed, 865 insertions(+), 2 deletions(-) create mode 100644 gcc/gimple-ssa-type-promote.c diff --git a/gcc/Makefile.in b/gcc/Makefile.in index b91b8dc..c6aed45 100644 --- a/gcc/Makefile.in +++ b/gcc/Makefile.in @@ -1499,6 +1499,7 @@ OBJS = \ tree-vect-slp.o \ tree-vectorizer.o \ tree-vrp.o \ + gimple-ssa-type-promote.o \ tree.o \ valtrack.o \ value-prof.o \ diff --git a/gcc/auto-profile.c b/gcc/auto-profile.c index 25202c5..d32c3b6 100644 --- a/gcc/auto-profile.c +++ b/gcc/auto-profile.c @@ -1266,7 +1266,7 @@ afdo_propagate_circuit (const bb_set &annotated_bb, edge_set *annotated_edge) FOR_EACH_EDGE (e, ei, bb->succs) { unsigned i, total = 0; - edge only_one; + edge only_one = NULL; bool check_value_one = (((integer_onep (cmp_rhs)) ^ (gimple_cond_code (cmp_stmt) == EQ_EXPR)) ^ ((e->flags & EDGE_TRUE_VALUE) != 0)); diff --git a/gcc/common.opt b/gcc/common.opt index 12ca0d6..f450428 100644 --- a/gcc/common.opt +++ b/gcc/common.opt @@ -2404,6 +2404,10 @@ ftree-vrp Common Report Var(flag_tree_vrp) Init(0) Optimization Perform Value Range Propagation on trees. +ftree-type-promote +Common Report Var(flag_tree_type_promote) Init(1) Optimization +Perform Type Promotion on trees + funit-at-a-time Common Report Var(flag_unit_at_a_time) Init(1) Compile whole compilation unit at a time. diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index cd82544..bc059a0 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -9093,6 +9093,16 @@ enabled by default at @option{-O2} and higher. Null pointer check elimination is only done if @option{-fdelete-null-pointer-checks} is enabled. +@item -ftree-type-promote +@opindex ftree-type-promote +This pass applies type promotion to SSA names in the function and +inserts appropriate truncations to preserve the semantics. Idea of +this pass is to promote operations such a way that we can minimise +generation of subreg in RTL, that intern results in removal of +redundant zero/sign extensions. + +This optimization is enabled by default. + @item -fsplit-ivs-in-unroller @opindex fsplit-ivs-in-unroller Enables expression of values of induction variables in later iterations diff --git a/gcc/gimple-ssa-type-promote.c b/gcc/gimple-ssa-type-promote.c new file mode 100644 index 0000000..6a8cc06 --- /dev/null +++ b/gcc/gimple-ssa-type-promote.c @@ -0,0 +1,845 @@ +/* Type promotion of SSA names to minimise redundant zero/sign extension. + Copyright (C) 2015 Free Software Foundation, Inc. + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify +it under the terms of the GNU General Public License as published by +the Free Software Foundation; either version 3, or (at your option) +any later version. + +GCC is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +You should have received a copy of the GNU General Public License +along with GCC; see the file COPYING3. If not see +<http://www.gnu.org/licenses/>. */ + +#include "config.h" +#include "system.h" +#include "coretypes.h" +#include "backend.h" +#include "hash-set.h" +#include "machmode.h" +#include "vec.h" +#include "double-int.h" +#include "input.h" +#include "symtab.h" +#include "wide-int.h" +#include "inchash.h" +#include "tree.h" +#include "fold-const.h" +#include "stor-layout.h" +#include "predict.h" +#include "function.h" +#include "dominance.h" +#include "cfg.h" +#include "basic-block.h" +#include "tree-ssa-alias.h" +#include "gimple-fold.h" +#include "tree-eh.h" +#include "gimple-expr.h" +#include "is-a.h" +#include "gimple.h" +#include "gimple-iterator.h" +#include "gimple-ssa.h" +#include "tree-phinodes.h" +#include "ssa-iterators.h" +#include "stringpool.h" +#include "tree-ssanames.h" +#include "tree-pass.h" +#include "gimple-pretty-print.h" +#include "langhooks.h" +#include "sbitmap.h" +#include "domwalk.h" +#include "tree-dfa.h" + +/* This pass applies type promotion to SSA names in the function and + inserts appropriate truncations. Idea of this pass is to promote operations + such a way that we can minimise generation of subreg in RTL, + that in turn results in removal of redundant zero/sign extensions. This pass + will run prior to The VRP and DOM such that they will be able to optimise + redundant truncations and extensions. This is based on the discussion from + https://gcc.gnu.org/ml/gcc-patches/2014-09/msg00472.html. +*/ + +/* Structure to hold the type and promoted type for promoted ssa variables. */ +struct ssa_name_info +{ + tree ssa; /* Name of the SSA_NAME. */ + tree type; /* Original type of ssa. */ + tree promoted_type; /* Promoted type of ssa. */ +}; + +/* Obstack for ssa_name_info. */ +static struct obstack ssa_name_info_obstack; + +static unsigned n_ssa_val; +static sbitmap ssa_to_be_promoted_bitmap; +static hash_map <tree, ssa_name_info *> *ssa_name_info_map; + +static bool +type_precision_ok (tree type) +{ + return (TYPE_PRECISION (type) + == GET_MODE_PRECISION (TYPE_MODE (type))); +} + +/* Return the promoted type for TYPE. */ +static tree +get_promoted_type (tree type) +{ + tree promoted_type; + enum machine_mode mode; + int uns; + + if (POINTER_TYPE_P (type) + || !INTEGRAL_TYPE_P (type) + || !type_precision_ok (type)) + return type; + + mode = TYPE_MODE (type); +#ifdef PROMOTE_MODE + uns = TYPE_SIGN (type); + PROMOTE_MODE (mode, uns, type); +#endif + uns = TYPE_SIGN (type); + if (TYPE_PRECISION (type) == GET_MODE_PRECISION (mode)) + return type; + promoted_type + = build_nonstandard_integer_type (GET_MODE_PRECISION (mode), + uns); + gcc_assert (TYPE_PRECISION (promoted_type) == GET_MODE_PRECISION (mode)); + return promoted_type; +} + +/* Return true if ssa NAME is already considered for promotion. */ +static bool +ssa_promoted_p (tree name) +{ + if (TREE_CODE (name) == SSA_NAME) + { + unsigned int index = SSA_NAME_VERSION (name); + if (index < n_ssa_val) + return bitmap_bit_p (ssa_to_be_promoted_bitmap, index); + } + return true; +} + +/* Set ssa NAME to be already considered for promotion. */ +static void +set_ssa_promoted (tree name) +{ + if (TREE_CODE (name) == SSA_NAME) + { + unsigned int index = SSA_NAME_VERSION (name); + if (index < n_ssa_val) + bitmap_set_bit (ssa_to_be_promoted_bitmap, index); + } +} + +/* Return true if LHS will be promoted later. */ +static bool +tobe_promoted_p (tree lhs) +{ + if (TREE_CODE (lhs) == SSA_NAME + && !POINTER_TYPE_P (TREE_TYPE (lhs)) + && INTEGRAL_TYPE_P (TREE_TYPE (lhs)) + && !VECTOR_TYPE_P (TREE_TYPE (lhs)) + && !ssa_promoted_p (lhs) + && (get_promoted_type (TREE_TYPE (lhs)) + != TREE_TYPE (lhs))) + return true; + else + return false; +} + +/* Convert constant CST to TYPE. */ +static tree +convert_int_cst (tree type, tree cst, signop sign = SIGNED) +{ + wide_int wi_cons = fold_convert (type, cst); + wi_cons = wi::ext (wi_cons, TYPE_PRECISION (TREE_TYPE (cst)), sign); + return wide_int_to_tree (type, wi_cons); +} + +/* Promote constants in STMT to TYPE. If PROMOTE_COND_EXPR is true, + promote only the constants in conditions part of the COND_EXPR. + + We promote the constants when the associated operands are promoted. + This usually means that we promote the constants when we promote the + defining stmnts (as part of promote_ssa). However for COND_EXPR, we + can promote only when we promote the other operand. Therefore, this + is done during fixup_use. */ + +static void +promote_cst_in_stmt (gimple *stmt, tree type, bool promote_cond = false) +{ + tree op; + ssa_op_iter iter; + use_operand_p oprnd; + int index; + tree op0, op1; + signop sign = SIGNED; + + switch (gimple_code (stmt)) + { + case GIMPLE_ASSIGN: + if (promote_cond + && gimple_assign_rhs_code (stmt) == COND_EXPR) + { + /* Promote INTEGER_CST that are tcc_compare arguments. */ + sign = TYPE_SIGN (type); + op = gimple_assign_rhs1 (stmt); + op0 = TREE_OPERAND (op, 0); + op1 = TREE_OPERAND (op, 1); + if (TREE_CODE (op0) == INTEGER_CST) + op0 = convert_int_cst (type, op0, sign); + if (TREE_CODE (op1) == INTEGER_CST) + op1 = convert_int_cst (type, op1, sign); + tree new_op = build2 (TREE_CODE (op), type, op0, op1); + gimple_assign_set_rhs1 (stmt, new_op); + } + else + { + /* Promote INTEGER_CST in GIMPLE_ASSIGN. */ + op = gimple_assign_rhs3 (stmt); + if (op && TREE_CODE (op) == INTEGER_CST) + gimple_assign_set_rhs3 (stmt, convert_int_cst (type, op, sign)); + if (TREE_CODE_CLASS (gimple_assign_rhs_code (stmt)) + == tcc_comparison) + sign = TYPE_SIGN (type); + op = gimple_assign_rhs1 (stmt); + if (op && TREE_CODE (op) == INTEGER_CST) + gimple_assign_set_rhs1 (stmt, convert_int_cst (type, op, sign)); + op = gimple_assign_rhs2 (stmt); + if (op && TREE_CODE (op) == INTEGER_CST) + gimple_assign_set_rhs2 (stmt, convert_int_cst (type, op, sign)); + } + break; + + case GIMPLE_PHI: + { + /* Promote INTEGER_CST arguments to GIMPLE_PHI. */ + gphi *phi = as_a <gphi *> (stmt); + FOR_EACH_PHI_ARG (oprnd, phi, iter, SSA_OP_USE) + { + op = USE_FROM_PTR (oprnd); + index = PHI_ARG_INDEX_FROM_USE (oprnd); + if (TREE_CODE (op) == INTEGER_CST) + SET_PHI_ARG_DEF (phi, index, convert_int_cst (type, op, sign)); + } + } + break; + + case GIMPLE_COND: + { + /* Promote INTEGER_CST that are GIMPLE_COND arguments. */ + gcond *cond = as_a <gcond *> (stmt); + sign = TYPE_SIGN (type); + op = gimple_cond_lhs (cond); + if (op && TREE_CODE (op) == INTEGER_CST) + gimple_cond_set_lhs (cond, convert_int_cst (type, op, sign)); + + op = gimple_cond_rhs (cond); + if (op && TREE_CODE (op) == INTEGER_CST) + gimple_cond_set_rhs (cond, convert_int_cst (type, op, sign)); + } + break; + + default: + gcc_unreachable (); + } +} + +/* Create an ssa with TYPE to copy ssa VAR. */ +static tree +make_promoted_copy (tree var, gimple *def_stmt, tree type) +{ + tree new_lhs = make_ssa_name (type, def_stmt); + if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (var)) + SSA_NAME_OCCURS_IN_ABNORMAL_PHI (new_lhs) = 1; + return new_lhs; +} + +/* Zero/sign extend (depending on UNSIGNED_P) VAR and truncate to WIDTH bits. + Assign the zero/sign extended value in NEW_VAR. gimple statement + that performs the zero/sign extension is returned. */ +static gimple * +zero_sign_extend_stmt (tree new_var, tree var, bool unsigned_p, int width) +{ + gcc_assert (TYPE_PRECISION (TREE_TYPE (var)) + == TYPE_PRECISION (TREE_TYPE (new_var))); + gcc_assert (TYPE_PRECISION (TREE_TYPE (var)) > width); + gimple *stmt; + + if (unsigned_p) + { + /* Zero extend. */ + tree cst + = wide_int_to_tree (TREE_TYPE (var), + wi::mask (width, false, + TYPE_PRECISION (TREE_TYPE (var)))); + stmt = gimple_build_assign (new_var, BIT_AND_EXPR, + var, cst); + } + else + /* Sign extend. */ + stmt = gimple_build_assign (new_var, + SEXT_EXPR, + var, build_int_cst (TREE_TYPE (var), width)); + return stmt; +} + + +static void +copy_default_ssa (tree to, tree from) +{ + SET_SSA_NAME_VAR_OR_IDENTIFIER (to, SSA_NAME_VAR (from)); + SSA_NAME_DEF_STMT (to) = SSA_NAME_DEF_STMT (from); + SET_SSA_NAME_VAR_OR_IDENTIFIER (from, NULL_TREE); + SSA_NAME_IS_DEFAULT_DEF (to) = 1; + SSA_NAME_IS_DEFAULT_DEF (from) = 0; +} + +/* Promote definition DEF to PROMOTED_TYPE. If the stmt that defines def + is def_stmt, make the type of def promoted_type. If the stmt is such + that, result of the def_stmt cannot be of promoted_type, create a new_def + of the original_type and make the def_stmt assign its value to newdef. + Then, create a NOP_EXPR to convert new_def to def of promoted type. + + For example, for stmt with original_type char and promoted_type int: + char _1 = mem; + becomes: + char _2 = mem; + int _1 = (int)_2; + + If the def_stmt allows def to be promoted, promote def in-place + (and its arguments when needed). + + For example: + char _3 = _1 + _2; + becomes: + int _3 = _1 + _2; + Here, _1 and _2 will also be promoted. */ +static void +promote_ssa (tree def, gimple_stmt_iterator *gsi) +{ + gimple *def_stmt = SSA_NAME_DEF_STMT (def); + gimple *copy_stmt = NULL; + basic_block bb; + gimple_stmt_iterator gsi2; + tree original_type = TREE_TYPE (def); + tree new_def; + ssa_name_info *info; + bool do_not_promote = false; + tree promoted_type = get_promoted_type (TREE_TYPE (def)); + + if (!tobe_promoted_p (def)) + return; + + info = (ssa_name_info *) obstack_alloc (&ssa_name_info_obstack, + sizeof (ssa_name_info)); + info->type = original_type; + info->promoted_type = promoted_type; + info->ssa = def; + gcc_assert (!ssa_name_info_map->get_or_insert (def)); + ssa_name_info_map->put (def, info); + + switch (gimple_code (def_stmt)) + { + case GIMPLE_PHI: + { + /* Promote def by fixing its type and make def anonymous. */ + TREE_TYPE (def) = promoted_type; + SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE); + promote_cst_in_stmt (def_stmt, promoted_type); + break; + } + + case GIMPLE_ASM: + { + gasm *asm_stmt = as_a <gasm *> (def_stmt); + for (unsigned int i = 0; i < gimple_asm_noutputs (asm_stmt); ++i) + { + /* Promote def and copy (i.e. convert) the value defined + by asm to def. */ + tree link = gimple_asm_output_op (asm_stmt, i); + tree op = TREE_VALUE (link); + if (op == def) + { + new_def = copy_ssa_name (def); + set_ssa_promoted (new_def); + copy_default_ssa (new_def, def); + TREE_VALUE (link) = new_def; + gimple_asm_set_output_op (asm_stmt, i, link); + + TREE_TYPE (def) = promoted_type; + copy_stmt = gimple_build_assign (def, NOP_EXPR, + new_def, NULL_TREE); + SSA_NAME_IS_DEFAULT_DEF (new_def) = 0; + gsi2 = gsi_for_stmt (def_stmt); + gsi_insert_after (&gsi2, copy_stmt, GSI_NEW_STMT); + break; + } + } + break; + } + + case GIMPLE_NOP: + { + if (SSA_NAME_VAR (def) == NULL) + { + /* Promote def by fixing its type for anonymous def. */ + TREE_TYPE (def) = promoted_type; + } + else if (TREE_CODE (SSA_NAME_VAR (def)) != PARM_DECL) + { + tree var = copy_node (SSA_NAME_VAR (def)); + TREE_TYPE (var) = promoted_type; + TREE_TYPE (def) = promoted_type; + SET_SSA_NAME_VAR_OR_IDENTIFIER (def, var); + } + else + { + /* Create a promoted copy of parameters. */ + bb = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun)); + gcc_assert (bb); + gsi2 = gsi_after_labels (bb); + /* Create new_def of the original type and set that to be the + parameter. */ + new_def = copy_ssa_name (def); + set_ssa_promoted (new_def); + set_ssa_default_def (cfun, SSA_NAME_VAR (def), new_def); + copy_default_ssa (new_def, def); + + /* Now promote the def and copy the value from parameter. */ + TREE_TYPE (def) = promoted_type; + copy_stmt = gimple_build_assign (def, NOP_EXPR, + new_def, NULL_TREE); + SSA_NAME_DEF_STMT (def) = copy_stmt; + gsi_insert_before (&gsi2, copy_stmt, GSI_NEW_STMT); + } + break; + } + + case GIMPLE_ASSIGN: + { + enum tree_code code = gimple_assign_rhs_code (def_stmt); + if (gimple_vuse (def_stmt) != NULL_TREE + || gimple_vdef (def_stmt) != NULL_TREE + || TREE_CODE_CLASS (code) == tcc_reference + || code == LROTATE_EXPR + || code == RROTATE_EXPR + || code == VIEW_CONVERT_EXPR + || code == REALPART_EXPR + || code == IMAGPART_EXPR + || code == REDUC_MAX_EXPR + || code == REDUC_PLUS_EXPR + || code == REDUC_MIN_EXPR) + { + do_not_promote = true; + } + else if (CONVERT_EXPR_CODE_P (code)) + { + tree rhs = gimple_assign_rhs1 (def_stmt); + if (!type_precision_ok (TREE_TYPE (rhs)) + || !INTEGRAL_TYPE_P (TREE_TYPE (rhs)) + || (TYPE_UNSIGNED (TREE_TYPE (rhs)) != TYPE_UNSIGNED (promoted_type))) + { + do_not_promote = true; + } + else if (types_compatible_p (TREE_TYPE (rhs), promoted_type)) + { + /* As we travel statements in dominated order, arguments + of def_stmt will be visited before visiting def. If RHS + is already promoted and type is compatible, we can convert + them into ZERO/SIGN EXTEND stmt. */ + ssa_name_info *info = ssa_name_info_map->get_or_insert (rhs); + tree type; + if (info == NULL) + type = TREE_TYPE (rhs); + else + type = info->type; + if ((TYPE_PRECISION (original_type) + > TYPE_PRECISION (type)) + || (TYPE_UNSIGNED (original_type) + != TYPE_UNSIGNED (type))) + { + if (TYPE_PRECISION (original_type) < TYPE_PRECISION (type)) + type = original_type; + gcc_assert (type != NULL_TREE); + TREE_TYPE (def) = promoted_type; + gimple *copy_stmt = + zero_sign_extend_stmt (def, rhs, + TYPE_UNSIGNED (type), + TYPE_PRECISION (type)); + SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE); + gsi_replace (gsi, copy_stmt, false); + } + else + { + TREE_TYPE (def) = promoted_type; + SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE); + } + } + else + { + /* If RHS is not promoted OR their types are not + compatible, create NOP_EXPR that converts + RHS to promoted DEF type and perform a + ZERO/SIGN EXTEND to get the required value + from RHS. */ + ssa_name_info *info = ssa_name_info_map->get_or_insert (rhs); + if (info != NULL) + { + tree type = info->type; + new_def = copy_ssa_name (rhs); + SET_SSA_NAME_VAR_OR_IDENTIFIER (new_def, NULL_TREE); + TREE_TYPE (def) = promoted_type; + SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE); + gimple *copy_stmt = + zero_sign_extend_stmt (new_def, rhs, + TYPE_UNSIGNED (type), + TYPE_PRECISION (type)); + gsi2 = gsi_for_stmt (def_stmt); + gsi_insert_before (&gsi2, copy_stmt, GSI_NEW_STMT); + gassign *new_def_stmt = gimple_build_assign (def, code, + new_def, NULL_TREE); + gsi_replace (gsi, new_def_stmt, false); + } + else + { + TREE_TYPE (def) = promoted_type; + SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE); + } + } + } + else + { + /* Promote def by fixing its type and make def anonymous. */ + SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE); + promote_cst_in_stmt (def_stmt, promoted_type); + TREE_TYPE (def) = promoted_type; + } + break; + } + + default: + do_not_promote = true; + break; + } + + if (do_not_promote) + { + /* Promote def and copy (i.e. convert) the value defined + by the stmt that cannot be promoted. */ + new_def = copy_ssa_name (def); + set_ssa_promoted (new_def); + SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE); + TREE_TYPE (def) = promoted_type; + gimple_set_lhs (def_stmt, new_def); + copy_stmt = gimple_build_assign (def, NOP_EXPR, + new_def, NULL_TREE); + gsi2 = gsi_for_stmt (def_stmt); + if (lookup_stmt_eh_lp (def_stmt) > 0 + || (gimple_code (def_stmt) == GIMPLE_CALL + && gimple_call_ctrl_altering_p (def_stmt))) + gsi_insert_on_edge (FALLTHRU_EDGE (gimple_bb (def_stmt)), + copy_stmt); + else + gsi_insert_after (&gsi2, copy_stmt, GSI_NEW_STMT); + } + reset_flow_sensitive_info (def); +} + +/* Fix the (promoted) USE in stmts where USE cannot be be promoted. */ +static unsigned int +fixup_use (gimple *stmt, gimple_stmt_iterator *gsi, + use_operand_p op, tree use) +{ + ssa_name_info *info = ssa_name_info_map->get_or_insert (use); + /* If USE is not promoted, nothing to do. */ + if (!info) + return 0; + + tree promoted_type = info->promoted_type; + tree old_type = info->type; + bool do_not_promote = false; + + switch (gimple_code (stmt)) + { + case GIMPLE_DEBUG: + { + SET_USE (op, fold_convert (old_type, use)); + update_stmt (stmt); + break; + } + + case GIMPLE_ASM: + case GIMPLE_CALL: + case GIMPLE_RETURN: + { + /* USE cannot be promoted here. */ + do_not_promote = true; + break; + } + + case GIMPLE_ASSIGN: + { + enum tree_code code = gimple_assign_rhs_code (stmt); + tree lhs = gimple_assign_lhs (stmt); + if (gimple_vuse (stmt) != NULL_TREE + || gimple_vdef (stmt) != NULL_TREE + || code == VIEW_CONVERT_EXPR + || code == LROTATE_EXPR + || code == RROTATE_EXPR + || code == CONSTRUCTOR + || code == BIT_FIELD_REF + || code == COMPLEX_EXPR + || VECTOR_TYPE_P (TREE_TYPE (lhs))) + { + do_not_promote = true; + } + else if (TREE_CODE_CLASS (code) == tcc_comparison + || code == TRUNC_DIV_EXPR + || code == CEIL_DIV_EXPR + || code == FLOOR_DIV_EXPR + || code == ROUND_DIV_EXPR + || code == TRUNC_MOD_EXPR + || code == CEIL_MOD_EXPR + || code == FLOOR_MOD_EXPR + || code == ROUND_MOD_EXPR + || code == LSHIFT_EXPR + || code == RSHIFT_EXPR + || !INTEGRAL_TYPE_P (TREE_TYPE (lhs))) + { + /* Promote the constant in comparison when other comparison + operand is promoted. All other constants are promoted as + part of promoting definition in promote_ssa. */ + if (TREE_CODE_CLASS (code) == tcc_comparison) + promote_cst_in_stmt (stmt, promoted_type, true); + /* In some stmts, value in USE has to be ZERO/SIGN + Extended based on the original type for correct + result. */ + tree temp = make_promoted_copy (use, NULL, TREE_TYPE (use)); + gimple *copy_stmt = + zero_sign_extend_stmt (temp, use, + TYPE_UNSIGNED (old_type), + TYPE_PRECISION (old_type)); + gsi_insert_before (gsi, copy_stmt, GSI_NEW_STMT); + + SET_USE (op, temp); + update_stmt (stmt); + } + else if (CONVERT_EXPR_CODE_P (code)) + { + if (types_compatible_p (TREE_TYPE (lhs), promoted_type)) + { + /* Type of LHS and promoted RHS are compatible, we can + convert this into ZERO/SIGN EXTEND stmt. */ + gimple *copy_stmt = + zero_sign_extend_stmt (lhs, use, + TYPE_UNSIGNED (old_type), + TYPE_PRECISION (old_type)); + set_ssa_promoted (lhs); + gsi_replace (gsi, copy_stmt, false); + } + else if (tobe_promoted_p (lhs)); + else + { + do_not_promote = true; + } + } + break; + } + + case GIMPLE_COND: + { + /* In GIMPLE_COND, value in USE has to be ZERO/SIGN + Extended based on the original type for correct + result. */ + tree temp = make_promoted_copy (use, NULL, TREE_TYPE (use)); + gimple *copy_stmt = + zero_sign_extend_stmt (temp, use, + TYPE_UNSIGNED (old_type), + TYPE_PRECISION (old_type)); + gsi_insert_before (gsi, copy_stmt, GSI_NEW_STMT); + SET_USE (op, temp); + promote_cst_in_stmt (stmt, promoted_type); + update_stmt (stmt); + break; + } + + default: + break; + } + + if (do_not_promote) + { + /* FOR stmts where USE cannot be promoted, create an + original type copy. */ + tree temp; + temp = copy_ssa_name (use); + SET_SSA_NAME_VAR_OR_IDENTIFIER (temp, NULL_TREE); + set_ssa_promoted (temp); + TREE_TYPE (temp) = old_type; + gimple *copy_stmt = gimple_build_assign (temp, NOP_EXPR, + use, NULL_TREE); + gsi_insert_before (gsi, copy_stmt, GSI_NEW_STMT); + SET_USE (op, temp); + update_stmt (stmt); + } + return 0; +} + + +/* Promote all the stmts in the basic block. */ +static void +promote_all_stmts (basic_block bb) +{ + gimple_stmt_iterator gsi; + ssa_op_iter iter; + tree def, use; + use_operand_p op; + + for (gphi_iterator gpi = gsi_start_phis (bb); + !gsi_end_p (gpi); gsi_next (&gpi)) + { + gphi *phi = gpi.phi (); + FOR_EACH_PHI_ARG (op, phi, iter, SSA_OP_USE) + { + use = USE_FROM_PTR (op); + if (TREE_CODE (use) == SSA_NAME + && gimple_code (SSA_NAME_DEF_STMT (use)) == GIMPLE_NOP) + promote_ssa (use, &gsi); + fixup_use (phi, &gsi, op, use); + } + + def = PHI_RESULT (phi); + promote_ssa (def, &gsi); + } + for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi)) + { + gimple *stmt = gsi_stmt (gsi); + if (is_gimple_debug (stmt)) + continue; + + FOR_EACH_SSA_USE_OPERAND (op, stmt, iter, SSA_OP_USE) + { + use = USE_FROM_PTR (op); + if (TREE_CODE (use) == SSA_NAME + && gimple_code (SSA_NAME_DEF_STMT (use)) == GIMPLE_NOP) + promote_ssa (use, &gsi); + fixup_use (stmt, &gsi, op, use); + } + + FOR_EACH_SSA_TREE_OPERAND (def, stmt, iter, SSA_OP_DEF) + promote_ssa (def, &gsi); + } +} + +/* Promote use in GIMPLE_DEBUG stmts. Do this separately to avoid generating + different sequence with and without -g. This can happen when promoting + SSA that are defined with GIMPLE_NOP. */ +static void +promote_debug_stmts () +{ + basic_block bb; + gimple_stmt_iterator gsi; + ssa_op_iter iter; + tree use; + use_operand_p op; + + FOR_EACH_BB_FN (bb, cfun) + for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi)) + { + gimple *stmt = gsi_stmt (gsi); + if (!is_gimple_debug (stmt)) + continue; + FOR_EACH_SSA_USE_OPERAND (op, stmt, iter, SSA_OP_USE) + { + use = USE_FROM_PTR (op); + fixup_use (stmt, &gsi, op, use); + } + } +} + + +class type_promotion_dom_walker : public dom_walker +{ +public: + type_promotion_dom_walker (cdi_direction direction) + : dom_walker (direction) {} + virtual void before_dom_children (basic_block bb) + { + promote_all_stmts (bb); + } +}; + +/* Main entry point to the pass. */ +static unsigned int +execute_type_promotion (void) +{ + n_ssa_val = num_ssa_names; + ssa_name_info_map = new hash_map<tree, ssa_name_info *>; + ssa_to_be_promoted_bitmap = sbitmap_alloc (n_ssa_val); + bitmap_clear (ssa_to_be_promoted_bitmap); + + /* Create the obstack where ssa_name_info will reside. */ + gcc_obstack_init (&ssa_name_info_obstack); + + calculate_dominance_info (CDI_DOMINATORS); + /* Walk the CFG in dominator order. */ + type_promotion_dom_walker (CDI_DOMINATORS) + .walk (ENTRY_BLOCK_PTR_FOR_FN (cfun)); + promote_debug_stmts (); + gsi_commit_edge_inserts (); + + obstack_free (&ssa_name_info_obstack, NULL); + sbitmap_free (ssa_to_be_promoted_bitmap); + delete ssa_name_info_map; + return 0; +} + +namespace { +const pass_data pass_data_type_promotion = +{ + GIMPLE_PASS, /* type */ + "promotion", /* name */ + OPTGROUP_NONE, /* optinfo_flags */ + TV_TREE_TYPE_PROMOTE, /* tv_id */ + PROP_ssa, /* properties_required */ + 0, /* properties_provided */ + 0, /* properties_destroyed */ + 0, /* todo_flags_start */ + (TODO_cleanup_cfg | TODO_update_ssa | TODO_verify_all), +}; + +class pass_type_promotion : public gimple_opt_pass +{ +public: + pass_type_promotion (gcc::context *ctxt) + : gimple_opt_pass (pass_data_type_promotion, ctxt) + {} + + /* opt_pass methods: */ + opt_pass * clone () { return new pass_type_promotion (m_ctxt); } + virtual bool gate (function *) { return flag_tree_type_promote != 0; } + virtual unsigned int execute (function *) + { + return execute_type_promotion (); + } + +}; // class pass_type_promotion + +} // anon namespace + +gimple_opt_pass * +make_pass_type_promote (gcc::context *ctxt) +{ + return new pass_type_promotion (ctxt); +} + diff --git a/gcc/passes.def b/gcc/passes.def index 36d2b3b..78c463a 100644 --- a/gcc/passes.def +++ b/gcc/passes.def @@ -272,6 +272,7 @@ along with GCC; see the file COPYING3. If not see POP_INSERT_PASSES () NEXT_PASS (pass_simduid_cleanup); NEXT_PASS (pass_lower_vector_ssa); + NEXT_PASS (pass_type_promote); NEXT_PASS (pass_cse_reciprocals); NEXT_PASS (pass_reassoc); NEXT_PASS (pass_strength_reduction); diff --git a/gcc/timevar.def b/gcc/timevar.def index b429faf..a8d40c3 100644 --- a/gcc/timevar.def +++ b/gcc/timevar.def @@ -278,6 +278,7 @@ DEFTIMEVAR (TV_VTABLE_VERIFICATION , "vtable verification") DEFTIMEVAR (TV_TREE_UBSAN , "tree ubsan") DEFTIMEVAR (TV_INITIALIZE_RTL , "initialize rtl") DEFTIMEVAR (TV_GIMPLE_LADDRESS , "address lowering") +DEFTIMEVAR (TV_TREE_TYPE_PROMOTE , "tree type promote") /* Everything else in rest_of_compilation not included above. */ DEFTIMEVAR (TV_EARLY_LOCAL , "early local passes") diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h index 333b5a7..449dd19 100644 --- a/gcc/tree-pass.h +++ b/gcc/tree-pass.h @@ -436,6 +436,7 @@ extern gimple_opt_pass *make_pass_fre (gcc::context *ctxt); extern gimple_opt_pass *make_pass_check_data_deps (gcc::context *ctxt); extern gimple_opt_pass *make_pass_copy_prop (gcc::context *ctxt); extern gimple_opt_pass *make_pass_isolate_erroneous_paths (gcc::context *ctxt); +extern gimple_opt_pass *make_pass_type_promote (gcc::context *ctxt); extern gimple_opt_pass *make_pass_vrp (gcc::context *ctxt); extern gimple_opt_pass *make_pass_uncprop (gcc::context *ctxt); extern gimple_opt_pass *make_pass_return_slot (gcc::context *ctxt); diff --git a/libiberty/cp-demangle.c b/libiberty/cp-demangle.c index ff608a3..6722331 100644 --- a/libiberty/cp-demangle.c +++ b/libiberty/cp-demangle.c @@ -4353,7 +4353,7 @@ d_print_comp_inner (struct d_print_info *dpi, int options, /* Variable used to store the current templates while a previously captured scope is used. */ - struct d_print_template *saved_templates; + struct d_print_template *saved_templates = NULL; /* Nonzero if templates have been stored in the above variable. */ int need_template_restore = 0; -- 1.9.1 [-- Attachment #4: 0001-Add-new-SEXT_EXPR-tree-code.patch --] [-- Type: text/x-patch, Size: 5067 bytes --] From c0ce364e3a422912a08189645efde46c36583753 Mon Sep 17 00:00:00 2001 From: Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org> Date: Thu, 22 Oct 2015 10:51:42 +1100 Subject: [PATCH 1/5] Add new SEXT_EXPR tree code --- gcc/cfgexpand.c | 12 ++++++++++++ gcc/expr.c | 20 ++++++++++++++++++++ gcc/fold-const.c | 4 ++++ gcc/tree-cfg.c | 12 ++++++++++++ gcc/tree-inline.c | 1 + gcc/tree-pretty-print.c | 11 +++++++++++ gcc/tree.def | 5 +++++ 7 files changed, 65 insertions(+) diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c index eaad859..aeb64bb 100644 --- a/gcc/cfgexpand.c +++ b/gcc/cfgexpand.c @@ -5054,6 +5054,18 @@ expand_debug_expr (tree exp) case FMA_EXPR: return simplify_gen_ternary (FMA, mode, inner_mode, op0, op1, op2); + case SEXT_EXPR: + gcc_assert (CONST_INT_P (op1)); + inner_mode = mode_for_size (INTVAL (op1), MODE_INT, 0); + gcc_assert (GET_MODE_BITSIZE (inner_mode) == INTVAL (op1)); + + if (mode != inner_mode) + op0 = simplify_gen_unary (SIGN_EXTEND, + mode, + gen_lowpart_SUBREG (inner_mode, op0), + inner_mode); + return op0; + default: flag_unsupported: #ifdef ENABLE_CHECKING diff --git a/gcc/expr.c b/gcc/expr.c index da68870..c2f535f 100644 --- a/gcc/expr.c +++ b/gcc/expr.c @@ -9318,6 +9318,26 @@ expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode, target = expand_vec_cond_expr (type, treeop0, treeop1, treeop2, target); return target; + case SEXT_EXPR: + { + machine_mode inner_mode = mode_for_size (tree_to_uhwi (treeop1), + MODE_INT, 0); + rtx temp, result; + rtx op0 = expand_normal (treeop0); + op0 = force_reg (mode, op0); + if (mode != inner_mode) + { + result = gen_reg_rtx (mode); + temp = simplify_gen_unary (SIGN_EXTEND, mode, + gen_lowpart_SUBREG (inner_mode, op0), + inner_mode); + convert_move (result, temp, 0); + } + else + result = op0; + return result; + } + default: gcc_unreachable (); } diff --git a/gcc/fold-const.c b/gcc/fold-const.c index 602ea24..a149bad 100644 --- a/gcc/fold-const.c +++ b/gcc/fold-const.c @@ -987,6 +987,10 @@ int_const_binop_1 (enum tree_code code, const_tree arg1, const_tree parg2, res = wi::bit_and (arg1, arg2); break; + case SEXT_EXPR: + res = wi::sext (arg1, arg2.to_uhwi ()); + break; + case RSHIFT_EXPR: case LSHIFT_EXPR: if (wi::neg_p (arg2)) diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c index 8e3e810..d18b3f7 100644 --- a/gcc/tree-cfg.c +++ b/gcc/tree-cfg.c @@ -3752,6 +3752,18 @@ verify_gimple_assign_binary (gassign *stmt) return false; } + case SEXT_EXPR: + { + if (!INTEGRAL_TYPE_P (lhs_type) + || !useless_type_conversion_p (lhs_type, rhs1_type) + || !tree_fits_uhwi_p (rhs2)) + { + error ("invalid operands in sext expr"); + return true; + } + return false; + } + case VEC_WIDEN_LSHIFT_HI_EXPR: case VEC_WIDEN_LSHIFT_LO_EXPR: { diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c index b8269ef..e61c200 100644 --- a/gcc/tree-inline.c +++ b/gcc/tree-inline.c @@ -3893,6 +3893,7 @@ estimate_operator_cost (enum tree_code code, eni_weights *weights, case BIT_XOR_EXPR: case BIT_AND_EXPR: case BIT_NOT_EXPR: + case SEXT_EXPR: case TRUTH_ANDIF_EXPR: case TRUTH_ORIF_EXPR: diff --git a/gcc/tree-pretty-print.c b/gcc/tree-pretty-print.c index 11f90051..bec9082 100644 --- a/gcc/tree-pretty-print.c +++ b/gcc/tree-pretty-print.c @@ -1923,6 +1923,14 @@ dump_generic_node (pretty_printer *pp, tree node, int spc, int flags, } break; + case SEXT_EXPR: + pp_string (pp, "SEXT_EXPR <"); + dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags, false); + pp_string (pp, ", "); + dump_generic_node (pp, TREE_OPERAND (node, 1), spc, flags, false); + pp_greater (pp); + break; + case MODIFY_EXPR: case INIT_EXPR: dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags, @@ -3561,6 +3569,9 @@ op_symbol_code (enum tree_code code) case MIN_EXPR: return "min"; + case SEXT_EXPR: + return "sext"; + default: return "<<< ??? >>>"; } diff --git a/gcc/tree.def b/gcc/tree.def index d0a3bd6..789cfdd 100644 --- a/gcc/tree.def +++ b/gcc/tree.def @@ -760,6 +760,11 @@ DEFTREECODE (BIT_XOR_EXPR, "bit_xor_expr", tcc_binary, 2) DEFTREECODE (BIT_AND_EXPR, "bit_and_expr", tcc_binary, 2) DEFTREECODE (BIT_NOT_EXPR, "bit_not_expr", tcc_unary, 1) +/* Sign-extend operation. It will sign extend first operand from + the sign bit specified by the second operand. The type of the + result is that of the first operand. */ +DEFTREECODE (SEXT_EXPR, "sext_expr", tcc_binary, 2) + /* ANDIF and ORIF allow the second operand not to be computed if the value of the expression is determined from the first operand. AND, OR, and XOR always compute the second operand whether its value is -- 1.9.1 ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [0/7] Type promotion pass and elimination of zext/sext 2015-11-10 14:13 ` Richard Biener 2015-11-12 6:08 ` Kugan @ 2015-11-14 1:15 ` Kugan 2015-11-18 14:04 ` Richard Biener 1 sibling, 1 reply; 28+ messages in thread From: Kugan @ 2015-11-14 1:15 UTC (permalink / raw) To: Richard Biener; +Cc: gcc-patches [-- Attachment #1: Type: text/plain, Size: 5293 bytes --] Attached is the latest version of the patch. With the patches 0001-Add-new-SEXT_EXPR-tree-code.patch, 0002-Add-type-promotion-pass.patch and 0003-Optimize-ZEXT_EXPR-with-tree-vrp.patch. I did bootstrap on ppc64-linux-gnu, aarch64-linux-gnu and x64-64-linux-gnu and regression testing on ppc64-linux-gnu, aarch64-linux-gnu arm64-linux-gnu and x64-64-linux-gnu. I ran into three issues in ppc64-linux-gnu regression testing. There are some other test cases which needs adjustment for scanning for some patterns that are not valid now. 1. rtl fwprop was going into infinite loop. Works with the following patch: diff --git a/gcc/fwprop.c b/gcc/fwprop.c index 16c7981..9cf4f43 100644 --- a/gcc/fwprop.c +++ b/gcc/fwprop.c @@ -948,6 +948,10 @@ try_fwprop_subst (df_ref use, rtx *loc, rtx new_rtx, rtx_insn *def_insn, int old_cost = 0; bool ok; + /* Value to be substituted is the same, nothing to do. */ + if (rtx_equal_p (*loc, new_rtx)) + return false; + update_df_init (def_insn, insn); /* forward_propagate_subreg may be operating on an instruction with 2. gcc.dg/torture/ftrapv-1.c fails This is because we are checking for the SImode trapping. With the promotion of the operation to wider mode, this is i think expected. I think the testcase needs updating. 3. gcc.dg/sms-3.c fails It fails with -fmodulo-sched-allow-regmoves and OK when I remove it. I am looking into it. I also have the following issues based on the previous review (as posted in the previous patch). Copying again for the review purpose. 1. > you still call promote_ssa on both DEFs and USEs and promote_ssa looks > at SSA_NAME_DEF_STMT of the passed arg. Please call promote_ssa just > on DEFs and fixup_uses on USEs. I am doing this to promote SSA that are defined with GIMPLE_NOP. Is there anyway to iterate over this. I have added gcc_assert to make sure that promote_ssa is called only once. 2. > Instead of this you should, in promote_all_stmts, walk over all uses doing what > fixup_uses does and then walk over all defs, doing what promote_ssa does. > > + case GIMPLE_NOP: > + { > + if (SSA_NAME_VAR (def) == NULL) > + { > + /* Promote def by fixing its type for anonymous def. */ > + TREE_TYPE (def) = promoted_type; > + } > + else > + { > + /* Create a promoted copy of parameters. */ > + bb = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun)); > > I think the uninitialized vars are somewhat tricky and it would be best > to create a new uninit anonymous SSA name for them. You can > have SSA_NAME_VAR != NULL and def _not_ being a parameter > btw. I experimented with get_or_create_default_def. Here we have to have a SSA_NAME_VAR (def) of promoted type. In the attached patch I am doing the following and seems to work. Does this looks OK? + } + else if (TREE_CODE (SSA_NAME_VAR (def)) != PARM_DECL) + { + tree var = copy_node (SSA_NAME_VAR (def)); + TREE_TYPE (var) = promoted_type; + TREE_TYPE (def) = promoted_type; + SET_SSA_NAME_VAR_OR_IDENTIFIER (def, var); + } I prefer to promote def as otherwise iterating over the uses and promoting can look complicated (have to look at all the different types of stmts again and do the right thing as It was in the earlier version of this before we move to this approach) 3) > you can also transparently handle constants for the cases where promoting > is required. At the moment their handling is interwinded with the def promotion > code. That makes the whole thing hard to follow. I have updated the comments with: +/* Promote constants in STMT to TYPE. If PROMOTE_COND_EXPR is true, + promote only the constants in conditions part of the COND_EXPR. + + We promote the constants when the associated operands are promoted. + This usually means that we promote the constants when we promote the + defining stmnts (as part of promote_ssa). However for COND_EXPR, we + can promote only when we promote the other operand. Therefore, this + is done during fixup_use. */ 4) I am handling gimple_debug separately to avoid any code difference with and without -g option. I have updated the comments for this. 5) I also noticed that tree-ssa-uninit sometimes gives false positives due to the assumptions it makes. Is it OK to move this pass before type promotion? I can do the testings and post a separate patch with this if this OK. 6) I also removed the optimization that prevents some of the redundant truncation/extensions from type promotion pass, as it dosent do much as of now. I can send a proper follow up patch. Is that OK? I also did a simple test with coremark for the latest patch. I compared the code size for coremark for linux-gcc with -Os. Results are as reported by the "size" utility. I know this doesn't mean much but can give some indication. Base with pass Percentage improvement ============================================================== arm 10476 10372 0.9927453226 aarch64 9545 9521 0.2514405448 ppc64 12236 12052 1.5037593985 After resolving the above issues, I would like propose that we commit the pass as not enabled by default (even though the patch as it stands enabled by default - I am doing it for testing purposes). Thanks, Kugan [-- Attachment #2: 0003-Optimize-ZEXT_EXPR-with-tree-vrp.patch --] [-- Type: text/x-diff, Size: 3609 bytes --] From 8e71ea17eaf6f282325076f588dbdf4f53c8b865 Mon Sep 17 00:00:00 2001 From: Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org> Date: Thu, 22 Oct 2015 10:53:56 +1100 Subject: [PATCH 3/5] Optimize ZEXT_EXPR with tree-vrp --- gcc/match.pd | 6 ++++++ gcc/tree-vrp.c | 61 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 67 insertions(+) diff --git a/gcc/match.pd b/gcc/match.pd index 0a9598e..1b152f1 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -2585,3 +2585,9 @@ along with GCC; see the file COPYING3. If not see (if (INTEGRAL_TYPE_P (TREE_TYPE (@0))) (op @0 (ext @1 @2))))) +(simplify + (sext (sext@2 @0 @1) @3) + (if (tree_int_cst_compare (@1, @3) <= 0) + @2 + (sext @0 @3))) + diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c index fe34ffd..024c8ef 100644 --- a/gcc/tree-vrp.c +++ b/gcc/tree-vrp.c @@ -2241,6 +2241,7 @@ extract_range_from_binary_expr_1 (value_range *vr, && code != LSHIFT_EXPR && code != MIN_EXPR && code != MAX_EXPR + && code != SEXT_EXPR && code != BIT_AND_EXPR && code != BIT_IOR_EXPR && code != BIT_XOR_EXPR) @@ -2801,6 +2802,54 @@ extract_range_from_binary_expr_1 (value_range *vr, extract_range_from_multiplicative_op_1 (vr, code, &vr0, &vr1); return; } + else if (code == SEXT_EXPR) + { + gcc_assert (range_int_cst_p (&vr1)); + HOST_WIDE_INT prec = tree_to_uhwi (vr1.min); + type = vr0.type; + wide_int tmin, tmax; + wide_int may_be_nonzero, must_be_nonzero; + + wide_int type_min = wi::min_value (prec, SIGNED); + wide_int type_max = wi::max_value (prec, SIGNED); + type_min = wide_int_to_tree (expr_type, type_min); + type_max = wide_int_to_tree (expr_type, type_max); + type_min = wi::sext (type_min, prec); + type_max = wi::sext (type_max, prec); + wide_int sign_bit + = wi::set_bit_in_zero (prec - 1, + TYPE_PRECISION (TREE_TYPE (vr0.min))); + if (zero_nonzero_bits_from_vr (expr_type, &vr0, + &may_be_nonzero, + &must_be_nonzero)) + { + if (wi::bit_and (must_be_nonzero, sign_bit) == sign_bit) + { + /* If to-be-extended sign bit is one. */ + tmin = type_min; + tmax = wi::zext (may_be_nonzero, prec); + } + else if (wi::bit_and (may_be_nonzero, sign_bit) + != sign_bit) + { + /* If to-be-extended sign bit is zero. */ + tmin = wi::zext (must_be_nonzero, prec); + tmax = wi::zext (may_be_nonzero, prec); + } + else + { + tmin = type_min; + tmax = type_max; + } + } + else + { + tmin = type_min; + tmax = type_max; + } + min = wide_int_to_tree (expr_type, tmin); + max = wide_int_to_tree (expr_type, tmax); + } else if (code == RSHIFT_EXPR || code == LSHIFT_EXPR) { @@ -9166,6 +9215,17 @@ simplify_bit_ops_using_ranges (gimple_stmt_iterator *gsi, gimple *stmt) break; } break; + case SEXT_EXPR: + { + unsigned int prec = tree_to_uhwi (op1); + wide_int min = vr0.min; + wide_int max = vr0.max; + wide_int sext_min = wi::sext (min, prec); + wide_int sext_max = wi::sext (max, prec); + if (min == sext_min && max == sext_max) + op = op0; + } + break; default: gcc_unreachable (); } @@ -9868,6 +9928,7 @@ simplify_stmt_using_ranges (gimple_stmt_iterator *gsi) case BIT_AND_EXPR: case BIT_IOR_EXPR: + case SEXT_EXPR: /* Optimize away BIT_AND_EXPR and BIT_IOR_EXPR if all the bits being cleared are already cleared or all the bits being set are already set. */ -- 1.9.1 [-- Attachment #3: 0002-Add-type-promotion-pass.patch --] [-- Type: text/x-diff, Size: 31165 bytes --] From 42128668393c32c3860d346ead7b3118a090ffa4 Mon Sep 17 00:00:00 2001 From: Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org> Date: Thu, 22 Oct 2015 10:52:37 +1100 Subject: [PATCH 2/5] Add type promotion pass --- gcc/Makefile.in | 1 + gcc/auto-profile.c | 2 +- gcc/common.opt | 4 + gcc/doc/invoke.texi | 10 + gcc/gimple-ssa-type-promote.c | 867 ++++++++++++++++++++++++++++++++++++++++++ gcc/passes.def | 1 + gcc/timevar.def | 1 + gcc/tree-pass.h | 1 + libiberty/cp-demangle.c | 2 +- 9 files changed, 887 insertions(+), 2 deletions(-) create mode 100644 gcc/gimple-ssa-type-promote.c diff --git a/gcc/Makefile.in b/gcc/Makefile.in index b91b8dc..c6aed45 100644 --- a/gcc/Makefile.in +++ b/gcc/Makefile.in @@ -1499,6 +1499,7 @@ OBJS = \ tree-vect-slp.o \ tree-vectorizer.o \ tree-vrp.o \ + gimple-ssa-type-promote.o \ tree.o \ valtrack.o \ value-prof.o \ diff --git a/gcc/auto-profile.c b/gcc/auto-profile.c index 25202c5..d32c3b6 100644 --- a/gcc/auto-profile.c +++ b/gcc/auto-profile.c @@ -1266,7 +1266,7 @@ afdo_propagate_circuit (const bb_set &annotated_bb, edge_set *annotated_edge) FOR_EACH_EDGE (e, ei, bb->succs) { unsigned i, total = 0; - edge only_one; + edge only_one = NULL; bool check_value_one = (((integer_onep (cmp_rhs)) ^ (gimple_cond_code (cmp_stmt) == EQ_EXPR)) ^ ((e->flags & EDGE_TRUE_VALUE) != 0)); diff --git a/gcc/common.opt b/gcc/common.opt index 12ca0d6..f450428 100644 --- a/gcc/common.opt +++ b/gcc/common.opt @@ -2404,6 +2404,10 @@ ftree-vrp Common Report Var(flag_tree_vrp) Init(0) Optimization Perform Value Range Propagation on trees. +ftree-type-promote +Common Report Var(flag_tree_type_promote) Init(1) Optimization +Perform Type Promotion on trees + funit-at-a-time Common Report Var(flag_unit_at_a_time) Init(1) Compile whole compilation unit at a time. diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index cd82544..bc059a0 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -9093,6 +9093,16 @@ enabled by default at @option{-O2} and higher. Null pointer check elimination is only done if @option{-fdelete-null-pointer-checks} is enabled. +@item -ftree-type-promote +@opindex ftree-type-promote +This pass applies type promotion to SSA names in the function and +inserts appropriate truncations to preserve the semantics. Idea of +this pass is to promote operations such a way that we can minimise +generation of subreg in RTL, that intern results in removal of +redundant zero/sign extensions. + +This optimization is enabled by default. + @item -fsplit-ivs-in-unroller @opindex fsplit-ivs-in-unroller Enables expression of values of induction variables in later iterations diff --git a/gcc/gimple-ssa-type-promote.c b/gcc/gimple-ssa-type-promote.c new file mode 100644 index 0000000..735e7ee --- /dev/null +++ b/gcc/gimple-ssa-type-promote.c @@ -0,0 +1,867 @@ +/* Type promotion of SSA names to minimise redundant zero/sign extension. + Copyright (C) 2015 Free Software Foundation, Inc. + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify +it under the terms of the GNU General Public License as published by +the Free Software Foundation; either version 3, or (at your option) +any later version. + +GCC is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +You should have received a copy of the GNU General Public License +along with GCC; see the file COPYING3. If not see +<http://www.gnu.org/licenses/>. */ + +#include "config.h" +#include "system.h" +#include "coretypes.h" +#include "backend.h" +#include "hash-set.h" +#include "machmode.h" +#include "vec.h" +#include "double-int.h" +#include "input.h" +#include "symtab.h" +#include "wide-int.h" +#include "inchash.h" +#include "tree.h" +#include "fold-const.h" +#include "stor-layout.h" +#include "predict.h" +#include "function.h" +#include "dominance.h" +#include "cfg.h" +#include "basic-block.h" +#include "tree-ssa-alias.h" +#include "gimple-fold.h" +#include "tree-eh.h" +#include "gimple-expr.h" +#include "is-a.h" +#include "gimple.h" +#include "gimple-iterator.h" +#include "gimple-ssa.h" +#include "tree-phinodes.h" +#include "ssa-iterators.h" +#include "stringpool.h" +#include "tree-ssanames.h" +#include "tree-pass.h" +#include "gimple-pretty-print.h" +#include "langhooks.h" +#include "sbitmap.h" +#include "domwalk.h" +#include "tree-dfa.h" + +/* This pass applies type promotion to SSA names in the function and + inserts appropriate truncations. Idea of this pass is to promote operations + such a way that we can minimise generation of subreg in RTL, + that in turn results in removal of redundant zero/sign extensions. This pass + will run prior to The VRP and DOM such that they will be able to optimise + redundant truncations and extensions. This is based on the discussion from + https://gcc.gnu.org/ml/gcc-patches/2014-09/msg00472.html. +*/ + +/* Structure to hold the type and promoted type for promoted ssa variables. */ +struct ssa_name_info +{ + tree ssa; /* Name of the SSA_NAME. */ + tree type; /* Original type of ssa. */ + tree promoted_type; /* Promoted type of ssa. */ +}; + +/* Obstack for ssa_name_info. */ +static struct obstack ssa_name_info_obstack; + +static unsigned n_ssa_val; +static sbitmap ssa_to_be_promoted_bitmap; +static hash_map <tree, ssa_name_info *> *ssa_name_info_map; + +static bool +type_precision_ok (tree type) +{ + return (TYPE_PRECISION (type) + == GET_MODE_PRECISION (TYPE_MODE (type))); +} + +/* Return the promoted type for TYPE. */ +static tree +get_promoted_type (tree type) +{ + tree promoted_type; + enum machine_mode mode; + int uns; + + if (POINTER_TYPE_P (type) + || !INTEGRAL_TYPE_P (type) + || !type_precision_ok (type)) + return type; + + mode = TYPE_MODE (type); +#ifdef PROMOTE_MODE + uns = TYPE_SIGN (type); + PROMOTE_MODE (mode, uns, type); +#endif + uns = TYPE_SIGN (type); + if (TYPE_PRECISION (type) == GET_MODE_PRECISION (mode)) + return type; + promoted_type + = build_nonstandard_integer_type (GET_MODE_PRECISION (mode), + uns); + gcc_assert (TYPE_PRECISION (promoted_type) == GET_MODE_PRECISION (mode)); + return promoted_type; +} + +/* Return true if ssa NAME is already considered for promotion. */ +static bool +ssa_promoted_p (tree name) +{ + if (TREE_CODE (name) == SSA_NAME) + { + unsigned int index = SSA_NAME_VERSION (name); + if (index < n_ssa_val) + return bitmap_bit_p (ssa_to_be_promoted_bitmap, index); + } + return true; +} + +/* Set ssa NAME to be already considered for promotion. */ +static void +set_ssa_promoted (tree name) +{ + if (TREE_CODE (name) == SSA_NAME) + { + unsigned int index = SSA_NAME_VERSION (name); + if (index < n_ssa_val) + bitmap_set_bit (ssa_to_be_promoted_bitmap, index); + } +} + +/* Return true if LHS will be promoted later. */ +static bool +tobe_promoted_p (tree lhs) +{ + if (TREE_CODE (lhs) == SSA_NAME + && !POINTER_TYPE_P (TREE_TYPE (lhs)) + && INTEGRAL_TYPE_P (TREE_TYPE (lhs)) + && !VECTOR_TYPE_P (TREE_TYPE (lhs)) + && !ssa_promoted_p (lhs) + && (get_promoted_type (TREE_TYPE (lhs)) + != TREE_TYPE (lhs))) + return true; + else + return false; +} + +/* Return true if the tree CODE needs the propmoted operand to be + truncated (when stray bits are set beyond the original type in + promoted mode) to preserve the semantics. */ +static bool +truncate_use_p (enum tree_code code) +{ + if (code == TRUNC_DIV_EXPR + || code == CEIL_DIV_EXPR + || code == FLOOR_DIV_EXPR + || code == ROUND_DIV_EXPR + || code == TRUNC_MOD_EXPR + || code == CEIL_MOD_EXPR + || code == FLOOR_MOD_EXPR + || code == ROUND_MOD_EXPR + || code == LSHIFT_EXPR + || code == RSHIFT_EXPR + || code == MAX_EXPR + || code == MIN_EXPR) + return true; + else + return false; +} + +/* Convert constant CST to TYPE. */ +static tree +convert_int_cst (tree type, tree cst, signop sign = SIGNED) +{ + wide_int wi_cons = fold_convert (type, cst); + wi_cons = wi::ext (wi_cons, TYPE_PRECISION (TREE_TYPE (cst)), sign); + return wide_int_to_tree (type, wi_cons); +} + +/* Promote constants in STMT to TYPE. If PROMOTE_COND_EXPR is true, + promote only the constants in conditions part of the COND_EXPR. + + We promote the constants when the associated operands are promoted. + This usually means that we promote the constants when we promote the + defining stmnts (as part of promote_ssa). However for COND_EXPR, we + can promote only when we promote the other operand. Therefore, this + is done during fixup_use. */ + +static void +promote_cst_in_stmt (gimple *stmt, tree type, bool promote_cond = false) +{ + tree op; + ssa_op_iter iter; + use_operand_p oprnd; + int index; + tree op0, op1; + signop sign = SIGNED; + + switch (gimple_code (stmt)) + { + case GIMPLE_ASSIGN: + if (promote_cond + && gimple_assign_rhs_code (stmt) == COND_EXPR) + { + /* Promote INTEGER_CST that are tcc_compare arguments. */ + sign = TYPE_SIGN (type); + op = gimple_assign_rhs1 (stmt); + op0 = TREE_OPERAND (op, 0); + op1 = TREE_OPERAND (op, 1); + if (TREE_CODE (op0) == INTEGER_CST) + op0 = convert_int_cst (type, op0, sign); + if (TREE_CODE (op1) == INTEGER_CST) + op1 = convert_int_cst (type, op1, sign); + tree new_op = build2 (TREE_CODE (op), type, op0, op1); + gimple_assign_set_rhs1 (stmt, new_op); + } + else + { + /* Promote INTEGER_CST in GIMPLE_ASSIGN. */ + op = gimple_assign_rhs3 (stmt); + if (op && TREE_CODE (op) == INTEGER_CST) + gimple_assign_set_rhs3 (stmt, convert_int_cst (type, op, sign)); + if ((TREE_CODE_CLASS (gimple_assign_rhs_code (stmt)) + == tcc_comparison) + || truncate_use_p (gimple_assign_rhs_code (stmt))) + sign = TYPE_SIGN (type); + op = gimple_assign_rhs1 (stmt); + if (op && TREE_CODE (op) == INTEGER_CST) + gimple_assign_set_rhs1 (stmt, convert_int_cst (type, op, sign)); + op = gimple_assign_rhs2 (stmt); + if (op && TREE_CODE (op) == INTEGER_CST) + gimple_assign_set_rhs2 (stmt, convert_int_cst (type, op, sign)); + } + break; + + case GIMPLE_PHI: + { + /* Promote INTEGER_CST arguments to GIMPLE_PHI. */ + gphi *phi = as_a <gphi *> (stmt); + FOR_EACH_PHI_ARG (oprnd, phi, iter, SSA_OP_USE) + { + op = USE_FROM_PTR (oprnd); + index = PHI_ARG_INDEX_FROM_USE (oprnd); + if (TREE_CODE (op) == INTEGER_CST) + SET_PHI_ARG_DEF (phi, index, convert_int_cst (type, op, sign)); + } + } + break; + + case GIMPLE_COND: + { + /* Promote INTEGER_CST that are GIMPLE_COND arguments. */ + gcond *cond = as_a <gcond *> (stmt); + sign = TYPE_SIGN (type); + op = gimple_cond_lhs (cond); + if (op && TREE_CODE (op) == INTEGER_CST) + gimple_cond_set_lhs (cond, convert_int_cst (type, op, sign)); + + op = gimple_cond_rhs (cond); + if (op && TREE_CODE (op) == INTEGER_CST) + gimple_cond_set_rhs (cond, convert_int_cst (type, op, sign)); + } + break; + + default: + gcc_unreachable (); + } +} + +/* Create an ssa with TYPE to copy ssa VAR. */ +static tree +make_promoted_copy (tree var, gimple *def_stmt, tree type) +{ + tree new_lhs = make_ssa_name (type, def_stmt); + if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (var)) + SSA_NAME_OCCURS_IN_ABNORMAL_PHI (new_lhs) = 1; + return new_lhs; +} + +/* Zero/sign extend (depending on UNSIGNED_P) VAR and truncate to WIDTH bits. + Assign the zero/sign extended value in NEW_VAR. gimple statement + that performs the zero/sign extension is returned. */ +static gimple * +zero_sign_extend_stmt (tree new_var, tree var, bool unsigned_p, int width) +{ + gcc_assert (TYPE_PRECISION (TREE_TYPE (var)) + == TYPE_PRECISION (TREE_TYPE (new_var))); + gcc_assert (TYPE_PRECISION (TREE_TYPE (var)) > width); + gimple *stmt; + + if (unsigned_p) + { + /* Zero extend. */ + tree cst + = wide_int_to_tree (TREE_TYPE (var), + wi::mask (width, false, + TYPE_PRECISION (TREE_TYPE (var)))); + stmt = gimple_build_assign (new_var, BIT_AND_EXPR, + var, cst); + } + else + /* Sign extend. */ + stmt = gimple_build_assign (new_var, + SEXT_EXPR, + var, build_int_cst (TREE_TYPE (var), width)); + return stmt; +} + + +static void +copy_default_ssa (tree to, tree from) +{ + SET_SSA_NAME_VAR_OR_IDENTIFIER (to, SSA_NAME_VAR (from)); + SSA_NAME_DEF_STMT (to) = SSA_NAME_DEF_STMT (from); + SET_SSA_NAME_VAR_OR_IDENTIFIER (from, NULL_TREE); + SSA_NAME_IS_DEFAULT_DEF (to) = 1; + SSA_NAME_IS_DEFAULT_DEF (from) = 0; +} + +/* Promote definition DEF to PROMOTED_TYPE. If the stmt that defines def + is def_stmt, make the type of def promoted_type. If the stmt is such + that, result of the def_stmt cannot be of promoted_type, create a new_def + of the original_type and make the def_stmt assign its value to newdef. + Then, create a NOP_EXPR to convert new_def to def of promoted type. + + For example, for stmt with original_type char and promoted_type int: + char _1 = mem; + becomes: + char _2 = mem; + int _1 = (int)_2; + + If the def_stmt allows def to be promoted, promote def in-place + (and its arguments when needed). + + For example: + char _3 = _1 + _2; + becomes: + int _3 = _1 + _2; + Here, _1 and _2 will also be promoted. */ + +static void +promote_ssa (tree def, gimple_stmt_iterator *gsi) +{ + gimple *def_stmt = SSA_NAME_DEF_STMT (def); + gimple *copy_stmt = NULL; + basic_block bb; + gimple_stmt_iterator gsi2; + tree original_type = TREE_TYPE (def); + tree new_def; + ssa_name_info *info; + bool do_not_promote = false; + tree promoted_type = get_promoted_type (TREE_TYPE (def)); + + if (!tobe_promoted_p (def)) + return; + + info = (ssa_name_info *) obstack_alloc (&ssa_name_info_obstack, + sizeof (ssa_name_info)); + info->type = original_type; + info->promoted_type = promoted_type; + info->ssa = def; + gcc_assert (!ssa_name_info_map->get_or_insert (def)); + ssa_name_info_map->put (def, info); + + switch (gimple_code (def_stmt)) + { + case GIMPLE_PHI: + { + /* Promote def by fixing its type and make def anonymous. */ + TREE_TYPE (def) = promoted_type; + SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE); + promote_cst_in_stmt (def_stmt, promoted_type); + break; + } + + case GIMPLE_ASM: + { + gasm *asm_stmt = as_a <gasm *> (def_stmt); + for (unsigned int i = 0; i < gimple_asm_noutputs (asm_stmt); ++i) + { + /* Promote def and copy (i.e. convert) the value defined + by asm to def. */ + tree link = gimple_asm_output_op (asm_stmt, i); + tree op = TREE_VALUE (link); + if (op == def) + { + new_def = copy_ssa_name (def); + set_ssa_promoted (new_def); + copy_default_ssa (new_def, def); + TREE_VALUE (link) = new_def; + gimple_asm_set_output_op (asm_stmt, i, link); + + TREE_TYPE (def) = promoted_type; + copy_stmt = gimple_build_assign (def, NOP_EXPR, + new_def, NULL_TREE); + SSA_NAME_IS_DEFAULT_DEF (new_def) = 0; + gsi2 = gsi_for_stmt (def_stmt); + gsi_insert_after (&gsi2, copy_stmt, GSI_NEW_STMT); + break; + } + } + break; + } + + case GIMPLE_NOP: + { + if (SSA_NAME_VAR (def) == NULL) + { + /* Promote def by fixing its type for anonymous def. */ + TREE_TYPE (def) = promoted_type; + } + else if (TREE_CODE (SSA_NAME_VAR (def)) != PARM_DECL) + { + tree var = copy_node (SSA_NAME_VAR (def)); + TREE_TYPE (var) = promoted_type; + TREE_TYPE (def) = promoted_type; + SET_SSA_NAME_VAR_OR_IDENTIFIER (def, var); + } + else + { + /* Create a promoted copy of parameters. */ + bb = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun)); + gcc_assert (bb); + gsi2 = gsi_after_labels (bb); + /* Create new_def of the original type and set that to be the + parameter. */ + new_def = copy_ssa_name (def); + set_ssa_promoted (new_def); + set_ssa_default_def (cfun, SSA_NAME_VAR (def), new_def); + copy_default_ssa (new_def, def); + + /* Now promote the def and copy the value from parameter. */ + TREE_TYPE (def) = promoted_type; + copy_stmt = gimple_build_assign (def, NOP_EXPR, + new_def, NULL_TREE); + SSA_NAME_DEF_STMT (def) = copy_stmt; + gsi_insert_before (&gsi2, copy_stmt, GSI_NEW_STMT); + } + break; + } + + case GIMPLE_ASSIGN: + { + enum tree_code code = gimple_assign_rhs_code (def_stmt); + tree rhs = gimple_assign_rhs1 (def_stmt); + if (gimple_vuse (def_stmt) != NULL_TREE + || gimple_vdef (def_stmt) != NULL_TREE + || TREE_CODE_CLASS (code) == tcc_reference + || code == LROTATE_EXPR + || code == RROTATE_EXPR + || code == VIEW_CONVERT_EXPR + || code == REALPART_EXPR + || code == IMAGPART_EXPR + || code == REDUC_PLUS_EXPR + || code == REDUC_MAX_EXPR + || code == REDUC_MIN_EXPR + || !INTEGRAL_TYPE_P (TREE_TYPE (rhs))) + { + do_not_promote = true; + } + else if (CONVERT_EXPR_CODE_P (code)) + { + if (!type_precision_ok (TREE_TYPE (rhs))) + { + do_not_promote = true; + } + else if (types_compatible_p (TREE_TYPE (rhs), promoted_type)) + { + /* As we travel statements in dominated order, arguments + of def_stmt will be visited before visiting def. If RHS + is already promoted and type is compatible, we can convert + them into ZERO/SIGN EXTEND stmt. */ + ssa_name_info *info = ssa_name_info_map->get_or_insert (rhs); + tree type; + if (info == NULL) + type = TREE_TYPE (rhs); + else + type = info->type; + if ((TYPE_PRECISION (original_type) + > TYPE_PRECISION (type)) + || (TYPE_UNSIGNED (original_type) + != TYPE_UNSIGNED (type))) + { + if (TYPE_PRECISION (original_type) < TYPE_PRECISION (type)) + type = original_type; + gcc_assert (type != NULL_TREE); + TREE_TYPE (def) = promoted_type; + gimple *copy_stmt = + zero_sign_extend_stmt (def, rhs, + TYPE_UNSIGNED (type), + TYPE_PRECISION (type)); + SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE); + gsi_replace (gsi, copy_stmt, false); + } + else + { + TREE_TYPE (def) = promoted_type; + SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE); + } + } + else + { + /* If RHS is not promoted OR their types are not + compatible, create NOP_EXPR that converts + RHS to promoted DEF type and perform a + ZERO/SIGN EXTEND to get the required value + from RHS. */ + ssa_name_info *info = ssa_name_info_map->get_or_insert (rhs); + if (info != NULL) + { + tree type = info->type; + new_def = copy_ssa_name (rhs); + SET_SSA_NAME_VAR_OR_IDENTIFIER (new_def, NULL_TREE); + TREE_TYPE (def) = promoted_type; + SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE); + gimple *copy_stmt = + zero_sign_extend_stmt (new_def, rhs, + TYPE_UNSIGNED (type), + TYPE_PRECISION (type)); + gsi2 = gsi_for_stmt (def_stmt); + gsi_insert_before (&gsi2, copy_stmt, GSI_NEW_STMT); + gassign *new_def_stmt = gimple_build_assign (def, code, + new_def, NULL_TREE); + gsi_replace (gsi, new_def_stmt, false); + } + else + { + TREE_TYPE (def) = promoted_type; + SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE); + } + } + } + else + { + /* Promote def by fixing its type and make def anonymous. */ + SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE); + promote_cst_in_stmt (def_stmt, promoted_type); + TREE_TYPE (def) = promoted_type; + } + break; + } + + default: + do_not_promote = true; + break; + } + + if (do_not_promote) + { + /* Promote def and copy (i.e. convert) the value defined + by the stmt that cannot be promoted. */ + new_def = copy_ssa_name (def); + set_ssa_promoted (new_def); + SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE); + TREE_TYPE (def) = promoted_type; + gimple_set_lhs (def_stmt, new_def); + copy_stmt = gimple_build_assign (def, NOP_EXPR, + new_def, NULL_TREE); + gsi2 = gsi_for_stmt (def_stmt); + if (lookup_stmt_eh_lp (def_stmt) > 0 + || (gimple_code (def_stmt) == GIMPLE_CALL + && gimple_call_ctrl_altering_p (def_stmt))) + gsi_insert_on_edge (FALLTHRU_EDGE (gimple_bb (def_stmt)), + copy_stmt); + else + gsi_insert_after (&gsi2, copy_stmt, GSI_NEW_STMT); + } + reset_flow_sensitive_info (def); +} + +/* Fix the (promoted) USE in stmts where USE cannot be be promoted. */ +static unsigned int +fixup_use (gimple *stmt, gimple_stmt_iterator *gsi, + use_operand_p op, tree use) +{ + ssa_name_info *info = ssa_name_info_map->get_or_insert (use); + /* If USE is not promoted, nothing to do. */ + if (!info) + return 0; + + tree promoted_type = info->promoted_type; + tree old_type = info->type; + bool do_not_promote = false; + + switch (gimple_code (stmt)) + { + case GIMPLE_DEBUG: + { + SET_USE (op, fold_convert (old_type, use)); + update_stmt (stmt); + break; + } + + case GIMPLE_ASM: + case GIMPLE_CALL: + case GIMPLE_RETURN: + { + /* USE cannot be promoted here. */ + do_not_promote = true; + break; + } + + case GIMPLE_ASSIGN: + { + enum tree_code code = gimple_assign_rhs_code (stmt); + tree lhs = gimple_assign_lhs (stmt); + if (gimple_vuse (stmt) != NULL_TREE + || gimple_vdef (stmt) != NULL_TREE + || code == VIEW_CONVERT_EXPR + || code == LROTATE_EXPR + || code == RROTATE_EXPR + || code == CONSTRUCTOR + || code == BIT_FIELD_REF + || code == COMPLEX_EXPR + || VECTOR_TYPE_P (TREE_TYPE (lhs))) + { + do_not_promote = true; + } + else if (TREE_CODE_CLASS (code) == tcc_comparison + || truncate_use_p (code)) + { + /* Promote the constant in comparison when other comparison + operand is promoted. All other constants are promoted as + part of promoting definition in promote_ssa. */ + if (TREE_CODE_CLASS (code) == tcc_comparison) + promote_cst_in_stmt (stmt, promoted_type, true); + /* In some stmts, value in USE has to be ZERO/SIGN + Extended based on the original type for correct + result. */ + tree temp = make_promoted_copy (use, NULL, TREE_TYPE (use)); + gimple *copy_stmt = + zero_sign_extend_stmt (temp, use, + TYPE_UNSIGNED (old_type), + TYPE_PRECISION (old_type)); + gsi_insert_before (gsi, copy_stmt, GSI_NEW_STMT); + + SET_USE (op, temp); + update_stmt (stmt); + } + else if (CONVERT_EXPR_CODE_P (code)) + { + if (types_compatible_p (TREE_TYPE (lhs), promoted_type)) + { + /* Type of LHS and promoted RHS are compatible, we can + convert this into ZERO/SIGN EXTEND stmt. */ + gimple *copy_stmt = + zero_sign_extend_stmt (lhs, use, + TYPE_UNSIGNED (old_type), + TYPE_PRECISION (old_type)); + set_ssa_promoted (lhs); + gsi_replace (gsi, copy_stmt, false); + } + else if (!tobe_promoted_p (lhs) + || !INTEGRAL_TYPE_P (TREE_TYPE (lhs)) + || (TYPE_UNSIGNED (TREE_TYPE (use)) != TYPE_UNSIGNED (TREE_TYPE (lhs)))) + { + tree temp = make_promoted_copy (use, NULL, TREE_TYPE (use)); + gimple *copy_stmt = + zero_sign_extend_stmt (temp, use, + TYPE_UNSIGNED (old_type), + TYPE_PRECISION (old_type)); + gsi_insert_before (gsi, copy_stmt, GSI_NEW_STMT); + SET_USE (op, temp); + update_stmt (stmt); + } + } + break; + } + + case GIMPLE_COND: + { + /* In GIMPLE_COND, value in USE has to be ZERO/SIGN + Extended based on the original type for correct + result. */ + tree temp = make_promoted_copy (use, NULL, TREE_TYPE (use)); + gimple *copy_stmt = + zero_sign_extend_stmt (temp, use, + TYPE_UNSIGNED (old_type), + TYPE_PRECISION (old_type)); + gsi_insert_before (gsi, copy_stmt, GSI_NEW_STMT); + SET_USE (op, temp); + promote_cst_in_stmt (stmt, promoted_type); + update_stmt (stmt); + break; + } + + default: + break; + } + + if (do_not_promote) + { + /* FOR stmts where USE cannot be promoted, create an + original type copy. */ + tree temp; + temp = copy_ssa_name (use); + SET_SSA_NAME_VAR_OR_IDENTIFIER (temp, NULL_TREE); + set_ssa_promoted (temp); + TREE_TYPE (temp) = old_type; + gimple *copy_stmt = gimple_build_assign (temp, NOP_EXPR, + use, NULL_TREE); + gsi_insert_before (gsi, copy_stmt, GSI_NEW_STMT); + SET_USE (op, temp); + update_stmt (stmt); + } + return 0; +} + + +/* Promote all the stmts in the basic block. */ +static void +promote_all_stmts (basic_block bb) +{ + gimple_stmt_iterator gsi; + ssa_op_iter iter; + tree def, use; + use_operand_p op; + + for (gphi_iterator gpi = gsi_start_phis (bb); + !gsi_end_p (gpi); gsi_next (&gpi)) + { + gphi *phi = gpi.phi (); + FOR_EACH_PHI_ARG (op, phi, iter, SSA_OP_USE) + { + use = USE_FROM_PTR (op); + if (TREE_CODE (use) == SSA_NAME + && gimple_code (SSA_NAME_DEF_STMT (use)) == GIMPLE_NOP) + promote_ssa (use, &gsi); + fixup_use (phi, &gsi, op, use); + } + + def = PHI_RESULT (phi); + promote_ssa (def, &gsi); + } + for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi)) + { + gimple *stmt = gsi_stmt (gsi); + if (is_gimple_debug (stmt)) + continue; + + FOR_EACH_SSA_USE_OPERAND (op, stmt, iter, SSA_OP_USE) + { + use = USE_FROM_PTR (op); + if (TREE_CODE (use) == SSA_NAME + && gimple_code (SSA_NAME_DEF_STMT (use)) == GIMPLE_NOP) + promote_ssa (use, &gsi); + fixup_use (stmt, &gsi, op, use); + } + + FOR_EACH_SSA_TREE_OPERAND (def, stmt, iter, SSA_OP_DEF) + promote_ssa (def, &gsi); + } +} + +/* Promote use in GIMPLE_DEBUG stmts. Do this separately to avoid generating + different sequence with and without -g. This can happen when promoting + SSA that are defined with GIMPLE_NOP. */ +static void +promote_debug_stmts () +{ + basic_block bb; + gimple_stmt_iterator gsi; + ssa_op_iter iter; + tree use; + use_operand_p op; + + FOR_EACH_BB_FN (bb, cfun) + for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi)) + { + gimple *stmt = gsi_stmt (gsi); + if (!is_gimple_debug (stmt)) + continue; + FOR_EACH_SSA_USE_OPERAND (op, stmt, iter, SSA_OP_USE) + { + use = USE_FROM_PTR (op); + fixup_use (stmt, &gsi, op, use); + } + } +} + + +class type_promotion_dom_walker : public dom_walker +{ +public: + type_promotion_dom_walker (cdi_direction direction) + : dom_walker (direction) {} + virtual void before_dom_children (basic_block bb) + { + promote_all_stmts (bb); + } +}; + +/* Main entry point to the pass. */ +static unsigned int +execute_type_promotion (void) +{ + n_ssa_val = num_ssa_names; + ssa_name_info_map = new hash_map<tree, ssa_name_info *>; + ssa_to_be_promoted_bitmap = sbitmap_alloc (n_ssa_val); + bitmap_clear (ssa_to_be_promoted_bitmap); + + /* Create the obstack where ssa_name_info will reside. */ + gcc_obstack_init (&ssa_name_info_obstack); + + calculate_dominance_info (CDI_DOMINATORS); + /* Walk the CFG in dominator order. */ + type_promotion_dom_walker (CDI_DOMINATORS) + .walk (ENTRY_BLOCK_PTR_FOR_FN (cfun)); + promote_debug_stmts (); + gsi_commit_edge_inserts (); + + obstack_free (&ssa_name_info_obstack, NULL); + sbitmap_free (ssa_to_be_promoted_bitmap); + delete ssa_name_info_map; + return 0; +} + +namespace { +const pass_data pass_data_type_promotion = +{ + GIMPLE_PASS, /* type */ + "promotion", /* name */ + OPTGROUP_NONE, /* optinfo_flags */ + TV_TREE_TYPE_PROMOTE, /* tv_id */ + PROP_ssa, /* properties_required */ + 0, /* properties_provided */ + 0, /* properties_destroyed */ + 0, /* todo_flags_start */ + (TODO_cleanup_cfg | TODO_update_ssa | TODO_verify_all), +}; + +class pass_type_promotion : public gimple_opt_pass +{ +public: + pass_type_promotion (gcc::context *ctxt) + : gimple_opt_pass (pass_data_type_promotion, ctxt) + {} + + /* opt_pass methods: */ + opt_pass * clone () { return new pass_type_promotion (m_ctxt); } + virtual bool gate (function *) { return flag_tree_type_promote != 0; } + virtual unsigned int execute (function *) + { + return execute_type_promotion (); + } + +}; // class pass_type_promotion + +} // anon namespace + +gimple_opt_pass * +make_pass_type_promote (gcc::context *ctxt) +{ + return new pass_type_promotion (ctxt); +} + diff --git a/gcc/passes.def b/gcc/passes.def index 36d2b3b..78c463a 100644 --- a/gcc/passes.def +++ b/gcc/passes.def @@ -272,6 +272,7 @@ along with GCC; see the file COPYING3. If not see POP_INSERT_PASSES () NEXT_PASS (pass_simduid_cleanup); NEXT_PASS (pass_lower_vector_ssa); + NEXT_PASS (pass_type_promote); NEXT_PASS (pass_cse_reciprocals); NEXT_PASS (pass_reassoc); NEXT_PASS (pass_strength_reduction); diff --git a/gcc/timevar.def b/gcc/timevar.def index b429faf..a8d40c3 100644 --- a/gcc/timevar.def +++ b/gcc/timevar.def @@ -278,6 +278,7 @@ DEFTIMEVAR (TV_VTABLE_VERIFICATION , "vtable verification") DEFTIMEVAR (TV_TREE_UBSAN , "tree ubsan") DEFTIMEVAR (TV_INITIALIZE_RTL , "initialize rtl") DEFTIMEVAR (TV_GIMPLE_LADDRESS , "address lowering") +DEFTIMEVAR (TV_TREE_TYPE_PROMOTE , "tree type promote") /* Everything else in rest_of_compilation not included above. */ DEFTIMEVAR (TV_EARLY_LOCAL , "early local passes") diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h index 333b5a7..449dd19 100644 --- a/gcc/tree-pass.h +++ b/gcc/tree-pass.h @@ -436,6 +436,7 @@ extern gimple_opt_pass *make_pass_fre (gcc::context *ctxt); extern gimple_opt_pass *make_pass_check_data_deps (gcc::context *ctxt); extern gimple_opt_pass *make_pass_copy_prop (gcc::context *ctxt); extern gimple_opt_pass *make_pass_isolate_erroneous_paths (gcc::context *ctxt); +extern gimple_opt_pass *make_pass_type_promote (gcc::context *ctxt); extern gimple_opt_pass *make_pass_vrp (gcc::context *ctxt); extern gimple_opt_pass *make_pass_uncprop (gcc::context *ctxt); extern gimple_opt_pass *make_pass_return_slot (gcc::context *ctxt); diff --git a/libiberty/cp-demangle.c b/libiberty/cp-demangle.c index ff608a3..6722331 100644 --- a/libiberty/cp-demangle.c +++ b/libiberty/cp-demangle.c @@ -4353,7 +4353,7 @@ d_print_comp_inner (struct d_print_info *dpi, int options, /* Variable used to store the current templates while a previously captured scope is used. */ - struct d_print_template *saved_templates; + struct d_print_template *saved_templates = NULL; /* Nonzero if templates have been stored in the above variable. */ int need_template_restore = 0; -- 1.9.1 [-- Attachment #4: 0001-Add-new-SEXT_EXPR-tree-code.patch --] [-- Type: text/x-diff, Size: 5067 bytes --] From c0ce364e3a422912a08189645efde46c36583753 Mon Sep 17 00:00:00 2001 From: Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org> Date: Thu, 22 Oct 2015 10:51:42 +1100 Subject: [PATCH 1/5] Add new SEXT_EXPR tree code --- gcc/cfgexpand.c | 12 ++++++++++++ gcc/expr.c | 20 ++++++++++++++++++++ gcc/fold-const.c | 4 ++++ gcc/tree-cfg.c | 12 ++++++++++++ gcc/tree-inline.c | 1 + gcc/tree-pretty-print.c | 11 +++++++++++ gcc/tree.def | 5 +++++ 7 files changed, 65 insertions(+) diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c index eaad859..aeb64bb 100644 --- a/gcc/cfgexpand.c +++ b/gcc/cfgexpand.c @@ -5054,6 +5054,18 @@ expand_debug_expr (tree exp) case FMA_EXPR: return simplify_gen_ternary (FMA, mode, inner_mode, op0, op1, op2); + case SEXT_EXPR: + gcc_assert (CONST_INT_P (op1)); + inner_mode = mode_for_size (INTVAL (op1), MODE_INT, 0); + gcc_assert (GET_MODE_BITSIZE (inner_mode) == INTVAL (op1)); + + if (mode != inner_mode) + op0 = simplify_gen_unary (SIGN_EXTEND, + mode, + gen_lowpart_SUBREG (inner_mode, op0), + inner_mode); + return op0; + default: flag_unsupported: #ifdef ENABLE_CHECKING diff --git a/gcc/expr.c b/gcc/expr.c index da68870..c2f535f 100644 --- a/gcc/expr.c +++ b/gcc/expr.c @@ -9318,6 +9318,26 @@ expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode, target = expand_vec_cond_expr (type, treeop0, treeop1, treeop2, target); return target; + case SEXT_EXPR: + { + machine_mode inner_mode = mode_for_size (tree_to_uhwi (treeop1), + MODE_INT, 0); + rtx temp, result; + rtx op0 = expand_normal (treeop0); + op0 = force_reg (mode, op0); + if (mode != inner_mode) + { + result = gen_reg_rtx (mode); + temp = simplify_gen_unary (SIGN_EXTEND, mode, + gen_lowpart_SUBREG (inner_mode, op0), + inner_mode); + convert_move (result, temp, 0); + } + else + result = op0; + return result; + } + default: gcc_unreachable (); } diff --git a/gcc/fold-const.c b/gcc/fold-const.c index 602ea24..a149bad 100644 --- a/gcc/fold-const.c +++ b/gcc/fold-const.c @@ -987,6 +987,10 @@ int_const_binop_1 (enum tree_code code, const_tree arg1, const_tree parg2, res = wi::bit_and (arg1, arg2); break; + case SEXT_EXPR: + res = wi::sext (arg1, arg2.to_uhwi ()); + break; + case RSHIFT_EXPR: case LSHIFT_EXPR: if (wi::neg_p (arg2)) diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c index 8e3e810..d18b3f7 100644 --- a/gcc/tree-cfg.c +++ b/gcc/tree-cfg.c @@ -3752,6 +3752,18 @@ verify_gimple_assign_binary (gassign *stmt) return false; } + case SEXT_EXPR: + { + if (!INTEGRAL_TYPE_P (lhs_type) + || !useless_type_conversion_p (lhs_type, rhs1_type) + || !tree_fits_uhwi_p (rhs2)) + { + error ("invalid operands in sext expr"); + return true; + } + return false; + } + case VEC_WIDEN_LSHIFT_HI_EXPR: case VEC_WIDEN_LSHIFT_LO_EXPR: { diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c index b8269ef..e61c200 100644 --- a/gcc/tree-inline.c +++ b/gcc/tree-inline.c @@ -3893,6 +3893,7 @@ estimate_operator_cost (enum tree_code code, eni_weights *weights, case BIT_XOR_EXPR: case BIT_AND_EXPR: case BIT_NOT_EXPR: + case SEXT_EXPR: case TRUTH_ANDIF_EXPR: case TRUTH_ORIF_EXPR: diff --git a/gcc/tree-pretty-print.c b/gcc/tree-pretty-print.c index 11f90051..bec9082 100644 --- a/gcc/tree-pretty-print.c +++ b/gcc/tree-pretty-print.c @@ -1923,6 +1923,14 @@ dump_generic_node (pretty_printer *pp, tree node, int spc, int flags, } break; + case SEXT_EXPR: + pp_string (pp, "SEXT_EXPR <"); + dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags, false); + pp_string (pp, ", "); + dump_generic_node (pp, TREE_OPERAND (node, 1), spc, flags, false); + pp_greater (pp); + break; + case MODIFY_EXPR: case INIT_EXPR: dump_generic_node (pp, TREE_OPERAND (node, 0), spc, flags, @@ -3561,6 +3569,9 @@ op_symbol_code (enum tree_code code) case MIN_EXPR: return "min"; + case SEXT_EXPR: + return "sext"; + default: return "<<< ??? >>>"; } diff --git a/gcc/tree.def b/gcc/tree.def index d0a3bd6..789cfdd 100644 --- a/gcc/tree.def +++ b/gcc/tree.def @@ -760,6 +760,11 @@ DEFTREECODE (BIT_XOR_EXPR, "bit_xor_expr", tcc_binary, 2) DEFTREECODE (BIT_AND_EXPR, "bit_and_expr", tcc_binary, 2) DEFTREECODE (BIT_NOT_EXPR, "bit_not_expr", tcc_unary, 1) +/* Sign-extend operation. It will sign extend first operand from + the sign bit specified by the second operand. The type of the + result is that of the first operand. */ +DEFTREECODE (SEXT_EXPR, "sext_expr", tcc_binary, 2) + /* ANDIF and ORIF allow the second operand not to be computed if the value of the expression is determined from the first operand. AND, OR, and XOR always compute the second operand whether its value is -- 1.9.1 ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [0/7] Type promotion pass and elimination of zext/sext 2015-11-14 1:15 ` Kugan @ 2015-11-18 14:04 ` Richard Biener 2015-11-18 15:06 ` Richard Biener 0 siblings, 1 reply; 28+ messages in thread From: Richard Biener @ 2015-11-18 14:04 UTC (permalink / raw) To: Kugan; +Cc: gcc-patches On Sat, Nov 14, 2015 at 2:15 AM, Kugan <kugan.vivekanandarajah@linaro.org> wrote: > > Attached is the latest version of the patch. With the patches > 0001-Add-new-SEXT_EXPR-tree-code.patch, > 0002-Add-type-promotion-pass.patch and > 0003-Optimize-ZEXT_EXPR-with-tree-vrp.patch. > > I did bootstrap on ppc64-linux-gnu, aarch64-linux-gnu and > x64-64-linux-gnu and regression testing on ppc64-linux-gnu, > aarch64-linux-gnu arm64-linux-gnu and x64-64-linux-gnu. I ran into three > issues in ppc64-linux-gnu regression testing. There are some other test > cases which needs adjustment for scanning for some patterns that are not > valid now. > > 1. rtl fwprop was going into infinite loop. Works with the following patch: > diff --git a/gcc/fwprop.c b/gcc/fwprop.c > index 16c7981..9cf4f43 100644 > --- a/gcc/fwprop.c > +++ b/gcc/fwprop.c > @@ -948,6 +948,10 @@ try_fwprop_subst (df_ref use, rtx *loc, rtx > new_rtx, rtx_insn *def_insn, > int old_cost = 0; > bool ok; > > + /* Value to be substituted is the same, nothing to do. */ > + if (rtx_equal_p (*loc, new_rtx)) > + return false; > + > update_df_init (def_insn, insn); > > /* forward_propagate_subreg may be operating on an instruction with Which testcase was this on? > 2. gcc.dg/torture/ftrapv-1.c fails > This is because we are checking for the SImode trapping. With the > promotion of the operation to wider mode, this is i think expected. I > think the testcase needs updating. No, it is not expected. As said earlier you need to refrain from promoting integer operations that trap. You can use ! operation_no_trapping_overflow for this. > 3. gcc.dg/sms-3.c fails > It fails with -fmodulo-sched-allow-regmoves and OK when I remove it. I > am looking into it. > > > I also have the following issues based on the previous review (as posted > in the previous patch). Copying again for the review purpose. > > 1. >> you still call promote_ssa on both DEFs and USEs and promote_ssa looks >> at SSA_NAME_DEF_STMT of the passed arg. Please call promote_ssa just >> on DEFs and fixup_uses on USEs. > > I am doing this to promote SSA that are defined with GIMPLE_NOP. Is > there anyway to iterate over this. I have added gcc_assert to make sure > that promote_ssa is called only once. gcc_assert (!ssa_name_info_map->get_or_insert (def)); with --disable-checking this will be compiled away so you need to do the assert in a separate statement. > 2. >> Instead of this you should, in promote_all_stmts, walk over all uses > doing what >> fixup_uses does and then walk over all defs, doing what promote_ssa does. >> >> + case GIMPLE_NOP: >> + { >> + if (SSA_NAME_VAR (def) == NULL) >> + { >> + /* Promote def by fixing its type for anonymous def. */ >> + TREE_TYPE (def) = promoted_type; >> + } >> + else >> + { >> + /* Create a promoted copy of parameters. */ >> + bb = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun)); >> >> I think the uninitialized vars are somewhat tricky and it would be best >> to create a new uninit anonymous SSA name for them. You can >> have SSA_NAME_VAR != NULL and def _not_ being a parameter >> btw. > > I experimented with get_or_create_default_def. Here we have to have a > SSA_NAME_VAR (def) of promoted type. > > In the attached patch I am doing the following and seems to work. Does > this looks OK? > > + } > + else if (TREE_CODE (SSA_NAME_VAR (def)) != PARM_DECL) > + { > + tree var = copy_node (SSA_NAME_VAR (def)); > + TREE_TYPE (var) = promoted_type; > + TREE_TYPE (def) = promoted_type; > + SET_SSA_NAME_VAR_OR_IDENTIFIER (def, var); > + } I believe this will wreck the SSA default-def map so you should do set_ssa_default_def (cfun, SSA_NAME_VAR (def), NULL_TREE); tree var = create_tmp_reg (promoted_type); TREE_TYPE (def) = promoted_type; SET_SSA_NAME_VAR_OR_IDENTIFIER (def, var); set_ssa_default_def (cfun, var, def); instead. > I prefer to promote def as otherwise iterating over the uses and > promoting can look complicated (have to look at all the different types > of stmts again and do the right thing as It was in the earlier version > of this before we move to this approach) > > 3) >> you can also transparently handle constants for the cases where promoting >> is required. At the moment their handling is interwinded with the def > promotion >> code. That makes the whole thing hard to follow. > > > I have updated the comments with: > > +/* Promote constants in STMT to TYPE. If PROMOTE_COND_EXPR is true, > + promote only the constants in conditions part of the COND_EXPR. > + > + We promote the constants when the associated operands are promoted. > + This usually means that we promote the constants when we promote the > + defining stmnts (as part of promote_ssa). However for COND_EXPR, we > + can promote only when we promote the other operand. Therefore, this > + is done during fixup_use. */ > > > 4) > I am handling gimple_debug separately to avoid any code difference with > and without -g option. I have updated the comments for this. > > 5) > I also noticed that tree-ssa-uninit sometimes gives false positives due > to the assumptions > it makes. Is it OK to move this pass before type promotion? I can do the > testings and post a separate patch with this if this OK. Hmm, no, this needs more explanation (like a testcase). > 6) > I also removed the optimization that prevents some of the redundant > truncation/extensions from type promotion pass, as it dosent do much as > of now. I can send a proper follow up patch. Is that OK? Yeah, that sounds fine. > I also did a simple test with coremark for the latest patch. I compared > the code size for coremark for linux-gcc with -Os. Results are as > reported by the "size" utility. I know this doesn't mean much but can > give some indication. > Base with pass Percentage improvement > ============================================================== > arm 10476 10372 0.9927453226 > aarch64 9545 9521 0.2514405448 > ppc64 12236 12052 1.5037593985 > > > After resolving the above issues, I would like propose that we commit > the pass as not enabled by default (even though the patch as it stands > enabled by default - I am doing it for testing purposes). Hmm, we don't like to have passes that are not enabled by default with any optimization level or for any target. Those tend to bitrot quickly :( Did you do any performance measurements yet? Looking over the pass in detail now (again). Thanks, Richard. > Thanks, > Kugan > > ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [0/7] Type promotion pass and elimination of zext/sext 2015-11-18 14:04 ` Richard Biener @ 2015-11-18 15:06 ` Richard Biener 2015-11-24 2:52 ` Kugan 0 siblings, 1 reply; 28+ messages in thread From: Richard Biener @ 2015-11-18 15:06 UTC (permalink / raw) To: Kugan; +Cc: gcc-patches On Wed, Nov 18, 2015 at 3:04 PM, Richard Biener <richard.guenther@gmail.com> wrote: > On Sat, Nov 14, 2015 at 2:15 AM, Kugan > <kugan.vivekanandarajah@linaro.org> wrote: >> >> Attached is the latest version of the patch. With the patches >> 0001-Add-new-SEXT_EXPR-tree-code.patch, >> 0002-Add-type-promotion-pass.patch and >> 0003-Optimize-ZEXT_EXPR-with-tree-vrp.patch. >> >> I did bootstrap on ppc64-linux-gnu, aarch64-linux-gnu and >> x64-64-linux-gnu and regression testing on ppc64-linux-gnu, >> aarch64-linux-gnu arm64-linux-gnu and x64-64-linux-gnu. I ran into three >> issues in ppc64-linux-gnu regression testing. There are some other test >> cases which needs adjustment for scanning for some patterns that are not >> valid now. >> >> 1. rtl fwprop was going into infinite loop. Works with the following patch: >> diff --git a/gcc/fwprop.c b/gcc/fwprop.c >> index 16c7981..9cf4f43 100644 >> --- a/gcc/fwprop.c >> +++ b/gcc/fwprop.c >> @@ -948,6 +948,10 @@ try_fwprop_subst (df_ref use, rtx *loc, rtx >> new_rtx, rtx_insn *def_insn, >> int old_cost = 0; >> bool ok; >> >> + /* Value to be substituted is the same, nothing to do. */ >> + if (rtx_equal_p (*loc, new_rtx)) >> + return false; >> + >> update_df_init (def_insn, insn); >> >> /* forward_propagate_subreg may be operating on an instruction with > > Which testcase was this on? > >> 2. gcc.dg/torture/ftrapv-1.c fails >> This is because we are checking for the SImode trapping. With the >> promotion of the operation to wider mode, this is i think expected. I >> think the testcase needs updating. > > No, it is not expected. As said earlier you need to refrain from promoting > integer operations that trap. You can use ! operation_no_trapping_overflow > for this. > >> 3. gcc.dg/sms-3.c fails >> It fails with -fmodulo-sched-allow-regmoves and OK when I remove it. I >> am looking into it. >> >> >> I also have the following issues based on the previous review (as posted >> in the previous patch). Copying again for the review purpose. >> >> 1. >>> you still call promote_ssa on both DEFs and USEs and promote_ssa looks >>> at SSA_NAME_DEF_STMT of the passed arg. Please call promote_ssa just >>> on DEFs and fixup_uses on USEs. >> >> I am doing this to promote SSA that are defined with GIMPLE_NOP. Is >> there anyway to iterate over this. I have added gcc_assert to make sure >> that promote_ssa is called only once. > > gcc_assert (!ssa_name_info_map->get_or_insert (def)); > > with --disable-checking this will be compiled away so you need to do > the assert in a separate statement. > >> 2. >>> Instead of this you should, in promote_all_stmts, walk over all uses >> doing what >>> fixup_uses does and then walk over all defs, doing what promote_ssa does. >>> >>> + case GIMPLE_NOP: >>> + { >>> + if (SSA_NAME_VAR (def) == NULL) >>> + { >>> + /* Promote def by fixing its type for anonymous def. */ >>> + TREE_TYPE (def) = promoted_type; >>> + } >>> + else >>> + { >>> + /* Create a promoted copy of parameters. */ >>> + bb = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun)); >>> >>> I think the uninitialized vars are somewhat tricky and it would be best >>> to create a new uninit anonymous SSA name for them. You can >>> have SSA_NAME_VAR != NULL and def _not_ being a parameter >>> btw. >> >> I experimented with get_or_create_default_def. Here we have to have a >> SSA_NAME_VAR (def) of promoted type. >> >> In the attached patch I am doing the following and seems to work. Does >> this looks OK? >> >> + } >> + else if (TREE_CODE (SSA_NAME_VAR (def)) != PARM_DECL) >> + { >> + tree var = copy_node (SSA_NAME_VAR (def)); >> + TREE_TYPE (var) = promoted_type; >> + TREE_TYPE (def) = promoted_type; >> + SET_SSA_NAME_VAR_OR_IDENTIFIER (def, var); >> + } > > I believe this will wreck the SSA default-def map so you should do > > set_ssa_default_def (cfun, SSA_NAME_VAR (def), NULL_TREE); > tree var = create_tmp_reg (promoted_type); > TREE_TYPE (def) = promoted_type; > SET_SSA_NAME_VAR_OR_IDENTIFIER (def, var); > set_ssa_default_def (cfun, var, def); > > instead. > >> I prefer to promote def as otherwise iterating over the uses and >> promoting can look complicated (have to look at all the different types >> of stmts again and do the right thing as It was in the earlier version >> of this before we move to this approach) >> >> 3) >>> you can also transparently handle constants for the cases where promoting >>> is required. At the moment their handling is interwinded with the def >> promotion >>> code. That makes the whole thing hard to follow. >> >> >> I have updated the comments with: >> >> +/* Promote constants in STMT to TYPE. If PROMOTE_COND_EXPR is true, >> + promote only the constants in conditions part of the COND_EXPR. >> + >> + We promote the constants when the associated operands are promoted. >> + This usually means that we promote the constants when we promote the >> + defining stmnts (as part of promote_ssa). However for COND_EXPR, we >> + can promote only when we promote the other operand. Therefore, this >> + is done during fixup_use. */ >> >> >> 4) >> I am handling gimple_debug separately to avoid any code difference with >> and without -g option. I have updated the comments for this. >> >> 5) >> I also noticed that tree-ssa-uninit sometimes gives false positives due >> to the assumptions >> it makes. Is it OK to move this pass before type promotion? I can do the >> testings and post a separate patch with this if this OK. > > Hmm, no, this needs more explanation (like a testcase). > >> 6) >> I also removed the optimization that prevents some of the redundant >> truncation/extensions from type promotion pass, as it dosent do much as >> of now. I can send a proper follow up patch. Is that OK? > > Yeah, that sounds fine. > >> I also did a simple test with coremark for the latest patch. I compared >> the code size for coremark for linux-gcc with -Os. Results are as >> reported by the "size" utility. I know this doesn't mean much but can >> give some indication. >> Base with pass Percentage improvement >> ============================================================== >> arm 10476 10372 0.9927453226 >> aarch64 9545 9521 0.2514405448 >> ppc64 12236 12052 1.5037593985 >> >> >> After resolving the above issues, I would like propose that we commit >> the pass as not enabled by default (even though the patch as it stands >> enabled by default - I am doing it for testing purposes). > > Hmm, we don't like to have passes that are not enabled by default with any > optimization level or for any target. Those tend to bitrot quickly :( > > Did you do any performance measurements yet? > > Looking over the pass in detail now (again). Ok, so still looking at the basic operation scheme. FOR_EACH_SSA_USE_OPERAND (op, stmt, iter, SSA_OP_USE) { use = USE_FROM_PTR (op); if (TREE_CODE (use) == SSA_NAME && gimple_code (SSA_NAME_DEF_STMT (use)) == GIMPLE_NOP) promote_ssa (use, &gsi); fixup_use (stmt, &gsi, op, use); } FOR_EACH_SSA_TREE_OPERAND (def, stmt, iter, SSA_OP_DEF) promote_ssa (def, &gsi); the GIMPLE_NOP handling in promote_ssa but when processing uses looks backwards. As those are implicitely defined in the entry block you may better just iterate over all default defs before the dominator walk like so unsigned n = num_ssa_names; for (i = 1; i < n; ++i) { tree name = ssa_name (i); if (name && SSA_NAME_IS_DEFAULT_DEF && ! has_zero_uses (name)) promote_default_def (name); } I see promote_cst_in_stmt in both promote_ssa and fixup_use. Logically it belongs to use processing, but on a stmt granularity. Thus between iterating over all uses and iteration over all defs call promote_cst_in_stmt on all stmts. It's a bit awkward as it expects to be called from context that knows whether promotion is necessary or not. /* Create an ssa with TYPE to copy ssa VAR. */ static tree make_promoted_copy (tree var, gimple *def_stmt, tree type) { tree new_lhs = make_ssa_name (type, def_stmt); if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (var)) SSA_NAME_OCCURS_IN_ABNORMAL_PHI (new_lhs) = 1; return new_lhs; } as you are generating a copy statement I don't see why you need to copy SSA_NAME_OCCURS_IN_ABNORMAL_PHI (in no case new_lhs will be used in a PHI node directly AFAICS). Merging make_promoted_copy and the usually following extension stmt generation plus insertion into a single helper would make that obvious. static unsigned int fixup_use (gimple *stmt, gimple_stmt_iterator *gsi, use_operand_p op, tree use) { ssa_name_info *info = ssa_name_info_map->get_or_insert (use); /* If USE is not promoted, nothing to do. */ if (!info) return 0; You should use ->get (), not ->get_or_insert here. gimple *copy_stmt = gimple_build_assign (temp, NOP_EXPR, use, NULL_TREE); you can avoid the trailing NULL_TREE here. gimple *copy_stmt = zero_sign_extend_stmt (temp, use, TYPE_UNSIGNED (old_type), TYPE_PRECISION (old_type)); coding style says the '=' goes to the next line, thus gimple *copy_stmt = zero_sign_extend_stmt ... /* Zero/sign extend (depending on UNSIGNED_P) VAR and truncate to WIDTH bits. Assign the zero/sign extended value in NEW_VAR. gimple statement that performs the zero/sign extension is returned. */ static gimple * zero_sign_extend_stmt (tree new_var, tree var, bool unsigned_p, int width) { looks like instead of unsigned_p/width you can pass in a type instead. /* Sign extend. */ stmt = gimple_build_assign (new_var, SEXT_EXPR, var, build_int_cst (TREE_TYPE (var), width)); use size_int (width) instead. /* Convert constant CST to TYPE. */ static tree convert_int_cst (tree type, tree cst, signop sign = SIGNED) no need for a default argument { wide_int wi_cons = fold_convert (type, cst); wi_cons = wi::ext (wi_cons, TYPE_PRECISION (TREE_TYPE (cst)), sign); return wide_int_to_tree (type, wi_cons); } I wonder why this function is needed at all and you don't just call fold_convert (type, cst)? /* Return true if the tree CODE needs the propmoted operand to be truncated (when stray bits are set beyond the original type in promoted mode) to preserve the semantics. */ static bool truncate_use_p (enum tree_code code) { a conservatively correct predicate would implement the inversion, not_truncated_use_p because if you miss any tree code the result will be unnecessary rather than missed truncations. static bool type_precision_ok (tree type) { return (TYPE_PRECISION (type) == GET_MODE_PRECISION (TYPE_MODE (type))); } /* Return the promoted type for TYPE. */ static tree get_promoted_type (tree type) { tree promoted_type; enum machine_mode mode; int uns; if (POINTER_TYPE_P (type) || !INTEGRAL_TYPE_P (type) || !type_precision_ok (type)) the type_precision_ok check is because SEXT doesn't work properly for bitfield types? I think we want to promote those to their mode precision anyway. We just need to use sth different than SEXT here (the bitwise-and works of course) or expand SEXT from non-mode precision differently (see expr.c REDUCE_BIT_FIELD which expands it as a lshift/rshift combo). Eventually this can be left for a followup though it might get you some extra testing coverage on non-promote-mode targets. /* Return true if ssa NAME is already considered for promotion. */ static bool ssa_promoted_p (tree name) { if (TREE_CODE (name) == SSA_NAME) { unsigned int index = SSA_NAME_VERSION (name); if (index < n_ssa_val) return bitmap_bit_p (ssa_to_be_promoted_bitmap, index); } return true; better than this default assert you pass in an SSA name. isn't the bitmap somewhat redundant with the hash-map? And you could combine both by using a vec<ssa_name_info *> indexed by SSA_NAME_VERSION ()? if ((TREE_CODE_CLASS (gimple_assign_rhs_code (stmt)) == tcc_comparison) || truncate_use_p (gimple_assign_rhs_code (stmt))) you always check for tcc_omparison when checking for truncate_use_p so just handle it there (well, as said above, implement conservative predicates). switch (gimple_code (stmt)) { case GIMPLE_ASSIGN: if (promote_cond && gimple_assign_rhs_code (stmt) == COND_EXPR) { looking at all callers this condition is never true. tree new_op = build2 (TREE_CODE (op), type, op0, op1); as tcc_comparison class trees are not shareable you don't need to build2 but can directly set TREE_OPERAND (op, ..) to the promoted value. Note that rhs1 may still just be an SSA name and not a comparison. case GIMPLE_PHI: { /* Promote INTEGER_CST arguments to GIMPLE_PHI. */ gphi *phi = as_a <gphi *> (stmt); FOR_EACH_PHI_ARG (oprnd, phi, iter, SSA_OP_USE) { op = USE_FROM_PTR (oprnd); index = PHI_ARG_INDEX_FROM_USE (oprnd); if (TREE_CODE (op) == INTEGER_CST) SET_PHI_ARG_DEF (phi, index, convert_int_cst (type, op, sign)); } static unsigned int fixup_use (gimple *stmt, gimple_stmt_iterator *gsi, use_operand_p op, tree use) { ssa_name_info *info = ssa_name_info_map->get_or_insert (use); /* If USE is not promoted, nothing to do. */ if (!info) return 0; tree promoted_type = info->promoted_type; tree old_type = info->type; bool do_not_promote = false; switch (gimple_code (stmt)) { .... default: break; } do_not_promote = false is not conservative. Please place a gcc_unreachable () in the default case. I see you handle debug stmts here but that case cannot be reached. /* Promote use in GIMPLE_DEBUG stmts. Do this separately to avoid generating different sequence with and without -g. This can happen when promoting SSA that are defined with GIMPLE_NOP. */ but that's only because you choose to unconditionally handle GIMPLE_NOP uses... Richard. > Thanks, > Richard. > >> Thanks, >> Kugan >> >> ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [0/7] Type promotion pass and elimination of zext/sext 2015-11-18 15:06 ` Richard Biener @ 2015-11-24 2:52 ` Kugan 2015-12-10 0:27 ` Kugan 0 siblings, 1 reply; 28+ messages in thread From: Kugan @ 2015-11-24 2:52 UTC (permalink / raw) To: Richard Biener; +Cc: gcc-patches [-- Attachment #1: Type: text/plain, Size: 17718 bytes --] Hi Richard, Thanks for you comments. I am attaching an updated patch with details below. On 19/11/15 02:06, Richard Biener wrote: > On Wed, Nov 18, 2015 at 3:04 PM, Richard Biener > <richard.guenther@gmail.com> wrote: >> On Sat, Nov 14, 2015 at 2:15 AM, Kugan >> <kugan.vivekanandarajah@linaro.org> wrote: >>> >>> Attached is the latest version of the patch. With the patches >>> 0001-Add-new-SEXT_EXPR-tree-code.patch, >>> 0002-Add-type-promotion-pass.patch and >>> 0003-Optimize-ZEXT_EXPR-with-tree-vrp.patch. >>> >>> I did bootstrap on ppc64-linux-gnu, aarch64-linux-gnu and >>> x64-64-linux-gnu and regression testing on ppc64-linux-gnu, >>> aarch64-linux-gnu arm64-linux-gnu and x64-64-linux-gnu. I ran into three >>> issues in ppc64-linux-gnu regression testing. There are some other test >>> cases which needs adjustment for scanning for some patterns that are not >>> valid now. >>> >>> 1. rtl fwprop was going into infinite loop. Works with the following patch: >>> diff --git a/gcc/fwprop.c b/gcc/fwprop.c >>> index 16c7981..9cf4f43 100644 >>> --- a/gcc/fwprop.c >>> +++ b/gcc/fwprop.c >>> @@ -948,6 +948,10 @@ try_fwprop_subst (df_ref use, rtx *loc, rtx >>> new_rtx, rtx_insn *def_insn, >>> int old_cost = 0; >>> bool ok; >>> >>> + /* Value to be substituted is the same, nothing to do. */ >>> + if (rtx_equal_p (*loc, new_rtx)) >>> + return false; >>> + >>> update_df_init (def_insn, insn); >>> >>> /* forward_propagate_subreg may be operating on an instruction with >> >> Which testcase was this on? After re-basing the trunk, I cannot reproduce it anymore. >> >>> 2. gcc.dg/torture/ftrapv-1.c fails >>> This is because we are checking for the SImode trapping. With the >>> promotion of the operation to wider mode, this is i think expected. I >>> think the testcase needs updating. >> >> No, it is not expected. As said earlier you need to refrain from promoting >> integer operations that trap. You can use ! operation_no_trapping_overflow >> for this. >> I have changed this. >>> 3. gcc.dg/sms-3.c fails >>> It fails with -fmodulo-sched-allow-regmoves and OK when I remove it. I >>> am looking into it. >>> >>> >>> I also have the following issues based on the previous review (as posted >>> in the previous patch). Copying again for the review purpose. >>> >>> 1. >>>> you still call promote_ssa on both DEFs and USEs and promote_ssa looks >>>> at SSA_NAME_DEF_STMT of the passed arg. Please call promote_ssa just >>>> on DEFs and fixup_uses on USEs. >>> >>> I am doing this to promote SSA that are defined with GIMPLE_NOP. Is >>> there anyway to iterate over this. I have added gcc_assert to make sure >>> that promote_ssa is called only once. >> >> gcc_assert (!ssa_name_info_map->get_or_insert (def)); >> >> with --disable-checking this will be compiled away so you need to do >> the assert in a separate statement. >> >>> 2. >>>> Instead of this you should, in promote_all_stmts, walk over all uses >>> doing what >>>> fixup_uses does and then walk over all defs, doing what promote_ssa does. >>>> >>>> + case GIMPLE_NOP: >>>> + { >>>> + if (SSA_NAME_VAR (def) == NULL) >>>> + { >>>> + /* Promote def by fixing its type for anonymous def. */ >>>> + TREE_TYPE (def) = promoted_type; >>>> + } >>>> + else >>>> + { >>>> + /* Create a promoted copy of parameters. */ >>>> + bb = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun)); >>>> >>>> I think the uninitialized vars are somewhat tricky and it would be best >>>> to create a new uninit anonymous SSA name for them. You can >>>> have SSA_NAME_VAR != NULL and def _not_ being a parameter >>>> btw. >>> >>> I experimented with get_or_create_default_def. Here we have to have a >>> SSA_NAME_VAR (def) of promoted type. >>> >>> In the attached patch I am doing the following and seems to work. Does >>> this looks OK? >>> >>> + } >>> + else if (TREE_CODE (SSA_NAME_VAR (def)) != PARM_DECL) >>> + { >>> + tree var = copy_node (SSA_NAME_VAR (def)); >>> + TREE_TYPE (var) = promoted_type; >>> + TREE_TYPE (def) = promoted_type; >>> + SET_SSA_NAME_VAR_OR_IDENTIFIER (def, var); >>> + } >> >> I believe this will wreck the SSA default-def map so you should do >> >> set_ssa_default_def (cfun, SSA_NAME_VAR (def), NULL_TREE); >> tree var = create_tmp_reg (promoted_type); >> TREE_TYPE (def) = promoted_type; >> SET_SSA_NAME_VAR_OR_IDENTIFIER (def, var); >> set_ssa_default_def (cfun, var, def); >> >> instead. I have changed this. >> >>> I prefer to promote def as otherwise iterating over the uses and >>> promoting can look complicated (have to look at all the different types >>> of stmts again and do the right thing as It was in the earlier version >>> of this before we move to this approach) >>> >>> 3) >>>> you can also transparently handle constants for the cases where promoting >>>> is required. At the moment their handling is interwinded with the def >>> promotion >>>> code. That makes the whole thing hard to follow. >>> >>> >>> I have updated the comments with: >>> >>> +/* Promote constants in STMT to TYPE. If PROMOTE_COND_EXPR is true, >>> + promote only the constants in conditions part of the COND_EXPR. >>> + >>> + We promote the constants when the associated operands are promoted. >>> + This usually means that we promote the constants when we promote the >>> + defining stmnts (as part of promote_ssa). However for COND_EXPR, we >>> + can promote only when we promote the other operand. Therefore, this >>> + is done during fixup_use. */ >>> >>> >>> 4) >>> I am handling gimple_debug separately to avoid any code difference with >>> and without -g option. I have updated the comments for this. >>> >>> 5) >>> I also noticed that tree-ssa-uninit sometimes gives false positives due >>> to the assumptions >>> it makes. Is it OK to move this pass before type promotion? I can do the >>> testings and post a separate patch with this if this OK. >> >> Hmm, no, this needs more explanation (like a testcase). There are few issues I ran into. I will send a list with more info. For example: /* Test we do not warn about initializing variable with self. */ /* { dg-do compile } */ /* { dg-options "-O -Wuninitialized" } */ int f() { int i = i; return i; } I now get: kugan@kugan-desktop:~$ /home/kugan/work/builds/gcc-fsf-linaro/tools/bin/ppc64-none-linux-gnu-gcc -O -Wuninitialized /home/kugan/work/SVN/gcc/trunk/gcc/testsuite/c-c++-common/uninit-D.c -fdump-tree-all /home/kugan/work/SVN/gcc/trunk/gcc/testsuite/c-c++-common/uninit-D.c: In function âfâ: /home/kugan/work/SVN/gcc/trunk/gcc/testsuite/c-c++-common/uninit-D.c:8:10: warning: âiâ is used uninitialized in this function [-Wuninitialized] return i; diff -u uninit-D.c.146t.veclower21 uninit-D.c.147t.promotion is: --- uninit-D.c.146t.veclower21 2015-11-24 11:30:04.374203197 +1100 +++ uninit-D.c.147t.promotion 2015-11-24 11:30:04.374203197 +1100 @@ -1,13 +1,16 @@ ;; Function f (f, funcdef_no=0, decl_uid=2271, cgraph_uid=0, symbol_order=0) f () { + signed long i; int i; + int _3; <bb 2>: - return i_1(D); + _3 = (int) i_1(D); + return _3; } >> >>> 6) >>> I also removed the optimization that prevents some of the redundant >>> truncation/extensions from type promotion pass, as it dosent do much as >>> of now. I can send a proper follow up patch. Is that OK? >> >> Yeah, that sounds fine. >> >>> I also did a simple test with coremark for the latest patch. I compared >>> the code size for coremark for linux-gcc with -Os. Results are as >>> reported by the "size" utility. I know this doesn't mean much but can >>> give some indication. >>> Base with pass Percentage improvement >>> ============================================================== >>> arm 10476 10372 0.9927453226 >>> aarch64 9545 9521 0.2514405448 >>> ppc64 12236 12052 1.5037593985 >>> >>> >>> After resolving the above issues, I would like propose that we commit >>> the pass as not enabled by default (even though the patch as it stands >>> enabled by default - I am doing it for testing purposes). >> >> Hmm, we don't like to have passes that are not enabled by default with any >> optimization level or for any target. Those tend to bitrot quickly :( >> >> Did you do any performance measurements yet? Ok, I understand. I did performance testing on AARch64 and saw some good improvement for the earlier version. I will do it again for more targets after getting it reviewed. >> >> Looking over the pass in detail now (again). > > Ok, so still looking at the basic operation scheme. > > FOR_EACH_SSA_USE_OPERAND (op, stmt, iter, SSA_OP_USE) > { > use = USE_FROM_PTR (op); > if (TREE_CODE (use) == SSA_NAME > && gimple_code (SSA_NAME_DEF_STMT (use)) == GIMPLE_NOP) > promote_ssa (use, &gsi); > fixup_use (stmt, &gsi, op, use); > } > > FOR_EACH_SSA_TREE_OPERAND (def, stmt, iter, SSA_OP_DEF) > promote_ssa (def, &gsi); > > the GIMPLE_NOP handling in promote_ssa but when processing uses looks > backwards. As those are implicitely defined in the entry block you may > better just iterate over all default defs before the dominator walk like so > > unsigned n = num_ssa_names; > for (i = 1; i < n; ++i) > { > tree name = ssa_name (i); > if (name > && SSA_NAME_IS_DEFAULT_DEF > && ! has_zero_uses (name)) > promote_default_def (name); > } > I have Changed this. > I see promote_cst_in_stmt in both promote_ssa and fixup_use. Logically > it belongs to use processing, but on a stmt granularity. Thus between > iterating over all uses and iteration over all defs call promote_cst_in_stmt > on all stmts. It's a bit awkward as it expects to be called from context > that knows whether promotion is necessary or not. > > /* Create an ssa with TYPE to copy ssa VAR. */ > static tree > make_promoted_copy (tree var, gimple *def_stmt, tree type) > { > tree new_lhs = make_ssa_name (type, def_stmt); > if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (var)) > SSA_NAME_OCCURS_IN_ABNORMAL_PHI (new_lhs) = 1; > return new_lhs; > } > > as you are generating a copy statement I don't see why you need to copy > SSA_NAME_OCCURS_IN_ABNORMAL_PHI (in no case new_lhs will > be used in a PHI node directly AFAICS). Merging make_promoted_copy > and the usually following extension stmt generation plus insertion into > a single helper would make that obvious. > I have changed this. > static unsigned int > fixup_use (gimple *stmt, gimple_stmt_iterator *gsi, > use_operand_p op, tree use) > { > ssa_name_info *info = ssa_name_info_map->get_or_insert (use); > /* If USE is not promoted, nothing to do. */ > if (!info) > return 0; > > You should use ->get (), not ->get_or_insert here. > > gimple *copy_stmt = gimple_build_assign (temp, NOP_EXPR, > use, NULL_TREE); > Changed this. > you can avoid the trailing NULL_TREE here. > > gimple *copy_stmt = > zero_sign_extend_stmt (temp, use, > TYPE_UNSIGNED (old_type), > TYPE_PRECISION (old_type)); > > coding style says the '=' goes to the next line, thus > > gimple *copy_stmt > = zero_sign_extend_stmt ... Changed this. > > /* Zero/sign extend (depending on UNSIGNED_P) VAR and truncate to WIDTH bits. > Assign the zero/sign extended value in NEW_VAR. gimple statement > that performs the zero/sign extension is returned. */ > static gimple * > zero_sign_extend_stmt (tree new_var, tree var, bool unsigned_p, int width) > { > > looks like instead of unsigned_p/width you can pass in a type instead. > > /* Sign extend. */ > stmt = gimple_build_assign (new_var, > SEXT_EXPR, > var, build_int_cst (TREE_TYPE (var), width)); > > use size_int (width) instead. > > /* Convert constant CST to TYPE. */ > static tree > convert_int_cst (tree type, tree cst, signop sign = SIGNED) > > no need for a default argument > > { > wide_int wi_cons = fold_convert (type, cst); > wi_cons = wi::ext (wi_cons, TYPE_PRECISION (TREE_TYPE (cst)), sign); > return wide_int_to_tree (type, wi_cons); > } For some of the operations, sign extended constants are created. For example: short unPack( unsigned char c ) { /* Only want lower four bit nibble */ c = c & (unsigned char)0x0F ; if( c > 7 ) { /* Negative nibble */ return( ( short )( c - 5 ) ) ; } else { /* positive nibble */ return( ( short )c ) ; } } - 5 above becomes + (-5). Therefore, If I sign extend the constant in promotion (even though it is unsigned) results in better code. There is no correctness issue. I have now changed it based on your suggestions. Is this look better? > > I wonder why this function is needed at all and you don't just call > fold_convert (type, cst)? > > /* Return true if the tree CODE needs the propmoted operand to be > truncated (when stray bits are set beyond the original type in > promoted mode) to preserve the semantics. */ > static bool > truncate_use_p (enum tree_code code) > { > > a conservatively correct predicate would implement the inversion, > not_truncated_use_p because if you miss any tree code the > result will be unnecessary rather than missed truncations. > Changed it. > static bool > type_precision_ok (tree type) > { > return (TYPE_PRECISION (type) > == GET_MODE_PRECISION (TYPE_MODE (type))); > } > > /* Return the promoted type for TYPE. */ > static tree > get_promoted_type (tree type) > { > tree promoted_type; > enum machine_mode mode; > int uns; > > if (POINTER_TYPE_P (type) > || !INTEGRAL_TYPE_P (type) > || !type_precision_ok (type)) > > the type_precision_ok check is because SEXT doesn't work > properly for bitfield types? I think we want to promote those > to their mode precision anyway. We just need to use > sth different than SEXT here (the bitwise-and works of course) > or expand SEXT from non-mode precision differently (see > expr.c REDUCE_BIT_FIELD which expands it as a > lshift/rshift combo). Eventually this can be left for a followup > though it might get you some extra testing coverage on > non-promote-mode targets. I will have a look at it. > > /* Return true if ssa NAME is already considered for promotion. */ > static bool > ssa_promoted_p (tree name) > { > if (TREE_CODE (name) == SSA_NAME) > { > unsigned int index = SSA_NAME_VERSION (name); > if (index < n_ssa_val) > return bitmap_bit_p (ssa_to_be_promoted_bitmap, index); > } > return true; > > better than this default assert you pass in an SSA name. Changed it. > > isn't the bitmap somewhat redundant with the hash-map? > And you could combine both by using a vec<ssa_name_info *> indexed > by SSA_NAME_VERSION ()? > > if ((TREE_CODE_CLASS (gimple_assign_rhs_code (stmt)) > == tcc_comparison) > || truncate_use_p (gimple_assign_rhs_code (stmt))) > > you always check for tcc_omparison when checking for truncate_use_p > so just handle it there (well, as said above, implement conservative > predicates). > > switch (gimple_code (stmt)) > { > case GIMPLE_ASSIGN: > if (promote_cond > && gimple_assign_rhs_code (stmt) == COND_EXPR) > { > > looking at all callers this condition is never true. > > tree new_op = build2 (TREE_CODE (op), type, op0, op1); > > as tcc_comparison class trees are not shareable you don't > need to build2 but can directly set TREE_OPERAND (op, ..) to the > promoted value. Note that rhs1 may still just be an SSA name > and not a comparison. Changed this. > > case GIMPLE_PHI: > { > /* Promote INTEGER_CST arguments to GIMPLE_PHI. */ > gphi *phi = as_a <gphi *> (stmt); > FOR_EACH_PHI_ARG (oprnd, phi, iter, SSA_OP_USE) > { > op = USE_FROM_PTR (oprnd); > index = PHI_ARG_INDEX_FROM_USE (oprnd); > if (TREE_CODE (op) == INTEGER_CST) > SET_PHI_ARG_DEF (phi, index, convert_int_cst (type, op, sign)); > } > > static unsigned int > fixup_use (gimple *stmt, gimple_stmt_iterator *gsi, > use_operand_p op, tree use) > { > ssa_name_info *info = ssa_name_info_map->get_or_insert (use); > /* If USE is not promoted, nothing to do. */ > if (!info) > return 0; > > tree promoted_type = info->promoted_type; > tree old_type = info->type; > bool do_not_promote = false; > > switch (gimple_code (stmt)) > { > .... > default: > break; > } > > do_not_promote = false is not conservative. Please place a > gcc_unreachable () in the default case. We will have valid statements (which are not handled in switch) for which we don't have to do any fix ups. > > I see you handle debug stmts here but that case cannot be reached. > > /* Promote use in GIMPLE_DEBUG stmts. Do this separately to avoid generating > different sequence with and without -g. This can happen when promoting > SSA that are defined with GIMPLE_NOP. */ > > but that's only because you choose to unconditionally handle GIMPLE_NOP uses... I have removed this. Thanks, Kugan > > Richard. > > >> Thanks, >> Richard. >> >>> Thanks, >>> Kugan >>> >>> [-- Attachment #2: 0002-Add-type-promotion-pass.patch --] [-- Type: text/x-patch, Size: 31362 bytes --] From 89f526ea6f7878879fa65a2b869cac4c21dc7df0 Mon Sep 17 00:00:00 2001 From: Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org> Date: Fri, 20 Nov 2015 14:14:52 +1100 Subject: [PATCH 2/3] Add type promotion pass --- gcc/Makefile.in | 1 + gcc/auto-profile.c | 2 +- gcc/common.opt | 4 + gcc/doc/invoke.texi | 10 + gcc/gimple-ssa-type-promote.c | 849 ++++++++++++++++++++++++++++++++++++++++++ gcc/passes.def | 1 + gcc/timevar.def | 1 + gcc/tree-pass.h | 1 + libiberty/cp-demangle.c | 2 +- 9 files changed, 869 insertions(+), 2 deletions(-) create mode 100644 gcc/gimple-ssa-type-promote.c diff --git a/gcc/Makefile.in b/gcc/Makefile.in index 0fd8d99..4e1444c 100644 --- a/gcc/Makefile.in +++ b/gcc/Makefile.in @@ -1512,6 +1512,7 @@ OBJS = \ tree-vect-slp.o \ tree-vectorizer.o \ tree-vrp.o \ + gimple-ssa-type-promote.o \ tree.o \ valtrack.o \ value-prof.o \ diff --git a/gcc/auto-profile.c b/gcc/auto-profile.c index c7aab42..f214331 100644 --- a/gcc/auto-profile.c +++ b/gcc/auto-profile.c @@ -1257,7 +1257,7 @@ afdo_propagate_circuit (const bb_set &annotated_bb, edge_set *annotated_edge) FOR_EACH_EDGE (e, ei, bb->succs) { unsigned i, total = 0; - edge only_one; + edge only_one = NULL; bool check_value_one = (((integer_onep (cmp_rhs)) ^ (gimple_cond_code (cmp_stmt) == EQ_EXPR)) ^ ((e->flags & EDGE_TRUE_VALUE) != 0)); diff --git a/gcc/common.opt b/gcc/common.opt index 3eb520e..582e8ee 100644 --- a/gcc/common.opt +++ b/gcc/common.opt @@ -2407,6 +2407,10 @@ fsplit-paths Common Report Var(flag_split_paths) Init(0) Optimization Split paths leading to loop backedges. +ftree-type-promote +Common Report Var(flag_tree_type_promote) Init(1) Optimization +Perform Type Promotion on trees + funit-at-a-time Common Report Var(flag_unit_at_a_time) Init(1) Compile whole compilation unit at a time. diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 7cef176..21f94a6 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -9142,6 +9142,16 @@ Split paths leading to loop backedges. This can improve dead code elimination and common subexpression elimination. This is enabled by default at @option{-O2} and above. +@item -ftree-type-promote +@opindex ftree-type-promote +This pass applies type promotion to SSA names in the function and +inserts appropriate truncations to preserve the semantics. Idea of +this pass is to promote operations such a way that we can minimise +generation of subreg in RTL, that intern results in removal of +redundant zero/sign extensions. + +This optimization is enabled by default. + @item -fsplit-ivs-in-unroller @opindex fsplit-ivs-in-unroller Enables expression of values of induction variables in later iterations diff --git a/gcc/gimple-ssa-type-promote.c b/gcc/gimple-ssa-type-promote.c new file mode 100644 index 0000000..5993e89 --- /dev/null +++ b/gcc/gimple-ssa-type-promote.c @@ -0,0 +1,849 @@ +/* Type promotion of SSA names to minimise redundant zero/sign extension. + Copyright (C) 2015 Free Software Foundation, Inc. + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify +it under the terms of the GNU General Public License as published by +the Free Software Foundation; either version 3, or (at your option) +any later version. + +GCC is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +You should have received a copy of the GNU General Public License +along with GCC; see the file COPYING3. If not see +<http://www.gnu.org/licenses/>. */ + +#include "config.h" +#include "system.h" +#include "coretypes.h" +#include "backend.h" +#include "hash-set.h" +#include "machmode.h" +#include "vec.h" +#include "double-int.h" +#include "input.h" +#include "symtab.h" +#include "wide-int.h" +#include "inchash.h" +#include "tree.h" +#include "fold-const.h" +#include "stor-layout.h" +#include "predict.h" +#include "function.h" +#include "dominance.h" +#include "cfg.h" +#include "basic-block.h" +#include "tree-ssa-alias.h" +#include "gimple-fold.h" +#include "tree-eh.h" +#include "gimple-expr.h" +#include "is-a.h" +#include "gimple.h" +#include "gimple-iterator.h" +#include "gimple-ssa.h" +#include "tree-phinodes.h" +#include "ssa-iterators.h" +#include "stringpool.h" +#include "tree-ssanames.h" +#include "tree-pass.h" +#include "gimple-pretty-print.h" +#include "langhooks.h" +#include "sbitmap.h" +#include "domwalk.h" +#include "tree-dfa.h" + +/* This pass applies type promotion to SSA names in the function and + inserts appropriate truncations. Idea of this pass is to promote operations + such a way that we can minimise generation of subreg in RTL, + that in turn results in removal of redundant zero/sign extensions. This pass + will run prior to The VRP and DOM such that they will be able to optimise + redundant truncations and extensions. This is based on the discussion from + https://gcc.gnu.org/ml/gcc-patches/2014-09/msg00472.html. +*/ + +/* Structure to hold the type and promoted type for promoted ssa variables. */ +struct ssa_name_info +{ + tree ssa; /* Name of the SSA_NAME. */ + tree type; /* Original type of ssa. */ + tree promoted_type; /* Promoted type of ssa. */ +}; + +/* Obstack for ssa_name_info. */ +static struct obstack ssa_name_info_obstack; + +static unsigned n_ssa_val; +static sbitmap ssa_to_be_promoted_bitmap; +static hash_map <tree, ssa_name_info *> *ssa_name_info_map; + +static bool +type_precision_ok (tree type) +{ + return (TYPE_PRECISION (type) + == GET_MODE_PRECISION (TYPE_MODE (type))); +} + +/* Return the promoted type for TYPE. */ +static tree +get_promoted_type (tree type) +{ + tree promoted_type; + enum machine_mode mode; + int uns; + + if (POINTER_TYPE_P (type) + || !INTEGRAL_TYPE_P (type) + || !type_precision_ok (type)) + return type; + + mode = TYPE_MODE (type); +#ifdef PROMOTE_MODE + uns = TYPE_SIGN (type); + PROMOTE_MODE (mode, uns, type); +#endif + uns = TYPE_SIGN (type); + if (TYPE_PRECISION (type) == GET_MODE_PRECISION (mode)) + return type; + promoted_type + = build_nonstandard_integer_type (GET_MODE_PRECISION (mode), + uns); + gcc_assert (TYPE_PRECISION (promoted_type) == GET_MODE_PRECISION (mode)); + return promoted_type; +} + +/* Return true if ssa NAME is already considered for promotion. */ +static bool +ssa_promoted_p (tree name) +{ + gcc_assert (TREE_CODE (name) == SSA_NAME); + unsigned int index = SSA_NAME_VERSION (name); + if (index < n_ssa_val) + return bitmap_bit_p (ssa_to_be_promoted_bitmap, index); + return true; +} + +/* Set ssa NAME to be already considered for promotion. */ +static void +set_ssa_promoted (tree name) +{ + gcc_assert (TREE_CODE (name) == SSA_NAME); + unsigned int index = SSA_NAME_VERSION (name); + if (index < n_ssa_val) + bitmap_set_bit (ssa_to_be_promoted_bitmap, index); +} + +/* Return true if the tree CODE needs the propmoted operand to be + truncated (when stray bits are set beyond the original type in + promoted mode) to preserve the semantics. */ +static bool +not_truncated_use_p (enum tree_code code) +{ + if (TREE_CODE_CLASS (code) == tcc_comparison + || code == TRUNC_DIV_EXPR + || code == CEIL_DIV_EXPR + || code == FLOOR_DIV_EXPR + || code == ROUND_DIV_EXPR + || code == TRUNC_MOD_EXPR + || code == CEIL_MOD_EXPR + || code == FLOOR_MOD_EXPR + || code == ROUND_MOD_EXPR + || code == LSHIFT_EXPR + || code == RSHIFT_EXPR + || code == MAX_EXPR + || code == MIN_EXPR) + return false; + else + return true; +} + + +/* Return true if LHS will be promoted later. */ +static bool +tobe_promoted_p (tree lhs) +{ + if (TREE_CODE (lhs) == SSA_NAME + && INTEGRAL_TYPE_P (TREE_TYPE (lhs)) + && !VECTOR_TYPE_P (TREE_TYPE (lhs)) + && !POINTER_TYPE_P (TREE_TYPE (lhs)) + && !ssa_promoted_p (lhs) + && (get_promoted_type (TREE_TYPE (lhs)) + != TREE_TYPE (lhs))) + return true; + else + return false; +} + +/* Convert and sign-extend constant CST to TYPE. */ +static tree +fold_convert_sext (tree type, tree cst) +{ + wide_int wi_cons = fold_convert (type, cst); + wi_cons = wi::ext (wi_cons, TYPE_PRECISION (TREE_TYPE (cst)), SIGNED); + return wide_int_to_tree (type, wi_cons); +} + +/* Promote constants in STMT to TYPE. If PROMOTE_COND_EXPR is true, + promote only the constants in conditions part of the COND_EXPR. + + We promote the constants when the ssociated operands are promoted. + This usually means that we promote the constants when we promote the + defining stmnts (as part of promote_ssa). However for COND_EXPR, we + can promote only when we promote the other operand. Therefore, this + is done during fixup_use. */ + +static void +promote_cst_in_stmt (gimple *stmt, tree type) +{ + tree op; + ssa_op_iter iter; + use_operand_p oprnd; + int index; + tree op0, op1; + + switch (gimple_code (stmt)) + { + case GIMPLE_ASSIGN: + if (gimple_assign_rhs_code (stmt) == COND_EXPR + && TREE_OPERAND_LENGTH (gimple_assign_rhs1 (stmt)) == 2) + { + /* Promote INTEGER_CST that are tcc_compare arguments. */ + op = gimple_assign_rhs1 (stmt); + op0 = TREE_OPERAND (op, 0); + op1 = TREE_OPERAND (op, 1); + if (TREE_TYPE (op0) != TREE_TYPE (op1)) + { + if (TREE_CODE (op0) == INTEGER_CST) + TREE_OPERAND (op, 0) = fold_convert (type, op0); + if (TREE_CODE (op1) == INTEGER_CST) + TREE_OPERAND (op, 1) = fold_convert (type, op1); + } + } + /* Promote INTEGER_CST in GIMPLE_ASSIGN. */ + if (not_truncated_use_p (gimple_assign_rhs_code (stmt))) + { + op = gimple_assign_rhs3 (stmt); + if (op && TREE_CODE (op) == INTEGER_CST) + gimple_assign_set_rhs3 (stmt, fold_convert_sext (type, op)); + op = gimple_assign_rhs1 (stmt); + if (op && TREE_CODE (op) == INTEGER_CST) + gimple_assign_set_rhs1 (stmt, fold_convert_sext (type, op)); + op = gimple_assign_rhs2 (stmt); + if (op && TREE_CODE (op) == INTEGER_CST) + gimple_assign_set_rhs2 (stmt, fold_convert_sext (type, op)); + } + else + { + op = gimple_assign_rhs3 (stmt); + if (op && TREE_CODE (op) == INTEGER_CST) + gimple_assign_set_rhs3 (stmt, fold_convert (type, op)); + op = gimple_assign_rhs1 (stmt); + if (op && TREE_CODE (op) == INTEGER_CST) + gimple_assign_set_rhs1 (stmt, fold_convert (type, op)); + op = gimple_assign_rhs2 (stmt); + if (op && TREE_CODE (op) == INTEGER_CST) + gimple_assign_set_rhs2 (stmt, fold_convert (type, op)); + } + break; + + case GIMPLE_PHI: + { + /* Promote INTEGER_CST arguments to GIMPLE_PHI. */ + gphi *phi = as_a <gphi *> (stmt); + FOR_EACH_PHI_ARG (oprnd, phi, iter, SSA_OP_USE) + { + op = USE_FROM_PTR (oprnd); + index = PHI_ARG_INDEX_FROM_USE (oprnd); + if (TREE_CODE (op) == INTEGER_CST) + SET_PHI_ARG_DEF (phi, index, fold_convert (type, op)); + } + } + break; + + case GIMPLE_COND: + { + /* Promote INTEGER_CST that are GIMPLE_COND arguments. */ + gcond *cond = as_a <gcond *> (stmt); + op = gimple_cond_lhs (cond); + if (op && TREE_CODE (op) == INTEGER_CST) + gimple_cond_set_lhs (cond, fold_convert (type, op)); + + op = gimple_cond_rhs (cond); + if (op && TREE_CODE (op) == INTEGER_CST) + gimple_cond_set_rhs (cond, fold_convert (type, op)); + } + break; + + default: + gcc_unreachable (); + } +} + +/* Zero/sign extend VAR and truncate to INNER_TYPE. + Assign the zero/sign extended value in NEW_VAR. gimple statement + that performs the zero/sign extension is returned. */ + +static gimple * +zero_sign_extend_stmt (tree new_var, tree var, tree inner_type) +{ + gcc_assert (TYPE_PRECISION (TREE_TYPE (var)) + == TYPE_PRECISION (TREE_TYPE (new_var))); + gcc_assert (TYPE_PRECISION (TREE_TYPE (var)) > TYPE_PRECISION (inner_type)); + gimple *stmt; + + if (TYPE_UNSIGNED (inner_type)) + { + /* Zero extend. */ + tree cst + = wide_int_to_tree (TREE_TYPE (var), + wi::mask (TYPE_PRECISION (inner_type), false, + TYPE_PRECISION (TREE_TYPE (var)))); + stmt = gimple_build_assign (new_var, BIT_AND_EXPR, + var, cst); + } + else + /* Sign extend. */ + stmt = gimple_build_assign (new_var, + SEXT_EXPR, + var, + build_int_cst (TREE_TYPE (var), + TYPE_PRECISION (inner_type))); + return stmt; +} + +static void +copy_default_ssa (tree to, tree from) +{ + SET_SSA_NAME_VAR_OR_IDENTIFIER (to, SSA_NAME_VAR (from)); + SSA_NAME_DEF_STMT (to) = SSA_NAME_DEF_STMT (from); + SET_SSA_NAME_VAR_OR_IDENTIFIER (from, NULL_TREE); + SSA_NAME_IS_DEFAULT_DEF (to) = 1; + SSA_NAME_IS_DEFAULT_DEF (from) = 0; +} + +/* Promote definition DEF to PROMOTED_TYPE. If the stmt that defines def + is def_stmt, make the type of def promoted_type. If the stmt is such + that, result of the def_stmt cannot be of promoted_type, create a new_def + of the original_type and make the def_stmt assign its value to newdef. + Then, create a NOP_EXPR to convert new_def to def of promoted type. + + For example, for stmt with original_type char and promoted_type int: + char _1 = mem; + becomes: + char _2 = mem; + int _1 = (int)_2; + + If the def_stmt allows def to be promoted, promote def in-place + (and its arguments when needed). + + For example: + char _3 = _1 + _2; + becomes: + int _3 = _1 + _2; + Here, _1 and _2 will also be promoted. */ + +static void +promote_ssa (tree def, gimple_stmt_iterator *gsi) +{ + gimple *def_stmt = SSA_NAME_DEF_STMT (def); + gimple *copy_stmt = NULL; + gimple_stmt_iterator gsi2; + tree original_type = TREE_TYPE (def); + tree new_def; + ssa_name_info *info; + bool do_not_promote = false; + tree promoted_type = get_promoted_type (TREE_TYPE (def)); + + if (!tobe_promoted_p (def)) + return; + + info = (ssa_name_info *) obstack_alloc (&ssa_name_info_obstack, + sizeof (ssa_name_info)); + info->type = original_type; + info->promoted_type = promoted_type; + info->ssa = def; + ssa_name_info_map->put (def, info); + + switch (gimple_code (def_stmt)) + { + case GIMPLE_PHI: + { + /* Promote def by fixing its type and make def anonymous. */ + TREE_TYPE (def) = promoted_type; + SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE); + promote_cst_in_stmt (def_stmt, promoted_type); + break; + } + + case GIMPLE_ASM: + { + gasm *asm_stmt = as_a <gasm *> (def_stmt); + for (unsigned int i = 0; i < gimple_asm_noutputs (asm_stmt); ++i) + { + /* Promote def and copy (i.e. convert) the value defined + by asm to def. */ + tree link = gimple_asm_output_op (asm_stmt, i); + tree op = TREE_VALUE (link); + if (op == def) + { + new_def = copy_ssa_name (def); + set_ssa_promoted (new_def); + copy_default_ssa (new_def, def); + TREE_VALUE (link) = new_def; + gimple_asm_set_output_op (asm_stmt, i, link); + + TREE_TYPE (def) = promoted_type; + copy_stmt = gimple_build_assign (def, NOP_EXPR, new_def); + SSA_NAME_IS_DEFAULT_DEF (new_def) = 0; + gimple_set_location (copy_stmt, gimple_location (def_stmt)); + gsi2 = gsi_for_stmt (def_stmt); + gsi_insert_after (&gsi2, copy_stmt, GSI_NEW_STMT); + break; + } + } + break; + } + + case GIMPLE_NOP: + { + gcc_unreachable (); + } + + case GIMPLE_ASSIGN: + { + enum tree_code code = gimple_assign_rhs_code (def_stmt); + tree rhs = gimple_assign_rhs1 (def_stmt); + if (gimple_vuse (def_stmt) != NULL_TREE + || gimple_vdef (def_stmt) != NULL_TREE + || (ANY_INTEGRAL_TYPE_P (TREE_TYPE (def)) + && !operation_no_trapping_overflow (TREE_TYPE (def), code)) + || TREE_CODE_CLASS (code) == tcc_reference + || TREE_CODE_CLASS (code) == tcc_comparison + || code == LROTATE_EXPR + || code == RROTATE_EXPR + || code == VIEW_CONVERT_EXPR + || code == REALPART_EXPR + || code == IMAGPART_EXPR + || code == REDUC_PLUS_EXPR + || code == REDUC_MAX_EXPR + || code == REDUC_MIN_EXPR + || !INTEGRAL_TYPE_P (TREE_TYPE (rhs))) + { + do_not_promote = true; + } + else if (CONVERT_EXPR_CODE_P (code)) + { + if (!type_precision_ok (TREE_TYPE (rhs))) + { + do_not_promote = true; + } + else if (types_compatible_p (TREE_TYPE (rhs), promoted_type)) + { + /* As we travel statements in dominated order, arguments + of def_stmt will be visited before visiting def. If RHS + is already promoted and type is compatible, we can convert + them into ZERO/SIGN EXTEND stmt. */ + ssa_name_info *info = ssa_name_info_map->get_or_insert (rhs); + tree type; + if (info == NULL) + type = TREE_TYPE (rhs); + else + type = info->type; + if ((TYPE_PRECISION (original_type) + > TYPE_PRECISION (type)) + || (TYPE_UNSIGNED (original_type) + != TYPE_UNSIGNED (type))) + { + if (TYPE_PRECISION (original_type) < TYPE_PRECISION (type)) + type = original_type; + gcc_assert (type != NULL_TREE); + TREE_TYPE (def) = promoted_type; + copy_stmt = zero_sign_extend_stmt (def, rhs, type); + SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE); + gsi_replace (gsi, copy_stmt, false); + } + else + { + TREE_TYPE (def) = promoted_type; + SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE); + } + } + else + { + /* If RHS is not promoted OR their types are not + compatible, create NOP_EXPR that converts + RHS to promoted DEF type and perform a + ZERO/SIGN EXTEND to get the required value + from RHS. */ + ssa_name_info *info = ssa_name_info_map->get_or_insert (rhs); + if (info != NULL) + { + tree type = info->type; + new_def = copy_ssa_name (rhs); + SET_SSA_NAME_VAR_OR_IDENTIFIER (new_def, NULL_TREE); + TREE_TYPE (def) = promoted_type; + SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE); + copy_stmt = zero_sign_extend_stmt (new_def, rhs, type); + gimple_set_location (copy_stmt, gimple_location (def_stmt)); + gsi2 = gsi_for_stmt (def_stmt); + gsi_insert_before (&gsi2, copy_stmt, GSI_NEW_STMT); + gassign *new_def_stmt = gimple_build_assign (def, code, new_def); + gsi_replace (gsi, new_def_stmt, false); + } + else + { + TREE_TYPE (def) = promoted_type; + SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE); + } + } + } + else + { + /* Promote def by fixing its type and make def anonymous. */ + promote_cst_in_stmt (def_stmt, promoted_type); + SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE); + TREE_TYPE (def) = promoted_type; + } + break; + } + + default: + do_not_promote = true; + break; + } + + if (do_not_promote) + { + /* Promote def and copy (i.e. convert) the value defined + by the stmt that cannot be promoted. */ + new_def = copy_ssa_name (def); + set_ssa_promoted (new_def); + SET_SSA_NAME_VAR_OR_IDENTIFIER (def, NULL_TREE); + TREE_TYPE (def) = promoted_type; + gimple_set_lhs (def_stmt, new_def); + copy_stmt = gimple_build_assign (def, NOP_EXPR, new_def); + gimple_set_location (copy_stmt, gimple_location (def_stmt)); + gsi2 = gsi_for_stmt (def_stmt); + if (lookup_stmt_eh_lp (def_stmt) > 0 + || (gimple_code (def_stmt) == GIMPLE_CALL + && gimple_call_ctrl_altering_p (def_stmt))) + gsi_insert_on_edge (FALLTHRU_EDGE (gimple_bb (def_stmt)), + copy_stmt); + else + gsi_insert_after (&gsi2, copy_stmt, GSI_NEW_STMT); + } + reset_flow_sensitive_info (def); +} + +/* Fix the (promoted) USE in stmts where USE cannot be be promoted. */ +static unsigned int +fixup_use (gimple *stmt, gimple_stmt_iterator *gsi, + use_operand_p op, tree use) +{ + gimple *copy_stmt; + ssa_name_info **info = ssa_name_info_map->get (use); + /* If USE is not promoted, nothing to do. */ + if (!info || *info == NULL) + return 0; + + tree promoted_type = (*info)->promoted_type; + tree old_type = (*info)->type; + bool do_not_promote = false; + + switch (gimple_code (stmt)) + { + case GIMPLE_DEBUG: + { + SET_USE (op, fold_convert (old_type, use)); + update_stmt (stmt); + break; + } + + case GIMPLE_ASM: + case GIMPLE_CALL: + case GIMPLE_RETURN: + { + /* USE cannot be promoted here. */ + do_not_promote = true; + break; + } + + case GIMPLE_ASSIGN: + { + enum tree_code code = gimple_assign_rhs_code (stmt); + tree lhs = gimple_assign_lhs (stmt); + if (gimple_vuse (stmt) != NULL_TREE + || gimple_vdef (stmt) != NULL_TREE + || (ANY_INTEGRAL_TYPE_P (TREE_TYPE (lhs)) + && !operation_no_trapping_overflow (TREE_TYPE (lhs), code)) + || code == VIEW_CONVERT_EXPR + || code == LROTATE_EXPR + || code == RROTATE_EXPR + || code == CONSTRUCTOR + || code == BIT_FIELD_REF + || code == COMPLEX_EXPR + || VECTOR_TYPE_P (TREE_TYPE (lhs))) + { + do_not_promote = true; + } + else if (!not_truncated_use_p (code)) + { + /* Promote the constant in comparison when other comparison + operand is promoted. All other constants are promoted as + part of promoting definition in promote_ssa. */ + if (TREE_CODE_CLASS (code) == tcc_comparison) + promote_cst_in_stmt (stmt, promoted_type); + /* In some stmts, value in USE has to be ZERO/SIGN + Extended based on the original type for correct + result. */ + tree temp = make_ssa_name (TREE_TYPE (use), NULL); + copy_stmt = zero_sign_extend_stmt (temp, use, old_type); + gimple_set_location (copy_stmt, gimple_location (stmt)); + gsi_insert_before (gsi, copy_stmt, GSI_NEW_STMT); + + SET_USE (op, temp); + update_stmt (stmt); + } + else if (CONVERT_EXPR_CODE_P (code) + || code == FLOAT_EXPR) + { + if (types_compatible_p (TREE_TYPE (lhs), promoted_type)) + { + /* Type of LHS and promoted RHS are compatible, we can + convert this into ZERO/SIGN EXTEND stmt. */ + copy_stmt = zero_sign_extend_stmt (lhs, use, old_type); + gimple_set_location (copy_stmt, gimple_location (stmt)); + set_ssa_promoted (lhs); + gsi_replace (gsi, copy_stmt, false); + } + else if (!tobe_promoted_p (lhs) + || !INTEGRAL_TYPE_P (TREE_TYPE (lhs)) + || (TYPE_UNSIGNED (TREE_TYPE (use)) != TYPE_UNSIGNED (TREE_TYPE (lhs)))) + { + tree temp = make_ssa_name (TREE_TYPE (use), NULL); + copy_stmt = zero_sign_extend_stmt (temp, use, old_type); + gimple_set_location (copy_stmt, gimple_location (stmt)); + gsi_insert_before (gsi, copy_stmt, GSI_NEW_STMT); + SET_USE (op, temp); + update_stmt (stmt); + } + } + break; + } + + case GIMPLE_COND: + { + /* In GIMPLE_COND, value in USE has to be ZERO/SIGN + Extended based on the original type for correct + result. */ + tree temp = make_ssa_name (TREE_TYPE (use), NULL); + copy_stmt = zero_sign_extend_stmt (temp, use, old_type); + gimple_set_location (copy_stmt, gimple_location (stmt)); + gsi_insert_before (gsi, copy_stmt, GSI_NEW_STMT); + SET_USE (op, temp); + promote_cst_in_stmt (stmt, promoted_type); + update_stmt (stmt); + break; + } + + default: + break; + } + + if (do_not_promote) + { + /* FOR stmts where USE cannot be promoted, create an + original type copy. */ + tree temp; + temp = copy_ssa_name (use); + SET_SSA_NAME_VAR_OR_IDENTIFIER (temp, NULL_TREE); + set_ssa_promoted (temp); + TREE_TYPE (temp) = old_type; + copy_stmt = gimple_build_assign (temp, NOP_EXPR, use); + gimple_set_location (copy_stmt, gimple_location (stmt)); + gsi_insert_before (gsi, copy_stmt, GSI_NEW_STMT); + SET_USE (op, temp); + update_stmt (stmt); + } + return 0; +} + +static void +promote_all_ssa_defined_with_nop () +{ + unsigned n = num_ssa_names, i; + gimple_stmt_iterator gsi2; + tree new_def; + basic_block bb; + gimple *copy_stmt; + + for (i = 1; i < n; ++i) + { + tree name = ssa_name (i); + if (name + && gimple_code (SSA_NAME_DEF_STMT (name)) == GIMPLE_NOP + && tobe_promoted_p (name) + && !has_zero_uses (name)) + { + tree promoted_type = get_promoted_type (TREE_TYPE (name)); + ssa_name_info *info; + set_ssa_promoted (name); + info = (ssa_name_info *) obstack_alloc (&ssa_name_info_obstack, + sizeof (ssa_name_info)); + info->type = TREE_TYPE (name); + info->promoted_type = promoted_type; + info->ssa = name; + ssa_name_info_map->put (name, info); + + if (SSA_NAME_VAR (name) == NULL) + { + /* Promote def by fixing its type for anonymous def. */ + TREE_TYPE (name) = promoted_type; + } + else if (TREE_CODE (SSA_NAME_VAR (name)) != PARM_DECL) + { + tree var = create_tmp_reg (promoted_type); + DECL_NAME (var) = DECL_NAME (SSA_NAME_VAR (name)); + set_ssa_default_def (cfun, SSA_NAME_VAR (name), NULL_TREE); + TREE_TYPE (name) = promoted_type; + SET_SSA_NAME_VAR_OR_IDENTIFIER (name, var); + set_ssa_default_def (cfun, var, name); + } + else + { + /* Create a promoted copy of parameters. */ + bb = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun)); + gcc_assert (bb); + gsi2 = gsi_after_labels (bb); + /* Create new_def of the original type and set that to be the + parameter. */ + new_def = copy_ssa_name (name); + set_ssa_promoted (new_def); + set_ssa_default_def (cfun, SSA_NAME_VAR (name), new_def); + copy_default_ssa (new_def, name); + + /* Now promote the def and copy the value from parameter. */ + TREE_TYPE (name) = promoted_type; + copy_stmt = gimple_build_assign (name, NOP_EXPR, new_def); + SSA_NAME_DEF_STMT (name) = copy_stmt; + gsi_insert_before (&gsi2, copy_stmt, GSI_NEW_STMT); + } + reset_flow_sensitive_info (name); + } + } +} + +/* Promote all the stmts in the basic block. */ +static void +promote_all_stmts (basic_block bb) +{ + gimple_stmt_iterator gsi; + ssa_op_iter iter; + tree def, use; + use_operand_p op; + + for (gphi_iterator gpi = gsi_start_phis (bb); + !gsi_end_p (gpi); gsi_next (&gpi)) + { + gphi *phi = gpi.phi (); + FOR_EACH_PHI_ARG (op, phi, iter, SSA_OP_USE) + { + use = USE_FROM_PTR (op); + fixup_use (phi, &gsi, op, use); + } + + def = PHI_RESULT (phi); + promote_ssa (def, &gsi); + } + for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi)) + { + gimple *stmt = gsi_stmt (gsi); + FOR_EACH_SSA_USE_OPERAND (op, stmt, iter, SSA_OP_USE) + { + use = USE_FROM_PTR (op); + fixup_use (stmt, &gsi, op, use); + } + + FOR_EACH_SSA_TREE_OPERAND (def, stmt, iter, SSA_OP_DEF) + promote_ssa (def, &gsi); + } +} + +class type_promotion_dom_walker : public dom_walker +{ +public: + type_promotion_dom_walker (cdi_direction direction) + : dom_walker (direction) {} + virtual void before_dom_children (basic_block bb) + { + promote_all_stmts (bb); + } +}; + +/* Main entry point to the pass. */ +static unsigned int +execute_type_promotion (void) +{ + n_ssa_val = num_ssa_names; + ssa_name_info_map = new hash_map<tree, ssa_name_info *>; + ssa_to_be_promoted_bitmap = sbitmap_alloc (n_ssa_val); + bitmap_clear (ssa_to_be_promoted_bitmap); + + /* Create the obstack where ssa_name_info will reside. */ + gcc_obstack_init (&ssa_name_info_obstack); + + calculate_dominance_info (CDI_DOMINATORS); + promote_all_ssa_defined_with_nop (); + /* Walk the CFG in dominator order. */ + type_promotion_dom_walker (CDI_DOMINATORS) + .walk (ENTRY_BLOCK_PTR_FOR_FN (cfun)); + gsi_commit_edge_inserts (); + + obstack_free (&ssa_name_info_obstack, NULL); + sbitmap_free (ssa_to_be_promoted_bitmap); + delete ssa_name_info_map; + return 0; +} + +namespace { +const pass_data pass_data_type_promotion = +{ + GIMPLE_PASS, /* type */ + "promotion", /* name */ + OPTGROUP_NONE, /* optinfo_flags */ + TV_TREE_TYPE_PROMOTE, /* tv_id */ + PROP_ssa, /* properties_required */ + 0, /* properties_provided */ + 0, /* properties_destroyed */ + 0, /* todo_flags_start */ + (TODO_cleanup_cfg | TODO_update_ssa | TODO_verify_all), +}; + +class pass_type_promotion : public gimple_opt_pass +{ +public: + pass_type_promotion (gcc::context *ctxt) + : gimple_opt_pass (pass_data_type_promotion, ctxt) + {} + + /* opt_pass methods: */ + opt_pass * clone () { return new pass_type_promotion (m_ctxt); } + virtual bool gate (function *) { return flag_tree_type_promote != 0; } + virtual unsigned int execute (function *) + { + return execute_type_promotion (); + } + +}; // class pass_type_promotion + +} // anon namespace + +gimple_opt_pass * +make_pass_type_promote (gcc::context *ctxt) +{ + return new pass_type_promotion (ctxt); +} + diff --git a/gcc/passes.def b/gcc/passes.def index 1702778..26838f3 100644 --- a/gcc/passes.def +++ b/gcc/passes.def @@ -276,6 +276,7 @@ along with GCC; see the file COPYING3. If not see POP_INSERT_PASSES () NEXT_PASS (pass_simduid_cleanup); NEXT_PASS (pass_lower_vector_ssa); + NEXT_PASS (pass_type_promote); NEXT_PASS (pass_split_paths); NEXT_PASS (pass_cse_reciprocals); NEXT_PASS (pass_reassoc, false /* insert_powi_p */); diff --git a/gcc/timevar.def b/gcc/timevar.def index 45e3b70..da7f2d5 100644 --- a/gcc/timevar.def +++ b/gcc/timevar.def @@ -279,6 +279,7 @@ DEFTIMEVAR (TV_VTABLE_VERIFICATION , "vtable verification") DEFTIMEVAR (TV_TREE_UBSAN , "tree ubsan") DEFTIMEVAR (TV_INITIALIZE_RTL , "initialize rtl") DEFTIMEVAR (TV_GIMPLE_LADDRESS , "address lowering") +DEFTIMEVAR (TV_TREE_TYPE_PROMOTE , "tree type promote") /* Everything else in rest_of_compilation not included above. */ DEFTIMEVAR (TV_EARLY_LOCAL , "early local passes") diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h index dcd2d5e..376ad7d 100644 --- a/gcc/tree-pass.h +++ b/gcc/tree-pass.h @@ -441,6 +441,7 @@ extern gimple_opt_pass *make_pass_fre (gcc::context *ctxt); extern gimple_opt_pass *make_pass_check_data_deps (gcc::context *ctxt); extern gimple_opt_pass *make_pass_copy_prop (gcc::context *ctxt); extern gimple_opt_pass *make_pass_isolate_erroneous_paths (gcc::context *ctxt); +extern gimple_opt_pass *make_pass_type_promote (gcc::context *ctxt); extern gimple_opt_pass *make_pass_vrp (gcc::context *ctxt); extern gimple_opt_pass *make_pass_uncprop (gcc::context *ctxt); extern gimple_opt_pass *make_pass_return_slot (gcc::context *ctxt); diff --git a/libiberty/cp-demangle.c b/libiberty/cp-demangle.c index ff608a3..6722331 100644 --- a/libiberty/cp-demangle.c +++ b/libiberty/cp-demangle.c @@ -4353,7 +4353,7 @@ d_print_comp_inner (struct d_print_info *dpi, int options, /* Variable used to store the current templates while a previously captured scope is used. */ - struct d_print_template *saved_templates; + struct d_print_template *saved_templates = NULL; /* Nonzero if templates have been stored in the above variable. */ int need_template_restore = 0; -- 1.9.1 ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [0/7] Type promotion pass and elimination of zext/sext 2015-11-24 2:52 ` Kugan @ 2015-12-10 0:27 ` Kugan 2015-12-16 13:18 ` Richard Biener 0 siblings, 1 reply; 28+ messages in thread From: Kugan @ 2015-12-10 0:27 UTC (permalink / raw) To: Richard Biener; +Cc: gcc-patches Hi Riachard, Thanks for the reviews. I think since we have some unresolved issues here, it is best to aim for the next stage1. I however would like any feedback so that I can continue to improve this. https://gcc.gnu.org/ml/gcc-patches/2015-09/msg01063.html is also related to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67714. I don't think there is any agreement on this. Or is there any better place to fix this? Thanks, Kugan ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [0/7] Type promotion pass and elimination of zext/sext 2015-12-10 0:27 ` Kugan @ 2015-12-16 13:18 ` Richard Biener 0 siblings, 0 replies; 28+ messages in thread From: Richard Biener @ 2015-12-16 13:18 UTC (permalink / raw) To: Kugan; +Cc: gcc-patches On Thu, Dec 10, 2015 at 1:27 AM, Kugan <kugan.vivekanandarajah@linaro.org> wrote: > Hi Riachard, > > Thanks for the reviews. > > I think since we have some unresolved issues here, it is best to aim for > the next stage1. I however would like any feedback so that I can > continue to improve this. Yeah, sorry I've been distracted lately and am not sure I'll get to the patch before christmas break. > https://gcc.gnu.org/ml/gcc-patches/2015-09/msg01063.html is also related > to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67714. I don't think > there is any agreement on this. Or is there any better place to fix this? I don't know enough in this area to suggest anything. Richard. > Thanks, > Kugan ^ permalink raw reply [flat|nested] 28+ messages in thread
end of thread, other threads:[~2015-12-16 13:18 UTC | newest] Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <A610E03AD50BFC4D95529A36D37FA55E8A7AB808CC@GEORGE.Emea.Arm.com> 2015-09-07 10:51 ` [0/7] Type promotion pass and elimination of zext/sext Wilco Dijkstra 2015-09-07 11:31 ` Kugan 2015-09-07 12:17 ` pinskia 2015-09-07 12:49 ` Wilco Dijkstra 2015-09-08 8:03 ` Renlin Li 2015-09-08 12:37 ` Wilco Dijkstra 2015-09-07 2:55 Kugan 2015-10-20 20:13 ` Kugan 2015-10-21 12:56 ` Richard Biener 2015-10-21 13:57 ` Richard Biener 2015-10-21 17:17 ` Joseph Myers 2015-10-21 18:11 ` Richard Henderson 2015-10-22 12:48 ` Richard Biener 2015-10-22 11:01 ` Kugan 2015-10-22 14:24 ` Richard Biener 2015-10-27 1:48 ` kugan 2015-10-28 15:51 ` Richard Biener 2015-11-02 9:17 ` Kugan 2015-11-03 14:40 ` Richard Biener 2015-11-08 9:43 ` Kugan 2015-11-10 14:13 ` Richard Biener 2015-11-12 6:08 ` Kugan 2015-11-14 1:15 ` Kugan 2015-11-18 14:04 ` Richard Biener 2015-11-18 15:06 ` Richard Biener 2015-11-24 2:52 ` Kugan 2015-12-10 0:27 ` Kugan 2015-12-16 13:18 ` Richard Biener
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).