* Question about GCC benchmarks and uninitialized variables @ 2018-07-24 8:48 Fredrik Hederstierna 2018-07-24 9:04 ` Alexander Monakov ` (2 more replies) 0 siblings, 3 replies; 19+ messages in thread From: Fredrik Hederstierna @ 2018-07-24 8:48 UTC (permalink / raw) To: gcc Hi This is a general question to all you working with GCC benchmarking. I have been working with code benchmarks like CSiBE for ARM. From time to time unpredicted results appears where numbers gets worse by no reason. When looking into what could cause this unpredictable behaviour, I found that there are (at least for CSiBE), tons of code warnings that could cause unpredictable code generation like -Wuninitialized and -Wmaybe-uninitialized. Alot of these warning indicates real bugs. I added this issue in bugzilla, #Bug 85880 - "Different code generation for uninitialized variables" Though it got (correctly) rejected, since its not a bug. But still I'm thinking how this apply to benchmarking code, and to how to approach and address this fact. So my question is how to approach this problems when doing benchmarking, ofcourse we want the benchmark to mirror as near as 'real life' code as possible. But if code contains real bugs, and issues that could cause unpredictable code generation, should such code be banned from benchmarking, since results might be misleading? On the other hand, the compiler should generate best code for any input? What do you think, should benchmarking code not being allowed to have eg warnings like -Wuninitialized and maybe -Wmaybe-uninitialized? Are there more warnings that indicate unpredictable code generations due to bad code, or are the root cause that these are 'bugs', and we should not allow real bugs at all in benchmarking code? I've read some about uninitialized variable issues also at this link which was interesting, its a wider discussion, and ofcourse a very hard problem to solve. https://gcc.gnu.org/wiki/Better_Uninitialized_Warnings Thanks, Best Regards, Fredrik ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Question about GCC benchmarks and uninitialized variables 2018-07-24 8:48 Question about GCC benchmarks and uninitialized variables Fredrik Hederstierna @ 2018-07-24 9:04 ` Alexander Monakov 2018-07-24 12:17 ` David Brown 2019-05-30 9:13 ` Question about GCC not warning for some noncompliant SEI CERT C code examples Fredrik Hederstierna 2 siblings, 0 replies; 19+ messages in thread From: Alexander Monakov @ 2018-07-24 9:04 UTC (permalink / raw) To: Fredrik Hederstierna; +Cc: gcc On Tue, 24 Jul 2018, Fredrik Hederstierna wrote: > So my question is how to approach this problems when doing benchmarking, > ofcourse we want the benchmark to mirror as near as 'real life' code as > possible. But if code contains real bugs, and issues that could cause > unpredictable code generation, should such code be banned from benchmarking, > since results might be misleading? Well, all benchmarks are going to be imperfect reflections of real-life workloads in the first place, so their bugs just increase the degree to which they are misleading. When a new compiler version starts to treat some undefined piece of code differently, it can cause a range of effects from code size perturbations as in your case, to completely invalidating the benchmark as in spec2k6 x264 benchmark's case (where GCC exploited undefined behavior in a loop, turning it to an infinite loop that eventually segfaulted). Perhaps even though results on individual benchmarks can vary wildly, aggregated results across a wide range of non-toy benchmarks may be indicative of ... something, because they are unlikely to all exhibit the same "bugs". > On the other hand, the compiler should > generate best code for any input? Engineering effort is limited, so it's probably better to go for generating good code for inputs that are likely to resemble actively used code (and in actively used&maintained code, bugs can be reported and fixed) :) > What do you think, should benchmarking code not being allowed to have eg > warnings like -Wuninitialized and maybe -Wmaybe-uninitialized? Are there more > warnings that indicate unpredictable code generations due to bad code, or are > the root cause that these are 'bugs', and we should not allow real bugs at all > in benchmarking code? A blanket ban on warnings won't work, they have false positives (especially the -Wmaybe- one), and there exist code that validly uses uninitialized data. I don't have such a striking example for scalar variables, but for uninitialized arrays there's this sparse set algorithm (which GCC itself also uses): https://research.swtch.com/sparse I think good benchmarks sets should be able to evolve to account for newly discovered bugs, rather then remain frozen (which sounds like a reason to become obsolete sooner rather than later). Alexander ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Question about GCC benchmarks and uninitialized variables 2018-07-24 8:48 Question about GCC benchmarks and uninitialized variables Fredrik Hederstierna 2018-07-24 9:04 ` Alexander Monakov @ 2018-07-24 12:17 ` David Brown 2019-05-30 9:13 ` Question about GCC not warning for some noncompliant SEI CERT C code examples Fredrik Hederstierna 2 siblings, 0 replies; 19+ messages in thread From: David Brown @ 2018-07-24 12:17 UTC (permalink / raw) To: Fredrik Hederstierna, gcc On 24/07/18 09:40, Fredrik Hederstierna wrote: > Hi > > This is a general question to all you working with GCC benchmarking. > > I have been working with code benchmarks like CSiBE for ARM. From > time to time unpredicted results appears where numbers gets worse by > no reason. > > When looking into what could cause this unpredictable behaviour, I > found that there are (at least for CSiBE), tons of code warnings that > could cause unpredictable code generation like -Wuninitialized and > -Wmaybe-uninitialized. Alot of these warning indicates real bugs. > > I added this issue in bugzilla, #Bug 85880 - "Different code > generation for uninitialized variables" Though it got (correctly) > rejected, since its not a bug. But still I'm thinking how this apply > to benchmarking code, and to how to approach and address this fact. > > So my question is how to approach this problems when doing > benchmarking, ofcourse we want the benchmark to mirror as near as > 'real life' code as possible. But if code contains real bugs, and > issues that could cause unpredictable code generation, should such > code be banned from benchmarking, since results might be misleading? > On the other hand, the compiler should generate best code for any > input? > > What do you think, should benchmarking code not being allowed to have > eg warnings like -Wuninitialized and maybe -Wmaybe-uninitialized? Are > there more warnings that indicate unpredictable code generations due > to bad code, or are the root cause that these are 'bugs', and we > should not allow real bugs at all in benchmarking code? > > I've read some about uninitialized variable issues also at this link > which was interesting, its a wider discussion, and ofcourse a very > hard problem to solve. > https://gcc.gnu.org/wiki/Better_Uninitialized_Warnings > > Thanks, Best Regards, Fredrik > AFAIUI, "-Wuninitialized" should not give false positives - if it triggers a warning, it is because you are using a variable without initialisation. The code is clearly wrong, and should not be used for real code or for benchmarking. "-Wmaybe-unitialized" indicates that there /could/ be a problem - it is code that you need to look at carefully. I'd say that in benchmarking code, you want to be very careful to generate predictable and deterministic results - and that means no use of uninitialised data. You need to get as close to perfectly repeatable results as the hardware allows - only then can you know what changes the performance of the code or hardware. If your results and timings depend on what values happen to lie in memory before the benchmark runs, you are guessing, not measuring. If you get a "-Wuninitialized" warning, it's a bug in the code, and needs fixed. A "-Wmaybe-uninitialized" warning may be a false positive, so you might want to modify code to avoid such false positives (so that you can see the real problems found by the warning). The gcc manual gives this example: { int x; switch (y) { case 1: x = 1; break; case 2: x = 4; break; case 3: x = 5; } foo (x); } A suggested fix is an "assert(0)" in the default case. You could also have a "__builtin_unreachable()" instead - this will avoid any extra run-time code, but of course you have to be very sure that x is valid or you have serious undefined behaviour. A lazy workaround would be to use "int x = x;". ^ permalink raw reply [flat|nested] 19+ messages in thread
* Question about GCC not warning for some noncompliant SEI CERT C code examples 2018-07-24 8:48 Question about GCC benchmarks and uninitialized variables Fredrik Hederstierna 2018-07-24 9:04 ` Alexander Monakov 2018-07-24 12:17 ` David Brown @ 2019-05-30 9:13 ` Fredrik Hederstierna 2019-05-30 14:28 ` Martin Sebor 2019-06-01 23:35 ` ARM peephole2 from 2003 never merged, still valid Fredrik Hederstierna 2 siblings, 2 replies; 19+ messages in thread From: Fredrik Hederstierna @ 2019-05-30 9:13 UTC (permalink / raw) To: gcc Hi When reading the SEI CERT C Coding Standard rules, looking at "DCL30-C. Declare objects with appropriate storage durations" it seem like GCC does not warn in compile-time for some noncompliant examples. I know eg AddressSanitizer and several runtime running tools finds these bugs, but it would be convenient of GCC could do some basic static analysis already in compile-time to avoid bad code generation. Some static analysers finds these bugs, but not all, and GCC does not warn. Example from DCL30-C, not warned by GCC: /* NONCOMPLIANT EXAMPLE-1 */ #include <stdio.h> const char *p; void dont_do_this(void) { const char c_str[] = "This will change"; p = c_str; /* Dangerous */ } void innocuous(void) { printf("%s\n", p); } int main(void) { dont_do_this(); innocuous(); return 0; } /* NONCOMPLIANT EXAMPLE-2 */ void squirrel_away(char **ptr_param) { char local[10]; /* Initialize array */ *ptr_param = local; } void rodent(void) { char *ptr; squirrel_away(&ptr); /* ptr is live but invalid here */ } Question, where in GCC is the most appropriate place to implements such a checker? I know there are some warnings for return-local-addr, and null-pointer-dereference in some different parts, but this seems different? Can it be found be points-to analysis, or where is it best to put this warning if being implemented? Reference: https://wiki.sei.cmu.edu/confluence/display/c/DCL30-C.+Declare+objects+with+appropriate+storage+durations ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Question about GCC not warning for some noncompliant SEI CERT C code examples 2019-05-30 9:13 ` Question about GCC not warning for some noncompliant SEI CERT C code examples Fredrik Hederstierna @ 2019-05-30 14:28 ` Martin Sebor 2019-05-30 14:48 ` Jeff Law 2019-06-01 23:35 ` ARM peephole2 from 2003 never merged, still valid Fredrik Hederstierna 1 sibling, 1 reply; 19+ messages in thread From: Martin Sebor @ 2019-05-30 14:28 UTC (permalink / raw) To: Fredrik Hederstierna, gcc On 5/30/19 3:12 AM, Fredrik Hederstierna wrote: > Hi > > When reading the SEI CERT C Coding Standard rules, looking at > "DCL30-C. Declare objects with appropriate storage durations" > it seem like GCC does not warn in compile-time for some noncompliant examples. > > I know eg AddressSanitizer and several runtime running tools finds these bugs, > but it would be convenient of GCC could do some basic static analysis already in compile-time to avoid bad code generation. > Some static analysers finds these bugs, but not all, and GCC does not warn. > > Example from DCL30-C, not warned by GCC: > > > /* NONCOMPLIANT EXAMPLE-1 */ > #include <stdio.h> > const char *p; > void dont_do_this(void) { > const char c_str[] = "This will change"; > p = c_str; /* Dangerous */ > } > void innocuous(void) { > printf("%s\n", p); > } > int main(void) { > dont_do_this(); > innocuous(); > return 0; > } > > > /* NONCOMPLIANT EXAMPLE-2 */ > void squirrel_away(char **ptr_param) { > char local[10]; > /* Initialize array */ > *ptr_param = local; > } > void rodent(void) { > char *ptr; > squirrel_away(&ptr); > /* ptr is live but invalid here */ > } Agreed. This (or a subset of the problem) is being tracked in PR 69433. > > Question, where in GCC is the most appropriate place to implements such a checker? > I know there are some warnings for return-local-addr, and null-pointer-dereference in some different parts, but this seems different? > Can it be found be points-to analysis, or where is it best to put this warning if being implemented? To me it seems close enough to -Wreturn-local-addr to be implemented in the same place, in gimple-ssa-isolate-paths.c. It's worth noting that besides warning, the pass also clears the returning address and isolates the path that returns the null in the CFG. As it happens, I recently submitted a modest enhancement to -Wreturn-local-addr to detect returning the addresses of VLAs and pointers obtained from alloca (PR 71924 and PR 90549). I'm working on extending the implementation to returning freed pointers (under a separate warning option). Besides the problems you mention, there is also a related request to diagnose dereferencing pointers to compound literals whose lifetime has ended (PR 89990), or more generally, those to any such local object. In the enhancement I submitted I chose not to use the alias oracle mainly because I didn't want to change the existing design. I'm not familiar enough with to tell with confidence if it can be used to obtain the same results. I.e., identify each instance of a local variable that a pointer may point to, rather than just answering the broad question: does this pointer point to any local variables? If it can, using it as Jeff suggests in his comments, would make sense. I don't think moving it out of gimple-ssa-isolate-paths.c is necessary (or even a good idea). Martin > > Reference: > https://wiki.sei.cmu.edu/confluence/display/c/DCL30-C.+Declare+objects+with+appropriate+storage+durations > ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Question about GCC not warning for some noncompliant SEI CERT C code examples 2019-05-30 14:28 ` Martin Sebor @ 2019-05-30 14:48 ` Jeff Law 0 siblings, 0 replies; 19+ messages in thread From: Jeff Law @ 2019-05-30 14:48 UTC (permalink / raw) To: Martin Sebor, Fredrik Hederstierna, gcc On 5/30/19 8:28 AM, Martin Sebor wrote: > On 5/30/19 3:12 AM, Fredrik Hederstierna wrote: >> Hi >> >> When reading the SEI CERT C Coding Standard rules, looking at >> "DCL30-C. Declare objects with appropriate storage durations" >> it seem like GCC does not warn in compile-time for some noncompliant >> examples. >> >> I know eg AddressSanitizer and several runtime running tools finds >> these bugs, >> but it would be convenient of GCC could do some basic static analysis >> already in compile-time to avoid bad code generation. >> Some static analysers finds these bugs, but not all, and GCC does not >> warn. >> >> Example from DCL30-C, not warned by GCC: >> >> >> /* NONCOMPLIANT EXAMPLE-1 */ >> #include <stdio.h> >> const char *p; >> void dont_do_this(void) { >>   const char c_str[] = "This will change"; >>   p = c_str; /* Dangerous */ >> } >> void innocuous(void) { >>   printf("%s\n", p); >> } >> int main(void) { >>   dont_do_this(); >>   innocuous(); >>   return 0; >> } >> >> >> /* NONCOMPLIANT EXAMPLE-2 */ >> void squirrel_away(char **ptr_param) { >>   char local[10]; >>   /* Initialize array */ >>   *ptr_param = local; >> } >> void rodent(void) { >>   char *ptr; >>   squirrel_away(&ptr); >>   /* ptr is live but invalid here */ >> } > > Agreed. This (or a subset of the problem) is being tracked in > PR 69433. > >> >> Question, where in GCC is the most appropriate place to implements >> such a checker? >> I know there are some warnings for return-local-addr, and >> null-pointer-dereference in some different parts, but this seems >> different? >> Can it be found be points-to analysis, or where is it best to put this >> warning if being implemented? > > To me it seems close enough to -Wreturn-local-addr to be implemented > in the same place, in gimple-ssa-isolate-paths.c. It's worth noting > that besides warning, the pass also clears the returning address and > isolates the path that returns the null in the CFG. > > As it happens, I recently submitted a modest enhancement to > -Wreturn-local-addr to detect returning the addresses of VLAs and > pointers obtained from alloca (PR 71924 and PR 90549). I'm working > on extending the implementation to returning freed pointers (under > a separate warning option). > > Besides the problems you mention, there is also a related request > to diagnose dereferencing pointers to compound literals whose > lifetime has ended (PR 89990), or more generally, those to any > such local object. > > In the enhancement I submitted I chose not to use the alias oracle > mainly because I didn't want to change the existing design. I'm > not familiar enough with to tell with confidence if it can be used > to obtain the same results. I.e., identify each instance of > a local variable that a pointer may point to, rather than just > answering the broad question: does this pointer point to any > local variables? If it can, using it as Jeff suggests in his > comments, would make sense. I don't think moving it out of > gimple-ssa-isolate-paths.c is necessary (or even a good idea). Note that you may also be able to use the alias oracle to detect escaping locals. That's likely to have a higher false positive rate, but may still be useful for detecting this kind of stuff. jeff ^ permalink raw reply [flat|nested] 19+ messages in thread
* ARM peephole2 from 2003 never merged, still valid 2019-05-30 9:13 ` Question about GCC not warning for some noncompliant SEI CERT C code examples Fredrik Hederstierna 2019-05-30 14:28 ` Martin Sebor @ 2019-06-01 23:35 ` Fredrik Hederstierna 2019-06-01 23:41 ` Fredrik Hederstierna 1 sibling, 1 reply; 19+ messages in thread From: Fredrik Hederstierna @ 2019-06-01 23:35 UTC (permalink / raw) To: gcc I found this patch from 2003 by Tamas Gergely that was never merged. Though it seems to still be valid, since when applying it, the CSiBE benchmark reduce total code size -170 bytes, making 35 out of 893 files smaller (from -4 to -12 bytes (1-3 instructions). Bug asd asd asd updated patch: diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 20bdc2bec37..4c924499ad9 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,9 @@ +2019-06-01 Fredrik Hederstierna <fredrik@hederstierna.com> + Tamas Gergely <gertom@rgai.hu> + + * gcc/config/arm/arm.md: new peephole2 pattern added for extending + move and compare parallelization. Retake from 2003 patch adapted. + 2019-06-01 Martin Sebor <msebor@redhat.com> PR middle-end/90694 diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md index ae582172ab9..753d18fbfc5 100644 --- a/gcc/config/arm/arm.md +++ b/gcc/config/arm/arm.md @@ -10951,6 +10951,19 @@ "" ) +; The same as the previous, but with exchanged operands +; +(define_peephole2 + [(set (match_operand:SI 0 "arm_general_register_operand" "") + (match_operand:SI 1 "arm_general_register_operand" "")) + (set (reg:CC CC_REGNUM) + (compare:CC (match_dup 0) (const_int 0)))] + "TARGET_ARM" + [(parallel [(set (reg:CC CC_REGNUM) (compare:CC (match_dup 1) (const_int 0))) + (set (match_dup 0) (match_dup 1))])] + "" +) + (define_split [(set (match_operand:SI 0 "s_register_operand" "") (and:SI (ge:SI (match_operand:SI 1 "s_register_operand" "") ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: ARM peephole2 from 2003 never merged, still valid 2019-06-01 23:35 ` ARM peephole2 from 2003 never merged, still valid Fredrik Hederstierna @ 2019-06-01 23:41 ` Fredrik Hederstierna 2019-06-02 12:28 ` Segher Boessenkool 0 siblings, 1 reply; 19+ messages in thread From: Fredrik Hederstierna @ 2019-06-01 23:41 UTC (permalink / raw) To: gcc (Sorry, my mail program crashed... trying repost, correct now...) I found this patch from 2003 by Tamas Gergely that was never merged for ARM. Though it seems to still be valid, since when applying it, the CSiBE benchmark reduce total code size -170 bytes, making 35 out of 893 files smaller (from -4 to -12 bytes (1-3 instructions). Could it be considered to apply it, since it abviously still is valid and reduce code size for ARM 32bit. The peephole2 pattern still is not matched. Bug Bug 9663<https://gcc.gnu.org/bugzilla/show_bug.cgi?id=9663> - [arm] gcc-20030127 misses an optimization opportunity https://gcc.gnu.org/bugzilla/show_bug.cgi?id=9663 Message https://gcc.gnu.org/ml/gcc-patches/2003-02/msg00204.html Adapted and updated patch from Tamas Gergely: diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 20bdc2bec37..4c924499ad9 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,9 @@ +2019-06-01 Fredrik Hederstierna <fredrik@hederstierna.com> + Tamas Gergely <gertom@rgai.hu> + + * gcc/config/arm/arm.md: new peephole2 pattern added for extending + move and compare parallelization. Retake from 2003 patch adapted. + 2019-06-01 Martin Sebor <msebor@redhat.com> PR middle-end/90694 diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md index ae582172ab9..753d18fbfc5 100644 --- a/gcc/config/arm/arm.md +++ b/gcc/config/arm/arm.md @@ -10951,6 +10951,19 @@ "" ) +; The same as the previous, but with exchanged operands +; +(define_peephole2 + [(set (match_operand:SI 0 "arm_general_register_operand" "") + (match_operand:SI 1 "arm_general_register_operand" "")) + (set (reg:CC CC_REGNUM) + (compare:CC (match_dup 0) (const_int 0)))] + "TARGET_ARM" + [(parallel [(set (reg:CC CC_REGNUM) (compare:CC (match_dup 1) (const_int 0))) + (set (match_dup 0) (match_dup 1))])] + "" +) + (define_split [(set (match_operand:SI 0 "s_register_operand" "") (and:SI (ge:SI (match_operand:SI 1 "s_register_operand" "") ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: ARM peephole2 from 2003 never merged, still valid 2019-06-01 23:41 ` Fredrik Hederstierna @ 2019-06-02 12:28 ` Segher Boessenkool 2019-06-05 23:03 ` Jeff Law 0 siblings, 1 reply; 19+ messages in thread From: Segher Boessenkool @ 2019-06-02 12:28 UTC (permalink / raw) To: Fredrik Hederstierna; +Cc: gcc On Sat, Jun 01, 2019 at 11:41:30PM +0000, Fredrik Hederstierna wrote: > +(define_peephole2 > + [(set (match_operand:SI 0 "arm_general_register_operand" "") > + (match_operand:SI 1 "arm_general_register_operand" "")) > + (set (reg:CC CC_REGNUM) > + (compare:CC (match_dup 0) (const_int 0)))] > + "TARGET_ARM" > + [(parallel [(set (reg:CC CC_REGNUM) (compare:CC (match_dup 1) (const_int 0))) > + (set (match_dup 0) (match_dup 1))])] > + "" > +) Hi Fredrik, Do you have a testcase for this? I wonder if it would be better handled during combine, and what that then tried; or perhaps these opportunities are created later, making a peephole a more attractive solution. Segher ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: ARM peephole2 from 2003 never merged, still valid 2019-06-02 12:28 ` Segher Boessenkool @ 2019-06-05 23:03 ` Jeff Law 2019-06-05 23:46 ` Segher Boessenkool 0 siblings, 1 reply; 19+ messages in thread From: Jeff Law @ 2019-06-05 23:03 UTC (permalink / raw) To: Segher Boessenkool, Fredrik Hederstierna; +Cc: gcc On 6/2/19 6:28 AM, Segher Boessenkool wrote: > On Sat, Jun 01, 2019 at 11:41:30PM +0000, Fredrik Hederstierna wrote: >> +(define_peephole2 >> + [(set (match_operand:SI 0 "arm_general_register_operand" "") >> + (match_operand:SI 1 "arm_general_register_operand" "")) >> + (set (reg:CC CC_REGNUM) >> + (compare:CC (match_dup 0) (const_int 0)))] >> + "TARGET_ARM" >> + [(parallel [(set (reg:CC CC_REGNUM) (compare:CC (match_dup 1) (const_int 0))) >> + (set (match_dup 0) (match_dup 1))])] >> + "" >> +) > > Hi Fredrik, > > Do you have a testcase for this? I wonder if it would be better handled > during combine, and what that then tried; or perhaps these opportunities > are created later, making a peephole a more attractive solution. We have two independent insns with no output/true dependency between them. So there's really not anything for combine to do here. What is more interesting to me is whether or not this could be handled by compare-elim and a define_insn rather than a define_peephole2. THe existing pattern and the new one both seem well suited for compare-elim. I do think a testcase is warranted here. Fredrik, if you could reduce one from CSiBE that might be appropriate. jeff ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: ARM peephole2 from 2003 never merged, still valid 2019-06-05 23:03 ` Jeff Law @ 2019-06-05 23:46 ` Segher Boessenkool 2019-06-06 9:13 ` Richard Earnshaw (lists) 0 siblings, 1 reply; 19+ messages in thread From: Segher Boessenkool @ 2019-06-05 23:46 UTC (permalink / raw) To: Jeff Law; +Cc: Fredrik Hederstierna, gcc On Wed, Jun 05, 2019 at 05:02:53PM -0600, Jeff Law wrote: > On 6/2/19 6:28 AM, Segher Boessenkool wrote: > > On Sat, Jun 01, 2019 at 11:41:30PM +0000, Fredrik Hederstierna wrote: > >> +(define_peephole2 > >> + [(set (match_operand:SI 0 "arm_general_register_operand" "") > >> + (match_operand:SI 1 "arm_general_register_operand" "")) > >> + (set (reg:CC CC_REGNUM) > >> + (compare:CC (match_dup 0) (const_int 0)))] > >> + "TARGET_ARM" > >> + [(parallel [(set (reg:CC CC_REGNUM) (compare:CC (match_dup 1) (const_int 0))) > >> + (set (match_dup 0) (match_dup 1))])] > >> + "" > >> +) > > > > Do you have a testcase for this? I wonder if it would be better handled > > during combine, and what that then tried; or perhaps these opportunities > > are created later, making a peephole a more attractive solution. > We have two independent insns with no output/true dependency between > them. So there's really not anything for combine to do here. op0 := op1; CC := op0 cmp 0; That's a perfectly fine dependency I think? > What is more interesting to me is whether or not this could be handled > by compare-elim and a define_insn rather than a define_peephole2. THe > existing pattern and the new one both seem well suited for compare-elim. > > I do think a testcase is warranted here. Fredrik, if you could reduce > one from CSiBE that might be appropriate. Yeah, let's see what is really going on :-) Segher ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: ARM peephole2 from 2003 never merged, still valid 2019-06-05 23:46 ` Segher Boessenkool @ 2019-06-06 9:13 ` Richard Earnshaw (lists) 2019-06-06 14:02 ` Segher Boessenkool 0 siblings, 1 reply; 19+ messages in thread From: Richard Earnshaw (lists) @ 2019-06-06 9:13 UTC (permalink / raw) To: Segher Boessenkool, Jeff Law; +Cc: Fredrik Hederstierna, gcc On 06/06/2019 00:46, Segher Boessenkool wrote: > On Wed, Jun 05, 2019 at 05:02:53PM -0600, Jeff Law wrote: >> On 6/2/19 6:28 AM, Segher Boessenkool wrote: >>> On Sat, Jun 01, 2019 at 11:41:30PM +0000, Fredrik Hederstierna wrote: >>>> +(define_peephole2 >>>> + [(set (match_operand:SI 0 "arm_general_register_operand" "") >>>> + (match_operand:SI 1 "arm_general_register_operand" "")) >>>> + (set (reg:CC CC_REGNUM) >>>> + (compare:CC (match_dup 0) (const_int 0)))] >>>> + "TARGET_ARM" >>>> + [(parallel [(set (reg:CC CC_REGNUM) (compare:CC (match_dup 1) (const_int 0))) >>>> + (set (match_dup 0) (match_dup 1))])] >>>> + "" >>>> +) >>> >>> Do you have a testcase for this? I wonder if it would be better handled >>> during combine, and what that then tried; or perhaps these opportunities >>> are created later, making a peephole a more attractive solution. >> We have two independent insns with no output/true dependency between >> them. So there's really not anything for combine to do here. > > op0 := op1; > CC := op0 cmp 0; > > That's a perfectly fine dependency I think? But if op0 isn't dead after the second insn, combine won't consider it, right? Even though in this case the combination would be safe as we still produce op0. R. > >> What is more interesting to me is whether or not this could be handled >> by compare-elim and a define_insn rather than a define_peephole2. THe >> existing pattern and the new one both seem well suited for compare-elim. >> >> I do think a testcase is warranted here. Fredrik, if you could reduce >> one from CSiBE that might be appropriate. > > Yeah, let's see what is really going on :-) > > > Segher > ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: ARM peephole2 from 2003 never merged, still valid 2019-06-06 9:13 ` Richard Earnshaw (lists) @ 2019-06-06 14:02 ` Segher Boessenkool 2019-06-06 14:55 ` Fredrik Hederstierna 0 siblings, 1 reply; 19+ messages in thread From: Segher Boessenkool @ 2019-06-06 14:02 UTC (permalink / raw) To: Richard Earnshaw (lists); +Cc: Jeff Law, Fredrik Hederstierna, gcc On Thu, Jun 06, 2019 at 10:12:57AM +0100, Richard Earnshaw (lists) wrote: > On 06/06/2019 00:46, Segher Boessenkool wrote: > > On Wed, Jun 05, 2019 at 05:02:53PM -0600, Jeff Law wrote: > >> On 6/2/19 6:28 AM, Segher Boessenkool wrote: > >>> Do you have a testcase for this? I wonder if it would be better handled > >>> during combine, and what that then tried; or perhaps these opportunities > >>> are created later, making a peephole a more attractive solution. > >> We have two independent insns with no output/true dependency between > >> them. So there's really not anything for combine to do here. > > > > op0 := op1; > > CC := op0 cmp 0; > > > > That's a perfectly fine dependency I think? > > But if op0 isn't dead after the second insn, combine won't consider it, > right? Even though in this case the combination would be safe as we > still produce op0. That doesn't stop combine from considering it. It does make that first SET survive, so that you get a parallel as final insn. It may not like that one of the parallel SETs is just a move. Needs testcase :-) Segher ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: ARM peephole2 from 2003 never merged, still valid 2019-06-06 14:02 ` Segher Boessenkool @ 2019-06-06 14:55 ` Fredrik Hederstierna 2019-06-06 16:06 ` Richard Earnshaw (lists) 0 siblings, 1 reply; 19+ messages in thread From: Fredrik Hederstierna @ 2019-06-06 14:55 UTC (permalink / raw) To: Segher Boessenkool, Richard Earnshaw (lists); +Cc: Jeff Law, gcc > From: Segher Boessenkool <segher@kernel.crashing.org> > Sent: Thursday, June 6, 2019 4:02 PM > To: Richard Earnshaw (lists) > Cc: Jeff Law; Fredrik Hederstierna; gcc@gcc.gnu.org > Subject: Re: ARM peephole2 from 2003 never merged, still valid > That doesn't stop combine from considering it. It does make that first SET > survive, so that you get a parallel as final insn. It may not like that > one of the parallel SETs is just a move. Needs testcase :-) Hi all, thanks for investigating this, I added semi-stripped testcase in original issue taken from CSiBE teem sources See attachment in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=9663 Tested gcc-9.1.0 for ARM 32bit targets, first without peephole2 patch: 00000000 <nrrdRangeSet>: 0: e92d407f push {r0, r1, r2, r3, r4, r5, r6, lr} 4: e2504000 subs r4, r0, #0 8: 0a00003f beq 10c <nrrdRangeSet+0x10c> c: e3510000 cmp r1, #0 10: e1a05001 mov r5, r1 then with new peephole2 patch: 00000000 <nrrdRangeSet>: 0: e92d407f push {r0, r1, r2, r3, r4, r5, r6, lr} 4: e2504000 subs r4, r0, #0 8: 0a00003e beq 108 <nrrdRangeSet+0x108> c: e2515000 subs r5, r1, #0 Thanks, Fredrik ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: ARM peephole2 from 2003 never merged, still valid 2019-06-06 14:55 ` Fredrik Hederstierna @ 2019-06-06 16:06 ` Richard Earnshaw (lists) 2019-06-06 16:49 ` Segher Boessenkool 2019-06-14 20:30 ` Jeff Law 0 siblings, 2 replies; 19+ messages in thread From: Richard Earnshaw (lists) @ 2019-06-06 16:06 UTC (permalink / raw) To: Fredrik Hederstierna, Segher Boessenkool; +Cc: Jeff Law, gcc On 06/06/2019 15:55, Fredrik Hederstierna wrote: >> From: Segher Boessenkool <segher@kernel.crashing.org> >> Sent: Thursday, June 6, 2019 4:02 PM >> To: Richard Earnshaw (lists) >> Cc: Jeff Law; Fredrik Hederstierna; gcc@gcc.gnu.org >> Subject: Re: ARM peephole2 from 2003 never merged, still valid >  >> That doesn't stop combine from considering it. It does make that first SET >> survive, so that you get a parallel as final insn. It may not like that >> one of the parallel SETs is just a move. Needs testcase :-) > > Hi all, thanks for investigating this, > I added semi-stripped testcase in original issue taken from CSiBE teem sources > > See attachment in > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=9663 > > Tested gcc-9.1.0 for ARM 32bit targets, first without peephole2 patch: > > 00000000 <nrrdRangeSet>: > 0: e92d407f push {r0, r1, r2, r3, r4, r5, r6, lr} > 4: e2504000 subs r4, r0, #0 > 8: 0a00003f beq 10c <nrrdRangeSet+0x10c> > c: e3510000 cmp r1, #0 > 10: e1a05001 mov r5, r1 > > then with new peephole2 patch: > > 00000000 <nrrdRangeSet>: > 0: e92d407f push {r0, r1, r2, r3, r4, r5, r6, lr} > 4: e2504000 subs r4, r0, #0 > 8: 0a00003e beq 108 <nrrdRangeSet+0x108> > c: e2515000 subs r5, r1, #0 > > Thanks, Fredrik > The reason combine doesn't catch this is because at the time it runs the MOV is in a different basic block. Later on it is sunk into the same basic block, but it's then too late to do the merge. R. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: ARM peephole2 from 2003 never merged, still valid 2019-06-06 16:06 ` Richard Earnshaw (lists) @ 2019-06-06 16:49 ` Segher Boessenkool 2019-07-08 6:41 ` Fredrik Hederstierna 2019-06-14 20:30 ` Jeff Law 1 sibling, 1 reply; 19+ messages in thread From: Segher Boessenkool @ 2019-06-06 16:49 UTC (permalink / raw) To: Richard Earnshaw (lists); +Cc: Fredrik Hederstierna, Jeff Law, gcc On Thu, Jun 06, 2019 at 05:06:35PM +0100, Richard Earnshaw (lists) wrote: > The reason combine doesn't catch this is because at the time it runs the > MOV is in a different basic block. Later on it is sunk into the same > basic block, but it's then too late to do the merge. Or you could say the MOV didn't even exist yet: the insn that is merged by the peephole is created by the prologue code, eventually. This isn't really a target problem, it is very much generic, but I don't see a better solution than a peephole either. Segher ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: ARM peephole2 from 2003 never merged, still valid 2019-06-06 16:49 ` Segher Boessenkool @ 2019-07-08 6:41 ` Fredrik Hederstierna 2019-07-09 19:13 ` Jeff Law 0 siblings, 1 reply; 19+ messages in thread From: Fredrik Hederstierna @ 2019-07-08 6:41 UTC (permalink / raw) To: Segher Boessenkool, Richard Earnshaw (lists); +Cc: Jeff Law, gcc > From: Segher Boessenkool <segher@kernel.crashing.org> > Sent: Thursday, June 6, 2019 6:49 PM > To: Richard Earnshaw (lists) > Cc: Fredrik Hederstierna; Jeff Law; gcc@gcc.gnu.org > Subject: Re: ARM peephole2 from 2003 never merged, still valid > > > On Thu, Jun 06, 2019 at 05:06:35PM +0100, Richard Earnshaw (lists) wrote: > > The reason combine doesn't catch this is because at the time it runs the > > MOV is in a different basic block. Later on it is sunk into the same > > basic block, but it's then too late to do the merge. > > Or you could say the MOV didn't even exist yet: the insn that is merged > by the peephole is created by the prologue code, eventually. > > This isn't really a target problem, it is very much generic, but I don't > see a better solution than a peephole either. > > Segher So what is the conclusion, should be re-open PR-9663 and try to fix the missing peephole? I think there will always be cases where the code generated target specific register optimizations are possible, that not necessarily have any direct connection of the actual basic blocks but more by chance which instructions are finally generated after each other, where maybe eg peephole-alike optimizers are the only solution to do the job? /Fredrik ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: ARM peephole2 from 2003 never merged, still valid 2019-07-08 6:41 ` Fredrik Hederstierna @ 2019-07-09 19:13 ` Jeff Law 0 siblings, 0 replies; 19+ messages in thread From: Jeff Law @ 2019-07-09 19:13 UTC (permalink / raw) To: Fredrik Hederstierna, Segher Boessenkool, Richard Earnshaw (lists); +Cc: gcc On 7/8/19 12:41 AM, Fredrik Hederstierna wrote: >> From: Segher Boessenkool <segher@kernel.crashing.org> Sent: >> Thursday, June 6, 2019 6:49 PM To: Richard Earnshaw (lists) Cc: >> Fredrik Hederstierna; Jeff Law; gcc@gcc.gnu.org Subject: Re: ARM >> peephole2 from 2003 never merged, still valid >> >> >> On Thu, Jun 06, 2019 at 05:06:35PM +0100, Richard Earnshaw (lists) >> wrote: >>> The reason combine doesn't catch this is because at the time it >>> runs the MOV is in a different basic block. Later on it is sunk >>> into the same basic block, but it's then too late to do the >>> merge. >> >> Or you could say the MOV didn't even exist yet: the insn that is >> merged by the peephole is created by the prologue code, >> eventually. >> >> This isn't really a target problem, it is very much generic, but I >> don't see a better solution than a peephole either. >> >> Segher > > So what is the conclusion, should be re-open PR-9663 and try to fix > the missing peephole? I think there will always be cases where the > code generated target specific register optimizations are possible, > that not necessarily have any direct connection of the actual basic > blocks but more by chance which instructions are finally generated > after each other, where maybe eg peephole-alike optimizers are the > only solution to do the job? /Fredrik > I think it's largely up to the ARM maintainers. Based on my understanding of the thread, I'd go with the peephole2. jeff ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: ARM peephole2 from 2003 never merged, still valid 2019-06-06 16:06 ` Richard Earnshaw (lists) 2019-06-06 16:49 ` Segher Boessenkool @ 2019-06-14 20:30 ` Jeff Law 1 sibling, 0 replies; 19+ messages in thread From: Jeff Law @ 2019-06-14 20:30 UTC (permalink / raw) To: Richard Earnshaw (lists), Fredrik Hederstierna, Segher Boessenkool; +Cc: gcc On 6/6/19 10:06 AM, Richard Earnshaw (lists) wrote: > On 06/06/2019 15:55, Fredrik Hederstierna wrote: >>> From: Segher Boessenkool <segher@kernel.crashing.org> >>> Sent: Thursday, June 6, 2019 4:02 PM >>> To: Richard Earnshaw (lists) >>> Cc: Jeff Law; Fredrik Hederstierna; gcc@gcc.gnu.org >>> Subject: Re: ARM peephole2 from 2003 never merged, still valid >>  >>> That doesn't stop combine from considering it. It does make that first SET >>> survive, so that you get a parallel as final insn. It may not like that >>> one of the parallel SETs is just a move. Needs testcase :-) >> >> Hi all, thanks for investigating this, >> I added semi-stripped testcase in original issue taken from CSiBE teem sources >> >> See attachment in >> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=9663 >> >> Tested gcc-9.1.0 for ARM 32bit targets, first without peephole2 patch: >> >> 00000000 <nrrdRangeSet>: >> 0: e92d407f push {r0, r1, r2, r3, r4, r5, r6, lr} >> 4: e2504000 subs r4, r0, #0 >> 8: 0a00003f beq 10c <nrrdRangeSet+0x10c> >> c: e3510000 cmp r1, #0 >> 10: e1a05001 mov r5, r1 >> >> then with new peephole2 patch: >> >> 00000000 <nrrdRangeSet>: >> 0: e92d407f push {r0, r1, r2, r3, r4, r5, r6, lr} >> 4: e2504000 subs r4, r0, #0 >> 8: 0a00003e beq 108 <nrrdRangeSet+0x108> >> c: e2515000 subs r5, r1, #0 >> >> Thanks, Fredrik >> > > The reason combine doesn't catch this is because at the time it runs the > MOV is in a different basic block. Later on it is sunk into the same > basic block, but it's then too late to do the merge. And while this looks well suited for compare-elim, nobody's done the work to enable compare-elim on ARM. It may also be the case that compare-elim is still too early, at least for the testcase in the PR. So should we just go with the peephole2? jeff ^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2019-07-09 19:13 UTC | newest] Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-07-24 8:48 Question about GCC benchmarks and uninitialized variables Fredrik Hederstierna 2018-07-24 9:04 ` Alexander Monakov 2018-07-24 12:17 ` David Brown 2019-05-30 9:13 ` Question about GCC not warning for some noncompliant SEI CERT C code examples Fredrik Hederstierna 2019-05-30 14:28 ` Martin Sebor 2019-05-30 14:48 ` Jeff Law 2019-06-01 23:35 ` ARM peephole2 from 2003 never merged, still valid Fredrik Hederstierna 2019-06-01 23:41 ` Fredrik Hederstierna 2019-06-02 12:28 ` Segher Boessenkool 2019-06-05 23:03 ` Jeff Law 2019-06-05 23:46 ` Segher Boessenkool 2019-06-06 9:13 ` Richard Earnshaw (lists) 2019-06-06 14:02 ` Segher Boessenkool 2019-06-06 14:55 ` Fredrik Hederstierna 2019-06-06 16:06 ` Richard Earnshaw (lists) 2019-06-06 16:49 ` Segher Boessenkool 2019-07-08 6:41 ` Fredrik Hederstierna 2019-07-09 19:13 ` Jeff Law 2019-06-14 20:30 ` Jeff Law
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).