Question about GCC benchmarks and uninitialized variables

public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed

* Question about GCC benchmarks and uninitialized variables
@ 2018-07-24  8:48 Fredrik Hederstierna
  2018-07-24  9:04 ` Alexander Monakov
                   ` (2 more replies)
  0 siblings, 3 replies; 19+ messages in thread
From: Fredrik Hederstierna @ 2018-07-24  8:48 UTC (permalink / raw)
  To: gcc

Hi

This is a general question to all you working with GCC benchmarking.

I have been working with code benchmarks like CSiBE for ARM.
From time to time unpredicted results appears where numbers gets worse by no reason.

When looking into what could cause this unpredictable behaviour, I found that there are (at least for CSiBE),
tons of code warnings that could cause unpredictable code generation like -Wuninitialized and -Wmaybe-uninitialized.
Alot of these warning indicates real bugs.

I added this issue in bugzilla,
#Bug 85880 - "Different code generation for uninitialized variables"
Though it got (correctly) rejected, since its not a bug.
But still I'm thinking how this apply to benchmarking code, and to how to approach and address this fact.

So my question is how to approach this problems when doing benchmarking,
ofcourse we want the benchmark to mirror as near as 'real life' code as possible.
But if code contains real bugs, and issues that could cause unpredictable code generation, should such code be banned from benchmarking, since results might be misleading? On the other hand, the compiler should generate best code for any input?

What do you think, should benchmarking code not being allowed to have eg warnings like
-Wuninitialized and maybe -Wmaybe-uninitialized?
Are there more warnings that indicate unpredictable code generations due to bad code, or are the root cause that these are 'bugs', and we should not allow real bugs at all in benchmarking code?

I've read some about uninitialized variable issues also at this link which was interesting, its a wider discussion, and ofcourse a very hard problem to solve.
https://gcc.gnu.org/wiki/Better_Uninitialized_Warnings

Thanks, Best Regards,
Fredrik

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Question about GCC benchmarks and uninitialized variables
  2018-07-24  8:48 Question about GCC benchmarks and uninitialized variables Fredrik Hederstierna
@ 2018-07-24  9:04 ` Alexander Monakov
  2018-07-24 12:17 ` David Brown
  2019-05-30  9:13 ` Question about GCC not warning for some noncompliant SEI CERT C code examples Fredrik Hederstierna
  2 siblings, 0 replies; 19+ messages in thread
From: Alexander Monakov @ 2018-07-24  9:04 UTC (permalink / raw)
  To: Fredrik Hederstierna; +Cc: gcc

On Tue, 24 Jul 2018, Fredrik Hederstierna wrote:

> So my question is how to approach this problems when doing benchmarking,
> ofcourse we want the benchmark to mirror as near as 'real life' code as
> possible.  But if code contains real bugs, and issues that could cause
> unpredictable code generation, should such code be banned from benchmarking,
> since results might be misleading?

Well, all benchmarks are going to be imperfect reflections of real-life
workloads in the first place, so their bugs just increase the degree to
which they are misleading.

When a new compiler version starts to treat some undefined piece of code
differently, it can cause a range of effects from code size perturbations
as in your case, to completely invalidating the benchmark as in spec2k6
x264 benchmark's case (where GCC exploited undefined behavior in a loop,
turning it to an infinite loop that eventually segfaulted).

Perhaps even though results on individual benchmarks can vary wildly,
aggregated results across a wide range of non-toy benchmarks may be
indicative of ... something, because they are unlikely to all exhibit
the same "bugs".

> On the other hand, the compiler should
> generate best code for any input?

Engineering effort is limited, so it's probably better to go for generating
good code for inputs that are likely to resemble actively used code (and in
actively used&maintained code, bugs can be reported and fixed) :)

> What do you think, should benchmarking code not being allowed to have eg
> warnings like -Wuninitialized and maybe -Wmaybe-uninitialized?  Are there more
> warnings that indicate unpredictable code generations due to bad code, or are
> the root cause that these are 'bugs', and we should not allow real bugs at all
> in benchmarking code?

A blanket ban on warnings won't work, they have false positives (especially the
-Wmaybe- one), and there exist code that validly uses uninitialized data. I
don't have such a striking example for scalar variables, but for uninitialized
arrays there's this sparse set algorithm (which GCC itself also uses):
https://research.swtch.com/sparse

I think good benchmarks sets should be able to evolve to account for newly
discovered bugs, rather then remain frozen (which sounds like a reason to
become obsolete sooner rather than later).

Alexander

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Question about GCC benchmarks and uninitialized variables
  2018-07-24  8:48 Question about GCC benchmarks and uninitialized variables Fredrik Hederstierna
  2018-07-24  9:04 ` Alexander Monakov
@ 2018-07-24 12:17 ` David Brown
  2019-05-30  9:13 ` Question about GCC not warning for some noncompliant SEI CERT C code examples Fredrik Hederstierna
  2 siblings, 0 replies; 19+ messages in thread
From: David Brown @ 2018-07-24 12:17 UTC (permalink / raw)
  To: Fredrik Hederstierna, gcc

On 24/07/18 09:40, Fredrik Hederstierna wrote:
> Hi
> 
> This is a general question to all you working with GCC benchmarking.
> 
> I have been working with code benchmarks like CSiBE for ARM. From
> time to time unpredicted results appears where numbers gets worse by
> no reason.
> 
> When looking into what could cause this unpredictable behaviour, I
> found that there are (at least for CSiBE), tons of code warnings that
> could cause unpredictable code generation like -Wuninitialized and
> -Wmaybe-uninitialized. Alot of these warning indicates real bugs.
> 
> I added this issue in bugzilla, #Bug 85880 - "Different code
> generation for uninitialized variables" Though it got (correctly)
> rejected, since its not a bug. But still I'm thinking how this apply
> to benchmarking code, and to how to approach and address this fact.
> 
> So my question is how to approach this problems when doing
> benchmarking, ofcourse we want the benchmark to mirror as near as
> 'real life' code as possible. But if code contains real bugs, and
> issues that could cause unpredictable code generation, should such
> code be banned from benchmarking, since results might be misleading?
> On the other hand, the compiler should generate best code for any
> input?
> 
> What do you think, should benchmarking code not being allowed to have
> eg warnings like -Wuninitialized and maybe -Wmaybe-uninitialized? Are
> there more warnings that indicate unpredictable code generations due
> to bad code, or are the root cause that these are 'bugs', and we
> should not allow real bugs at all in benchmarking code?
> 
> I've read some about uninitialized variable issues also at this link
> which was interesting, its a wider discussion, and ofcourse a very
> hard problem to solve. 
> https://gcc.gnu.org/wiki/Better_Uninitialized_Warnings
> 
> Thanks, Best Regards, Fredrik
> 

AFAIUI, "-Wuninitialized" should not give false positives - if it
triggers a warning, it is because you are using a variable without
initialisation.  The code is clearly wrong, and should not be used for
real code or for benchmarking.

"-Wmaybe-unitialized" indicates that there /could/ be a problem - it is
code that you need to look at carefully.

I'd say that in benchmarking code, you want to be very careful to
generate predictable and deterministic results - and that means no use
of uninitialised data.  You need to get as close to perfectly repeatable
results as the hardware allows - only then can you know what changes the
performance of the code or hardware.  If your results and timings depend
on what values happen to lie in memory before the benchmark runs, you
are guessing, not measuring.

If you get a "-Wuninitialized" warning, it's a bug in the code, and
needs fixed.  A "-Wmaybe-uninitialized" warning may be a false positive,
so you might want to modify code to avoid such false positives (so that
you can see the real problems found by the warning).  The gcc manual
gives this example:

{
  int x;
  switch (y)
    {
    case 1: x = 1;
      break;
    case 2: x = 4;
      break;
    case 3: x = 5;
    }
  foo (x);
}

A suggested fix is an "assert(0)" in the default case.  You could also
have a "__builtin_unreachable()" instead - this will avoid any extra
run-time code, but of course you have to be very sure that x is valid or
you have serious undefined behaviour.

A lazy workaround would be to use "int x = x;".

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Question about GCC not warning for some noncompliant SEI CERT C code examples
  2018-07-24  8:48 Question about GCC benchmarks and uninitialized variables Fredrik Hederstierna
  2018-07-24  9:04 ` Alexander Monakov
  2018-07-24 12:17 ` David Brown
@ 2019-05-30  9:13 ` Fredrik Hederstierna
  2019-05-30 14:28   ` Martin Sebor
  2019-06-01 23:35   ` ARM peephole2 from 2003 never merged, still valid Fredrik Hederstierna
  2 siblings, 2 replies; 19+ messages in thread
From: Fredrik Hederstierna @ 2019-05-30  9:13 UTC (permalink / raw)
  To: gcc

Hi

When reading the SEI CERT C Coding Standard rules, looking at
"DCL30-C. Declare objects with appropriate storage durations"
it seem like GCC does not warn in compile-time for some noncompliant examples.

I know eg AddressSanitizer and several runtime running tools finds these bugs,
but it would be convenient of GCC could do some basic static analysis already in compile-time to avoid bad code generation.
Some static analysers finds these bugs, but not all, and GCC does not warn.

Example from DCL30-C, not warned by GCC:

/* NONCOMPLIANT EXAMPLE-1 */
#include <stdio.h>
const char *p;
void dont_do_this(void) {
  const char c_str[] = "This will change";
  p = c_str; /* Dangerous */
}
void innocuous(void) {
  printf("%s\n", p);
}
int main(void) {
  dont_do_this();
  innocuous();
  return 0;
}

/* NONCOMPLIANT EXAMPLE-2 */
void squirrel_away(char **ptr_param) {
  char local[10];
  /* Initialize array */
  *ptr_param = local;
}
void rodent(void) {
  char *ptr;
  squirrel_away(&ptr);
  /* ptr is live but invalid here */
}

Question, where in GCC is the most appropriate place to implements such a checker?
I know there are some warnings for return-local-addr, and null-pointer-dereference in some different parts, but this seems different?
Can it be found be points-to analysis, or where is it best to put this warning if being implemented?

Reference:
https://wiki.sei.cmu.edu/confluence/display/c/DCL30-C.+Declare+objects+with+appropriate+storage+durations

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Question about GCC not warning for some noncompliant SEI CERT C code examples
  2019-05-30  9:13 ` Question about GCC not warning for some noncompliant SEI CERT C code examples Fredrik Hederstierna
@ 2019-05-30 14:28   ` Martin Sebor
  2019-05-30 14:48     ` Jeff Law
  2019-06-01 23:35   ` ARM peephole2 from 2003 never merged, still valid Fredrik Hederstierna
  1 sibling, 1 reply; 19+ messages in thread
From: Martin Sebor @ 2019-05-30 14:28 UTC (permalink / raw)
  To: Fredrik Hederstierna, gcc

On 5/30/19 3:12 AM, Fredrik Hederstierna wrote:
> Hi
> 
> When reading the SEI CERT C Coding Standard rules, looking at
> "DCL30-C. Declare objects with appropriate storage durations"
> it seem like GCC does not warn in compile-time for some noncompliant examples.
> 
> I know eg AddressSanitizer and several runtime running tools finds these bugs,
> but it would be convenient of GCC could do some basic static analysis already in compile-time to avoid bad code generation.
> Some static analysers finds these bugs, but not all, and GCC does not warn.
> 
> Example from DCL30-C, not warned by GCC:
> 
> 
> /* NONCOMPLIANT EXAMPLE-1 */
> #include <stdio.h>
> const char *p;
> void dont_do_this(void) {
>    const char c_str[] = "This will change";
>    p = c_str; /* Dangerous */
> }
> void innocuous(void) {
>    printf("%s\n", p);
> }
> int main(void) {
>    dont_do_this();
>    innocuous();
>    return 0;
> }
> 
> 
> /* NONCOMPLIANT EXAMPLE-2 */
> void squirrel_away(char **ptr_param) {
>    char local[10];
>    /* Initialize array */
>    *ptr_param = local;
> }
> void rodent(void) {
>    char *ptr;
>    squirrel_away(&ptr);
>    /* ptr is live but invalid here */
> }

Agreed.  This (or a subset of the problem) is being tracked in
PR 69433.

> 
> Question, where in GCC is the most appropriate place to implements such a checker?
> I know there are some warnings for return-local-addr, and null-pointer-dereference in some different parts, but this seems different?
> Can it be found be points-to analysis, or where is it best to put this warning if being implemented?

To me it seems close enough to -Wreturn-local-addr to be implemented
in the same place, in gimple-ssa-isolate-paths.c.  It's worth noting
that besides warning, the pass also clears the returning address and
isolates the path that returns the null in the CFG.

As it happens, I recently submitted a modest enhancement to
-Wreturn-local-addr to detect returning the addresses of VLAs and
pointers obtained from alloca (PR 71924 and PR 90549).  I'm working
on extending the implementation to returning freed pointers (under
a separate warning option).

Besides the problems you mention, there is also a related request
to diagnose dereferencing pointers to compound literals whose
lifetime has ended (PR 89990), or more generally, those to any
such local object.

In the enhancement I submitted I chose not to use the alias oracle
mainly because I didn't want to change the existing design.  I'm
not familiar enough with to tell with confidence if it can be used
to obtain the same results.  I.e., identify each instance of
a local variable that a pointer may point to, rather than just
answering the broad question: does this pointer point to any
local variables?  If it can, using it as Jeff suggests in his
comments, would make sense.  I don't think moving it out of
gimple-ssa-isolate-paths.c is necessary (or even a good idea).

Martin

> 
> Reference:
> https://wiki.sei.cmu.edu/confluence/display/c/DCL30-C.+Declare+objects+with+appropriate+storage+durations
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Question about GCC not warning for some noncompliant SEI CERT C code examples
  2019-05-30 14:28   ` Martin Sebor
@ 2019-05-30 14:48     ` Jeff Law
  0 siblings, 0 replies; 19+ messages in thread
From: Jeff Law @ 2019-05-30 14:48 UTC (permalink / raw)
  To: Martin Sebor, Fredrik Hederstierna, gcc

On 5/30/19 8:28 AM, Martin Sebor wrote:
> On 5/30/19 3:12 AM, Fredrik Hederstierna wrote:
>> Hi
>>
>> When reading the SEI CERT C Coding Standard rules, looking at
>> "DCL30-C. Declare objects with appropriate storage durations"
>> it seem like GCC does not warn in compile-time for some noncompliant
>> examples.
>>
>> I know eg AddressSanitizer and several runtime running tools finds
>> these bugs,
>> but it would be convenient of GCC could do some basic static analysis
>> already in compile-time to avoid bad code generation.
>> Some static analysers finds these bugs, but not all, and GCC does not
>> warn.
>>
>> Example from DCL30-C, not warned by GCC:
>>
>>
>> /* NONCOMPLIANT EXAMPLE-1 */
>> #include <stdio.h>
>> const char *p;
>> void dont_do_this(void) {
>> Â Â  const char c_str[] = "This will change";
>> Â Â  p = c_str; /* Dangerous */
>> }
>> void innocuous(void) {
>> Â Â  printf("%s\n", p);
>> }
>> int main(void) {
>> Â Â  dont_do_this();
>> Â Â  innocuous();
>> Â Â  return 0;
>> }
>>
>>
>> /* NONCOMPLIANT EXAMPLE-2 */
>> void squirrel_away(char **ptr_param) {
>> Â Â  char local[10];
>> Â Â  /* Initialize array */
>> Â Â  *ptr_param = local;
>> }
>> void rodent(void) {
>> Â Â  char *ptr;
>> Â Â  squirrel_away(&ptr);
>> Â Â  /* ptr is live but invalid here */
>> }
> 
> Agreed.Â  This (or a subset of the problem) is being tracked in
> PR 69433.
> 
>>
>> Question, where in GCC is the most appropriate place to implements
>> such a checker?
>> I know there are some warnings for return-local-addr, and
>> null-pointer-dereference in some different parts, but this seems
>> different?
>> Can it be found be points-to analysis, or where is it best to put this
>> warning if being implemented?
> 
> To me it seems close enough to -Wreturn-local-addr to be implemented
> in the same place, in gimple-ssa-isolate-paths.c.Â  It's worth noting
> that besides warning, the pass also clears the returning address and
> isolates the path that returns the null in the CFG.
> 
> As it happens, I recently submitted a modest enhancement to
> -Wreturn-local-addr to detect returning the addresses of VLAs and
> pointers obtained from alloca (PR 71924 and PR 90549).Â  I'm working
> on extending the implementation to returning freed pointers (under
> a separate warning option).
> 
> Besides the problems you mention, there is also a related request
> to diagnose dereferencing pointers to compound literals whose
> lifetime has ended (PR 89990), or more generally, those to any
> such local object.
> 
> In the enhancement I submitted I chose not to use the alias oracle
> mainly because I didn't want to change the existing design.Â  I'm
> not familiar enough with to tell with confidence if it can be used
> to obtain the same results.Â  I.e., identify each instance of
> a local variable that a pointer may point to, rather than just
> answering the broad question: does this pointer point to any
> local variables?Â  If it can, using it as Jeff suggests in his
> comments, would make sense.Â  I don't think moving it out of
> gimple-ssa-isolate-paths.c is necessary (or even a good idea).
Note that you may also be able to use the alias oracle to detect
escaping locals.  That's likely to have a higher false positive rate,
but may still be useful for detecting this kind of stuff.

jeff

^ permalink raw reply	[flat|nested] 19+ messages in thread

* ARM peephole2 from 2003 never merged, still valid
  2019-05-30  9:13 ` Question about GCC not warning for some noncompliant SEI CERT C code examples Fredrik Hederstierna
  2019-05-30 14:28   ` Martin Sebor
@ 2019-06-01 23:35   ` Fredrik Hederstierna
  2019-06-01 23:41     ` Fredrik Hederstierna
  1 sibling, 1 reply; 19+ messages in thread
From: Fredrik Hederstierna @ 2019-06-01 23:35 UTC (permalink / raw)
  To: gcc

I found this patch from 2003 by Tamas Gergely that was never merged.


Though it seems to still be valid, since when applying it, the CSiBE benchmark reduce total code size -170 bytes, making 35 out of 893 files smaller (from -4 to -12 bytes (1-3 instructions).


Bug


asd


asd


asd



updated patch:


diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 20bdc2bec37..4c924499ad9 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,9 @@
+2019-06-01  Fredrik Hederstierna  <fredrik@hederstierna.com>
+            Tamas Gergely <gertom@rgai.hu>
+
+       * gcc/config/arm/arm.md: new peephole2 pattern added for extending
+       move and compare parallelization. Retake from 2003 patch adapted.
+
 2019-06-01  Martin Sebor  <msebor@redhat.com>

        PR middle-end/90694
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index ae582172ab9..753d18fbfc5 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -10951,6 +10951,19 @@
   ""
 )

+; The same as the previous, but with exchanged operands
+;
+(define_peephole2
+  [(set (match_operand:SI 0 "arm_general_register_operand" "")
+       (match_operand:SI 1 "arm_general_register_operand" ""))
+   (set (reg:CC CC_REGNUM)
+       (compare:CC (match_dup 0) (const_int 0)))]
+  "TARGET_ARM"
+  [(parallel [(set (reg:CC CC_REGNUM) (compare:CC (match_dup 1) (const_int 0)))
+             (set (match_dup 0) (match_dup 1))])]
+  ""
+)
+
 (define_split
   [(set (match_operand:SI 0 "s_register_operand" "")
        (and:SI (ge:SI (match_operand:SI 1 "s_register_operand" "")

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: ARM peephole2 from 2003 never merged, still valid
  2019-06-01 23:35   ` ARM peephole2 from 2003 never merged, still valid Fredrik Hederstierna
@ 2019-06-01 23:41     ` Fredrik Hederstierna
  2019-06-02 12:28       ` Segher Boessenkool
  0 siblings, 1 reply; 19+ messages in thread
From: Fredrik Hederstierna @ 2019-06-01 23:41 UTC (permalink / raw)
  To: gcc

(Sorry, my mail program crashed...  trying repost, correct now...)


I found this patch from 2003 by Tamas Gergely that was never merged for ARM.


Though it seems to still be valid, since when applying it, the CSiBE benchmark reduce total code size -170 bytes, making 35 out of 893 files smaller (from -4 to -12 bytes (1-3 instructions).


Could it be considered to apply it, since it abviously still is valid and reduce code size for ARM 32bit. The peephole2 pattern still is not matched.


Bug

Bug 9663<https://gcc.gnu.org/bugzilla/show_bug.cgi?id=9663> - [arm] gcc-20030127 misses an optimization opportunity

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=9663


Message
https://gcc.gnu.org/ml/gcc-patches/2003-02/msg00204.html


Adapted and updated patch from Tamas Gergely:


diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 20bdc2bec37..4c924499ad9 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,9 @@
+2019-06-01  Fredrik Hederstierna  <fredrik@hederstierna.com>
+            Tamas Gergely <gertom@rgai.hu>
+
+       * gcc/config/arm/arm.md: new peephole2 pattern added for extending
+       move and compare parallelization. Retake from 2003 patch adapted.
+
 2019-06-01  Martin Sebor  <msebor@redhat.com>

        PR middle-end/90694
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index ae582172ab9..753d18fbfc5 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -10951,6 +10951,19 @@
   ""
 )

+; The same as the previous, but with exchanged operands
+;
+(define_peephole2
+  [(set (match_operand:SI 0 "arm_general_register_operand" "")
+       (match_operand:SI 1 "arm_general_register_operand" ""))
+   (set (reg:CC CC_REGNUM)
+       (compare:CC (match_dup 0) (const_int 0)))]
+  "TARGET_ARM"
+  [(parallel [(set (reg:CC CC_REGNUM) (compare:CC (match_dup 1) (const_int 0)))
+             (set (match_dup 0) (match_dup 1))])]
+  ""
+)
+
 (define_split
   [(set (match_operand:SI 0 "s_register_operand" "")
        (and:SI (ge:SI (match_operand:SI 1 "s_register_operand" "")

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: ARM peephole2 from 2003 never merged, still valid
  2019-06-01 23:41     ` Fredrik Hederstierna
@ 2019-06-02 12:28       ` Segher Boessenkool
  2019-06-05 23:03         ` Jeff Law
  0 siblings, 1 reply; 19+ messages in thread
From: Segher Boessenkool @ 2019-06-02 12:28 UTC (permalink / raw)
  To: Fredrik Hederstierna; +Cc: gcc

On Sat, Jun 01, 2019 at 11:41:30PM +0000, Fredrik Hederstierna wrote:
> +(define_peephole2
> +  [(set (match_operand:SI 0 "arm_general_register_operand" "")
> +       (match_operand:SI 1 "arm_general_register_operand" ""))
> +   (set (reg:CC CC_REGNUM)
> +       (compare:CC (match_dup 0) (const_int 0)))]
> +  "TARGET_ARM"
> +  [(parallel [(set (reg:CC CC_REGNUM) (compare:CC (match_dup 1) (const_int 0)))
> +             (set (match_dup 0) (match_dup 1))])]
> +  ""
> +)

Hi Fredrik,

Do you have a testcase for this?  I wonder if it would be better handled
during combine, and what that then tried; or perhaps these opportunities
are created later, making a peephole a more attractive solution.


Segher

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: ARM peephole2 from 2003 never merged, still valid
  2019-06-02 12:28       ` Segher Boessenkool
@ 2019-06-05 23:03         ` Jeff Law
  2019-06-05 23:46           ` Segher Boessenkool
  0 siblings, 1 reply; 19+ messages in thread
From: Jeff Law @ 2019-06-05 23:03 UTC (permalink / raw)
  To: Segher Boessenkool, Fredrik Hederstierna; +Cc: gcc

On 6/2/19 6:28 AM, Segher Boessenkool wrote:
> On Sat, Jun 01, 2019 at 11:41:30PM +0000, Fredrik Hederstierna wrote:
>> +(define_peephole2
>> +  [(set (match_operand:SI 0 "arm_general_register_operand" "")
>> +       (match_operand:SI 1 "arm_general_register_operand" ""))
>> +   (set (reg:CC CC_REGNUM)
>> +       (compare:CC (match_dup 0) (const_int 0)))]
>> +  "TARGET_ARM"
>> +  [(parallel [(set (reg:CC CC_REGNUM) (compare:CC (match_dup 1) (const_int 0)))
>> +             (set (match_dup 0) (match_dup 1))])]
>> +  ""
>> +)
> 
> Hi Fredrik,
> 
> Do you have a testcase for this?  I wonder if it would be better handled
> during combine, and what that then tried; or perhaps these opportunities
> are created later, making a peephole a more attractive solution.
We have two independent insns with no output/true dependency between
them.  So there's really not anything for combine to do here.

What is more interesting to me is whether or not this could be handled
by compare-elim and a define_insn rather than a define_peephole2.  THe
existing pattern and the new one both seem well suited for compare-elim.

I do think a testcase is warranted here.  Fredrik, if you could reduce
one from CSiBE that might be appropriate.


jeff

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: ARM peephole2 from 2003 never merged, still valid
  2019-06-05 23:03         ` Jeff Law
@ 2019-06-05 23:46           ` Segher Boessenkool
  2019-06-06  9:13             ` Richard Earnshaw (lists)
  0 siblings, 1 reply; 19+ messages in thread
From: Segher Boessenkool @ 2019-06-05 23:46 UTC (permalink / raw)
  To: Jeff Law; +Cc: Fredrik Hederstierna, gcc

On Wed, Jun 05, 2019 at 05:02:53PM -0600, Jeff Law wrote:
> On 6/2/19 6:28 AM, Segher Boessenkool wrote:
> > On Sat, Jun 01, 2019 at 11:41:30PM +0000, Fredrik Hederstierna wrote:
> >> +(define_peephole2
> >> +  [(set (match_operand:SI 0 "arm_general_register_operand" "")
> >> +       (match_operand:SI 1 "arm_general_register_operand" ""))
> >> +   (set (reg:CC CC_REGNUM)
> >> +       (compare:CC (match_dup 0) (const_int 0)))]
> >> +  "TARGET_ARM"
> >> +  [(parallel [(set (reg:CC CC_REGNUM) (compare:CC (match_dup 1) (const_int 0)))
> >> +             (set (match_dup 0) (match_dup 1))])]
> >> +  ""
> >> +)
> > 
> > Do you have a testcase for this?  I wonder if it would be better handled
> > during combine, and what that then tried; or perhaps these opportunities
> > are created later, making a peephole a more attractive solution.
> We have two independent insns with no output/true dependency between
> them.  So there's really not anything for combine to do here.

op0 := op1;
CC := op0 cmp 0;

That's a perfectly fine dependency I think?

> What is more interesting to me is whether or not this could be handled
> by compare-elim and a define_insn rather than a define_peephole2.  THe
> existing pattern and the new one both seem well suited for compare-elim.
> 
> I do think a testcase is warranted here.  Fredrik, if you could reduce
> one from CSiBE that might be appropriate.

Yeah, let's see what is really going on :-)


Segher

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: ARM peephole2 from 2003 never merged, still valid
  2019-06-05 23:46           ` Segher Boessenkool
@ 2019-06-06  9:13             ` Richard Earnshaw (lists)
  2019-06-06 14:02               ` Segher Boessenkool
  0 siblings, 1 reply; 19+ messages in thread
From: Richard Earnshaw (lists) @ 2019-06-06  9:13 UTC (permalink / raw)
  To: Segher Boessenkool, Jeff Law; +Cc: Fredrik Hederstierna, gcc

On 06/06/2019 00:46, Segher Boessenkool wrote:
> On Wed, Jun 05, 2019 at 05:02:53PM -0600, Jeff Law wrote:
>> On 6/2/19 6:28 AM, Segher Boessenkool wrote:
>>> On Sat, Jun 01, 2019 at 11:41:30PM +0000, Fredrik Hederstierna wrote:
>>>> +(define_peephole2
>>>> +  [(set (match_operand:SI 0 "arm_general_register_operand" "")
>>>> +       (match_operand:SI 1 "arm_general_register_operand" ""))
>>>> +   (set (reg:CC CC_REGNUM)
>>>> +       (compare:CC (match_dup 0) (const_int 0)))]
>>>> +  "TARGET_ARM"
>>>> +  [(parallel [(set (reg:CC CC_REGNUM) (compare:CC (match_dup 1) (const_int 0)))
>>>> +             (set (match_dup 0) (match_dup 1))])]
>>>> +  ""
>>>> +)
>>>
>>> Do you have a testcase for this?  I wonder if it would be better handled
>>> during combine, and what that then tried; or perhaps these opportunities
>>> are created later, making a peephole a more attractive solution.
>> We have two independent insns with no output/true dependency between
>> them.  So there's really not anything for combine to do here.
> 
> op0 := op1;
> CC := op0 cmp 0;
> 
> That's a perfectly fine dependency I think?

But if op0 isn't dead after the second insn, combine won't consider it,
right?  Even though in this case the combination would be safe as we
still produce op0.

R.

> 
>> What is more interesting to me is whether or not this could be handled
>> by compare-elim and a define_insn rather than a define_peephole2.  THe
>> existing pattern and the new one both seem well suited for compare-elim.
>>
>> I do think a testcase is warranted here.  Fredrik, if you could reduce
>> one from CSiBE that might be appropriate.
> 
> Yeah, let's see what is really going on :-)
> 
> 
> Segher
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: ARM peephole2 from 2003 never merged, still valid
  2019-06-06  9:13             ` Richard Earnshaw (lists)
@ 2019-06-06 14:02               ` Segher Boessenkool
  2019-06-06 14:55                 ` Fredrik Hederstierna
  0 siblings, 1 reply; 19+ messages in thread
From: Segher Boessenkool @ 2019-06-06 14:02 UTC (permalink / raw)
  To: Richard Earnshaw (lists); +Cc: Jeff Law, Fredrik Hederstierna, gcc

On Thu, Jun 06, 2019 at 10:12:57AM +0100, Richard Earnshaw (lists) wrote:
> On 06/06/2019 00:46, Segher Boessenkool wrote:
> > On Wed, Jun 05, 2019 at 05:02:53PM -0600, Jeff Law wrote:
> >> On 6/2/19 6:28 AM, Segher Boessenkool wrote:
> >>> Do you have a testcase for this?  I wonder if it would be better handled
> >>> during combine, and what that then tried; or perhaps these opportunities
> >>> are created later, making a peephole a more attractive solution.
> >> We have two independent insns with no output/true dependency between
> >> them.  So there's really not anything for combine to do here.
> > 
> > op0 := op1;
> > CC := op0 cmp 0;
> > 
> > That's a perfectly fine dependency I think?
> 
> But if op0 isn't dead after the second insn, combine won't consider it,
> right?  Even though in this case the combination would be safe as we
> still produce op0.

That doesn't stop combine from considering it.  It does make that first SET
survive, so that you get a parallel as final insn.  It may not like that
one of the parallel SETs is just a move.  Needs testcase :-)


Segher

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: ARM peephole2 from 2003 never merged, still valid
  2019-06-06 14:02               ` Segher Boessenkool
@ 2019-06-06 14:55                 ` Fredrik Hederstierna
  2019-06-06 16:06                   ` Richard Earnshaw (lists)
  0 siblings, 1 reply; 19+ messages in thread
From: Fredrik Hederstierna @ 2019-06-06 14:55 UTC (permalink / raw)
  To: Segher Boessenkool, Richard Earnshaw (lists); +Cc: Jeff Law, gcc

> From: Segher Boessenkool <segher@kernel.crashing.org>
> Sent: Thursday, June 6, 2019 4:02 PM
> To: Richard Earnshaw (lists)
> Cc: Jeff Law; Fredrik Hederstierna; gcc@gcc.gnu.org
> Subject: Re: ARM peephole2 from 2003 never merged, still valid
  
> That doesn't stop combine from considering it.  It does make that first SET
> survive, so that you get a parallel as final insn.  It may not like that
> one of the parallel SETs is just a move.  Needs testcase :-)
 
Hi all, thanks for investigating this,
I added semi-stripped testcase in original issue taken from CSiBE teem sources

See attachment in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=9663

Tested gcc-9.1.0 for ARM 32bit targets, first without peephole2 patch:

00000000 <nrrdRangeSet>:
   0:	e92d407f 	push	{r0, r1, r2, r3, r4, r5, r6, lr}
   4:	e2504000 	subs	r4, r0, #0
   8:	0a00003f 	beq	10c <nrrdRangeSet+0x10c>
   c:	e3510000 	cmp	r1, #0
  10:	e1a05001 	mov	r5, r1

then with new peephole2 patch:

00000000 <nrrdRangeSet>:
   0:	e92d407f 	push	{r0, r1, r2, r3, r4, r5, r6, lr}
   4:	e2504000 	subs	r4, r0, #0
   8:	0a00003e 	beq	108 <nrrdRangeSet+0x108>
   c:	e2515000 	subs	r5, r1, #0

Thanks, Fredrik

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: ARM peephole2 from 2003 never merged, still valid
  2019-06-06 14:55                 ` Fredrik Hederstierna
@ 2019-06-06 16:06                   ` Richard Earnshaw (lists)
  2019-06-06 16:49                     ` Segher Boessenkool
  2019-06-14 20:30                     ` Jeff Law
  0 siblings, 2 replies; 19+ messages in thread
From: Richard Earnshaw (lists) @ 2019-06-06 16:06 UTC (permalink / raw)
  To: Fredrik Hederstierna, Segher Boessenkool; +Cc: Jeff Law, gcc

On 06/06/2019 15:55, Fredrik Hederstierna wrote:
>> From: Segher Boessenkool <segher@kernel.crashing.org>
>> Sent: Thursday, June 6, 2019 4:02 PM
>> To: Richard Earnshaw (lists)
>> Cc: Jeff Law; Fredrik Hederstierna; gcc@gcc.gnu.org
>> Subject: Re: ARM peephole2 from 2003 never merged, still valid
> Â  
>> That doesn't stop combine from considering it.Â  It does make that first SET
>> survive, so that you get a parallel as final insn.Â  It may not like that
>> one of the parallel SETs is just a move.Â  Needs testcase :-)
>  
> Hi all, thanks for investigating this,
> I added semi-stripped testcase in original issue taken from CSiBE teem sources
> 
> See attachment in
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=9663
> 
> Tested gcc-9.1.0 for ARM 32bit targets, first without peephole2 patch:
> 
> 00000000 <nrrdRangeSet>:
>    0:	e92d407f 	push	{r0, r1, r2, r3, r4, r5, r6, lr}
>    4:	e2504000 	subs	r4, r0, #0
>    8:	0a00003f 	beq	10c <nrrdRangeSet+0x10c>
>    c:	e3510000 	cmp	r1, #0
>   10:	e1a05001 	mov	r5, r1
> 
> then with new peephole2 patch:
> 
> 00000000 <nrrdRangeSet>:
>    0:	e92d407f 	push	{r0, r1, r2, r3, r4, r5, r6, lr}
>    4:	e2504000 	subs	r4, r0, #0
>    8:	0a00003e 	beq	108 <nrrdRangeSet+0x108>
>    c:	e2515000 	subs	r5, r1, #0
> 
> Thanks, Fredrik
> 

The reason combine doesn't catch this is because at the time it runs the
MOV is in a different basic block.  Later on it is sunk into the same
basic block, but it's then too late to do the merge.

R.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: ARM peephole2 from 2003 never merged, still valid
  2019-06-06 16:06                   ` Richard Earnshaw (lists)
@ 2019-06-06 16:49                     ` Segher Boessenkool
  2019-07-08  6:41                       ` Fredrik Hederstierna
  2019-06-14 20:30                     ` Jeff Law
  1 sibling, 1 reply; 19+ messages in thread
From: Segher Boessenkool @ 2019-06-06 16:49 UTC (permalink / raw)
  To: Richard Earnshaw (lists); +Cc: Fredrik Hederstierna, Jeff Law, gcc

On Thu, Jun 06, 2019 at 05:06:35PM +0100, Richard Earnshaw (lists) wrote:
> The reason combine doesn't catch this is because at the time it runs the
> MOV is in a different basic block.  Later on it is sunk into the same
> basic block, but it's then too late to do the merge.

Or you could say the MOV didn't even exist yet: the insn that is merged
by the peephole is created by the prologue code, eventually.

This isn't really a target problem, it is very much generic, but I don't
see a better solution than a peephole either.


Segher

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: ARM peephole2 from 2003 never merged, still valid
  2019-06-06 16:49                     ` Segher Boessenkool
@ 2019-07-08  6:41                       ` Fredrik Hederstierna
  2019-07-09 19:13                         ` Jeff Law
  0 siblings, 1 reply; 19+ messages in thread
From: Fredrik Hederstierna @ 2019-07-08  6:41 UTC (permalink / raw)
  To: Segher Boessenkool, Richard Earnshaw (lists); +Cc: Jeff Law, gcc

> From: Segher Boessenkool <segher@kernel.crashing.org>
>  Sent: Thursday, June 6, 2019 6:49 PM
>  To: Richard Earnshaw (lists)
>  Cc: Fredrik Hederstierna; Jeff Law; gcc@gcc.gnu.org
>  Subject: Re: ARM peephole2 from 2003 never merged, still valid 
>    
>  
> On Thu, Jun 06, 2019 at 05:06:35PM +0100, Richard Earnshaw (lists) wrote:
>  > The reason combine doesn't catch this is because at the time it runs the
>  > MOV is in a different basic block.  Later on it is sunk into the same
>  > basic block, but it's then too late to do the merge.
>  
>  Or you could say the MOV didn't even exist yet: the insn that is merged
>  by the peephole is created by the prologue code, eventually.
>  
>  This isn't really a target problem, it is very much generic, but I don't
>  see a better solution than a peephole either.
>
>  Segher
 
So what is the conclusion, should be re-open PR-9663 and try to fix the missing peephole?
I think there will always be cases where the code generated target specific register optimizations are possible, that not necessarily have any direct connection of the actual basic blocks but more by chance which instructions are finally generated after each other, where maybe eg peephole-alike optimizers are the only solution to do the job?
/Fredrik

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: ARM peephole2 from 2003 never merged, still valid
  2019-07-08  6:41                       ` Fredrik Hederstierna
@ 2019-07-09 19:13                         ` Jeff Law
  0 siblings, 0 replies; 19+ messages in thread
From: Jeff Law @ 2019-07-09 19:13 UTC (permalink / raw)
  To: Fredrik Hederstierna, Segher Boessenkool, Richard Earnshaw (lists); +Cc: gcc

On 7/8/19 12:41 AM, Fredrik Hederstierna wrote:
>> From: Segher Boessenkool <segher@kernel.crashing.org> Sent:
>> Thursday, June 6, 2019 6:49 PM To: Richard Earnshaw (lists) Cc:
>> Fredrik Hederstierna; Jeff Law; gcc@gcc.gnu.org Subject: Re: ARM
>> peephole2 from 2003 never merged, still valid
>> 
>> 
>> On Thu, Jun 06, 2019 at 05:06:35PM +0100, Richard Earnshaw (lists)
>> wrote:
>>> The reason combine doesn't catch this is because at the time it
>>> runs the MOV is in a different basic block.  Later on it is sunk
>>> into the same basic block, but it's then too late to do the
>>> merge.
>> 
>> Or you could say the MOV didn't even exist yet: the insn that is
>> merged by the peephole is created by the prologue code,
>> eventually.
>> 
>> This isn't really a target problem, it is very much generic, but I
>> don't see a better solution than a peephole either.
>> 
>> Segher
> 
> So what is the conclusion, should be re-open PR-9663 and try to fix
> the missing peephole? I think there will always be cases where the
> code generated target specific register optimizations are possible,
> that not necessarily have any direct connection of the actual basic
> blocks but more by chance which instructions are finally generated
> after each other, where maybe eg peephole-alike optimizers are the
> only solution to do the job? /Fredrik
> 

I think it's largely up to the ARM maintainers.  Based on my
understanding of the thread, I'd go with the peephole2.

jeff

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: ARM peephole2 from 2003 never merged, still valid
  2019-06-06 16:06                   ` Richard Earnshaw (lists)
  2019-06-06 16:49                     ` Segher Boessenkool
@ 2019-06-14 20:30                     ` Jeff Law
  1 sibling, 0 replies; 19+ messages in thread
From: Jeff Law @ 2019-06-14 20:30 UTC (permalink / raw)
  To: Richard Earnshaw (lists), Fredrik Hederstierna, Segher Boessenkool; +Cc: gcc

On 6/6/19 10:06 AM, Richard Earnshaw (lists) wrote:
> On 06/06/2019 15:55, Fredrik Hederstierna wrote:
>>> From: Segher Boessenkool <segher@kernel.crashing.org>
>>> Sent: Thursday, June 6, 2019 4:02 PM
>>> To: Richard Earnshaw (lists)
>>> Cc: Jeff Law; Fredrik Hederstierna; gcc@gcc.gnu.org
>>> Subject: Re: ARM peephole2 from 2003 never merged, still valid
>> Â  
>>> That doesn't stop combine from considering it.Â  It does make that first SET
>>> survive, so that you get a parallel as final insn.Â  It may not like that
>>> one of the parallel SETs is just a move.Â  Needs testcase :-)
>>  
>> Hi all, thanks for investigating this,
>> I added semi-stripped testcase in original issue taken from CSiBE teem sources
>>
>> See attachment in
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=9663
>>
>> Tested gcc-9.1.0 for ARM 32bit targets, first without peephole2 patch:
>>
>> 00000000 <nrrdRangeSet>:
>>    0:	e92d407f 	push	{r0, r1, r2, r3, r4, r5, r6, lr}
>>    4:	e2504000 	subs	r4, r0, #0
>>    8:	0a00003f 	beq	10c <nrrdRangeSet+0x10c>
>>    c:	e3510000 	cmp	r1, #0
>>   10:	e1a05001 	mov	r5, r1
>>
>> then with new peephole2 patch:
>>
>> 00000000 <nrrdRangeSet>:
>>    0:	e92d407f 	push	{r0, r1, r2, r3, r4, r5, r6, lr}
>>    4:	e2504000 	subs	r4, r0, #0
>>    8:	0a00003e 	beq	108 <nrrdRangeSet+0x108>
>>    c:	e2515000 	subs	r5, r1, #0
>>
>> Thanks, Fredrik
>>
> 
> The reason combine doesn't catch this is because at the time it runs the
> MOV is in a different basic block.  Later on it is sunk into the same
> basic block, but it's then too late to do the merge.
And while this looks well suited for compare-elim, nobody's done the
work to enable compare-elim on ARM.  It may also be the case that
compare-elim is still too early, at least for the testcase in the PR.

So should we just go with the peephole2?
jeff

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2019-07-09 19:13 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-07-24  8:48 Question about GCC benchmarks and uninitialized variables Fredrik Hederstierna
2018-07-24  9:04 ` Alexander Monakov
2018-07-24 12:17 ` David Brown
2019-05-30  9:13 ` Question about GCC not warning for some noncompliant SEI CERT C code examples Fredrik Hederstierna
2019-05-30 14:28   ` Martin Sebor
2019-05-30 14:48     ` Jeff Law
2019-06-01 23:35   ` ARM peephole2 from 2003 never merged, still valid Fredrik Hederstierna
2019-06-01 23:41     ` Fredrik Hederstierna
2019-06-02 12:28       ` Segher Boessenkool
2019-06-05 23:03         ` Jeff Law
2019-06-05 23:46           ` Segher Boessenkool
2019-06-06  9:13             ` Richard Earnshaw (lists)
2019-06-06 14:02               ` Segher Boessenkool
2019-06-06 14:55                 ` Fredrik Hederstierna
2019-06-06 16:06                   ` Richard Earnshaw (lists)
2019-06-06 16:49                     ` Segher Boessenkool
2019-07-08  6:41                       ` Fredrik Hederstierna
2019-07-09 19:13                         ` Jeff Law
2019-06-14 20:30                     ` Jeff Law

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).