public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [0/3] Fix PR71280, in ifcvt/rtlanal/i386.
@ 2016-11-23 18:58 Bernd Schmidt
  2016-11-23 19:00 ` [0/3] Fix PR78120, " Bernd Schmidt
                   ` (2 more replies)
  0 siblings, 3 replies; 32+ messages in thread
From: Bernd Schmidt @ 2016-11-23 18:58 UTC (permalink / raw)
  To: GCC Patches

This is a small series of patches to fix various problems in cost 
calculations that together caused PR71280, a missed optimization 
opportunity.

A summary of the problems:

1. I noticed comparisons between set_src_cost and set_rtx_cost seemed to 
be invalid. There seems to be no good reason that insn_rtx_cost 
shouldn't use the latter. It also makes the numbers comparable to the
ones you get from seq_cost.

2. The i386 backend mishandles SET rtxs. If you have a fairly plain 
single-insn SET, you tend to get COSTS_N_INSNS (2) out of set_rtx_cost, 
because rtx_costs has a default of COSTS_N_INSNS (1) for a SET, and you 
get the cost of the src in addition to that.

3. ifcvt computes the sum of costs for the involved blocks, but only 
makes a before/after comparison when optimizing for size. When 
optimizing for speed, it uses max_seq_cost, which is an estimate 
computed from BRANCH_COST, which in turn can be zero for predictable 
branches on x86.

It seems a little risky to tweak costs this late in the process, but all 
of these should be improvements so it would put us on a better footing 
for fixing performance issues. I'll leave it to the reviewer to decide 
whether we want this now or after gcc-7.

The series was bootstrapped and tested on x86_64-linux. There's the 
following new guality fail:

-PASS: gcc.dg/guality/pr54693-2.c   -Os  line 21 x == 10 - i
-PASS: gcc.dg/guality/pr54693-2.c   -Os  line 21 y == 20 - 2 * i
+FAIL: gcc.dg/guality/pr54693-2.c   -Os  line 21 x == 10 - i
+FAIL: gcc.dg/guality/pr54693-2.c   -Os  line 21 y == 20 - 2 * i

which appears to be caused by loss of debuginfo in ivopts:

-  # DEBUG x => (int) ((unsigned int) x_9(D) - (unsigned int) i_14)
-  # DEBUG y => (int) ((unsigned int) y_10(D) - (unsigned int) i_14 * 2)
+  # DEBUG x => NULL
+  # DEBUG y => NULL

I'd claim this is out of scope for this patch series. So, ok?


Bernd

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386.
  2016-11-23 18:58 [0/3] Fix PR71280, in ifcvt/rtlanal/i386 Bernd Schmidt
@ 2016-11-23 19:00 ` Bernd Schmidt
  2016-11-23 19:30   ` Jeff Law
  2016-11-24 14:21   ` Segher Boessenkool
  2016-11-23 19:01 ` Bernd Schmidt
  2016-11-23 19:03 ` [3/3] " Bernd Schmidt
  2 siblings, 2 replies; 32+ messages in thread
From: Bernd Schmidt @ 2016-11-23 19:00 UTC (permalink / raw)
  To: GCC Patches

[-- Attachment #1: Type: text/plain, Size: 357 bytes --]

Note that I misspelled the PR number in the 0/3 message :-/

On 11/23/2016 07:57 PM, Bernd Schmidt wrote:
> 1. I noticed comparisons between set_src_cost and set_rtx_cost seemed to
> be invalid. There seems to be no good reason that insn_rtx_cost
> shouldn't use the latter. It also makes the numbers comparable to the
> ones you get from seq_cost.


Bernd

[-- Attachment #2: 71280-1.diff --]
[-- Type: text/x-patch, Size: 480 bytes --]

	PR rtl-optimization/78120
	* rtlanal.c (insn_rtx_cost): Use set_rtx_cost.

Index: gcc/rtlanal.c
===================================================================
--- gcc/rtlanal.c	(revision 242038)
+++ gcc/rtlanal.c	(working copy)
@@ -5211,7 +5211,7 @@ insn_rtx_cost (rtx pat, bool speed)
   else
     return 0;
 
-  cost = set_src_cost (SET_SRC (set), GET_MODE (SET_DEST (set)), speed);
+  cost = set_rtx_cost (set, speed);
   return cost > 0 ? cost : COSTS_N_INSNS (1);
 }
 

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386.
  2016-11-23 18:58 [0/3] Fix PR71280, in ifcvt/rtlanal/i386 Bernd Schmidt
  2016-11-23 19:00 ` [0/3] Fix PR78120, " Bernd Schmidt
@ 2016-11-23 19:01 ` Bernd Schmidt
  2016-11-23 21:46   ` Uros Bizjak
  2016-11-23 19:03 ` [3/3] " Bernd Schmidt
  2 siblings, 1 reply; 32+ messages in thread
From: Bernd Schmidt @ 2016-11-23 19:01 UTC (permalink / raw)
  To: GCC Patches, Uros Bizjak

[-- Attachment #1: Type: text/plain, Size: 319 bytes --]

On 11/23/2016 07:57 PM, Bernd Schmidt wrote:
> 2. The i386 backend mishandles SET rtxs. If you have a fairly plain
> single-insn SET, you tend to get COSTS_N_INSNS (2) out of set_rtx_cost,
> because rtx_costs has a default of COSTS_N_INSNS (1) for a SET, and you
> get the cost of the src in addition to that.


Bernd


[-- Attachment #2: 71280-2.diff --]
[-- Type: text/x-patch, Size: 1312 bytes --]

	PR rtl-optimization/78120
	* config/i386/i386.c (ix86_rtx_costs): Fully handle SETs.

Index: gcc/config/i386/i386.c
===================================================================
--- gcc/config/i386/i386.c	(revision 242038)
+++ gcc/config/i386/i386.c	(working copy)
@@ -39925,6 +39925,7 @@ ix86_rtx_costs (rtx x, machine_mode mode
   enum rtx_code code = GET_CODE (x);
   enum rtx_code outer_code = (enum rtx_code) outer_code_i;
   const struct processor_costs *cost = speed ? ix86_cost : &ix86_size_cost;
+  int src_cost;
 
   switch (code)
     {
@@ -39935,7 +39936,23 @@ ix86_rtx_costs (rtx x, machine_mode mode
 	  *total = ix86_set_reg_reg_cost (GET_MODE (SET_DEST (x)));
 	  return true;
 	}
-      return false;
+
+      if (register_operand (SET_SRC (x), VOIDmode))
+	/* Avoid potentially incorrect high cost from rtx_costs
+	   for non-tieable SUBREGs.  */
+	src_cost = 0;
+      else
+	{
+	  src_cost = rtx_cost (SET_SRC (x), mode, SET, 1, speed);
+
+	  if (CONSTANT_P (SET_SRC (x)))
+	    /* Constant costs assume a base value of COSTS_N_INSNS (1) and add
+	       a small value, possibly zero for cheap constants.  */
+	    src_cost += COSTS_N_INSNS (1);
+	}
+
+      *total = src_cost + rtx_cost (SET_DEST (x), mode, SET, 0, speed);
+      return true;
 
     case CONST_INT:
     case CONST:

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [3/3] Fix PR78120, in ifcvt/rtlanal/i386.
  2016-11-23 18:58 [0/3] Fix PR71280, in ifcvt/rtlanal/i386 Bernd Schmidt
  2016-11-23 19:00 ` [0/3] Fix PR78120, " Bernd Schmidt
  2016-11-23 19:01 ` Bernd Schmidt
@ 2016-11-23 19:03 ` Bernd Schmidt
  2016-11-23 19:38   ` Jeff Law
  2 siblings, 1 reply; 32+ messages in thread
From: Bernd Schmidt @ 2016-11-23 19:03 UTC (permalink / raw)
  To: GCC Patches

[-- Attachment #1: Type: text/plain, Size: 502 bytes --]

On 11/23/2016 07:57 PM, Bernd Schmidt wrote:
> 3. ifcvt computes the sum of costs for the involved blocks, but only
> makes a before/after comparison when optimizing for size. When
> optimizing for speed, it uses max_seq_cost, which is an estimate
> computed from BRANCH_COST, which in turn can be zero for predictable
> branches on x86.

This is the final patch and has the testcase. It also happens to be the 
least risky of the series so it could be applied on its own (without the 
test).


Bernd


[-- Attachment #2: 71280-3.diff --]
[-- Type: text/x-patch, Size: 3388 bytes --]

	PR rtl-optimization/78120
	* ifcvt.c (noce_conversion_profitable_p): Check original cost in all
	cases, and additionally test against max_seq_cost for speed
	optimization.
	(noce_process_if_block): Compute an estimate for the original cost when
	optimizing for speed, using the minimum of then and else block costs.

	PR rtl-optimization/78120
	* gcc.target/i386/pr78120.c: New test.

Index: gcc/ifcvt.c
===================================================================
--- gcc/ifcvt.c	(revision 242038)
+++ gcc/ifcvt.c	(working copy)
@@ -812,8 +812,10 @@ struct noce_if_info
      we're optimizing for size.  */
   bool speed_p;
 
-  /* The combined cost of COND, JUMP and the costs for THEN_BB and
-     ELSE_BB.  */
+  /* An estimate of the original costs.  When optimizing for size, this is the
+     combined cost of COND, JUMP and the costs for THEN_BB and ELSE_BB.
+     When optimizing for speed, we use the costs of COND plus the minimum of
+     the costs for THEN_BB and ELSE_BB, as computed in the next field.  */
   unsigned int original_cost;
 
   /* Maximum permissible cost for the unconditional sequence we should
@@ -852,12 +857,12 @@ noce_conversion_profitable_p (rtx_insn *
   /* Cost up the new sequence.  */
   unsigned int cost = seq_cost (seq, speed_p);
 
+  if (cost <= if_info->original_cost)
+    return true;
+
   /* When compiling for size, we can make a reasonably accurately guess
-     at the size growth.  */
-  if (!speed_p)
-    return cost <= if_info->original_cost;
-  else
-    return cost <= if_info->max_seq_cost;
+     at the size growth.  When compiling for speed, use the maximum.  */
+  return speed_p && cost <= if_info->max_seq_cost;
 }
 
 /* Helper function for noce_try_store_flag*.  */
@@ -3441,15 +3446,24 @@ noce_process_if_block (struct noce_if_in
 	}
     }
 
-  if (! bb_valid_for_noce_process_p (then_bb, cond, &if_info->original_cost,
+  bool speed_p = optimize_bb_for_speed_p (test_bb);
+  unsigned int then_cost = 0, else_cost = 0;
+  if (!bb_valid_for_noce_process_p (then_bb, cond, &then_cost,
 				    &if_info->then_simple))
     return false;
 
   if (else_bb
-      && ! bb_valid_for_noce_process_p (else_bb, cond, &if_info->original_cost,
-				      &if_info->else_simple))
+      && !bb_valid_for_noce_process_p (else_bb, cond, &else_cost,
+				       &if_info->else_simple))
     return false;
 
+  if (else_bb == NULL)
+    if_info->original_cost += then_cost;
+  else if (speed_p)
+    if_info->original_cost += MIN (then_cost, else_cost);
+  else
+    if_info->original_cost += then_cost + else_cost;
+
   insn_a = last_active_insn (then_bb, FALSE);
   set_a = single_set (insn_a);
   gcc_assert (set_a);
Index: gcc/testsuite/gcc.target/i386/pr78120.c
===================================================================
--- gcc/testsuite/gcc.target/i386/pr78120.c	(nonexistent)
+++ gcc/testsuite/gcc.target/i386/pr78120.c	(working copy)
@@ -0,0 +1,27 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mtune=generic" } */
+/* { dg-final { scan-assembler "adc" } } */
+/* { dg-final { scan-assembler-not "jmp" } } */
+
+typedef unsigned long u64;
+
+typedef struct {
+  u64 hi, lo;
+} u128;
+
+static inline u128 add_u128 (u128 a, u128 b)
+{
+  a.lo += b.lo;
+  if (a.lo < b.lo)
+    a.hi++;
+
+  return a;
+}
+
+extern u128 t1, t2, t3;
+
+void foo (void)
+{
+  t1 = add_u128 (t1, t2);
+  t1 = add_u128 (t1, t3);
+}

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386.
  2016-11-23 19:00 ` [0/3] Fix PR78120, " Bernd Schmidt
@ 2016-11-23 19:30   ` Jeff Law
  2016-11-23 19:31     ` Bernd Schmidt
  2016-11-24 14:21   ` Segher Boessenkool
  1 sibling, 1 reply; 32+ messages in thread
From: Jeff Law @ 2016-11-23 19:30 UTC (permalink / raw)
  To: Bernd Schmidt, GCC Patches

On 11/23/2016 12:00 PM, Bernd Schmidt wrote:
> Note that I misspelled the PR number in the 0/3 message :-/
>
> On 11/23/2016 07:57 PM, Bernd Schmidt wrote:
>> 1. I noticed comparisons between set_src_cost and set_rtx_cost seemed to
>> be invalid. There seems to be no good reason that insn_rtx_cost
>> shouldn't use the latter. It also makes the numbers comparable to the
>> ones you get from seq_cost.
>
>
> Bernd
>
> 71280-1.diff
>
>
> 	PR rtl-optimization/78120
> 	* rtlanal.c (insn_rtx_cost): Use set_rtx_cost.
LGTM.   As a principle, if we have the full set, we ought to use 
set_rtx_cost, and only use set_src_cost if we don't have the full set 
expression.

Jeff

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386.
  2016-11-23 19:30   ` Jeff Law
@ 2016-11-23 19:31     ` Bernd Schmidt
  0 siblings, 0 replies; 32+ messages in thread
From: Bernd Schmidt @ 2016-11-23 19:31 UTC (permalink / raw)
  To: Jeff Law, GCC Patches

On 11/23/2016 08:30 PM, Jeff Law wrote:
> On 11/23/2016 12:00 PM, Bernd Schmidt wrote:
>> Note that I misspelled the PR number in the 0/3 message :-/
>>
>> On 11/23/2016 07:57 PM, Bernd Schmidt wrote:
>>> 1. I noticed comparisons between set_src_cost and set_rtx_cost seemed to
>>> be invalid. There seems to be no good reason that insn_rtx_cost
>>> shouldn't use the latter. It also makes the numbers comparable to the
>>> ones you get from seq_cost.
>>
>>
>> Bernd
>>
>> 71280-1.diff
>>
>>
>>     PR rtl-optimization/78120
>>     * rtlanal.c (insn_rtx_cost): Use set_rtx_cost.
> LGTM.   As a principle, if we have the full set, we ought to use
> set_rtx_cost, and only use set_src_cost if we don't have the full set
> expression.

Note that this cannot really be applied on its own, it needs patch 2/3 
so as to not make the x86 port do strange things. So I'll be holding off 
on this one until we have a consensus on the whole set.


Bernd

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [3/3] Fix PR78120, in ifcvt/rtlanal/i386.
  2016-11-23 19:03 ` [3/3] " Bernd Schmidt
@ 2016-11-23 19:38   ` Jeff Law
  0 siblings, 0 replies; 32+ messages in thread
From: Jeff Law @ 2016-11-23 19:38 UTC (permalink / raw)
  To: Bernd Schmidt, GCC Patches

On 11/23/2016 12:02 PM, Bernd Schmidt wrote:
> On 11/23/2016 07:57 PM, Bernd Schmidt wrote:
>> 3. ifcvt computes the sum of costs for the involved blocks, but only
>> makes a before/after comparison when optimizing for size. When
>> optimizing for speed, it uses max_seq_cost, which is an estimate
>> computed from BRANCH_COST, which in turn can be zero for predictable
>> branches on x86.
>
> This is the final patch and has the testcase. It also happens to be the
> least risky of the series so it could be applied on its own (without the
> test).
>
>
> Bernd
>
>
> 71280-3.diff
>
>
> 	PR rtl-optimization/78120
> 	* ifcvt.c (noce_conversion_profitable_p): Check original cost in all
> 	cases, and additionally test against max_seq_cost for speed
> 	optimization.
> 	(noce_process_if_block): Compute an estimate for the original cost when
> 	optimizing for speed, using the minimum of then and else block costs.
>
> 	PR rtl-optimization/78120
> 	* gcc.target/i386/pr78120.c: New test.
Also OK.  Obviously Uros has the call on the x86 target change.  Stage 
the series in as you see fit given the dependencies.

jeff

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386.
  2016-11-23 19:01 ` Bernd Schmidt
@ 2016-11-23 21:46   ` Uros Bizjak
  0 siblings, 0 replies; 32+ messages in thread
From: Uros Bizjak @ 2016-11-23 21:46 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: GCC Patches

On Wed, Nov 23, 2016 at 8:01 PM, Bernd Schmidt <bschmidt@redhat.com> wrote:
> On 11/23/2016 07:57 PM, Bernd Schmidt wrote:
>>
>> 2. The i386 backend mishandles SET rtxs. If you have a fairly plain
>> single-insn SET, you tend to get COSTS_N_INSNS (2) out of set_rtx_cost,
>> because rtx_costs has a default of COSTS_N_INSNS (1) for a SET, and you
>> get the cost of the src in addition to that.

Looks good to me.

Thanks,
Uros.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386.
  2016-11-23 19:00 ` [0/3] Fix PR78120, " Bernd Schmidt
  2016-11-23 19:30   ` Jeff Law
@ 2016-11-24 14:21   ` Segher Boessenkool
  2016-11-24 14:26     ` Bernd Schmidt
  1 sibling, 1 reply; 32+ messages in thread
From: Segher Boessenkool @ 2016-11-24 14:21 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: GCC Patches

[ Only your 0/3 and 3/3 messages arrived -- or is this 1/3? ]

On Wed, Nov 23, 2016 at 08:00:30PM +0100, Bernd Schmidt wrote:
> Note that I misspelled the PR number in the 0/3 message :-/
> 
> On 11/23/2016 07:57 PM, Bernd Schmidt wrote:
> >1. I noticed comparisons between set_src_cost and set_rtx_cost seemed to
> >be invalid. There seems to be no good reason that insn_rtx_cost
> >shouldn't use the latter. It also makes the numbers comparable to the
> >ones you get from seq_cost.
> 
> 
> Bernd

> 	PR rtl-optimization/78120
> 	* rtlanal.c (insn_rtx_cost): Use set_rtx_cost.
> 
> Index: gcc/rtlanal.c
> ===================================================================
> --- gcc/rtlanal.c	(revision 242038)
> +++ gcc/rtlanal.c	(working copy)
> @@ -5211,7 +5211,7 @@ insn_rtx_cost (rtx pat, bool speed)
>    else
>      return 0;
>  
> -  cost = set_src_cost (SET_SRC (set), GET_MODE (SET_DEST (set)), speed);
> +  cost = set_rtx_cost (set, speed);
>    return cost > 0 ? cost : COSTS_N_INSNS (1);
>  }
>  

Combine uses insn_rtx_cost extensively.  I have tried to change it to use
the full rtx cost, not just the source cost, a few times before, and it
always only regressed.  Part of it is that most ports' cost calculations
are, erm, not so great -- every target we care about needs fixes.

Let's please not try this in stage 3.


Segher

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386.
  2016-11-24 14:21   ` Segher Boessenkool
@ 2016-11-24 14:26     ` Bernd Schmidt
  2016-11-24 14:36       ` Segher Boessenkool
  0 siblings, 1 reply; 32+ messages in thread
From: Bernd Schmidt @ 2016-11-24 14:26 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: GCC Patches

On 11/24/2016 03:21 PM, Segher Boessenkool wrote:

> Combine uses insn_rtx_cost extensively.  I have tried to change it to use
> the full rtx cost, not just the source cost, a few times before, and it
> always only regressed.  Part of it is that most ports' cost calculations
> are, erm, not so great -- every target we care about needs fixes.
>
> Let's please not try this in stage 3.

It got approved and committed. Do you want me to revert it now or wait 
for obvious signs of fallout?


Bernd

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386.
  2016-11-24 14:26     ` Bernd Schmidt
@ 2016-11-24 14:36       ` Segher Boessenkool
  2016-11-24 14:38         ` Bernd Schmidt
  0 siblings, 1 reply; 32+ messages in thread
From: Segher Boessenkool @ 2016-11-24 14:36 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: GCC Patches

On Thu, Nov 24, 2016 at 03:26:45PM +0100, Bernd Schmidt wrote:
> On 11/24/2016 03:21 PM, Segher Boessenkool wrote:
> 
> >Combine uses insn_rtx_cost extensively.  I have tried to change it to use
> >the full rtx cost, not just the source cost, a few times before, and it
> >always only regressed.  Part of it is that most ports' cost calculations
> >are, erm, not so great -- every target we care about needs fixes.
> >
> >Let's please not try this in stage 3.
> 
> It got approved and committed. Do you want me to revert it now or wait 
> for obvious signs of fallout?

In my opinion it is an early stage 1 thing, not something suitable for
stage 3.  I can do some simple tests on various targets if you want.


Segher

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386.
  2016-11-24 14:36       ` Segher Boessenkool
@ 2016-11-24 14:38         ` Bernd Schmidt
  2016-11-24 14:44           ` Eric Botcazou
  2016-11-24 14:54           ` Segher Boessenkool
  0 siblings, 2 replies; 32+ messages in thread
From: Bernd Schmidt @ 2016-11-24 14:38 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: GCC Patches

On 11/24/2016 03:36 PM, Segher Boessenkool wrote:
> On Thu, Nov 24, 2016 at 03:26:45PM +0100, Bernd Schmidt wrote:
>> On 11/24/2016 03:21 PM, Segher Boessenkool wrote:
>>
>>> Combine uses insn_rtx_cost extensively.  I have tried to change it to use
>>> the full rtx cost, not just the source cost, a few times before, and it
>>> always only regressed.  Part of it is that most ports' cost calculations
>>> are, erm, not so great -- every target we care about needs fixes.
>>>
>>> Let's please not try this in stage 3.
>>
>> It got approved and committed. Do you want me to revert it now or wait
>> for obvious signs of fallout?
>
> In my opinion it is an early stage 1 thing, not something suitable for
> stage 3.  I can do some simple tests on various targets if you want.

Sure.

I'll make the argument that stage 3 is when we fix stuff, including 
performance regressions, and this patch is very clearly a fix. When we 
have very obvious distortions like a case where costs from insn_rtx_cost 
and seq_cost aren't comparable, it's impossible to arrive at sane solutions.


Bernd

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386.
  2016-11-24 14:38         ` Bernd Schmidt
@ 2016-11-24 14:44           ` Eric Botcazou
  2016-11-24 14:54           ` Segher Boessenkool
  1 sibling, 0 replies; 32+ messages in thread
From: Eric Botcazou @ 2016-11-24 14:44 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: gcc-patches, Segher Boessenkool

> I'll make the argument that stage 3 is when we fix stuff, including
> performance regressions, and this patch is very clearly a fix. When we
> have very obvious distortions like a case where costs from insn_rtx_cost
> and seq_cost aren't comparable, it's impossible to arrive at sane solutions.

It would help to make a pass over the main architecture back-ends and evaluate 
the potential fallout and required adjustments, if any.

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386.
  2016-11-24 14:38         ` Bernd Schmidt
  2016-11-24 14:44           ` Eric Botcazou
@ 2016-11-24 14:54           ` Segher Boessenkool
  2016-11-24 15:16             ` Richard Biener
                               ` (2 more replies)
  1 sibling, 3 replies; 32+ messages in thread
From: Segher Boessenkool @ 2016-11-24 14:54 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: GCC Patches

On Thu, Nov 24, 2016 at 03:38:55PM +0100, Bernd Schmidt wrote:
> On 11/24/2016 03:36 PM, Segher Boessenkool wrote:
> >On Thu, Nov 24, 2016 at 03:26:45PM +0100, Bernd Schmidt wrote:
> >>On 11/24/2016 03:21 PM, Segher Boessenkool wrote:
> >>
> >>>Combine uses insn_rtx_cost extensively.  I have tried to change it to use
> >>>the full rtx cost, not just the source cost, a few times before, and it
> >>>always only regressed.  Part of it is that most ports' cost calculations
> >>>are, erm, not so great -- every target we care about needs fixes.
> >>>
> >>>Let's please not try this in stage 3.
> >>
> >>It got approved and committed. Do you want me to revert it now or wait
> >>for obvious signs of fallout?
> >
> >In my opinion it is an early stage 1 thing, not something suitable for
> >stage 3.  I can do some simple tests on various targets if you want.
> 
> Sure.
> 
> I'll make the argument that stage 3 is when we fix stuff, including 
> performance regressions, and this patch is very clearly a fix. When we 
> have very obvious distortions like a case where costs from insn_rtx_cost 
> and seq_cost aren't comparable, it's impossible to arrive at sane solutions.

Your own 2/3 shows my point: you needed fixes to the i386 port for it
to behave sanely after this 1/3; what makes you think other ports are
not in the same boat?

IMHO switching insn_rtx_cost to be based on not just set_src_cost is
a good idea, but will require re-tuning of all targets, so it is not
stage 3 material.

That we compare different kinds of costs (which really has no meaning at
all, it's a heuristic at best) in various places is a known problem, not
a regression.


Segher

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386.
  2016-11-24 14:54           ` Segher Boessenkool
@ 2016-11-24 15:16             ` Richard Biener
  2016-11-24 15:46               ` Jeff Law
  2016-11-24 15:34             ` Bernd Schmidt
  2016-11-24 15:48             ` Jeff Law
  2 siblings, 1 reply; 32+ messages in thread
From: Richard Biener @ 2016-11-24 15:16 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: Bernd Schmidt, GCC Patches

On Thu, Nov 24, 2016 at 3:53 PM, Segher Boessenkool
<segher@kernel.crashing.org> wrote:
> On Thu, Nov 24, 2016 at 03:38:55PM +0100, Bernd Schmidt wrote:
>> On 11/24/2016 03:36 PM, Segher Boessenkool wrote:
>> >On Thu, Nov 24, 2016 at 03:26:45PM +0100, Bernd Schmidt wrote:
>> >>On 11/24/2016 03:21 PM, Segher Boessenkool wrote:
>> >>
>> >>>Combine uses insn_rtx_cost extensively.  I have tried to change it to use
>> >>>the full rtx cost, not just the source cost, a few times before, and it
>> >>>always only regressed.  Part of it is that most ports' cost calculations
>> >>>are, erm, not so great -- every target we care about needs fixes.
>> >>>
>> >>>Let's please not try this in stage 3.
>> >>
>> >>It got approved and committed. Do you want me to revert it now or wait
>> >>for obvious signs of fallout?
>> >
>> >In my opinion it is an early stage 1 thing, not something suitable for
>> >stage 3.  I can do some simple tests on various targets if you want.
>>
>> Sure.
>>
>> I'll make the argument that stage 3 is when we fix stuff, including
>> performance regressions, and this patch is very clearly a fix. When we
>> have very obvious distortions like a case where costs from insn_rtx_cost
>> and seq_cost aren't comparable, it's impossible to arrive at sane solutions.
>
> Your own 2/3 shows my point: you needed fixes to the i386 port for it
> to behave sanely after this 1/3; what makes you think other ports are
> not in the same boat?
>
> IMHO switching insn_rtx_cost to be based on not just set_src_cost is
> a good idea, but will require re-tuning of all targets, so it is not
> stage 3 material.

Agreed.

> That we compare different kinds of costs (which really has no meaning at
> all, it's a heuristic at best) in various places is a known problem, not
> a regression.

But technically stage 3 is for general bugfixing, not only regression fixing.

I'd say be prepared to revert but wait to see who screams first.

Thanks,
Richard.

>
> Segher

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386.
  2016-11-24 14:54           ` Segher Boessenkool
  2016-11-24 15:16             ` Richard Biener
@ 2016-11-24 15:34             ` Bernd Schmidt
  2016-11-24 15:48             ` Jeff Law
  2 siblings, 0 replies; 32+ messages in thread
From: Bernd Schmidt @ 2016-11-24 15:34 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: GCC Patches

On 11/24/2016 03:53 PM, Segher Boessenkool wrote:
>
> Your own 2/3 shows my point: you needed fixes to the i386 port for it
> to behave sanely after this 1/3; what makes you think other ports are
> not in the same boat?

When I realized i386 was broken I had a look at aarch64 and it looked 
sane, and that was the basis of the idea for patch 2/3.

It's possible other targets may need to handle SETs as well. In theory, 
something like the block of code I added for i386 could just work when 
copied to other backends if they have no "case SET" yet. If you do run 
into problems please try at least that very simple fix.

> That we compare different kinds of costs (which really has no meaning at
> all, it's a heuristic at best) in various places is a known problem, not
> a regression.

It leads to observable regressions however as PR78120 shows, when code 
like ifcvt tries to use cost calculations in apparently-sensible ways. 
And, as Richard mentioned, we're not in a regressions-only phase.


Bernd

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386.
  2016-11-24 15:16             ` Richard Biener
@ 2016-11-24 15:46               ` Jeff Law
  0 siblings, 0 replies; 32+ messages in thread
From: Jeff Law @ 2016-11-24 15:46 UTC (permalink / raw)
  To: Richard Biener, Segher Boessenkool; +Cc: Bernd Schmidt, GCC Patches

On 11/24/2016 08:16 AM, Richard Biener wrote:
>>
>> IMHO switching insn_rtx_cost to be based on not just set_src_cost is
>> a good idea, but will require re-tuning of all targets, so it is not
>> stage 3 material.
>
> Agreed.
>
>> That we compare different kinds of costs (which really has no meaning at
>> all, it's a heuristic at best) in various places is a known problem, not
>> a regression.
>
> But technically stage 3 is for general bugfixing, not only regression fixing.
>
> I'd say be prepared to revert but wait to see who screams first.
Right.  And I would claim that we're early enough in stage3 that 
attempting to address this BZ is a good thing.  The BZ also happens to 
be a 6/7 regression.

So I'd say let's go with the patch, but be aware that there may be a 
need to twiddle other ports.  If we find a bunch of ports are 
problematical than we might need to think about reversion.

jeff

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386.
  2016-11-24 14:54           ` Segher Boessenkool
  2016-11-24 15:16             ` Richard Biener
  2016-11-24 15:34             ` Bernd Schmidt
@ 2016-11-24 15:48             ` Jeff Law
  2016-11-24 16:14               ` Segher Boessenkool
  2 siblings, 1 reply; 32+ messages in thread
From: Jeff Law @ 2016-11-24 15:48 UTC (permalink / raw)
  To: Segher Boessenkool, Bernd Schmidt; +Cc: GCC Patches

On 11/24/2016 07:53 AM, Segher Boessenkool wrote:
>
> That we compare different kinds of costs (which really has no meaning at
> all, it's a heuristic at best) in various places is a known problem, not
> a regression.
But the problems with the costing system exhibit themselves as a code 
quality regression.  In the end that's what the end-users see -- a 
regression in the quality of the code GCC generates.

jeff

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386.
  2016-11-24 15:48             ` Jeff Law
@ 2016-11-24 16:14               ` Segher Boessenkool
  2016-11-24 22:32                 ` Segher Boessenkool
                                   ` (2 more replies)
  0 siblings, 3 replies; 32+ messages in thread
From: Segher Boessenkool @ 2016-11-24 16:14 UTC (permalink / raw)
  To: Jeff Law; +Cc: Bernd Schmidt, GCC Patches

On Thu, Nov 24, 2016 at 08:48:04AM -0700, Jeff Law wrote:
> On 11/24/2016 07:53 AM, Segher Boessenkool wrote:
> >
> >That we compare different kinds of costs (which really has no meaning at
> >all, it's a heuristic at best) in various places is a known problem, not
> >a regression.
> But the problems with the costing system exhibit themselves as a code 
> quality regression.  In the end that's what the end-users see -- a 
> regression in the quality of the code GCC generates.

Yes, exactly -- and I fear this all-encompassing change will cause just
such a regression for many users.  Tests are running, will know more
later today (or tomorrow).

The PR is about a very specific problem; the patch is not.  The patch
is not a bug fix.  If we allow anything that "makes things better" in
stage 3, what make it different from stage 1 then?


Segher

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386.
  2016-11-24 16:14               ` Segher Boessenkool
@ 2016-11-24 22:32                 ` Segher Boessenkool
  2016-11-26 10:44                   ` Jeff Law
  2016-11-25  9:15                 ` Richard Biener
  2016-11-28 18:50                 ` Jeff Law
  2 siblings, 1 reply; 32+ messages in thread
From: Segher Boessenkool @ 2016-11-24 22:32 UTC (permalink / raw)
  To: Jeff Law; +Cc: Bernd Schmidt, GCC Patches

On Thu, Nov 24, 2016 at 10:14:24AM -0600, Segher Boessenkool wrote:
> On Thu, Nov 24, 2016 at 08:48:04AM -0700, Jeff Law wrote:
> > On 11/24/2016 07:53 AM, Segher Boessenkool wrote:
> > >
> > >That we compare different kinds of costs (which really has no meaning at
> > >all, it's a heuristic at best) in various places is a known problem, not
> > >a regression.
> > But the problems with the costing system exhibit themselves as a code 
> > quality regression.  In the end that's what the end-users see -- a 
> > regression in the quality of the code GCC generates.
> 
> Yes, exactly -- and I fear this all-encompassing change will cause just
> such a regression for many users.  Tests are running, will know more
> later today (or tomorrow).
> 
> The PR is about a very specific problem; the patch is not.  The patch
> is not a bug fix.  If we allow anything that "makes things better" in
> stage 3, what make it different from stage 1 then?

Here are results of testing with trunk right before the three patches,
compared with with the three patches.  This lists the sizes of the vmlinux
of a Linux kernel build for that arch.


better:
    blackfin   1973931   1973867
         frv   3638192   3637792
       h8300   1060172   1059976
        i386   9742984   9742463
        ia64  15402035  15396171
        mips   4286748   4286692
     mn10300   2360025   2358201
       nios2   3185625   3176693
      x86_64  10360418  10359588

worse:
       alpha   5439003   5455979
         c6x   2107939   2108931
        cris   2189380   2193836
        m32r   3427409   3427453
        m68k   3228408   3230978
      mips64   5564819   5565291
      parisc   8278881   8289573
    parisc64   7234619   7249139
     powerpc   8438949   8440005
   powerpc64  14499969  14508689
        s390  12778748  12779220
     shnommu   1369868   1371020
     sparc64   5921556   5922172
      tilegx  12297581  12307461
     tilepro  11215603  11227339
      xtensa   1776196   1779152

does not build:
         arc         0         0
         arm         0         0
       arm64         0         0
  microblaze         0         0
          sh         0         0
       sparc         0         0


Segher

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386.
  2016-11-24 16:14               ` Segher Boessenkool
  2016-11-24 22:32                 ` Segher Boessenkool
@ 2016-11-25  9:15                 ` Richard Biener
  2016-11-25 15:34                   ` Jeff Law
  2016-11-25 15:55                   ` Segher Boessenkool
  2016-11-28 18:50                 ` Jeff Law
  2 siblings, 2 replies; 32+ messages in thread
From: Richard Biener @ 2016-11-25  9:15 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: Jeff Law, Bernd Schmidt, GCC Patches

On Thu, Nov 24, 2016 at 5:14 PM, Segher Boessenkool
<segher@kernel.crashing.org> wrote:
> On Thu, Nov 24, 2016 at 08:48:04AM -0700, Jeff Law wrote:
>> On 11/24/2016 07:53 AM, Segher Boessenkool wrote:
>> >
>> >That we compare different kinds of costs (which really has no meaning at
>> >all, it's a heuristic at best) in various places is a known problem, not
>> >a regression.
>> But the problems with the costing system exhibit themselves as a code
>> quality regression.  In the end that's what the end-users see -- a
>> regression in the quality of the code GCC generates.
>
> Yes, exactly -- and I fear this all-encompassing change will cause just
> such a regression for many users.  Tests are running, will know more
> later today (or tomorrow).
>
> The PR is about a very specific problem; the patch is not.  The patch
> is not a bug fix.  If we allow anything that "makes things better" in
> stage 3, what make it different from stage 1 then?

That's a good question ;)  The stage 3 definition has a loophole via
"go file a bug about feature X, then it's a bugfix!".

I'm all open for a more sensible definition, like constraining the kind
of non-regression fixes that we want to allow, but I fear the most
sensible option would be to simply ditch the notion of different
"stages" and make it "general development" and "regression fixing".
(though if you try hard enough and go back in time you'll find that
almost all non-enhancement bugs are regressions in some sense)

And yes, current stage3 still feels too much like stage1 ;)

Richard.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386.
  2016-11-25  9:15                 ` Richard Biener
@ 2016-11-25 15:34                   ` Jeff Law
  2016-11-25 15:55                   ` Segher Boessenkool
  1 sibling, 0 replies; 32+ messages in thread
From: Jeff Law @ 2016-11-25 15:34 UTC (permalink / raw)
  To: Richard Biener, Segher Boessenkool; +Cc: Bernd Schmidt, GCC Patches

On 11/25/2016 02:15 AM, Richard Biener wrote:
>
> That's a good question ;)  The stage 3 definition has a loophole via
> "go file a bug about feature X, then it's a bugfix!".
Right.  That loophole has existed since we've moved to the current model 
-- we extend a level of trust to our developers not to abuse the 
loophole.  I think that level of trust is warranted and hasn't been 
significantly violated.

> I'm all open for a more sensible definition, like constraining the kind
> of non-regression fixes that we want to allow, but I fear the most
> sensible option would be to simply ditch the notion of different
> "stages" and make it "general development" and "regression fixing".
> (though if you try hard enough and go back in time you'll find that
> almost all non-enhancement bugs are regressions in some sense)
Similarly, I'm always open for improvements.  My worry is if we went to 
development/regression bugfixing cycle, then non-regression bugs would 
largely be ignored.

General bugfixing is, IMHO, a good period -- it gets a larger portion of 
our developers fixing bugs and gives folks with a heavy review load a 
chance to flush out their queues of stuff that came in right at the end 
of stage1.

I'm not really pushing to open a "development cycle" discussion right 
now, but I do sense that our cycles could use some refinement.

>
> And yes, current stage3 still feels too much like stage1 ;)
Hasn't seemed that way to me, but obviously experiences will differ.  My 
biggest worry about this cycle is the higher than typical (compared to 
the last few years) regression bug counts.

That worry is somewhat mitigated by the belief that we're marking 
regressions much much more consistently this year, so we're a lot less 
likely to get a big jump in marked regressions like we saw last year.

*If* that is the case (and my light poking around seems to indicate 
that's true), then we're likely ahead of the gcc-6 cycle, but behind the 
gcc-5 cycle.

jeff

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386.
  2016-11-25  9:15                 ` Richard Biener
  2016-11-25 15:34                   ` Jeff Law
@ 2016-11-25 15:55                   ` Segher Boessenkool
  2016-11-28  8:59                     ` Bernd Schmidt
  2016-11-28  9:05                     ` Bernd Schmidt
  1 sibling, 2 replies; 32+ messages in thread
From: Segher Boessenkool @ 2016-11-25 15:55 UTC (permalink / raw)
  To: Richard Biener; +Cc: Jeff Law, Bernd Schmidt, GCC Patches

On Fri, Nov 25, 2016 at 10:15:25AM +0100, Richard Biener wrote:
> On Thu, Nov 24, 2016 at 5:14 PM, Segher Boessenkool
> <segher@kernel.crashing.org> wrote:
> > On Thu, Nov 24, 2016 at 08:48:04AM -0700, Jeff Law wrote:
> >> On 11/24/2016 07:53 AM, Segher Boessenkool wrote:
> >> >
> >> >That we compare different kinds of costs (which really has no meaning at
> >> >all, it's a heuristic at best) in various places is a known problem, not
> >> >a regression.
> >> But the problems with the costing system exhibit themselves as a code
> >> quality regression.  In the end that's what the end-users see -- a
> >> regression in the quality of the code GCC generates.
> >
> > Yes, exactly -- and I fear this all-encompassing change will cause just
> > such a regression for many users.  Tests are running, will know more
> > later today (or tomorrow).
> >
> > The PR is about a very specific problem; the patch is not.  The patch
> > is not a bug fix.  If we allow anything that "makes things better" in
> > stage 3, what make it different from stage 1 then?
> 
> That's a good question ;)  The stage 3 definition has a loophole via
> "go file a bug about feature X, then it's a bugfix!".
> 
> I'm all open for a more sensible definition, like constraining the kind
> of non-regression fixes that we want to allow, but I fear the most
> sensible option would be to simply ditch the notion of different
> "stages" and make it "general development" and "regression fixing".
> (though if you try hard enough and go back in time you'll find that
> almost all non-enhancement bugs are regressions in some sense)

The scale goes: early stage 1, anything goes; ...; until stage 4, only
very narrow regression fixes are allowed.

Let's try to keep that spirit, and not behave like politicians following
the "rules" (or not).

> And yes, current stage3 still feels too much like stage1 ;)

Yes, very much so.  Well, at least trunk bootstraps on more targets now.

--

So IMNSHO this rtx costing change belongs in early stage 1, and should
be reverted.  If ifcvt should use full rtx cost instead of rtx_src_cost,
fix *that*, that is a much more local change.  And even then, test on
more targets please.


Segher

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386.
  2016-11-24 22:32                 ` Segher Boessenkool
@ 2016-11-26 10:44                   ` Jeff Law
  2016-11-26 11:11                     ` Eric Botcazou
  2016-11-26 18:08                     ` Segher Boessenkool
  0 siblings, 2 replies; 32+ messages in thread
From: Jeff Law @ 2016-11-26 10:44 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: Bernd Schmidt, GCC Patches

On 11/24/2016 03:32 PM, Segher Boessenkool wrote:
> On Thu, Nov 24, 2016 at 10:14:24AM -0600, Segher Boessenkool wrote:
>> On Thu, Nov 24, 2016 at 08:48:04AM -0700, Jeff Law wrote:
>>> On 11/24/2016 07:53 AM, Segher Boessenkool wrote:
>>>>
>>>> That we compare different kinds of costs (which really has no meaning at
>>>> all, it's a heuristic at best) in various places is a known problem, not
>>>> a regression.
>>> But the problems with the costing system exhibit themselves as a code
>>> quality regression.  In the end that's what the end-users see -- a
>>> regression in the quality of the code GCC generates.
>>
>> Yes, exactly -- and I fear this all-encompassing change will cause just
>> such a regression for many users.  Tests are running, will know more
>> later today (or tomorrow).
>>
>> The PR is about a very specific problem; the patch is not.  The patch
>> is not a bug fix.  If we allow anything that "makes things better" in
>> stage 3, what make it different from stage 1 then?
>
> Here are results of testing with trunk right before the three patches,
> compared with with the three patches.  This lists the sizes of the vmlinux
> of a Linux kernel build for that arch.
Thanks.  While I question how much emphasis we should put on code sizes 
as a way to measure this change, it can still point out interesting 
effects, positive and negative.

 From my investigations on the m68k, the effects on the IL are minimal 
with a slight bias towards better code (by suppressing if-conversions of 
some now more costly blocks).  *But* the size of the resulting code was 
all over the place -- sometimes it was better, others worse.  From 
looking at the assembly we seemingly are copying blocks that aren't 
strictly necessary.

Enter bb-reorder and the STC algorithm.  It is copying blocks *very* 
aggressively, like absurdly aggressively on the m68k.  Of course it 
doesn't help that the m68k doesn't define a length attribute and as a 
result STC thinks every insn has size 0 and thus block copying is zero cost.

I want to verify the #s, so take this with a slight grain of salt.  The 
net changes to newlib's .o's for Bernd's work -- +30 bytes.  The effect 
of the STC issue above -- +1115586 bytes.  Or to put it another way, 
Bernd's changes, +.0003% change.  STC, +13.8%.


jeff

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386.
  2016-11-26 10:44                   ` Jeff Law
@ 2016-11-26 11:11                     ` Eric Botcazou
  2016-11-26 16:15                       ` Jeff Law
  2016-11-26 18:08                     ` Segher Boessenkool
  1 sibling, 1 reply; 32+ messages in thread
From: Eric Botcazou @ 2016-11-26 11:11 UTC (permalink / raw)
  To: Jeff Law; +Cc: gcc-patches, Segher Boessenkool, Bernd Schmidt

>  From my investigations on the m68k, the effects on the IL are minimal
> with a slight bias towards better code (by suppressing if-conversions of
> some now more costly blocks).  *But* the size of the resulting code was
> all over the place -- sometimes it was better, others worse.  From
> looking at the assembly we seemingly are copying blocks that aren't
> strictly necessary.

I'm seeing essentially the same thing on SPARC, probably because of the ifcvt 
change; the rtlanal change seems to be neutral for the architecture.

-- 
Eric Botcazou

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386.
  2016-11-26 11:11                     ` Eric Botcazou
@ 2016-11-26 16:15                       ` Jeff Law
  2016-11-26 22:03                         ` Segher Boessenkool
  0 siblings, 1 reply; 32+ messages in thread
From: Jeff Law @ 2016-11-26 16:15 UTC (permalink / raw)
  To: Eric Botcazou; +Cc: gcc-patches, Segher Boessenkool, Bernd Schmidt

On 11/26/2016 04:11 AM, Eric Botcazou wrote:
>>  From my investigations on the m68k, the effects on the IL are minimal
>> with a slight bias towards better code (by suppressing if-conversions of
>> some now more costly blocks).  *But* the size of the resulting code was
>> all over the place -- sometimes it was better, others worse.  From
>> looking at the assembly we seemingly are copying blocks that aren't
>> strictly necessary.
>
> I'm seeing essentially the same thing on SPARC, probably because of the ifcvt
> change; the rtlanal change seems to be neutral for the architecture.
Just to be clear, I was only testing the rtlanal change, not the ifcvt 
change.

I repeated my test on the GCC runtime libraries for m68k-elf.  Bernd's 
rtlanal change +.03%, the goof in STC, +9.4%.  So the STC goof still 
dwarfs the impact to Bernd's change, but not as badly as I saw in the 
newlib codebase.


Jeff

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386.
  2016-11-26 10:44                   ` Jeff Law
  2016-11-26 11:11                     ` Eric Botcazou
@ 2016-11-26 18:08                     ` Segher Boessenkool
  1 sibling, 0 replies; 32+ messages in thread
From: Segher Boessenkool @ 2016-11-26 18:08 UTC (permalink / raw)
  To: Jeff Law; +Cc: Bernd Schmidt, GCC Patches

On Sat, Nov 26, 2016 at 03:44:22AM -0700, Jeff Law wrote:
> On 11/24/2016 03:32 PM, Segher Boessenkool wrote:
> >On Thu, Nov 24, 2016 at 10:14:24AM -0600, Segher Boessenkool wrote:
> >>On Thu, Nov 24, 2016 at 08:48:04AM -0700, Jeff Law wrote:
> >>>On 11/24/2016 07:53 AM, Segher Boessenkool wrote:
> >>>>
> >>>>That we compare different kinds of costs (which really has no meaning at
> >>>>all, it's a heuristic at best) in various places is a known problem, not
> >>>>a regression.
> >>>But the problems with the costing system exhibit themselves as a code
> >>>quality regression.  In the end that's what the end-users see -- a
> >>>regression in the quality of the code GCC generates.
> >>
> >>Yes, exactly -- and I fear this all-encompassing change will cause just
> >>such a regression for many users.  Tests are running, will know more
> >>later today (or tomorrow).
> >>
> >>The PR is about a very specific problem; the patch is not.  The patch
> >>is not a bug fix.  If we allow anything that "makes things better" in
> >>stage 3, what make it different from stage 1 then?
> >
> >Here are results of testing with trunk right before the three patches,
> >compared with with the three patches.  This lists the sizes of the vmlinux
> >of a Linux kernel build for that arch.
> Thanks.  While I question how much emphasis we should put on code sizes 
> as a way to measure this change, it can still point out interesting 
> effects, positive and negative.

Code size I can test "easily" for many archs (it still takes almost a
full day), and it does correlate well with local optimisations on most
archs.  I have looked at the actual differences on some archs (which
takes a lot more time still), and the differences are all over the place.
Which suggests changing the costs is a big change for most of those
archs; and they all have been tuned for the *old* situation, so this
makes things worse in the short run, whether the new costs are better
or not.

Not a change for stage 3, and not something *I* should need to analyse
anyway; this analysis needs to be done *before* the patch goes in.

> From my investigations on the m68k, the effects on the IL are minimal 
> with a slight bias towards better code (by suppressing if-conversions of 
> some now more costly blocks).  *But* the size of the resulting code was 
> all over the place -- sometimes it was better, others worse.  From 
> looking at the assembly we seemingly are copying blocks that aren't 
> strictly necessary.
> 
> Enter bb-reorder and the STC algorithm.  It is copying blocks *very* 
> aggressively, like absurdly aggressively on the m68k.  Of course it 
> doesn't help that the m68k doesn't define a length attribute and as a 
> result STC thinks every insn has size 0 and thus block copying is zero cost.
> 
> I want to verify the #s, so take this with a slight grain of salt.  The 
> net changes to newlib's .o's for Bernd's work -- +30 bytes.  The effect 
> of the STC issue above -- +1115586 bytes.  Or to put it another way, 
> Bernd's changes, +.0003% change.  STC, +13.8%.

STC wasn't changed in the patch.  Maybe interactions with STC is what
causes all the problems, but that is an argument *against* doing this
after stage 1.


Segher

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386.
  2016-11-26 16:15                       ` Jeff Law
@ 2016-11-26 22:03                         ` Segher Boessenkool
  0 siblings, 0 replies; 32+ messages in thread
From: Segher Boessenkool @ 2016-11-26 22:03 UTC (permalink / raw)
  To: Jeff Law; +Cc: Eric Botcazou, gcc-patches, Bernd Schmidt

On Sat, Nov 26, 2016 at 09:15:48AM -0700, Jeff Law wrote:
> On 11/26/2016 04:11 AM, Eric Botcazou wrote:
> >> From my investigations on the m68k, the effects on the IL are minimal
> >>with a slight bias towards better code (by suppressing if-conversions of
> >>some now more costly blocks).  *But* the size of the resulting code was
> >>all over the place -- sometimes it was better, others worse.  From
> >>looking at the assembly we seemingly are copying blocks that aren't
> >>strictly necessary.
> >
> >I'm seeing essentially the same thing on SPARC, probably because of the 
> >ifcvt
> >change; the rtlanal change seems to be neutral for the architecture.
> Just to be clear, I was only testing the rtlanal change, not the ifcvt 
> change.
> 
> I repeated my test on the GCC runtime libraries for m68k-elf.  Bernd's 
> rtlanal change +.03%, the goof in STC, +9.4%.  So the STC goof still 
> dwarfs the impact to Bernd's change, but not as badly as I saw in the 
> newlib codebase.

orig, i386+rtlanal, i386+rtlanal+ifcvt:

worse:
       alpha   5439003   5455979   5455979
         c6x   2107939   2108931   2108931
        cris   2189380   2193836   2193836
        m32r   3427409   3427541   3427453
        m68k   3228408   3230978   3230978
        mips   4286748   4286964   4286692
      mips64   5564819   5565643   5565291
      parisc   8278881   8289977   8289573
    parisc64   7234619   7249187   7249139
     powerpc   8438949   8440005   8440005
   powerpc64  14499969  14508689  14508689
        s390  12778748  12779228  12779220
     shnommu   1369868   1371020   1371020
     sparc64   5921556   5922172   5922172
      tilegx  12297581  12307461  12307461
     tilepro  11215603  11227339  11227339
      xtensa   1776196   1779152   1779152

better:
    blackfin   1973931   1973867   1973867
         frv   3638192   3637792   3637792
       h8300   1060172   1059976   1059976
        i386   9742984   9742463   9742463
        ia64  15402035  15396171  15396171
     mn10300   2360025   2358201   2358201
       nios2   3185625   3176693   3176693
      x86_64  10360418  10359588  10359588

did not build:
         arc         0         0         0
         arm         0         0         0
       arm64         0         0         0
  microblaze         0         0         0
          sh         0         0         0
       sparc         0         0         0


tl;dr: The ifcvt change doesn't do much, but the cost change does.


Segher

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386.
  2016-11-25 15:55                   ` Segher Boessenkool
@ 2016-11-28  8:59                     ` Bernd Schmidt
  2016-11-28  9:05                     ` Bernd Schmidt
  1 sibling, 0 replies; 32+ messages in thread
From: Bernd Schmidt @ 2016-11-28  8:59 UTC (permalink / raw)
  To: Segher Boessenkool, Richard Biener; +Cc: Jeff Law, GCC Patches

On 11/25/2016 04:55 PM, Segher Boessenkool wrote:

> So IMNSHO this rtx costing change belongs in early stage 1, and should
> be reverted.

Done.


Bernd

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386.
  2016-11-25 15:55                   ` Segher Boessenkool
  2016-11-28  8:59                     ` Bernd Schmidt
@ 2016-11-28  9:05                     ` Bernd Schmidt
  1 sibling, 0 replies; 32+ messages in thread
From: Bernd Schmidt @ 2016-11-28  9:05 UTC (permalink / raw)
  To: Segher Boessenkool, Richard Biener; +Cc: Jeff Law, Bernd Schmidt, GCC Patches

On 11/25/2016 04:55 PM, Segher Boessenkool wrote:
> So IMNSHO this rtx costing change belongs in early stage 1, and should
> be reverted.

Done.


Bernd

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386.
  2016-11-24 16:14               ` Segher Boessenkool
  2016-11-24 22:32                 ` Segher Boessenkool
  2016-11-25  9:15                 ` Richard Biener
@ 2016-11-28 18:50                 ` Jeff Law
  2016-11-28 18:52                   ` Bernd Schmidt
  2 siblings, 1 reply; 32+ messages in thread
From: Jeff Law @ 2016-11-28 18:50 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: Bernd Schmidt, GCC Patches

On 11/24/2016 09:14 AM, Segher Boessenkool wrote:
> On Thu, Nov 24, 2016 at 08:48:04AM -0700, Jeff Law wrote:
>> On 11/24/2016 07:53 AM, Segher Boessenkool wrote:
>>>
>>> That we compare different kinds of costs (which really has no meaning at
>>> all, it's a heuristic at best) in various places is a known problem, not
>>> a regression.
>> But the problems with the costing system exhibit themselves as a code
>> quality regression.  In the end that's what the end-users see -- a
>> regression in the quality of the code GCC generates.
>
> Yes, exactly -- and I fear this all-encompassing change will cause just
> such a regression for many users.  Tests are running, will know more
> later today (or tomorrow).
>
> The PR is about a very specific problem; the patch is not.  The patch
> is not a bug fix.  If we allow anything that "makes things better" in
> stage 3, what make it different from stage 1 then?
So how would you suggest this be fixed right now?  I'd really like to 
get the regression addressed.

I would claim that Bernd's patch is right from a design and 
implementation standpoint -- the issues are fallout from backend issues 
and none looked terrible to me.

jeff

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386.
  2016-11-28 18:50                 ` Jeff Law
@ 2016-11-28 18:52                   ` Bernd Schmidt
  0 siblings, 0 replies; 32+ messages in thread
From: Bernd Schmidt @ 2016-11-28 18:52 UTC (permalink / raw)
  To: Jeff Law, Segher Boessenkool; +Cc: GCC Patches

On 11/28/2016 07:50 PM, Jeff Law wrote:
> So how would you suggest this be fixed right now?  I'd really like to
> get the regression addressed.

The regression is still fixed. That wasn't the case at all stages while 
I was working on it, but the i386 patch seems to suffice now.

> I would claim that Bernd's patch is right from a design and
> implementation standpoint -- the issues are fallout from backend issues
> and none looked terrible to me.

Agree but I reverted it anyway.


Bernd

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2016-11-28 18:52 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-23 18:58 [0/3] Fix PR71280, in ifcvt/rtlanal/i386 Bernd Schmidt
2016-11-23 19:00 ` [0/3] Fix PR78120, " Bernd Schmidt
2016-11-23 19:30   ` Jeff Law
2016-11-23 19:31     ` Bernd Schmidt
2016-11-24 14:21   ` Segher Boessenkool
2016-11-24 14:26     ` Bernd Schmidt
2016-11-24 14:36       ` Segher Boessenkool
2016-11-24 14:38         ` Bernd Schmidt
2016-11-24 14:44           ` Eric Botcazou
2016-11-24 14:54           ` Segher Boessenkool
2016-11-24 15:16             ` Richard Biener
2016-11-24 15:46               ` Jeff Law
2016-11-24 15:34             ` Bernd Schmidt
2016-11-24 15:48             ` Jeff Law
2016-11-24 16:14               ` Segher Boessenkool
2016-11-24 22:32                 ` Segher Boessenkool
2016-11-26 10:44                   ` Jeff Law
2016-11-26 11:11                     ` Eric Botcazou
2016-11-26 16:15                       ` Jeff Law
2016-11-26 22:03                         ` Segher Boessenkool
2016-11-26 18:08                     ` Segher Boessenkool
2016-11-25  9:15                 ` Richard Biener
2016-11-25 15:34                   ` Jeff Law
2016-11-25 15:55                   ` Segher Boessenkool
2016-11-28  8:59                     ` Bernd Schmidt
2016-11-28  9:05                     ` Bernd Schmidt
2016-11-28 18:50                 ` Jeff Law
2016-11-28 18:52                   ` Bernd Schmidt
2016-11-23 19:01 ` Bernd Schmidt
2016-11-23 21:46   ` Uros Bizjak
2016-11-23 19:03 ` [3/3] " Bernd Schmidt
2016-11-23 19:38   ` Jeff Law

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).