* [0/3] Fix PR71280, in ifcvt/rtlanal/i386.
@ 2016-11-23 18:58 Bernd Schmidt
2016-11-23 19:00 ` [0/3] Fix PR78120, " Bernd Schmidt
` (2 more replies)
0 siblings, 3 replies; 32+ messages in thread
From: Bernd Schmidt @ 2016-11-23 18:58 UTC (permalink / raw)
To: GCC Patches
This is a small series of patches to fix various problems in cost
calculations that together caused PR71280, a missed optimization
opportunity.
A summary of the problems:
1. I noticed comparisons between set_src_cost and set_rtx_cost seemed to
be invalid. There seems to be no good reason that insn_rtx_cost
shouldn't use the latter. It also makes the numbers comparable to the
ones you get from seq_cost.
2. The i386 backend mishandles SET rtxs. If you have a fairly plain
single-insn SET, you tend to get COSTS_N_INSNS (2) out of set_rtx_cost,
because rtx_costs has a default of COSTS_N_INSNS (1) for a SET, and you
get the cost of the src in addition to that.
3. ifcvt computes the sum of costs for the involved blocks, but only
makes a before/after comparison when optimizing for size. When
optimizing for speed, it uses max_seq_cost, which is an estimate
computed from BRANCH_COST, which in turn can be zero for predictable
branches on x86.
It seems a little risky to tweak costs this late in the process, but all
of these should be improvements so it would put us on a better footing
for fixing performance issues. I'll leave it to the reviewer to decide
whether we want this now or after gcc-7.
The series was bootstrapped and tested on x86_64-linux. There's the
following new guality fail:
-PASS: gcc.dg/guality/pr54693-2.c -Os line 21 x == 10 - i
-PASS: gcc.dg/guality/pr54693-2.c -Os line 21 y == 20 - 2 * i
+FAIL: gcc.dg/guality/pr54693-2.c -Os line 21 x == 10 - i
+FAIL: gcc.dg/guality/pr54693-2.c -Os line 21 y == 20 - 2 * i
which appears to be caused by loss of debuginfo in ivopts:
- # DEBUG x => (int) ((unsigned int) x_9(D) - (unsigned int) i_14)
- # DEBUG y => (int) ((unsigned int) y_10(D) - (unsigned int) i_14 * 2)
+ # DEBUG x => NULL
+ # DEBUG y => NULL
I'd claim this is out of scope for this patch series. So, ok?
Bernd
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386.
2016-11-23 18:58 [0/3] Fix PR71280, in ifcvt/rtlanal/i386 Bernd Schmidt
@ 2016-11-23 19:00 ` Bernd Schmidt
2016-11-23 19:30 ` Jeff Law
2016-11-24 14:21 ` Segher Boessenkool
2016-11-23 19:01 ` Bernd Schmidt
2016-11-23 19:03 ` [3/3] " Bernd Schmidt
2 siblings, 2 replies; 32+ messages in thread
From: Bernd Schmidt @ 2016-11-23 19:00 UTC (permalink / raw)
To: GCC Patches
[-- Attachment #1: Type: text/plain, Size: 357 bytes --]
Note that I misspelled the PR number in the 0/3 message :-/
On 11/23/2016 07:57 PM, Bernd Schmidt wrote:
> 1. I noticed comparisons between set_src_cost and set_rtx_cost seemed to
> be invalid. There seems to be no good reason that insn_rtx_cost
> shouldn't use the latter. It also makes the numbers comparable to the
> ones you get from seq_cost.
Bernd
[-- Attachment #2: 71280-1.diff --]
[-- Type: text/x-patch, Size: 480 bytes --]
PR rtl-optimization/78120
* rtlanal.c (insn_rtx_cost): Use set_rtx_cost.
Index: gcc/rtlanal.c
===================================================================
--- gcc/rtlanal.c (revision 242038)
+++ gcc/rtlanal.c (working copy)
@@ -5211,7 +5211,7 @@ insn_rtx_cost (rtx pat, bool speed)
else
return 0;
- cost = set_src_cost (SET_SRC (set), GET_MODE (SET_DEST (set)), speed);
+ cost = set_rtx_cost (set, speed);
return cost > 0 ? cost : COSTS_N_INSNS (1);
}
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386.
2016-11-23 18:58 [0/3] Fix PR71280, in ifcvt/rtlanal/i386 Bernd Schmidt
2016-11-23 19:00 ` [0/3] Fix PR78120, " Bernd Schmidt
@ 2016-11-23 19:01 ` Bernd Schmidt
2016-11-23 21:46 ` Uros Bizjak
2016-11-23 19:03 ` [3/3] " Bernd Schmidt
2 siblings, 1 reply; 32+ messages in thread
From: Bernd Schmidt @ 2016-11-23 19:01 UTC (permalink / raw)
To: GCC Patches, Uros Bizjak
[-- Attachment #1: Type: text/plain, Size: 319 bytes --]
On 11/23/2016 07:57 PM, Bernd Schmidt wrote:
> 2. The i386 backend mishandles SET rtxs. If you have a fairly plain
> single-insn SET, you tend to get COSTS_N_INSNS (2) out of set_rtx_cost,
> because rtx_costs has a default of COSTS_N_INSNS (1) for a SET, and you
> get the cost of the src in addition to that.
Bernd
[-- Attachment #2: 71280-2.diff --]
[-- Type: text/x-patch, Size: 1312 bytes --]
PR rtl-optimization/78120
* config/i386/i386.c (ix86_rtx_costs): Fully handle SETs.
Index: gcc/config/i386/i386.c
===================================================================
--- gcc/config/i386/i386.c (revision 242038)
+++ gcc/config/i386/i386.c (working copy)
@@ -39925,6 +39925,7 @@ ix86_rtx_costs (rtx x, machine_mode mode
enum rtx_code code = GET_CODE (x);
enum rtx_code outer_code = (enum rtx_code) outer_code_i;
const struct processor_costs *cost = speed ? ix86_cost : &ix86_size_cost;
+ int src_cost;
switch (code)
{
@@ -39935,7 +39936,23 @@ ix86_rtx_costs (rtx x, machine_mode mode
*total = ix86_set_reg_reg_cost (GET_MODE (SET_DEST (x)));
return true;
}
- return false;
+
+ if (register_operand (SET_SRC (x), VOIDmode))
+ /* Avoid potentially incorrect high cost from rtx_costs
+ for non-tieable SUBREGs. */
+ src_cost = 0;
+ else
+ {
+ src_cost = rtx_cost (SET_SRC (x), mode, SET, 1, speed);
+
+ if (CONSTANT_P (SET_SRC (x)))
+ /* Constant costs assume a base value of COSTS_N_INSNS (1) and add
+ a small value, possibly zero for cheap constants. */
+ src_cost += COSTS_N_INSNS (1);
+ }
+
+ *total = src_cost + rtx_cost (SET_DEST (x), mode, SET, 0, speed);
+ return true;
case CONST_INT:
case CONST:
^ permalink raw reply [flat|nested] 32+ messages in thread
* [3/3] Fix PR78120, in ifcvt/rtlanal/i386.
2016-11-23 18:58 [0/3] Fix PR71280, in ifcvt/rtlanal/i386 Bernd Schmidt
2016-11-23 19:00 ` [0/3] Fix PR78120, " Bernd Schmidt
2016-11-23 19:01 ` Bernd Schmidt
@ 2016-11-23 19:03 ` Bernd Schmidt
2016-11-23 19:38 ` Jeff Law
2 siblings, 1 reply; 32+ messages in thread
From: Bernd Schmidt @ 2016-11-23 19:03 UTC (permalink / raw)
To: GCC Patches
[-- Attachment #1: Type: text/plain, Size: 502 bytes --]
On 11/23/2016 07:57 PM, Bernd Schmidt wrote:
> 3. ifcvt computes the sum of costs for the involved blocks, but only
> makes a before/after comparison when optimizing for size. When
> optimizing for speed, it uses max_seq_cost, which is an estimate
> computed from BRANCH_COST, which in turn can be zero for predictable
> branches on x86.
This is the final patch and has the testcase. It also happens to be the
least risky of the series so it could be applied on its own (without the
test).
Bernd
[-- Attachment #2: 71280-3.diff --]
[-- Type: text/x-patch, Size: 3388 bytes --]
PR rtl-optimization/78120
* ifcvt.c (noce_conversion_profitable_p): Check original cost in all
cases, and additionally test against max_seq_cost for speed
optimization.
(noce_process_if_block): Compute an estimate for the original cost when
optimizing for speed, using the minimum of then and else block costs.
PR rtl-optimization/78120
* gcc.target/i386/pr78120.c: New test.
Index: gcc/ifcvt.c
===================================================================
--- gcc/ifcvt.c (revision 242038)
+++ gcc/ifcvt.c (working copy)
@@ -812,8 +812,10 @@ struct noce_if_info
we're optimizing for size. */
bool speed_p;
- /* The combined cost of COND, JUMP and the costs for THEN_BB and
- ELSE_BB. */
+ /* An estimate of the original costs. When optimizing for size, this is the
+ combined cost of COND, JUMP and the costs for THEN_BB and ELSE_BB.
+ When optimizing for speed, we use the costs of COND plus the minimum of
+ the costs for THEN_BB and ELSE_BB, as computed in the next field. */
unsigned int original_cost;
/* Maximum permissible cost for the unconditional sequence we should
@@ -852,12 +857,12 @@ noce_conversion_profitable_p (rtx_insn *
/* Cost up the new sequence. */
unsigned int cost = seq_cost (seq, speed_p);
+ if (cost <= if_info->original_cost)
+ return true;
+
/* When compiling for size, we can make a reasonably accurately guess
- at the size growth. */
- if (!speed_p)
- return cost <= if_info->original_cost;
- else
- return cost <= if_info->max_seq_cost;
+ at the size growth. When compiling for speed, use the maximum. */
+ return speed_p && cost <= if_info->max_seq_cost;
}
/* Helper function for noce_try_store_flag*. */
@@ -3441,15 +3446,24 @@ noce_process_if_block (struct noce_if_in
}
}
- if (! bb_valid_for_noce_process_p (then_bb, cond, &if_info->original_cost,
+ bool speed_p = optimize_bb_for_speed_p (test_bb);
+ unsigned int then_cost = 0, else_cost = 0;
+ if (!bb_valid_for_noce_process_p (then_bb, cond, &then_cost,
&if_info->then_simple))
return false;
if (else_bb
- && ! bb_valid_for_noce_process_p (else_bb, cond, &if_info->original_cost,
- &if_info->else_simple))
+ && !bb_valid_for_noce_process_p (else_bb, cond, &else_cost,
+ &if_info->else_simple))
return false;
+ if (else_bb == NULL)
+ if_info->original_cost += then_cost;
+ else if (speed_p)
+ if_info->original_cost += MIN (then_cost, else_cost);
+ else
+ if_info->original_cost += then_cost + else_cost;
+
insn_a = last_active_insn (then_bb, FALSE);
set_a = single_set (insn_a);
gcc_assert (set_a);
Index: gcc/testsuite/gcc.target/i386/pr78120.c
===================================================================
--- gcc/testsuite/gcc.target/i386/pr78120.c (nonexistent)
+++ gcc/testsuite/gcc.target/i386/pr78120.c (working copy)
@@ -0,0 +1,27 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mtune=generic" } */
+/* { dg-final { scan-assembler "adc" } } */
+/* { dg-final { scan-assembler-not "jmp" } } */
+
+typedef unsigned long u64;
+
+typedef struct {
+ u64 hi, lo;
+} u128;
+
+static inline u128 add_u128 (u128 a, u128 b)
+{
+ a.lo += b.lo;
+ if (a.lo < b.lo)
+ a.hi++;
+
+ return a;
+}
+
+extern u128 t1, t2, t3;
+
+void foo (void)
+{
+ t1 = add_u128 (t1, t2);
+ t1 = add_u128 (t1, t3);
+}
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386.
2016-11-23 19:00 ` [0/3] Fix PR78120, " Bernd Schmidt
@ 2016-11-23 19:30 ` Jeff Law
2016-11-23 19:31 ` Bernd Schmidt
2016-11-24 14:21 ` Segher Boessenkool
1 sibling, 1 reply; 32+ messages in thread
From: Jeff Law @ 2016-11-23 19:30 UTC (permalink / raw)
To: Bernd Schmidt, GCC Patches
On 11/23/2016 12:00 PM, Bernd Schmidt wrote:
> Note that I misspelled the PR number in the 0/3 message :-/
>
> On 11/23/2016 07:57 PM, Bernd Schmidt wrote:
>> 1. I noticed comparisons between set_src_cost and set_rtx_cost seemed to
>> be invalid. There seems to be no good reason that insn_rtx_cost
>> shouldn't use the latter. It also makes the numbers comparable to the
>> ones you get from seq_cost.
>
>
> Bernd
>
> 71280-1.diff
>
>
> PR rtl-optimization/78120
> * rtlanal.c (insn_rtx_cost): Use set_rtx_cost.
LGTM. As a principle, if we have the full set, we ought to use
set_rtx_cost, and only use set_src_cost if we don't have the full set
expression.
Jeff
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386.
2016-11-23 19:30 ` Jeff Law
@ 2016-11-23 19:31 ` Bernd Schmidt
0 siblings, 0 replies; 32+ messages in thread
From: Bernd Schmidt @ 2016-11-23 19:31 UTC (permalink / raw)
To: Jeff Law, GCC Patches
On 11/23/2016 08:30 PM, Jeff Law wrote:
> On 11/23/2016 12:00 PM, Bernd Schmidt wrote:
>> Note that I misspelled the PR number in the 0/3 message :-/
>>
>> On 11/23/2016 07:57 PM, Bernd Schmidt wrote:
>>> 1. I noticed comparisons between set_src_cost and set_rtx_cost seemed to
>>> be invalid. There seems to be no good reason that insn_rtx_cost
>>> shouldn't use the latter. It also makes the numbers comparable to the
>>> ones you get from seq_cost.
>>
>>
>> Bernd
>>
>> 71280-1.diff
>>
>>
>> PR rtl-optimization/78120
>> * rtlanal.c (insn_rtx_cost): Use set_rtx_cost.
> LGTM. As a principle, if we have the full set, we ought to use
> set_rtx_cost, and only use set_src_cost if we don't have the full set
> expression.
Note that this cannot really be applied on its own, it needs patch 2/3
so as to not make the x86 port do strange things. So I'll be holding off
on this one until we have a consensus on the whole set.
Bernd
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [3/3] Fix PR78120, in ifcvt/rtlanal/i386.
2016-11-23 19:03 ` [3/3] " Bernd Schmidt
@ 2016-11-23 19:38 ` Jeff Law
0 siblings, 0 replies; 32+ messages in thread
From: Jeff Law @ 2016-11-23 19:38 UTC (permalink / raw)
To: Bernd Schmidt, GCC Patches
On 11/23/2016 12:02 PM, Bernd Schmidt wrote:
> On 11/23/2016 07:57 PM, Bernd Schmidt wrote:
>> 3. ifcvt computes the sum of costs for the involved blocks, but only
>> makes a before/after comparison when optimizing for size. When
>> optimizing for speed, it uses max_seq_cost, which is an estimate
>> computed from BRANCH_COST, which in turn can be zero for predictable
>> branches on x86.
>
> This is the final patch and has the testcase. It also happens to be the
> least risky of the series so it could be applied on its own (without the
> test).
>
>
> Bernd
>
>
> 71280-3.diff
>
>
> PR rtl-optimization/78120
> * ifcvt.c (noce_conversion_profitable_p): Check original cost in all
> cases, and additionally test against max_seq_cost for speed
> optimization.
> (noce_process_if_block): Compute an estimate for the original cost when
> optimizing for speed, using the minimum of then and else block costs.
>
> PR rtl-optimization/78120
> * gcc.target/i386/pr78120.c: New test.
Also OK. Obviously Uros has the call on the x86 target change. Stage
the series in as you see fit given the dependencies.
jeff
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386.
2016-11-23 19:01 ` Bernd Schmidt
@ 2016-11-23 21:46 ` Uros Bizjak
0 siblings, 0 replies; 32+ messages in thread
From: Uros Bizjak @ 2016-11-23 21:46 UTC (permalink / raw)
To: Bernd Schmidt; +Cc: GCC Patches
On Wed, Nov 23, 2016 at 8:01 PM, Bernd Schmidt <bschmidt@redhat.com> wrote:
> On 11/23/2016 07:57 PM, Bernd Schmidt wrote:
>>
>> 2. The i386 backend mishandles SET rtxs. If you have a fairly plain
>> single-insn SET, you tend to get COSTS_N_INSNS (2) out of set_rtx_cost,
>> because rtx_costs has a default of COSTS_N_INSNS (1) for a SET, and you
>> get the cost of the src in addition to that.
Looks good to me.
Thanks,
Uros.
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386.
2016-11-23 19:00 ` [0/3] Fix PR78120, " Bernd Schmidt
2016-11-23 19:30 ` Jeff Law
@ 2016-11-24 14:21 ` Segher Boessenkool
2016-11-24 14:26 ` Bernd Schmidt
1 sibling, 1 reply; 32+ messages in thread
From: Segher Boessenkool @ 2016-11-24 14:21 UTC (permalink / raw)
To: Bernd Schmidt; +Cc: GCC Patches
[ Only your 0/3 and 3/3 messages arrived -- or is this 1/3? ]
On Wed, Nov 23, 2016 at 08:00:30PM +0100, Bernd Schmidt wrote:
> Note that I misspelled the PR number in the 0/3 message :-/
>
> On 11/23/2016 07:57 PM, Bernd Schmidt wrote:
> >1. I noticed comparisons between set_src_cost and set_rtx_cost seemed to
> >be invalid. There seems to be no good reason that insn_rtx_cost
> >shouldn't use the latter. It also makes the numbers comparable to the
> >ones you get from seq_cost.
>
>
> Bernd
> PR rtl-optimization/78120
> * rtlanal.c (insn_rtx_cost): Use set_rtx_cost.
>
> Index: gcc/rtlanal.c
> ===================================================================
> --- gcc/rtlanal.c (revision 242038)
> +++ gcc/rtlanal.c (working copy)
> @@ -5211,7 +5211,7 @@ insn_rtx_cost (rtx pat, bool speed)
> else
> return 0;
>
> - cost = set_src_cost (SET_SRC (set), GET_MODE (SET_DEST (set)), speed);
> + cost = set_rtx_cost (set, speed);
> return cost > 0 ? cost : COSTS_N_INSNS (1);
> }
>
Combine uses insn_rtx_cost extensively. I have tried to change it to use
the full rtx cost, not just the source cost, a few times before, and it
always only regressed. Part of it is that most ports' cost calculations
are, erm, not so great -- every target we care about needs fixes.
Let's please not try this in stage 3.
Segher
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386.
2016-11-24 14:21 ` Segher Boessenkool
@ 2016-11-24 14:26 ` Bernd Schmidt
2016-11-24 14:36 ` Segher Boessenkool
0 siblings, 1 reply; 32+ messages in thread
From: Bernd Schmidt @ 2016-11-24 14:26 UTC (permalink / raw)
To: Segher Boessenkool; +Cc: GCC Patches
On 11/24/2016 03:21 PM, Segher Boessenkool wrote:
> Combine uses insn_rtx_cost extensively. I have tried to change it to use
> the full rtx cost, not just the source cost, a few times before, and it
> always only regressed. Part of it is that most ports' cost calculations
> are, erm, not so great -- every target we care about needs fixes.
>
> Let's please not try this in stage 3.
It got approved and committed. Do you want me to revert it now or wait
for obvious signs of fallout?
Bernd
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386.
2016-11-24 14:26 ` Bernd Schmidt
@ 2016-11-24 14:36 ` Segher Boessenkool
2016-11-24 14:38 ` Bernd Schmidt
0 siblings, 1 reply; 32+ messages in thread
From: Segher Boessenkool @ 2016-11-24 14:36 UTC (permalink / raw)
To: Bernd Schmidt; +Cc: GCC Patches
On Thu, Nov 24, 2016 at 03:26:45PM +0100, Bernd Schmidt wrote:
> On 11/24/2016 03:21 PM, Segher Boessenkool wrote:
>
> >Combine uses insn_rtx_cost extensively. I have tried to change it to use
> >the full rtx cost, not just the source cost, a few times before, and it
> >always only regressed. Part of it is that most ports' cost calculations
> >are, erm, not so great -- every target we care about needs fixes.
> >
> >Let's please not try this in stage 3.
>
> It got approved and committed. Do you want me to revert it now or wait
> for obvious signs of fallout?
In my opinion it is an early stage 1 thing, not something suitable for
stage 3. I can do some simple tests on various targets if you want.
Segher
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386.
2016-11-24 14:36 ` Segher Boessenkool
@ 2016-11-24 14:38 ` Bernd Schmidt
2016-11-24 14:44 ` Eric Botcazou
2016-11-24 14:54 ` Segher Boessenkool
0 siblings, 2 replies; 32+ messages in thread
From: Bernd Schmidt @ 2016-11-24 14:38 UTC (permalink / raw)
To: Segher Boessenkool; +Cc: GCC Patches
On 11/24/2016 03:36 PM, Segher Boessenkool wrote:
> On Thu, Nov 24, 2016 at 03:26:45PM +0100, Bernd Schmidt wrote:
>> On 11/24/2016 03:21 PM, Segher Boessenkool wrote:
>>
>>> Combine uses insn_rtx_cost extensively. I have tried to change it to use
>>> the full rtx cost, not just the source cost, a few times before, and it
>>> always only regressed. Part of it is that most ports' cost calculations
>>> are, erm, not so great -- every target we care about needs fixes.
>>>
>>> Let's please not try this in stage 3.
>>
>> It got approved and committed. Do you want me to revert it now or wait
>> for obvious signs of fallout?
>
> In my opinion it is an early stage 1 thing, not something suitable for
> stage 3. I can do some simple tests on various targets if you want.
Sure.
I'll make the argument that stage 3 is when we fix stuff, including
performance regressions, and this patch is very clearly a fix. When we
have very obvious distortions like a case where costs from insn_rtx_cost
and seq_cost aren't comparable, it's impossible to arrive at sane solutions.
Bernd
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386.
2016-11-24 14:38 ` Bernd Schmidt
@ 2016-11-24 14:44 ` Eric Botcazou
2016-11-24 14:54 ` Segher Boessenkool
1 sibling, 0 replies; 32+ messages in thread
From: Eric Botcazou @ 2016-11-24 14:44 UTC (permalink / raw)
To: Bernd Schmidt; +Cc: gcc-patches, Segher Boessenkool
> I'll make the argument that stage 3 is when we fix stuff, including
> performance regressions, and this patch is very clearly a fix. When we
> have very obvious distortions like a case where costs from insn_rtx_cost
> and seq_cost aren't comparable, it's impossible to arrive at sane solutions.
It would help to make a pass over the main architecture back-ends and evaluate
the potential fallout and required adjustments, if any.
--
Eric Botcazou
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386.
2016-11-24 14:38 ` Bernd Schmidt
2016-11-24 14:44 ` Eric Botcazou
@ 2016-11-24 14:54 ` Segher Boessenkool
2016-11-24 15:16 ` Richard Biener
` (2 more replies)
1 sibling, 3 replies; 32+ messages in thread
From: Segher Boessenkool @ 2016-11-24 14:54 UTC (permalink / raw)
To: Bernd Schmidt; +Cc: GCC Patches
On Thu, Nov 24, 2016 at 03:38:55PM +0100, Bernd Schmidt wrote:
> On 11/24/2016 03:36 PM, Segher Boessenkool wrote:
> >On Thu, Nov 24, 2016 at 03:26:45PM +0100, Bernd Schmidt wrote:
> >>On 11/24/2016 03:21 PM, Segher Boessenkool wrote:
> >>
> >>>Combine uses insn_rtx_cost extensively. I have tried to change it to use
> >>>the full rtx cost, not just the source cost, a few times before, and it
> >>>always only regressed. Part of it is that most ports' cost calculations
> >>>are, erm, not so great -- every target we care about needs fixes.
> >>>
> >>>Let's please not try this in stage 3.
> >>
> >>It got approved and committed. Do you want me to revert it now or wait
> >>for obvious signs of fallout?
> >
> >In my opinion it is an early stage 1 thing, not something suitable for
> >stage 3. I can do some simple tests on various targets if you want.
>
> Sure.
>
> I'll make the argument that stage 3 is when we fix stuff, including
> performance regressions, and this patch is very clearly a fix. When we
> have very obvious distortions like a case where costs from insn_rtx_cost
> and seq_cost aren't comparable, it's impossible to arrive at sane solutions.
Your own 2/3 shows my point: you needed fixes to the i386 port for it
to behave sanely after this 1/3; what makes you think other ports are
not in the same boat?
IMHO switching insn_rtx_cost to be based on not just set_src_cost is
a good idea, but will require re-tuning of all targets, so it is not
stage 3 material.
That we compare different kinds of costs (which really has no meaning at
all, it's a heuristic at best) in various places is a known problem, not
a regression.
Segher
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386.
2016-11-24 14:54 ` Segher Boessenkool
@ 2016-11-24 15:16 ` Richard Biener
2016-11-24 15:46 ` Jeff Law
2016-11-24 15:34 ` Bernd Schmidt
2016-11-24 15:48 ` Jeff Law
2 siblings, 1 reply; 32+ messages in thread
From: Richard Biener @ 2016-11-24 15:16 UTC (permalink / raw)
To: Segher Boessenkool; +Cc: Bernd Schmidt, GCC Patches
On Thu, Nov 24, 2016 at 3:53 PM, Segher Boessenkool
<segher@kernel.crashing.org> wrote:
> On Thu, Nov 24, 2016 at 03:38:55PM +0100, Bernd Schmidt wrote:
>> On 11/24/2016 03:36 PM, Segher Boessenkool wrote:
>> >On Thu, Nov 24, 2016 at 03:26:45PM +0100, Bernd Schmidt wrote:
>> >>On 11/24/2016 03:21 PM, Segher Boessenkool wrote:
>> >>
>> >>>Combine uses insn_rtx_cost extensively. I have tried to change it to use
>> >>>the full rtx cost, not just the source cost, a few times before, and it
>> >>>always only regressed. Part of it is that most ports' cost calculations
>> >>>are, erm, not so great -- every target we care about needs fixes.
>> >>>
>> >>>Let's please not try this in stage 3.
>> >>
>> >>It got approved and committed. Do you want me to revert it now or wait
>> >>for obvious signs of fallout?
>> >
>> >In my opinion it is an early stage 1 thing, not something suitable for
>> >stage 3. I can do some simple tests on various targets if you want.
>>
>> Sure.
>>
>> I'll make the argument that stage 3 is when we fix stuff, including
>> performance regressions, and this patch is very clearly a fix. When we
>> have very obvious distortions like a case where costs from insn_rtx_cost
>> and seq_cost aren't comparable, it's impossible to arrive at sane solutions.
>
> Your own 2/3 shows my point: you needed fixes to the i386 port for it
> to behave sanely after this 1/3; what makes you think other ports are
> not in the same boat?
>
> IMHO switching insn_rtx_cost to be based on not just set_src_cost is
> a good idea, but will require re-tuning of all targets, so it is not
> stage 3 material.
Agreed.
> That we compare different kinds of costs (which really has no meaning at
> all, it's a heuristic at best) in various places is a known problem, not
> a regression.
But technically stage 3 is for general bugfixing, not only regression fixing.
I'd say be prepared to revert but wait to see who screams first.
Thanks,
Richard.
>
> Segher
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386.
2016-11-24 14:54 ` Segher Boessenkool
2016-11-24 15:16 ` Richard Biener
@ 2016-11-24 15:34 ` Bernd Schmidt
2016-11-24 15:48 ` Jeff Law
2 siblings, 0 replies; 32+ messages in thread
From: Bernd Schmidt @ 2016-11-24 15:34 UTC (permalink / raw)
To: Segher Boessenkool; +Cc: GCC Patches
On 11/24/2016 03:53 PM, Segher Boessenkool wrote:
>
> Your own 2/3 shows my point: you needed fixes to the i386 port for it
> to behave sanely after this 1/3; what makes you think other ports are
> not in the same boat?
When I realized i386 was broken I had a look at aarch64 and it looked
sane, and that was the basis of the idea for patch 2/3.
It's possible other targets may need to handle SETs as well. In theory,
something like the block of code I added for i386 could just work when
copied to other backends if they have no "case SET" yet. If you do run
into problems please try at least that very simple fix.
> That we compare different kinds of costs (which really has no meaning at
> all, it's a heuristic at best) in various places is a known problem, not
> a regression.
It leads to observable regressions however as PR78120 shows, when code
like ifcvt tries to use cost calculations in apparently-sensible ways.
And, as Richard mentioned, we're not in a regressions-only phase.
Bernd
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386.
2016-11-24 15:16 ` Richard Biener
@ 2016-11-24 15:46 ` Jeff Law
0 siblings, 0 replies; 32+ messages in thread
From: Jeff Law @ 2016-11-24 15:46 UTC (permalink / raw)
To: Richard Biener, Segher Boessenkool; +Cc: Bernd Schmidt, GCC Patches
On 11/24/2016 08:16 AM, Richard Biener wrote:
>>
>> IMHO switching insn_rtx_cost to be based on not just set_src_cost is
>> a good idea, but will require re-tuning of all targets, so it is not
>> stage 3 material.
>
> Agreed.
>
>> That we compare different kinds of costs (which really has no meaning at
>> all, it's a heuristic at best) in various places is a known problem, not
>> a regression.
>
> But technically stage 3 is for general bugfixing, not only regression fixing.
>
> I'd say be prepared to revert but wait to see who screams first.
Right. And I would claim that we're early enough in stage3 that
attempting to address this BZ is a good thing. The BZ also happens to
be a 6/7 regression.
So I'd say let's go with the patch, but be aware that there may be a
need to twiddle other ports. If we find a bunch of ports are
problematical than we might need to think about reversion.
jeff
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386.
2016-11-24 14:54 ` Segher Boessenkool
2016-11-24 15:16 ` Richard Biener
2016-11-24 15:34 ` Bernd Schmidt
@ 2016-11-24 15:48 ` Jeff Law
2016-11-24 16:14 ` Segher Boessenkool
2 siblings, 1 reply; 32+ messages in thread
From: Jeff Law @ 2016-11-24 15:48 UTC (permalink / raw)
To: Segher Boessenkool, Bernd Schmidt; +Cc: GCC Patches
On 11/24/2016 07:53 AM, Segher Boessenkool wrote:
>
> That we compare different kinds of costs (which really has no meaning at
> all, it's a heuristic at best) in various places is a known problem, not
> a regression.
But the problems with the costing system exhibit themselves as a code
quality regression. In the end that's what the end-users see -- a
regression in the quality of the code GCC generates.
jeff
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386.
2016-11-24 15:48 ` Jeff Law
@ 2016-11-24 16:14 ` Segher Boessenkool
2016-11-24 22:32 ` Segher Boessenkool
` (2 more replies)
0 siblings, 3 replies; 32+ messages in thread
From: Segher Boessenkool @ 2016-11-24 16:14 UTC (permalink / raw)
To: Jeff Law; +Cc: Bernd Schmidt, GCC Patches
On Thu, Nov 24, 2016 at 08:48:04AM -0700, Jeff Law wrote:
> On 11/24/2016 07:53 AM, Segher Boessenkool wrote:
> >
> >That we compare different kinds of costs (which really has no meaning at
> >all, it's a heuristic at best) in various places is a known problem, not
> >a regression.
> But the problems with the costing system exhibit themselves as a code
> quality regression. In the end that's what the end-users see -- a
> regression in the quality of the code GCC generates.
Yes, exactly -- and I fear this all-encompassing change will cause just
such a regression for many users. Tests are running, will know more
later today (or tomorrow).
The PR is about a very specific problem; the patch is not. The patch
is not a bug fix. If we allow anything that "makes things better" in
stage 3, what make it different from stage 1 then?
Segher
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386.
2016-11-24 16:14 ` Segher Boessenkool
@ 2016-11-24 22:32 ` Segher Boessenkool
2016-11-26 10:44 ` Jeff Law
2016-11-25 9:15 ` Richard Biener
2016-11-28 18:50 ` Jeff Law
2 siblings, 1 reply; 32+ messages in thread
From: Segher Boessenkool @ 2016-11-24 22:32 UTC (permalink / raw)
To: Jeff Law; +Cc: Bernd Schmidt, GCC Patches
On Thu, Nov 24, 2016 at 10:14:24AM -0600, Segher Boessenkool wrote:
> On Thu, Nov 24, 2016 at 08:48:04AM -0700, Jeff Law wrote:
> > On 11/24/2016 07:53 AM, Segher Boessenkool wrote:
> > >
> > >That we compare different kinds of costs (which really has no meaning at
> > >all, it's a heuristic at best) in various places is a known problem, not
> > >a regression.
> > But the problems with the costing system exhibit themselves as a code
> > quality regression. In the end that's what the end-users see -- a
> > regression in the quality of the code GCC generates.
>
> Yes, exactly -- and I fear this all-encompassing change will cause just
> such a regression for many users. Tests are running, will know more
> later today (or tomorrow).
>
> The PR is about a very specific problem; the patch is not. The patch
> is not a bug fix. If we allow anything that "makes things better" in
> stage 3, what make it different from stage 1 then?
Here are results of testing with trunk right before the three patches,
compared with with the three patches. This lists the sizes of the vmlinux
of a Linux kernel build for that arch.
better:
blackfin 1973931 1973867
frv 3638192 3637792
h8300 1060172 1059976
i386 9742984 9742463
ia64 15402035 15396171
mips 4286748 4286692
mn10300 2360025 2358201
nios2 3185625 3176693
x86_64 10360418 10359588
worse:
alpha 5439003 5455979
c6x 2107939 2108931
cris 2189380 2193836
m32r 3427409 3427453
m68k 3228408 3230978
mips64 5564819 5565291
parisc 8278881 8289573
parisc64 7234619 7249139
powerpc 8438949 8440005
powerpc64 14499969 14508689
s390 12778748 12779220
shnommu 1369868 1371020
sparc64 5921556 5922172
tilegx 12297581 12307461
tilepro 11215603 11227339
xtensa 1776196 1779152
does not build:
arc 0 0
arm 0 0
arm64 0 0
microblaze 0 0
sh 0 0
sparc 0 0
Segher
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386.
2016-11-24 16:14 ` Segher Boessenkool
2016-11-24 22:32 ` Segher Boessenkool
@ 2016-11-25 9:15 ` Richard Biener
2016-11-25 15:34 ` Jeff Law
2016-11-25 15:55 ` Segher Boessenkool
2016-11-28 18:50 ` Jeff Law
2 siblings, 2 replies; 32+ messages in thread
From: Richard Biener @ 2016-11-25 9:15 UTC (permalink / raw)
To: Segher Boessenkool; +Cc: Jeff Law, Bernd Schmidt, GCC Patches
On Thu, Nov 24, 2016 at 5:14 PM, Segher Boessenkool
<segher@kernel.crashing.org> wrote:
> On Thu, Nov 24, 2016 at 08:48:04AM -0700, Jeff Law wrote:
>> On 11/24/2016 07:53 AM, Segher Boessenkool wrote:
>> >
>> >That we compare different kinds of costs (which really has no meaning at
>> >all, it's a heuristic at best) in various places is a known problem, not
>> >a regression.
>> But the problems with the costing system exhibit themselves as a code
>> quality regression. In the end that's what the end-users see -- a
>> regression in the quality of the code GCC generates.
>
> Yes, exactly -- and I fear this all-encompassing change will cause just
> such a regression for many users. Tests are running, will know more
> later today (or tomorrow).
>
> The PR is about a very specific problem; the patch is not. The patch
> is not a bug fix. If we allow anything that "makes things better" in
> stage 3, what make it different from stage 1 then?
That's a good question ;) The stage 3 definition has a loophole via
"go file a bug about feature X, then it's a bugfix!".
I'm all open for a more sensible definition, like constraining the kind
of non-regression fixes that we want to allow, but I fear the most
sensible option would be to simply ditch the notion of different
"stages" and make it "general development" and "regression fixing".
(though if you try hard enough and go back in time you'll find that
almost all non-enhancement bugs are regressions in some sense)
And yes, current stage3 still feels too much like stage1 ;)
Richard.
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386.
2016-11-25 9:15 ` Richard Biener
@ 2016-11-25 15:34 ` Jeff Law
2016-11-25 15:55 ` Segher Boessenkool
1 sibling, 0 replies; 32+ messages in thread
From: Jeff Law @ 2016-11-25 15:34 UTC (permalink / raw)
To: Richard Biener, Segher Boessenkool; +Cc: Bernd Schmidt, GCC Patches
On 11/25/2016 02:15 AM, Richard Biener wrote:
>
> That's a good question ;) The stage 3 definition has a loophole via
> "go file a bug about feature X, then it's a bugfix!".
Right. That loophole has existed since we've moved to the current model
-- we extend a level of trust to our developers not to abuse the
loophole. I think that level of trust is warranted and hasn't been
significantly violated.
> I'm all open for a more sensible definition, like constraining the kind
> of non-regression fixes that we want to allow, but I fear the most
> sensible option would be to simply ditch the notion of different
> "stages" and make it "general development" and "regression fixing".
> (though if you try hard enough and go back in time you'll find that
> almost all non-enhancement bugs are regressions in some sense)
Similarly, I'm always open for improvements. My worry is if we went to
development/regression bugfixing cycle, then non-regression bugs would
largely be ignored.
General bugfixing is, IMHO, a good period -- it gets a larger portion of
our developers fixing bugs and gives folks with a heavy review load a
chance to flush out their queues of stuff that came in right at the end
of stage1.
I'm not really pushing to open a "development cycle" discussion right
now, but I do sense that our cycles could use some refinement.
>
> And yes, current stage3 still feels too much like stage1 ;)
Hasn't seemed that way to me, but obviously experiences will differ. My
biggest worry about this cycle is the higher than typical (compared to
the last few years) regression bug counts.
That worry is somewhat mitigated by the belief that we're marking
regressions much much more consistently this year, so we're a lot less
likely to get a big jump in marked regressions like we saw last year.
*If* that is the case (and my light poking around seems to indicate
that's true), then we're likely ahead of the gcc-6 cycle, but behind the
gcc-5 cycle.
jeff
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386.
2016-11-25 9:15 ` Richard Biener
2016-11-25 15:34 ` Jeff Law
@ 2016-11-25 15:55 ` Segher Boessenkool
2016-11-28 8:59 ` Bernd Schmidt
2016-11-28 9:05 ` Bernd Schmidt
1 sibling, 2 replies; 32+ messages in thread
From: Segher Boessenkool @ 2016-11-25 15:55 UTC (permalink / raw)
To: Richard Biener; +Cc: Jeff Law, Bernd Schmidt, GCC Patches
On Fri, Nov 25, 2016 at 10:15:25AM +0100, Richard Biener wrote:
> On Thu, Nov 24, 2016 at 5:14 PM, Segher Boessenkool
> <segher@kernel.crashing.org> wrote:
> > On Thu, Nov 24, 2016 at 08:48:04AM -0700, Jeff Law wrote:
> >> On 11/24/2016 07:53 AM, Segher Boessenkool wrote:
> >> >
> >> >That we compare different kinds of costs (which really has no meaning at
> >> >all, it's a heuristic at best) in various places is a known problem, not
> >> >a regression.
> >> But the problems with the costing system exhibit themselves as a code
> >> quality regression. In the end that's what the end-users see -- a
> >> regression in the quality of the code GCC generates.
> >
> > Yes, exactly -- and I fear this all-encompassing change will cause just
> > such a regression for many users. Tests are running, will know more
> > later today (or tomorrow).
> >
> > The PR is about a very specific problem; the patch is not. The patch
> > is not a bug fix. If we allow anything that "makes things better" in
> > stage 3, what make it different from stage 1 then?
>
> That's a good question ;) The stage 3 definition has a loophole via
> "go file a bug about feature X, then it's a bugfix!".
>
> I'm all open for a more sensible definition, like constraining the kind
> of non-regression fixes that we want to allow, but I fear the most
> sensible option would be to simply ditch the notion of different
> "stages" and make it "general development" and "regression fixing".
> (though if you try hard enough and go back in time you'll find that
> almost all non-enhancement bugs are regressions in some sense)
The scale goes: early stage 1, anything goes; ...; until stage 4, only
very narrow regression fixes are allowed.
Let's try to keep that spirit, and not behave like politicians following
the "rules" (or not).
> And yes, current stage3 still feels too much like stage1 ;)
Yes, very much so. Well, at least trunk bootstraps on more targets now.
--
So IMNSHO this rtx costing change belongs in early stage 1, and should
be reverted. If ifcvt should use full rtx cost instead of rtx_src_cost,
fix *that*, that is a much more local change. And even then, test on
more targets please.
Segher
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386.
2016-11-24 22:32 ` Segher Boessenkool
@ 2016-11-26 10:44 ` Jeff Law
2016-11-26 11:11 ` Eric Botcazou
2016-11-26 18:08 ` Segher Boessenkool
0 siblings, 2 replies; 32+ messages in thread
From: Jeff Law @ 2016-11-26 10:44 UTC (permalink / raw)
To: Segher Boessenkool; +Cc: Bernd Schmidt, GCC Patches
On 11/24/2016 03:32 PM, Segher Boessenkool wrote:
> On Thu, Nov 24, 2016 at 10:14:24AM -0600, Segher Boessenkool wrote:
>> On Thu, Nov 24, 2016 at 08:48:04AM -0700, Jeff Law wrote:
>>> On 11/24/2016 07:53 AM, Segher Boessenkool wrote:
>>>>
>>>> That we compare different kinds of costs (which really has no meaning at
>>>> all, it's a heuristic at best) in various places is a known problem, not
>>>> a regression.
>>> But the problems with the costing system exhibit themselves as a code
>>> quality regression. In the end that's what the end-users see -- a
>>> regression in the quality of the code GCC generates.
>>
>> Yes, exactly -- and I fear this all-encompassing change will cause just
>> such a regression for many users. Tests are running, will know more
>> later today (or tomorrow).
>>
>> The PR is about a very specific problem; the patch is not. The patch
>> is not a bug fix. If we allow anything that "makes things better" in
>> stage 3, what make it different from stage 1 then?
>
> Here are results of testing with trunk right before the three patches,
> compared with with the three patches. This lists the sizes of the vmlinux
> of a Linux kernel build for that arch.
Thanks. While I question how much emphasis we should put on code sizes
as a way to measure this change, it can still point out interesting
effects, positive and negative.
From my investigations on the m68k, the effects on the IL are minimal
with a slight bias towards better code (by suppressing if-conversions of
some now more costly blocks). *But* the size of the resulting code was
all over the place -- sometimes it was better, others worse. From
looking at the assembly we seemingly are copying blocks that aren't
strictly necessary.
Enter bb-reorder and the STC algorithm. It is copying blocks *very*
aggressively, like absurdly aggressively on the m68k. Of course it
doesn't help that the m68k doesn't define a length attribute and as a
result STC thinks every insn has size 0 and thus block copying is zero cost.
I want to verify the #s, so take this with a slight grain of salt. The
net changes to newlib's .o's for Bernd's work -- +30 bytes. The effect
of the STC issue above -- +1115586 bytes. Or to put it another way,
Bernd's changes, +.0003% change. STC, +13.8%.
jeff
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386.
2016-11-26 10:44 ` Jeff Law
@ 2016-11-26 11:11 ` Eric Botcazou
2016-11-26 16:15 ` Jeff Law
2016-11-26 18:08 ` Segher Boessenkool
1 sibling, 1 reply; 32+ messages in thread
From: Eric Botcazou @ 2016-11-26 11:11 UTC (permalink / raw)
To: Jeff Law; +Cc: gcc-patches, Segher Boessenkool, Bernd Schmidt
> From my investigations on the m68k, the effects on the IL are minimal
> with a slight bias towards better code (by suppressing if-conversions of
> some now more costly blocks). *But* the size of the resulting code was
> all over the place -- sometimes it was better, others worse. From
> looking at the assembly we seemingly are copying blocks that aren't
> strictly necessary.
I'm seeing essentially the same thing on SPARC, probably because of the ifcvt
change; the rtlanal change seems to be neutral for the architecture.
--
Eric Botcazou
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386.
2016-11-26 11:11 ` Eric Botcazou
@ 2016-11-26 16:15 ` Jeff Law
2016-11-26 22:03 ` Segher Boessenkool
0 siblings, 1 reply; 32+ messages in thread
From: Jeff Law @ 2016-11-26 16:15 UTC (permalink / raw)
To: Eric Botcazou; +Cc: gcc-patches, Segher Boessenkool, Bernd Schmidt
On 11/26/2016 04:11 AM, Eric Botcazou wrote:
>> From my investigations on the m68k, the effects on the IL are minimal
>> with a slight bias towards better code (by suppressing if-conversions of
>> some now more costly blocks). *But* the size of the resulting code was
>> all over the place -- sometimes it was better, others worse. From
>> looking at the assembly we seemingly are copying blocks that aren't
>> strictly necessary.
>
> I'm seeing essentially the same thing on SPARC, probably because of the ifcvt
> change; the rtlanal change seems to be neutral for the architecture.
Just to be clear, I was only testing the rtlanal change, not the ifcvt
change.
I repeated my test on the GCC runtime libraries for m68k-elf. Bernd's
rtlanal change +.03%, the goof in STC, +9.4%. So the STC goof still
dwarfs the impact to Bernd's change, but not as badly as I saw in the
newlib codebase.
Jeff
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386.
2016-11-26 10:44 ` Jeff Law
2016-11-26 11:11 ` Eric Botcazou
@ 2016-11-26 18:08 ` Segher Boessenkool
1 sibling, 0 replies; 32+ messages in thread
From: Segher Boessenkool @ 2016-11-26 18:08 UTC (permalink / raw)
To: Jeff Law; +Cc: Bernd Schmidt, GCC Patches
On Sat, Nov 26, 2016 at 03:44:22AM -0700, Jeff Law wrote:
> On 11/24/2016 03:32 PM, Segher Boessenkool wrote:
> >On Thu, Nov 24, 2016 at 10:14:24AM -0600, Segher Boessenkool wrote:
> >>On Thu, Nov 24, 2016 at 08:48:04AM -0700, Jeff Law wrote:
> >>>On 11/24/2016 07:53 AM, Segher Boessenkool wrote:
> >>>>
> >>>>That we compare different kinds of costs (which really has no meaning at
> >>>>all, it's a heuristic at best) in various places is a known problem, not
> >>>>a regression.
> >>>But the problems with the costing system exhibit themselves as a code
> >>>quality regression. In the end that's what the end-users see -- a
> >>>regression in the quality of the code GCC generates.
> >>
> >>Yes, exactly -- and I fear this all-encompassing change will cause just
> >>such a regression for many users. Tests are running, will know more
> >>later today (or tomorrow).
> >>
> >>The PR is about a very specific problem; the patch is not. The patch
> >>is not a bug fix. If we allow anything that "makes things better" in
> >>stage 3, what make it different from stage 1 then?
> >
> >Here are results of testing with trunk right before the three patches,
> >compared with with the three patches. This lists the sizes of the vmlinux
> >of a Linux kernel build for that arch.
> Thanks. While I question how much emphasis we should put on code sizes
> as a way to measure this change, it can still point out interesting
> effects, positive and negative.
Code size I can test "easily" for many archs (it still takes almost a
full day), and it does correlate well with local optimisations on most
archs. I have looked at the actual differences on some archs (which
takes a lot more time still), and the differences are all over the place.
Which suggests changing the costs is a big change for most of those
archs; and they all have been tuned for the *old* situation, so this
makes things worse in the short run, whether the new costs are better
or not.
Not a change for stage 3, and not something *I* should need to analyse
anyway; this analysis needs to be done *before* the patch goes in.
> From my investigations on the m68k, the effects on the IL are minimal
> with a slight bias towards better code (by suppressing if-conversions of
> some now more costly blocks). *But* the size of the resulting code was
> all over the place -- sometimes it was better, others worse. From
> looking at the assembly we seemingly are copying blocks that aren't
> strictly necessary.
>
> Enter bb-reorder and the STC algorithm. It is copying blocks *very*
> aggressively, like absurdly aggressively on the m68k. Of course it
> doesn't help that the m68k doesn't define a length attribute and as a
> result STC thinks every insn has size 0 and thus block copying is zero cost.
>
> I want to verify the #s, so take this with a slight grain of salt. The
> net changes to newlib's .o's for Bernd's work -- +30 bytes. The effect
> of the STC issue above -- +1115586 bytes. Or to put it another way,
> Bernd's changes, +.0003% change. STC, +13.8%.
STC wasn't changed in the patch. Maybe interactions with STC is what
causes all the problems, but that is an argument *against* doing this
after stage 1.
Segher
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386.
2016-11-26 16:15 ` Jeff Law
@ 2016-11-26 22:03 ` Segher Boessenkool
0 siblings, 0 replies; 32+ messages in thread
From: Segher Boessenkool @ 2016-11-26 22:03 UTC (permalink / raw)
To: Jeff Law; +Cc: Eric Botcazou, gcc-patches, Bernd Schmidt
On Sat, Nov 26, 2016 at 09:15:48AM -0700, Jeff Law wrote:
> On 11/26/2016 04:11 AM, Eric Botcazou wrote:
> >> From my investigations on the m68k, the effects on the IL are minimal
> >>with a slight bias towards better code (by suppressing if-conversions of
> >>some now more costly blocks). *But* the size of the resulting code was
> >>all over the place -- sometimes it was better, others worse. From
> >>looking at the assembly we seemingly are copying blocks that aren't
> >>strictly necessary.
> >
> >I'm seeing essentially the same thing on SPARC, probably because of the
> >ifcvt
> >change; the rtlanal change seems to be neutral for the architecture.
> Just to be clear, I was only testing the rtlanal change, not the ifcvt
> change.
>
> I repeated my test on the GCC runtime libraries for m68k-elf. Bernd's
> rtlanal change +.03%, the goof in STC, +9.4%. So the STC goof still
> dwarfs the impact to Bernd's change, but not as badly as I saw in the
> newlib codebase.
orig, i386+rtlanal, i386+rtlanal+ifcvt:
worse:
alpha 5439003 5455979 5455979
c6x 2107939 2108931 2108931
cris 2189380 2193836 2193836
m32r 3427409 3427541 3427453
m68k 3228408 3230978 3230978
mips 4286748 4286964 4286692
mips64 5564819 5565643 5565291
parisc 8278881 8289977 8289573
parisc64 7234619 7249187 7249139
powerpc 8438949 8440005 8440005
powerpc64 14499969 14508689 14508689
s390 12778748 12779228 12779220
shnommu 1369868 1371020 1371020
sparc64 5921556 5922172 5922172
tilegx 12297581 12307461 12307461
tilepro 11215603 11227339 11227339
xtensa 1776196 1779152 1779152
better:
blackfin 1973931 1973867 1973867
frv 3638192 3637792 3637792
h8300 1060172 1059976 1059976
i386 9742984 9742463 9742463
ia64 15402035 15396171 15396171
mn10300 2360025 2358201 2358201
nios2 3185625 3176693 3176693
x86_64 10360418 10359588 10359588
did not build:
arc 0 0 0
arm 0 0 0
arm64 0 0 0
microblaze 0 0 0
sh 0 0 0
sparc 0 0 0
tl;dr: The ifcvt change doesn't do much, but the cost change does.
Segher
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386.
2016-11-25 15:55 ` Segher Boessenkool
@ 2016-11-28 8:59 ` Bernd Schmidt
2016-11-28 9:05 ` Bernd Schmidt
1 sibling, 0 replies; 32+ messages in thread
From: Bernd Schmidt @ 2016-11-28 8:59 UTC (permalink / raw)
To: Segher Boessenkool, Richard Biener; +Cc: Jeff Law, GCC Patches
On 11/25/2016 04:55 PM, Segher Boessenkool wrote:
> So IMNSHO this rtx costing change belongs in early stage 1, and should
> be reverted.
Done.
Bernd
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386.
2016-11-25 15:55 ` Segher Boessenkool
2016-11-28 8:59 ` Bernd Schmidt
@ 2016-11-28 9:05 ` Bernd Schmidt
1 sibling, 0 replies; 32+ messages in thread
From: Bernd Schmidt @ 2016-11-28 9:05 UTC (permalink / raw)
To: Segher Boessenkool, Richard Biener; +Cc: Jeff Law, Bernd Schmidt, GCC Patches
On 11/25/2016 04:55 PM, Segher Boessenkool wrote:
> So IMNSHO this rtx costing change belongs in early stage 1, and should
> be reverted.
Done.
Bernd
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386.
2016-11-24 16:14 ` Segher Boessenkool
2016-11-24 22:32 ` Segher Boessenkool
2016-11-25 9:15 ` Richard Biener
@ 2016-11-28 18:50 ` Jeff Law
2016-11-28 18:52 ` Bernd Schmidt
2 siblings, 1 reply; 32+ messages in thread
From: Jeff Law @ 2016-11-28 18:50 UTC (permalink / raw)
To: Segher Boessenkool; +Cc: Bernd Schmidt, GCC Patches
On 11/24/2016 09:14 AM, Segher Boessenkool wrote:
> On Thu, Nov 24, 2016 at 08:48:04AM -0700, Jeff Law wrote:
>> On 11/24/2016 07:53 AM, Segher Boessenkool wrote:
>>>
>>> That we compare different kinds of costs (which really has no meaning at
>>> all, it's a heuristic at best) in various places is a known problem, not
>>> a regression.
>> But the problems with the costing system exhibit themselves as a code
>> quality regression. In the end that's what the end-users see -- a
>> regression in the quality of the code GCC generates.
>
> Yes, exactly -- and I fear this all-encompassing change will cause just
> such a regression for many users. Tests are running, will know more
> later today (or tomorrow).
>
> The PR is about a very specific problem; the patch is not. The patch
> is not a bug fix. If we allow anything that "makes things better" in
> stage 3, what make it different from stage 1 then?
So how would you suggest this be fixed right now? I'd really like to
get the regression addressed.
I would claim that Bernd's patch is right from a design and
implementation standpoint -- the issues are fallout from backend issues
and none looked terrible to me.
jeff
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386.
2016-11-28 18:50 ` Jeff Law
@ 2016-11-28 18:52 ` Bernd Schmidt
0 siblings, 0 replies; 32+ messages in thread
From: Bernd Schmidt @ 2016-11-28 18:52 UTC (permalink / raw)
To: Jeff Law, Segher Boessenkool; +Cc: GCC Patches
On 11/28/2016 07:50 PM, Jeff Law wrote:
> So how would you suggest this be fixed right now? I'd really like to
> get the regression addressed.
The regression is still fixed. That wasn't the case at all stages while
I was working on it, but the i386 patch seems to suffice now.
> I would claim that Bernd's patch is right from a design and
> implementation standpoint -- the issues are fallout from backend issues
> and none looked terrible to me.
Agree but I reverted it anyway.
Bernd
^ permalink raw reply [flat|nested] 32+ messages in thread
end of thread, other threads:[~2016-11-28 18:52 UTC | newest]
Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-23 18:58 [0/3] Fix PR71280, in ifcvt/rtlanal/i386 Bernd Schmidt
2016-11-23 19:00 ` [0/3] Fix PR78120, " Bernd Schmidt
2016-11-23 19:30 ` Jeff Law
2016-11-23 19:31 ` Bernd Schmidt
2016-11-24 14:21 ` Segher Boessenkool
2016-11-24 14:26 ` Bernd Schmidt
2016-11-24 14:36 ` Segher Boessenkool
2016-11-24 14:38 ` Bernd Schmidt
2016-11-24 14:44 ` Eric Botcazou
2016-11-24 14:54 ` Segher Boessenkool
2016-11-24 15:16 ` Richard Biener
2016-11-24 15:46 ` Jeff Law
2016-11-24 15:34 ` Bernd Schmidt
2016-11-24 15:48 ` Jeff Law
2016-11-24 16:14 ` Segher Boessenkool
2016-11-24 22:32 ` Segher Boessenkool
2016-11-26 10:44 ` Jeff Law
2016-11-26 11:11 ` Eric Botcazou
2016-11-26 16:15 ` Jeff Law
2016-11-26 22:03 ` Segher Boessenkool
2016-11-26 18:08 ` Segher Boessenkool
2016-11-25 9:15 ` Richard Biener
2016-11-25 15:34 ` Jeff Law
2016-11-25 15:55 ` Segher Boessenkool
2016-11-28 8:59 ` Bernd Schmidt
2016-11-28 9:05 ` Bernd Schmidt
2016-11-28 18:50 ` Jeff Law
2016-11-28 18:52 ` Bernd Schmidt
2016-11-23 19:01 ` Bernd Schmidt
2016-11-23 21:46 ` Uros Bizjak
2016-11-23 19:03 ` [3/3] " Bernd Schmidt
2016-11-23 19:38 ` Jeff Law
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).