* [0/3] Fix PR71280, in ifcvt/rtlanal/i386. @ 2016-11-23 18:58 Bernd Schmidt 2016-11-23 19:00 ` [0/3] Fix PR78120, " Bernd Schmidt ` (2 more replies) 0 siblings, 3 replies; 32+ messages in thread From: Bernd Schmidt @ 2016-11-23 18:58 UTC (permalink / raw) To: GCC Patches This is a small series of patches to fix various problems in cost calculations that together caused PR71280, a missed optimization opportunity. A summary of the problems: 1. I noticed comparisons between set_src_cost and set_rtx_cost seemed to be invalid. There seems to be no good reason that insn_rtx_cost shouldn't use the latter. It also makes the numbers comparable to the ones you get from seq_cost. 2. The i386 backend mishandles SET rtxs. If you have a fairly plain single-insn SET, you tend to get COSTS_N_INSNS (2) out of set_rtx_cost, because rtx_costs has a default of COSTS_N_INSNS (1) for a SET, and you get the cost of the src in addition to that. 3. ifcvt computes the sum of costs for the involved blocks, but only makes a before/after comparison when optimizing for size. When optimizing for speed, it uses max_seq_cost, which is an estimate computed from BRANCH_COST, which in turn can be zero for predictable branches on x86. It seems a little risky to tweak costs this late in the process, but all of these should be improvements so it would put us on a better footing for fixing performance issues. I'll leave it to the reviewer to decide whether we want this now or after gcc-7. The series was bootstrapped and tested on x86_64-linux. There's the following new guality fail: -PASS: gcc.dg/guality/pr54693-2.c -Os line 21 x == 10 - i -PASS: gcc.dg/guality/pr54693-2.c -Os line 21 y == 20 - 2 * i +FAIL: gcc.dg/guality/pr54693-2.c -Os line 21 x == 10 - i +FAIL: gcc.dg/guality/pr54693-2.c -Os line 21 y == 20 - 2 * i which appears to be caused by loss of debuginfo in ivopts: - # DEBUG x => (int) ((unsigned int) x_9(D) - (unsigned int) i_14) - # DEBUG y => (int) ((unsigned int) y_10(D) - (unsigned int) i_14 * 2) + # DEBUG x => NULL + # DEBUG y => NULL I'd claim this is out of scope for this patch series. So, ok? Bernd ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386. 2016-11-23 18:58 [0/3] Fix PR71280, in ifcvt/rtlanal/i386 Bernd Schmidt @ 2016-11-23 19:00 ` Bernd Schmidt 2016-11-23 19:30 ` Jeff Law 2016-11-24 14:21 ` Segher Boessenkool 2016-11-23 19:01 ` Bernd Schmidt 2016-11-23 19:03 ` [3/3] " Bernd Schmidt 2 siblings, 2 replies; 32+ messages in thread From: Bernd Schmidt @ 2016-11-23 19:00 UTC (permalink / raw) To: GCC Patches [-- Attachment #1: Type: text/plain, Size: 357 bytes --] Note that I misspelled the PR number in the 0/3 message :-/ On 11/23/2016 07:57 PM, Bernd Schmidt wrote: > 1. I noticed comparisons between set_src_cost and set_rtx_cost seemed to > be invalid. There seems to be no good reason that insn_rtx_cost > shouldn't use the latter. It also makes the numbers comparable to the > ones you get from seq_cost. Bernd [-- Attachment #2: 71280-1.diff --] [-- Type: text/x-patch, Size: 480 bytes --] PR rtl-optimization/78120 * rtlanal.c (insn_rtx_cost): Use set_rtx_cost. Index: gcc/rtlanal.c =================================================================== --- gcc/rtlanal.c (revision 242038) +++ gcc/rtlanal.c (working copy) @@ -5211,7 +5211,7 @@ insn_rtx_cost (rtx pat, bool speed) else return 0; - cost = set_src_cost (SET_SRC (set), GET_MODE (SET_DEST (set)), speed); + cost = set_rtx_cost (set, speed); return cost > 0 ? cost : COSTS_N_INSNS (1); } ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386. 2016-11-23 19:00 ` [0/3] Fix PR78120, " Bernd Schmidt @ 2016-11-23 19:30 ` Jeff Law 2016-11-23 19:31 ` Bernd Schmidt 2016-11-24 14:21 ` Segher Boessenkool 1 sibling, 1 reply; 32+ messages in thread From: Jeff Law @ 2016-11-23 19:30 UTC (permalink / raw) To: Bernd Schmidt, GCC Patches On 11/23/2016 12:00 PM, Bernd Schmidt wrote: > Note that I misspelled the PR number in the 0/3 message :-/ > > On 11/23/2016 07:57 PM, Bernd Schmidt wrote: >> 1. I noticed comparisons between set_src_cost and set_rtx_cost seemed to >> be invalid. There seems to be no good reason that insn_rtx_cost >> shouldn't use the latter. It also makes the numbers comparable to the >> ones you get from seq_cost. > > > Bernd > > 71280-1.diff > > > PR rtl-optimization/78120 > * rtlanal.c (insn_rtx_cost): Use set_rtx_cost. LGTM. As a principle, if we have the full set, we ought to use set_rtx_cost, and only use set_src_cost if we don't have the full set expression. Jeff ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386. 2016-11-23 19:30 ` Jeff Law @ 2016-11-23 19:31 ` Bernd Schmidt 0 siblings, 0 replies; 32+ messages in thread From: Bernd Schmidt @ 2016-11-23 19:31 UTC (permalink / raw) To: Jeff Law, GCC Patches On 11/23/2016 08:30 PM, Jeff Law wrote: > On 11/23/2016 12:00 PM, Bernd Schmidt wrote: >> Note that I misspelled the PR number in the 0/3 message :-/ >> >> On 11/23/2016 07:57 PM, Bernd Schmidt wrote: >>> 1. I noticed comparisons between set_src_cost and set_rtx_cost seemed to >>> be invalid. There seems to be no good reason that insn_rtx_cost >>> shouldn't use the latter. It also makes the numbers comparable to the >>> ones you get from seq_cost. >> >> >> Bernd >> >> 71280-1.diff >> >> >> PR rtl-optimization/78120 >> * rtlanal.c (insn_rtx_cost): Use set_rtx_cost. > LGTM. As a principle, if we have the full set, we ought to use > set_rtx_cost, and only use set_src_cost if we don't have the full set > expression. Note that this cannot really be applied on its own, it needs patch 2/3 so as to not make the x86 port do strange things. So I'll be holding off on this one until we have a consensus on the whole set. Bernd ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386. 2016-11-23 19:00 ` [0/3] Fix PR78120, " Bernd Schmidt 2016-11-23 19:30 ` Jeff Law @ 2016-11-24 14:21 ` Segher Boessenkool 2016-11-24 14:26 ` Bernd Schmidt 1 sibling, 1 reply; 32+ messages in thread From: Segher Boessenkool @ 2016-11-24 14:21 UTC (permalink / raw) To: Bernd Schmidt; +Cc: GCC Patches [ Only your 0/3 and 3/3 messages arrived -- or is this 1/3? ] On Wed, Nov 23, 2016 at 08:00:30PM +0100, Bernd Schmidt wrote: > Note that I misspelled the PR number in the 0/3 message :-/ > > On 11/23/2016 07:57 PM, Bernd Schmidt wrote: > >1. I noticed comparisons between set_src_cost and set_rtx_cost seemed to > >be invalid. There seems to be no good reason that insn_rtx_cost > >shouldn't use the latter. It also makes the numbers comparable to the > >ones you get from seq_cost. > > > Bernd > PR rtl-optimization/78120 > * rtlanal.c (insn_rtx_cost): Use set_rtx_cost. > > Index: gcc/rtlanal.c > =================================================================== > --- gcc/rtlanal.c (revision 242038) > +++ gcc/rtlanal.c (working copy) > @@ -5211,7 +5211,7 @@ insn_rtx_cost (rtx pat, bool speed) > else > return 0; > > - cost = set_src_cost (SET_SRC (set), GET_MODE (SET_DEST (set)), speed); > + cost = set_rtx_cost (set, speed); > return cost > 0 ? cost : COSTS_N_INSNS (1); > } > Combine uses insn_rtx_cost extensively. I have tried to change it to use the full rtx cost, not just the source cost, a few times before, and it always only regressed. Part of it is that most ports' cost calculations are, erm, not so great -- every target we care about needs fixes. Let's please not try this in stage 3. Segher ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386. 2016-11-24 14:21 ` Segher Boessenkool @ 2016-11-24 14:26 ` Bernd Schmidt 2016-11-24 14:36 ` Segher Boessenkool 0 siblings, 1 reply; 32+ messages in thread From: Bernd Schmidt @ 2016-11-24 14:26 UTC (permalink / raw) To: Segher Boessenkool; +Cc: GCC Patches On 11/24/2016 03:21 PM, Segher Boessenkool wrote: > Combine uses insn_rtx_cost extensively. I have tried to change it to use > the full rtx cost, not just the source cost, a few times before, and it > always only regressed. Part of it is that most ports' cost calculations > are, erm, not so great -- every target we care about needs fixes. > > Let's please not try this in stage 3. It got approved and committed. Do you want me to revert it now or wait for obvious signs of fallout? Bernd ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386. 2016-11-24 14:26 ` Bernd Schmidt @ 2016-11-24 14:36 ` Segher Boessenkool 2016-11-24 14:38 ` Bernd Schmidt 0 siblings, 1 reply; 32+ messages in thread From: Segher Boessenkool @ 2016-11-24 14:36 UTC (permalink / raw) To: Bernd Schmidt; +Cc: GCC Patches On Thu, Nov 24, 2016 at 03:26:45PM +0100, Bernd Schmidt wrote: > On 11/24/2016 03:21 PM, Segher Boessenkool wrote: > > >Combine uses insn_rtx_cost extensively. I have tried to change it to use > >the full rtx cost, not just the source cost, a few times before, and it > >always only regressed. Part of it is that most ports' cost calculations > >are, erm, not so great -- every target we care about needs fixes. > > > >Let's please not try this in stage 3. > > It got approved and committed. Do you want me to revert it now or wait > for obvious signs of fallout? In my opinion it is an early stage 1 thing, not something suitable for stage 3. I can do some simple tests on various targets if you want. Segher ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386. 2016-11-24 14:36 ` Segher Boessenkool @ 2016-11-24 14:38 ` Bernd Schmidt 2016-11-24 14:44 ` Eric Botcazou 2016-11-24 14:54 ` Segher Boessenkool 0 siblings, 2 replies; 32+ messages in thread From: Bernd Schmidt @ 2016-11-24 14:38 UTC (permalink / raw) To: Segher Boessenkool; +Cc: GCC Patches On 11/24/2016 03:36 PM, Segher Boessenkool wrote: > On Thu, Nov 24, 2016 at 03:26:45PM +0100, Bernd Schmidt wrote: >> On 11/24/2016 03:21 PM, Segher Boessenkool wrote: >> >>> Combine uses insn_rtx_cost extensively. I have tried to change it to use >>> the full rtx cost, not just the source cost, a few times before, and it >>> always only regressed. Part of it is that most ports' cost calculations >>> are, erm, not so great -- every target we care about needs fixes. >>> >>> Let's please not try this in stage 3. >> >> It got approved and committed. Do you want me to revert it now or wait >> for obvious signs of fallout? > > In my opinion it is an early stage 1 thing, not something suitable for > stage 3. I can do some simple tests on various targets if you want. Sure. I'll make the argument that stage 3 is when we fix stuff, including performance regressions, and this patch is very clearly a fix. When we have very obvious distortions like a case where costs from insn_rtx_cost and seq_cost aren't comparable, it's impossible to arrive at sane solutions. Bernd ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386. 2016-11-24 14:38 ` Bernd Schmidt @ 2016-11-24 14:44 ` Eric Botcazou 2016-11-24 14:54 ` Segher Boessenkool 1 sibling, 0 replies; 32+ messages in thread From: Eric Botcazou @ 2016-11-24 14:44 UTC (permalink / raw) To: Bernd Schmidt; +Cc: gcc-patches, Segher Boessenkool > I'll make the argument that stage 3 is when we fix stuff, including > performance regressions, and this patch is very clearly a fix. When we > have very obvious distortions like a case where costs from insn_rtx_cost > and seq_cost aren't comparable, it's impossible to arrive at sane solutions. It would help to make a pass over the main architecture back-ends and evaluate the potential fallout and required adjustments, if any. -- Eric Botcazou ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386. 2016-11-24 14:38 ` Bernd Schmidt 2016-11-24 14:44 ` Eric Botcazou @ 2016-11-24 14:54 ` Segher Boessenkool 2016-11-24 15:16 ` Richard Biener ` (2 more replies) 1 sibling, 3 replies; 32+ messages in thread From: Segher Boessenkool @ 2016-11-24 14:54 UTC (permalink / raw) To: Bernd Schmidt; +Cc: GCC Patches On Thu, Nov 24, 2016 at 03:38:55PM +0100, Bernd Schmidt wrote: > On 11/24/2016 03:36 PM, Segher Boessenkool wrote: > >On Thu, Nov 24, 2016 at 03:26:45PM +0100, Bernd Schmidt wrote: > >>On 11/24/2016 03:21 PM, Segher Boessenkool wrote: > >> > >>>Combine uses insn_rtx_cost extensively. I have tried to change it to use > >>>the full rtx cost, not just the source cost, a few times before, and it > >>>always only regressed. Part of it is that most ports' cost calculations > >>>are, erm, not so great -- every target we care about needs fixes. > >>> > >>>Let's please not try this in stage 3. > >> > >>It got approved and committed. Do you want me to revert it now or wait > >>for obvious signs of fallout? > > > >In my opinion it is an early stage 1 thing, not something suitable for > >stage 3. I can do some simple tests on various targets if you want. > > Sure. > > I'll make the argument that stage 3 is when we fix stuff, including > performance regressions, and this patch is very clearly a fix. When we > have very obvious distortions like a case where costs from insn_rtx_cost > and seq_cost aren't comparable, it's impossible to arrive at sane solutions. Your own 2/3 shows my point: you needed fixes to the i386 port for it to behave sanely after this 1/3; what makes you think other ports are not in the same boat? IMHO switching insn_rtx_cost to be based on not just set_src_cost is a good idea, but will require re-tuning of all targets, so it is not stage 3 material. That we compare different kinds of costs (which really has no meaning at all, it's a heuristic at best) in various places is a known problem, not a regression. Segher ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386. 2016-11-24 14:54 ` Segher Boessenkool @ 2016-11-24 15:16 ` Richard Biener 2016-11-24 15:46 ` Jeff Law 2016-11-24 15:34 ` Bernd Schmidt 2016-11-24 15:48 ` Jeff Law 2 siblings, 1 reply; 32+ messages in thread From: Richard Biener @ 2016-11-24 15:16 UTC (permalink / raw) To: Segher Boessenkool; +Cc: Bernd Schmidt, GCC Patches On Thu, Nov 24, 2016 at 3:53 PM, Segher Boessenkool <segher@kernel.crashing.org> wrote: > On Thu, Nov 24, 2016 at 03:38:55PM +0100, Bernd Schmidt wrote: >> On 11/24/2016 03:36 PM, Segher Boessenkool wrote: >> >On Thu, Nov 24, 2016 at 03:26:45PM +0100, Bernd Schmidt wrote: >> >>On 11/24/2016 03:21 PM, Segher Boessenkool wrote: >> >> >> >>>Combine uses insn_rtx_cost extensively. I have tried to change it to use >> >>>the full rtx cost, not just the source cost, a few times before, and it >> >>>always only regressed. Part of it is that most ports' cost calculations >> >>>are, erm, not so great -- every target we care about needs fixes. >> >>> >> >>>Let's please not try this in stage 3. >> >> >> >>It got approved and committed. Do you want me to revert it now or wait >> >>for obvious signs of fallout? >> > >> >In my opinion it is an early stage 1 thing, not something suitable for >> >stage 3. I can do some simple tests on various targets if you want. >> >> Sure. >> >> I'll make the argument that stage 3 is when we fix stuff, including >> performance regressions, and this patch is very clearly a fix. When we >> have very obvious distortions like a case where costs from insn_rtx_cost >> and seq_cost aren't comparable, it's impossible to arrive at sane solutions. > > Your own 2/3 shows my point: you needed fixes to the i386 port for it > to behave sanely after this 1/3; what makes you think other ports are > not in the same boat? > > IMHO switching insn_rtx_cost to be based on not just set_src_cost is > a good idea, but will require re-tuning of all targets, so it is not > stage 3 material. Agreed. > That we compare different kinds of costs (which really has no meaning at > all, it's a heuristic at best) in various places is a known problem, not > a regression. But technically stage 3 is for general bugfixing, not only regression fixing. I'd say be prepared to revert but wait to see who screams first. Thanks, Richard. > > Segher ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386. 2016-11-24 15:16 ` Richard Biener @ 2016-11-24 15:46 ` Jeff Law 0 siblings, 0 replies; 32+ messages in thread From: Jeff Law @ 2016-11-24 15:46 UTC (permalink / raw) To: Richard Biener, Segher Boessenkool; +Cc: Bernd Schmidt, GCC Patches On 11/24/2016 08:16 AM, Richard Biener wrote: >> >> IMHO switching insn_rtx_cost to be based on not just set_src_cost is >> a good idea, but will require re-tuning of all targets, so it is not >> stage 3 material. > > Agreed. > >> That we compare different kinds of costs (which really has no meaning at >> all, it's a heuristic at best) in various places is a known problem, not >> a regression. > > But technically stage 3 is for general bugfixing, not only regression fixing. > > I'd say be prepared to revert but wait to see who screams first. Right. And I would claim that we're early enough in stage3 that attempting to address this BZ is a good thing. The BZ also happens to be a 6/7 regression. So I'd say let's go with the patch, but be aware that there may be a need to twiddle other ports. If we find a bunch of ports are problematical than we might need to think about reversion. jeff ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386. 2016-11-24 14:54 ` Segher Boessenkool 2016-11-24 15:16 ` Richard Biener @ 2016-11-24 15:34 ` Bernd Schmidt 2016-11-24 15:48 ` Jeff Law 2 siblings, 0 replies; 32+ messages in thread From: Bernd Schmidt @ 2016-11-24 15:34 UTC (permalink / raw) To: Segher Boessenkool; +Cc: GCC Patches On 11/24/2016 03:53 PM, Segher Boessenkool wrote: > > Your own 2/3 shows my point: you needed fixes to the i386 port for it > to behave sanely after this 1/3; what makes you think other ports are > not in the same boat? When I realized i386 was broken I had a look at aarch64 and it looked sane, and that was the basis of the idea for patch 2/3. It's possible other targets may need to handle SETs as well. In theory, something like the block of code I added for i386 could just work when copied to other backends if they have no "case SET" yet. If you do run into problems please try at least that very simple fix. > That we compare different kinds of costs (which really has no meaning at > all, it's a heuristic at best) in various places is a known problem, not > a regression. It leads to observable regressions however as PR78120 shows, when code like ifcvt tries to use cost calculations in apparently-sensible ways. And, as Richard mentioned, we're not in a regressions-only phase. Bernd ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386. 2016-11-24 14:54 ` Segher Boessenkool 2016-11-24 15:16 ` Richard Biener 2016-11-24 15:34 ` Bernd Schmidt @ 2016-11-24 15:48 ` Jeff Law 2016-11-24 16:14 ` Segher Boessenkool 2 siblings, 1 reply; 32+ messages in thread From: Jeff Law @ 2016-11-24 15:48 UTC (permalink / raw) To: Segher Boessenkool, Bernd Schmidt; +Cc: GCC Patches On 11/24/2016 07:53 AM, Segher Boessenkool wrote: > > That we compare different kinds of costs (which really has no meaning at > all, it's a heuristic at best) in various places is a known problem, not > a regression. But the problems with the costing system exhibit themselves as a code quality regression. In the end that's what the end-users see -- a regression in the quality of the code GCC generates. jeff ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386. 2016-11-24 15:48 ` Jeff Law @ 2016-11-24 16:14 ` Segher Boessenkool 2016-11-24 22:32 ` Segher Boessenkool ` (2 more replies) 0 siblings, 3 replies; 32+ messages in thread From: Segher Boessenkool @ 2016-11-24 16:14 UTC (permalink / raw) To: Jeff Law; +Cc: Bernd Schmidt, GCC Patches On Thu, Nov 24, 2016 at 08:48:04AM -0700, Jeff Law wrote: > On 11/24/2016 07:53 AM, Segher Boessenkool wrote: > > > >That we compare different kinds of costs (which really has no meaning at > >all, it's a heuristic at best) in various places is a known problem, not > >a regression. > But the problems with the costing system exhibit themselves as a code > quality regression. In the end that's what the end-users see -- a > regression in the quality of the code GCC generates. Yes, exactly -- and I fear this all-encompassing change will cause just such a regression for many users. Tests are running, will know more later today (or tomorrow). The PR is about a very specific problem; the patch is not. The patch is not a bug fix. If we allow anything that "makes things better" in stage 3, what make it different from stage 1 then? Segher ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386. 2016-11-24 16:14 ` Segher Boessenkool @ 2016-11-24 22:32 ` Segher Boessenkool 2016-11-26 10:44 ` Jeff Law 2016-11-25 9:15 ` Richard Biener 2016-11-28 18:50 ` Jeff Law 2 siblings, 1 reply; 32+ messages in thread From: Segher Boessenkool @ 2016-11-24 22:32 UTC (permalink / raw) To: Jeff Law; +Cc: Bernd Schmidt, GCC Patches On Thu, Nov 24, 2016 at 10:14:24AM -0600, Segher Boessenkool wrote: > On Thu, Nov 24, 2016 at 08:48:04AM -0700, Jeff Law wrote: > > On 11/24/2016 07:53 AM, Segher Boessenkool wrote: > > > > > >That we compare different kinds of costs (which really has no meaning at > > >all, it's a heuristic at best) in various places is a known problem, not > > >a regression. > > But the problems with the costing system exhibit themselves as a code > > quality regression. In the end that's what the end-users see -- a > > regression in the quality of the code GCC generates. > > Yes, exactly -- and I fear this all-encompassing change will cause just > such a regression for many users. Tests are running, will know more > later today (or tomorrow). > > The PR is about a very specific problem; the patch is not. The patch > is not a bug fix. If we allow anything that "makes things better" in > stage 3, what make it different from stage 1 then? Here are results of testing with trunk right before the three patches, compared with with the three patches. This lists the sizes of the vmlinux of a Linux kernel build for that arch. better: blackfin 1973931 1973867 frv 3638192 3637792 h8300 1060172 1059976 i386 9742984 9742463 ia64 15402035 15396171 mips 4286748 4286692 mn10300 2360025 2358201 nios2 3185625 3176693 x86_64 10360418 10359588 worse: alpha 5439003 5455979 c6x 2107939 2108931 cris 2189380 2193836 m32r 3427409 3427453 m68k 3228408 3230978 mips64 5564819 5565291 parisc 8278881 8289573 parisc64 7234619 7249139 powerpc 8438949 8440005 powerpc64 14499969 14508689 s390 12778748 12779220 shnommu 1369868 1371020 sparc64 5921556 5922172 tilegx 12297581 12307461 tilepro 11215603 11227339 xtensa 1776196 1779152 does not build: arc 0 0 arm 0 0 arm64 0 0 microblaze 0 0 sh 0 0 sparc 0 0 Segher ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386. 2016-11-24 22:32 ` Segher Boessenkool @ 2016-11-26 10:44 ` Jeff Law 2016-11-26 11:11 ` Eric Botcazou 2016-11-26 18:08 ` Segher Boessenkool 0 siblings, 2 replies; 32+ messages in thread From: Jeff Law @ 2016-11-26 10:44 UTC (permalink / raw) To: Segher Boessenkool; +Cc: Bernd Schmidt, GCC Patches On 11/24/2016 03:32 PM, Segher Boessenkool wrote: > On Thu, Nov 24, 2016 at 10:14:24AM -0600, Segher Boessenkool wrote: >> On Thu, Nov 24, 2016 at 08:48:04AM -0700, Jeff Law wrote: >>> On 11/24/2016 07:53 AM, Segher Boessenkool wrote: >>>> >>>> That we compare different kinds of costs (which really has no meaning at >>>> all, it's a heuristic at best) in various places is a known problem, not >>>> a regression. >>> But the problems with the costing system exhibit themselves as a code >>> quality regression. In the end that's what the end-users see -- a >>> regression in the quality of the code GCC generates. >> >> Yes, exactly -- and I fear this all-encompassing change will cause just >> such a regression for many users. Tests are running, will know more >> later today (or tomorrow). >> >> The PR is about a very specific problem; the patch is not. The patch >> is not a bug fix. If we allow anything that "makes things better" in >> stage 3, what make it different from stage 1 then? > > Here are results of testing with trunk right before the three patches, > compared with with the three patches. This lists the sizes of the vmlinux > of a Linux kernel build for that arch. Thanks. While I question how much emphasis we should put on code sizes as a way to measure this change, it can still point out interesting effects, positive and negative. From my investigations on the m68k, the effects on the IL are minimal with a slight bias towards better code (by suppressing if-conversions of some now more costly blocks). *But* the size of the resulting code was all over the place -- sometimes it was better, others worse. From looking at the assembly we seemingly are copying blocks that aren't strictly necessary. Enter bb-reorder and the STC algorithm. It is copying blocks *very* aggressively, like absurdly aggressively on the m68k. Of course it doesn't help that the m68k doesn't define a length attribute and as a result STC thinks every insn has size 0 and thus block copying is zero cost. I want to verify the #s, so take this with a slight grain of salt. The net changes to newlib's .o's for Bernd's work -- +30 bytes. The effect of the STC issue above -- +1115586 bytes. Or to put it another way, Bernd's changes, +.0003% change. STC, +13.8%. jeff ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386. 2016-11-26 10:44 ` Jeff Law @ 2016-11-26 11:11 ` Eric Botcazou 2016-11-26 16:15 ` Jeff Law 2016-11-26 18:08 ` Segher Boessenkool 1 sibling, 1 reply; 32+ messages in thread From: Eric Botcazou @ 2016-11-26 11:11 UTC (permalink / raw) To: Jeff Law; +Cc: gcc-patches, Segher Boessenkool, Bernd Schmidt > From my investigations on the m68k, the effects on the IL are minimal > with a slight bias towards better code (by suppressing if-conversions of > some now more costly blocks). *But* the size of the resulting code was > all over the place -- sometimes it was better, others worse. From > looking at the assembly we seemingly are copying blocks that aren't > strictly necessary. I'm seeing essentially the same thing on SPARC, probably because of the ifcvt change; the rtlanal change seems to be neutral for the architecture. -- Eric Botcazou ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386. 2016-11-26 11:11 ` Eric Botcazou @ 2016-11-26 16:15 ` Jeff Law 2016-11-26 22:03 ` Segher Boessenkool 0 siblings, 1 reply; 32+ messages in thread From: Jeff Law @ 2016-11-26 16:15 UTC (permalink / raw) To: Eric Botcazou; +Cc: gcc-patches, Segher Boessenkool, Bernd Schmidt On 11/26/2016 04:11 AM, Eric Botcazou wrote: >> From my investigations on the m68k, the effects on the IL are minimal >> with a slight bias towards better code (by suppressing if-conversions of >> some now more costly blocks). *But* the size of the resulting code was >> all over the place -- sometimes it was better, others worse. From >> looking at the assembly we seemingly are copying blocks that aren't >> strictly necessary. > > I'm seeing essentially the same thing on SPARC, probably because of the ifcvt > change; the rtlanal change seems to be neutral for the architecture. Just to be clear, I was only testing the rtlanal change, not the ifcvt change. I repeated my test on the GCC runtime libraries for m68k-elf. Bernd's rtlanal change +.03%, the goof in STC, +9.4%. So the STC goof still dwarfs the impact to Bernd's change, but not as badly as I saw in the newlib codebase. Jeff ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386. 2016-11-26 16:15 ` Jeff Law @ 2016-11-26 22:03 ` Segher Boessenkool 0 siblings, 0 replies; 32+ messages in thread From: Segher Boessenkool @ 2016-11-26 22:03 UTC (permalink / raw) To: Jeff Law; +Cc: Eric Botcazou, gcc-patches, Bernd Schmidt On Sat, Nov 26, 2016 at 09:15:48AM -0700, Jeff Law wrote: > On 11/26/2016 04:11 AM, Eric Botcazou wrote: > >> From my investigations on the m68k, the effects on the IL are minimal > >>with a slight bias towards better code (by suppressing if-conversions of > >>some now more costly blocks). *But* the size of the resulting code was > >>all over the place -- sometimes it was better, others worse. From > >>looking at the assembly we seemingly are copying blocks that aren't > >>strictly necessary. > > > >I'm seeing essentially the same thing on SPARC, probably because of the > >ifcvt > >change; the rtlanal change seems to be neutral for the architecture. > Just to be clear, I was only testing the rtlanal change, not the ifcvt > change. > > I repeated my test on the GCC runtime libraries for m68k-elf. Bernd's > rtlanal change +.03%, the goof in STC, +9.4%. So the STC goof still > dwarfs the impact to Bernd's change, but not as badly as I saw in the > newlib codebase. orig, i386+rtlanal, i386+rtlanal+ifcvt: worse: alpha 5439003 5455979 5455979 c6x 2107939 2108931 2108931 cris 2189380 2193836 2193836 m32r 3427409 3427541 3427453 m68k 3228408 3230978 3230978 mips 4286748 4286964 4286692 mips64 5564819 5565643 5565291 parisc 8278881 8289977 8289573 parisc64 7234619 7249187 7249139 powerpc 8438949 8440005 8440005 powerpc64 14499969 14508689 14508689 s390 12778748 12779228 12779220 shnommu 1369868 1371020 1371020 sparc64 5921556 5922172 5922172 tilegx 12297581 12307461 12307461 tilepro 11215603 11227339 11227339 xtensa 1776196 1779152 1779152 better: blackfin 1973931 1973867 1973867 frv 3638192 3637792 3637792 h8300 1060172 1059976 1059976 i386 9742984 9742463 9742463 ia64 15402035 15396171 15396171 mn10300 2360025 2358201 2358201 nios2 3185625 3176693 3176693 x86_64 10360418 10359588 10359588 did not build: arc 0 0 0 arm 0 0 0 arm64 0 0 0 microblaze 0 0 0 sh 0 0 0 sparc 0 0 0 tl;dr: The ifcvt change doesn't do much, but the cost change does. Segher ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386. 2016-11-26 10:44 ` Jeff Law 2016-11-26 11:11 ` Eric Botcazou @ 2016-11-26 18:08 ` Segher Boessenkool 1 sibling, 0 replies; 32+ messages in thread From: Segher Boessenkool @ 2016-11-26 18:08 UTC (permalink / raw) To: Jeff Law; +Cc: Bernd Schmidt, GCC Patches On Sat, Nov 26, 2016 at 03:44:22AM -0700, Jeff Law wrote: > On 11/24/2016 03:32 PM, Segher Boessenkool wrote: > >On Thu, Nov 24, 2016 at 10:14:24AM -0600, Segher Boessenkool wrote: > >>On Thu, Nov 24, 2016 at 08:48:04AM -0700, Jeff Law wrote: > >>>On 11/24/2016 07:53 AM, Segher Boessenkool wrote: > >>>> > >>>>That we compare different kinds of costs (which really has no meaning at > >>>>all, it's a heuristic at best) in various places is a known problem, not > >>>>a regression. > >>>But the problems with the costing system exhibit themselves as a code > >>>quality regression. In the end that's what the end-users see -- a > >>>regression in the quality of the code GCC generates. > >> > >>Yes, exactly -- and I fear this all-encompassing change will cause just > >>such a regression for many users. Tests are running, will know more > >>later today (or tomorrow). > >> > >>The PR is about a very specific problem; the patch is not. The patch > >>is not a bug fix. If we allow anything that "makes things better" in > >>stage 3, what make it different from stage 1 then? > > > >Here are results of testing with trunk right before the three patches, > >compared with with the three patches. This lists the sizes of the vmlinux > >of a Linux kernel build for that arch. > Thanks. While I question how much emphasis we should put on code sizes > as a way to measure this change, it can still point out interesting > effects, positive and negative. Code size I can test "easily" for many archs (it still takes almost a full day), and it does correlate well with local optimisations on most archs. I have looked at the actual differences on some archs (which takes a lot more time still), and the differences are all over the place. Which suggests changing the costs is a big change for most of those archs; and they all have been tuned for the *old* situation, so this makes things worse in the short run, whether the new costs are better or not. Not a change for stage 3, and not something *I* should need to analyse anyway; this analysis needs to be done *before* the patch goes in. > From my investigations on the m68k, the effects on the IL are minimal > with a slight bias towards better code (by suppressing if-conversions of > some now more costly blocks). *But* the size of the resulting code was > all over the place -- sometimes it was better, others worse. From > looking at the assembly we seemingly are copying blocks that aren't > strictly necessary. > > Enter bb-reorder and the STC algorithm. It is copying blocks *very* > aggressively, like absurdly aggressively on the m68k. Of course it > doesn't help that the m68k doesn't define a length attribute and as a > result STC thinks every insn has size 0 and thus block copying is zero cost. > > I want to verify the #s, so take this with a slight grain of salt. The > net changes to newlib's .o's for Bernd's work -- +30 bytes. The effect > of the STC issue above -- +1115586 bytes. Or to put it another way, > Bernd's changes, +.0003% change. STC, +13.8%. STC wasn't changed in the patch. Maybe interactions with STC is what causes all the problems, but that is an argument *against* doing this after stage 1. Segher ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386. 2016-11-24 16:14 ` Segher Boessenkool 2016-11-24 22:32 ` Segher Boessenkool @ 2016-11-25 9:15 ` Richard Biener 2016-11-25 15:34 ` Jeff Law 2016-11-25 15:55 ` Segher Boessenkool 2016-11-28 18:50 ` Jeff Law 2 siblings, 2 replies; 32+ messages in thread From: Richard Biener @ 2016-11-25 9:15 UTC (permalink / raw) To: Segher Boessenkool; +Cc: Jeff Law, Bernd Schmidt, GCC Patches On Thu, Nov 24, 2016 at 5:14 PM, Segher Boessenkool <segher@kernel.crashing.org> wrote: > On Thu, Nov 24, 2016 at 08:48:04AM -0700, Jeff Law wrote: >> On 11/24/2016 07:53 AM, Segher Boessenkool wrote: >> > >> >That we compare different kinds of costs (which really has no meaning at >> >all, it's a heuristic at best) in various places is a known problem, not >> >a regression. >> But the problems with the costing system exhibit themselves as a code >> quality regression. In the end that's what the end-users see -- a >> regression in the quality of the code GCC generates. > > Yes, exactly -- and I fear this all-encompassing change will cause just > such a regression for many users. Tests are running, will know more > later today (or tomorrow). > > The PR is about a very specific problem; the patch is not. The patch > is not a bug fix. If we allow anything that "makes things better" in > stage 3, what make it different from stage 1 then? That's a good question ;) The stage 3 definition has a loophole via "go file a bug about feature X, then it's a bugfix!". I'm all open for a more sensible definition, like constraining the kind of non-regression fixes that we want to allow, but I fear the most sensible option would be to simply ditch the notion of different "stages" and make it "general development" and "regression fixing". (though if you try hard enough and go back in time you'll find that almost all non-enhancement bugs are regressions in some sense) And yes, current stage3 still feels too much like stage1 ;) Richard. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386. 2016-11-25 9:15 ` Richard Biener @ 2016-11-25 15:34 ` Jeff Law 2016-11-25 15:55 ` Segher Boessenkool 1 sibling, 0 replies; 32+ messages in thread From: Jeff Law @ 2016-11-25 15:34 UTC (permalink / raw) To: Richard Biener, Segher Boessenkool; +Cc: Bernd Schmidt, GCC Patches On 11/25/2016 02:15 AM, Richard Biener wrote: > > That's a good question ;) The stage 3 definition has a loophole via > "go file a bug about feature X, then it's a bugfix!". Right. That loophole has existed since we've moved to the current model -- we extend a level of trust to our developers not to abuse the loophole. I think that level of trust is warranted and hasn't been significantly violated. > I'm all open for a more sensible definition, like constraining the kind > of non-regression fixes that we want to allow, but I fear the most > sensible option would be to simply ditch the notion of different > "stages" and make it "general development" and "regression fixing". > (though if you try hard enough and go back in time you'll find that > almost all non-enhancement bugs are regressions in some sense) Similarly, I'm always open for improvements. My worry is if we went to development/regression bugfixing cycle, then non-regression bugs would largely be ignored. General bugfixing is, IMHO, a good period -- it gets a larger portion of our developers fixing bugs and gives folks with a heavy review load a chance to flush out their queues of stuff that came in right at the end of stage1. I'm not really pushing to open a "development cycle" discussion right now, but I do sense that our cycles could use some refinement. > > And yes, current stage3 still feels too much like stage1 ;) Hasn't seemed that way to me, but obviously experiences will differ. My biggest worry about this cycle is the higher than typical (compared to the last few years) regression bug counts. That worry is somewhat mitigated by the belief that we're marking regressions much much more consistently this year, so we're a lot less likely to get a big jump in marked regressions like we saw last year. *If* that is the case (and my light poking around seems to indicate that's true), then we're likely ahead of the gcc-6 cycle, but behind the gcc-5 cycle. jeff ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386. 2016-11-25 9:15 ` Richard Biener 2016-11-25 15:34 ` Jeff Law @ 2016-11-25 15:55 ` Segher Boessenkool 2016-11-28 8:59 ` Bernd Schmidt 2016-11-28 9:05 ` Bernd Schmidt 1 sibling, 2 replies; 32+ messages in thread From: Segher Boessenkool @ 2016-11-25 15:55 UTC (permalink / raw) To: Richard Biener; +Cc: Jeff Law, Bernd Schmidt, GCC Patches On Fri, Nov 25, 2016 at 10:15:25AM +0100, Richard Biener wrote: > On Thu, Nov 24, 2016 at 5:14 PM, Segher Boessenkool > <segher@kernel.crashing.org> wrote: > > On Thu, Nov 24, 2016 at 08:48:04AM -0700, Jeff Law wrote: > >> On 11/24/2016 07:53 AM, Segher Boessenkool wrote: > >> > > >> >That we compare different kinds of costs (which really has no meaning at > >> >all, it's a heuristic at best) in various places is a known problem, not > >> >a regression. > >> But the problems with the costing system exhibit themselves as a code > >> quality regression. In the end that's what the end-users see -- a > >> regression in the quality of the code GCC generates. > > > > Yes, exactly -- and I fear this all-encompassing change will cause just > > such a regression for many users. Tests are running, will know more > > later today (or tomorrow). > > > > The PR is about a very specific problem; the patch is not. The patch > > is not a bug fix. If we allow anything that "makes things better" in > > stage 3, what make it different from stage 1 then? > > That's a good question ;) The stage 3 definition has a loophole via > "go file a bug about feature X, then it's a bugfix!". > > I'm all open for a more sensible definition, like constraining the kind > of non-regression fixes that we want to allow, but I fear the most > sensible option would be to simply ditch the notion of different > "stages" and make it "general development" and "regression fixing". > (though if you try hard enough and go back in time you'll find that > almost all non-enhancement bugs are regressions in some sense) The scale goes: early stage 1, anything goes; ...; until stage 4, only very narrow regression fixes are allowed. Let's try to keep that spirit, and not behave like politicians following the "rules" (or not). > And yes, current stage3 still feels too much like stage1 ;) Yes, very much so. Well, at least trunk bootstraps on more targets now. -- So IMNSHO this rtx costing change belongs in early stage 1, and should be reverted. If ifcvt should use full rtx cost instead of rtx_src_cost, fix *that*, that is a much more local change. And even then, test on more targets please. Segher ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386. 2016-11-25 15:55 ` Segher Boessenkool @ 2016-11-28 8:59 ` Bernd Schmidt 2016-11-28 9:05 ` Bernd Schmidt 1 sibling, 0 replies; 32+ messages in thread From: Bernd Schmidt @ 2016-11-28 8:59 UTC (permalink / raw) To: Segher Boessenkool, Richard Biener; +Cc: Jeff Law, GCC Patches On 11/25/2016 04:55 PM, Segher Boessenkool wrote: > So IMNSHO this rtx costing change belongs in early stage 1, and should > be reverted. Done. Bernd ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386. 2016-11-25 15:55 ` Segher Boessenkool 2016-11-28 8:59 ` Bernd Schmidt @ 2016-11-28 9:05 ` Bernd Schmidt 1 sibling, 0 replies; 32+ messages in thread From: Bernd Schmidt @ 2016-11-28 9:05 UTC (permalink / raw) To: Segher Boessenkool, Richard Biener; +Cc: Jeff Law, Bernd Schmidt, GCC Patches On 11/25/2016 04:55 PM, Segher Boessenkool wrote: > So IMNSHO this rtx costing change belongs in early stage 1, and should > be reverted. Done. Bernd ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386. 2016-11-24 16:14 ` Segher Boessenkool 2016-11-24 22:32 ` Segher Boessenkool 2016-11-25 9:15 ` Richard Biener @ 2016-11-28 18:50 ` Jeff Law 2016-11-28 18:52 ` Bernd Schmidt 2 siblings, 1 reply; 32+ messages in thread From: Jeff Law @ 2016-11-28 18:50 UTC (permalink / raw) To: Segher Boessenkool; +Cc: Bernd Schmidt, GCC Patches On 11/24/2016 09:14 AM, Segher Boessenkool wrote: > On Thu, Nov 24, 2016 at 08:48:04AM -0700, Jeff Law wrote: >> On 11/24/2016 07:53 AM, Segher Boessenkool wrote: >>> >>> That we compare different kinds of costs (which really has no meaning at >>> all, it's a heuristic at best) in various places is a known problem, not >>> a regression. >> But the problems with the costing system exhibit themselves as a code >> quality regression. In the end that's what the end-users see -- a >> regression in the quality of the code GCC generates. > > Yes, exactly -- and I fear this all-encompassing change will cause just > such a regression for many users. Tests are running, will know more > later today (or tomorrow). > > The PR is about a very specific problem; the patch is not. The patch > is not a bug fix. If we allow anything that "makes things better" in > stage 3, what make it different from stage 1 then? So how would you suggest this be fixed right now? I'd really like to get the regression addressed. I would claim that Bernd's patch is right from a design and implementation standpoint -- the issues are fallout from backend issues and none looked terrible to me. jeff ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386. 2016-11-28 18:50 ` Jeff Law @ 2016-11-28 18:52 ` Bernd Schmidt 0 siblings, 0 replies; 32+ messages in thread From: Bernd Schmidt @ 2016-11-28 18:52 UTC (permalink / raw) To: Jeff Law, Segher Boessenkool; +Cc: GCC Patches On 11/28/2016 07:50 PM, Jeff Law wrote: > So how would you suggest this be fixed right now? I'd really like to > get the regression addressed. The regression is still fixed. That wasn't the case at all stages while I was working on it, but the i386 patch seems to suffice now. > I would claim that Bernd's patch is right from a design and > implementation standpoint -- the issues are fallout from backend issues > and none looked terrible to me. Agree but I reverted it anyway. Bernd ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386. 2016-11-23 18:58 [0/3] Fix PR71280, in ifcvt/rtlanal/i386 Bernd Schmidt 2016-11-23 19:00 ` [0/3] Fix PR78120, " Bernd Schmidt @ 2016-11-23 19:01 ` Bernd Schmidt 2016-11-23 21:46 ` Uros Bizjak 2016-11-23 19:03 ` [3/3] " Bernd Schmidt 2 siblings, 1 reply; 32+ messages in thread From: Bernd Schmidt @ 2016-11-23 19:01 UTC (permalink / raw) To: GCC Patches, Uros Bizjak [-- Attachment #1: Type: text/plain, Size: 319 bytes --] On 11/23/2016 07:57 PM, Bernd Schmidt wrote: > 2. The i386 backend mishandles SET rtxs. If you have a fairly plain > single-insn SET, you tend to get COSTS_N_INSNS (2) out of set_rtx_cost, > because rtx_costs has a default of COSTS_N_INSNS (1) for a SET, and you > get the cost of the src in addition to that. Bernd [-- Attachment #2: 71280-2.diff --] [-- Type: text/x-patch, Size: 1312 bytes --] PR rtl-optimization/78120 * config/i386/i386.c (ix86_rtx_costs): Fully handle SETs. Index: gcc/config/i386/i386.c =================================================================== --- gcc/config/i386/i386.c (revision 242038) +++ gcc/config/i386/i386.c (working copy) @@ -39925,6 +39925,7 @@ ix86_rtx_costs (rtx x, machine_mode mode enum rtx_code code = GET_CODE (x); enum rtx_code outer_code = (enum rtx_code) outer_code_i; const struct processor_costs *cost = speed ? ix86_cost : &ix86_size_cost; + int src_cost; switch (code) { @@ -39935,7 +39936,23 @@ ix86_rtx_costs (rtx x, machine_mode mode *total = ix86_set_reg_reg_cost (GET_MODE (SET_DEST (x))); return true; } - return false; + + if (register_operand (SET_SRC (x), VOIDmode)) + /* Avoid potentially incorrect high cost from rtx_costs + for non-tieable SUBREGs. */ + src_cost = 0; + else + { + src_cost = rtx_cost (SET_SRC (x), mode, SET, 1, speed); + + if (CONSTANT_P (SET_SRC (x))) + /* Constant costs assume a base value of COSTS_N_INSNS (1) and add + a small value, possibly zero for cheap constants. */ + src_cost += COSTS_N_INSNS (1); + } + + *total = src_cost + rtx_cost (SET_DEST (x), mode, SET, 0, speed); + return true; case CONST_INT: case CONST: ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [0/3] Fix PR78120, in ifcvt/rtlanal/i386. 2016-11-23 19:01 ` Bernd Schmidt @ 2016-11-23 21:46 ` Uros Bizjak 0 siblings, 0 replies; 32+ messages in thread From: Uros Bizjak @ 2016-11-23 21:46 UTC (permalink / raw) To: Bernd Schmidt; +Cc: GCC Patches On Wed, Nov 23, 2016 at 8:01 PM, Bernd Schmidt <bschmidt@redhat.com> wrote: > On 11/23/2016 07:57 PM, Bernd Schmidt wrote: >> >> 2. The i386 backend mishandles SET rtxs. If you have a fairly plain >> single-insn SET, you tend to get COSTS_N_INSNS (2) out of set_rtx_cost, >> because rtx_costs has a default of COSTS_N_INSNS (1) for a SET, and you >> get the cost of the src in addition to that. Looks good to me. Thanks, Uros. ^ permalink raw reply [flat|nested] 32+ messages in thread
* [3/3] Fix PR78120, in ifcvt/rtlanal/i386. 2016-11-23 18:58 [0/3] Fix PR71280, in ifcvt/rtlanal/i386 Bernd Schmidt 2016-11-23 19:00 ` [0/3] Fix PR78120, " Bernd Schmidt 2016-11-23 19:01 ` Bernd Schmidt @ 2016-11-23 19:03 ` Bernd Schmidt 2016-11-23 19:38 ` Jeff Law 2 siblings, 1 reply; 32+ messages in thread From: Bernd Schmidt @ 2016-11-23 19:03 UTC (permalink / raw) To: GCC Patches [-- Attachment #1: Type: text/plain, Size: 502 bytes --] On 11/23/2016 07:57 PM, Bernd Schmidt wrote: > 3. ifcvt computes the sum of costs for the involved blocks, but only > makes a before/after comparison when optimizing for size. When > optimizing for speed, it uses max_seq_cost, which is an estimate > computed from BRANCH_COST, which in turn can be zero for predictable > branches on x86. This is the final patch and has the testcase. It also happens to be the least risky of the series so it could be applied on its own (without the test). Bernd [-- Attachment #2: 71280-3.diff --] [-- Type: text/x-patch, Size: 3388 bytes --] PR rtl-optimization/78120 * ifcvt.c (noce_conversion_profitable_p): Check original cost in all cases, and additionally test against max_seq_cost for speed optimization. (noce_process_if_block): Compute an estimate for the original cost when optimizing for speed, using the minimum of then and else block costs. PR rtl-optimization/78120 * gcc.target/i386/pr78120.c: New test. Index: gcc/ifcvt.c =================================================================== --- gcc/ifcvt.c (revision 242038) +++ gcc/ifcvt.c (working copy) @@ -812,8 +812,10 @@ struct noce_if_info we're optimizing for size. */ bool speed_p; - /* The combined cost of COND, JUMP and the costs for THEN_BB and - ELSE_BB. */ + /* An estimate of the original costs. When optimizing for size, this is the + combined cost of COND, JUMP and the costs for THEN_BB and ELSE_BB. + When optimizing for speed, we use the costs of COND plus the minimum of + the costs for THEN_BB and ELSE_BB, as computed in the next field. */ unsigned int original_cost; /* Maximum permissible cost for the unconditional sequence we should @@ -852,12 +857,12 @@ noce_conversion_profitable_p (rtx_insn * /* Cost up the new sequence. */ unsigned int cost = seq_cost (seq, speed_p); + if (cost <= if_info->original_cost) + return true; + /* When compiling for size, we can make a reasonably accurately guess - at the size growth. */ - if (!speed_p) - return cost <= if_info->original_cost; - else - return cost <= if_info->max_seq_cost; + at the size growth. When compiling for speed, use the maximum. */ + return speed_p && cost <= if_info->max_seq_cost; } /* Helper function for noce_try_store_flag*. */ @@ -3441,15 +3446,24 @@ noce_process_if_block (struct noce_if_in } } - if (! bb_valid_for_noce_process_p (then_bb, cond, &if_info->original_cost, + bool speed_p = optimize_bb_for_speed_p (test_bb); + unsigned int then_cost = 0, else_cost = 0; + if (!bb_valid_for_noce_process_p (then_bb, cond, &then_cost, &if_info->then_simple)) return false; if (else_bb - && ! bb_valid_for_noce_process_p (else_bb, cond, &if_info->original_cost, - &if_info->else_simple)) + && !bb_valid_for_noce_process_p (else_bb, cond, &else_cost, + &if_info->else_simple)) return false; + if (else_bb == NULL) + if_info->original_cost += then_cost; + else if (speed_p) + if_info->original_cost += MIN (then_cost, else_cost); + else + if_info->original_cost += then_cost + else_cost; + insn_a = last_active_insn (then_bb, FALSE); set_a = single_set (insn_a); gcc_assert (set_a); Index: gcc/testsuite/gcc.target/i386/pr78120.c =================================================================== --- gcc/testsuite/gcc.target/i386/pr78120.c (nonexistent) +++ gcc/testsuite/gcc.target/i386/pr78120.c (working copy) @@ -0,0 +1,27 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -mtune=generic" } */ +/* { dg-final { scan-assembler "adc" } } */ +/* { dg-final { scan-assembler-not "jmp" } } */ + +typedef unsigned long u64; + +typedef struct { + u64 hi, lo; +} u128; + +static inline u128 add_u128 (u128 a, u128 b) +{ + a.lo += b.lo; + if (a.lo < b.lo) + a.hi++; + + return a; +} + +extern u128 t1, t2, t3; + +void foo (void) +{ + t1 = add_u128 (t1, t2); + t1 = add_u128 (t1, t3); +} ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [3/3] Fix PR78120, in ifcvt/rtlanal/i386. 2016-11-23 19:03 ` [3/3] " Bernd Schmidt @ 2016-11-23 19:38 ` Jeff Law 0 siblings, 0 replies; 32+ messages in thread From: Jeff Law @ 2016-11-23 19:38 UTC (permalink / raw) To: Bernd Schmidt, GCC Patches On 11/23/2016 12:02 PM, Bernd Schmidt wrote: > On 11/23/2016 07:57 PM, Bernd Schmidt wrote: >> 3. ifcvt computes the sum of costs for the involved blocks, but only >> makes a before/after comparison when optimizing for size. When >> optimizing for speed, it uses max_seq_cost, which is an estimate >> computed from BRANCH_COST, which in turn can be zero for predictable >> branches on x86. > > This is the final patch and has the testcase. It also happens to be the > least risky of the series so it could be applied on its own (without the > test). > > > Bernd > > > 71280-3.diff > > > PR rtl-optimization/78120 > * ifcvt.c (noce_conversion_profitable_p): Check original cost in all > cases, and additionally test against max_seq_cost for speed > optimization. > (noce_process_if_block): Compute an estimate for the original cost when > optimizing for speed, using the minimum of then and else block costs. > > PR rtl-optimization/78120 > * gcc.target/i386/pr78120.c: New test. Also OK. Obviously Uros has the call on the x86 target change. Stage the series in as you see fit given the dependencies. jeff ^ permalink raw reply [flat|nested] 32+ messages in thread
end of thread, other threads:[~2016-11-28 18:52 UTC | newest] Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2016-11-23 18:58 [0/3] Fix PR71280, in ifcvt/rtlanal/i386 Bernd Schmidt 2016-11-23 19:00 ` [0/3] Fix PR78120, " Bernd Schmidt 2016-11-23 19:30 ` Jeff Law 2016-11-23 19:31 ` Bernd Schmidt 2016-11-24 14:21 ` Segher Boessenkool 2016-11-24 14:26 ` Bernd Schmidt 2016-11-24 14:36 ` Segher Boessenkool 2016-11-24 14:38 ` Bernd Schmidt 2016-11-24 14:44 ` Eric Botcazou 2016-11-24 14:54 ` Segher Boessenkool 2016-11-24 15:16 ` Richard Biener 2016-11-24 15:46 ` Jeff Law 2016-11-24 15:34 ` Bernd Schmidt 2016-11-24 15:48 ` Jeff Law 2016-11-24 16:14 ` Segher Boessenkool 2016-11-24 22:32 ` Segher Boessenkool 2016-11-26 10:44 ` Jeff Law 2016-11-26 11:11 ` Eric Botcazou 2016-11-26 16:15 ` Jeff Law 2016-11-26 22:03 ` Segher Boessenkool 2016-11-26 18:08 ` Segher Boessenkool 2016-11-25 9:15 ` Richard Biener 2016-11-25 15:34 ` Jeff Law 2016-11-25 15:55 ` Segher Boessenkool 2016-11-28 8:59 ` Bernd Schmidt 2016-11-28 9:05 ` Bernd Schmidt 2016-11-28 18:50 ` Jeff Law 2016-11-28 18:52 ` Bernd Schmidt 2016-11-23 19:01 ` Bernd Schmidt 2016-11-23 21:46 ` Uros Bizjak 2016-11-23 19:03 ` [3/3] " Bernd Schmidt 2016-11-23 19:38 ` Jeff Law
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).