From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 55786 invoked by alias); 9 Oct 2015 09:29:20 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 55773 invoked by uid 89); 9 Oct 2015 09:29:20 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=0.8 required=5.0 tests=AWL,BAYES_99,BAYES_999,FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS autolearn=no version=3.3.2 X-HELO: mail-yk0-f173.google.com Received: from mail-yk0-f173.google.com (HELO mail-yk0-f173.google.com) (209.85.160.173) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-GCM-SHA256 encrypted) ESMTPS; Fri, 09 Oct 2015 09:29:18 +0000 Received: by ykft14 with SMTP id t14so73311908ykf.0 for ; Fri, 09 Oct 2015 02:29:16 -0700 (PDT) MIME-Version: 1.0 X-Received: by 10.129.76.18 with SMTP id z18mr9251519ywa.263.1444382956600; Fri, 09 Oct 2015 02:29:16 -0700 (PDT) Received: by 10.37.93.136 with HTTP; Fri, 9 Oct 2015 02:29:16 -0700 (PDT) In-Reply-To: References: Date: Fri, 09 Oct 2015 09:29:00 -0000 Message-ID: Subject: Re: [PATCH] bb-reorder: Improve the simple algorithm for -Os (PR67864) From: Richard Biener To: Segher Boessenkool Cc: GCC Patches Content-Type: text/plain; charset=UTF-8 X-IsSubscribed: yes X-SW-Source: 2015-10/txt/msg00938.txt.bz2 On Thu, Oct 8, 2015 at 6:57 PM, Segher Boessenkool wrote: > As the PR points out, the "simple" reorder algorithm makes bigger code > than the STC algorithm did, for -Os, for x86. I now tested it for many > different targets and it turns out to be worse everywhere. > > This simple patch tunes "simple" a bit; this makes it better than STC > almost everywhere. The only exceptions (for the targets where I have > results) are x86 and mn10300. For those targets it may be best to switch > the default algorithm for -Os to STC. > > The raw results. This is text size for vmlinux for 31 targets; as you > can see many did not build, but at least all primary targets did. > "none" is no basic block reordering; "trunk" is current trunk; "stc" is > with the STC algorithm; "nodyn" is with simple, but considering all edges > equally likely; "swaps" is that, but prefering the more likely edge from > conditional branches; and "falls" prefers the edge from conditional > branches that is already the fallthrough edge. This patch is "falls". > > none trunk stc nodyn swaps falls best > 3728663 3721479 3700831 3706407 3717971 3690367 falls arm > 2097684 2095560 2089484 2094024 2094212 2086720 falls blackfin > 2096118 2107356 2081894 2092276 2103732 2077162 falls cris > 3204044 3200972 3187196 3191932 3198156 3177980 falls frv > 9254222 9340258 9208805 9265886 9331362 9247119 stc i386 > 3353034 3355482 3334726 3334466 3349710 3314970 falls m32r > 4545720 4549824 4514256 4541832 4544540 4498416 falls microblaze > 4276743 4266363 4246255 4259923 4261367 4227723 falls mips > 5779199 5770567 5741663 5757663 5764475 5721803 falls mips64 > 2059078 2086000 2051005 2064365 2083923 2055705 stc mn10300 > 3311925 3320113 3293873 3305949 3317865 3284081 falls nios2 > 6738701 6747077 6710253 6742917 6740965 6696757 falls parisc64 > 8312408 8312744 8261016 8294480 8306488 8237188 falls powerpc > 17782722 17788382 17722326 17749526 17780642 17683810 falls powerpc64 > 11016289 11029481 10970081 10980065 11024617 10942409 falls s390 > 1406832 1417224 1400344 1409392 1416172 1399428 falls shnommu > 3771111 3776223 3751623 3768459 3771455 3732967 falls sparc > 6113427 6113527 6074875 6091167 6106015 6049571 falls sparc64 > 10449681 10529883 10398908 10458240 10522149 10440814 stc x86_64 > 1905733 1905733 1905733 1905733 1905733 1905733 ----- xtensa > > > Is this okay for trunk? I think the patch makes sense but it also raises a question for me - how did we decide what edge gets EDGE_FALLTHRU when going out-of-cfglayout? And isn't _that_ mechanism then not part of basic-block reordering that needs to be tweaked for choosing the EDGE_FALLTHRU as better? Thanks, Richard. > > Segher > > > 2015-10-08 Segher Boessenkool > > PR rtl-optimization/67864 > * gcc/bb-reorder (reorder_basic_blocks_simple): Prefer existing > fallthrough edges for conditional jumps. Don't sort candidate > edges if not optimizing for speed. > > --- > gcc/bb-reorder.c | 16 ++++++++++++---- > 1 file changed, 12 insertions(+), 4 deletions(-) > > diff --git a/gcc/bb-reorder.c b/gcc/bb-reorder.c > index cb001e8..3b7098e 100644 > --- a/gcc/bb-reorder.c > +++ b/gcc/bb-reorder.c > @@ -2318,16 +2318,24 @@ reorder_basic_blocks_simple (void) > > if (any_condjump_p (end)) > { > - edges[n++] = EDGE_SUCC (bb, 0); > - edges[n++] = EDGE_SUCC (bb, 1); > + edge e0 = EDGE_SUCC (bb, 0); > + edge e1 = EDGE_SUCC (bb, 1); > + /* When optimizing for size it is best to keep the original > + fallthrough edges. */ > + if (e1->flags & EDGE_FALLTHRU) > + std::swap (e0, e1); > + edges[n++] = e0; > + edges[n++] = e1; > } > else if (single_succ_p (bb)) > edges[n++] = EDGE_SUCC (bb, 0); > } > > - /* Sort the edges, the most desirable first. */ > + /* Sort the edges, the most desirable first. When optimizing for size > + all edges are equally desirable. */ > > - std::stable_sort (edges, edges + n, edge_order); > + if (optimize_function_for_speed_p (cfun)) > + std::stable_sort (edges, edges + n, edge_order); > > /* Now decide which of those edges to make fallthrough edges. We set > BB_VISITED if a block already has a fallthrough successor assigned > -- > 1.8.1.4 >