From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 108919 invoked by alias); 24 Sep 2015 13:19:50 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 108906 invoked by uid 89); 24 Sep 2015 13:19:49 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.1 required=5.0 tests=AWL,BAYES_40,RP_MATCHES_RCVD,SPF_HELO_PASS,SPF_PASS autolearn=ham version=3.3.2 X-HELO: gate.crashing.org Received: from gate.crashing.org (HELO gate.crashing.org) (63.228.1.57) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-SHA encrypted) ESMTPS; Thu, 24 Sep 2015 13:19:48 +0000 Received: from gate.crashing.org (localhost.localdomain [127.0.0.1]) by gate.crashing.org (8.14.1/8.13.8) with ESMTP id t8ODJhno006936; Thu, 24 Sep 2015 08:19:44 -0500 Received: (from segher@localhost) by gate.crashing.org (8.14.1/8.14.1/Submit) id t8ODJhnN006934; Thu, 24 Sep 2015 08:19:43 -0500 Date: Thu, 24 Sep 2015 13:32:00 -0000 From: Segher Boessenkool To: Bernd Schmidt Cc: gcc-patches@gcc.gnu.org Subject: Re: [PATCH 0/4] bb-reorder: Add the "simple" algorithm Message-ID: <20150924131943.GA20466@gate.crashing.org> References: <5603C8C6.9060901@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <5603C8C6.9060901@redhat.com> User-Agent: Mutt/1.4.2.3i X-IsSubscribed: yes X-SW-Source: 2015-09/txt/msg01854.txt.bz2 On Thu, Sep 24, 2015 at 11:56:22AM +0200, Bernd Schmidt wrote: > On 09/24/2015 12:06 AM, Segher Boessenkool wrote: > >The current basic block reordering always uses the "software trace cache" > >algorithm. That has a few problems: > > > >1) It increases code size substantially; this makes it not suitable for > >-O1 or -Os, and not at all for some architectures; > >2) but it is enabled for -Os and all targets; > >3) and -O1 gets nothing, resulting in pretty jumpy code. > > A general question first, I see code in bb-reorder.c (in copy_bb_p) that > limits the amount of code growth if not optimizing for speed. Is that > not working as expected or not sufficient? It works. The "simple" algorithm generates slightly smaller code though (less than a percent). Defaulting -Os to STC is easy of course; do you prefer that? > Your code looks like a nice clean algorithm so I have no objections to > it (detailed comments to follow), but I want to make sure it is > necessary to add it. It's not just for -Os, but also for -O1 (where we currently don't reorder at all, although various passes leave the config in a pretty sorry state -- like, we run shrink-wrapping at -O1, and it can make quite a mess if some blocks are copied and others not; but this is just an example, it was the trigger for me though). And, when I wrote the original for this, it was for a target where STC does not help at all (there is no instruction cache); "simple" saves a lot of space at -O2. Quite important for embedded targets. Finally, it lets us easily plug in other algorithms. Segher