From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 7867 invoked by alias); 16 Dec 2016 12:25:40 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 7826 invoked by uid 89); 16 Dec 2016 12:25:38 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.9 required=5.0 tests=BAYES_00,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.2 spammy=fed, H*MI:sk:CAFiYyc, consideration, late X-HELO: mail-ua0-f179.google.com Received: from mail-ua0-f179.google.com (HELO mail-ua0-f179.google.com) (209.85.217.179) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 16 Dec 2016 12:25:28 +0000 Received: by mail-ua0-f179.google.com with SMTP id b35so8736952uaa.0 for ; Fri, 16 Dec 2016 04:25:28 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=3DiS82g6TiKqiBaDp48s2v5pFtt3wI7bJexgj/F29zE=; b=Dw5Syh1ca65f3ea9pegy3Tr15Yl7RIlXsvuB1Aau0YL+hVsdOcYO3qZ4Fq/gEryy5G 5j+YO9DWJPz+hRFHx0YrbOrZ0oy97BjjcXzkLevtrZCp1xSIeS5UJorVfold6z4uGg3g sdlygU9wBj4rzSTFow+ix70j3PubvkBaoUidaQECvknvcjpxDM3QzFaI2UsJkvAyJL0I htjypYTLyoBDiZoctsqkbSuMK3jSdNtSYa8G1PuqdLH0ZK677nRdDy0z9qxJvlTUc6EK d43fDG8Vm068ucTyawXJ4YAgEYF0FYpLn2UMCzTGayAom2QY7YIYVGW+86rXv7dzGVyN MiDA== X-Gm-Message-State: AIkVDXIK7hP/bLamZaaAVN5RcVRGhDpvvAtSzit8T9cECr9QSrkC3cof4WQk/1lcatoj5nxNPmtbphzFFQEO2w== X-Received: by 10.159.32.8 with SMTP id 8mr1857648uam.129.1481891126767; Fri, 16 Dec 2016 04:25:26 -0800 (PST) MIME-Version: 1.0 Received: by 10.103.72.129 with HTTP; Fri, 16 Dec 2016 04:25:26 -0800 (PST) In-Reply-To: <1481801639-14286-1-git-send-email-james.greenhalgh@arm.com> References: <20161018155510.GA11109@arm.com> <1481801639-14286-1-git-send-email-james.greenhalgh@arm.com> From: Richard Biener Date: Fri, 16 Dec 2016 12:34:00 -0000 Message-ID: Subject: Re: [Patch] Undermine the jump threading cost model to fix PR77445. To: James Greenhalgh Cc: GCC Patches , nd@arm.com, Jeff Law , Jan Hubicka Content-Type: text/plain; charset=UTF-8 X-IsSubscribed: yes X-SW-Source: 2016-12/txt/msg01451.txt.bz2 On Thu, Dec 15, 2016 at 12:33 PM, James Greenhalgh wrote: > > Hi, > > As mentioned in PR77445, the improvements to the jump threading cost model > this year have caused substantial regressions in the amount of jump threading > we do and the performance of workloads which rely on that threading. > > This patch represents the low-bar in fixing the performance issues reported > in PR77445 - by weakening the cost model enough that we thread in a way much > closer to GCC 6. I don't think this patch is likely to be acceptable for > trunk, but I'm posting it for consideration regardless. > > Under the new cost model, if the edge doesn't pass optimize_edge_for_speed_p, > then we don't thread. The problem in late threading is bad edge profile > data makes the edge look cold, and thus it fails optimize_edge_for_speed_p > and is no longer considered a candidate for threading. As an aside, I think > this is the wrong cost model for jump threading, where you get the most > impact if you can resolve unpredictable switch statements - which by their > nature may have multiple cold edges in need of threading. > > Early threading should avoid these issues, as there is no edge profile > info yet. optimize_edge_for_speed_p is therefore more likely to hold, but > the condition for threading is: > > if (speed_p && optimize_edge_for_speed_p (taken_edge)) > { > if (n_insns >= PARAM_VALUE (PARAM_MAX_FSM_THREAD_PATH_INSNS)) > { > [...reject threading...] > } > } > else if (n_insns > 1) > { > [...reject threading...] > } > > With speed_p is hardwired to false for the early threader > ( pass_early_thread_jumps::execute ): > > find_jump_threads_backwards (bb, false); > > So we always fall to the n_insns > 1 case and thus only rarely get to > thread. > > In this patch I change that call in pass_early_thread_jumps::execute to > instead look at optimize_bb_for_speed_p (bb) . That allows the speed_p > check to pass in the main threading cost model, and then the > optimize_edge_for_speed_p can also pass. That gets the first stage of > jump-threading back working in a proprietary benchmark which is sensitive > to this optimisation. But all early optimizations are supposed to never increase code size -- they exist to get more realistic sizes to the IPA inliner heuristic code. Iff at all then add some --param early-jump-threading-insns and use that in place of the '1' and then eventually bump that to 2 or 3. But using the 100 from the late threading is too much. Note that using optimize_bb_for_speed_p at this point in the pipeline is nonsense and you can as well use ! optimize_size ... because even a guessed profile is only created at the end of the early optimization pipeline. > To get the rest of the required jump threading, I also have to weaken the > cost model - and this is obviously a hack! The easy hack is to special case > when the taken edge has frequency zero, and permit jump threading there. > > I know this patch is likely not the preferred way to fix this. For me that > would be a change to the cost model, which as I mentioned above I think > misses the point about which edges we want to thread. By far the best > fix would be to the junk edge profiling data we create during threading. > > However, this patch does fix the performance issues identified in PR77445, > and does highlight a fundamental issue with the early threader (which > doesn't seem to me like it will be effective while it sets speed_p to > false), so I'd like it to be considered for trunk if no better fix appears > before stage 4. But the PR has nothing to do with early threading but with late threading fed a bogus profile. > Bootstrapped on x86_64 with no issues. The testsuite changes just reshuffle > which passes spot the threading opportunities. > > OK? I don't think so. Richard. > Thanks, > James > > --- > gcc/ > > 2016-12-15 James Greenhalgh > > PR tree-optimization/77445 > * tree-ssa-threadbackward.c (profitable_jump_thread_path) Work > around sometimes corrupt edge frequency data. > (pass_early_thread_jumps::execute): Pass > optimize_bb_for_speed_p as the speed_p parameter to > find_jump_threads_backwards to enable threading in more cases. > > gcc/testsuite/ > > 2016-12-15 James Greenhalgh > > PR tree-optimization/77445 > * gcc.dg/tree-ssa/ssa-dom-thread-7.c: Adjust options and dump passes. > * gcc.dg/tree-ssa/pr66752-3.c: Likewise. >