From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 116027 invoked by alias); 5 Aug 2016 21:34:12 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 116010 invoked by uid 89); 5 Aug 2016 21:34:11 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-3.1 required=5.0 tests=BAYES_00,RP_MATCHES_RCVD,SPF_HELO_PASS autolearn=ham version=3.3.2 spammy=0.5, whitespace, 05, prior X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-GCM-SHA384 encrypted) ESMTPS; Fri, 05 Aug 2016 21:34:10 +0000 Received: from int-mx10.intmail.prod.int.phx2.redhat.com (int-mx10.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 5EDB0693EF; Fri, 5 Aug 2016 21:34:08 +0000 (UTC) Received: from localhost.localdomain (ovpn-116-52.phx2.redhat.com [10.3.116.52]) by int-mx10.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id u75LY7pQ031931; Fri, 5 Aug 2016 17:34:08 -0400 Subject: Re: backward threading heuristics tweek To: Jan Hubicka , gcc-patches@gcc.gnu.org References: <20160606101953.GC12313@kam.mff.cuni.cz> From: Jeff Law Message-ID: Date: Fri, 05 Aug 2016 21:34:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.2.0 MIME-Version: 1.0 In-Reply-To: <20160606101953.GC12313@kam.mff.cuni.cz> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-IsSubscribed: yes X-SW-Source: 2016-08/txt/msg00503.txt.bz2 On 06/06/2016 04:19 AM, Jan Hubicka wrote: > Hi, > while looking into profile mismatches introduced by the backward threading pass > I noticed that the heuristics seems quite simplistics. First it should be > profile sensitive and disallow duplication when optimizing cold paths. Second > it should use estimate_num_insns because gimple statement count is not really > very realistic estimate of final code size effect and third there seems to be > no reason to disable the pass for functions optimized for size. I've never been happy with the size estimation code -- it's a series of hacks to address a couple pathological codesize regressions that were seen when comparing gcc-6 to prior versions. I won't lose any sleep if we move to estimate_num_insns and rely more on profile data. > > If we block duplication for more than 1 insns for size optimized paths the pass > is able to do majority of threading decisions that are for free and improve codegen. > The code size benefit was between 0.5% to 2.7% on testcases I tried (tramp3d, > GCC modules, xlanancbmk and some other stuff around my hd). > > Bootstrapped/regtested x86_64-linux, seems sane? > > The pass should also avoid calling cleanup_cfg when no trheading was done > and i do not see why it is guarded by expensive_optimizations. What are the > main compile time complexity limitations? The pass essentially walks up the use-def links in the CFG. WHen it encounters a PHI, we walk up every PHI argument. That's where it gets expensive. I think there's a better algorithm to do those walks that doesn't start at the beginning each time, but I haven't had time to experiment with coding it up. > > Honza > > * tree-ssa-threadbackward.c: Include tree-inline.h > (profitable_jump_thread_path): Use estimate_num_insns to estimate > size of copied block; for cold paths reduce duplication. > (find_jump_threads_backwards): Remove redundant tests. > (pass_thread_jumps::gate): Enable for -Os. > * gcc.dg/tree-ssa/ssa-dom-thread-7.c: Update testcase. > Index: tree-ssa-threadbackward.c > =================================================================== > --- tree-ssa-threadbackward.c (revision 237101) > +++ tree-ssa-threadbackward.c (working copy) > @@ -35,6 +35,7 @@ along with GCC; see the file COPYING3. > #include "tree-pass.h" > #include "gimple-ssa.h" > #include "tree-phinodes.h" > +#include "tree-inline.h" Probably an indication we want estimate_num_insns broken out into a more generic place that could be used by the threader, inlining, etc. Please consider that as a follow-up. > > static int max_threaded_paths; > > @@ -210,7 +211,7 @@ profitable_jump_thread_path (vec && !(gimple_code (stmt) == GIMPLE_ASSIGN > && gimple_assign_rhs_code (stmt) == ASSERT_EXPR) > && !is_gimple_debug (stmt)) > - ++n_insns; > + n_insns += estimate_num_insns (stmt, &eni_size_weights); > } > > /* We do not look at the block with the threaded branch > @@ -238,13 +239,15 @@ profitable_jump_thread_path (vec threaded_through_latch = true; > } > > + gimple *stmt = get_gimple_control_stmt ((*path)[0]); > + gcc_assert (stmt); > + > /* We are going to remove the control statement at the end of the > last block in the threading path. So don't count it against our > statement count. */ > - n_insns--; > > - gimple *stmt = get_gimple_control_stmt ((*path)[0]); > - gcc_assert (stmt); > + n_insns-= estimate_num_insns (stmt, &eni_size_weights); Whitespace nit here before the "-=". I think this is fine with the whitespace fix. And again, consider moving estimate_num_insns to a new location outside the inliner. jeff