From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-patches-return-433378-listarch-gcc-patches=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 116027 invoked by alias); 5 Aug 2016 21:34:12 -0000
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
Received: (qmail 116010 invoked by uid 89); 5 Aug 2016 21:34:11 -0000
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-3.1 required=5.0 tests=BAYES_00,RP_MATCHES_RCVD,SPF_HELO_PASS autolearn=ham version=3.3.2 spammy=0.5, whitespace, 05, prior
X-HELO: mx1.redhat.com
Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-GCM-SHA384 encrypted) ESMTPS; Fri, 05 Aug 2016 21:34:10 +0000
Received: from int-mx10.intmail.prod.int.phx2.redhat.com (int-mx10.intmail.prod.int.phx2.redhat.com [10.5.11.23])	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))	(No client certificate requested)	by mx1.redhat.com (Postfix) with ESMTPS id 5EDB0693EF;	Fri,  5 Aug 2016 21:34:08 +0000 (UTC)
Received: from localhost.localdomain (ovpn-116-52.phx2.redhat.com [10.3.116.52])	by int-mx10.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id u75LY7pQ031931;	Fri, 5 Aug 2016 17:34:08 -0400
Subject: Re: backward threading heuristics tweek
To: Jan Hubicka <hubicka@ucw.cz>, gcc-patches@gcc.gnu.org
References: <20160606101953.GC12313@kam.mff.cuni.cz>
From: Jeff Law <law@redhat.com>
Message-ID: <ce27bc2d-5cf8-61a6-4c0c-e54a68f6ef3e@redhat.com>
Date: Fri, 05 Aug 2016 21:34:00 -0000
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.2.0
MIME-Version: 1.0
In-Reply-To: <20160606101953.GC12313@kam.mff.cuni.cz>
Content-Type: text/plain; charset=windows-1252; format=flowed
Content-Transfer-Encoding: 7bit
X-IsSubscribed: yes
X-SW-Source: 2016-08/txt/msg00503.txt.bz2

On 06/06/2016 04:19 AM, Jan Hubicka wrote:
> Hi,
> while looking into profile mismatches introduced by the backward threading pass
> I noticed that the heuristics seems quite simplistics.  First it should be
> profile sensitive and disallow duplication when optimizing cold paths. Second
> it should use estimate_num_insns because gimple statement count is not really
> very realistic estimate of final code size effect and third there seems to be
> no reason to disable the pass for functions optimized for size.
I've never been happy with the size estimation code -- it's a series of 
hacks to address a couple pathological codesize regressions that were 
seen when comparing gcc-6 to prior versions.  I won't lose any sleep if 
we move to estimate_num_insns and rely more on profile data.


>
> If we block duplication for more than 1 insns for size optimized paths the pass
> is able to do majority of threading decisions that are for free and improve codegen.
> The code size benefit was between 0.5% to 2.7% on testcases I tried (tramp3d,
> GCC modules, xlanancbmk and some other stuff around my hd).
>
> Bootstrapped/regtested x86_64-linux, seems sane?
>
> The pass should also avoid calling cleanup_cfg when no trheading was done
> and i do not see why it is guarded by expensive_optimizations. What are the
> main compile time complexity limitations?
The pass essentially walks up the use-def links in the CFG.  WHen it 
encounters a PHI, we walk up every PHI argument.  That's where it gets 
expensive.  I think there's a better algorithm to do those walks that 
doesn't start at the beginning each time, but I haven't had time to 
experiment with coding it up.


>
> Honza
>
> 	* tree-ssa-threadbackward.c: Include tree-inline.h
> 	(profitable_jump_thread_path): Use estimate_num_insns to estimate
> 	size of copied block; for cold paths reduce duplication.
> 	(find_jump_threads_backwards): Remove redundant tests.
> 	(pass_thread_jumps::gate): Enable for -Os.
> 	* gcc.dg/tree-ssa/ssa-dom-thread-7.c: Update testcase.
> Index: tree-ssa-threadbackward.c
> ===================================================================
> --- tree-ssa-threadbackward.c	(revision 237101)
> +++ tree-ssa-threadbackward.c	(working copy)
> @@ -35,6 +35,7 @@ along with GCC; see the file COPYING3.
>  #include "tree-pass.h"
>  #include "gimple-ssa.h"
>  #include "tree-phinodes.h"
> +#include "tree-inline.h"
Probably an indication we want estimate_num_insns broken out into a more 
generic place that could be used by the threader, inlining, etc.  Please 
consider that as a follow-up.


>
>  static int max_threaded_paths;
>
> @@ -210,7 +211,7 @@ profitable_jump_thread_path (vec<basic_b
>  		  && !(gimple_code (stmt) == GIMPLE_ASSIGN
>  		       && gimple_assign_rhs_code (stmt) == ASSERT_EXPR)
>  		  && !is_gimple_debug (stmt))
> -		++n_insns;
> +	        n_insns += estimate_num_insns (stmt, &eni_size_weights);
>  	    }
>
>  	  /* We do not look at the block with the threaded branch
> @@ -238,13 +239,15 @@ profitable_jump_thread_path (vec<basic_b
>  	threaded_through_latch = true;
>      }
>
> +  gimple *stmt = get_gimple_control_stmt ((*path)[0]);
> +  gcc_assert (stmt);
> +
>    /* We are going to remove the control statement at the end of the
>       last block in the threading path.  So don't count it against our
>       statement count.  */
> -  n_insns--;
>
> -  gimple *stmt = get_gimple_control_stmt ((*path)[0]);
> -  gcc_assert (stmt);
> +  n_insns-= estimate_num_insns (stmt, &eni_size_weights);
Whitespace nit here before the "-=".

I think this is fine with the whitespace fix.  And again, consider 
moving estimate_num_insns to a new location outside the inliner.

jeff