From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 53162 invoked by alias); 11 Aug 2016 11:35:33 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 53131 invoked by uid 89); 11 Aug 2016 11:35:32 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-0.6 required=5.0 tests=AWL,BAYES_05,KAM_LAZY_DOMAIN_SECURITY,RP_MATCHES_RCVD autolearn=no version=3.3.2 spammy=decisions, estimate, realistic, unprofitable X-HELO: nikam.ms.mff.cuni.cz Received: from nikam.ms.mff.cuni.cz (HELO nikam.ms.mff.cuni.cz) (195.113.20.16) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-GCM-SHA384 encrypted) ESMTPS; Thu, 11 Aug 2016 11:35:20 +0000 Received: by nikam.ms.mff.cuni.cz (Postfix, from userid 16202) id 1B893545259; Thu, 11 Aug 2016 13:35:17 +0200 (CEST) Date: Thu, 11 Aug 2016 11:35:00 -0000 From: Jan Hubicka To: Andrew Pinski Cc: Jan Hubicka , Jeff Law , GCC Patches Subject: Re: backward threading heuristics tweek Message-ID: <20160811113516.GA67433@kam.mff.cuni.cz> References: <20160606101953.GC12313@kam.mff.cuni.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) X-SW-Source: 2016-08/txt/msg00911.txt.bz2 > On Mon, Jun 6, 2016 at 3:19 AM, Jan Hubicka wrote: > > Hi, > > while looking into profile mismatches introduced by the backward threading pass > > I noticed that the heuristics seems quite simplistics. First it should be > > profile sensitive and disallow duplication when optimizing cold paths. Second > > it should use estimate_num_insns because gimple statement count is not really > > very realistic estimate of final code size effect and third there seems to be > > no reason to disable the pass for functions optimized for size. > > > > If we block duplication for more than 1 insns for size optimized paths the pass > > is able to do majority of threading decisions that are for free and improve codegen. > > The code size benefit was between 0.5% to 2.7% on testcases I tried (tramp3d, > > GCC modules, xlanancbmk and some other stuff around my hd). > > > > Bootstrapped/regtested x86_64-linux, seems sane? > > > > The pass should also avoid calling cleanup_cfg when no trheading was done > > and i do not see why it is guarded by expensive_optimizations. What are the > > main compile time complexity limitations? > > This patch caused a huge regression (~11%) on coremarks on ThunderX. > I assume other targets too. > Basically it looks like the path is no longer thread jumped. Sorry for late reply. I checked our periodic testers and the patch seems more or less performance neutral with some code size improvements. Can you point me to the path that is no longer crossjumped? I added diag output, so you should see the reason why the path was considered unprofitable - either it was cold or we exceeded the maximal size. The size is largely untuned, so perhaps we can just adjust it. Honza