From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 26839 invoked by alias); 13 Dec 2013 01:13:15 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 26738 invoked by uid 89); 13 Dec 2013 01:13:14 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-0.9 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_LOW,RP_MATCHES_RCVD autolearn=ham version=3.3.2 X-HELO: nikam.ms.mff.cuni.cz Received: from nikam.ms.mff.cuni.cz (HELO nikam.ms.mff.cuni.cz) (195.113.20.16) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-SHA encrypted) ESMTPS; Fri, 13 Dec 2013 01:13:13 +0000 Received: by nikam.ms.mff.cuni.cz (Postfix, from userid 16202) id 8C81A543032; Fri, 13 Dec 2013 02:13:09 +0100 (CET) Date: Fri, 13 Dec 2013 01:13:00 -0000 From: Jan Hubicka To: Teresa Johnson Cc: Martin =?iso-8859-2?Q?Li=B9ka?= , Jeff Law , Jan Hubicka , "gcc-patches@gcc.gnu.org" Subject: Re: [PATCH i386] Enable -freorder-blocks-and-partition Message-ID: <20131213011309.GA21107@kam.mff.cuni.cz> References: <528BA299.7040606@redhat.com> <20131128140655.GA20730@kam.mff.cuni.cz> <529CB252.4070806@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-2 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: Mutt/1.5.20 (2009-06-14) X-SW-Source: 2013-12/txt/msg01211.txt.bz2 > On Wed, Dec 11, 2013 at 1:21 AM, Martin LiĀ¹ka wrote: > > Hello, > > I prepared a collection of systemtap graphs for GIMP. > > > > 1) just my profile-based function reordering: 550 pages > > 2) just -freorder-blocks-and-partitions: 646 pages > > 3) just -fno-reorder-blocks-and-partitions: 638 pages > > > > Please see attached data. > > Thanks for the data. A few observations/questions: > > With both 1) (your (time-based?) reordering) and 2) > (-freorder-blocks-and-partitions) there are a fair amount of accesses > out of the cold section. I'm not seeing so many accesses out of the > cold section in the apps I am looking at with splitting enabled. In I see you already comitted the patch, so perhaps Martin's measurement assume the pass is off by default? I rebuilded GCC with profiledboostrap and with the linkerscript unmapping text.unlikely. I get ICE in: (gdb) bt #0 diagnostic_set_caret_max_width(diagnostic_context*, int) () at ../../gcc/diagnostic.c:108 #1 0x0000000000f68457 in diagnostic_initialize (context=0x18ae000 , n_opts=n_opts@entry=1290) at ../../gcc/diagnostic.c:135 #2 0x000000000100050e in general_init (argv0=) at ../../gcc/toplev.c:1110 #3 toplev_main(int, char**) () at ../../gcc/toplev.c:1922 #4 0x00007ffff774cbe5 in __libc_start_main () from /lib64/libc.so.6 #5 0x0000000000f7898d in _start () at ../sysdeps/x86_64/start.S:122 That is relatively early in startup process. The function seems inlined and it fails only on second invocation, did not have time to investigate further, yet while without -fprofile-use it starts... On our periodic testers I see off-noise improvement in crafty 2200->2300 and regression on Vortex, 2900->2800, plus code size increase. Honza