From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 26977 invoked by alias); 19 Nov 2013 17:24:52 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 26968 invoked by uid 89); 19 Nov 2013 17:24:51 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.3 required=5.0 tests=AWL,BAYES_50,RCVD_IN_DNSWL_LOW,RDNS_NONE,SPF_PASS,URIBL_BLOCKED autolearn=no version=3.3.2 X-HELO: mail-qa0-f47.google.com Received: from Unknown (HELO mail-qa0-f47.google.com) (209.85.216.47) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-SHA encrypted) ESMTPS; Tue, 19 Nov 2013 17:24:50 +0000 Received: by mail-qa0-f47.google.com with SMTP id w5so1838662qac.6 for ; Tue, 19 Nov 2013 09:24:42 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=6XezjGK86DpQYLT1BWlPIo8XMXFCIKZVU3HUAQbbcYg=; b=Mbxi7qUnMcQVj8JeEfYjPErDJXWUJ0+Z7T1NaLz73/0fGNX7pTs7xQvHf8I4lZMIDU /q74D/v4O/Ip5bZtaS8fZRNFgtogds48ZfnkT5sYKqAI5I1kp5CkYJUXfQg+y6eli7sj rBsg/dR1Nu/DuePciPQ5o+kLFM9Qpp46KPiRD2Xtj162EHKwLsW/tWp/WdLx9GIjSvax y4QyqKAwj0zNvE0lMA5W8jvow7ZoY+/Xqx6gGS4DgHp9wfHSur4tysbvbzoxY1QFXlR2 UiIX+AIPOx9uKJhjzEC7j6BvJIlQqPDKVQcNhFwAI20vI2nek7uy2zLFmG/vPhBClxX7 s5VQ== X-Gm-Message-State: ALoCoQmSgEK4E7Aizb31XNWyCzzAmvkJFaM3WAO827PslfXE1b1Z36YER8EsMyB+kkImhLEizKDwb/9JF3uAqUHXQPtSrMstsgZma5RJJd+gMsi4PTBb+/ojnNFtb5z9KuTJZ/GEgm3cVOMqyLmNPQU3ptkHKlMf7lpPkAp1Zht0Uru2Ul9CHqsZJlBVMy0f8mobYyZ4XjTczqI5ZcD6jpeMbHlmGNQv0w== MIME-Version: 1.0 X-Received: by 10.224.126.193 with SMTP id d1mr32889984qas.41.1384881882239; Tue, 19 Nov 2013 09:24:42 -0800 (PST) Received: by 10.229.117.69 with HTTP; Tue, 19 Nov 2013 09:24:42 -0800 (PST) In-Reply-To: <20131119154407.GA1742@kam.mff.cuni.cz> References: <20131119154407.GA1742@kam.mff.cuni.cz> Date: Tue, 19 Nov 2013 18:23:00 -0000 Message-ID: Subject: Re: [PATCH i386] Enable -freorder-blocks-and-partition From: Teresa Johnson To: Jan Hubicka Cc: =?UTF-8?Q?Martin_Li=C5=A1ka?= , "gcc-patches@gcc.gnu.org" Content-Type: text/plain; charset=ISO-8859-1 X-IsSubscribed: yes X-SW-Source: 2013-11/txt/msg02373.txt.bz2 On Tue, Nov 19, 2013 at 7:44 AM, Jan Hubicka wrote: > Martin, > can you, please, generate the updated systemtap with > -freorder-blocks-and-partition enabled? > > I am in favour of enabling this - it is usefull pass and it is pointless ot > have passes that are not enabled by default. > Is there reason why this would not work on other ELF target? Is it working > with Darwin and Windows? I don't know how to test these (I don't see any machines listed in the gcc compile farm of those types). For Windows, I assume you mean MinGW, which should be enabled as it is under i386. Should I disable it there and for Darwin? > >> This patch enables -freorder-blocks-and-partition by default for x86 >> at -O2 and up. It is showing some modest gains in cpu2006 performance >> with profile feedback and -O2 on an Intel Westmere system. Specifically, >> I am seeing consistent improvements in 401.bzip2 (1.5-3%), 483.xalancbmk >> (1.5-3%), and 453.povray (2.5-3%), and no apparent regressions. > > This actually sounds very good ;) > > Lets see how the systemtap graphs goes. If we will end up with problem > of too many accesses to cold section, I would suggest making cold section > subdivided into .unlikely and .unlikely.part (we could have better name) > with the second consisting only of unlikely parts of hot&normal functions. > > This should reduce the problems we are seeing with mistakely identifying > code to be cold because of roundoff errors (and it probably makes sense > in general, too). > We will however need to update gold and ld for that. Note that I don't think this would help much unless the linker is changed to move the cold split section close to the hot section. There is probably some fine-tuning we could do eventually in the linker under -ffunction-sections without putting the split portions in a separate section. I.e. clump the split parts together within unlikely. But hopefully this can all be done later on as follow-on work to boost the performance further. >> >> Bootstrapped and tested on x86-64-unknown-linux-gnu with a normal >> bootstrap, a profiledbootstrap and an LTO profiledbootstrap. All were >> configured with --enable-languages=all,obj-c++ and tested for both >> 32 and 64-bit with RUNTESTFLAGS="--target_board=unix\{-m32,-m64\}". >> >> It would be good to enable this for additional targets as a follow on, >> but it needs more testing for both correctness and performance on those >> other targets (i.e for correctness because I see a number of places >> in other config/*/*.c files that do some special handling under this >> option for different targets or simply disable it, so I am not sure >> how well-tested it is under different architectural constraints). >> >> Ok for trunk? >> >> Thanks, >> Teresa >> >> 2013-11-19 Teresa Johnson >> >> * common/config/i386/i386-common.c: Enable >> -freorder-blocks-and-partition at -O2 and up for x86. >> * opts.c (finish_options): Only warn if -freorder-blocks-and- >> partition was set on command line. > > You probably mis doc/invoke.texi update. > Thank you for working on this! Yes, thanks. Here is the patch with the invoke.texi update. Teresa 2013-11-19 Teresa Johnson * common/config/i386/i386-common.c: Enable -freorder-blocks-and-partition at -O2 and up for x86. * doc/invoke.texi: Update -freorder-blocks-and-partition default. * opts.c (finish_options): Only warn if -freorder-blocks-and- partition was set on command line. Index: common/config/i386/i386-common.c =================================================================== --- common/config/i386/i386-common.c (revision 205001) +++ common/config/i386/i386-common.c (working copy) @@ -789,6 +789,8 @@ static const struct default_options ix86_option_op { /* Enable redundant extension instructions removal at -O2 and higher. */ { OPT_LEVELS_2_PLUS, OPT_free, NULL, 1 }, + /* Enable function splitting at -O2 and higher. */ + { OPT_LEVELS_2_PLUS, OPT_freorder_blocks_and_partition, NULL, 1 }, /* Turn off -fschedule-insns by default. It tends to make the problem with not enough registers even worse. */ { OPT_LEVELS_ALL, OPT_fschedule_insns, NULL, 0 }, Index: doc/invoke.texi =================================================================== --- doc/invoke.texi (revision 205001) +++ doc/invoke.texi (working copy) @@ -8172,6 +8172,8 @@ exception handling, for linkonce sections, for fun section attribute and on any architecture that does not support named sections. +Enabled for x86 at levels @option{-O2}, @option{-O3}. + @item -freorder-functions @opindex freorder-functions Reorder functions in the object file in order to Index: opts.c =================================================================== --- opts.c (revision 205001) +++ opts.c (working copy) @@ -737,9 +737,10 @@ finish_options (struct gcc_options *opts, struct g && opts->x_flag_reorder_blocks_and_partition && (ui_except == UI_SJLJ || ui_except >= UI_TARGET)) { - inform (loc, - "-freorder-blocks-and-partition does not work " - "with exceptions on this architecture"); + if (opts_set->x_flag_reorder_blocks_and_partition) + inform (loc, + "-freorder-blocks-and-partition does not work " + "with exceptions on this architecture"); opts->x_flag_reorder_blocks_and_partition = 0; opts->x_flag_reorder_blocks = 1; } @@ -752,9 +753,10 @@ finish_options (struct gcc_options *opts, struct g && opts->x_flag_reorder_blocks_and_partition && (ui_except == UI_SJLJ || ui_except >= UI_TARGET)) { - inform (loc, - "-freorder-blocks-and-partition does not support " - "unwind info on this architecture"); + if (opts_set->x_flag_reorder_blocks_and_partition) + inform (loc, + "-freorder-blocks-and-partition does not support " + "unwind info on this architecture"); opts->x_flag_reorder_blocks_and_partition = 0; opts->x_flag_reorder_blocks = 1; } @@ -769,9 +771,10 @@ finish_options (struct gcc_options *opts, struct g && targetm_common.unwind_tables_default && (ui_except == UI_SJLJ || ui_except >= UI_TARGET)))) { - inform (loc, - "-freorder-blocks-and-partition does not work " - "on this architecture"); + if (opts_set->x_flag_reorder_blocks_and_partition) + inform (loc, + "-freorder-blocks-and-partition does not work " + "on this architecture"); opts->x_flag_reorder_blocks_and_partition = 0; opts->x_flag_reorder_blocks = 1; } -- Teresa Johnson | Software Engineer | tejohnson@google.com | 408-460-2413