From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 1038 invoked by alias); 19 Jul 2011 11:35:07 -0000 Received: (qmail 932 invoked by uid 22791); 19 Jul 2011 11:35:06 -0000 X-SWARE-Spam-Status: No, hits=-2.2 required=5.0 tests=AWL,BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,TW_TM X-Spam-Check-By: sourceware.org Received: from mail-gx0-f175.google.com (HELO mail-gx0-f175.google.com) (209.85.161.175) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Tue, 19 Jul 2011 11:34:26 +0000 Received: by gxk3 with SMTP id 3so1909156gxk.20 for ; Tue, 19 Jul 2011 04:34:25 -0700 (PDT) MIME-Version: 1.0 Received: by 10.150.240.15 with SMTP id n15mr6316591ybh.387.1311075265290; Tue, 19 Jul 2011 04:34:25 -0700 (PDT) Received: by 10.150.205.2 with HTTP; Tue, 19 Jul 2011 04:34:25 -0700 (PDT) In-Reply-To: References: <20110713232837.GA31809@hungry-tiger.westford.ibm.com> Date: Tue, 19 Jul 2011 12:10:00 -0000 Message-ID: Subject: Re: [PATCH Atom][PR middle-end/44382] Tree reassociation improvement From: Richard Guenther To: Ilya Enkovich Cc: Michael Meissner , gcc-patches@gcc.gnu.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org X-SW-Source: 2011-07/txt/msg01498.txt.bz2 On Thu, Jul 14, 2011 at 5:00 PM, Ilya Enkovich wro= te: >> 2011/7/14 Richard Guenther : >>> >>> But then how comes the option to override it is useful? =A0It isn't dep= endent >>> on the particular case. =A0At least the option should be a --param. >>> >>> Richard. >>> >> >> Option is still useful if you want to try feature on platform with no >> hook implemented and for other performance experiments. I agree >> --param usage should be better here. I'll fix it. >> >> Ilya >> > > Here is a patch with new param instead of new option. Bootstrapped and > checked on x86_64-linux. } +static int +ix86_reassociation_width (const_gimple stmt) all functions need comments describing what they do and what parameters they get. +@deftypefn {Target Hook} int TARGET_SCHED_REASSOCIATION_WIDTH (const_gimple @var{stmt}) +This hook is called by tree reassociator to determine a level of +parallelism required in output calculations chain. I don't think we should get an actual statement here, but an enum tree_code and a machine mode. +/* Find out how many cycles we need to compute whole statements + chain holding ops_num operands if may execute up to cpu_width + statements per cycle. */ I have a hard time parsing this sentence, maybe a native english speaker can help here. +static int +get_reassociation_width (VEC(operand_entry_t, heap) * ops, gimple stmt) +{ + int param_width =3D PARAM_VALUE (PARAM_TREE_REASSOC_WIDTH); + int ops_num =3D VEC_length (operand_entry_t, ops); + int width; + int cycles_best; + + if (param_width > 0) + width =3D param_width; + else + width =3D targetm.sched.reassociation_width (stmt); this is the only place you need stmt, with tree-code and mode args you'd not need it (and it's odd anyway). + while (width > 1 && + get_required_cycles (ops_num, width - 1) =3D=3D cycles_best) &&s go on the next line, similar cases in your patch as well + if (dump_file && (dump_flags & TDF_DETAILS)) + { + fprintf (dump_file, + "Width =3D %d was chosen for reassociation\n", width); + } no {}s around single-stmts +static void +rewrite_expr_tree_parallel (gimple stmt, int width, + VEC(operand_entry_t, heap) * ops) +{ it's not easy to follow the flow of this function, esp. I wonder + else + { + tree var =3D create_tmp_reg (TREE_TYPE (last_rhs1), "reassoc"); + add_referenced_var (var); + stmts[i] =3D build_and_add_sum (var, op1, op2, opcode); + } what you need to create new stmts for, and if you possibly create multiple temporary variables here. You don't seem to handle the special-cases of rewrite_expr_tree for the leafs of your tree, especially the PHI node special-casing. I think you may run into vectorization issues here. - TODO_verify_ssa + TODO_remove_unused_locals + | TODO_verify_ssa why? I think the patch looks reasonable overall, please update it with the above comments and re-post it. Thanks, Richard. > Ilya > -- > gcc/ > > 2011-07-14 =A0Enkovich Ilya =A0 > > =A0 =A0 =A0 =A0PR middle-end/44382 > =A0 =A0 =A0 =A0* target.def (reassociation_width): New hook. > > =A0 =A0 =A0 =A0* doc/tm.texi.in (reassociation_width): New hook documenta= tion. > > =A0 =A0 =A0 =A0* doc/tm.texi (reassociation_width): Likewise. > > =A0 =A0 =A0 =A0* doc/invoke.texi (tree-reassoc-width): New param document= ed. > > =A0 =A0 =A0 =A0* hooks.h (hook_int_const_gimple_1): New default hook. > > =A0 =A0 =A0 =A0* hooks.c (hook_int_const_gimple_1): Likewise. > > =A0 =A0 =A0 =A0* config/i386/i386.h (ix86_tune_indices): Add > =A0 =A0 =A0 =A0X86_TUNE_REASSOC_INT_TO_PARALLEL and > =A0 =A0 =A0 =A0X86_TUNE_REASSOC_FP_TO_PARALLEL. > > =A0 =A0 =A0 =A0(TARGET_REASSOC_INT_TO_PARALLEL): New. > =A0 =A0 =A0 =A0(TARGET_REASSOC_FP_TO_PARALLEL): Likewise. > > =A0 =A0 =A0 =A0* config/i386/i386.c (initial_ix86_tune_features): Add > =A0 =A0 =A0 =A0X86_TUNE_REASSOC_INT_TO_PARALLEL and > =A0 =A0 =A0 =A0X86_TUNE_REASSOC_FP_TO_PARALLEL. > > =A0 =A0 =A0 =A0(ix86_reassociation_width) implementation of > =A0 =A0 =A0 =A0new hook for i386 target. > > =A0 =A0 =A0 =A0* params.def (PARAM_TREE_REASSOC_WIDTH): New param added. > > =A0 =A0 =A0 =A0* tree-ssa-reassoc.c (get_required_cycles): New function. > =A0 =A0 =A0 =A0(get_reassociation_width): Likewise. > =A0 =A0 =A0 =A0(rewrite_expr_tree_parallel): Likewise. > > =A0 =A0 =A0 =A0(reassociate_bb): Now checks reassociation width to be used > =A0 =A0 =A0 =A0and call rewrite_expr_tree_parallel instead of rewrite_exp= r_tree > =A0 =A0 =A0 =A0if needed. > > =A0 =A0 =A0 =A0(pass_reassoc): TODO_remove_unused_locals flag added. > > gcc/testsuite/ > > 2011-07-14 =A0Enkovich Ilya =A0 > > =A0 =A0 =A0 =A0* gcc.dg/tree-ssa/pr38533.c (dg-options): Added option > =A0 =A0 =A0 =A0--param tree-reassoc-width=3D1. > > =A0 =A0 =A0 =A0* gcc.dg/tree-ssa/reassoc-24.c: New test. > =A0 =A0 =A0 =A0* gcc.dg/tree-ssa/reassoc-25.c: Likewise. >