From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 12484 invoked by alias); 24 May 2011 13:19:37 -0000 Received: (qmail 12474 invoked by uid 22791); 24 May 2011 13:19:35 -0000 X-SWARE-Spam-Status: No, hits=-1.1 required=5.0 tests=AWL,BAYES_00,MAY_BE_FORGED,TW_CF,TW_TM,T_RP_MATCHES_RCVD X-Spam-Check-By: sourceware.org Received: from e37.co.us.ibm.com (HELO e37.co.us.ibm.com) (32.97.110.158) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Tue, 24 May 2011 13:19:19 +0000 Received: from d03relay01.boulder.ibm.com (d03relay01.boulder.ibm.com [9.17.195.226]) by e37.co.us.ibm.com (8.14.4/8.13.1) with ESMTP id p4ODGJJD030012 for ; Tue, 24 May 2011 07:16:19 -0600 Received: from d03av03.boulder.ibm.com (d03av03.boulder.ibm.com [9.17.195.169]) by d03relay01.boulder.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id p4ODJ9n1087702 for ; Tue, 24 May 2011 07:19:09 -0600 Received: from d03av03.boulder.ibm.com (loopback [127.0.0.1]) by d03av03.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id p4O7J570018672 for ; Tue, 24 May 2011 01:19:05 -0600 Received: from [9.10.86.209] (tepot-pc.rchland.ibm.com [9.10.86.209] (may be forged)) by d03av03.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVin) with ESMTP id p4O7J4uF018634; Tue, 24 May 2011 01:19:05 -0600 Subject: Re: [PATCH] Add powi-to-multiply expansion to cse_sincos pass From: "William J. Schmidt" To: Richard Guenther Cc: gcc-patches@gcc.gnu.org In-Reply-To: References: <1305813841.29239.3.camel@gnopaine> <1306173991.4821.7.camel@L3G5336.ibm.com> Content-Type: text/plain Date: Tue, 24 May 2011 15:28:00 -0000 Message-Id: <1306243144.4821.29.camel@L3G5336.ibm.com> Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org X-SW-Source: 2011-05/txt/msg01725.txt.bz2 On Tue, 2011-05-24 at 14:26 +0200, Richard Guenther wrote: > > +/* Recursive subroutine of powi_as_mults. This function takes the > > + array, CACHE, of already calculated exponents and an exponent N and > > + returns a tree that corresponds to CACHE[1]**N, with type TYPE. */ > > + > > +static tree > > +powi_as_mults_1 (gimple_stmt_iterator *gsi, location_t loc, tree type, > > + HOST_WIDE_INT n, tree *cache, tree target) > > +{ > > + tree op0, op1, ssa_target; > > + unsigned HOST_WIDE_INT digit; > > + gimple mult_stmt; > > + > > + if (n < POWI_TABLE_SIZE) > > + { > > + if (cache[n]) > > + return cache[n]; > > + > > + ssa_target = make_ssa_name (target, NULL); > > You can hoist the ssa_target creation before the if instead of > replicating it ... OK. The time spent in make_ssa_name will be wasted in the common case where the value is found in the cache. But I can restructure the logic to avoid that. > > > + cache[n] = ssa_target; > > + > > + op0 = powi_as_mults_1 (gsi, loc, type, n - powi_table[n], cache, target); > > + op1 = powi_as_mults_1 (gsi, loc, type, powi_table[n], cache, target); > > + } > > + else if (n & 1) > > + { > > + digit = n & ((1 << POWI_WINDOW_SIZE) - 1); > > + op0 = powi_as_mults_1 (gsi, loc, type, n - digit, cache, target); > > + op1 = powi_as_mults_1 (gsi, loc, type, digit, cache, target); > > + ssa_target = make_ssa_name (target, NULL); > > here and > > > + } > > + else > > + { > > + op0 = powi_as_mults_1 (gsi, loc, type, n >> 1, cache, target); > > + op1 = op0; > > + ssa_target = make_ssa_name (target, NULL); > > here. > > > + } > > + > > + mult_stmt = gimple_build_assign_with_ops (MULT_EXPR, ssa_target, op0, op1); > > + gsi_insert_before (gsi, mult_stmt, GSI_SAME_STMT); > > + > > + return ssa_target; > > +} > > + > > +/* Convert ARG0**N to a tree of multiplications of ARG0 with itself. > > + This function needs to be kept in sync with powi_cost above. */ > > + > > +static tree > > +powi_as_mults (gimple_stmt_iterator *gsi, location_t loc, > > + tree arg0, HOST_WIDE_INT n) > > +{ > > + tree cache[POWI_TABLE_SIZE], result, type = TREE_TYPE (arg0), target; > > + gimple div_stmt; > > + > > + if (n == 0) > > + return build_real (type, dconst1); > > + > > + memset (cache, 0, sizeof (cache)); > > + cache[1] = arg0; > > + > > + target = create_tmp_var (type, "powmult"); > > + add_referenced_var (target); > > + > > + result = powi_as_mults_1 (gsi, loc, type, (n < 0) ? -n : n, cache, target); > > + > > + if (n >= 0) > > + return result; > > + > > + /* If the original exponent was negative, reciprocate the result. */ > > + target = create_tmp_var (type, "powmult"); > > + add_referenced_var (target); > > re-use target from above. > Oops, sorry, that was sloppy. > > + target = make_ssa_name (target, NULL); > > + > > + div_stmt = gimple_build_assign_with_ops (RDIV_EXPR, target, > > + build_real (type, dconst1), > > + result); > > + gsi_insert_before (gsi, div_stmt, GSI_SAME_STMT); > > + > > + return target; > > +} > > + > > +/* ARG0 and N are the two arguments to a powi builtin in GSI with > > + location info LOC. If the arguments are appropriate, create an > > + equivalent sequence of statements prior to GSI using an optimal > > + number of multiplications, and return an expession holding the > > + result. */ > > + > > +static tree > > +gimple_expand_builtin_powi (gimple_stmt_iterator *gsi, location_t loc, > > + tree arg0, HOST_WIDE_INT n) > > +{ > > + /* Avoid largest negative number. */ > > + if (n != -n > > + && ((n >= -1 && n <= 2) > > + || (optimize_function_for_speed_p (cfun) > > + && powi_cost (n) <= POWI_MAX_MULTS))) > > + return powi_as_mults (gsi, loc, arg0, n); > > + > > + return NULL_TREE; > > +} > > + > > /* Go through all calls to sin, cos and cexpi and call execute_cse_sincos_1 > > - on the SSA_NAME argument of each of them. */ > > + on the SSA_NAME argument of each of them. Also expand powi(x,n) into > > + an optimal number of multiplies, when n is a constant. */ > > > > static unsigned int > > execute_cse_sincos (void) > > @@ -821,7 +1056,9 @@ execute_cse_sincos (void) > > && (fndecl = gimple_call_fndecl (stmt)) > > && DECL_BUILT_IN_CLASS (fndecl) == BUILT_IN_NORMAL) > > { > > - tree arg; > > + tree arg, arg0, arg1, result; > > + HOST_WIDE_INT n, n_hi; > > + location_t loc; > > > > switch (DECL_FUNCTION_CODE (fndecl)) > > { > > @@ -833,6 +1070,29 @@ execute_cse_sincos (void) > > cfg_changed |= execute_cse_sincos_1 (arg); > > break; > > > > + CASE_FLT_FN (BUILT_IN_POWI): > > + arg0 = gimple_call_arg (stmt, 0); > > + arg1 = gimple_call_arg (stmt, 1); > > + if (!host_integerp (arg1, 0)) > > + break; > > + > > + n = TREE_INT_CST_LOW (arg1); > > + n_hi = TREE_INT_CST_HIGH (arg1); > > + if (n_hi != 0 && n_hi != -1) > > + break; > > The n_hi query and check isn't necessary as host_integerp already > ensures this. > Ah, OK. Thanks. > > + > > + loc = gimple_location (stmt); > > + result = gimple_expand_builtin_powi (&gsi, loc, arg0, n); > > + > > + if (result) > > + { > > + tree lhs = gimple_get_lhs (stmt); > > + gimple new_stmt = gimple_build_assign (lhs, result); > > + gimple_set_location (new_stmt, loc); > > + gsi_replace (&gsi, new_stmt, true); > > + } > > + break; > > + > > default:; > > } > > } > > @@ -849,10 +1109,9 @@ execute_cse_sincos (void) > > static bool > > gate_cse_sincos (void) > > { > > - /* Make sure we have either sincos or cexp. */ > > - return (TARGET_HAS_SINCOS > > - || TARGET_C99_FUNCTIONS) > > - && optimize; > > + /* We no longer require either sincos or cexp, since powi expansion > > + piggybacks on this pass. */ > > + return optimize; > > } > > > > struct gimple_opt_pass pass_cse_sincos = > > With the above changes the patch is ok for trunk, possibly after we > resolved the gsi_replace virtual operand case (you probably > can reproduce it with -funsafe-math-optimizations -ferrno-math). OK, just saw your other note. I'll copy the logic from update_call_from-tree (). Thanks, Bill > > Thanks, > Richard.