From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 416 invoked by alias); 26 May 2011 09:22:52 -0000 Received: (qmail 404 invoked by uid 22791); 26 May 2011 09:22:50 -0000 X-SWARE-Spam-Status: No, hits=-0.9 required=5.0 tests=AWL,BAYES_50,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,RFC_ABUSE_POST,TW_HF,TW_TM X-Spam-Check-By: sourceware.org Received: from mail-wy0-f175.google.com (HELO mail-wy0-f175.google.com) (74.125.82.175) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Thu, 26 May 2011 09:22:34 +0000 Received: by wye20 with SMTP id 20so398457wye.20 for ; Thu, 26 May 2011 02:22:32 -0700 (PDT) MIME-Version: 1.0 Received: by 10.227.199.21 with SMTP id eq21mr567214wbb.101.1306401752548; Thu, 26 May 2011 02:22:32 -0700 (PDT) Received: by 10.227.37.152 with HTTP; Thu, 26 May 2011 02:22:32 -0700 (PDT) In-Reply-To: <1306345412.4821.56.camel@L3G5336.ibm.com> References: <1306345412.4821.56.camel@L3G5336.ibm.com> Date: Thu, 26 May 2011 10:33:00 -0000 Message-ID: Subject: Re: [PATCH] More pow(x,c) expansions in cse_sincos pass (PR46728, patch 3) From: Richard Guenther To: "William J. Schmidt" Cc: gcc-patches@gcc.gnu.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org X-SW-Source: 2011-05/txt/msg01982.txt.bz2 On Wed, May 25, 2011 at 7:43 PM, William J. Schmidt wrote: > This patch adds logic to gimple_expand_builtin_pow () to optimize > pow(x,y), where y is one of 0.5, 0.25, 0.75, 1./3., or 1./6. =A0I noticed > that there were two missing calls to gimple_set_location () in my > previous patch, so I've corrected those here as well. > > There's one TODO comment in this patch. =A0I don't believe the test for > TREE_SIDE_EFFECTS (arg0) should be necessary; but I'm not convinced it > was necessary in the code whence I copied it, either, so I left it in > for comment in case I'm misunderstanding something. > > > 2011-05-25 =A0Bill Schmidt =A0 > > =A0 =A0 =A0 =A0PR tree-optimization/46728 > =A0 =A0 =A0 =A0* tree-ssa-math-opts.c (powi_as_mults_1): Add gimple_set_l= ocation. > =A0 =A0 =A0 =A0(powi_as_mults): Add gimple_set_location. > =A0 =A0 =A0 =A0(build_and_insert_call): New. > =A0 =A0 =A0 =A0(gimple_expand_builtin_pow): Add handling for pow(x,y) whe= n y is > =A0 =A0 =A0 =A00.5, 0.25, 0.75, 1./3., or 1./6. > > Index: gcc/tree-ssa-math-opts.c > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > --- gcc/tree-ssa-math-opts.c =A0 =A0(revision 174199) > +++ gcc/tree-ssa-math-opts.c =A0 =A0(working copy) > @@ -965,6 +965,7 @@ powi_as_mults_1 (gimple_stmt_iterator *gsi, locati > =A0 =A0 } > > =A0 mult_stmt =3D gimple_build_assign_with_ops (MULT_EXPR, ssa_target, op= 0, op1); > + =A0gimple_set_location (mult_stmt, loc); > =A0 gsi_insert_before (gsi, mult_stmt, GSI_SAME_STMT); > > =A0 return ssa_target; > @@ -999,6 +1000,7 @@ powi_as_mults (gimple_stmt_iterator *gsi, location > =A0 div_stmt =3D gimple_build_assign_with_ops (RDIV_EXPR, target, > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 = =A0 =A0 =A0 build_real (type, dconst1), > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 = =A0 =A0 =A0 result); > + =A0gimple_set_location (div_stmt, loc); > =A0 gsi_insert_before (gsi, div_stmt, GSI_SAME_STMT); > > =A0 return target; > @@ -1024,6 +1026,34 @@ gimple_expand_builtin_powi (gimple_stmt_iterator * > =A0 return NULL_TREE; > =A0} > > +/* Build a gimple call statement that calls FN with argument ARG. > + =A0 Set the lhs of the call statement to a fresh SSA name for > + =A0 variable VAR. =A0If VAR is NULL, first allocate it. =A0Insert the > + =A0 statement prior to GSI's current position, and return the fresh > + =A0 SSA name. =A0*/ > + > +static tree > +build_and_insert_call (gimple_stmt_iterator *gsi, tree fn, tree arg, > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0tree *var, location_t loc) > +{ > + =A0gimple call_stmt; > + =A0tree ssa_target; > + > + =A0if (!*var) > + =A0 =A0{ > + =A0 =A0 =A0*var =3D create_tmp_var (TREE_TYPE (arg), "powroot"); > + =A0 =A0 =A0add_referenced_var (*var); > + =A0 =A0} > + > + =A0call_stmt =3D gimple_build_call (fn, 1, arg); > + =A0ssa_target =3D make_ssa_name (*var, NULL); > + =A0gimple_set_lhs (call_stmt, ssa_target); > + =A0gimple_set_location (call_stmt, loc); > + =A0gsi_insert_before (gsi, call_stmt, GSI_SAME_STMT); > + > + =A0return ssa_target; > +} > + > =A0/* ARG0 and ARG1 are the two arguments to a pow builtin call in GSI > =A0 =A0with location info LOC. =A0If possible, create an equivalent and > =A0 =A0less expensive sequence of statements prior to GSI, and return an > @@ -1035,6 +1065,8 @@ gimple_expand_builtin_pow (gimple_stmt_iterator *g > =A0{ > =A0 REAL_VALUE_TYPE c, cint; > =A0 HOST_WIDE_INT n; > + =A0tree type, sqrtfn, target =3D NULL_TREE; > + =A0enum machine_mode mode; > > =A0 /* If the exponent isn't a constant, there's nothing of interest > =A0 =A0 =A0to be done. =A0*/ > @@ -1054,6 +1086,108 @@ gimple_expand_builtin_pow (gimple_stmt_iterator *g > =A0 =A0 =A0 =A0 =A0 =A0 =A0&& powi_cost (n) <=3D POWI_MAX_MULTS))) > =A0 =A0 return gimple_expand_builtin_powi (gsi, loc, arg0, n); > > + =A0/* Attempt various optimizations using sqrt and cbrt. =A0*/ > + =A0type =3D TREE_TYPE (arg0); > + =A0mode =3D TYPE_MODE (type); > + =A0sqrtfn =3D mathfn_built_in (type, BUILT_IN_SQRT); > + > + =A0if (flag_unsafe_math_optimizations && sqrtfn !=3D NULL_TREE) > + =A0 =A0{ > + =A0 =A0 =A0REAL_VALUE_TYPE dconst1_4, dconst3_4; > + =A0 =A0 =A0tree cbrtfn; > + =A0 =A0 =A0bool hw_sqrt_exists; > + > + =A0 =A0 =A0/* Optimize pow(x,0.5) =3D sqrt(x). =A0*/ > + =A0 =A0 =A0if (REAL_VALUES_EQUAL (c, dconsthalf)) > + =A0 =A0 =A0 return build_and_insert_call (gsi, sqrtfn, arg0, &target, l= oc); This should be done if !HONOR_SIGNED_ZEROS instead. > + > + =A0 =A0 =A0/* Optimize pow(x,0.25) =3D sqrt(sqrt(x)). =A0*/ > + =A0 =A0 =A0dconst1_4 =3D dconst1; > + =A0 =A0 =A0SET_REAL_EXP (&dconst1_4, REAL_EXP (&dconst1_4) - 2); > + =A0 =A0 =A0hw_sqrt_exists =3D optab_handler(sqrt_optab, mode) !=3D CODE= _FOR_nothing; > + > + =A0 =A0 =A0if (REAL_VALUES_EQUAL (c, dconst1_4) && hw_sqrt_exists) > + =A0 =A0 =A0 { > + =A0 =A0 =A0 =A0 tree sqrt_arg0; > + > + =A0 =A0 =A0 =A0 /* sqrt(x) =A0*/ > + =A0 =A0 =A0 =A0 sqrt_arg0 =3D build_and_insert_call (gsi, sqrtfn, arg0,= &target, loc); > + > + =A0 =A0 =A0 =A0 /* sqrt(sqrt(x)) =A0*/ > + =A0 =A0 =A0 =A0 return build_and_insert_call (gsi, sqrtfn, sqrt_arg0, &= target, loc); > + =A0 =A0 =A0 } > + > + =A0 =A0 =A0/* Optimize pow(x,0.75) =3D sqrt(x) * sqrt(sqrt(x)). =A0*/ > + =A0 =A0 =A0real_from_integer (&dconst3_4, VOIDmode, 3, 0, 0); > + =A0 =A0 =A0SET_REAL_EXP (&dconst3_4, REAL_EXP (&dconst3_4) - 2); > + > + =A0 =A0 =A0if (optimize_function_for_speed_p (cfun) > + =A0 =A0 =A0 =A0 && !TREE_SIDE_EFFECTS (arg0) /* TODO: is this needed? = =A0*/ Not needed as we are in gimple form. > + =A0 =A0 =A0 =A0 && REAL_VALUES_EQUAL (c, dconst3_4) > + =A0 =A0 =A0 =A0 && hw_sqrt_exists) > + =A0 =A0 =A0 { > + =A0 =A0 =A0 =A0 tree sqrt_arg0, sqrt_sqrt, ssa_target; > + =A0 =A0 =A0 =A0 gimple mult_stmt; > + > + =A0 =A0 =A0 =A0 /* sqrt(x) =A0*/ > + =A0 =A0 =A0 =A0 sqrt_arg0 =3D build_and_insert_call (gsi, sqrtfn, arg0,= &target, loc); > + > + =A0 =A0 =A0 =A0 /* sqrt(sqrt(x)) =A0*/ > + =A0 =A0 =A0 =A0 sqrt_sqrt =3D build_and_insert_call (gsi, sqrtfn, sqrt_= arg0, > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= =A0 =A0 =A0 =A0&target, loc); > + =A0 =A0 =A0 =A0 /* sqrt(x) * sqrt(sqrt(x)) =A0*/ > + =A0 =A0 =A0 =A0 ssa_target =3D make_ssa_name (target, NULL); > + =A0 =A0 =A0 =A0 mult_stmt =3D gimple_build_assign_with_ops (MULT_EXPR, = ssa_target, > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= =A0 =A0 =A0 =A0 =A0 =A0 =A0 sqrt_arg0, sqrt_sqrt); > + =A0 =A0 =A0 =A0 gimple_set_location (mult_stmt, loc); > + =A0 =A0 =A0 =A0 gsi_insert_before (gsi, mult_stmt, GSI_SAME_STMT); > + > + =A0 =A0 =A0 =A0 return ssa_target; > + =A0 =A0 =A0 } > + > + =A0 =A0 =A0/* Optimize pow(x,1./3.) =3D cbrt(x), and pow(x,1./6.) =3D c= brt(sqrt(x)). =A0*/ > + =A0 =A0 =A0cbrtfn =3D mathfn_built_in (type, BUILT_IN_CBRT); > + > + =A0 =A0 =A0if (cbrtfn !=3D NULL_TREE) > + =A0 =A0 =A0 { > + =A0 =A0 =A0 =A0 /* pow(x,1./3.) =3D> cbrt(x). =A0*/ > + =A0 =A0 =A0 =A0 const REAL_VALUE_TYPE dconst1_3 > + =A0 =A0 =A0 =A0 =A0 =3D real_value_truncate (mode, dconst_third ()); > + > + =A0 =A0 =A0 =A0 if (REAL_VALUES_EQUAL (c, dconst1_3)) > + =A0 =A0 =A0 =A0 =A0 return build_and_insert_call (gsi, cbrtfn, arg0, &t= arget, loc); > + > + =A0 =A0 =A0 =A0 /* pow(x,1./6.) =3D> cbrt(sqrt(x)). =A0*/ > + =A0 =A0 =A0 =A0 if (optimize_function_for_speed_p (cfun) && hw_sqrt_exi= sts) > + =A0 =A0 =A0 =A0 =A0 { > + =A0 =A0 =A0 =A0 =A0 =A0 REAL_VALUE_TYPE dconst1_6 =3D dconst1_3; > + =A0 =A0 =A0 =A0 =A0 =A0 SET_REAL_EXP (&dconst1_6, REAL_EXP (&dconst1_6)= - 1); > + > + =A0 =A0 =A0 =A0 =A0 =A0 if (REAL_VALUES_EQUAL (c, dconst1_6)) > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 { > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 tree sqrt_arg0; > + > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 /* sqrt(x) =A0*/ > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 sqrt_arg0 =3D build_and_insert_call (gs= i, sqrtfn, arg0, > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0&target, loc); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 /* cbrt(sqrt(x)) =A0*/ > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 return build_and_insert_call (gsi, cbrt= fn, sqrt_arg0, > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= =A0 =A0 =A0 =A0 =A0 &target, loc); > + =A0 =A0 =A0 =A0 =A0 =A0 =A0 } > + =A0 =A0 =A0 =A0 =A0 } > + =A0 =A0 =A0 } > + =A0 =A0} There are several extra pre-conditions in the original code in builtins.c as well as commens for reasonings (yes, there seem to be duplicates of several of the transforms there ...). Please try to preserve them. I noticed especially /* Optimize pow (x, 0.25) into sqrt (sqrt (x)). Assume on mo= st machines that a builtin sqrt instruction is smaller than a call to pow with 0.25, so do this optimization even if -Os. */ and /* Check whether we can do cbrt insstead of pow (x, 1./3.) and cbrt/sqrts instead of pow (x, 1./6.). */ if (cbrtfn && ! op && (tree_expr_nonnegative_p (arg0) || !HONOR_NANS (mode))) where you omit the non-negative check and the check for NANs. Note that -funsafe-math-optimizations does not mean you can ignore nans or signed zero issues. pow(x,1/3) -> cbrt(x) does not need a -funsafe-math-optimization check either. It would be nice if you could re-arrange this function so that before each individual replacement all preconditions are checked (even if that means duplicating some checks and code) - that way reviewing and later maintainance should be much easier. Thanks, Richard. > + =A0/* We allow one transformation when flag_unsafe_math_optimizations is > + =A0 =A0 false. =A0Replacing pow(x,0.5) with sqrt(x) is safe, provided s= igned > + =A0 =A0 zeros must not be maintained. =A0For x =3D -0, the former produ= ces +0, > + =A0 =A0 and the latter produces -0. =A0*/ > + =A0else if (sqrtfn !=3D NULL_TREE > + =A0 =A0 =A0 =A0 =A0&& !HONOR_SIGNED_ZEROS (mode) > + =A0 =A0 =A0 =A0 =A0&& REAL_VALUES_EQUAL (c, dconsthalf)) > + > + =A0 =A0return build_and_insert_call (gsi, sqrtfn, arg0, &target, l