From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 7396 invoked by alias); 3 Oct 2013 23:19:08 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 7383 invoked by uid 89); 3 Oct 2013 23:19:07 -0000 Received: from mail-ie0-f181.google.com (HELO mail-ie0-f181.google.com) (209.85.223.181) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-SHA encrypted) ESMTPS; Thu, 03 Oct 2013 23:19:07 +0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.8 required=5.0 tests=AWL,BAYES_00,NO_RELAYS autolearn=ham version=3.3.2 X-HELO: mail-ie0-f181.google.com Received: by mail-ie0-f181.google.com with SMTP id tp5so6912261ieb.26 for ; Thu, 03 Oct 2013 16:19:03 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:content-type; bh=gcqJeD4N5lSRmNySlGSDJLneaZfrb8Mt2BhbFDcMvQY=; b=KKRgTBEGk/hLCEsa8RtWorK97DiVUlyjVEKScjKTWYHaVZL5HjAni30MKo0dDbRsbb WFdHRTLMuXQCBiYy7Ic4F88DcPl59L2XcluuBYZ4xcBcag37TWJQ6xr8r8XldlIX1Dvw KTxWITS8K6Nmh70yFxpRpbV+z1LpXnziXngGHzd32lZYGKgC+7SwSKCt2NV375c1n2CV SaLOC4aHzH3vv7U4skslkKKY2J9JrdnKQoT8GsZd0LBpRUAUaHSEzhmwkQU8oUBO+4kh xgmjl0vFaLDFX7/I9bHN0ZzB69YJcRE+fiSmLLLewjXss60N4Gmo73V5OyVfcs0WYVE+ uwXQ== X-Gm-Message-State: ALoCoQmkX+S+FeOtHf6ZrEjw16JfJsFVBama9v8LvNmTKBidkbss2Mm8c3OOL2/hjcYgHeQJKEE1TE+/kvF3lwKYys1R5bQYS3w0lhYV4T1O5DCcrTJFgpZZTlC0PFNIBGiQke6Lvac8LVUVTgbGQkN0cdSNB3oAH9bIE2yK0dwQ9VTrCcZZZ6s/+2p0MjR1/GRFfhntiqHgncd49aZ4Z7dk6L3Y9a+YkQ== MIME-Version: 1.0 X-Received: by 10.50.114.168 with SMTP id jh8mr4187226igb.6.1380842343478; Thu, 03 Oct 2013 16:19:03 -0700 (PDT) Received: by 10.64.236.37 with HTTP; Thu, 3 Oct 2013 16:19:03 -0700 (PDT) In-Reply-To: References: Date: Thu, 03 Oct 2013 23:19:00 -0000 Message-ID: Subject: Re: [PATCH] Fixing improper conversion from sin() to sinf() in optimization mode. From: Cong Hou To: GCC Patches , "Joseph S. Myers" Content-Type: text/plain; charset=ISO-8859-1 X-SW-Source: 2013-10/txt/msg00288.txt.bz2 Ping... thanks, Cong On Fri, Sep 20, 2013 at 9:49 AM, Cong Hou wrote: > Any comment or more suggestions on this patch? > > > thanks, > Cong > > On Mon, Sep 9, 2013 at 7:28 PM, Cong Hou wrote: >> On Mon, Sep 9, 2013 at 6:26 PM, Xinliang David Li wrote: >>> On Fri, Sep 6, 2013 at 3:24 PM, Cong Hou wrote: >>>> First, thank you for your detailed comments again! Then I deeply >>>> apologize for not explaining my patch properly and responding to your >>>> previous comment. I didn't understand thoroughly the problem before >>>> submitting the patch. >>>> >>>> Previously I only considered the following three conversions for sqrt(): >>>> >>>> >>>> 1: (float) sqrt ((double) float_val) -> sqrtf (float_val) >>>> 2: (float) sqrtl ((long double) float_val) -> sqrtf (float_val) >>>> 3: (double) sqrtl ((long double) double_val) -> sqrt (double_val) >>>> >>>> >>>> We have four types here: >>>> >>>> TYPE is the type to which the result of the function call is converted. >>>> ITYPE is the type of the math call expression. >>>> TREE_TYPE(arg0) is the type of the function argument (before type conversion). >>>> NEWTYPE is chosen from TYPE and TREE_TYPE(arg0) with higher precision. >>>> It will be the type of the new math call expression after conversion. >>>> >>>> For all three cases above, TYPE is always the same as NEWTYPE. That is >>>> why I only considered TYPE during the precision comparison. ITYPE can >>>> only be double_type_node or long_double_type_node depending on the >>>> type of the math function. That is why I explicitly used those two >>>> types instead of ITYPE (no correctness issue). But you are right, >>>> ITYPE is more elegant and better here. >>>> >>>> After further analysis, I found I missed two more cases. Note that we >>>> have the following conditions according to the code in convert.c: >>>> >>>> TYPE_PRECISION(NEWTYPE) >= TYPE_PRECISION(TYPE) >>>> TYPE_PRECISION(NEWTYPE) >= TYPE_PRECISION(TREE_TYPE(arg0)) >>>> TYPE_PRECISION (NEWTYPE) < TYPE_PRECISION (ITYPE) >>>> >>>> the last condition comes from the fact that we only consider >>>> converting a math function call into another one with narrower type. >>>> Therefore we have >>>> >>>> TYPE_PRECISION(TYPE) < TYPE_PRECISION (ITYPE) >>>> TYPE_PRECISION(TREE_TYPE(arg0)) < TYPE_PRECISION (ITYPE) >>>> >>>> So for sqrt(), TYPE and TREE_TYPE(arg0) can only be float, and for >>>> sqrtl(), TYPE and TREE_TYPE(arg0) can be either float or double with >>>> four possible combinations. Therefore we have two more conversions to >>>> consider besides the three ones I mentioned above: >>>> >>>> >>>> 4: (float) sqrtl ((long double) double_val) -> (float) sqrt (double_val) >>>> 5: (double) sqrtl ((long double) float_val) -> sqrt ((double) float_val) >>>> >>>> >>>> For the first conversion here, TYPE (float) is different from NEWTYPE >>>> (double), and my previous patch doesn't handle this case.The correct >>>> way is to compare precisions of ITYPE and NEWTYPE now. >>>> >>>> To sum up, we are converting the expression >>>> >>>> (TYPE) sqrtITYPE ((ITYPE) expr) >>>> >>>> to >>>> >>>> (TYPE) sqrtNEWTYPE ((NEWTYPE) expr) >>>> >>>> and we require >>>> >>>> PRECISION (ITYPE) >= PRECISION (NEWTYPE) * 2 + 2 >>>> >>>> to make it a safe conversion. >>>> >>>> >>>> The new patch is pasted below. >>>> >>>> I appreciate your detailed comments and analysis, and next time when I >>>> submit a patch I will be more carefully about the reviewer's comment. >>>> >>>> >>>> Thank you! >>>> >>>> Cong >>>> >>>> >>>> >>>> Index: gcc/convert.c >>>> =================================================================== >>>> --- gcc/convert.c (revision 201891) >>>> +++ gcc/convert.c (working copy) >>>> @@ -135,16 +135,19 @@ convert_to_real (tree type, tree expr) >>>> CASE_MATHFN (COS) >>>> CASE_MATHFN (ERF) >>>> CASE_MATHFN (ERFC) >>>> - CASE_MATHFN (FABS) >>>> CASE_MATHFN (LOG) >>>> CASE_MATHFN (LOG10) >>>> CASE_MATHFN (LOG2) >>>> CASE_MATHFN (LOG1P) >>>> - CASE_MATHFN (LOGB) >>>> CASE_MATHFN (SIN) >>>> - CASE_MATHFN (SQRT) >>>> CASE_MATHFN (TAN) >>>> CASE_MATHFN (TANH) >>>> + /* The above functions are not safe to do this conversion. */ >>>> + if (!flag_unsafe_math_optimizations) >>>> + break; >>>> + CASE_MATHFN (SQRT) >>>> + CASE_MATHFN (FABS) >>>> + CASE_MATHFN (LOGB) >>>> #undef CASE_MATHFN >>>> { >>>> tree arg0 = strip_float_extensions (CALL_EXPR_ARG (expr, 0)); >>>> @@ -155,6 +158,27 @@ convert_to_real (tree type, tree expr) >>>> if (TYPE_PRECISION (TREE_TYPE (arg0)) > TYPE_PRECISION (type)) >>>> newtype = TREE_TYPE (arg0); >>>> >>>> + /* We consider to convert >>>> + >>>> + (T1) sqrtT2 ((T2) exprT3) >>>> + to >>>> + (T1) sqrtT4 ((T4) exprT3) >>> >>> Should this be >>> >>> (T4) sqrtT4 ((T4) exprT3) >> >> T4 is not necessarily the same as T1. For the conversion: >> >> (float) sqrtl ((long double) double_val) -> (float) sqrt (double_val) >> >> T4 is double and T1 is float. >> >> >>>> + >>>> + , where T1 is TYPE, T2 is ITYPE, T3 is TREE_TYPE (ARG0), >>>> + and T4 is NEWTYPE. >>> >>> NEWTYPE is also the wider one between T1 and T3. >> >> Right. Actually this is easy to catch from the context just before >> this comment. >> >> tree newtype = type; >> if (TYPE_PRECISION (TREE_TYPE (arg0)) > TYPE_PRECISION (type)) >> newtype = TREE_TYPE (arg0); >> >> >> >> thanks, >> Cong >> >> >>> >>> David >>> >>>> All those types are of floating point types. >>>> + T4 (NEWTYPE) should be narrower than T2 (ITYPE). This conversion >>>> + is safe only if P1 >= P2*2+2, where P1 and P2 are precisions of >>>> + T2 and T4. See the following URL for a reference: >>>> + http://stackoverflow.com/questions/9235456/determining-floating-point-square-root >>>> + */ >>>> + if (fcode == BUILT_IN_SQRT || fcode == BUILT_IN_SQRTL) >>>> + { >>>> + int p1 = REAL_MODE_FORMAT (TYPE_MODE (itype))->p; >>>> + int p2 = REAL_MODE_FORMAT (TYPE_MODE (newtype))->p; >>>> + if (p1 < p2 * 2 + 2 && !flag_unsafe_math_optimizations) >>>> + break; >>>> + } >>>> + >>>> /* Be careful about integer to fp conversions. >>>> These may overflow still. */ >>>> if (FLOAT_TYPE_P (TREE_TYPE (arg0)) >>>> Index: gcc/testsuite/gcc.c-torture/execute/20030125-1.c >>>> =================================================================== >>>> --- gcc/testsuite/gcc.c-torture/execute/20030125-1.c (revision 201891) >>>> +++ gcc/testsuite/gcc.c-torture/execute/20030125-1.c (working copy) >>>> @@ -44,11 +44,11 @@ __attribute__ ((noinline)) >>>> double >>>> sin(double a) >>>> { >>>> - abort (); >>>> + return a; >>>> } >>>> __attribute__ ((noinline)) >>>> float >>>> sinf(float a) >>>> { >>>> - return a; >>>> + abort (); >>>> } >>>> >>>> On Wed, Sep 4, 2013 at 3:26 PM, Joseph S. Myers wrote: >>>>> On Wed, 4 Sep 2013, Xinliang David Li wrote: >>>>> >>>>>> > This patch submission still fails to pay attention to various of my >>>>>> > comments. >>>>>> > >>>>>> >>>>>> If you can provide inlined comments in the patch, that will be more >>>>>> useful and productive. >>>>> >>>>> I have explained things several times in this thread already. I see no >>>>> point in repeating things when what I would say has already been said and >>>>> ignored. As far as I can tell, this latest patch submission has taken one >>>>> line from the message it is in response to, and largely ignored the >>>>> following two paragraphs (including where I explicitly say that said line >>>>> should not appear literally in the source code at all). But, repeating >>>>> what I said before yet again: >>>>> >>>>> (but you should be referring to the relevant types >>>>> >>>>> The patch does not do properly that. It refers explicitly to >>>>> long_double_type_node and double_type_node. >>>>> >>>>> - "type", the type >>>>> being converted to, "itype", the type of the function being called in the >>>>> source code, "TREE_TYPE (arg0)", the type of the argument after extensions >>>>> have been removed, and "newtype", computed from those >>>>> >>>>> The patch only engages with "type". I suspect "newtype" is what it really >>>>> means there when using "type". When it uses long_double_type_node and >>>>> double_type_node, those should be "itype". >>>>> >>>>> - so you should have >>>>> expressions like the above with two or more of those four types, but not >>>>> with long_double_type_node directly). >>>>> >>>>> See above. The patch uses long_double_type_node directly, contrary to >>>>> what I explicitly said. You are free to disagree with something I say in >>>>> a review - but in that case you need to reply specifically to the review >>>>> comment explaining your rationale for disagreeing with it. Just ignoring >>>>> the comment and not mentioning the disagreement wastes the time of >>>>> reviewers. >>>>> >>>>> The patch submission will need to include a proper analysis to justify to >>>>> the reader why the particular inequality with particular types from those >>>>> four is correct in all cases where the relevant code may be executed. >>>>> >>>>> The comments only refer to "T1" and "T2" without explaining how they >>>>> relate to the original expression (three types) or the variables within >>>>> GCC dealing with various types (four types, because newtype gets >>>>> involved). I said back in >>>>> and >>>>> that three types >>>>> are involved - when I say "the patch submission needs to include its own >>>>> analysis for the full generality of three types", again, it's >>>>> inappropriate for a patch to omit such an analysis without justification. >>>>> The patch submission needs to include an analysis that properly explains >>>>> the transformation involved and the conditions under which it is safe. >>>>> Maybe starting along the lines of: >>>>> >>>>> We have an expression of the form (T1) sqrtT2 ((T2) exprT3), where exprT3 >>>>> has type T3 (TREE_TYPE (ARG0)), T2 is the type of the floating-point >>>>> square root function being used (ITYPE), T1 is TYPE and all these types >>>>> are binary floating-point types. We wish to optimize if possible to an >>>>> expression of the form (T1) sqrtT4 ((T4) exprT3), where T4 (NEWTYPE) is >>>>> narrower than T2. (Then explain the choice of T4 and the conditions under >>>>> which the transformation is safe, with appropriate references.) >>>>> >>>>> I suggest that for the next patch submission you (the patch submitter) >>>>> make sure you spend at least an hour properly understanding the issues and >>>>> all the previous messages in the thread and writing up the detailed, >>>>> coherent explanation of the transformation and analysis of the issues that >>>>> I asked for some time ago, as if for a reader who hasn't read any of this >>>>> thread or looked at this transformation in GCC before. I've certainly >>>>> spent longer than that on review in this thread. It's normal for a patch >>>>> involving anything at all tricky to take hours to write up (and also >>>>> normal for one to discover, in the course of writing the detailed coherent >>>>> analysis for people who aren't familiar with the code and the issues >>>>> involved, that there's in fact something wrong with the patch and it needs >>>>> revisiting before submission). >>>>> >>>>> -- >>>>> Joseph S. Myers >>>>> joseph@codesourcery.com