From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 16672 invoked by alias); 2 Sep 2014 22:23:00 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 16662 invoked by uid 89); 2 Sep 2014 22:22:59 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-3.0 required=5.0 tests=AWL,BAYES_00,RP_MATCHES_RCVD autolearn=ham version=3.3.2 X-HELO: e24smtp03.br.ibm.com Received: from e24smtp03.br.ibm.com (HELO e24smtp03.br.ibm.com) (32.104.18.24) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (CAMELLIA256-SHA encrypted) ESMTPS; Tue, 02 Sep 2014 22:22:58 +0000 Received: from /spool/local by e24smtp03.br.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Tue, 2 Sep 2014 19:22:54 -0300 Received: from d24dlp02.br.ibm.com (9.18.248.206) by e24smtp03.br.ibm.com (10.172.0.139) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Tue, 2 Sep 2014 19:22:53 -0300 Received: from d24relay01.br.ibm.com (d24relay01.br.ibm.com [9.8.31.16]) by d24dlp02.br.ibm.com (Postfix) with ESMTP id 0F8FD1DC006A for ; Tue, 2 Sep 2014 18:22:52 -0400 (EDT) Received: from d24av04.br.ibm.com (d24av04.br.ibm.com [9.8.31.97]) by d24relay01.br.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id s82MNEk64468924 for ; Tue, 2 Sep 2014 19:23:15 -0300 Received: from d24av04.br.ibm.com (localhost [127.0.0.1]) by d24av04.br.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id s82MMqW0007767 for ; Tue, 2 Sep 2014 19:22:52 -0300 Received: from [9.78.132.103] ([9.78.132.103]) by d24av04.br.ibm.com (8.14.4/8.14.4/NCO v10.0 AVin) with ESMTP id s82MMppj007757 for ; Tue, 2 Sep 2014 19:22:52 -0300 Message-ID: <5406433B.3030703@linux.vnet.ibm.com> Date: Tue, 02 Sep 2014 22:23:00 -0000 From: Adhemerval Zanella User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.5.0 MIME-Version: 1.0 To: gcc-patches@gcc.gnu.org Subject: Re: [PATCH] PowerPC: Implement TARGET_ATOMIC_ASSIGN_EXPAND_FENV References: <53E28E53.2000009@linux.vnet.ibm.com> <53F38135.9070604@linux.vnet.ibm.com> In-Reply-To: <53F38135.9070604@linux.vnet.ibm.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 14090222-0009-0000-0000-000001AC4AE7 X-IsSubscribed: yes X-SW-Source: 2014-09/txt/msg00187.txt.bz2 Ping. On 19-08-2014 13:54, Adhemerval Zanella wrote: > Ping. > > On 06-08-2014 17:21, Adhemerval Zanella wrote: >> On 01-08-2014 12:31, Joseph S. Myers wrote: >>> On Thu, 31 Jul 2014, David Edelsohn wrote: >>> >>>> Thanks for implementing the FENV support. The patch generally looks >>>> good to me. >>>> >>>> My one concern is a detail in the implementation of "update". I do not >>>> have enough experience with GENERIC to verify the details and it seems >>>> like it is missing building an outer COMPOUND_EXPR containing >>>> update_mffs and the CALL_EXPR for update mtfsf. >>> I suppose what's actually odd there is that you have >>> >>> + tree update_mffs = build2 (MODIFY_EXPR, void_type_node, old_fenv, call_mffs); >>> + >>> + tree old_llu = build1 (VIEW_CONVERT_EXPR, uint64_type_node, update_mffs); >>> >>> so you build a MODIFY_EXPR in void_type_node but then convert it with a >>> VIEW_CONVERT_EXPR. If you'd built the MODIFY_EXPR in double_type_node >>> then the VIEW_CONVERT_EXPR would be meaningful (the value of an assignment >>> a = b being the new value of a), but reinterpreting a void value doesn't >>> make sense. Or you could probably just use call_mffs directly in the >>> VIEW_CONVERT_EXPR without explicitly creating the old_fenv variable. >>> >> Thanks for the review Josephm. I have changed to avoid the void reinterpretation >> and use call_mffs directly. I have also removed the the mask generation in 'clear' >> from your previous message, it is now reusing the mas used in feholdexcept. The >> testcase patch is the same as before. >> >> Checked on both linux-powerpc64/powerpc64le and no regressions found. >> >> -- >> >> 2014-08-06 Adhemerval Zanella >> >> gcc: >> * config/rs6000/rs6000.c (rs6000_atomic_assign_expand_fenv): New >> function. >> >> gcc/testsuite: >> * gcc.dg/atomic/c11-atomic-exec-5.c >> (test_main_long_double_add_overflow): Define and run only for >> LDBL_MANT_DIG != 106. >> (test_main_complex_long_double_add_overflow): Likewise. >> (test_main_long_double_sub_overflow): Likewise. >> (test_main_complex_long_double_sub_overflow): Likewise. >> >> --- >> >> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c >> index d088ff6..7d66eb1 100644 >> --- a/gcc/config/rs6000/rs6000.c >> +++ b/gcc/config/rs6000/rs6000.c >> @@ -1631,6 +1631,9 @@ static const struct attribute_spec rs6000_attribute_table[] = >> >> #undef TARGET_CAN_USE_DOLOOP_P >> #define TARGET_CAN_USE_DOLOOP_P can_use_doloop_if_innermost >> + >> +#undef TARGET_ATOMIC_ASSIGN_EXPAND_FENV >> +#define TARGET_ATOMIC_ASSIGN_EXPAND_FENV rs6000_atomic_assign_expand_fenv >> >> >> /* Processor table. */ >> @@ -33337,6 +33340,80 @@ emit_fusion_gpr_load (rtx *operands) >> return ""; >> } >> >> +/* Implement TARGET_ATOMIC_ASSIGN_EXPAND_FENV hook. */ >> + >> +static void >> +rs6000_atomic_assign_expand_fenv (tree *hold, tree *clear, tree *update) >> +{ >> + if (!TARGET_HARD_FLOAT || !TARGET_FPRS) >> + return; >> + >> + tree mffs = rs6000_builtin_decls[RS6000_BUILTIN_MFFS]; >> + tree mtfsf = rs6000_builtin_decls[RS6000_BUILTIN_MTFSF]; >> + tree call_mffs = build_call_expr (mffs, 0); >> + >> + /* Generates the equivalent of feholdexcept (&fenv_var) >> + >> + *fenv_var = __builtin_mffs (); >> + double fenv_hold; >> + *(uint64_t*)&fenv_hold = *(uint64_t*)fenv_var & 0xffffffff00000007LL; >> + __builtin_mtfsf (0xff, fenv_hold); */ >> + >> + /* Mask to clear everything except for the rounding modes and non-IEEE >> + arithmetic flag. */ >> + const unsigned HOST_WIDE_INT hold_exception_mask = >> + HOST_WIDE_INT_UC (0xffffffff00000007); >> + >> + tree fenv_var = create_tmp_var (double_type_node, NULL); >> + >> + tree hold_mffs = build2 (MODIFY_EXPR, void_type_node, fenv_var, call_mffs); >> + >> + tree fenv_llu = build1 (VIEW_CONVERT_EXPR, uint64_type_node, fenv_var); >> + tree fenv_llu_and = build2 (BIT_AND_EXPR, uint64_type_node, fenv_llu, >> + build_int_cst (uint64_type_node, hold_exception_mask)); >> + >> + tree fenv_mtfsf = build1 (VIEW_CONVERT_EXPR, double_type_node, fenv_llu_and); >> + >> + tree hold_mtfsf = build_call_expr (mtfsf, 2, >> + build_int_cst (unsigned_type_node, 0xff), fenv_mtfsf); >> + >> + *hold = build2 (COMPOUND_EXPR, void_type_node, hold_mffs, hold_mtfsf); >> + >> + /* Reload the value of fenv_hold to clear the exceptions. */ >> + >> + *clear = build_call_expr (mtfsf, 2, >> + build_int_cst (unsigned_type_node, 0xff), fenv_mtfsf); >> + >> + /* Generates the equivalent of feupdateenv (&fenv_var) >> + >> + double old_fenv = __builtin_mffs (); >> + double fenv_update; >> + *(uint64_t*)&fenv_update = (*(uint64_t*)&old & 0xffffffff1fffff00LL) | >> + (*(uint64_t*)fenv_var 0x1ff80fff); >> + __builtin_mtfsf (0xff, fenv_update); */ >> + >> + const unsigned HOST_WIDE_INT update_exception_mask = >> + HOST_WIDE_INT_UC (0xffffffff1fffff00); >> + const unsigned HOST_WIDE_INT new_exception_mask = >> + HOST_WIDE_INT_UC (0x1ff80fff); >> + >> + tree old_llu = build1 (VIEW_CONVERT_EXPR, uint64_type_node, call_mffs); >> + tree old_llu_and = build2 (BIT_AND_EXPR, uint64_type_node, >> + old_llu, build_int_cst (uint64_type_node, update_exception_mask)); >> + >> + tree new_llu_and = build2 (BIT_AND_EXPR, uint64_type_node, fenv_llu, >> + build_int_cst (uint64_type_node, new_exception_mask)); >> + >> + tree new_llu_mask = build2 (BIT_IOR_EXPR, uint64_type_node, >> + old_llu_and, new_llu_and); >> + >> + tree fenv_mtfsf_update = build1 (VIEW_CONVERT_EXPR, double_type_node, >> + new_llu_mask); >> + >> + *update = build_call_expr (mtfsf, 2, >> + build_int_cst (unsigned_type_node, 0xff), fenv_mtfsf_update); >> +} >> + >> >> struct gcc_target targetm = TARGET_INITIALIZER; >> >> diff --git a/gcc/testsuite/gcc.dg/atomic/c11-atomic-exec-5.c b/gcc/testsuite/gcc.dg/atomic/c11-atomic-exec-5.c >> index b47a228..6bcb302 100644 >> --- a/gcc/testsuite/gcc.dg/atomic/c11-atomic-exec-5.c >> +++ b/gcc/testsuite/gcc.dg/atomic/c11-atomic-exec-5.c >> @@ -331,11 +331,11 @@ TEST_FUNCS (complex_double_div_overflow, _Complex double, , /= DBL_MIN, 0, >> TEST_FUNCS (long_double_add_invalid, long double, , += __builtin_infl (), 0, >> 0, __builtin_isinf, 0, >> -__builtin_infl (), FE_INVALID) >> +#if LDBL_MANT_DIG != 106 >> TEST_FUNCS (long_double_add_overflow, long double, , += LDBL_MAX, 0, >> LDBL_MAX, __builtin_isinf, FE_OVERFLOW | FE_INEXACT, >> 0, 0) >> #define NOT_LDBL_EPSILON_2(X) ((X) != LDBL_EPSILON / 2) >> -#if LDBL_MANT_DIG != 106 >> TEST_FUNCS (long_double_add_inexact, long double, , += LDBL_EPSILON / 2, 0, >> 1.0L, NOT_LDBL_EPSILON_2, FE_INEXACT, >> 0, 0) >> @@ -348,18 +348,18 @@ TEST_FUNCS (long_double_preinc_inexact, long double, ++, , 0, >> TEST_FUNCS (long_double_postinc_inexact, long double, , ++, 0, >> LDBL_EPSILON / 2, NOT_MINUS_1, FE_INEXACT, >> -1, 0) >> -#endif >> TEST_FUNCS (complex_long_double_add_overflow, _Complex long double, , += LDBL_MAX, 0, >> LDBL_MAX, REAL_ISINF, FE_OVERFLOW | FE_INEXACT, >> 0, 0) >> +#endif >> TEST_FUNCS (long_double_sub_invalid, long double, , -= __builtin_infl (), 0, >> 0, __builtin_isinf, 0, >> __builtin_infl (), FE_INVALID) >> +#if LDBL_MANT_DIG != 106 >> TEST_FUNCS (long_double_sub_overflow, long double, , -= LDBL_MAX, 0, >> -LDBL_MAX, __builtin_isinf, FE_OVERFLOW | FE_INEXACT, >> 0, 0) >> #define NOT_MINUS_LDBL_EPSILON_2(X) ((X) != -LDBL_EPSILON / 2) >> -#if LDBL_MANT_DIG != 106 >> TEST_FUNCS (long_double_sub_inexact, long double, , -= LDBL_EPSILON / 2, 0, >> -1.0L, NOT_MINUS_LDBL_EPSILON_2, FE_INEXACT, >> 0, 0) >> @@ -372,10 +372,10 @@ TEST_FUNCS (long_double_predec_inexact, long double, --, , 0, >> TEST_FUNCS (long_double_postdec_inexact, long double, , --, 0, >> -LDBL_EPSILON / 2, NOT_1, FE_INEXACT, >> 1, 0) >> -#endif >> TEST_FUNCS (complex_long_double_sub_overflow, _Complex long double, , -= LDBL_MAX, 0, >> -LDBL_MAX, REAL_ISINF, FE_OVERFLOW | FE_INEXACT, >> 0, 0) >> +#endif >> TEST_FUNCS (long_double_mul_invalid, long double, , *= __builtin_infl (), 0, >> __builtin_infl (), __builtin_isinf, 0, >> 0, FE_INVALID) >> @@ -507,23 +507,23 @@ main (void) >> ret |= test_main_int_div_double_inexact (); >> ret |= test_main_complex_double_div_overflow (); >> ret |= test_main_long_double_add_invalid (); >> - ret |= test_main_long_double_add_overflow (); >> #if LDBL_MANT_DIG != 106 >> + ret |= test_main_long_double_add_overflow (); >> ret |= test_main_long_double_add_inexact (); >> ret |= test_main_long_double_add_inexact_int (); >> ret |= test_main_long_double_preinc_inexact (); >> ret |= test_main_long_double_postinc_inexact (); >> -#endif >> ret |= test_main_complex_long_double_add_overflow (); >> +#endif >> ret |= test_main_long_double_sub_invalid (); >> - ret |= test_main_long_double_sub_overflow (); >> #if LDBL_MANT_DIG != 106 >> + ret |= test_main_long_double_sub_overflow (); >> ret |= test_main_long_double_sub_inexact (); >> ret |= test_main_long_double_sub_inexact_int (); >> ret |= test_main_long_double_predec_inexact (); >> ret |= test_main_long_double_postdec_inexact (); >> -#endif >> ret |= test_main_complex_long_double_sub_overflow (); >> +#endif >> ret |= test_main_long_double_mul_invalid (); >> ret |= test_main_long_double_mul_overflow (); >> ret |= test_main_long_double_mul_overflow_float (); >>