From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 37631 invoked by alias); 28 Sep 2017 22:34:38 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 37323 invoked by uid 89); 28 Sep 2017 22:34:38 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-14.5 required=5.0 tests=AWL,BAYES_00,GIT_PATCH_1,GIT_PATCH_2,GIT_PATCH_3,KAM_ASCII_DIVIDERS,KAM_LAZY_DOMAIN_SECURITY,RCVD_IN_DNSWL_LOW autolearn=ham version=3.3.2 spammy= X-HELO: mx0a-001b2d01.pphosted.com Received: from mx0b-001b2d01.pphosted.com (HELO mx0a-001b2d01.pphosted.com) (148.163.158.5) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 28 Sep 2017 22:34:35 +0000 Received: from pps.filterd (m0098414.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.21/8.16.0.21) with SMTP id v8SMYLhp001391 for ; Thu, 28 Sep 2017 18:34:28 -0400 Received: from e36.co.us.ibm.com (e36.co.us.ibm.com [32.97.110.154]) by mx0b-001b2d01.pphosted.com with ESMTP id 2d95ew6e1x-1 (version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT) for ; Thu, 28 Sep 2017 18:34:28 -0400 Received: from localhost by e36.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Thu, 28 Sep 2017 16:34:27 -0600 Received: from b03cxnp07029.gho.boulder.ibm.com (9.17.130.16) by e36.co.us.ibm.com (192.168.1.136) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Thu, 28 Sep 2017 16:34:25 -0600 Received: from b03ledav003.gho.boulder.ibm.com (b03ledav003.gho.boulder.ibm.com [9.17.130.234]) by b03cxnp07029.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id v8SMYOSK10289420; Thu, 28 Sep 2017 15:34:24 -0700 Received: from b03ledav003.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id BCC0D6A041; Thu, 28 Sep 2017 16:34:24 -0600 (MDT) Received: from ibm-tiger.the-meissners.org (unknown [9.32.77.111]) by b03ledav003.gho.boulder.ibm.com (Postfix) with ESMTP id 92FF96A03C; Thu, 28 Sep 2017 16:34:24 -0600 (MDT) Received: by ibm-tiger.the-meissners.org (Postfix, from userid 500) id EA6114743D; Thu, 28 Sep 2017 18:34:23 -0400 (EDT) Date: Thu, 28 Sep 2017 22:34:00 -0000 From: Michael Meissner To: GCC Patches , Segher Boessenkool , David Edelsohn , Bill Schmidt Subject: [PATCH], Add PowerPC ISA 3.0 IEEE 128-bit floating point round to odd built-in functions Mail-Followup-To: Michael Meissner , GCC Patches , Segher Boessenkool , David Edelsohn , Bill Schmidt MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="NzB8fVQJ5HfG6fxh" Content-Disposition: inline User-Agent: Mutt/1.5.20 (2009-12-10) X-TM-AS-GCONF: 00 x-cbid: 17092822-0020-0000-0000-00000CC6F6DD X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00007806; HX=3.00000241; KW=3.00000007; PH=3.00000004; SC=3.00000232; SDB=6.00923777; UDB=6.00464439; IPR=6.00703916; BA=6.00005613; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009; ZB=6.00000000; ZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00017312; XFM=3.00000015; UTC=2017-09-28 22:34:26 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 17092822-0021-0000-0000-00005E509AA4 Message-Id: <20170928223423.GA3629@ibm-tiger.the-meissners.org> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2017-09-28_08:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1707230000 definitions=main-1709280332 X-IsSubscribed: yes X-SW-Source: 2017-09/txt/msg01912.txt.bz2 --NzB8fVQJ5HfG6fxh Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-length: 1933 This patch addss built-in functions on PowerPC ISA 3.0 (power9) that allow the user to access the round to odd IEEE 128-bit floating point instructions. I have checked it on a little endian power8 system doing a bootstrap and make check. There were no regressions in the testsuite. I verified that the new test (float128-odd.c) did run sucessfully. Can I check this patch into the trunk? [gcc] 2017-09-28 Michael Meissner * config/rs6000/rs6000-builtin.def (BU_FLOAT128_2_HW): Define new helper macro for IEEE float128 hardware built-in functions. (SQRTF128_ODD): Add built-in functions with the round-to-odd semantics. (TRUNCF128_ODD): Likewise. (ADDF128_ODD): Likewise. (SUBF128_ODD): Likewise. (MULF128_ODD): Likewise. (DIVF128_ODD): Likewise. (FMAF128_ODD): Likewise. * config/rs6000/rs6000.md (truncsf2_hw): Change the truncate with round to odd expansion to use float_truncate:DF inside of the UNSPEC to better document what the insn does. (add3_odd): Add insns for IEEE 128-bit floating point round to odd hardware instructions. (sub3_odd): Likewise. (mul3_odd): Likewise. (div3_odd): Likewise. (sqrt2_odd): Likewise. (fma4_odd): Likewise. (fms4_odd): Likewise. (nfma4_odd): Likewise. (nfms4_odd): Likewise. (truncdf2_odd): Change insn format to make it more readable, and add a generator function. * doc/extend.texi (PowerPC built-in functions): Update documentation for existing IEEE float128-bit built-in functions. Add built-in functions that generate the IEEE 128-bit floating point round to odd instructions. [gcc/testsuite] 2017-09-28 Michael Meissner * gcc.target/powerpc/float128-odd.c: New test. -- Michael Meissner, IBM IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA email: meissner@linux.vnet.ibm.com, phone: +1 (978) 899-4797 --NzB8fVQJ5HfG6fxh Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="gcc-power9.patch290b" Content-length: 12265 Index: gcc/config/rs6000/rs6000-builtin.def =================================================================== --- gcc/config/rs6000/rs6000-builtin.def (revision 253267) +++ gcc/config/rs6000/rs6000-builtin.def (working copy) @@ -686,6 +686,14 @@ | RS6000_BTC_UNARY), \ CODE_FOR_ ## ICODE) /* ICODE */ +#define BU_FLOAT128_2_HW(ENUM, NAME, ATTR, ICODE) \ + RS6000_BUILTIN_2 (MISC_BUILTIN_ ## ENUM, /* ENUM */ \ + "__builtin_" NAME, /* NAME */ \ + RS6000_BTM_FLOAT128_HW, /* MASK */ \ + (RS6000_BTC_ ## ATTR /* ATTR */ \ + | RS6000_BTC_BINARY), \ + CODE_FOR_ ## ICODE) /* ICODE */ + #define BU_FLOAT128_3_HW(ENUM, NAME, ATTR, ICODE) \ RS6000_BUILTIN_3 (MISC_BUILTIN_ ## ENUM, /* ENUM */ \ "__builtin_" NAME, /* NAME */ \ @@ -2365,11 +2373,19 @@ BU_P9_OVERLOAD_2 (CMPEQB, "byte_in_set") BU_FLOAT128_1 (FABSQ, "fabsq", CONST, abskf2) BU_FLOAT128_2 (COPYSIGNQ, "copysignq", CONST, copysignkf3) -/* 1 and 3 argument IEEE 128-bit floating point functions that require ISA 3.0 - hardware. These functions use the new 'f128' suffix. Eventually these - should be folded into the common built-in function handling. */ -BU_FLOAT128_1_HW (SQRTF128, "sqrtf128", CONST, sqrtkf2) -BU_FLOAT128_3_HW (FMAF128, "fmaf128", CONST, fmakf4_hw) +/* 1, 2, and 3 argument IEEE 128-bit floating point functions that require ISA + 3.0 hardware. These functions use the new 'f128' suffix. Eventually the + standard functions should be folded into the common built-in function + handling. */ +BU_FLOAT128_1_HW (SQRTF128, "sqrtf128", CONST, sqrtkf2) +BU_FLOAT128_1_HW (SQRTF128_ODD, "sqrtf128_round_to_odd", CONST, sqrtkf2_odd) +BU_FLOAT128_1_HW (TRUNCF128_ODD, "truncf128_round_to_odd", CONST, trunckfdf2_odd) +BU_FLOAT128_2_HW (ADDF128_ODD, "addf128_round_to_odd", CONST, addkf3_odd) +BU_FLOAT128_2_HW (SUBF128_ODD, "subf128_round_to_odd", CONST, subkf3_odd) +BU_FLOAT128_2_HW (MULF128_ODD, "mulf128_round_to_odd", CONST, mulkf3_odd) +BU_FLOAT128_2_HW (DIVF128_ODD, "divf128_round_to_odd", CONST, divkf3_odd) +BU_FLOAT128_3_HW (FMAF128, "fmaf128", CONST, fmakf4_hw) +BU_FLOAT128_3_HW (FMAF128_ODD, "fmaf128_round_to_odd", CONST, fmakf4_odd) /* 1 argument crypto functions. */ BU_CRYPTO_1 (VSBOX, "vsbox", CONST, crypto_vsbox) Index: gcc/config/rs6000/rs6000.md =================================================================== --- gcc/config/rs6000/rs6000.md (revision 253267) +++ gcc/config/rs6000/rs6000.md (working copy) @@ -14505,7 +14505,9 @@ (define_insn_and_split "truncsf2_h "#" "&& 1" [(set (match_dup 2) - (unspec:DF [(match_dup 1)] UNSPEC_ROUND_TO_ODD)) + (unspec:DF [(float_truncate:DF + (match_dup 1))] + UNSPEC_ROUND_TO_ODD)) (set (match_dup 0) (float_truncate:SF (match_dup 2)))] { @@ -14682,9 +14684,125 @@ (define_insn_and_split "floatunsdf2_odd" +(define_insn "add3_odd" + [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v") + (unspec:IEEE128 + [(plus:IEEE128 + (match_operand:IEEE128 1 "altivec_register_operand" "v") + (match_operand:IEEE128 2 "altivec_register_operand" "v"))] + UNSPEC_ROUND_TO_ODD))] + "TARGET_FLOAT128_HW && FLOAT128_IEEE_P (mode)" + "xsaddqpo %0,%1,%2" + [(set_attr "type" "vecfloat") + (set_attr "size" "128")]) + +(define_insn "sub3_odd" + [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v") + (unspec:IEEE128 + [(minus:IEEE128 + (match_operand:IEEE128 1 "altivec_register_operand" "v") + (match_operand:IEEE128 2 "altivec_register_operand" "v"))] + UNSPEC_ROUND_TO_ODD))] + "TARGET_FLOAT128_HW && FLOAT128_IEEE_P (mode)" + "xssubqpo %0,%1,%2" + [(set_attr "type" "vecfloat") + (set_attr "size" "128")]) + +(define_insn "mul3_odd" + [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v") + (unspec:IEEE128 + [(mult:IEEE128 + (match_operand:IEEE128 1 "altivec_register_operand" "v") + (match_operand:IEEE128 2 "altivec_register_operand" "v"))] + UNSPEC_ROUND_TO_ODD))] + "TARGET_FLOAT128_HW && FLOAT128_IEEE_P (mode)" + "xsmulqpo %0,%1,%2" + [(set_attr "type" "vecfloat") + (set_attr "size" "128")]) + +(define_insn "div3_odd" + [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v") + (unspec:IEEE128 + [(div:IEEE128 + (match_operand:IEEE128 1 "altivec_register_operand" "v") + (match_operand:IEEE128 2 "altivec_register_operand" "v"))] + UNSPEC_ROUND_TO_ODD))] + "TARGET_FLOAT128_HW && FLOAT128_IEEE_P (mode)" + "xsdivqpo %0,%1,%2" + [(set_attr "type" "vecdiv") + (set_attr "size" "128")]) + +(define_insn "sqrt2_odd" + [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v") + (unspec:IEEE128 + [(sqrt:IEEE128 + (match_operand:IEEE128 1 "altivec_register_operand" "v"))] + UNSPEC_ROUND_TO_ODD))] + "TARGET_FLOAT128_HW && FLOAT128_IEEE_P (mode)" + "xssqrtqpo %0,%1" + [(set_attr "type" "vecdiv") + (set_attr "size" "128")]) + +(define_insn "fma4_odd" + [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v") + (unspec:IEEE128 + [(fma:IEEE128 + (match_operand:IEEE128 1 "altivec_register_operand" "%v") + (match_operand:IEEE128 2 "altivec_register_operand" "v") + (match_operand:IEEE128 3 "altivec_register_operand" "0"))] + UNSPEC_ROUND_TO_ODD))] + "TARGET_FLOAT128_HW && FLOAT128_IEEE_P (mode)" + "xsmaddqpo %0,%1,%2" + [(set_attr "type" "vecfloat") + (set_attr "size" "128")]) + +(define_insn "*fms4_odd" + [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v") + (unspec:IEEE128 + [(fma:IEEE128 + (match_operand:IEEE128 1 "altivec_register_operand" "%v") + (match_operand:IEEE128 2 "altivec_register_operand" "v") + (neg:IEEE128 + (match_operand:IEEE128 3 "altivec_register_operand" "0")))] + UNSPEC_ROUND_TO_ODD))] + "TARGET_FLOAT128_HW && FLOAT128_IEEE_P (mode)" + "xsmsubqpo %0,%1,%2" + [(set_attr "type" "vecfloat") + (set_attr "size" "128")]) + +(define_insn "*nfma4_odd" + [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v") + (neg:IEEE128 + (unspec:IEEE128 + [(fma:IEEE128 + (match_operand:IEEE128 1 "altivec_register_operand" "%v") + (match_operand:IEEE128 2 "altivec_register_operand" "v") + (match_operand:IEEE128 3 "altivec_register_operand" "0"))] + UNSPEC_ROUND_TO_ODD)))] + "TARGET_FLOAT128_HW && FLOAT128_IEEE_P (mode)" + "xsnmaddqpo %0,%1,%2" + [(set_attr "type" "vecfloat") + (set_attr "size" "128")]) + +(define_insn "*nfms4_odd" + [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v") + (neg:IEEE128 + (unspec:IEEE128 + [(fma:IEEE128 + (match_operand:IEEE128 1 "altivec_register_operand" "%v") + (match_operand:IEEE128 2 "altivec_register_operand" "v") + (neg:IEEE128 + (match_operand:IEEE128 3 "altivec_register_operand" "0")))] + UNSPEC_ROUND_TO_ODD)))] + "TARGET_FLOAT128_HW && FLOAT128_IEEE_P (mode)" + "xsnmsubqpo %0,%1,%2" + [(set_attr "type" "vecfloat") + (set_attr "size" "128")]) + +(define_insn "truncdf2_odd" [(set (match_operand:DF 0 "vsx_register_operand" "=v") - (unspec:DF [(match_operand:IEEE128 1 "altivec_register_operand" "v")] + (unspec:DF [(float_truncate:DF + (match_operand:IEEE128 1 "altivec_register_operand" "v"))] UNSPEC_ROUND_TO_ODD))] "TARGET_FLOAT128_HW && FLOAT128_IEEE_P (mode)" "xscvqpdpo %0,%1" Index: gcc/doc/extend.texi =================================================================== --- gcc/doc/extend.texi (revision 253267) +++ gcc/doc/extend.texi (working copy) @@ -15348,14 +15348,47 @@ that use the ISA 3.0 instruction set. @table @code @item __float128 __builtin_sqrtf128 (__float128) -Similar to @code{__builtin_sqrtf}, except the return and input types -are @code{__float128}. +Perform a 128-bit IEEE floating point square root operation. @findex __builtin_sqrtf128 @item __float128 __builtin_fmaf128 (__float128, __float128, __float128) -Similar to @code{__builtin_fma}, except the return and input types are -@code{__float128}. +Perform a 128-bit IEEE floating point fused multiply and add operation. @findex __builtin_fmaf128 + +@item __float128 __builtin_addf128_round_to_odd (__float128, __float128) +Perform a 128-bit IEEE floating point add using round to odd as the +rounding mode. +@findex __builtin_addf128_round_to_odd + +@item __float128 __builtin_subf128_round_to_odd (__float128, __float128) +Perform a 128-bit IEEE floating point subtract using round to odd as +the rounding mode. +@findex __builtin_subf128_round_to_odd + +@item __float128 __builtin_mulf128_round_to_odd (__float128, __float128) +Perform a 128-bit IEEE floating point multiply using round to odd as +the rounding mode. +@findex __builtin_mulf128_round_to_odd + +@item __float128 __builtin_divf128_round_to_odd (__float128, __float128) +Perform a 128-bit IEEE floating point divide using round to odd as +the rounding mode. +@findex __builtin_divf128_round_to_odd + +@item __float128 __builtin_sqrtf128_round_to_odd (__float128) +Perform a 128-bit IEEE floating point square root using round to odd +as the rounding mode. +@findex __builtin_sqrtf128_round_to_odd + +@item __float128 __builtin_fmaf128 (__float128, __float128, __float128) +Perform a 128-bit IEEE floating point fused multiply and add operation +using round to odd as the rounding mode. +@findex __builtin_fmaf128_round_to_odd + +@item double __builtin_truncf128_round_to_odd (__float128) +Convert a 128-bit IEEE floating point value to @code{double} using +round to odd as the rounding mode. +@findex __builtin_truncf128_round_to_odd @end table The following built-in functions are available for the PowerPC family Index: gcc/testsuite/gcc.target/powerpc/float128-odd.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/float128-odd.c (nonexistent) +++ gcc/testsuite/gcc.target/powerpc/float128-odd.c (working copy) @@ -0,0 +1,75 @@ +/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */ +/* { dg-require-effective-target powerpc_p9vector_ok } */ +/* { dg-options "-mpower9-vector -O2" } */ + +/* Test the generation of the round to odd instructions. */ +__float128 +f128_add(__float128 a, __float128 b) +{ + return __builtin_addf128_round_to_odd (a, b); +} + +__float128 +f128_sub (__float128 a, __float128 b) +{ + return __builtin_subf128_round_to_odd (a, b); +} + +__float128 +f128_mul (__float128 a, __float128 b) +{ + return __builtin_mulf128_round_to_odd (a, b); +} + +__float128 +f128_div (__float128 a, __float128 b) +{ + return __builtin_divf128_round_to_odd (a, b); +} + +__float128 +f128_sqrt (__float128 a) +{ + return __builtin_sqrtf128_round_to_odd (a); +} + +double +f128_trunc (__float128 a) +{ + return __builtin_truncf128_round_to_odd (a); +} + +__float128 +f128_fma (__float128 a, __float128 b, __float128 c) +{ + return __builtin_fmaf128_round_to_odd (a, b, c); +} + +__float128 +f128_fms (__float128 a, __float128 b, __float128 c) +{ + return __builtin_fmaf128_round_to_odd (a, b, -c); +} + +__float128 +f128_nfma (__float128 a, __float128 b, __float128 c) +{ + return - __builtin_fmaf128_round_to_odd (a, b, c); +} + +__float128 +f128_nfms (__float128 a, __float128 b, __float128 c) +{ + return - __builtin_fmaf128_round_to_odd (a, b, -c); +} + +/* { dg-final { scan-assembler {\mxsaddqpo\M} } } */ +/* { dg-final { scan-assembler {\mxssubqpo\M} } } */ +/* { dg-final { scan-assembler {\mxsmulqpo\M} } } */ +/* { dg-final { scan-assembler {\mxsdivqpo\M} } } */ +/* { dg-final { scan-assembler {\mxssqrtqpo\M} } } */ +/* { dg-final { scan-assembler {\mxscvqpdpo\M} } } */ +/* { dg-final { scan-assembler {\mxsmaddqpo\M} } } */ +/* { dg-final { scan-assembler {\mxsmsubqpo\M} } } */ +/* { dg-final { scan-assembler {\mxsnmaddqpo\M} } } */ +/* { dg-final { scan-assembler {\mxsnmsubqpo\M} } } */ --NzB8fVQJ5HfG6fxh--