From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 94972 invoked by alias); 17 May 2016 14:41:48 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 94954 invoked by uid 89); 17 May 2016 14:41:47 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.3 required=5.0 tests=BAYES_00,KAM_LAZY_DOMAIN_SECURITY,RP_MATCHES_RCVD autolearn=ham version=3.3.2 spammy=vadd, vabs, vneg, vdiv X-HELO: foss.arm.com Received: from foss.arm.com (HELO foss.arm.com) (217.140.101.70) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 17 May 2016 14:41:37 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 77F0428 for ; Tue, 17 May 2016 07:41:53 -0700 (PDT) Received: from [10.2.206.222] (e108033-lin.cambridge.arm.com [10.2.206.222]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 01A073F21A for ; Tue, 17 May 2016 07:41:35 -0700 (PDT) Subject: [PATCH 11/17][ARM] Add builtins for VFP FP16 intrinsics. To: gcc-patches References: <573B28A3.9030603@foss.arm.com> From: Matthew Wahab Message-ID: <573B2D9E.2000202@foss.arm.com> Date: Tue, 17 May 2016 14:41:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.5.1 MIME-Version: 1.0 In-Reply-To: <573B28A3.9030603@foss.arm.com> Content-Type: multipart/mixed; boundary="------------080308020505030103010507" X-IsSubscribed: yes X-SW-Source: 2016-05/txt/msg01252.txt.bz2 This is a multi-part message in MIME format. --------------080308020505030103010507 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Content-length: 1318 The ACLE intrinsics introduced to support the ARMv8.2 FP16 extensions require that intrinsics for scalar floating pointer (VFP) instructions are available under different conditions from those for the NEON intrinsics. This patch adds the support code and builtins data for the new VFP intrinsics. Because of the similarities between the scalar and NEON builtins, the support code for the scalar builtins follows the code for the NEON builtins. The declarations for the VFP builtins are also added in this patch since the support code expects non-empty tables. Tested the series for arm-none-linux-gnueabihf with native bootstrap and make check and for arm-none-eabi and armeb-none-eabi with make check on an ARMv8.2-A emulator. Ok for trunk? Matthew 2016-05-17 Matthew Wahab * config/arm/arm-builtins.c (hf_UP): New. (si_UP): New. (arm_vfp_builtin_data): New. Update comment. (enum arm_builtins): Include arm_vfp_builtins.def. (ARM_BUILTIN_VFP_PATTERN_START): New. (arm_init_vfp_builtins): New. (arm_init_builtins): Add arm_init_vfp_builtins. (arm_expand_vfp_builtin): New. (arm_expand_builtins: Update for arm_expand_vfp_builtin. Fix long line. * config/arm/arm_vfp_builtins.c: New file. * config/arm/t-arm (arm.o): Add arm_vfp_builtins.def. (arm-builtins.o): Likewise. --------------080308020505030103010507 Content-Type: text/x-patch; name="0011-PATCH-11-17-ARM-Add-builtins-for-VFP-FP16-intrinsics.patch" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename*0="0011-PATCH-11-17-ARM-Add-builtins-for-VFP-FP16-intrinsics.pa"; filename*1="tch" Content-length: 8656 >From d1f2b10a2e672b1dc886d8d1efb136d970f967f1 Mon Sep 17 00:00:00 2001 From: Matthew Wahab Date: Thu, 7 Apr 2016 15:33:14 +0100 Subject: [PATCH 11/17] [PATCH 11/17][ARM] Add builtins for VFP FP16 intrinsics. 2016-05-17 Matthew Wahab * config/arm/arm-builtins.c (hf_UP): New. (si_UP): New. (arm_vfp_builtin_data): New. Update comment. (arm_init_vfp_builtins): New. (arm_init_builtins): Add arm_init_vfp_builtins. (arm_expand_vfp_builtin): New. (arm_expand_builtins): Update for arm_expand_vfp_builtin. Fix long line. * config/arm/arm_vfp_builtins.c: New file. * config/arm/t-arm (arm.o): Add arm_vfp_builtins.def. (arm-builtins.o): Likewise. --- gcc/config/arm/arm-builtins.c | 75 +++++++++++++++++++++++++++++++++---- gcc/config/arm/arm_vfp_builtins.def | 56 +++++++++++++++++++++++++++ gcc/config/arm/t-arm | 4 +- 3 files changed, 126 insertions(+), 9 deletions(-) create mode 100644 gcc/config/arm/arm_vfp_builtins.def diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c index 5a22b91..58c68a6 100644 --- a/gcc/config/arm/arm-builtins.c +++ b/gcc/config/arm/arm-builtins.c @@ -190,6 +190,8 @@ arm_storestruct_lane_qualifiers[SIMD_MAX_BUILTIN_ARGS] #define ti_UP TImode #define ei_UP EImode #define oi_UP OImode +#define hf_UP HFmode +#define si_UP SImode #define UP(X) X##_UP @@ -239,12 +241,22 @@ typedef struct { VAR11 (T, N, A, B, C, D, E, F, G, H, I, J, K) \ VAR1 (T, N, L) -/* The NEON builtin data can be found in arm_neon_builtins.def. - The mode entries in the following table correspond to the "key" type of the - instruction variant, i.e. equivalent to that which would be specified after - the assembler mnemonic, which usually refers to the last vector operand. - The modes listed per instruction should be the same as those defined for - that instruction's pattern in neon.md. */ +/* The NEON builtin data can be found in arm_neon_builtins.def and + arm_vfp_builtins.def. The entries in arm_neon_builtins.def require + TARGET_NEON to be true. The entries in arm_vfp_builtins.def require + TARGET_VFP to be true. The feature tests are checked when the builtins are + expanded. + + The mode entries in the following table correspond to + the "key" type of the instruction variant, i.e. equivalent to that which + would be specified after the assembler mnemonic, which usually refers to the + last vector operand. The modes listed per instruction should be the same as + those defined for that instruction's pattern in neon.md. */ + +static neon_builtin_datum vfp_builtin_data[] = +{ +#include "arm_vfp_builtins.def" +}; static neon_builtin_datum neon_builtin_data[] = { @@ -534,6 +546,10 @@ enum arm_builtins #undef CRYPTO2 #undef CRYPTO3 + ARM_BUILTIN_VFP_BASE, + +#include "arm_vfp_builtins.def" + ARM_BUILTIN_NEON_BASE, ARM_BUILTIN_NEON_LANE_CHECK = ARM_BUILTIN_NEON_BASE, @@ -542,6 +558,9 @@ enum arm_builtins ARM_BUILTIN_MAX }; +#define ARM_BUILTIN_VFP_PATTERN_START \ + (ARM_BUILTIN_VFP_BASE + 1) + #define ARM_BUILTIN_NEON_PATTERN_START \ (ARM_BUILTIN_NEON_BASE + 1) @@ -1033,6 +1052,20 @@ arm_init_neon_builtins (void) } } +/* Set up all the scalar floating point builtins. */ + +static void +arm_init_vfp_builtins (void) +{ + unsigned int i, fcode = ARM_BUILTIN_VFP_PATTERN_START; + + for (i = 0; i < ARRAY_SIZE (vfp_builtin_data); i++, fcode++) + { + neon_builtin_datum *d = &vfp_builtin_data[i]; + arm_init_neon_builtin (fcode, d); + } +} + static void arm_init_crypto_builtins (void) { @@ -1777,7 +1810,7 @@ arm_init_builtins (void) if (TARGET_HARD_FLOAT) { arm_init_neon_builtins (); - + arm_init_vfp_builtins (); arm_init_crypto_builtins (); } @@ -2324,6 +2357,27 @@ arm_expand_neon_builtin (int fcode, tree exp, rtx target) return arm_expand_neon_builtin_1 (fcode, exp, target, d); } +/* Expand a VFP builtin, if TARGET_VFP is true. These builtins are treated like + neon builtins except that the data is looked up in table + VFP_BUILTIN_DATA. */ + +static rtx +arm_expand_vfp_builtin (int fcode, tree exp, rtx target) +{ + if (fcode >= ARM_BUILTIN_VFP_BASE && ! TARGET_VFP) + { + fatal_error (input_location, + "You must enable VFP instructions" + " to use these intrinsics."); + return const0_rtx; + } + + neon_builtin_datum *d + = &vfp_builtin_data[fcode - ARM_BUILTIN_VFP_PATTERN_START]; + + return arm_expand_neon_builtin_1 (fcode, exp, target, d); +} + /* Expand an expression EXP that calls a built-in function, with result going to TARGET if that's convenient (and in mode MODE if that's convenient). @@ -2361,13 +2415,18 @@ arm_expand_builtin (tree exp, if (fcode >= ARM_BUILTIN_NEON_BASE) return arm_expand_neon_builtin (fcode, exp, target); + if (fcode >= ARM_BUILTIN_VFP_BASE) + return arm_expand_vfp_builtin (fcode, exp, target); + /* Check in the context of the function making the call whether the builtin is supported. */ if (fcode >= ARM_BUILTIN_CRYPTO_BASE && (!TARGET_CRYPTO || !TARGET_HARD_FLOAT)) { fatal_error (input_location, - "You must enable crypto intrinsics (e.g. include -mfloat-abi=softfp -mfpu=crypto-neon...) to use these intrinsics."); + "You must enable crypto instructions" + " (e.g. include -mfloat-abi=softfp -mfpu=crypto-neon...)" + " to use these intrinsics."); return const0_rtx; } diff --git a/gcc/config/arm/arm_vfp_builtins.def b/gcc/config/arm/arm_vfp_builtins.def new file mode 100644 index 0000000..35014ce --- /dev/null +++ b/gcc/config/arm/arm_vfp_builtins.def @@ -0,0 +1,56 @@ +/* VFP instruction builtin definitions. + Copyright (C) 2016 Free Software Foundation, Inc. + Contributed by ARM Ltd. + This file is part of GCC. + + GCC is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published + by the Free Software Foundation; either version 3, or (at your + option) any later version. + + GCC is distributed in the hope that it will be useful, but WITHOUT + ANY WARRANTY; without even the implied warranty of MERCHANTABILITY + or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public + License for more details. + + You should have received a copy of the GNU General Public License + along with GCC; see the file COPYING3. If not see + . */ + +/* This file lists the builtins that may be available when VFP is enabled but + not NEON is enabled. The entries otherwise have the same requirements and + generate the same structures as those in the arm_neon_builtins.def. */ + +/* FP16 Arithmetic instructions. */ +VAR1 (UNOP, vabs, hf) +VAR2 (UNOP, vcvths, hf, si) +VAR2 (UNOP, vcvthu, hf, si) +VAR1 (UNOP, vcvtahs, si) +VAR1 (UNOP, vcvtahu, si) +VAR1 (UNOP, vcvtmhs, si) +VAR1 (UNOP, vcvtmhu, si) +VAR1 (UNOP, vcvtnhs, si) +VAR1 (UNOP, vcvtnhu, si) +VAR1 (UNOP, vcvtphs, si) +VAR1 (UNOP, vcvtphu, si) +VAR1 (UNOP, vneg, hf) +VAR1 (UNOP, vrnd, hf) +VAR1 (UNOP, vrnda, hf) +VAR1 (UNOP, vrndi, hf) +VAR1 (UNOP, vrndm, hf) +VAR1 (UNOP, vrndn, hf) +VAR1 (UNOP, vrndp, hf) +VAR1 (UNOP, vrndx, hf) +VAR1 (UNOP, vsqrt, hf) + +VAR1 (BINOP, vadd, hf) +VAR2 (BINOP, vcvths_n, hf, si) +VAR2 (BINOP, vcvthu_n, hf, si) +VAR1 (BINOP, vdiv, hf) +VAR1 (BINOP, vmaxnm, hf) +VAR1 (BINOP, vminnm, hf) +VAR1 (BINOP, vmulf, hf) +VAR1 (BINOP, vsub, hf) + +VAR1 (TERNOP, vfma, hf) +VAR1 (TERNOP, vfms, hf) diff --git a/gcc/config/arm/t-arm b/gcc/config/arm/t-arm index 749a58d..803baa2 100644 --- a/gcc/config/arm/t-arm +++ b/gcc/config/arm/t-arm @@ -95,7 +95,8 @@ arm.o: $(srcdir)/config/arm/arm.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) \ $(srcdir)/config/arm/arm-cores.def \ $(srcdir)/config/arm/arm-arches.def $(srcdir)/config/arm/arm-fpus.def \ $(srcdir)/config/arm/arm-protos.h \ - $(srcdir)/config/arm/arm_neon_builtins.def + $(srcdir)/config/arm/arm_neon_builtins.def \ + $(srcdir)/config/arm/arm_vfp_builtins.def arm-builtins.o: $(srcdir)/config/arm/arm-builtins.c $(CONFIG_H) \ $(SYSTEM_H) coretypes.h $(TM_H) \ @@ -103,6 +104,7 @@ arm-builtins.o: $(srcdir)/config/arm/arm-builtins.c $(CONFIG_H) \ $(DIAGNOSTIC_CORE_H) $(OPTABS_H) \ $(srcdir)/config/arm/arm-protos.h \ $(srcdir)/config/arm/arm_neon_builtins.def \ + $(srcdir)/config/arm/arm_vfp_builtins.def \ $(srcdir)/config/arm/arm-simd-builtin-types.def $(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) \ $(srcdir)/config/arm/arm-builtins.c -- 2.1.4 --------------080308020505030103010507--