From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 6530 invoked by alias); 8 Dec 2015 12:53:28 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 6521 invoked by uid 89); 8 Dec 2015 12:53:27 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=0.3 required=5.0 tests=AWL,BAYES_50,KAM_ASCII_DIVIDERS,KAM_LAZY_DOMAIN_SECURITY,RCVD_IN_DNSWL_LOW autolearn=no version=3.3.2 X-HELO: mx07-00178001.pphosted.com Received: from mx08-00178001.pphosted.com (HELO mx07-00178001.pphosted.com) (91.207.212.93) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-SHA encrypted) ESMTPS; Tue, 08 Dec 2015 12:53:26 +0000 Received: from pps.filterd (m0046660.ppops.net [127.0.0.1]) by mx08-00178001.pphosted.com (8.14.5/8.14.5) with SMTP id tB8CrChj015705; Tue, 8 Dec 2015 13:53:19 +0100 Received: from beta.dmz-eu.st.com (beta.dmz-eu.st.com [164.129.1.35]) by mx08-00178001.pphosted.com with ESMTP id 1ynv2992bv-1 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT); Tue, 08 Dec 2015 13:53:19 +0100 Received: from zeta.dmz-eu.st.com (zeta.dmz-eu.st.com [164.129.230.9]) by beta.dmz-eu.st.com (STMicroelectronics) with ESMTP id DE16531; Tue, 8 Dec 2015 12:52:41 +0000 (GMT) Received: from Webmail-eu.st.com (safex1hubcas5.st.com [10.75.90.71]) by zeta.dmz-eu.st.com (STMicroelectronics) with ESMTP id 44CE72C9F; Tue, 8 Dec 2015 12:53:17 +0000 (GMT) Received: from [164.129.122.197] (164.129.122.197) by webmail-eu.st.com (10.75.90.13) with Microsoft SMTP Server (TLS) id 8.3.389.2; Tue, 8 Dec 2015 13:53:16 +0100 From: Christian Bruel Subject: [PATCH, ARM] PR68674 Fix LTO support for neon builtins and error catching To: , , X-No-Archive: yes Message-ID: <5666D2BC.5030105@st.com> Date: Tue, 08 Dec 2015 12:53:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.4.0 MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="------------060404000709070507060801" X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:5.15.21,1.0.33,0.0.0000 definitions=2015-12-08_08:2015-12-08,2015-12-08,1970-01-01 signatures=0 X-IsSubscribed: yes X-SW-Source: 2015-12/txt/msg00860.txt.bz2 --------------060404000709070507060801 Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 7bit Content-length: 1898 Hi, The order of the NEON builtins construction has led to complications since the attribute target support. This was not a problem when driven from the command line, but was causing various issues when the builtins was mixed between fpu configurations or when used with LTO. Firstly the builtin functions was not initialized before the parsing of functions, leading to wrong type initializations. Then error catching code when a builtin was used without the proper fpu flags was incomprehensible for the user, for instance #include "arm_neon.h" int8x8_t a, b; int16x8_t e; void main() { e = (int16x8_t)__builtin_neon_vaddlsv8qi (a, b); } compiled with default options (without -mfpu=neon -mfloat-abi=hard) gave pages of /arm-none-eabi/6.0.0/include/arm_neon.h:39:9: error: unknown type name '__simd64_int8_t' typedef __simd64_int8_t int8x8_t; ... ... arm_neon.h:4724:3: error: can't convert a vector of type 'poly64x2_t {aka __vector(4) int}' to type 'int' which has different size return (poly64x2_t)__builtin_neon_vsli_nv2di ((int64x2_t) __a, (int64x2_t) __b, __c); ^~~~~~ ... ... and one for each arm_neon.h lines.. by postponing the check into arm_expand_builtin, we now emit something more useful: testo.c: In function 'main': testo.c:9:7: error: '__builtin_neon_vaddlsv8qi' neon builtin is not supported in this configuration. e = (int16x8_t)__builtin_neon_vaddlsv8qi (a, b); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ One small side effect to note: The total memory allocated is 370k bigger when neon is not used, so this support will have a follow-up to make their initialization lazy. But I'd like first to stabilize the stuff for stage3 (or get it pre-approved if the memory is an issue) tested without new failures with {,-mfpu=vfp,-mfpu=neon}{,-march=armv7-a\} (a few tests that was fail are now unsupported) OK for trunk ? --------------060404000709070507060801 Content-Type: text/x-patch; name="lto-neon.patch" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="lto-neon.patch" Content-length: 6967 2015-12-07 Christian Bruel * config/arm/arm-builtins.c (ARM_BUILTIN_CRYPTO_BASE): New enum tag. (arm_init_neon_builtins_internal): Rename arm_init_neon_builtins, (arm_init_crypto_builtins_internal): Rename arm_init_crypto_builtins. use add_builtin_function_ext_scope instead of add_builtin_function. (neon_set_p, neon_crypto_set_p): Remove. (arm_init_builtins): Always call arm_init_neon_builtins and arm_init_crypto_builtins. (arm_expand_builtin): Check ARM_BUILTIN_NEON_BASE and ARM_BUILTIN_CRYPTO_BASE. * config/arm/arm-protos.h (arm_init_neon_builtins): Remove proto. * config/arm/arm.c (arm_can_inline_p): Return OK for builtins. (arm_valid_target_attribute_tree) : Remove arm_init_neon_builtins call. 2015-12-07 Christian Bruel PR target/pr68784 PR target/pr65837 * gcc.target/arm/pr68784.c: New test. * gcc.target/arm/lto/pr65837_0_attr.c: New test. * gcc.target/arm/lto/pr65837_0.c: Force float-abi. Index: gcc/config/arm/arm-builtins.c =================================================================== --- gcc/config/arm/arm-builtins.c (revision 231363) +++ gcc/config/arm/arm-builtins.c (working copy) @@ -526,6 +526,8 @@ enum arm_builtins #define CRYPTO3(L, U, M1, M2, M3, M4) \ ARM_BUILTIN_CRYPTO_##U, + ARM_BUILTIN_CRYPTO_BASE, + #include "crypto.def" #undef CRYPTO1 @@ -894,7 +896,7 @@ arm_init_simd_builtin_scalar_types (void } static void -arm_init_neon_builtins_internal (void) +arm_init_neon_builtins (void) { unsigned int i, fcode = ARM_BUILTIN_NEON_PATTERN_START; @@ -1018,7 +1020,7 @@ arm_init_neon_builtins_internal (void) } static void -arm_init_crypto_builtins_internal (void) +arm_init_crypto_builtins (void) { tree V16UQI_type_node = arm_simd_builtin_type (V16QImode, true, false); @@ -1098,25 +1100,6 @@ arm_init_crypto_builtins_internal (void) #undef FT3 } -static bool neon_set_p = false; -static bool neon_crypto_set_p = false; - -void -arm_init_neon_builtins (void) -{ - if (! neon_set_p) - { - neon_set_p = true; - arm_init_neon_builtins_internal (); - } - - if (! neon_crypto_set_p && TARGET_CRYPTO && TARGET_HARD_FLOAT) - { - neon_crypto_set_p = true; - arm_init_crypto_builtins_internal (); - } -} - #undef NUM_DREG_TYPES #undef NUM_QREG_TYPES @@ -1777,8 +1760,9 @@ arm_init_builtins (void) arm_init_neon_builtins which uses it. */ arm_init_fp16_builtins (); - if (TARGET_NEON) - arm_init_neon_builtins (); + arm_init_neon_builtins (); + + arm_init_crypto_builtins (); if (TARGET_CRC32) arm_init_crc32_builtins (); @@ -2332,9 +2316,26 @@ arm_expand_builtin (tree exp, int mask; int imm; + /* Check in the context of the function making the call whether the + builtin is supported. */ + if (fcode >= ARM_BUILTIN_NEON_BASE && !TARGET_NEON) + { + error ("%qE neon builtin is not supported in this configuration.", + fndecl); + return const0_rtx; + } + if (fcode >= ARM_BUILTIN_NEON_BASE) return arm_expand_neon_builtin (fcode, exp, target); + if (fcode >= ARM_BUILTIN_CRYPTO_BASE + && (!TARGET_CRYPTO || !TARGET_HARD_FLOAT)) + { + error ("%qE crypto builtin is not supported in this configuration.", + fndecl); + return const0_rtx; + } + switch (fcode) { case ARM_BUILTIN_GET_FPSCR: Index: gcc/config/arm/arm-protos.h =================================================================== --- gcc/config/arm/arm-protos.h (revision 231363) +++ gcc/config/arm/arm-protos.h (working copy) @@ -213,7 +213,6 @@ extern void arm_mark_dllimport (tree); extern bool arm_change_mode_p (tree); #endif -extern void arm_init_neon_builtins (void); extern tree arm_valid_target_attribute_tree (tree, struct gcc_options *, struct gcc_options *); extern void arm_pr_long_calls (struct cpp_reader *); Index: gcc/config/arm/arm.c =================================================================== --- gcc/config/arm/arm.c (revision 231363) +++ gcc/config/arm/arm.c (working copy) @@ -26542,16 +26542,10 @@ thumb_set_return_address (rtx source, rt bool arm_vector_mode_supported_p (machine_mode mode) { - /* Neon also supports V2SImode, etc. listed in the clause below. */ - if (TARGET_NEON && (mode == V2SFmode || mode == V4SImode || mode == V8HImode + if (mode == V2SFmode || mode == V4SImode || mode == V8HImode || mode == V4HFmode || mode == V16QImode || mode == V4SFmode - || mode == V2DImode || mode == V8HFmode)) - return true; - - if ((TARGET_NEON || TARGET_IWMMXT) - && ((mode == V2SImode) - || (mode == V4HImode) - || (mode == V8QImode))) + || mode == V2DImode || mode == V8HFmode + || mode == V2SImode || mode == V4HImode || mode == V8QImode) return true; if (TARGET_INT_SIMD && (mode == V4UQQmode || mode == V4QQmode @@ -29926,9 +29920,6 @@ arm_valid_target_attribute_tree (tree ar /* Do any overrides, such as global options arch=xxx. */ arm_option_override_internal (opts, opts_set); - if (TARGET_NEON) - arm_init_neon_builtins (); - return build_target_option_node (opts); } Index: gcc/testsuite/gcc.target/arm/lto/pr65837_0.c =================================================================== --- gcc/testsuite/gcc.target/arm/lto/pr65837_0.c (revision 231363) +++ gcc/testsuite/gcc.target/arm/lto/pr65837_0.c (working copy) @@ -1,5 +1,7 @@ /* { dg-lto-do run } */ -/* { dg-lto-options {{-flto -mfpu=neon}} } */ +/* { dg-require-effective-target arm_neon_ok } */ +/* { dg-skip-if "need hardfp ABI" { *-*-* } { "-mfloat-abi=soft" } { "" } } */ +/* { dg-lto-options {{-flto -mfpu=neon -mfloat-abi=hard}} } */ /* { dg-suppress-ld-options {-mfpu=neon} } */ #include "arm_neon.h" Index: gcc/testsuite/gcc.target/arm/lto/pr65837_0_attr.c =================================================================== --- gcc/testsuite/gcc.target/arm/lto/pr65837_0_attr.c (revision 0) +++ gcc/testsuite/gcc.target/arm/lto/pr65837_0_attr.c (working copy) @@ -0,0 +1,16 @@ +/* { dg-lto-do run } */ +/* { dg-require-effective-target arm_neon_ok } */ +/* { dg-skip-if "need hardfp ABI" { *-*-* } { "-mfloat-abi=soft" } { "" } } */ +/* { dg-lto-options {{-flto -mfloat-abi=hard}} } */ + +#include "arm_neon.h" + +float32x2_t a, b, c, e; + +int __attribute__ ((target("fpu=neon"))) +main() +{ + e = __builtin_neon_vmls_lanev2sf (a, b, c, 0); + return 0; +} + Index: gcc/testsuite/gcc.target/arm/pr68784.c =================================================================== --- gcc/testsuite/gcc.target/arm/pr68784.c (revision 0) +++ gcc/testsuite/gcc.target/arm/pr68784.c (working copy) @@ -0,0 +1,16 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target arm_neon_ok } */ +/* { dg-options "-O2 -mfloat-abi=softfp" } */ + +#include "arm_neon.h" + +int8x8_t a, b; +int16x8_t e; + +void +__attribute__ ((target("fpu=neon"))) +foo(void) +{ + e = (int16x8_t)__builtin_neon_vaddlsv8qi (a, b); +} + --------------060404000709070507060801--