From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 117201 invoked by alias); 10 Dec 2015 09:59:58 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 115627 invoked by uid 89); 10 Dec 2015 09:59:56 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.7 required=5.0 tests=AWL,BAYES_00,SPF_PASS autolearn=ham version=3.3.2 X-HELO: eu-smtp-delivery-143.mimecast.com Received: from eu-smtp-delivery-143.mimecast.com (HELO eu-smtp-delivery-143.mimecast.com) (146.101.78.143) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 10 Dec 2015 09:59:54 +0000 Received: from cam-owa2.Emea.Arm.com (fw-tnat.cambridge.arm.com [217.140.96.140]) by eu-smtp-1.mimecast.com with ESMTP id uk-mta-35-sucsg7XbSAmGyEl-qQrf-Q-1; Thu, 10 Dec 2015 09:59:48 +0000 Received: from [10.2.206.200] ([10.1.2.79]) by cam-owa2.Emea.Arm.com with Microsoft SMTPSVC(6.0.3790.3959); Thu, 10 Dec 2015 09:59:47 +0000 Message-ID: <56694D13.1010208@arm.com> Date: Thu, 10 Dec 2015 09:59:00 -0000 From: Kyrill Tkachov User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.2.0 MIME-Version: 1.0 To: Christian Bruel , "ramana.radhakrishnan@foss.arm.com" , "gcc-patches@gcc.gnu.org" Subject: Re: [PATCH, ARM] PR68674 Fix LTO support for neon builtins and error catching References: <5666D2BC.5030105@st.com> <566865A6.4020307@arm.com> <5669453F.5010100@st.com> In-Reply-To: <5669453F.5010100@st.com> X-MC-Unique: sucsg7XbSAmGyEl-qQrf-Q-1 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: quoted-printable X-IsSubscribed: yes X-SW-Source: 2015-12/txt/msg01093.txt.bz2 On 10/12/15 09:26, Christian Bruel wrote: > Hi Kyrill, > > On 12/09/2015 06:32 PM, Kyrill Tkachov wrote: >> Hi Christian, >> >> On 08/12/15 12:53, Christian Bruel wrote: >>> Hi, >>> >>> The order of the NEON builtins construction has led to complications si= nce the attribute target support. This was not a problem when driven from t= he command line, but was causing various issues when the builtins was mixed= between fpu >>> configurations or when used with LTO. >>> >>> Firstly the builtin functions was not initialized before the parsing of= functions, leading to wrong type initializations. >>> >>> Then error catching code when a builtin was used without the proper fpu= flags was incomprehensible for the user, for instance >>> >>> #include "arm_neon.h" >>> >>> int8x8_t a, b; >>> int16x8_t e; >>> >>> void >>> main() >>> { >>> e =3D (int16x8_t)__builtin_neon_vaddlsv8qi (a, b); >>> } >>> >>> compiled with default options (without -mfpu=3Dneon -mfloat-abi=3Dhard)= gave pages of >>> >>> /arm-none-eabi/6.0.0/include/arm_neon.h:39:9: error: unknown type name = '__simd64_int8_t' >>> typedef __simd64_int8_t int8x8_t; >>> ... >>> ... >>> arm_neon.h:4724:3: error: can't convert a vector of type 'poly64x2_t {a= ka __vector(4) int}' to type 'int' which has different size >>> return (poly64x2_t)__builtin_neon_vsli_nv2di ((int64x2_t) __a, (int= 64x2_t) __b, __c); >>> ^~~~~~ >>> ... >>> ... and one for each arm_neon.h lines.. >>> >>> by postponing the check into arm_expand_builtin, we now emit something = more useful: >>> >>> testo.c: In function 'main': >>> testo.c:9:7: error: '__builtin_neon_vaddlsv8qi' neon builtin is not sup= ported in this configuration. >>> e =3D (int16x8_t)__builtin_neon_vaddlsv8qi (a, b); >>> ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >>> >>> One small side effect to note: The total memory allocated is 370k bigge= r when neon is not used, so this support will have a follow-up to make thei= r initialization lazy. But I'd like first to stabilize the stuff for stage3= (or get it >>> pre-approved if the memory is an issue) >>> >>> tested without new failures with {,-mfpu=3Dvfp,-mfpu=3Dneon}{,-march=3D= armv7-a\} >>> (a few tests that was fail are now unsupported) >>> >> >> I agree, the vector types (re)initialisation is a tricky part. >> I've seen similar issues in the aarch64 work for target attributes >> >> bool >> arm_vector_mode_supported_p (machine_mode mode) >> { >> - /* Neon also supports V2SImode, etc. listed in the clause below. */ >> - if (TARGET_NEON && (mode =3D=3D V2SFmode || mode =3D=3D V4SImode || m= ode =3D=3D V8HImode >> + if (mode =3D=3D V2SFmode || mode =3D=3D V4SImode || mode =3D=3D V8HIm= ode >> || mode =3D=3D V4HFmode || mode =3D=3D V16QImode || mode =3D=3D= V4SFmode >> - || mode =3D=3D V2DImode || mode =3D=3D V8HFmode)) >> - return true; >> - >> - if ((TARGET_NEON || TARGET_IWMMXT) >> - && ((mode =3D=3D V2SImode) >> - || (mode =3D=3D V4HImode) >> - || (mode =3D=3D V8QImode))) >> + || mode =3D=3D V2DImode || mode =3D=3D V8HFmode >> + || mode =3D=3D V2SImode || mode =3D=3D V4HImode || mode =3D=3D V8= QImode) >> return true; >> >> >> So this allows vector modes unconditionally for all targets/fpu configur= ations? >> I was tempted to do that in aarch64 when I was encountering similar issu= es. >> In the end what worked for me was re-laying out the vector types in SET_= CURRENT_FUNCTION >> if necessary (https://gcc.gnu.org/ml/gcc-patches/2015-08/msg01084.html) > > yes my assumption was that arm_init_neon_builtins () is now called for al= l targets, since the check is done at expand time and that the builtins nee= d to be known by lto, with the vector type initialization, before they are = expanded.=20 > However at that time, lto streaming-in have not yet processed the attribu= tes and TARGET_NEON is not set for the function. > > I had a look at your re-layout, but I'm not sure. it feels like a hack. I= think this should be solved first place during the builtin construction. A= lso set_current_function is too late, builtin_expand that will explode beca= use of the=20 > unknown modes. > > But raise the point. In fact I was not really happy with this arm_vector_= mode_supported_p neither as I was not sure about other contexts it can be c= alled from and I cannot clearly claim that this change is always correct. > So the main usage of targetm.vector_mode_supported_p is in stor-layout.c an= d vector_type_mode in particular seems to have a relevant comment: /* Vector types need to re-check the target flags each time we report the machine mode. We need to do this because attribute target can change the result of vector_mode_supported_p and have_regs_of_mode on a per-function basis. Thus the TYPE_MODE of a VECTOR_TYPE can change on a per-function basis. */ I think that implies that it expects targetm.vector_mode_supported_p to rej= ect vector modes in contexts that don't support NEON... > I'd like to think about other way to set the vector modes from arm_init_n= eon_builtins before the target flags are known. I'm thinking about the lazy= initialization at expand time, or using a contextual boolean flags. how do= es that sound ? > Laying out the vector types during arm_init_neon_builtins sounds more promi= sing to me. Changing layout of types during expand is risky, from what I remember. In principle, the types and builtins created in arm_init_neon_builtins are = only ever supposed to be used in a NEON context, so I thought that just turning on NEON upon entry into arm_= init_neon_builtins and resetting it back upon exit would work. However, this won't work because we construct= our builtin types by copying existing type nodes (e.g. intQI_type_node) that have been laid out earlier by the mi= dend (frontend?) assuming no NEON. I wonder if we can explicitly layout these global types in the arm_init_neo= n_builtins context... Thanks, Kyrill > many thanks, > > Christian > > >> >> Kyrill >> >