From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 75155 invoked by alias); 19 Jan 2016 15:01:51 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 75132 invoked by uid 89); 19 Jan 2016 15:01:50 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-0.8 required=5.0 tests=BAYES_00,KAM_ASCII_DIVIDERS,KAM_LAZY_DOMAIN_SECURITY,KHOP_DYNAMIC,RCVD_IN_DNSWL_LOW autolearn=no version=3.3.2 spammy=lightly, H*RU:pps.filterd, Hx-spam-relays-external:pps.filterd, unsignedp X-HELO: mx07-00178001.pphosted.com Received: from mx08-00178001.pphosted.com (HELO mx07-00178001.pphosted.com) (91.207.212.93) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-GCM-SHA384 encrypted) ESMTPS; Tue, 19 Jan 2016 15:01:44 +0000 Received: from pps.filterd (m0046660.ppops.net [127.0.0.1]) by mx08-00178001.pphosted.com (8.15.0.59/8.15.0.59) with SMTP id u0JExHn1013705; Tue, 19 Jan 2016 16:01:32 +0100 Received: from beta.dmz-eu.st.com (beta.dmz-eu.st.com [164.129.1.35]) by mx08-00178001.pphosted.com with ESMTP id 20fbv83496-1 (version=TLSv1/SSLv3 cipher=ECDHE-RSA-AES256-SHA bits=256 verify=NOT); Tue, 19 Jan 2016 16:01:32 +0100 Received: from zeta.dmz-eu.st.com (zeta.dmz-eu.st.com [164.129.230.9]) by beta.dmz-eu.st.com (STMicroelectronics) with ESMTP id 3D17131; Tue, 19 Jan 2016 15:00:44 +0000 (GMT) Received: from Webmail-eu.st.com (safex1hubcas5.st.com [10.75.90.71]) by zeta.dmz-eu.st.com (STMicroelectronics) with ESMTP id 5A85627D8; Tue, 19 Jan 2016 15:01:31 +0000 (GMT) Received: from [164.129.122.197] (164.129.122.197) by webmail-eu.st.com (10.75.90.13) with Microsoft SMTP Server (TLS) id 8.3.406.0; Tue, 19 Jan 2016 16:01:30 +0100 Subject: Re: [PATCH][ARM,AARCH64] target/PR68674: relayout vector_types in expand_expr To: Richard Biener References: <568FB9AA.5020706@st.com> CC: "kyrylo.tkachov@foss.arm.com" , Richard Earnshaw , "ramana.radhakrishnan@foss.arm.com" , "bschmidt@redhat.com" , GCC Patches From: Christian Bruel X-No-Archive: yes Message-ID: <569E4FCA.3070704@st.com> Date: Tue, 19 Jan 2016 15:01:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.4.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 7bit X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2016-01-19_07:,, signatures=0 X-IsSubscribed: yes X-SW-Source: 2016-01/txt/msg01426.txt.bz2 Hi Richard, thanks for your input, On 01/18/2016 12:36 PM, Richard Biener wrote: > On Fri, Jan 8, 2016 at 2:29 PM, Christian Bruel wrote: >> When compiling code with attribute targets on arm or aarch64, >> vector_type_mode returns different results (eg Vmode or BLKmode) depending >> on the current simd flags that are not set between functions. >> >> for example the following code: >> >> #include >> >> extern int8x8_t a; >> extern int8x8_t b; >> >> int16x8_t >> __attribute__ ((target("fpu=neon"))) >> foo(void) >> { >> return vaddl_s8 (a, b); >> } >> >> Triggers gcc_asserts in copy_to_mode_regs while expanding NEON builtins , >> because the mismatch and DECL_MODE current's TYPE_MODE used in >> expand_builtin for global variables. >> >> but the best explanation is in the vector_type_mode: >> /* Vector types need to re-check the target flags each time we report >> the machine mode. We need to do this because attribute target can >> change the result of vector_mode_supported_p and have_regs_of_mode >> on a per-function basis. Thus the TYPE_MODE of a VECTOR_TYPE can >> change on a per-function basis. */ >> >> I first tried to hack the 2 machine descriptions to insert convert_to_mode >> or relayout_decls here and there, but I found this very fragile. Instead a >> more central relayout the of type while expanding gave good results, as >> proposed here. >> >> bootstraped and tested with no regression for arm, aarch64 and i586. >> >> Does this look to be the right approach ? >> >> nb: for testing this patch is complementary with >> >> https://gcc.gnu.org/ml/gcc-patches/2016-01/msg00332.html >> https://gcc.gnu.org/ml/gcc-patches/2016-01/msg00248.html >> >> thanks for your comments. > A x86 specific testcase that ICEs as well: > > typedef int v8si __attribute__((vector_size(32))); > v8si a; > v8si __attribute__((target("avx"))) foo() > { > return a; > } > > in your patch not using the shared DECL_RTL of the global var > "fixes" this so I think a conceptually better fix would be to > "adjust" DECL_RTL from globals via a adjust_address (or so). > > Also given that we do > > /* ... fall through ... */ > > case FUNCTION_DECL: > case RESULT_DECL: > decl_rtl = DECL_RTL (exp); > expand_decl_rtl: > gcc_assert (decl_rtl); > decl_rtl = copy_rtx (decl_rtl); > > thus always "unshare" DECL_RTL anyway it might be not so > bad to simply do > > decl_rtl = adjust_address (decl_rtl, TYPE_MODE (type), 0); > > instead of that to avoid one copy. > > Index: expr.c > =================================================================== > --- expr.c (revision 232496) > +++ expr.c (working copy) > @@ -9597,7 +9597,10 @@ expand_expr_real_1 (tree exp, rtx target > decl_rtl = DECL_RTL (exp); > expand_decl_rtl: > gcc_assert (decl_rtl); > - decl_rtl = copy_rtx (decl_rtl); > + if (MEM_P (decl_rtl)) > + decl_rtl = adjust_address (decl_rtl, TYPE_MODE (type), 0); > + else > + decl_rtl = copy_rtx (decl_rtl); > /* Record writes to register variables. */ > if (modifier == EXPAND_WRITE > && REG_P (decl_rtl) > > untested apart from on the x86_64 testcase (which it fixes). One could guard > this further to only apply on vector typed decls with mismatched mode of course. > > I think that re-layouting globals is not very good design. > > Richard. A few other ICEs with this implementation, for instance if the context is not in a function, such as typedef __simd64_int8_t int8x8_t; extern int8x8_t b; int8x8_t *a = &b; So, to avoid a var re-layout and a copy_rtx (implied by adjust_address btw). What about just calling 'change_address' ? like: (very lightly tested) Index: expr.c =================================================================== --- expr.c (revision 232564) +++ expr.c (working copy) @@ -9392,7 +9392,8 @@ enum expand_modifier modifier, rtx *alt_rtl, bool inner_reference_p) { - rtx op0, op1, temp, decl_rtl; + rtx op0, op1, temp; + rtx decl_rtl = NULL_RTX; tree type; int unsignedp; machine_mode mode, dmode; @@ -9590,11 +9591,22 @@ && (TREE_STATIC (exp) || DECL_EXTERNAL (exp))) layout_decl (exp, 0); + decl_rtl = DECL_RTL (exp); + + if (MEM_P (decl_rtl) + && (VECTOR_TYPE_P (type) && DECL_MODE (exp) != mode)) + { + if (current_function_decl + && (! reload_completed && !reload_in_progress)) + decl_rtl = change_address (decl_rtl, TYPE_MODE (type), 0); + } + /* ... fall through ... */ case FUNCTION_DECL: case RESULT_DECL: - decl_rtl = DECL_RTL (exp); + if (! decl_rtl) + decl_rtl = DECL_RTL (exp); expand_decl_rtl: gcc_assert (decl_rtl); decl_rtl = copy_rtx (decl_rtl); I'm not sure that moving the code in the 'expand_decl_rtl' label is best, as we'd need to test for exp and the case should only happen for global vars (not functions or results) thanks,