From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 113708 invoked by alias); 15 Aug 2019 16:30:26 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 113564 invoked by uid 89); 15 Aug 2019 16:30:16 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-10.4 required=5.0 tests=AWL,BAYES_00,GIT_PATCH_2,GIT_PATCH_3,KAM_ASCII_DIVIDERS,KAM_SHORT,SPF_PASS autolearn=ham version=3.3.1 spammy= X-HELO: mx1.suse.de Received: from mx2.suse.de (HELO mx1.suse.de) (195.135.220.15) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 15 Aug 2019 16:30:00 +0000 Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 2E580AD73; Thu, 15 Aug 2019 16:29:58 +0000 (UTC) Date: Thu, 15 Aug 2019 17:42:00 -0000 User-Agent: K-9 Mail for Android In-Reply-To: References: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [PATCHv4] Fix not 8-byte aligned ldrd/strd on ARMv5 (PR 89544) To: Bernd Edlinger CC: "gcc-patches@gcc.gnu.org" ,Richard Earnshaw ,Ramana Radhakrishnan ,Kyrill Tkachov ,Eric Botcazou ,Jeff Law ,Jakub Jelinek From: Richard Biener Message-ID: X-SW-Source: 2019-08/txt/msg01121.txt.bz2 On August 15, 2019 4:52:24 PM GMT+02:00, Bernd Edlinger wrote: >On 8/15/19 2:54 PM, Richard Biener wrote: >> On Thu, 15 Aug 2019, Bernd Edlinger wrote: >>=20 >>>>>> >>>>>> Hmm. So your patch overrides user-alignment here. Woudln't it >>>>>> be better to do that more conciously by >>>>>> >>>>>> if (! DECL_USER_ALIGN (decl) >>>>>> || (align < GET_MODE_ALIGNMENT (DECL_MODE (decl)) >>>>>> && targetm.slow_unaligned_access (DECL_MODE (decl), >align))) >>>>>> >>> >>> ? I don't know why that would be better? >>> If the value is underaligned no matter why, pretend it was declared >as >>> naturally aligned if that causes wrong code otherwise. >>> That was the idea here. >>=20 >> It would be better because then we ignore it and use what we'd use >> by default rather than inventing sth new. And your patch suggests >> it might be needed to up align even w/o DECL_USER_ALIGN. >>=20 > >Hmmm, you mean the constant 1.0i should not have DECL_USER_ALIGN set? >But it inherits the alignment from the destination variable, >apparently.=20 Yes. I think it shouldn't inherit the alignment unless we are assembling a = static initializer.=20 > >did you mean >if (! DECL_USER_ALIGN (decl) > && align < GET_MODE_ALIGNMENT (DECL_MODE (decl)) > && ... >? > >I can give it a try. No, I meant || thus ignore DECL_USER_ALIGN if it is sth we have to satisfy = with unaligned loads.=20 > >>>>>> IMHO whatever code later fails to properly use unaligned loads >>>>>> should be fixed instead rather than ignoring user requested >alignment. >>>>>> >>>>>> Can you quote a short testcase that explains what exactly goes >wrong? >>>>>> The struct-layout ones are awkward to look at... >>>>>> >>>>> >>>>> Sure, >>>>> >>>>> $ cat test.c >>>>> _Complex float __attribute__((aligned(1))) cf; >>>>> >>>>> void foo (void) >>>>> { >>>>> cf =3D 1.0i; >>>>> } >>>>> >>>>> $ arm-linux-gnueabihf-gcc -S test.c=20 >>>>> during RTL pass: expand >>>>> test.c: In function 'foo': >>>>> test.c:5:6: internal compiler error: in gen_movsf, at >config/arm/arm.md:7003 >>>>> 5 | cf =3D 1.0i; >>>>> | ~~~^~~~~~ >>>>> 0x7ba475 gen_movsf(rtx_def*, rtx_def*) >>>>> ../../gcc-trunk/gcc/config/arm/arm.md:7003 >>>>> 0xa49587 insn_gen_fn::operator()(rtx_def*, rtx_def*) const >>>>> ../../gcc-trunk/gcc/recog.h:318 >>>>> 0xa49587 emit_move_insn_1(rtx_def*, rtx_def*) >>>>> ../../gcc-trunk/gcc/expr.c:3695 >>>>> 0xa49914 emit_move_insn(rtx_def*, rtx_def*) >>>>> ../../gcc-trunk/gcc/expr.c:3791 >>>>> 0xa494f7 emit_move_complex_parts(rtx_def*, rtx_def*) >>>>> ../../gcc-trunk/gcc/expr.c:3490 >>>>> 0xa49914 emit_move_insn(rtx_def*, rtx_def*) >>>>> ../../gcc-trunk/gcc/expr.c:3791 >>>>> 0xa5106f store_expr(tree_node*, rtx_def*, int, bool, bool) >>>>> ../../gcc-trunk/gcc/expr.c:5855 >>>>> 0xa51cc0 expand_assignment(tree_node*, tree_node*, bool) >>>>> ../../gcc-trunk/gcc/expr.c:5441 >>>> >>>> Huh, so why didn't it trigger >>>> >>>> /* Handle misaligned stores. */ >>>> mode =3D TYPE_MODE (TREE_TYPE (to)); >>>> if ((TREE_CODE (to) =3D=3D MEM_REF >>>> || TREE_CODE (to) =3D=3D TARGET_MEM_REF) >>>> && mode !=3D BLKmode >>>> && !mem_ref_refers_to_non_mem_p (to) >>>> && ((align =3D get_object_alignment (to)) >>>> < GET_MODE_ALIGNMENT (mode)) >>>> && (((icode =3D optab_handler (movmisalign_optab, mode)) >>>> !=3D CODE_FOR_nothing) >>>> || targetm.slow_unaligned_access (mode, align))) >>>> { >>>> >>>> ? (_Complex float is 32bit aligned it seems, the DECL_RTL for the >>>> var is (mem/c:SC (symbol_ref:SI ("cf") [flags 0x2] >>> 0x2aaaaaad1240 cf>) [1 cf+0 S8 A8]), SCmode is 32bit aligned. >>>> >>>> Ah, 'to' is a plain DECL here so the above handling is incomplete. >>>> IIRC component refs like __real cf =3D 0.f should be handled fine >>>> again(?). So, does adding || DECL_P (to) fix the case as well? >>>> >>> >>> So I tried this instead of the varasm.c change: >>> >>> Index: expr.c >>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >>> --- expr.c (revision 274487) >>> +++ expr.c (working copy) >>> @@ -5002,9 +5002,10 @@ expand_assignment (tree to, tree from, bool >nontem >>> /* Handle misaligned stores. */ >>> mode =3D TYPE_MODE (TREE_TYPE (to)); >>> if ((TREE_CODE (to) =3D=3D MEM_REF >>> - || TREE_CODE (to) =3D=3D TARGET_MEM_REF) >>> + || TREE_CODE (to) =3D=3D TARGET_MEM_REF >>> + || DECL_P (to)) >>> && mode !=3D BLKmode >>> - && !mem_ref_refers_to_non_mem_p (to) >>> + && (DECL_P (to) || !mem_ref_refers_to_non_mem_p (to)) >>> && ((align =3D get_object_alignment (to)) >>> < GET_MODE_ALIGNMENT (mode)) >>> && (((icode =3D optab_handler (movmisalign_optab, mode)) >>> >>> Result, yes, it fixes this test case >>> but then I run all struct-layout-1.exp there are sill cases. where >we have problems: >>> >>> In file included from >/home/ed/gnu/gcc-build-arm-linux-gnueabihf-linux64/gcc/testsuite/gcc/gcc.d= g-struct-layout-1//t024_x.c:8:^M >>> >/home/ed/gnu/gcc-build-arm-linux-gnueabihf-linux64/gcc/testsuite/gcc/gcc.d= g-struct-layout-1//t024_test.h: >In function 'test2112':^M >>> >/home/ed/gnu/gcc-trunk/gcc/testsuite/gcc.dg/compat/struct-layout-1_x1.h:23= :10: >internal compiler error: in gen_movdf, at config/arm/arm.md:7107^M >>> >/home/ed/gnu/gcc-trunk/gcc/testsuite/gcc.dg/compat/struct-layout-1_x1.h:62= :3: >note: in definition of macro 'TX'^M >>> >/home/ed/gnu/gcc-build-arm-linux-gnueabihf-linux64/gcc/testsuite/gcc/gcc.d= g-struct-layout-1//t024_test.h:113:1: >note: in expansion of macro 'TCI'^M >>> >/home/ed/gnu/gcc-build-arm-linux-gnueabihf-linux64/gcc/testsuite/gcc/gcc.d= g-struct-layout-1//t024_test.h:113:294: >note: in expansion of macro 'F'^M >>> 0x7ba377 gen_movdf(rtx_def*, rtx_def*)^M >>> ../../gcc-trunk/gcc/config/arm/arm.md:7107^M >>> 0xa494c7 insn_gen_fn::operator()(rtx_def*, rtx_def*) const^M >>> ../../gcc-trunk/gcc/recog.h:318^M >>> 0xa494c7 emit_move_insn_1(rtx_def*, rtx_def*)^M >>> ../../gcc-trunk/gcc/expr.c:3695^M >>> 0xa49854 emit_move_insn(rtx_def*, rtx_def*)^M >>> ../../gcc-trunk/gcc/expr.c:3791^M >>> 0xa49437 emit_move_complex_parts(rtx_def*, rtx_def*)^M >>> ../../gcc-trunk/gcc/expr.c:3490^M >>> 0xa49854 emit_move_insn(rtx_def*, rtx_def*)^M >>> ../../gcc-trunk/gcc/expr.c:3791^M >>> 0xa50faf store_expr(tree_node*, rtx_def*, int, bool, bool)^M >>> ../../gcc-trunk/gcc/expr.c:5856^M >>> 0xa51f34 expand_assignment(tree_node*, tree_node*, bool)^M >>> ../../gcc-trunk/gcc/expr.c:5302^M >>> 0xa51f34 expand_assignment(tree_node*, tree_node*, bool)^M >>> ../../gcc-trunk/gcc/expr.c:4983^M >>> 0x9338af expand_gimple_stmt_1^M >>> ../../gcc-trunk/gcc/cfgexpand.c:3777^M >>> 0x9338af expand_gimple_stmt^M >>> ../../gcc-trunk/gcc/cfgexpand.c:3875^M >>> 0x939221 expand_gimple_basic_block^M >>> ../../gcc-trunk/gcc/cfgexpand.c:5915^M >>> 0x93af86 execute^M >>> ../../gcc-trunk/gcc/cfgexpand.c:6538^M >>> Please submit a full bug report,^M >>> >>> My personal gut feeling this will be more fragile than over-aligning >the >>> constants. >>=20 >> As said the constant shouldn't end up under-aligned, the user cannot >> specify alignment of literal constants. Not sure what you mean >> with "over"-aligning. >>=20 > > >Hmm wait a moment, I actually wanted _only_ to change the >DECL_ARTIFICIAL >that is built by build_constant_desc. It uses align_variable of >course, >but I totally missed that this also controls the alignment of normal >variables, sorry about the confusion here. > >I mean we should align the constant for the unaligned complex with >the natural alignment of the type-mode.=20 Agreed.=20 That wrong fix made >the variables ignore the alignment, which was of course not intended, >and instead I would need: > >Index: expr.c >=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >--- expr.c (revision 274531) >+++ expr.c (working copy) >@@ -5002,9 +5002,10 @@ expand_assignment (tree to, tree from, bool >nontem > /* Handle misaligned stores. */ > mode =3D TYPE_MODE (TREE_TYPE (to)); > if ((TREE_CODE (to) =3D=3D MEM_REF >- || TREE_CODE (to) =3D=3D TARGET_MEM_REF) >+ || TREE_CODE (to) =3D=3D TARGET_MEM_REF >+ || DECL_P (to)) > && mode !=3D BLKmode >- && !mem_ref_refers_to_non_mem_p (to) >+ && (DECL_P (to) || !mem_ref_refers_to_non_mem_p (to)) > && ((align =3D get_object_alignment (to)) > < GET_MODE_ALIGNMENT (mode)) > && (((icode =3D optab_handler (movmisalign_optab, mode)) > >Index: varasm.c >=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >--- varasm.c (revision 274531) >+++ varasm.c (working copy) >@@ -47,6 +47,7 @@ along with GCC; see the file COPYING3. If not see > #include "stmt.h" > #include "expr.h" > #include "expmed.h" >+#include "optabs.h" > #include "output.h" > #include "langhooks.h" > #include "debug.h" >@@ -3386,7 +3387,15 @@ build_constant_desc (tree exp) > if (TREE_CODE (exp) =3D=3D STRING_CST) >SET_DECL_ALIGN (decl, targetm.constant_alignment (exp, DECL_ALIGN >(decl))); > else >- align_variable (decl, 0); >+ { >+ align_variable (decl, 0); >+ if (DECL_ALIGN (decl) < GET_MODE_ALIGNMENT (DECL_MODE (decl)) >+ && ((optab_handler (movmisalign_optab, DECL_MODE (decl)) >+ !=3D CODE_FOR_nothing) >+ || targetm.slow_unaligned_access (DECL_MODE (decl), >+ DECL_ALIGN (decl)))) >+ SET_DECL_ALIGN (decl, GET_MODE_ALIGNMENT (DECL_MODE (decl))); >+ } >=20 > /* Now construct the SYMBOL_REF and the MEM. */ > if (use_object_blocks_p ()) > >>> >>> >>>>> 0xa51cc0 expand_assignment(tree_node*, tree_node*, bool) >>>>> ../../gcc-trunk/gcc/expr.c:4983 >>>>> 0x93396f expand_gimple_stmt_1 >>>>> ../../gcc-trunk/gcc/cfgexpand.c:3777 >>>>> 0x93396f expand_gimple_stmt >>>>> ../../gcc-trunk/gcc/cfgexpand.c:3875 >>>>> 0x9392e1 expand_gimple_basic_block >>>>> ../../gcc-trunk/gcc/cfgexpand.c:5915 >>>>> 0x93b046 execute >>>>> ../../gcc-trunk/gcc/cfgexpand.c:6538 >>>>> Please submit a full bug report, >>>>> with preprocessed source if appropriate. >>>>> Please include the complete backtrace with any bug report. >>>>> See for instructions. >>>>> >>>>> Without the hunk in varasm.c of course. >>>>> >>>>> What happens is that expand_expr_real_2 returns a unaligned >mem_ref here: >>>>> >>>>> case COMPLEX_CST: >>>>> /* Handle evaluating a complex constant in a CONCAT target.=20 >*/ >>>>> if (original_target && GET_CODE (original_target) =3D=3D CONCAT) >>>>> { >>>>> [... this path not taken ...] >>> >>> BTW: this code block executes when the other ICE happens. >>>=20=20 >>>>> } >>>>> >>>>> /* fall through */ >>>>> >>>>> case STRING_CST: >>>>> temp =3D expand_expr_constant (exp, 1, modifier); >>>>> >>>>> /* temp contains a constant address. >>>>> On RISC machines where a constant address isn't valid, >>>>> make some insns to get that address into a register. */ >>>>> if (modifier !=3D EXPAND_CONST_ADDRESS >>>>> && modifier !=3D EXPAND_INITIALIZER >>>>> && modifier !=3D EXPAND_SUM >>>>> && ! memory_address_addr_space_p (mode, XEXP (temp, 0), >>>>> MEM_ADDR_SPACE >(temp))) >>>>> return replace_equiv_address (temp, >>>>> copy_rtx (XEXP (temp, 0))); >>>>> return temp; >>>>> >>>>> The result of expand_expr_real(..., EXPAND_NORMAL) ought to be >usable >>>>> by emit_move_insn, that is expected just *everywhere* and can't be >changed. >>>>> >>>>> This could probably be fixed in an ugly way in the COMPLEX_CST, >handler >>>>> but OTOH, I don't see any reason why this constant has to be >misaligned >>>>> when it can be easily aligned, which avoids the need for a >misaligned access. >>>> >>>> If the COMPLEX_CST happends to end up in unaligned memory then >that's >>>> of course a bug (unless the target requests that for all >COMPLEX_CSTs). >>>> That is, if the unalignment is triggered because the store is to an >>>> unaligned decl. >>>> >>>> But I think the issue is the above one? >>>> >>> >>> yes initially the constant seems to be unaligned. then it is >expanded, >>> and there is no special handling for unaligned constants in >expand_expr_real, >>> and then probably expand_assignment or store_expr seem not fully >prepared for >>> this either. >>=20 >> With a cross I see the constant has regular aligned _Complex type >> so not sure how it can end up unaligned. >>=20 > >Maybe a target configuration issue. >Not sure, I have configured mine this way: > >../gcc-trunk/configure >--prefix=3D/home/ed/gnu/arm-linux-gnueabihf-linux64 >--target=3Darm-linux-gnueabihf --enable-languages=3Dall --with-arch=3Darmv= 7-a >--with-tune=3Dcortex-a9 --with-fpu=3Dvfpv3-d16 --with-float=3Dhard > >However it appears now there are two different errors, one is in >expand_assignment >which you found (I start to wonder if I should add you to the authors >section >of this patch), and a different one, which I have not yet simplified, >but you can easily try that for yourself: > >make check-gcc-c RUNTESTFLAGS=3D"struct-layout-1.exp=3D*" > >it is okay when the test fails to execute but there should no internal >compiler errors. > > >>>>> >>>>> The problem is that the code that handles this misaligned access >>>>> is skipped because the mem_rtx has initially no MEM_ATTRS and >therefore >>>>> MEM_ALIGN =3D=3D 32, and therefore the code that handles the unaligned >>>>> access is not taken. BUT before the mem_rtx is returned it is >>>>> set to MEM_ALIGN =3D 8 by set_mem_attributes, and we have an >assertion, >>>>> because the result from expand_expr_real(..., EXPAND_NORMAL) ought >to be >>>>> usable with emit_move_insn. >>>> >>>> yes, as said the _access_ determines the address should be aligned >>>> so we shouldn't end up setting MEM_ALIGN to 8 but to 32 according >>>> to the access type/mode. But we can't trust DECL_ALIGN of >>>> FUNCTION_DECLs but we _can_ trust users writing *(int *)fn >>>> (maybe for actual accesses we _can_ trust DECL_ALIGN, it's just >>>> we may not compute nonzero bits for the actual address because >>>> of function pointer mangling) >>>> (for accessing function code I'd say this would be premature >>>> optimization, but ...) >>>> >>> >>> Not a very nice solution, but it is not worth to spend much effort >>> in optimizing undefined behavior, I just want to avoid the ICE >>> at this time and would not trust the DECL_ALIGN either. >>=20 >> So I meant >>=20 >> Index: gcc/builtins.c >> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >> --- gcc/builtins.c (revision 274534) >> +++ gcc/builtins.c (working copy) >> @@ -255,7 +255,8 @@ get_object_alignment_2 (tree exp, unsign >>=20=20 >> /* Extract alignment information from the innermost object and >> possibly adjust bitpos and offset. */ >> - if (TREE_CODE (exp) =3D=3D FUNCTION_DECL) >> + if (TREE_CODE (exp) =3D=3D FUNCTION_DECL >> + && addr_p) >> { >> /* Function addresses can encode extra information besides >their >> alignment. However, if TARGET_PTRMEMFUNC_VBIT_LOCATION >>=20 >> so we get at DECL_ALIGN of the FUNCTION_DECL (not sure if we >> can trust it). >>=20 >>>> >>>> Still I think you can't simply override STACK_SLOT_ALIGNMENT just >because >>>> of the mode of an entry param, can you? If you can assume a bigger >>>> alignment then STACK_SLOT_ALIGNMENT should return it. >>>> >>> >>> I don't see a real problem here. All target except i386 and gcn >(whatever that is) >>> use the default for STACK_SLOT_ALIGNMENT which simply allows any >(large) align value >>> to rule the effective STACK_SLOT_ALIGNMENT. The user could have >simply declared >>> the local variable with the alignment that results in better code >FWIW. >>> >>> If the stack alignment is too high that is capped in >assign_stack_local: >>> >>> /* Ignore alignment if it exceeds MAX_SUPPORTED_STACK_ALIGNMENT.=20 >*/ >>> if (alignment_in_bits > MAX_SUPPORTED_STACK_ALIGNMENT) >>> { >>> alignment_in_bits =3D MAX_SUPPORTED_STACK_ALIGNMENT; >>> alignment =3D MAX_SUPPORTED_STACK_ALIGNMENT / BITS_PER_UNIT; >>> } >>> >>> I for one, would just assume that MAX_SUPPORTED_STACK_ALIGNMENT >should >>> be sufficient for all modes that need movmisalign_optab and friends. >>> If it is not, an ICE would be just fine. >>=20 >> Hmm. In some way we could better communicate with the user then >> and do not allow under-aligning automatic vars? But the you >> still have packed structs with BLKmode where the actual field >> accesses will carry SImode even when not aligned(?) >>=20 > >Yes, that works also when unaligned. > >>=20 >> Please split it into the parts for the PR and parts making the >> asserts not trigger. >>=20 > >Yes, will do. > >> The PR is already fixed, right? The assign_parm_find_stack_rtl hunk >> is merely an optimization? >>=20 > >Hmmmm... You are right, I should have added that to the commit >message... > >Of course the test cases try to verify the optimization. > > >Thanks >Bernd.