From: Bernd Edlinger <bernd.edlinger@hotmail.de>
To: Richard Biener <rguenther@suse.de>
Cc: "gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>,
Richard Earnshaw <richard.earnshaw@arm.com>,
Ramana Radhakrishnan <ramana.radhakrishnan@arm.com>,
Kyrill Tkachov <kyrylo.tkachov@foss.arm.com>,
Eric Botcazou <ebotcazou@adacore.com>, Jeff Law <law@redhat.com>,
Jakub Jelinek <jakub@redhat.com>
Subject: Re: [PATCHv4] Fix not 8-byte aligned ldrd/strd on ARMv5 (PR 89544)
Date: Thu, 15 Aug 2019 15:28:00 -0000 [thread overview]
Message-ID: <AM6PR10MB2566B60823F40D98F4F2D55DE4AC0@AM6PR10MB2566.EURPRD10.PROD.OUTLOOK.COM> (raw)
In-Reply-To: <alpine.LSU.2.20.1908151437430.32458@zhemvz.fhfr.qr>
On 8/15/19 2:54 PM, Richard Biener wrote:
> On Thu, 15 Aug 2019, Bernd Edlinger wrote:
>
>>>>>
>>>>> Hmm. So your patch overrides user-alignment here. Woudln't it
>>>>> be better to do that more conciously by
>>>>>
>>>>> if (! DECL_USER_ALIGN (decl)
>>>>> || (align < GET_MODE_ALIGNMENT (DECL_MODE (decl))
>>>>> && targetm.slow_unaligned_access (DECL_MODE (decl), align)))
>>>>>
>>
>> ? I don't know why that would be better?
>> If the value is underaligned no matter why, pretend it was declared as
>> naturally aligned if that causes wrong code otherwise.
>> That was the idea here.
>
> It would be better because then we ignore it and use what we'd use
> by default rather than inventing sth new. And your patch suggests
> it might be needed to up align even w/o DECL_USER_ALIGN.
>
Hmmm, you mean the constant 1.0i should not have DECL_USER_ALIGN set?
But it inherits the alignment from the destination variable, apparently.
did you mean
if (! DECL_USER_ALIGN (decl)
&& align < GET_MODE_ALIGNMENT (DECL_MODE (decl))
&& ...
?
I can give it a try.
>>>>> IMHO whatever code later fails to properly use unaligned loads
>>>>> should be fixed instead rather than ignoring user requested alignment.
>>>>>
>>>>> Can you quote a short testcase that explains what exactly goes wrong?
>>>>> The struct-layout ones are awkward to look at...
>>>>>
>>>>
>>>> Sure,
>>>>
>>>> $ cat test.c
>>>> _Complex float __attribute__((aligned(1))) cf;
>>>>
>>>> void foo (void)
>>>> {
>>>> cf = 1.0i;
>>>> }
>>>>
>>>> $ arm-linux-gnueabihf-gcc -S test.c
>>>> during RTL pass: expand
>>>> test.c: In function 'foo':
>>>> test.c:5:6: internal compiler error: in gen_movsf, at config/arm/arm.md:7003
>>>> 5 | cf = 1.0i;
>>>> | ~~~^~~~~~
>>>> 0x7ba475 gen_movsf(rtx_def*, rtx_def*)
>>>> ../../gcc-trunk/gcc/config/arm/arm.md:7003
>>>> 0xa49587 insn_gen_fn::operator()(rtx_def*, rtx_def*) const
>>>> ../../gcc-trunk/gcc/recog.h:318
>>>> 0xa49587 emit_move_insn_1(rtx_def*, rtx_def*)
>>>> ../../gcc-trunk/gcc/expr.c:3695
>>>> 0xa49914 emit_move_insn(rtx_def*, rtx_def*)
>>>> ../../gcc-trunk/gcc/expr.c:3791
>>>> 0xa494f7 emit_move_complex_parts(rtx_def*, rtx_def*)
>>>> ../../gcc-trunk/gcc/expr.c:3490
>>>> 0xa49914 emit_move_insn(rtx_def*, rtx_def*)
>>>> ../../gcc-trunk/gcc/expr.c:3791
>>>> 0xa5106f store_expr(tree_node*, rtx_def*, int, bool, bool)
>>>> ../../gcc-trunk/gcc/expr.c:5855
>>>> 0xa51cc0 expand_assignment(tree_node*, tree_node*, bool)
>>>> ../../gcc-trunk/gcc/expr.c:5441
>>>
>>> Huh, so why didn't it trigger
>>>
>>> /* Handle misaligned stores. */
>>> mode = TYPE_MODE (TREE_TYPE (to));
>>> if ((TREE_CODE (to) == MEM_REF
>>> || TREE_CODE (to) == TARGET_MEM_REF)
>>> && mode != BLKmode
>>> && !mem_ref_refers_to_non_mem_p (to)
>>> && ((align = get_object_alignment (to))
>>> < GET_MODE_ALIGNMENT (mode))
>>> && (((icode = optab_handler (movmisalign_optab, mode))
>>> != CODE_FOR_nothing)
>>> || targetm.slow_unaligned_access (mode, align)))
>>> {
>>>
>>> ? (_Complex float is 32bit aligned it seems, the DECL_RTL for the
>>> var is (mem/c:SC (symbol_ref:SI ("cf") [flags 0x2] <var_decl
>>> 0x2aaaaaad1240 cf>) [1 cf+0 S8 A8]), SCmode is 32bit aligned.
>>>
>>> Ah, 'to' is a plain DECL here so the above handling is incomplete.
>>> IIRC component refs like __real cf = 0.f should be handled fine
>>> again(?). So, does adding || DECL_P (to) fix the case as well?
>>>
>>
>> So I tried this instead of the varasm.c change:
>>
>> Index: expr.c
>> ===================================================================
>> --- expr.c (revision 274487)
>> +++ expr.c (working copy)
>> @@ -5002,9 +5002,10 @@ expand_assignment (tree to, tree from, bool nontem
>> /* Handle misaligned stores. */
>> mode = TYPE_MODE (TREE_TYPE (to));
>> if ((TREE_CODE (to) == MEM_REF
>> - || TREE_CODE (to) == TARGET_MEM_REF)
>> + || TREE_CODE (to) == TARGET_MEM_REF
>> + || DECL_P (to))
>> && mode != BLKmode
>> - && !mem_ref_refers_to_non_mem_p (to)
>> + && (DECL_P (to) || !mem_ref_refers_to_non_mem_p (to))
>> && ((align = get_object_alignment (to))
>> < GET_MODE_ALIGNMENT (mode))
>> && (((icode = optab_handler (movmisalign_optab, mode))
>>
>> Result, yes, it fixes this test case
>> but then I run all struct-layout-1.exp there are sill cases. where we have problems:
>>
>> In file included from /home/ed/gnu/gcc-build-arm-linux-gnueabihf-linux64/gcc/testsuite/gcc/gcc.dg-struct-layout-1//t024_x.c:8:^M
>> /home/ed/gnu/gcc-build-arm-linux-gnueabihf-linux64/gcc/testsuite/gcc/gcc.dg-struct-layout-1//t024_test.h: In function 'test2112':^M
>> /home/ed/gnu/gcc-trunk/gcc/testsuite/gcc.dg/compat/struct-layout-1_x1.h:23:10: internal compiler error: in gen_movdf, at config/arm/arm.md:7107^M
>> /home/ed/gnu/gcc-trunk/gcc/testsuite/gcc.dg/compat/struct-layout-1_x1.h:62:3: note: in definition of macro 'TX'^M
>> /home/ed/gnu/gcc-build-arm-linux-gnueabihf-linux64/gcc/testsuite/gcc/gcc.dg-struct-layout-1//t024_test.h:113:1: note: in expansion of macro 'TCI'^M
>> /home/ed/gnu/gcc-build-arm-linux-gnueabihf-linux64/gcc/testsuite/gcc/gcc.dg-struct-layout-1//t024_test.h:113:294: note: in expansion of macro 'F'^M
>> 0x7ba377 gen_movdf(rtx_def*, rtx_def*)^M
>> ../../gcc-trunk/gcc/config/arm/arm.md:7107^M
>> 0xa494c7 insn_gen_fn::operator()(rtx_def*, rtx_def*) const^M
>> ../../gcc-trunk/gcc/recog.h:318^M
>> 0xa494c7 emit_move_insn_1(rtx_def*, rtx_def*)^M
>> ../../gcc-trunk/gcc/expr.c:3695^M
>> 0xa49854 emit_move_insn(rtx_def*, rtx_def*)^M
>> ../../gcc-trunk/gcc/expr.c:3791^M
>> 0xa49437 emit_move_complex_parts(rtx_def*, rtx_def*)^M
>> ../../gcc-trunk/gcc/expr.c:3490^M
>> 0xa49854 emit_move_insn(rtx_def*, rtx_def*)^M
>> ../../gcc-trunk/gcc/expr.c:3791^M
>> 0xa50faf store_expr(tree_node*, rtx_def*, int, bool, bool)^M
>> ../../gcc-trunk/gcc/expr.c:5856^M
>> 0xa51f34 expand_assignment(tree_node*, tree_node*, bool)^M
>> ../../gcc-trunk/gcc/expr.c:5302^M
>> 0xa51f34 expand_assignment(tree_node*, tree_node*, bool)^M
>> ../../gcc-trunk/gcc/expr.c:4983^M
>> 0x9338af expand_gimple_stmt_1^M
>> ../../gcc-trunk/gcc/cfgexpand.c:3777^M
>> 0x9338af expand_gimple_stmt^M
>> ../../gcc-trunk/gcc/cfgexpand.c:3875^M
>> 0x939221 expand_gimple_basic_block^M
>> ../../gcc-trunk/gcc/cfgexpand.c:5915^M
>> 0x93af86 execute^M
>> ../../gcc-trunk/gcc/cfgexpand.c:6538^M
>> Please submit a full bug report,^M
>>
>> My personal gut feeling this will be more fragile than over-aligning the
>> constants.
>
> As said the constant shouldn't end up under-aligned, the user cannot
> specify alignment of literal constants. Not sure what you mean
> with "over"-aligning.
>
Hmm wait a moment, I actually wanted _only_ to change the DECL_ARTIFICIAL
that is built by build_constant_desc. It uses align_variable of course,
but I totally missed that this also controls the alignment of normal
variables, sorry about the confusion here.
I mean we should align the constant for the unaligned complex with
the natural alignment of the type-mode. That wrong fix made
the variables ignore the alignment, which was of course not intended,
and instead I would need:
Index: expr.c
===================================================================
--- expr.c (revision 274531)
+++ expr.c (working copy)
@@ -5002,9 +5002,10 @@ expand_assignment (tree to, tree from, bool nontem
/* Handle misaligned stores. */
mode = TYPE_MODE (TREE_TYPE (to));
if ((TREE_CODE (to) == MEM_REF
- || TREE_CODE (to) == TARGET_MEM_REF)
+ || TREE_CODE (to) == TARGET_MEM_REF
+ || DECL_P (to))
&& mode != BLKmode
- && !mem_ref_refers_to_non_mem_p (to)
+ && (DECL_P (to) || !mem_ref_refers_to_non_mem_p (to))
&& ((align = get_object_alignment (to))
< GET_MODE_ALIGNMENT (mode))
&& (((icode = optab_handler (movmisalign_optab, mode))
Index: varasm.c
===================================================================
--- varasm.c (revision 274531)
+++ varasm.c (working copy)
@@ -47,6 +47,7 @@ along with GCC; see the file COPYING3. If not see
#include "stmt.h"
#include "expr.h"
#include "expmed.h"
+#include "optabs.h"
#include "output.h"
#include "langhooks.h"
#include "debug.h"
@@ -3386,7 +3387,15 @@ build_constant_desc (tree exp)
if (TREE_CODE (exp) == STRING_CST)
SET_DECL_ALIGN (decl, targetm.constant_alignment (exp, DECL_ALIGN (decl)));
else
- align_variable (decl, 0);
+ {
+ align_variable (decl, 0);
+ if (DECL_ALIGN (decl) < GET_MODE_ALIGNMENT (DECL_MODE (decl))
+ && ((optab_handler (movmisalign_optab, DECL_MODE (decl))
+ != CODE_FOR_nothing)
+ || targetm.slow_unaligned_access (DECL_MODE (decl),
+ DECL_ALIGN (decl))))
+ SET_DECL_ALIGN (decl, GET_MODE_ALIGNMENT (DECL_MODE (decl)));
+ }
/* Now construct the SYMBOL_REF and the MEM. */
if (use_object_blocks_p ())
>>
>>
>>>> 0xa51cc0 expand_assignment(tree_node*, tree_node*, bool)
>>>> ../../gcc-trunk/gcc/expr.c:4983
>>>> 0x93396f expand_gimple_stmt_1
>>>> ../../gcc-trunk/gcc/cfgexpand.c:3777
>>>> 0x93396f expand_gimple_stmt
>>>> ../../gcc-trunk/gcc/cfgexpand.c:3875
>>>> 0x9392e1 expand_gimple_basic_block
>>>> ../../gcc-trunk/gcc/cfgexpand.c:5915
>>>> 0x93b046 execute
>>>> ../../gcc-trunk/gcc/cfgexpand.c:6538
>>>> Please submit a full bug report,
>>>> with preprocessed source if appropriate.
>>>> Please include the complete backtrace with any bug report.
>>>> See <https://gcc.gnu.org/bugs/> for instructions.
>>>>
>>>> Without the hunk in varasm.c of course.
>>>>
>>>> What happens is that expand_expr_real_2 returns a unaligned mem_ref here:
>>>>
>>>> case COMPLEX_CST:
>>>> /* Handle evaluating a complex constant in a CONCAT target. */
>>>> if (original_target && GET_CODE (original_target) == CONCAT)
>>>> {
>>>> [... this path not taken ...]
>>
>> BTW: this code block executes when the other ICE happens.
>>
>>>> }
>>>>
>>>> /* fall through */
>>>>
>>>> case STRING_CST:
>>>> temp = expand_expr_constant (exp, 1, modifier);
>>>>
>>>> /* temp contains a constant address.
>>>> On RISC machines where a constant address isn't valid,
>>>> make some insns to get that address into a register. */
>>>> if (modifier != EXPAND_CONST_ADDRESS
>>>> && modifier != EXPAND_INITIALIZER
>>>> && modifier != EXPAND_SUM
>>>> && ! memory_address_addr_space_p (mode, XEXP (temp, 0),
>>>> MEM_ADDR_SPACE (temp)))
>>>> return replace_equiv_address (temp,
>>>> copy_rtx (XEXP (temp, 0)));
>>>> return temp;
>>>>
>>>> The result of expand_expr_real(..., EXPAND_NORMAL) ought to be usable
>>>> by emit_move_insn, that is expected just *everywhere* and can't be changed.
>>>>
>>>> This could probably be fixed in an ugly way in the COMPLEX_CST, handler
>>>> but OTOH, I don't see any reason why this constant has to be misaligned
>>>> when it can be easily aligned, which avoids the need for a misaligned access.
>>>
>>> If the COMPLEX_CST happends to end up in unaligned memory then that's
>>> of course a bug (unless the target requests that for all COMPLEX_CSTs).
>>> That is, if the unalignment is triggered because the store is to an
>>> unaligned decl.
>>>
>>> But I think the issue is the above one?
>>>
>>
>> yes initially the constant seems to be unaligned. then it is expanded,
>> and there is no special handling for unaligned constants in expand_expr_real,
>> and then probably expand_assignment or store_expr seem not fully prepared for
>> this either.
>
> With a cross I see the constant has regular aligned _Complex type
> so not sure how it can end up unaligned.
>
Maybe a target configuration issue.
Not sure, I have configured mine this way:
../gcc-trunk/configure --prefix=/home/ed/gnu/arm-linux-gnueabihf-linux64 --target=arm-linux-gnueabihf --enable-languages=all --with-arch=armv7-a --with-tune=cortex-a9 --with-fpu=vfpv3-d16 --with-float=hard
However it appears now there are two different errors, one is in expand_assignment
which you found (I start to wonder if I should add you to the authors section
of this patch), and a different one, which I have not yet simplified,
but you can easily try that for yourself:
make check-gcc-c RUNTESTFLAGS="struct-layout-1.exp=*"
it is okay when the test fails to execute but there should no internal compiler errors.
>>>>
>>>> The problem is that the code that handles this misaligned access
>>>> is skipped because the mem_rtx has initially no MEM_ATTRS and therefore
>>>> MEM_ALIGN == 32, and therefore the code that handles the unaligned
>>>> access is not taken. BUT before the mem_rtx is returned it is
>>>> set to MEM_ALIGN = 8 by set_mem_attributes, and we have an assertion,
>>>> because the result from expand_expr_real(..., EXPAND_NORMAL) ought to be
>>>> usable with emit_move_insn.
>>>
>>> yes, as said the _access_ determines the address should be aligned
>>> so we shouldn't end up setting MEM_ALIGN to 8 but to 32 according
>>> to the access type/mode. But we can't trust DECL_ALIGN of
>>> FUNCTION_DECLs but we _can_ trust users writing *(int *)fn
>>> (maybe for actual accesses we _can_ trust DECL_ALIGN, it's just
>>> we may not compute nonzero bits for the actual address because
>>> of function pointer mangling)
>>> (for accessing function code I'd say this would be premature
>>> optimization, but ...)
>>>
>>
>> Not a very nice solution, but it is not worth to spend much effort
>> in optimizing undefined behavior, I just want to avoid the ICE
>> at this time and would not trust the DECL_ALIGN either.
>
> So I meant
>
> Index: gcc/builtins.c
> ===================================================================
> --- gcc/builtins.c (revision 274534)
> +++ gcc/builtins.c (working copy)
> @@ -255,7 +255,8 @@ get_object_alignment_2 (tree exp, unsign
>
> /* Extract alignment information from the innermost object and
> possibly adjust bitpos and offset. */
> - if (TREE_CODE (exp) == FUNCTION_DECL)
> + if (TREE_CODE (exp) == FUNCTION_DECL
> + && addr_p)
> {
> /* Function addresses can encode extra information besides their
> alignment. However, if TARGET_PTRMEMFUNC_VBIT_LOCATION
>
> so we get at DECL_ALIGN of the FUNCTION_DECL (not sure if we
> can trust it).
>
>>>
>>> Still I think you can't simply override STACK_SLOT_ALIGNMENT just because
>>> of the mode of an entry param, can you? If you can assume a bigger
>>> alignment then STACK_SLOT_ALIGNMENT should return it.
>>>
>>
>> I don't see a real problem here. All target except i386 and gcn (whatever that is)
>> use the default for STACK_SLOT_ALIGNMENT which simply allows any (large) align value
>> to rule the effective STACK_SLOT_ALIGNMENT. The user could have simply declared
>> the local variable with the alignment that results in better code FWIW.
>>
>> If the stack alignment is too high that is capped in assign_stack_local:
>>
>> /* Ignore alignment if it exceeds MAX_SUPPORTED_STACK_ALIGNMENT. */
>> if (alignment_in_bits > MAX_SUPPORTED_STACK_ALIGNMENT)
>> {
>> alignment_in_bits = MAX_SUPPORTED_STACK_ALIGNMENT;
>> alignment = MAX_SUPPORTED_STACK_ALIGNMENT / BITS_PER_UNIT;
>> }
>>
>> I for one, would just assume that MAX_SUPPORTED_STACK_ALIGNMENT should
>> be sufficient for all modes that need movmisalign_optab and friends.
>> If it is not, an ICE would be just fine.
>
> Hmm. In some way we could better communicate with the user then
> and do not allow under-aligning automatic vars? But the you
> still have packed structs with BLKmode where the actual field
> accesses will carry SImode even when not aligned(?)
>
Yes, that works also when unaligned.
>
> Please split it into the parts for the PR and parts making the
> asserts not trigger.
>
Yes, will do.
> The PR is already fixed, right? The assign_parm_find_stack_rtl hunk
> is merely an optimization?
>
Hmmmm... You are right, I should have added that to the commit message...
Of course the test cases try to verify the optimization.
Thanks
Bernd.
next prev parent reply other threads:[~2019-08-15 14:52 UTC|newest]
Thread overview: 50+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-03-10 12:51 [PATCHv2] " Bernd Edlinger
2019-03-19 14:01 ` [PING] " Bernd Edlinger
2019-03-21 11:26 ` Richard Biener
2019-03-22 17:47 ` Bernd Edlinger
2019-03-25 9:28 ` Richard Biener
2019-07-30 22:13 ` [PATCHv3] " Bernd Edlinger
2019-07-31 13:17 ` Richard Earnshaw (lists)
2019-08-01 11:19 ` Bernd Edlinger
2019-08-02 9:10 ` Richard Earnshaw (lists)
2019-08-02 13:11 ` Richard Biener
2019-08-02 19:01 ` Bernd Edlinger
2019-08-08 14:20 ` [PATCHv4] " Bernd Edlinger
2019-08-14 10:54 ` [PING] " Bernd Edlinger
2019-08-14 12:27 ` Richard Biener
2019-08-14 22:26 ` Bernd Edlinger
2019-08-15 8:58 ` Richard Biener
2019-08-15 12:38 ` Bernd Edlinger
2019-08-15 13:03 ` Richard Biener
2019-08-15 14:33 ` Richard Biener
2019-08-15 15:28 ` Bernd Edlinger [this message]
2019-08-15 17:42 ` Richard Biener
2019-08-15 21:19 ` [PATCHv5] " Bernd Edlinger
2019-08-20 5:38 ` Jeff Law
2019-08-20 15:04 ` John David Anglin
[not found] ` <0d39b64f-67d9-7857-cf4e-36f09c0dc15e@bell.net>
2019-08-20 16:03 ` Fwd: " Bernd Edlinger
2019-09-04 12:53 ` Richard Earnshaw (lists)
2019-09-04 13:29 ` Bernd Edlinger
2019-09-04 14:14 ` Richard Earnshaw (lists)
2019-09-04 15:00 ` Bernd Edlinger
2019-09-04 15:48 ` Richard Earnshaw (lists)
2019-09-05 9:21 ` Richard Earnshaw (lists)
2019-09-05 9:35 ` Bernd Edlinger
2019-09-06 10:15 ` Bernd Edlinger
2019-09-06 10:18 ` Richard Earnshaw (lists)
2019-08-15 21:27 ` [PATCH] Sanitizing the middle-end interface to the back-end for strict alignment Bernd Edlinger
2019-08-17 10:11 ` Bernd Edlinger
2019-08-23 0:01 ` Jeff Law
2019-08-23 0:05 ` Jeff Law
2019-08-23 15:15 ` [PING] " Bernd Edlinger
2019-08-27 10:07 ` Kyrill Tkachov
2019-08-28 11:50 ` Bernd Edlinger
2019-08-28 12:01 ` Kyrill Tkachov
2019-08-28 13:54 ` Christophe Lyon
2019-08-28 21:48 ` Bernd Edlinger
2019-08-29 9:09 ` Kyrill Tkachov
2019-08-29 10:00 ` Christophe Lyon
2019-08-29 22:57 ` Bernd Edlinger
2019-08-30 10:07 ` Kyrill Tkachov
2019-08-30 15:22 ` Christophe Lyon
2019-08-14 11:56 ` [PATCHv3] Fix not 8-byte aligned ldrd/strd on ARMv5 (PR 89544) Richard Biener
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=AM6PR10MB2566B60823F40D98F4F2D55DE4AC0@AM6PR10MB2566.EURPRD10.PROD.OUTLOOK.COM \
--to=bernd.edlinger@hotmail.de \
--cc=ebotcazou@adacore.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=jakub@redhat.com \
--cc=kyrylo.tkachov@foss.arm.com \
--cc=law@redhat.com \
--cc=ramana.radhakrishnan@arm.com \
--cc=rguenther@suse.de \
--cc=richard.earnshaw@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).