From: Richard Biener <rguenther@suse.de>
To: Richard Earnshaw <Richard.Earnshaw@foss.arm.com>
Cc: Richard Biener <richard.guenther@gmail.com>,
Alan Lawrence <Alan.Lawrence@arm.com>,
"gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>,
Marcus Shawcroft <Marcus.Shawcroft@arm.com>
Subject: Re: New regression on ARM Linux
Date: Tue, 31 Mar 2015 10:00:00 -0000 [thread overview]
Message-ID: <alpine.LSU.2.11.1503311200090.31545@zhemvz.fhfr.qr> (raw)
In-Reply-To: <551A6C46.7010305@foss.arm.com>
On Tue, 31 Mar 2015, Richard Earnshaw wrote:
> On 31/03/15 08:50, Richard Biener wrote:
> > On Mon, Mar 30, 2015 at 10:13 PM, Richard Biener <rguenther@suse.de> wrote:
> >> On March 30, 2015 6:45:34 PM GMT+02:00, Alan Lawrence <alan.lawrence@arm.com> wrote:
> >>> -O2 was what I first used; it also occurs at -O1. -fno-tree-sra fixes
> >>> it.
> >>>
> >>> The problem appears to be in laying out arguments, specifically
> >>> varargs. From
> >>> the "good" -fdump-rtl-expand:
> >>>
> >>> (insn 18 17 19 2 (set (mem:SI (reg/f:SI 107 virtual-outgoing-args) [0
> >>> S4 A32])
> >>> (reg:SI 111 [ b1$16 ])) reduced.c:14 -1
> >>> (nil))
> >>> (insn 19 18 20 2 (set (reg:DF 2 r2)
> >>> (reg:DF 112 [ b1$8 ])) reduced.c:14 -1
> >>> (nil))
> >>> (insn 20 19 21 2 (set (reg:SI 1 r1)
> >>> (reg:SI 113 [ b1 ])) reduced.c:14 -1
> >>> (nil))
> >>> (insn 21 20 22 2 (set (reg:SI 0 r0)
> >>> (reg:SI 118)) reduced.c:14 -1
> >>> (nil))
> >>> (call_insn 22 21 23 2 (parallel [
> >>> (set (reg:SI 0 r0)
> >>> (call (mem:SI (symbol_ref:SI ("printf") [flags 0x41]
> >>> <function_decl 0x2ab50e856d00 __builtin_printf>) [0 __builtin_printf S4
> >>> A32])
> >>>
> >>> The struct members are
> >>> reg:SI 113 => int a;
> >>> reg:DF 112 => double b;
> >>> reg:SI 111 => int c;
> >>>
> >>> r0 gets the format string; r1 gets int a; r2+r3 get double b; int c is
> >>> pushed
> >>> into virtual-outgoing-args. In contrast, post-change to
> >>> build_ref_of_offset, we get:
> >>>
> >>> (insn 17 16 18 2 (set (reg:SI 118)
> >>> (symbol_ref/v/f:SI ("*.LC1") [flags 0x82] <var_decl 0x2ba57fa8d750
> >>> *.LC1>)) reduced.c:14 -1
> >>> (nil))
> >>> (insn 18 17 19 2 (set (mem:SI (plus:SI (reg/f:SI 107
> >>> virtual-outgoing-args)
> >>> (const_int 8 [0x8])) [0 S4 A64])
> >>> (reg:SI 111 [ b1$16 ])) reduced.c:14 -1
> >>> (nil))
> >>> (insn 19 18 20 2 (set (mem:DF (reg/f:SI 107 virtual-outgoing-args) [0
> >>> S8 A64])
> >>> (reg:DF 112 [ b1$8 ])) reduced.c:14 -1
> >>> (nil))
> >>> (insn 20 19 21 2 (set (reg:SI 2 r2)
> >>> (reg:SI 113 [ b1 ])) reduced.c:14 -1
> >>> (nil))
> >>> (insn 21 20 22 2 (set (reg:SI 0 r0)
> >>> (reg:SI 118)) reduced.c:14 -1
> >>> (nil))
> >>> (call_insn 22 21 23 2 (parallel [
> >>> (set (reg:SI 0 r0)
> >>> (call (mem:SI (symbol_ref:SI ("printf") [flags 0x41]
> >>> <function_decl 0x2ba57f843d00 __builtin_printf>) [0 __builtin_printf S4
> >>> A32])
> >>>
> >>> r0 still gets the format string, but 'int b1.a' now goes in r2, and the
> >>>
> >>> double+following int are all pushed into virtual-outgoing-args. This is
> >>> because
> >>> arm_function_arg is fed a 64-bit-aligned int as type of the second
> >>> argument (the
> >>> type constructed by build_ref_for_offset); it then executes
> >>> (aapcs_layout_arg,
> >>> arm.c line ~~5914)
> >>>
> >>> /* C3 - For double-word aligned arguments, round the NCRN up to the
> >>> next even number. */
> >>> ncrn = pcum->aapcs_ncrn;
> >>> if ((ncrn & 1) && arm_needs_doubleword_align (mode, type))
> >>> ncrn++;
> >>>
> >>> Which changes r1 to r2. Passing -fno-tree-sra, or removing from the
> >>> testcase
> >>> "*(cls_struct_16byte *)resp = b1", causes arm_function_arg to be fed a
> >>> 32-bit-aligned int instead, which works as previously.
> >>>
> >>> Passing the same members of that struct in a non-vargs call, works ok -
> >>> I think
> >>> because these use the type of the declared parameters, rather than the
> >>> provided
> >>> arguments, and the former do not have the increased alignment from
> >>> build_ref_for_offset.
> >>
> >> It doesn't make sense to use the alignment of passed values. That looks like bs.
> >>
> >> This means that
> >>
> >> Int I __aligned__(8);
> >>
> >> Is passed differently than int.
> >>
> >> Arm_function_arg needs to be fixed.
> >
> > That is,
> >
> > typedef int myint __attribute__((aligned(8)));
> >
> > int main()
> > {
> > myint i = 1;
> > int j = 2;
> > __builtin_printf("%d %d\n", i, j);
> > }
> >
> > or
> >
> > myint i;
> > int j;
> > myint *p = &i;
> > int *q = &j;
> >
> > int main()
> > {
> > __builtin_printf("%d %d", *p, *q);
> > }
> >
> > should behave the same. There isn't a printf modifier for an "aligned int"
> > because that sort of thing doesn't make sense. Special-casing aligned vs.
> > non-aligned values only makes sense for things passed by value on the stack.
> > And then obviously only dependent on the functuion type signature, not
> > on the type of the passed value.
> >
>
> I think the testcase is ill-formed. Just because printf doesn't have
> such a modifier, doesn't mean that another variadic function might not
> have a means to detect when an object in the variadic list needs to be
> over-aligned. As such, the test should really be written as:
A value doesn't have "alignment". A function may have alignment
requirements on its arguments, clearly printf doesn't.
> typedef int myint __attribute__((aligned(8)));
>
> int main()
> {
> myint i = 1;
> int j = 2;
> __builtin_printf("%d %d\n", (int) i, j);
> }
>
> Variadic functions take the types of their arguments from the types of
> the actuals passed. The compiler should either be applying appropriate
> promotion rules to make the types conformant by the language
> specification or respecting the types exactly. However, that should be
> done in the mid-end not the back-end. If incorrect alignment
> information is passed to the back-end it can't help but make the wrong
> choice. Examining the mode here wouldn't help either. Consider:
>
> int foo (int a, int b, int c, int d, int e, int f
> __attribute__((aligned(8))));
>
> On ARM that should pass f in a 64-bit aligned location (since the
> parameter will be on the stack).
>
> R.
>
>
> > Richard.
> >
> >> Richard.
> >>
> >>>
> >>> FWIW, I also tried:
> >>>
> >>> __attribute__((__aligned__((16)))) int x;
> >>> int main (void)
> >>> {
> >>> __builtin_printf("%d\n", x);
> >>> }
> >>>
> >>> but in that case, the arm_function_arg is still fed a type with
> >>> alignment 32
> >>> (bits), i.e. distinct from the type of the field 'x' in memory, which
> >>> has
> >>> alignment 128.
> >>>
> >>> --Alan
> >>>
> >>> Richard Biener wrote:
> >>>> On Mon, 30 Mar 2015, Richard Biener wrote:
> >>>>
> >>>>> On Mon, 30 Mar 2015, Alan Lawrence wrote:
> >>>>>
> >>>>>> ...actually attach the testcase...
> >>>>> What compile options?
> >>>>
> >>>> Just tried -O2. The GIMPLE IL assumes 64bit alignment of .LC0 but
> >>>> I can't see anything not guaranteeing that:
> >>>>
> >>>> .section .rodata
> >>>> .align 3
> >>>> .LANCHOR0 = . + 0
> >>>> .LC1:
> >>>> .ascii "%d %g %d\012\000"
> >>>> .space 6
> >>>> .LC0:
> >>>> .word 7
> >>>> .space 4
> >>>> .word 0
> >>>> .word 1075838976
> >>>> .word 9
> >>>> .space 4
> >>>>
> >>>> maybe there is some more generic code-gen bug for aligned aggregate
> >>>> copy? That is, the patch tells the backend that the loads and
> >>>> stores to the 'int' vars (which have padding followed) is aligned
> >>>> to 8 bytes.
> >>>>
> >>>> I don't see what is wrong in the final assembler, but maybe
> >>>> some endian issue exists? The code looks quite ugly though ;)
> >>>>
> >>>> Richard.
> >>>>
> >>>>>> Alan Lawrence wrote:
> >>>>>>> We've been seeing a bunch of new failures in the *libffi*
> >>> testsuite on ARM
> >>>>>>> Linux (arm-none-linux-gnueabi, arm-none-linux-gnueabihf),
> >>> following this
> >>>>>>> one-liner fix. I've reduced the testcase down to the attached
> >>> (including
> >>>>>>> removing any dependency on libffi); with gcc r221347, this prints
> >>> the
> >>>>>>> expected
> >>>>>>> 7 8 9
> >>>>>>> whereas with gcc r221348, instead it prints
> >>>>>>> 0 8 0
> >>>>>>>
> >>>>>>> The action of r221348 is to change the alignment of a mem_ref, and
> >>> a
> >>>>>>> var_decl of b1, from 32 to 64; both have type
> >>>>>>> type <record_type 0x2b9b8d428d20 cls_struct_16byte
> >>> sizes-gimplified type_0
> >>>>>>> BLK
> >>>>>>> size <integer_cst 0x2b9b8d3720a8 constant 192>
> >>>>>>> unit size <integer_cst 0x2b9b8d372078 constant 24>
> >>>>>>> align 64 symtab 0 alias set 1 canonical type
> >>> 0x2b9b8d428d20
> >>>>>>> fields <field_decl 0x2b9b8d42b098 a type <integer_type
> >>>>>>> 0x2b9b8d092690 int>
> >>>>>>> SI file reduced.c line 12 col 7
> >>>>>>> size <integer_cst 0x2b9b8d08eeb8 constant 32>
> >>>>>>> unit size <integer_cst 0x2b9b8d08eed0 constant 4>
> >>>>>>> align 32 offset_align 64
> >>>>>>> offset <integer_cst 0x2b9b8d08eee8 constant 0>
> >>>>>>> bit offset <integer_cst 0x2b9b8d08ef48 constant 0>
> >>> context
> >>>>>>> <record_type 0x2b9b8d428d20 cls_struct_16byte> chain <field_decl
> >>>>>>> 0x2b9b8d42b130 b>> context <translation_unit_decl 0x2b9b8d4232d0
> >>> D.6070>
> >>>>>>> pointer_to_this <pointer_type 0x2b9b8d42d0a8> chain
> >>> <type_decl
> >>>>>>> 0x2b9b8d42b000 D.6044>>
> >>>>>>>
> >>>>>>> The tree-optimized output is the same with both compilers (as this
> >>> does not
> >>>>>>> mention alignment); the expand output differs.
> >>>>>>>
> >>>>>>> Still investigating...
> >>>>>>>
> >>>>>>> --Alan
> >>>>>>>
> >>>>>>>
> >>>>>>> Richard Biener wrote:
> >>>>>>>> This fixes a vectorizer testcase regression on powerpc where SRA
> >>>>>>>> drops alignment info unnecessarily.
> >>>>>>>>
> >>>>>>>> Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.
> >>>>>>>>
> >>>>>>>> Richard.
> >>>>>>>>
> >>>>>>>> 2015-03-11 Richard Biener <rguenther@suse.de>
> >>>>>>>>
> >>>>>>>> PR tree-optimization/65310
> >>>>>>>> * tree-sra.c (build_ref_for_offset): Also preserve larger
> >>>>>>>> alignment.
> >>>>>>>>
> >>>>>>>> Index: gcc/tree-sra.c
> >>>>>>>>
> >>> ===================================================================
> >>>>>>>> --- gcc/tree-sra.c (revision 221324)
> >>>>>>>> +++ gcc/tree-sra.c (working copy)
> >>>>>>>> @@ -1597,7 +1597,7 @@ build_ref_for_offset (location_t loc, tr
> >>>>>>>> misalign = (misalign + offset) & (align - 1);
> >>>>>>>> if (misalign != 0)
> >>>>>>>> align = (misalign & -misalign);
> >>>>>>>> - if (align < TYPE_ALIGN (exp_type))
> >>>>>>>> + if (align != TYPE_ALIGN (exp_type))
> >>>>>>>> exp_type = build_aligned_type (exp_type, align);
> >>>>>>>> mem_ref = fold_build2_loc (loc, MEM_REF, exp_type, base,
> >>> off);
> >>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>
> >>
> >
>
>
--
Richard Biener <rguenther@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Jennifer Guild,
Dilip Upmanyu, Graham Norton HRB 21284 (AG Nuernberg)
next prev parent reply other threads:[~2015-03-31 10:00 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-03-11 15:09 [PATCH] Fix regression caused by PR65310 fix Richard Biener
2015-03-30 12:15 ` New regression on ARM Linux (was: Re: [PATCH] Fix regression caused by PR65310 fix) Alan Lawrence
2015-03-30 12:18 ` New regression on ARM Linux Alan Lawrence
2015-03-30 13:01 ` Richard Biener
2015-03-30 13:16 ` Richard Biener
2015-03-30 16:45 ` Alan Lawrence
2015-03-30 20:13 ` Richard Biener
2015-03-31 7:50 ` Richard Biener
2015-03-31 9:43 ` Richard Earnshaw
2015-03-31 10:00 ` Richard Biener [this message]
2015-03-31 10:10 ` Richard Earnshaw
2015-03-31 10:32 ` Jakub Jelinek
2015-03-31 10:36 ` Richard Biener
2015-03-31 10:40 ` Richard Earnshaw
2015-03-31 10:45 ` Richard Biener
2015-03-31 10:51 ` Richard Earnshaw
2015-03-31 11:09 ` Richard Biener
2015-03-31 12:15 ` Richard Earnshaw
2015-03-31 12:11 ` Alan Lawrence
2015-03-31 10:47 ` Alan Lawrence
2015-03-31 11:05 ` Richard Biener
2015-03-31 11:07 ` Jakub Jelinek
2015-03-31 11:11 ` Richard Biener
2015-03-31 13:11 ` Alan Lawrence
2015-03-31 13:25 ` Richard Biener
2015-04-02 14:59 ` Alan Lawrence
2015-03-31 10:20 ` Richard Biener
2015-03-31 10:31 ` Richard Earnshaw
2015-03-31 10:45 ` Richard Biener
2015-03-31 10:53 ` Richard Earnshaw
2015-03-31 9:50 ` Alan Lawrence
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.LSU.2.11.1503311200090.31545@zhemvz.fhfr.qr \
--to=rguenther@suse.de \
--cc=Alan.Lawrence@arm.com \
--cc=Marcus.Shawcroft@arm.com \
--cc=Richard.Earnshaw@foss.arm.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=richard.guenther@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).