From: Richard Sandiford <richard.sandiford@linaro.org>
To: James Greenhalgh <james.greenhalgh@arm.com>
Cc: <gcc-patches@gcc.gnu.org>, <nd@arm.com>,
<richard.earnshaw@arm.com>, <marcus.shawcroft@arm.com>
Subject: Re: [AArch64] Improve HFA code generation
Date: Mon, 26 Jun 2017 19:03:00 -0000 [thread overview]
Message-ID: <87k23y6075.fsf@linaro.org> (raw)
In-Reply-To: <1497968507-24709-1-git-send-email-james.greenhalgh@arm.com> (James Greenhalgh's message of "Tue, 20 Jun 2017 15:21:47 +0100")
James Greenhalgh <james.greenhalgh@arm.com> writes:
> Hi,
>
> For this code:
>
> struct y {
> float x[4];
> };
>
> float
> bar3 (struct y x)
> {
> return x.x[3];
> }
>
> GCC generates:
>
> bar3:
> fmov x1, d2
> mov x0, 0
> bfi x0, x1, 0, 32
> fmov x1, d3
> bfi x0, x1, 32, 32
> sbfx x0, x0, 32, 32
> fmov s0, w0
> ret
>
> If you can wrap your head around that, you'll spot that it could be
> simplified to:
>
> bar3:
> fmov s0, s3
> ret
I get the second version with current trunk, at -O and above. Does this
only happen with some other modifications, or for certain subtargets?
Or maybe I'm testing the wrong thing.
> Looking at it, I think the issue is the mode that we assign to the
> PARALLEL we build for an HFA in registers. When we get in to
> aarch64_layout_arg with a composite, MODE is set to the smallest integer
> mode that would contain the size of the composite type. That is to say, in
> the example above, MODE will be TImode.
>
> Looking at the expansion path through assign_parms, we're going to go:
>
> assign_parms
> assign_parm_setup_reg
> assign_parm_remove_parallels
> emit_group_store
>
> assign_parm_remove_parallels is going to try to create a REG in MODE,
> then construct that REG using the values in the HFA PARALLEL we created. So,
> for the example above, we're going to try to create a packed TImode value
> built up from each of the four "S" registers we've assigned for the
> arguments. Using one of the struct elements is then a 32-bit extract from
> the TImode value (then a move back to FP/SIMD registers). This explains
> the code-gen in the example. Note that an extract from the TImode value
> makes the whole TImode value live, so we can't optimize away the
> construction in registers.
>
> If instead we make the PARALLEL that we create in aarch64_layout_arg BLKmode
> then our expansion path is through:
>
> assign_parms
> assign_parm_setup_block
>
> Which handles creating a stack slot of the right size for our HFA, and
> copying it to there. We could then trust the usual optimisers to deal with
> the object construction and eliminate it where possible.
>
> However, we can't just return a BLKmode Parallel, as the mid-end was explictly
> asking us to return in MODE, and will eventually ICE given the inconsistency.
> One other way we can force these structures to be given BLKmode is through
> TARGET_MEMBER_TYPE_FORCES_BLK. Which is what we do in this patch.
>
> We're going to tell the mid-end that any structure of more than one element
> which contains either floating-point or vector data should be set out in
> BLKmode rather than a large-enough integer mode. In doing so, we implicitly
> fix the issue with HFA layout above. But at what cost!
The patch only seems to handle scalar floats, not vectors.
Wasn't replying to this to nitpick though :-). I just wondered whether
there would be any benefit to having an array_mode hook that returns
V4SF for float[4]. (We needed that hook to handle load/store lanes
for SVE, so see git branch linaro-dev/sve if you want to try it.)
Maybe alignment would be a problem though?
Thanks,
Richard
prev parent reply other threads:[~2017-06-26 19:03 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-06-20 14:22 James Greenhalgh
2017-06-26 19:03 ` Richard Sandiford [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87k23y6075.fsf@linaro.org \
--to=richard.sandiford@linaro.org \
--cc=gcc-patches@gcc.gnu.org \
--cc=james.greenhalgh@arm.com \
--cc=marcus.shawcroft@arm.com \
--cc=nd@arm.com \
--cc=richard.earnshaw@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).