From: Kyrill Tkachov <kyrylo.tkachov@foss.arm.com>
To: gcc-patches@gcc.gnu.org, richard.earnshaw@arm.com,
james.greenhalgh@arm.com, marcus.shawcroft@arm.com,
richard.sandiford@linaro.org
Subject: Re: [AArch64] Fix sve/extract_[12].c for big-endian SVE
Date: Fri, 26 Jan 2018 14:22:00 -0000 [thread overview]
Message-ID: <5A6B345C.9040105@foss.arm.com> (raw)
In-Reply-To: <87d11wpx0r.fsf@linaro.org>
Hi Richard,
On 26/01/18 13:31, Richard Sandiford wrote:
> sve/extract_[12].c were relying on the target-independent optimisation
> that removes a redundant vec_select, so that we don't end up with
> things like:
>
> dup v0.4s, v0.4s[0]
> ...use s0...
>
> But that optimisation rightly doesn't trigger for big-endian targets,
> because GCC expects lane 0 to be in the high part of the register
> rather than the low part.
>
> SVE breaks this assumption -- see the comment at the head of
> aarch64-sve.md for details -- so the optimisation is valid for
> both endiannesses. Long term, we probably need some kind of target
> hook to make GCC aware of this.
>
> But there's another problem with the current extract pattern: it doesn't
> tell the register allocator how cheap an extraction of lane 0 is with
> tied registers. It seems better to split the lane 0 case out into
> its own pattern and use tied operands for the FPR<-SIMD case,
> so that using different registers has the cost of an extra reload.
> I think we want this for both endiannesses, regardless of the hook
> described above.
>
> Also, the gen_lowpart in this pattern fails for aarch64_be due to
> TARGET_CAN_CHANGE_MODE_CLASS restrictions, so the patch uses gen_rtx_REG
> instead. We're only creating this rtl in order to print it, so there's
> no need for anything fancier.
>
> Tested on aarch64_be-elf and aarch64-linux-gnu. OK to install?
>
> Richard
>
>
> 2018-01-26 Richard Sandiford <richard.sandiford@linaro.org>
>
> gcc/
> * config/aarch64/aarch64-sve.md (*vec_extract<mode><Vel>_0): New
> pattern.
> (*vec_extract<mode><Vel>_v128): Require a nonzero lane number.
> Use gen_rtx_REG rather than gen_lowpart.
>
> Index: gcc/config/aarch64/aarch64-sve.md
> ===================================================================
> --- gcc/config/aarch64/aarch64-sve.md 2018-01-13 18:01:51.232735405 +0000
> +++ gcc/config/aarch64/aarch64-sve.md 2018-01-26 13:26:50.176756711 +0000
> @@ -484,18 +484,52 @@ (define_expand "vec_extract<mode><Vel>"
> }
> )
>
> +;; Extract element zero. This is a special case because we want to force
> +;; the registers to be the same for the second alternative, and then
> +;; split the instruction into nothing after RA.
> +(define_insn_and_split "*vec_extract<mode><Vel>_0"
> + [(set (match_operand:<VEL> 0 "aarch64_simd_nonimmediate_operand" "=r, w, Utv")
> + (vec_select:<VEL>
> + (match_operand:SVE_ALL 1 "register_operand" "w, 0, w")
> + (parallel [(const_int 0)])))]
> + "TARGET_SVE"
> + {
> + operands[1] = gen_rtx_REG (<V128>mode, REGNO (operands[1]));
> + switch (which_alternative)
> + {
> + case 0:
> + return "umov\\t%<vwcore>0, %1.<Vetype>[0]";
> + case 1:
> + return "#";
> + case 2:
> + return "st1\\t{%1.<Vetype>}[0], %0";
> + default:
> + gcc_unreachable ();
> + }
> + }
> + "&& reload_completed
> + && REG_P (operands[1])
> + && REGNO (operands[0]) == REGNO (operands[1])"
Is it guaranteed that operands[0] will be a REG rtx here?
The aarch64_simd_nonimmediate_operand predicate and Utv constraint might allow
a MEM, which would cause an error in an RTL-checking build.
If I'm not mistaken GCC may attempt the split even when matching alternative 2.
Kyrill
> + [(const_int 0)]
> + {
> + emit_note (NOTE_INSN_DELETED);
> + DONE;
> + }
> + [(set_attr "type" "neon_to_gp_q, untyped, neon_store1_one_lane_q")]
> +)
> +
> ;; Extract an element from the Advanced SIMD portion of the register.
> ;; We don't just reuse the aarch64-simd.md pattern because we don't
> -;; want any chnage in lane number on big-endian targets.
> +;; want any change in lane number on big-endian targets.
> (define_insn "*vec_extract<mode><Vel>_v128"
> [(set (match_operand:<VEL> 0 "aarch64_simd_nonimmediate_operand" "=r, w, Utv")
> (vec_select:<VEL>
> (match_operand:SVE_ALL 1 "register_operand" "w, w, w")
> (parallel [(match_operand:SI 2 "const_int_operand")])))]
> "TARGET_SVE
> - && IN_RANGE (INTVAL (operands[2]) * GET_MODE_SIZE (<VEL>mode), 0, 15)"
> + && IN_RANGE (INTVAL (operands[2]) * GET_MODE_SIZE (<VEL>mode), 1, 15)"
> {
> - operands[1] = gen_lowpart (<V128>mode, operands[1]);
> + operands[1] = gen_rtx_REG (<V128>mode, REGNO (operands[1]));
> switch (which_alternative)
> {
> case 0:
next prev parent reply other threads:[~2018-01-26 14:00 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-01-26 13:46 Richard Sandiford
2018-01-26 14:22 ` Kyrill Tkachov [this message]
2018-01-26 15:27 ` Richard Sandiford
2018-01-29 12:37 ` James Greenhalgh
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5A6B345C.9040105@foss.arm.com \
--to=kyrylo.tkachov@foss.arm.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=james.greenhalgh@arm.com \
--cc=marcus.shawcroft@arm.com \
--cc=richard.earnshaw@arm.com \
--cc=richard.sandiford@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).