* [AArch64] [BE] Fix vector load/stores to not use ld1/st1
@ 2014-10-10 15:23 Alan Hayward
2014-11-04 11:16 ` Marcus Shawcroft
0 siblings, 1 reply; 6+ messages in thread
From: Alan Hayward @ 2014-10-10 15:23 UTC (permalink / raw)
To: gcc-patches
[-- Attachment #1: Type: text/plain, Size: 1980 bytes --]
This patch is dependant on "[AArch64] [BE] [1/2] Make large opaque integer
modes endianness-safe.”
It fixes up movoi/ci/xi for Big Endian, so that we end up with the lsb of
a big-endian integer to be in the low byte of the highest-numbered
register.
movoi uses stp/ldp
movxi needs a split version (which is shared with register->register
movxi), which just splits to two new movs
movci can then split to three movs. A future patch will instead split to
an movoi and a movti.
There are no changes for LE.
Ran whole of check with both parts of "Make large opaque integer modes
endianness-safe”. No regressions.
ChangeLog:
gcc/:
2014-10-10 Alan Hayward <alan.hayward@arm.com>
* config/aarch64/aarch64.c
(aarch64_classify_address): Allow extra addressing modes for BE.
(aarch64_print_operand): new operand for printing a q register+1.
(aarch64_simd_emit_reg_reg_move): replacement for
aarch64_simd_disambiguate_copy that plants the required mov.
* config/aarch64/aarch64-protos.h
(aarch64_simd_emit_reg_reg_move): replacement for
aarch64_simd_disambiguate_copy.
* config/aarch64/aarch64-simd.md
(define_split) Use new aarch64_simd_emit_reg_reg_move.
(define_expand "mov<mode>") less restrictive predicates.
(define_insn "*aarch64_mov<mode>") Simplify and only allow for LE.
(define_insn "*aarch64_be_movoi") New. BE only. Plant ldp or stp.
(define_insn "*aarch64_be_movci") New. BE only. No instructions.
(define_insn "*aarch64_be_movxi") New. BE only. No instructions.
(define_split) OI mov. Use new aarch64_simd_emit_reg_reg_move.
(define_split) CI mov. Use new aarch64_simd_emit_reg_reg_move.
On BE
plant movs for reg to/from mem case.
(define_split) XI mov. Use new aarch64_simd_emit_reg_reg_move.
On BE
plant movs for reg to/from mem case.
[-- Attachment #2: 0001-be-mov.patch --]
[-- Type: application/octet-stream, Size: 14964 bytes --]
From bc034ae258440b366de586f587106bd6ecd90170 Mon Sep 17 00:00:00 2001
From: Alan Hayward <alan.hayward@arm.com>
Date: Fri, 10 Oct 2014 15:05:22 +0100
Subject: [PATCH] alanh BE patch clean
---
gcc/config/aarch64/aarch64-protos.h | 3 +-
gcc/config/aarch64/aarch64-simd.md | 171 ++++++++++++++++++++----------------
gcc/config/aarch64/aarch64.c | 99 ++++++++++++++-------
3 files changed, 161 insertions(+), 112 deletions(-)
diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h
index b5f53d2..2e97a00 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -256,7 +256,8 @@ void aarch64_emit_call_insn (rtx);
/* Initialize builtins for SIMD intrinsics. */
void init_aarch64_simd_builtins (void);
-void aarch64_simd_disambiguate_copy (rtx *, rtx *, rtx *, unsigned int);
+void aarch64_simd_emit_reg_reg_move (rtx *operands, enum machine_mode mode,
+ unsigned int count);
/* Emit code to place a AdvSIMD pair result in memory locations (with equal
registers). */
diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index cab26a3..64a74b0 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -157,19 +157,10 @@
"TARGET_SIMD && reload_completed
&& GP_REGNUM_P (REGNO (operands[0]))
&& GP_REGNUM_P (REGNO (operands[1]))"
- [(set (match_dup 0) (match_dup 1))
- (set (match_dup 2) (match_dup 3))]
+ [(const_int 0)]
{
- int rdest = REGNO (operands[0]);
- int rsrc = REGNO (operands[1]);
- rtx dest[2], src[2];
-
- dest[0] = gen_rtx_REG (DImode, rdest);
- src[0] = gen_rtx_REG (DImode, rsrc);
- dest[1] = gen_rtx_REG (DImode, rdest + 1);
- src[1] = gen_rtx_REG (DImode, rsrc + 1);
-
- aarch64_simd_disambiguate_copy (operands, dest, src, 2);
+ aarch64_simd_emit_reg_reg_move (operands, DImode, 2);
+ DONE;
})
(define_split
@@ -4077,8 +4068,8 @@
;; Reload patterns for AdvSIMD register list operands.
(define_expand "mov<mode>"
- [(set (match_operand:VSTRUCT 0 "aarch64_simd_nonimmediate_operand" "")
- (match_operand:VSTRUCT 1 "aarch64_simd_general_operand" ""))]
+ [(set (match_operand:VSTRUCT 0 "nonimmediate_operand" "")
+ (match_operand:VSTRUCT 1 "general_operand" ""))]
"TARGET_SIMD"
{
if (can_create_pseudo_p ())
@@ -4090,21 +4081,15 @@
(define_insn "*aarch64_mov<mode>"
[(set (match_operand:VSTRUCT 0 "aarch64_simd_nonimmediate_operand" "=w,Utv,w")
- (match_operand:VSTRUCT 1 "aarch64_simd_general_operand" " w,w,Utv"))]
- "TARGET_SIMD
+ (match_operand:VSTRUCT 1 "aarch64_simd_general_operand" " w,w,Utv"))]
+ "TARGET_SIMD && !BYTES_BIG_ENDIAN
&& (register_operand (operands[0], <MODE>mode)
|| register_operand (operands[1], <MODE>mode))"
-
-{
- switch (which_alternative)
- {
- case 0: return "#";
- case 1: return "st1\\t{%S1.16b - %<Vendreg>1.16b}, %0";
- case 2: return "ld1\\t{%S0.16b - %<Vendreg>0.16b}, %1";
- default: gcc_unreachable ();
- }
-}
- [(set_attr "type" "neon_move,neon_store<nregs>_<nregs>reg_q,\
+ "@
+ #
+ st1\\t{%S1.16b - %<Vendreg>1.16b}, %0
+ ld1\\t{%S0.16b - %<Vendreg>0.16b}, %1"
+ [(set_attr "type" "multiple,neon_store<nregs>_<nregs>reg_q,\
neon_load<nregs>_<nregs>reg_q")
(set (attr "length") (symbol_ref "aarch64_simd_attr_length_move (insn)"))]
)
@@ -4127,70 +4112,100 @@
[(set_attr "type" "neon_store1_1reg<q>")]
)
+(define_insn "*aarch64_be_movoi"
+ [(set (match_operand:OI 0 "nonimmediate_operand" "=w,m,w")
+ (match_operand:OI 1 "general_operand" " w,w,m"))]
+ "TARGET_SIMD && BYTES_BIG_ENDIAN
+ && (register_operand (operands[0], OImode)
+ || register_operand (operands[1], OImode))"
+ "@
+ #
+ stp\\t%q1, %R1, %0
+ ldp\\t%q0, %R0, %1"
+ [(set_attr "type" "multiple,neon_store2_2reg_q,neon_load2_2reg_q")
+ (set (attr "length") (symbol_ref "aarch64_simd_attr_length_move (insn)"))]
+)
+
+(define_insn "*aarch64_be_movci"
+ [(set (match_operand:CI 0 "nonimmediate_operand" "=w,o,w")
+ (match_operand:CI 1 "general_operand" " w,w,o"))]
+ "TARGET_SIMD && BYTES_BIG_ENDIAN
+ && (register_operand (operands[0], CImode)
+ || register_operand (operands[1], CImode))"
+ "#"
+ [(set_attr "type" "multiple")
+ (set (attr "length") (symbol_ref "aarch64_simd_attr_length_move (insn)"))]
+)
+
+(define_insn "*aarch64_be_movxi"
+ [(set (match_operand:XI 0 "nonimmediate_operand" "=w,o,w")
+ (match_operand:XI 1 "general_operand" " w,w,o"))]
+ "TARGET_SIMD && BYTES_BIG_ENDIAN
+ && (register_operand (operands[0], XImode)
+ || register_operand (operands[1], XImode))"
+ "#"
+ [(set_attr "type" "multiple")
+ (set (attr "length") (symbol_ref "aarch64_simd_attr_length_move (insn)"))]
+)
+
(define_split
- [(set (match_operand:OI 0 "register_operand" "")
- (match_operand:OI 1 "register_operand" ""))]
+ [(set (match_operand:OI 0 "register_operand")
+ (match_operand:OI 1 "register_operand"))]
"TARGET_SIMD && reload_completed"
- [(set (match_dup 0) (match_dup 1))
- (set (match_dup 2) (match_dup 3))]
+ [(const_int 0)]
{
- int rdest = REGNO (operands[0]);
- int rsrc = REGNO (operands[1]);
- rtx dest[2], src[2];
-
- dest[0] = gen_rtx_REG (TFmode, rdest);
- src[0] = gen_rtx_REG (TFmode, rsrc);
- dest[1] = gen_rtx_REG (TFmode, rdest + 1);
- src[1] = gen_rtx_REG (TFmode, rsrc + 1);
-
- aarch64_simd_disambiguate_copy (operands, dest, src, 2);
+ aarch64_simd_emit_reg_reg_move (operands, TImode, 2);
+ DONE;
})
(define_split
- [(set (match_operand:CI 0 "register_operand" "")
- (match_operand:CI 1 "register_operand" ""))]
+ [(set (match_operand:CI 0 "nonimmediate_operand")
+ (match_operand:CI 1 "general_operand"))]
"TARGET_SIMD && reload_completed"
- [(set (match_dup 0) (match_dup 1))
- (set (match_dup 2) (match_dup 3))
- (set (match_dup 4) (match_dup 5))]
+ [(const_int 0)]
{
- int rdest = REGNO (operands[0]);
- int rsrc = REGNO (operands[1]);
- rtx dest[3], src[3];
-
- dest[0] = gen_rtx_REG (TFmode, rdest);
- src[0] = gen_rtx_REG (TFmode, rsrc);
- dest[1] = gen_rtx_REG (TFmode, rdest + 1);
- src[1] = gen_rtx_REG (TFmode, rsrc + 1);
- dest[2] = gen_rtx_REG (TFmode, rdest + 2);
- src[2] = gen_rtx_REG (TFmode, rsrc + 2);
-
- aarch64_simd_disambiguate_copy (operands, dest, src, 3);
+ if (register_operand (operands[0], CImode)
+ && register_operand (operands[1], CImode))
+ {
+ aarch64_simd_emit_reg_reg_move (operands, TImode, 3);
+ DONE;
+ }
+ else if (BYTES_BIG_ENDIAN)
+ {
+ emit_move_insn (simplify_gen_subreg (TImode, operands[0], CImode, 0),
+ simplify_gen_subreg (TImode, operands[1], CImode, 0));
+ emit_move_insn (simplify_gen_subreg (TImode, operands[0], CImode, 16),
+ simplify_gen_subreg (TImode, operands[1], CImode, 16));
+ emit_move_insn (simplify_gen_subreg (TImode, operands[0], CImode, 32),
+ simplify_gen_subreg (TImode, operands[1], CImode, 32));
+ DONE;
+ }
+ else
+ FAIL;
})
(define_split
- [(set (match_operand:XI 0 "register_operand" "")
- (match_operand:XI 1 "register_operand" ""))]
+ [(set (match_operand:XI 0 "nonimmediate_operand")
+ (match_operand:XI 1 "general_operand"))]
"TARGET_SIMD && reload_completed"
- [(set (match_dup 0) (match_dup 1))
- (set (match_dup 2) (match_dup 3))
- (set (match_dup 4) (match_dup 5))
- (set (match_dup 6) (match_dup 7))]
+ [(const_int 0)]
{
- int rdest = REGNO (operands[0]);
- int rsrc = REGNO (operands[1]);
- rtx dest[4], src[4];
-
- dest[0] = gen_rtx_REG (TFmode, rdest);
- src[0] = gen_rtx_REG (TFmode, rsrc);
- dest[1] = gen_rtx_REG (TFmode, rdest + 1);
- src[1] = gen_rtx_REG (TFmode, rsrc + 1);
- dest[2] = gen_rtx_REG (TFmode, rdest + 2);
- src[2] = gen_rtx_REG (TFmode, rsrc + 2);
- dest[3] = gen_rtx_REG (TFmode, rdest + 3);
- src[3] = gen_rtx_REG (TFmode, rsrc + 3);
-
- aarch64_simd_disambiguate_copy (operands, dest, src, 4);
+ if (register_operand (operands[0], XImode)
+ && register_operand (operands[1], XImode))
+ {
+ aarch64_simd_emit_reg_reg_move (operands, TImode, 4);
+ DONE;
+ }
+ else if (BYTES_BIG_ENDIAN)
+ {
+ emit_move_insn (simplify_gen_subreg (OImode, operands[0], XImode, 0),
+ simplify_gen_subreg (OImode, operands[1], XImode, 0));
+ emit_move_insn (simplify_gen_subreg (OImode, operands[0], XImode, 32),
+ simplify_gen_subreg (OImode, operands[1], XImode, 32));
+ DONE;
+ }
+ else
+ FAIL;
})
(define_insn "aarch64_ld2<mode>_dreg"
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 5144c35..c0306ad 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -3176,12 +3176,20 @@ aarch64_classify_address (struct aarch64_address_info *info,
{
enum rtx_code code = GET_CODE (x);
rtx op0, op1;
+
+ /* On BE, we use load/store pair for all large int mode load/stores. */
+ bool load_store_pair_p = (outer_code == PARALLEL
+ || (BYTES_BIG_ENDIAN
+ && aarch64_vect_struct_mode_p (mode)));
+
bool allow_reg_index_p =
- outer_code != PARALLEL && (GET_MODE_SIZE (mode) != 16
- || aarch64_vector_mode_supported_p (mode));
- /* Don't support anything other than POST_INC or REG addressing for
- AdvSIMD. */
- if (aarch64_vect_struct_mode_p (mode)
+ !load_store_pair_p
+ && (GET_MODE_SIZE (mode) != 16 || aarch64_vector_mode_supported_p (mode))
+ && !aarch64_vect_struct_mode_p (mode);
+
+ /* On LE, for AdvSIMD, don't support anything other than POST_INC or
+ REG addressing. */
+ if (aarch64_vect_struct_mode_p (mode) && !BYTES_BIG_ENDIAN
&& (code != POST_INC && code != REG))
return false;
@@ -3229,11 +3237,32 @@ aarch64_classify_address (struct aarch64_address_info *info,
Q: 9-bit signed offset
We conservatively require an offset representable in either mode.
*/
- if (mode == TImode || mode == TFmode)
- return (aarch64_offset_7bit_signed_scaled_p (mode, offset)
- && offset_9bit_signed_unscaled_p (mode, offset));
-
- if (outer_code == PARALLEL)
+ if (mode == TImode || mode == TFmode)
+ return (aarch64_offset_7bit_signed_scaled_p (mode, offset)
+ && offset_9bit_signed_unscaled_p (mode, offset));
+
+ /* A 7bit offset check because OImode will emit a ldp/stp
+ instruction (only big endian will get here). */
+ if (mode == OImode)
+ return (aarch64_offset_7bit_signed_scaled_p (TImode, offset));
+
+ /* Three 9/12 bit offsets checks because CImode will emit three
+ ldr/str instructions (only big endian will get here). */
+ if (mode == CImode)
+ return ((offset_9bit_signed_unscaled_p (TImode, offset)
+ || offset_12bit_unsigned_scaled_p (TImode, offset))
+ && (offset_9bit_signed_unscaled_p (TImode, offset + 16)
+ || offset_12bit_unsigned_scaled_p (TImode, offset + 16))
+ && (offset_9bit_signed_unscaled_p (TImode, offset + 32)
+ || offset_12bit_unsigned_scaled_p (TImode, offset + 32)));
+
+ /* Two 7bit offsets checks because XImode will emit two ldp/stp
+ instructions (only big endian will get here). */
+ if (mode == XImode)
+ return (aarch64_offset_7bit_signed_scaled_p (TImode, offset)
+ && aarch64_offset_7bit_signed_scaled_p (TImode, offset + 32));
+
+ if (load_store_pair_p)
return ((GET_MODE_SIZE (mode) == 4 || GET_MODE_SIZE (mode) == 8)
&& aarch64_offset_7bit_signed_scaled_p (mode, offset));
else
@@ -3293,7 +3322,7 @@ aarch64_classify_address (struct aarch64_address_info *info,
return (aarch64_offset_7bit_signed_scaled_p (mode, offset)
&& offset_9bit_signed_unscaled_p (mode, offset));
- if (outer_code == PARALLEL)
+ if (load_store_pair_p)
return ((GET_MODE_SIZE (mode) == 4 || GET_MODE_SIZE (mode) == 8)
&& aarch64_offset_7bit_signed_scaled_p (mode, offset));
else
@@ -3307,7 +3336,8 @@ aarch64_classify_address (struct aarch64_address_info *info,
/* load literal: pc-relative constant pool entry. Only supported
for SI mode or larger. */
info->type = ADDRESS_SYMBOLIC;
- if (outer_code != PARALLEL && GET_MODE_SIZE (mode) >= 4)
+
+ if (!load_store_pair_p && GET_MODE_SIZE (mode) >= 4)
{
rtx sym, addend;
@@ -3824,6 +3854,16 @@ aarch64_print_operand (FILE *f, rtx x, char code)
asm_fprintf (f, "v%d", REGNO (x) - V0_REGNUM + (code - 'S'));
break;
+ case 'R':
+ /* Print a scalar FP/SIMD register name + 1. */
+ if (!REG_P (x) || !FP_REGNUM_P (REGNO (x)))
+ {
+ output_operand_lossage ("incompatible floating point / vector register operand for '%%%c'", code);
+ return;
+ }
+ asm_fprintf (f, "q%d", REGNO (x) - V0_REGNUM + 1);
+ break;
+
case 'X':
/* Print bottom 16 bits of integer constant in hex. */
if (!CONST_INT_P (x))
@@ -8014,35 +8054,28 @@ aarch64_simd_mem_operand_p (rtx op)
|| REG_P (XEXP (op, 0)));
}
-/* Set up OPERANDS for a register copy from SRC to DEST, taking care
+/* Emit a register copy from SRC to DEST, taking care
not to early-clobber SRC registers in the process.
- We assume that the operands described by SRC and DEST represent a
- decomposed copy of OPERANDS[1] into OPERANDS[0]. COUNT is the
- number of components into which the copy has been decomposed. */
+ COUNT is the number of components into which the copy needs to be
+ decomposed. */
void
-aarch64_simd_disambiguate_copy (rtx *operands, rtx *dest,
- rtx *src, unsigned int count)
+aarch64_simd_emit_reg_reg_move (rtx *operands, enum machine_mode mode,
+ unsigned int count)
{
unsigned int i;
+ int rdest = REGNO (operands[0]);
+ int rsrc = REGNO (operands[1]);
if (!reg_overlap_mentioned_p (operands[0], operands[1])
- || REGNO (operands[0]) < REGNO (operands[1]))
- {
- for (i = 0; i < count; i++)
- {
- operands[2 * i] = dest[i];
- operands[2 * i + 1] = src[i];
- }
- }
+ || rdest < rsrc)
+ for (i = 0; i < count; i++)
+ emit_move_insn (gen_rtx_REG (mode, rdest + i),
+ gen_rtx_REG (mode, rsrc + i));
else
- {
- for (i = 0; i < count; i++)
- {
- operands[2 * i] = dest[count - i - 1];
- operands[2 * i + 1] = src[count - i - 1];
- }
- }
+ for (i = 0; i < count; i++)
+ emit_move_insn (gen_rtx_REG (mode, rdest + count - i - 1),
+ gen_rtx_REG (mode, rsrc + count - i - 1));
}
/* Compute and return the length of aarch64_simd_mov<mode>, where <mode> is
--
1.8.3
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [AArch64] [BE] Fix vector load/stores to not use ld1/st1
2014-10-10 15:23 [AArch64] [BE] Fix vector load/stores to not use ld1/st1 Alan Hayward
@ 2014-11-04 11:16 ` Marcus Shawcroft
2014-11-06 9:20 ` Yangfei (Felix)
0 siblings, 1 reply; 6+ messages in thread
From: Marcus Shawcroft @ 2014-11-04 11:16 UTC (permalink / raw)
To: Alan Hayward; +Cc: gcc-patches
On 10 October 2014 16:19, Alan Hayward <alan.hayward@arm.com> wrote:
> This patch is dependant on "[AArch64] [BE] [1/2] Make large opaque integer
> modes endianness-safe.”
>
> It fixes up movoi/ci/xi for Big Endian, so that we end up with the lsb of
> a big-endian integer to be in the low byte of the highest-numbered
> register.
>
> movoi uses stp/ldp
> movxi needs a split version (which is shared with register->register
> movxi), which just splits to two new movs
> movci can then split to three movs. A future patch will instead split to
> an movoi and a movti.
>
>
> There are no changes for LE.
>
> Ran whole of check with both parts of "Make large opaque integer modes
> endianness-safe”. No regressions.
>
>
> ChangeLog:
>
> gcc/:
> 2014-10-10 Alan Hayward <alan.hayward@arm.com>
> * config/aarch64/aarch64.c
> (aarch64_classify_address): Allow extra addressing modes for BE.
> (aarch64_print_operand): new operand for printing a q register+1.
> (aarch64_simd_emit_reg_reg_move): replacement for
> aarch64_simd_disambiguate_copy that plants the required mov.
> * config/aarch64/aarch64-protos.h
> (aarch64_simd_emit_reg_reg_move): replacement for
> aarch64_simd_disambiguate_copy.
> * config/aarch64/aarch64-simd.md
> (define_split) Use new aarch64_simd_emit_reg_reg_move.
> (define_expand "mov<mode>") less restrictive predicates.
> (define_insn "*aarch64_mov<mode>") Simplify and only allow for LE.
> (define_insn "*aarch64_be_movoi") New. BE only. Plant ldp or stp.
> (define_insn "*aarch64_be_movci") New. BE only. No instructions.
> (define_insn "*aarch64_be_movxi") New. BE only. No instructions.
> (define_split) OI mov. Use new aarch64_simd_emit_reg_reg_move.
> (define_split) CI mov. Use new aarch64_simd_emit_reg_reg_move.
> On BE
> plant movs for reg to/from mem case.
> (define_split) XI mov. Use new aarch64_simd_emit_reg_reg_move.
> On BE
> plant movs for reg to/from mem case.
>
OK /Marcus
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [AArch64] [BE] Fix vector load/stores to not use ld1/st1
2014-11-04 11:16 ` Marcus Shawcroft
@ 2014-11-06 9:20 ` Yangfei (Felix)
2014-11-06 9:28 ` Alan Hayward
0 siblings, 1 reply; 6+ messages in thread
From: Yangfei (Felix) @ 2014-11-06 9:20 UTC (permalink / raw)
To: Marcus Shawcroft, Alan Hayward, gcc-patches
> On 10 October 2014 16:19, Alan Hayward <alan.hayward@arm.com> wrote:
> > This patch is dependant on "[AArch64] [BE] [1/2] Make large opaque
> > integer modes endianness-safe.”
> >
> > It fixes up movoi/ci/xi for Big Endian, so that we end up with the lsb
> > of a big-endian integer to be in the low byte of the highest-numbered
> > register.
> >
> > movoi uses stp/ldp
> > movxi needs a split version (which is shared with register->register
> > movxi), which just splits to two new movs movci can then split to
> > three movs. A future patch will instead split to an movoi and a movti.
> >
> >
> > There are no changes for LE.
> >
> > Ran whole of check with both parts of "Make large opaque integer modes
> > endianness-safe”. No regressions.
> >
> >
> > ChangeLog:
> >
> > gcc/:
> > 2014-10-10 Alan Hayward <alan.hayward@arm.com>
> > * config/aarch64/aarch64.c
> > (aarch64_classify_address): Allow extra addressing modes for BE.
> > (aarch64_print_operand): new operand for printing a q register+1.
> > (aarch64_simd_emit_reg_reg_move): replacement for
> > aarch64_simd_disambiguate_copy that plants the required mov.
> > * config/aarch64/aarch64-protos.h
> > (aarch64_simd_emit_reg_reg_move): replacement for
> > aarch64_simd_disambiguate_copy.
> > * config/aarch64/aarch64-simd.md
> > (define_split) Use new aarch64_simd_emit_reg_reg_move.
> > (define_expand "mov<mode>") less restrictive predicates.
> > (define_insn "*aarch64_mov<mode>") Simplify and only allow for LE.
> > (define_insn "*aarch64_be_movoi") New. BE only. Plant ldp or
> stp.
> > (define_insn "*aarch64_be_movci") New. BE only. No
> instructions.
> > (define_insn "*aarch64_be_movxi") New. BE only. No
> instructions.
> > (define_split) OI mov. Use new
> aarch64_simd_emit_reg_reg_move.
> > (define_split) CI mov. Use new
> aarch64_simd_emit_reg_reg_move.
> > On BE
> > plant movs for reg to/from mem case.
> > (define_split) XI mov. Use new
> aarch64_simd_emit_reg_reg_move.
> > On BE
> > plant movs for reg to/from mem case.
> >
>
> OK /Marcus
Hello,
It seems that this patch breaks the compile of some testcases under big-endian.
On example: testsuite/gcc.target/aarch64/advsimd-intrinsics/ vldX.c
Any thing I missed? Please confirm. Thanks.
$ aarch64_be-linux-gnu-gcc vldX.c
vldX.c: In function 'exec_vldX':
vldX.c:686:1: internal compiler error: in change_address_1, at emit-rtl.c:2096
}
^
0x73b18c change_address_1
/home/jjj/p660/p660_build_dir_be/src/trunk/gcc/emit-rtl.c:2096
0xdbd278 gen_split_2991(rtx_insn*, rtx_def**)
/home/jjj/p660/p660_build_dir_be/src/trunk/gcc/config/aarch64/aarch64-simd.md:4436
0x743aa1 try_split(rtx_def*, rtx_def*, int)
/home/jjj/p660/p660_build_dir_be/src/trunk/gcc/emit-rtl.c:3639
0x9b8c93 split_insn
/home/jjj/p660/p660_build_dir_be/src/trunk/gcc/recog.c:2901
0x9b8ee7 split_all_insns_noflow()
/home/jjj/p660/p660_build_dir_be/src/trunk/gcc/recog.c:3042
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <http://gcc.gnu.org/bugs.html> for instructions.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [AArch64] [BE] Fix vector load/stores to not use ld1/st1
2014-11-06 9:20 ` Yangfei (Felix)
@ 2014-11-06 9:28 ` Alan Hayward
2014-11-06 9:38 ` Yangfei (Felix)
2014-11-06 10:15 ` Christophe Lyon
0 siblings, 2 replies; 6+ messages in thread
From: Alan Hayward @ 2014-11-06 9:28 UTC (permalink / raw)
To: Yangfei (Felix), Marcus Shawcroft, gcc-patches
On 06/11/2014 09:09, "Yangfei (Felix)" <felix.yang@huawei.com> wrote:
>> On 10 October 2014 16:19, Alan Hayward <alan.hayward@arm.com> wrote:
>> > This patch is dependant on "[AArch64] [BE] [1/2] Make large opaque
>> > integer modes endianness-safe.”
>> >
>> > It fixes up movoi/ci/xi for Big Endian, so that we end up with the lsb
>> > of a big-endian integer to be in the low byte of the highest-numbered
>> > register.
>> >
>> > movoi uses stp/ldp
>> > movxi needs a split version (which is shared with register->register
>> > movxi), which just splits to two new movs movci can then split to
>> > three movs. A future patch will instead split to an movoi and a movti.
>> >
>> >
>> > There are no changes for LE.
>> >
>> > Ran whole of check with both parts of "Make large opaque integer modes
>> > endianness-safe”. No regressions.
>> >
>> >
>> > ChangeLog:
>> >
>> > gcc/:
>> > 2014-10-10 Alan Hayward <alan.hayward@arm.com>
>> > * config/aarch64/aarch64.c
>> > (aarch64_classify_address): Allow extra addressing modes for
>>BE.
>> > (aarch64_print_operand): new operand for printing a q
>>register+1.
>> > (aarch64_simd_emit_reg_reg_move): replacement for
>> > aarch64_simd_disambiguate_copy that plants the required mov.
>> > * config/aarch64/aarch64-protos.h
>> > (aarch64_simd_emit_reg_reg_move): replacement for
>> > aarch64_simd_disambiguate_copy.
>> > * config/aarch64/aarch64-simd.md
>> > (define_split) Use new aarch64_simd_emit_reg_reg_move.
>> > (define_expand "mov<mode>") less restrictive predicates.
>> > (define_insn "*aarch64_mov<mode>") Simplify and only allow
>>for LE.
>> > (define_insn "*aarch64_be_movoi") New. BE only. Plant ldp or
>> stp.
>> > (define_insn "*aarch64_be_movci") New. BE only. No
>> instructions.
>> > (define_insn "*aarch64_be_movxi") New. BE only. No
>> instructions.
>> > (define_split) OI mov. Use new
>> aarch64_simd_emit_reg_reg_move.
>> > (define_split) CI mov. Use new
>> aarch64_simd_emit_reg_reg_move.
>> > On BE
>> > plant movs for reg to/from mem case.
>> > (define_split) XI mov. Use new
>> aarch64_simd_emit_reg_reg_move.
>> > On BE
>> > plant movs for reg to/from mem case.
>> >
>>
>> OK /Marcus
>
>
>Hello,
>
> It seems that this patch breaks the compile of some testcases under
>big-endian.
> On example: testsuite/gcc.target/aarch64/advsimd-intrinsics/ vldX.c
> Any thing I missed? Please confirm. Thanks.
>
>$ aarch64_be-linux-gnu-gcc vldX.c
>
>vldX.c: In function 'exec_vldX':
>vldX.c:686:1: internal compiler error: in change_address_1, at
>emit-rtl.c:2096
> }
> ^
>0x73b18c change_address_1
> /home/jjj/p660/p660_build_dir_be/src/trunk/gcc/emit-rtl.c:2096
>0xdbd278 gen_split_2991(rtx_insn*, rtx_def**)
>
>/home/jjj/p660/p660_build_dir_be/src/trunk/gcc/config/aarch64/aarch64-simd
>.md:4436
>0x743aa1 try_split(rtx_def*, rtx_def*, int)
> /home/jjj/p660/p660_build_dir_be/src/trunk/gcc/emit-rtl.c:3639
>0x9b8c93 split_insn
> /home/jjj/p660/p660_build_dir_be/src/trunk/gcc/recog.c:2901
>0x9b8ee7 split_all_insns_noflow()
> /home/jjj/p660/p660_build_dir_be/src/trunk/gcc/recog.c:3042
>Please submit a full bug report,
>with preprocessed source if appropriate.
>Please include the complete backtrace with any bug report.
>See <http://gcc.gnu.org/bugs.html> for instructions.
>
Did you try my patch with or without it’s dependency ?
"[AArch64] [BE] [1/2] Make large opaque integer modes endianness-safe.”
(David Sherwood)
Without that above patch then I would expect some tests to fail.
In addition, it looks like David has had some problems merging that patch
to the latest head - I’d suggest not using this patch until David has
sorted his.
Alan.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [AArch64] [BE] Fix vector load/stores to not use ld1/st1
2014-11-06 9:28 ` Alan Hayward
@ 2014-11-06 9:38 ` Yangfei (Felix)
2014-11-06 10:15 ` Christophe Lyon
1 sibling, 0 replies; 6+ messages in thread
From: Yangfei (Felix) @ 2014-11-06 9:38 UTC (permalink / raw)
To: Alan Hayward, Marcus Shawcroft, gcc-patches
> >
> >
> >Hello,
> >
> > It seems that this patch breaks the compile of some testcases under
> >big-endian.
> > On example: testsuite/gcc.target/aarch64/advsimd-intrinsics/ vldX.c
> > Any thing I missed? Please confirm. Thanks.
> >
> >$ aarch64_be-linux-gnu-gcc vldX.c
> >
> >vldX.c: In function 'exec_vldX':
> >vldX.c:686:1: internal compiler error: in change_address_1, at
> >emit-rtl.c:2096
> > }
> > ^
> >0x73b18c change_address_1
> > /home/jjj/p660/p660_build_dir_be/src/trunk/gcc/emit-rtl.c:2096
> >0xdbd278 gen_split_2991(rtx_insn*, rtx_def**)
> >
> >/home/jjj/p660/p660_build_dir_be/src/trunk/gcc/config/aarch64/aarch64-s
> >imd
> >.md:4436
> >0x743aa1 try_split(rtx_def*, rtx_def*, int)
> > /home/jjj/p660/p660_build_dir_be/src/trunk/gcc/emit-rtl.c:3639
> >0x9b8c93 split_insn
> > /home/jjj/p660/p660_build_dir_be/src/trunk/gcc/recog.c:2901
> >0x9b8ee7 split_all_insns_noflow()
> > /home/jjj/p660/p660_build_dir_be/src/trunk/gcc/recog.c:3042
> >Please submit a full bug report,
> >with preprocessed source if appropriate.
> >Please include the complete backtrace with any bug report.
> >See <http://gcc.gnu.org/bugs.html> for instructions.
> >
>
>
> Did you try my patch with or without it’s dependency ?
> "[AArch64] [BE] [1/2] Make large opaque integer modes endianness-safe.”
> (David Sherwood)
>
> Without that above patch then I would expect some tests to fail.
>
>
> In addition, it looks like David has had some problems merging that patch to the
> latest head - I’d suggest not using this patch until David has sorted his.
>
> Alan.
Yeah, I also merged with it's dependency.
I tried the patch with this snapshot: gcc version 5.0.0 20141029 (experimental).
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [AArch64] [BE] Fix vector load/stores to not use ld1/st1
2014-11-06 9:28 ` Alan Hayward
2014-11-06 9:38 ` Yangfei (Felix)
@ 2014-11-06 10:15 ` Christophe Lyon
1 sibling, 0 replies; 6+ messages in thread
From: Christophe Lyon @ 2014-11-06 10:15 UTC (permalink / raw)
To: Alan Hayward; +Cc: Yangfei (Felix), Marcus Shawcroft, gcc-patches
On 6 November 2014 10:28, Alan Hayward <alan.hayward@arm.com> wrote:
>
>
> On 06/11/2014 09:09, "Yangfei (Felix)" <felix.yang@huawei.com> wrote:
>
>>> On 10 October 2014 16:19, Alan Hayward <alan.hayward@arm.com> wrote:
>>> > This patch is dependant on "[AArch64] [BE] [1/2] Make large opaque
>>> > integer modes endianness-safe.”
>>> >
>>> > It fixes up movoi/ci/xi for Big Endian, so that we end up with the lsb
>>> > of a big-endian integer to be in the low byte of the highest-numbered
>>> > register.
>>> >
>>> > movoi uses stp/ldp
>>> > movxi needs a split version (which is shared with register->register
>>> > movxi), which just splits to two new movs movci can then split to
>>> > three movs. A future patch will instead split to an movoi and a movti.
>>> >
>>> >
>>> > There are no changes for LE.
>>> >
>>> > Ran whole of check with both parts of "Make large opaque integer modes
>>> > endianness-safe”. No regressions.
>>> >
>>> >
>>> > ChangeLog:
>>> >
>>> > gcc/:
>>> > 2014-10-10 Alan Hayward <alan.hayward@arm.com>
>>> > * config/aarch64/aarch64.c
>>> > (aarch64_classify_address): Allow extra addressing modes for
>>>BE.
>>> > (aarch64_print_operand): new operand for printing a q
>>>register+1.
>>> > (aarch64_simd_emit_reg_reg_move): replacement for
>>> > aarch64_simd_disambiguate_copy that plants the required mov.
>>> > * config/aarch64/aarch64-protos.h
>>> > (aarch64_simd_emit_reg_reg_move): replacement for
>>> > aarch64_simd_disambiguate_copy.
>>> > * config/aarch64/aarch64-simd.md
>>> > (define_split) Use new aarch64_simd_emit_reg_reg_move.
>>> > (define_expand "mov<mode>") less restrictive predicates.
>>> > (define_insn "*aarch64_mov<mode>") Simplify and only allow
>>>for LE.
>>> > (define_insn "*aarch64_be_movoi") New. BE only. Plant ldp or
>>> stp.
>>> > (define_insn "*aarch64_be_movci") New. BE only. No
>>> instructions.
>>> > (define_insn "*aarch64_be_movxi") New. BE only. No
>>> instructions.
>>> > (define_split) OI mov. Use new
>>> aarch64_simd_emit_reg_reg_move.
>>> > (define_split) CI mov. Use new
>>> aarch64_simd_emit_reg_reg_move.
>>> > On BE
>>> > plant movs for reg to/from mem case.
>>> > (define_split) XI mov. Use new
>>> aarch64_simd_emit_reg_reg_move.
>>> > On BE
>>> > plant movs for reg to/from mem case.
>>> >
>>>
>>> OK /Marcus
>>
>>
>>Hello,
>>
>> It seems that this patch breaks the compile of some testcases under
>>big-endian.
>> On example: testsuite/gcc.target/aarch64/advsimd-intrinsics/ vldX.c
>> Any thing I missed? Please confirm. Thanks.
>>
>>$ aarch64_be-linux-gnu-gcc vldX.c
>>
>>vldX.c: In function 'exec_vldX':
>>vldX.c:686:1: internal compiler error: in change_address_1, at
>>emit-rtl.c:2096
>> }
>> ^
>>0x73b18c change_address_1
>> /home/jjj/p660/p660_build_dir_be/src/trunk/gcc/emit-rtl.c:2096
>>0xdbd278 gen_split_2991(rtx_insn*, rtx_def**)
>>
>>/home/jjj/p660/p660_build_dir_be/src/trunk/gcc/config/aarch64/aarch64-simd
>>.md:4436
>>0x743aa1 try_split(rtx_def*, rtx_def*, int)
>> /home/jjj/p660/p660_build_dir_be/src/trunk/gcc/emit-rtl.c:3639
>>0x9b8c93 split_insn
>> /home/jjj/p660/p660_build_dir_be/src/trunk/gcc/recog.c:2901
>>0x9b8ee7 split_all_insns_noflow()
>> /home/jjj/p660/p660_build_dir_be/src/trunk/gcc/recog.c:3042
>>Please submit a full bug report,
>>with preprocessed source if appropriate.
>>Please include the complete backtrace with any bug report.
>>See <http://gcc.gnu.org/bugs.html> for instructions.
>>
>
>
> Did you try my patch with or without it’s dependency ?
> "[AArch64] [BE] [1/2] Make large opaque integer modes endianness-safe.”
> (David Sherwood)
>
> Without that above patch then I would expect some tests to fail.
>
Yes, we already have vldX and vuzp/vzip tests failing in aarch64_be.
They are PR63652 and 636523.
> In addition, it looks like David has had some problems merging that patch
> to the latest head - I’d suggest not using this patch until David has
> sorted his.
I've been looking at these PRs, but I should probably give a try to
David's patch once it's updated.
>
> Alan.
>
>
>
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2014-11-06 10:15 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-10-10 15:23 [AArch64] [BE] Fix vector load/stores to not use ld1/st1 Alan Hayward
2014-11-04 11:16 ` Marcus Shawcroft
2014-11-06 9:20 ` Yangfei (Felix)
2014-11-06 9:28 ` Alan Hayward
2014-11-06 9:38 ` Yangfei (Felix)
2014-11-06 10:15 ` Christophe Lyon
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).