public inbox for gcc-cvs@sourceware.org
help / color / mirror / Atom feed
* [gcc(refs/vendors/ARM/heads/morello)] aarch64: Rework valid offset code
@ 2022-05-05 12:06 Matthew Malcomson
0 siblings, 0 replies; only message in thread
From: Matthew Malcomson @ 2022-05-05 12:06 UTC (permalink / raw)
To: gcc-cvs
https://gcc.gnu.org/g:c0c8dd0be875910509bc09be50c8141f47c3eebc
commit c0c8dd0be875910509bc09be50c8141f47c3eebc
Author: Richard Sandiford <richard.sandiford@arm.com>
Date: Wed Mar 30 04:45:09 2022 +0100
aarch64: Rework valid offset code
aarch64_classify_address has 5 styles of check for a valid offset:
(1) the range of a single LDR/STR (including LDUR/STUR through aliases)
(2) the range of a single LDP/STP
(3) (1) && (2), but for different modes; both must pass individually
(4) the range of an LDP/STP followed by an LDR/STR (for big-endian CI)
(5) the range of two LDPs/STPs (for big-endian XI)
(1) and (2) are handled generally, based on whether load_store_pair_p
is false or true respectively. The others are handled as mode-specific
special cases.
This patch tries to generalise things so that all 5 cases are handled as
instances of a general template, described by the comments in the patch.
(4) and (5) involve a sequence of load and store instructions.
The question in that case is what to do about pre/post-modify
addresses. In principle, the first instruction in the sequence
should carry a pre-modify, with the others using offsets from
the modified address. Conversely, the last instruction in the
sequence should carry a post-modify, with the others using offsets
from the unmodified address. It would also be possible to use a
different scheme by emitting the individual loads or stores in a
different order (dependencies permitting).
However, in practice, the existing move patterns for (4) and (5)
require an offsetable address, which means that they reject all
forms of pre/post-modify:
(define_insn "*aarch64_be_movci"
[(set (match_operand:CI 0 "nonimmediate_operand" "=w,o,w")
(match_operand:CI 1 "general_operand" " w,w,o"))]
...
(define_insn "*aarch64_be_movxi"
[(set (match_operand:XI 0 "nonimmediate_operand" "=w,o,w")
(match_operand:XI 1 "general_operand" " w,w,o"))]
If all we cared about for CI and XI was the move patterns, we could
simply reject pre/post-modify for those modes. However, we do want
to allow POST_INC in "m" so that ivopts will use it for LD[34] and
ST[34] (which are the main uses of CI and XI).
The patch therefore tightens the check to reject other forms
of pre/post-modify for CI and XI, which means we can assert that
the PRE_MODIFY and POST_MODIFY code only has to handle (1)-(3).
Diff:
---
gcc/config/aarch64/aarch64.c | 197 +++++++++++++++++++++++++------------------
1 file changed, 114 insertions(+), 83 deletions(-)
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index e06a35bb702..72ba5f4507f 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -9747,23 +9747,71 @@ aarch64_classify_address (struct aarch64_address_info *info,
bool alt_base_p = (TARGET_CAPABILITY_HYBRID
&& CAPABILITY_MODE_P (GET_MODE (x)));
- /* On BE, we use load/store pair for all large int mode load/stores.
- TI/TFmode may also use a load/store pair. */
bool advsimd_struct_p = (vec_flags == (VEC_ADVSIMD | VEC_STRUCT));
- bool load_store_pair_p = (type == ADDR_QUERY_LDP_STP
- || type == ADDR_QUERY_LDP_STP_N
- || mode == TImode
- || mode == TFmode
- || (BYTES_BIG_ENDIAN && advsimd_struct_p));
-
- /* If we are dealing with ADDR_QUERY_LDP_STP_N that means the incoming mode
- corresponds to the actual size of the memory being loaded/stored and the
- mode of the corresponding addressing mode is half of that. */
- if (type == ADDR_QUERY_LDP_STP_N
- && known_eq (GET_MODE_SIZE (mode), 16))
- mode = DFmode;
-
- bool allow_reg_index_p = (!load_store_pair_p
+
+ /* Classify the access as up to two of the following:
+
+ - a sequence of LDPs or STPs
+ - a single LDR or STR
+
+ The LDR/STR can overlap the LDPs/STPs or come after them.
+
+ If LDP_STP_MODE is not VOIDmode, require a sequence of NUM_LDP_STP
+ pairs, with each loaded or stored register having mode LDP_STP_MODE.
+
+ If LDR_STR_MODE is not VOIDmode, require a valid LDR/STR of that
+ mode at offset LDR_STR_OFFSET from the start of MODE. */
+ machine_mode ldp_stp_mode = VOIDmode;
+ machine_mode ldr_str_mode = VOIDmode;
+ unsigned int num_ldp_stp = 1;
+ poly_int64 ldr_str_offset = 0;
+ if (type == ADDR_QUERY_LDP_STP)
+ {
+ if (known_eq (GET_MODE_SIZE (mode), 4)
+ || known_eq (GET_MODE_SIZE (mode), 8)
+ || known_eq (GET_MODE_SIZE (mode), 16))
+ ldp_stp_mode = mode;
+ else
+ return false;
+ }
+ /* If we are dealing with ADDR_QUERY_LDP_STP_N that means the
+ incoming mode corresponds to the actual size of the memory
+ being loaded/stored and the mode of the corresponding
+ addressing mode is half of that. */
+ else if (type == ADDR_QUERY_LDP_STP_N)
+ {
+ if (known_eq (GET_MODE_SIZE (mode), 16))
+ ldp_stp_mode = DFmode;
+ else
+ return false;
+ }
+ /* TImode and TFmode values are allowed in both pairs of X
+ registers and individual Q registers. The available
+ address modes are:
+ X,X: 7-bit signed scaled offset
+ Q: 9-bit signed offset
+ We conservatively require an offset representable in either mode. */
+ else if (mode == TImode || mode == TFmode)
+ {
+ ldp_stp_mode = DImode;
+ ldr_str_mode = mode;
+ }
+ /* On BE, we use load/store pair for multi-vector load/stores. */
+ else if (BYTES_BIG_ENDIAN && advsimd_struct_p)
+ {
+ ldp_stp_mode = V16QImode;
+ if (known_eq (GET_MODE_SIZE (mode), GET_MODE_SIZE (V16QImode) * 3))
+ {
+ ldr_str_mode = V16QImode;
+ ldr_str_offset = GET_MODE_SIZE (V16QImode) * 2;
+ }
+ else if (known_eq (GET_MODE_SIZE (mode), GET_MODE_SIZE (V16QImode) * 4))
+ num_ldp_stp = 2;
+ }
+ else
+ ldr_str_mode = mode;
+
+ bool allow_reg_index_p = (ldp_stp_mode == VOIDmode
&& (known_lt (GET_MODE_SIZE (mode), 16)
|| mode == CADImode
|| vec_flags == VEC_ADVSIMD
@@ -9778,12 +9826,23 @@ aarch64_classify_address (struct aarch64_address_info *info,
&& (code != REG && code != PLUS))
return false;
- /* On LE, for AdvSIMD, don't support anything other than POST_INC or
- REG addressing. */
- if (advsimd_struct_p
- && !BYTES_BIG_ENDIAN
- && (code != POST_INC && code != REG))
- return false;
+ if (advsimd_struct_p)
+ {
+ if (GET_RTX_CLASS (code) == RTX_AUTOINC)
+ {
+ /* LD[234] and ST[234] only support post-increment addressing.
+ The big-endian movci and movxi patterns do not support *any*
+ pre/post-modify addressing, but we want to allow them in
+ "m" so that ivopts can use them to optimize gimple
+ LD[34]/ST[34] operations. */
+ if (code != POST_INC)
+ return false;
+ }
+ /* On LE, for AdvSIMD, don't support anything other than POST_INC or
+ REG addressing. */
+ else if (!BYTES_BIG_ENDIAN && code != REG)
+ return false;
+ }
/* For Morello: Exit early if the address is not in Pmode. This blocks all
CONST_INTs and other non-capability SCALAR_ADDR_MODE_P types. */
@@ -9832,50 +9891,24 @@ aarch64_classify_address (struct aarch64_address_info *info,
info->offset = op1;
info->const_offset = offset;
- /* TImode and TFmode values are allowed in both pairs of X
- registers and individual Q registers. The available
- address modes are:
- X,X: 7-bit signed scaled offset
- Q: 9-bit signed offset
- We conservatively require an offset representable in either mode.
- When performing the check for pairs of X registers i.e. LDP/STP
- pass down DImode since that is the natural size of the LDP/STP
- instruction memory accesses. */
- if ((mode == TImode || mode == TFmode)
- && !aarch64_offset_7bit_signed_scaled_p (DImode, offset))
+ if (ldp_stp_mode != VOIDmode)
+ /* Test that each LDP/STP pair fits a signed 7-bit offset
+ range, scaled by the size of the individual registers. */
+ for (unsigned int i = 0; i < num_ldp_stp; ++i)
+ {
+ auto suboffset = offset + i * GET_MODE_SIZE (ldp_stp_mode) * 2;
+ if (!aarch64_offset_7bit_signed_scaled_p (ldp_stp_mode,
+ suboffset))
+ return false;
+ }
+
+ if (ldr_str_mode != VOIDmode
+ && !aarch64_valid_ldr_str_offset_p (ldr_str_mode, alt_base_p,
+ offset + ldr_str_offset,
+ type))
return false;
- /* A 7bit offset check because OImode will emit a ldp/stp
- instruction (only big endian will get here).
- For ldp/stp instructions, the offset is scaled for the size of a
- single element of the pair. */
- if (mode == OImode)
- return aarch64_offset_7bit_signed_scaled_p (TImode, offset);
-
- /* Three 9/12 bit offsets checks because CImode will emit three
- ldr/str instructions (only big endian will get here). */
- if (mode == CImode)
- return (aarch64_offset_7bit_signed_scaled_p (TImode, offset)
- && (aarch64_offset_9bit_signed_unscaled_p (V16QImode,
- offset + 32)
- || offset_12bit_unsigned_scaled_p (V16QImode,
- offset + 32)));
-
- /* Two 7bit offsets checks because XImode will emit two ldp/stp
- instructions (only big endian will get here). */
- if (mode == XImode)
- return (aarch64_offset_7bit_signed_scaled_p (TImode, offset)
- && aarch64_offset_7bit_signed_scaled_p (TImode,
- offset + 32));
-
- if (load_store_pair_p)
- return ((known_eq (GET_MODE_SIZE (mode), 4)
- || known_eq (GET_MODE_SIZE (mode), 8)
- || known_eq (GET_MODE_SIZE (mode), 16))
- && aarch64_offset_7bit_signed_scaled_p (mode, offset));
-
- return aarch64_valid_ldr_str_offset_p (mode, alt_base_p, offset,
- type);
+ return true;
}
if (allow_reg_index_p)
@@ -9918,24 +9951,22 @@ aarch64_classify_address (struct aarch64_address_info *info,
info->offset = XEXP (XEXP (x, 1), 1);
info->const_offset = offset;
- /* TImode and TFmode values are allowed in both pairs of X
- registers and individual Q registers. The available
- address modes are:
- X,X: 7-bit signed scaled offset
- Q: 9-bit signed offset
- We conservatively require an offset representable in either mode.
- */
- if (mode == TImode || mode == TFmode)
- return (aarch64_offset_7bit_signed_scaled_p (mode, offset)
- && aarch64_offset_9bit_signed_unscaled_p (mode, offset));
-
- if (load_store_pair_p)
- return ((known_eq (GET_MODE_SIZE (mode), 4)
- || known_eq (GET_MODE_SIZE (mode), 8)
- || known_eq (GET_MODE_SIZE (mode), 16))
- && aarch64_offset_7bit_signed_scaled_p (mode, offset));
- else
- return aarch64_offset_9bit_signed_unscaled_p (mode, offset);
+ if (ldp_stp_mode != VOIDmode)
+ {
+ gcc_assert (num_ldp_stp == 1);
+ if (!aarch64_offset_7bit_signed_scaled_p (ldp_stp_mode, offset))
+ return false;
+ }
+
+ if (ldr_str_mode != VOIDmode)
+ {
+ gcc_assert (known_eq (ldr_str_offset, 0));
+ if (!aarch64_offset_9bit_signed_unscaled_p (ldr_str_mode,
+ offset))
+ return false;
+ }
+
+ return true;
}
return false;
@@ -9946,7 +9977,7 @@ aarch64_classify_address (struct aarch64_address_info *info,
for SI mode or larger. */
info->type = ADDRESS_SYMBOLIC;
- if (!load_store_pair_p
+ if (ldp_stp_mode == VOIDmode
&& GET_MODE_SIZE (mode).is_constant (&const_size)
&& const_size >= 4)
{
^ permalink raw reply [flat|nested] only message in thread
only message in thread, other threads:[~2022-05-05 12:06 UTC | newest]
Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-05 12:06 [gcc(refs/vendors/ARM/heads/morello)] aarch64: Rework valid offset code Matthew Malcomson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).