public inbox for gcc-cvs@sourceware.org help / color / mirror / Atom feed
From: Matthew Malcomson <matmal01@gcc.gnu.org> To: gcc-cvs@gcc.gnu.org Subject: [gcc(refs/vendors/ARM/heads/morello)] aarch64: Rework valid offset code Date: Thu, 5 May 2022 12:06:11 +0000 (GMT) [thread overview] Message-ID: <20220505120611.05CAF3856243@sourceware.org> (raw) https://gcc.gnu.org/g:c0c8dd0be875910509bc09be50c8141f47c3eebc commit c0c8dd0be875910509bc09be50c8141f47c3eebc Author: Richard Sandiford <richard.sandiford@arm.com> Date: Wed Mar 30 04:45:09 2022 +0100 aarch64: Rework valid offset code aarch64_classify_address has 5 styles of check for a valid offset: (1) the range of a single LDR/STR (including LDUR/STUR through aliases) (2) the range of a single LDP/STP (3) (1) && (2), but for different modes; both must pass individually (4) the range of an LDP/STP followed by an LDR/STR (for big-endian CI) (5) the range of two LDPs/STPs (for big-endian XI) (1) and (2) are handled generally, based on whether load_store_pair_p is false or true respectively. The others are handled as mode-specific special cases. This patch tries to generalise things so that all 5 cases are handled as instances of a general template, described by the comments in the patch. (4) and (5) involve a sequence of load and store instructions. The question in that case is what to do about pre/post-modify addresses. In principle, the first instruction in the sequence should carry a pre-modify, with the others using offsets from the modified address. Conversely, the last instruction in the sequence should carry a post-modify, with the others using offsets from the unmodified address. It would also be possible to use a different scheme by emitting the individual loads or stores in a different order (dependencies permitting). However, in practice, the existing move patterns for (4) and (5) require an offsetable address, which means that they reject all forms of pre/post-modify: (define_insn "*aarch64_be_movci" [(set (match_operand:CI 0 "nonimmediate_operand" "=w,o,w") (match_operand:CI 1 "general_operand" " w,w,o"))] ... (define_insn "*aarch64_be_movxi" [(set (match_operand:XI 0 "nonimmediate_operand" "=w,o,w") (match_operand:XI 1 "general_operand" " w,w,o"))] If all we cared about for CI and XI was the move patterns, we could simply reject pre/post-modify for those modes. However, we do want to allow POST_INC in "m" so that ivopts will use it for LD[34] and ST[34] (which are the main uses of CI and XI). The patch therefore tightens the check to reject other forms of pre/post-modify for CI and XI, which means we can assert that the PRE_MODIFY and POST_MODIFY code only has to handle (1)-(3). Diff: --- gcc/config/aarch64/aarch64.c | 197 +++++++++++++++++++++++++------------------ 1 file changed, 114 insertions(+), 83 deletions(-) diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index e06a35bb702..72ba5f4507f 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -9747,23 +9747,71 @@ aarch64_classify_address (struct aarch64_address_info *info, bool alt_base_p = (TARGET_CAPABILITY_HYBRID && CAPABILITY_MODE_P (GET_MODE (x))); - /* On BE, we use load/store pair for all large int mode load/stores. - TI/TFmode may also use a load/store pair. */ bool advsimd_struct_p = (vec_flags == (VEC_ADVSIMD | VEC_STRUCT)); - bool load_store_pair_p = (type == ADDR_QUERY_LDP_STP - || type == ADDR_QUERY_LDP_STP_N - || mode == TImode - || mode == TFmode - || (BYTES_BIG_ENDIAN && advsimd_struct_p)); - - /* If we are dealing with ADDR_QUERY_LDP_STP_N that means the incoming mode - corresponds to the actual size of the memory being loaded/stored and the - mode of the corresponding addressing mode is half of that. */ - if (type == ADDR_QUERY_LDP_STP_N - && known_eq (GET_MODE_SIZE (mode), 16)) - mode = DFmode; - - bool allow_reg_index_p = (!load_store_pair_p + + /* Classify the access as up to two of the following: + + - a sequence of LDPs or STPs + - a single LDR or STR + + The LDR/STR can overlap the LDPs/STPs or come after them. + + If LDP_STP_MODE is not VOIDmode, require a sequence of NUM_LDP_STP + pairs, with each loaded or stored register having mode LDP_STP_MODE. + + If LDR_STR_MODE is not VOIDmode, require a valid LDR/STR of that + mode at offset LDR_STR_OFFSET from the start of MODE. */ + machine_mode ldp_stp_mode = VOIDmode; + machine_mode ldr_str_mode = VOIDmode; + unsigned int num_ldp_stp = 1; + poly_int64 ldr_str_offset = 0; + if (type == ADDR_QUERY_LDP_STP) + { + if (known_eq (GET_MODE_SIZE (mode), 4) + || known_eq (GET_MODE_SIZE (mode), 8) + || known_eq (GET_MODE_SIZE (mode), 16)) + ldp_stp_mode = mode; + else + return false; + } + /* If we are dealing with ADDR_QUERY_LDP_STP_N that means the + incoming mode corresponds to the actual size of the memory + being loaded/stored and the mode of the corresponding + addressing mode is half of that. */ + else if (type == ADDR_QUERY_LDP_STP_N) + { + if (known_eq (GET_MODE_SIZE (mode), 16)) + ldp_stp_mode = DFmode; + else + return false; + } + /* TImode and TFmode values are allowed in both pairs of X + registers and individual Q registers. The available + address modes are: + X,X: 7-bit signed scaled offset + Q: 9-bit signed offset + We conservatively require an offset representable in either mode. */ + else if (mode == TImode || mode == TFmode) + { + ldp_stp_mode = DImode; + ldr_str_mode = mode; + } + /* On BE, we use load/store pair for multi-vector load/stores. */ + else if (BYTES_BIG_ENDIAN && advsimd_struct_p) + { + ldp_stp_mode = V16QImode; + if (known_eq (GET_MODE_SIZE (mode), GET_MODE_SIZE (V16QImode) * 3)) + { + ldr_str_mode = V16QImode; + ldr_str_offset = GET_MODE_SIZE (V16QImode) * 2; + } + else if (known_eq (GET_MODE_SIZE (mode), GET_MODE_SIZE (V16QImode) * 4)) + num_ldp_stp = 2; + } + else + ldr_str_mode = mode; + + bool allow_reg_index_p = (ldp_stp_mode == VOIDmode && (known_lt (GET_MODE_SIZE (mode), 16) || mode == CADImode || vec_flags == VEC_ADVSIMD @@ -9778,12 +9826,23 @@ aarch64_classify_address (struct aarch64_address_info *info, && (code != REG && code != PLUS)) return false; - /* On LE, for AdvSIMD, don't support anything other than POST_INC or - REG addressing. */ - if (advsimd_struct_p - && !BYTES_BIG_ENDIAN - && (code != POST_INC && code != REG)) - return false; + if (advsimd_struct_p) + { + if (GET_RTX_CLASS (code) == RTX_AUTOINC) + { + /* LD[234] and ST[234] only support post-increment addressing. + The big-endian movci and movxi patterns do not support *any* + pre/post-modify addressing, but we want to allow them in + "m" so that ivopts can use them to optimize gimple + LD[34]/ST[34] operations. */ + if (code != POST_INC) + return false; + } + /* On LE, for AdvSIMD, don't support anything other than POST_INC or + REG addressing. */ + else if (!BYTES_BIG_ENDIAN && code != REG) + return false; + } /* For Morello: Exit early if the address is not in Pmode. This blocks all CONST_INTs and other non-capability SCALAR_ADDR_MODE_P types. */ @@ -9832,50 +9891,24 @@ aarch64_classify_address (struct aarch64_address_info *info, info->offset = op1; info->const_offset = offset; - /* TImode and TFmode values are allowed in both pairs of X - registers and individual Q registers. The available - address modes are: - X,X: 7-bit signed scaled offset - Q: 9-bit signed offset - We conservatively require an offset representable in either mode. - When performing the check for pairs of X registers i.e. LDP/STP - pass down DImode since that is the natural size of the LDP/STP - instruction memory accesses. */ - if ((mode == TImode || mode == TFmode) - && !aarch64_offset_7bit_signed_scaled_p (DImode, offset)) + if (ldp_stp_mode != VOIDmode) + /* Test that each LDP/STP pair fits a signed 7-bit offset + range, scaled by the size of the individual registers. */ + for (unsigned int i = 0; i < num_ldp_stp; ++i) + { + auto suboffset = offset + i * GET_MODE_SIZE (ldp_stp_mode) * 2; + if (!aarch64_offset_7bit_signed_scaled_p (ldp_stp_mode, + suboffset)) + return false; + } + + if (ldr_str_mode != VOIDmode + && !aarch64_valid_ldr_str_offset_p (ldr_str_mode, alt_base_p, + offset + ldr_str_offset, + type)) return false; - /* A 7bit offset check because OImode will emit a ldp/stp - instruction (only big endian will get here). - For ldp/stp instructions, the offset is scaled for the size of a - single element of the pair. */ - if (mode == OImode) - return aarch64_offset_7bit_signed_scaled_p (TImode, offset); - - /* Three 9/12 bit offsets checks because CImode will emit three - ldr/str instructions (only big endian will get here). */ - if (mode == CImode) - return (aarch64_offset_7bit_signed_scaled_p (TImode, offset) - && (aarch64_offset_9bit_signed_unscaled_p (V16QImode, - offset + 32) - || offset_12bit_unsigned_scaled_p (V16QImode, - offset + 32))); - - /* Two 7bit offsets checks because XImode will emit two ldp/stp - instructions (only big endian will get here). */ - if (mode == XImode) - return (aarch64_offset_7bit_signed_scaled_p (TImode, offset) - && aarch64_offset_7bit_signed_scaled_p (TImode, - offset + 32)); - - if (load_store_pair_p) - return ((known_eq (GET_MODE_SIZE (mode), 4) - || known_eq (GET_MODE_SIZE (mode), 8) - || known_eq (GET_MODE_SIZE (mode), 16)) - && aarch64_offset_7bit_signed_scaled_p (mode, offset)); - - return aarch64_valid_ldr_str_offset_p (mode, alt_base_p, offset, - type); + return true; } if (allow_reg_index_p) @@ -9918,24 +9951,22 @@ aarch64_classify_address (struct aarch64_address_info *info, info->offset = XEXP (XEXP (x, 1), 1); info->const_offset = offset; - /* TImode and TFmode values are allowed in both pairs of X - registers and individual Q registers. The available - address modes are: - X,X: 7-bit signed scaled offset - Q: 9-bit signed offset - We conservatively require an offset representable in either mode. - */ - if (mode == TImode || mode == TFmode) - return (aarch64_offset_7bit_signed_scaled_p (mode, offset) - && aarch64_offset_9bit_signed_unscaled_p (mode, offset)); - - if (load_store_pair_p) - return ((known_eq (GET_MODE_SIZE (mode), 4) - || known_eq (GET_MODE_SIZE (mode), 8) - || known_eq (GET_MODE_SIZE (mode), 16)) - && aarch64_offset_7bit_signed_scaled_p (mode, offset)); - else - return aarch64_offset_9bit_signed_unscaled_p (mode, offset); + if (ldp_stp_mode != VOIDmode) + { + gcc_assert (num_ldp_stp == 1); + if (!aarch64_offset_7bit_signed_scaled_p (ldp_stp_mode, offset)) + return false; + } + + if (ldr_str_mode != VOIDmode) + { + gcc_assert (known_eq (ldr_str_offset, 0)); + if (!aarch64_offset_9bit_signed_unscaled_p (ldr_str_mode, + offset)) + return false; + } + + return true; } return false; @@ -9946,7 +9977,7 @@ aarch64_classify_address (struct aarch64_address_info *info, for SI mode or larger. */ info->type = ADDRESS_SYMBOLIC; - if (!load_store_pair_p + if (ldp_stp_mode == VOIDmode && GET_MODE_SIZE (mode).is_constant (&const_size) && const_size >= 4) {
reply other threads:[~2022-05-05 12:06 UTC|newest] Thread overview: [no followups] expand[flat|nested] mbox.gz Atom feed
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20220505120611.05CAF3856243@sourceware.org \ --to=matmal01@gcc.gnu.org \ --cc=gcc-cvs@gcc.gnu.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).