* [Bug rtl-optimization/114674] [aarch64] ldp_fusion fails to merge 2 strs due to imprecise alignment info
2024-04-10 7:41 [Bug rtl-optimization/114674] New: [aarch64] ldp_fusion fails to merge 2 strs due to imprecise alignment info dizhao at os dot amperecomputing.com
@ 2024-04-10 7:43 ` dizhao at os dot amperecomputing.com
2024-04-10 8:34 ` acoplan at gcc dot gnu.org
` (5 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: dizhao at os dot amperecomputing.com @ 2024-04-10 7:43 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114674
--- Comment #1 from Di Zhao <dizhao at os dot amperecomputing.com> ---
Here's a quick fix I tried, that works on the small test case above:
diff --git a/gcc/config/aarch64/aarch64-ldp-fusion.cc
b/gcc/config/aarch64/aarch64-ldp-fusion.cc
index 365dcf48b22..43478ede72b 100644
--- a/gcc/config/aarch64/aarch64-ldp-fusion.cc
+++ b/gcc/config/aarch64/aarch64-ldp-fusion.cc
@@ -1735,6 +1735,9 @@ ldp_bb_info::fuse_pair (bool load_p,
rtx new_mem = adjust_address_nv (effective_base,
mode_for_mem,
adjust_amt);
+ // Keep original alignment info for ldp/stp policy.
+ set_mem_align (new_mem, MEM_ALIGN (change_mem));
+
rtx new_set = load_p
? gen_rtx_SET (change_reg, new_mem)
: gen_rtx_SET (new_mem, change_reg);
Is is OK?
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug rtl-optimization/114674] [aarch64] ldp_fusion fails to merge 2 strs due to imprecise alignment info
2024-04-10 7:41 [Bug rtl-optimization/114674] New: [aarch64] ldp_fusion fails to merge 2 strs due to imprecise alignment info dizhao at os dot amperecomputing.com
2024-04-10 7:43 ` [Bug rtl-optimization/114674] " dizhao at os dot amperecomputing.com
@ 2024-04-10 8:34 ` acoplan at gcc dot gnu.org
2024-04-10 13:27 ` acoplan at gcc dot gnu.org
` (4 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: acoplan at gcc dot gnu.org @ 2024-04-10 8:34 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114674
Alex Coplan <acoplan at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |acoplan at gcc dot gnu.org
Keywords| |missed-optimization
--- Comment #2 from Alex Coplan <acoplan at gcc dot gnu.org> ---
Thanks for the report (and patch), I'll look into this.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug rtl-optimization/114674] [aarch64] ldp_fusion fails to merge 2 strs due to imprecise alignment info
2024-04-10 7:41 [Bug rtl-optimization/114674] New: [aarch64] ldp_fusion fails to merge 2 strs due to imprecise alignment info dizhao at os dot amperecomputing.com
2024-04-10 7:43 ` [Bug rtl-optimization/114674] " dizhao at os dot amperecomputing.com
2024-04-10 8:34 ` acoplan at gcc dot gnu.org
@ 2024-04-10 13:27 ` acoplan at gcc dot gnu.org
2024-04-10 14:00 ` acoplan at gcc dot gnu.org
` (3 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: acoplan at gcc dot gnu.org @ 2024-04-10 13:27 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114674
Alex Coplan <acoplan at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Ever confirmed|0 |1
Status|UNCONFIRMED |NEW
Last reconfirmed| |2024-04-10
--- Comment #3 from Alex Coplan <acoplan at gcc dot gnu.org> ---
Confirmed.
I think it might be best to take the maximum MEM_ALIGN between the adjusted mem
(new_mem) and the original mem (change_mem). In this case it happens that the
original mem (change_mem) has a stronger alignment guarantee, but in general it
could be the case that the adjusted mem gives a better alignment guarantee.
Since both locations are known to point to the same address, it seems best to
me to take the largest alignment of the two.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug rtl-optimization/114674] [aarch64] ldp_fusion fails to merge 2 strs due to imprecise alignment info
2024-04-10 7:41 [Bug rtl-optimization/114674] New: [aarch64] ldp_fusion fails to merge 2 strs due to imprecise alignment info dizhao at os dot amperecomputing.com
` (2 preceding siblings ...)
2024-04-10 13:27 ` acoplan at gcc dot gnu.org
@ 2024-04-10 14:00 ` acoplan at gcc dot gnu.org
2024-04-11 6:40 ` dizhao at os dot amperecomputing.com
` (2 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: acoplan at gcc dot gnu.org @ 2024-04-10 14:00 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114674
Alex Coplan <acoplan at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Assignee|unassigned at gcc dot gnu.org |acoplan at gcc dot gnu.org
Status|NEW |ASSIGNED
--- Comment #4 from Alex Coplan <acoplan at gcc dot gnu.org> ---
Discussing offline with Richard S an alternative approach would be to use
replace_equiv_address[_nv] instead of adjust_address[_nv]; that way we preserve
most properties of the original mem and just take the address from the other.
In fact this is what aarch64_check_consecutive_mems already does so I think we
should follow that.
I'll try a patch along those lines for stage 1.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug rtl-optimization/114674] [aarch64] ldp_fusion fails to merge 2 strs due to imprecise alignment info
2024-04-10 7:41 [Bug rtl-optimization/114674] New: [aarch64] ldp_fusion fails to merge 2 strs due to imprecise alignment info dizhao at os dot amperecomputing.com
` (3 preceding siblings ...)
2024-04-10 14:00 ` acoplan at gcc dot gnu.org
@ 2024-04-11 6:40 ` dizhao at os dot amperecomputing.com
2024-05-07 13:44 ` cvs-commit at gcc dot gnu.org
2024-05-07 13:46 ` acoplan at gcc dot gnu.org
6 siblings, 0 replies; 8+ messages in thread
From: dizhao at os dot amperecomputing.com @ 2024-04-11 6:40 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114674
--- Comment #5 from Di Zhao <dizhao at os dot amperecomputing.com> ---
Yes it's better to take a maximum MEM_ALIGN. I've missed that. Thanks.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug rtl-optimization/114674] [aarch64] ldp_fusion fails to merge 2 strs due to imprecise alignment info
2024-04-10 7:41 [Bug rtl-optimization/114674] New: [aarch64] ldp_fusion fails to merge 2 strs due to imprecise alignment info dizhao at os dot amperecomputing.com
` (4 preceding siblings ...)
2024-04-11 6:40 ` dizhao at os dot amperecomputing.com
@ 2024-05-07 13:44 ` cvs-commit at gcc dot gnu.org
2024-05-07 13:46 ` acoplan at gcc dot gnu.org
6 siblings, 0 replies; 8+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2024-05-07 13:44 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114674
--- Comment #6 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Alex Coplan <acoplan@gcc.gnu.org>:
https://gcc.gnu.org/g:74690ff96b263b8609639b7fbc5d6afd3f19cb98
commit r15-282-g74690ff96b263b8609639b7fbc5d6afd3f19cb98
Author: Alex Coplan <alex.coplan@arm.com>
Date: Wed Apr 10 16:30:36 2024 +0100
aarch64: Preserve mem info on change of base for ldp/stp [PR114674]
The ldp/stp fusion pass can change the base of an access so that the two
accesses end up using a common base register. So far we have been using
adjust_address_nv to do this, but this means that we don't preserve
other properties of the mem we're replacing. It seems better to use
replace_equiv_address_nv, as this will preserve e.g. the MEM_ALIGN of the
mem whose address we're changing.
The PR shows that by adjusting the other mem we lose alignment
information about the original access and therefore end up rejecting an
otherwise viable pair when --param=aarch64-stp-policy=aligned is passed.
This patch fixes that by using replace_equiv_address_nv instead.
Notably this is the same approach as taken by
aarch64_check_consecutive_mems when a change of base is required, so
this at least makes things more consistent between the ldp fusion pass
and the peepholes.
gcc/ChangeLog:
PR target/114674
* config/aarch64/aarch64-ldp-fusion.cc (ldp_bb_info::fuse_pair):
Use replace_equiv_address_nv on a change of base instead of
adjust_address_nv on the other access.
gcc/testsuite/ChangeLog:
PR target/114674
* gcc.target/aarch64/pr114674.c: New test.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug rtl-optimization/114674] [aarch64] ldp_fusion fails to merge 2 strs due to imprecise alignment info
2024-04-10 7:41 [Bug rtl-optimization/114674] New: [aarch64] ldp_fusion fails to merge 2 strs due to imprecise alignment info dizhao at os dot amperecomputing.com
` (5 preceding siblings ...)
2024-05-07 13:44 ` cvs-commit at gcc dot gnu.org
@ 2024-05-07 13:46 ` acoplan at gcc dot gnu.org
6 siblings, 0 replies; 8+ messages in thread
From: acoplan at gcc dot gnu.org @ 2024-05-07 13:46 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114674
Alex Coplan <acoplan at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|ASSIGNED |RESOLVED
Resolution|--- |FIXED
--- Comment #7 from Alex Coplan <acoplan at gcc dot gnu.org> ---
Fixed for GCC 15, thanks for the report.
^ permalink raw reply [flat|nested] 8+ messages in thread