* [PATCH] Fix target/101934: aarch64 memset code creates unaligned stores for -mstrict-align
@ 2021-08-31 23:33 apinski
2021-09-01 8:51 ` Richard Sandiford
0 siblings, 1 reply; 3+ messages in thread
From: apinski @ 2021-08-31 23:33 UTC (permalink / raw)
To: gcc-patches; +Cc: Andrew Pinski
From: Andrew Pinski <apinski@marvell.com>
The problem here is the aarch64_expand_setmem code did not check
STRICT_ALIGNMENT if it is creating an overlapping store.
This patch adds that check and the testcase works.
gcc/ChangeLog:
PR target/101934
* config/aarch64/aarch64.c (aarch64_expand_setmem):
Check STRICT_ALIGNMENT before creating an overlapping
store.
gcc/testsuite/ChangeLog:
PR target/101934
* gcc.target/aarch64/memset-strict-align-1.c: New test.
---
gcc/config/aarch64/aarch64.c | 4 +--
.../aarch64/memset-strict-align-1.c | 28 +++++++++++++++++++
2 files changed, 30 insertions(+), 2 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/aarch64/memset-strict-align-1.c
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 3213585a588..26d59ba1e13 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -23566,8 +23566,8 @@ aarch64_expand_setmem (rtx *operands)
/* Do certain trailing copies as overlapping if it's going to be
cheaper. i.e. less instructions to do so. For instance doing a 15
byte copy it's more efficient to do two overlapping 8 byte copies than
- 8 + 4 + 2 + 1. */
- if (n > 0 && n < copy_limit / 2)
+ 8 + 4 + 2 + 1. Only do this when -mstrict-align is not supplied. */
+ if (n > 0 && n < copy_limit / 2 && !STRICT_ALIGNMENT)
{
next_mode = smallest_mode_for_size (n, MODE_INT);
int n_bits = GET_MODE_BITSIZE (next_mode).to_constant ();
diff --git a/gcc/testsuite/gcc.target/aarch64/memset-strict-align-1.c b/gcc/testsuite/gcc.target/aarch64/memset-strict-align-1.c
new file mode 100644
index 00000000000..5cdc8a44968
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/memset-strict-align-1.c
@@ -0,0 +1,28 @@
+/* { dg-do compile } */
+/* { dg-options "-Os -mstrict-align" } */
+
+struct s { char x[95]; };
+void foo (struct s *);
+void bar (void) { struct s s1 = {}; foo (&s1); }
+
+/* memset (s1 = {}, sizeof = 95) should be expanded out
+ such that there are no overlap stores when -mstrict-align
+ is in use.
+ so 2 pair 16 bytes stores (64 bytes).
+ 1 16 byte stores
+ 1 8 byte store
+ 1 4 byte store
+ 1 2 byte store
+ 1 1 byte store
+ */
+
+/* { dg-final { scan-assembler-times "stp\tq" 2 } } */
+/* { dg-final { scan-assembler-times "str\tq" 1 } } */
+/* { dg-final { scan-assembler-times "str\txzr" 1 } } */
+/* { dg-final { scan-assembler-times "str\twzr" 1 } } */
+/* { dg-final { scan-assembler-times "strh\twzr" 1 } } */
+/* { dg-final { scan-assembler-times "strb\twzr" 1 } } */
+
+/* Also one store pair for the frame-pointer and the LR. */
+/* { dg-final { scan-assembler-times "stp\tx" 1 } } */
+
--
2.17.1
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] Fix target/101934: aarch64 memset code creates unaligned stores for -mstrict-align
2021-08-31 23:33 [PATCH] Fix target/101934: aarch64 memset code creates unaligned stores for -mstrict-align apinski
@ 2021-09-01 8:51 ` Richard Sandiford
2021-09-16 7:28 ` Andrew Pinski
0 siblings, 1 reply; 3+ messages in thread
From: Richard Sandiford @ 2021-09-01 8:51 UTC (permalink / raw)
To: apinski--- via Gcc-patches; +Cc: apinski
apinski--- via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> From: Andrew Pinski <apinski@marvell.com>
>
> The problem here is the aarch64_expand_setmem code did not check
> STRICT_ALIGNMENT if it is creating an overlapping store.
> This patch adds that check and the testcase works.
>
> gcc/ChangeLog:
>
> PR target/101934
> * config/aarch64/aarch64.c (aarch64_expand_setmem):
> Check STRICT_ALIGNMENT before creating an overlapping
> store.
>
> gcc/testsuite/ChangeLog:
>
> PR target/101934
> * gcc.target/aarch64/memset-strict-align-1.c: New test.
OK, thanks.
Richard
> ---
> gcc/config/aarch64/aarch64.c | 4 +--
> .../aarch64/memset-strict-align-1.c | 28 +++++++++++++++++++
> 2 files changed, 30 insertions(+), 2 deletions(-)
> create mode 100644 gcc/testsuite/gcc.target/aarch64/memset-strict-align-1.c
>
> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index 3213585a588..26d59ba1e13 100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -23566,8 +23566,8 @@ aarch64_expand_setmem (rtx *operands)
> /* Do certain trailing copies as overlapping if it's going to be
> cheaper. i.e. less instructions to do so. For instance doing a 15
> byte copy it's more efficient to do two overlapping 8 byte copies than
> - 8 + 4 + 2 + 1. */
> - if (n > 0 && n < copy_limit / 2)
> + 8 + 4 + 2 + 1. Only do this when -mstrict-align is not supplied. */
> + if (n > 0 && n < copy_limit / 2 && !STRICT_ALIGNMENT)
> {
> next_mode = smallest_mode_for_size (n, MODE_INT);
> int n_bits = GET_MODE_BITSIZE (next_mode).to_constant ();
> diff --git a/gcc/testsuite/gcc.target/aarch64/memset-strict-align-1.c b/gcc/testsuite/gcc.target/aarch64/memset-strict-align-1.c
> new file mode 100644
> index 00000000000..5cdc8a44968
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/memset-strict-align-1.c
> @@ -0,0 +1,28 @@
> +/* { dg-do compile } */
> +/* { dg-options "-Os -mstrict-align" } */
> +
> +struct s { char x[95]; };
> +void foo (struct s *);
> +void bar (void) { struct s s1 = {}; foo (&s1); }
> +
> +/* memset (s1 = {}, sizeof = 95) should be expanded out
> + such that there are no overlap stores when -mstrict-align
> + is in use.
> + so 2 pair 16 bytes stores (64 bytes).
> + 1 16 byte stores
> + 1 8 byte store
> + 1 4 byte store
> + 1 2 byte store
> + 1 1 byte store
> + */
> +
> +/* { dg-final { scan-assembler-times "stp\tq" 2 } } */
> +/* { dg-final { scan-assembler-times "str\tq" 1 } } */
> +/* { dg-final { scan-assembler-times "str\txzr" 1 } } */
> +/* { dg-final { scan-assembler-times "str\twzr" 1 } } */
> +/* { dg-final { scan-assembler-times "strh\twzr" 1 } } */
> +/* { dg-final { scan-assembler-times "strb\twzr" 1 } } */
> +
> +/* Also one store pair for the frame-pointer and the LR. */
> +/* { dg-final { scan-assembler-times "stp\tx" 1 } } */
> +
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] Fix target/101934: aarch64 memset code creates unaligned stores for -mstrict-align
2021-09-01 8:51 ` Richard Sandiford
@ 2021-09-16 7:28 ` Andrew Pinski
0 siblings, 0 replies; 3+ messages in thread
From: Andrew Pinski @ 2021-09-16 7:28 UTC (permalink / raw)
To: Richard Sandiford, apinski--- via Gcc-patches, Andrew Pinski
On Wed, Sep 1, 2021 at 1:52 AM Richard Sandiford via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> apinski--- via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> > From: Andrew Pinski <apinski@marvell.com>
> >
> > The problem here is the aarch64_expand_setmem code did not check
> > STRICT_ALIGNMENT if it is creating an overlapping store.
> > This patch adds that check and the testcase works.
> >
> > gcc/ChangeLog:
> >
> > PR target/101934
> > * config/aarch64/aarch64.c (aarch64_expand_setmem):
> > Check STRICT_ALIGNMENT before creating an overlapping
> > store.
> >
> > gcc/testsuite/ChangeLog:
> >
> > PR target/101934
> > * gcc.target/aarch64/memset-strict-align-1.c: New test.
>
> OK, thanks.
Applied now also on the GCC 11 branch.
Thanks,
Andrew
>
> Richard
>
> > ---
> > gcc/config/aarch64/aarch64.c | 4 +--
> > .../aarch64/memset-strict-align-1.c | 28 +++++++++++++++++++
> > 2 files changed, 30 insertions(+), 2 deletions(-)
> > create mode 100644 gcc/testsuite/gcc.target/aarch64/memset-strict-align-1.c
> >
> > diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> > index 3213585a588..26d59ba1e13 100644
> > --- a/gcc/config/aarch64/aarch64.c
> > +++ b/gcc/config/aarch64/aarch64.c
> > @@ -23566,8 +23566,8 @@ aarch64_expand_setmem (rtx *operands)
> > /* Do certain trailing copies as overlapping if it's going to be
> > cheaper. i.e. less instructions to do so. For instance doing a 15
> > byte copy it's more efficient to do two overlapping 8 byte copies than
> > - 8 + 4 + 2 + 1. */
> > - if (n > 0 && n < copy_limit / 2)
> > + 8 + 4 + 2 + 1. Only do this when -mstrict-align is not supplied. */
> > + if (n > 0 && n < copy_limit / 2 && !STRICT_ALIGNMENT)
> > {
> > next_mode = smallest_mode_for_size (n, MODE_INT);
> > int n_bits = GET_MODE_BITSIZE (next_mode).to_constant ();
> > diff --git a/gcc/testsuite/gcc.target/aarch64/memset-strict-align-1.c b/gcc/testsuite/gcc.target/aarch64/memset-strict-align-1.c
> > new file mode 100644
> > index 00000000000..5cdc8a44968
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/aarch64/memset-strict-align-1.c
> > @@ -0,0 +1,28 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-Os -mstrict-align" } */
> > +
> > +struct s { char x[95]; };
> > +void foo (struct s *);
> > +void bar (void) { struct s s1 = {}; foo (&s1); }
> > +
> > +/* memset (s1 = {}, sizeof = 95) should be expanded out
> > + such that there are no overlap stores when -mstrict-align
> > + is in use.
> > + so 2 pair 16 bytes stores (64 bytes).
> > + 1 16 byte stores
> > + 1 8 byte store
> > + 1 4 byte store
> > + 1 2 byte store
> > + 1 1 byte store
> > + */
> > +
> > +/* { dg-final { scan-assembler-times "stp\tq" 2 } } */
> > +/* { dg-final { scan-assembler-times "str\tq" 1 } } */
> > +/* { dg-final { scan-assembler-times "str\txzr" 1 } } */
> > +/* { dg-final { scan-assembler-times "str\twzr" 1 } } */
> > +/* { dg-final { scan-assembler-times "strh\twzr" 1 } } */
> > +/* { dg-final { scan-assembler-times "strb\twzr" 1 } } */
> > +
> > +/* Also one store pair for the frame-pointer and the LR. */
> > +/* { dg-final { scan-assembler-times "stp\tx" 1 } } */
> > +
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2021-09-16 7:28 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-31 23:33 [PATCH] Fix target/101934: aarch64 memset code creates unaligned stores for -mstrict-align apinski
2021-09-01 8:51 ` Richard Sandiford
2021-09-16 7:28 ` Andrew Pinski
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).