public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/102066] New: aarch64: Suboptimal addressing modes for SVE LD1W, ST1W
@ 2021-08-25 14:46 ktkachov at gcc dot gnu.org
  2021-08-25 15:01 ` [Bug target/102066] " rsandifo at gcc dot gnu.org
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: ktkachov at gcc dot gnu.org @ 2021-08-25 14:46 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102066

            Bug ID: 102066
           Summary: aarch64: Suboptimal addressing modes for SVE LD1W,
                    ST1W
           Product: gcc
           Version: 12.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: ktkachov at gcc dot gnu.org
                CC: rsandifo at gcc dot gnu.org
  Target Milestone: ---
            Target: aarch64

For the code:
#include <arm_sve.h>

void foo(int n, float *x, float *y) {
    for (unsigned i=0; i<n; i+=svcntw()) {
        svfloat32_t val = svld1_f32(svptrue_b8(), &x[i]);
        svst1_f32(svptrue_b8(), &y[i], val);
    }
}

at -O3 -march=armv8.2-a+sve GCC generates:
foo:
        cbz     w0, .L1
        mov     w4, 0
        cntw    x6
        ptrue   p0.b, all
.L3:
        ubfiz   x3, x4, 2, 32
        add     w4, w4, w6
        add     x5, x1, x3
        add     x3, x2, x3
        ld1w    z0.s, p0/z, [x5]
        st1w    z0.s, p0, [x3]
        cmp     w4, w0
        bcc     .L3
.L1:
        ret

but it could be making better use of the addressing modes:
foo:                                    // @foo
        cbz     w0, .LBB0_3
        mov     w8, wzr
        ptrue   p0.b
        cntw    x9
.LBB0_2:                                // %for.body
        mov     w10, w8
        ld1w    { z0.s }, p0/z, [x1, x10, lsl #2]
        add     w8, w8, w9
        cmp     w8, w0
        st1w    { z0.s }, p0, [x2, x10, lsl #2]
        b.lo    .LBB0_2
.LBB0_3:                                // %for.cond.cleanup
        ret

I guess the predicates and constraints in @aarch64_pred_mov<mode> in
aarch64-sve.md should allow for the scaled address modes

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug target/102066] aarch64: Suboptimal addressing modes for SVE LD1W, ST1W
  2021-08-25 14:46 [Bug target/102066] New: aarch64: Suboptimal addressing modes for SVE LD1W, ST1W ktkachov at gcc dot gnu.org
@ 2021-08-25 15:01 ` rsandifo at gcc dot gnu.org
  2021-08-25 15:19 ` ktkachov at gcc dot gnu.org
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: rsandifo at gcc dot gnu.org @ 2021-08-25 15:01 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102066

--- Comment #1 from rsandifo at gcc dot gnu.org <rsandifo at gcc dot gnu.org> ---
> I guess the predicates and constraints in @aarch64_pred_mov<mode> in aarch64-sve.md should allow for the scaled address modes
They already allow them.  I'm guessing this is an ivopts problem,
in that it doesn't realise it can promote the unsigned iterator
to uint64_t for a svcntw() step.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug target/102066] aarch64: Suboptimal addressing modes for SVE LD1W, ST1W
  2021-08-25 14:46 [Bug target/102066] New: aarch64: Suboptimal addressing modes for SVE LD1W, ST1W ktkachov at gcc dot gnu.org
  2021-08-25 15:01 ` [Bug target/102066] " rsandifo at gcc dot gnu.org
@ 2021-08-25 15:19 ` ktkachov at gcc dot gnu.org
  2024-01-26  0:18 ` [Bug tree-optimization/102066] " pinskia at gcc dot gnu.org
  2024-01-26  0:19 ` pinskia at gcc dot gnu.org
  3 siblings, 0 replies; 5+ messages in thread
From: ktkachov at gcc dot gnu.org @ 2021-08-25 15:19 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102066

--- Comment #2 from ktkachov at gcc dot gnu.org ---
(In reply to rsandifo@gcc.gnu.org from comment #1)
> > I guess the predicates and constraints in @aarch64_pred_mov<mode> in aarch64-sve.md should allow for the scaled address modes
> They already allow them.  I'm guessing this is an ivopts problem,
> in that it doesn't realise it can promote the unsigned iterator
> to uint64_t for a svcntw() step.

ah indeed
#include <arm_sve.h>

void foo(int n, float *x, float *y) {
    for (uint64_t i=0; i<n; i+=svcntw()) {
        svfloat32_t val = svld1_f32(svptrue_b8(), &x[i]);
        svst1_f32(svptrue_b8(), &y[i], val);
    }
}

generates good code

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug tree-optimization/102066] aarch64: Suboptimal addressing modes for SVE LD1W, ST1W
  2021-08-25 14:46 [Bug target/102066] New: aarch64: Suboptimal addressing modes for SVE LD1W, ST1W ktkachov at gcc dot gnu.org
  2021-08-25 15:01 ` [Bug target/102066] " rsandifo at gcc dot gnu.org
  2021-08-25 15:19 ` ktkachov at gcc dot gnu.org
@ 2024-01-26  0:18 ` pinskia at gcc dot gnu.org
  2024-01-26  0:19 ` pinskia at gcc dot gnu.org
  3 siblings, 0 replies; 5+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-01-26  0:18 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102066

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |pinskia at gcc dot gnu.org
          Component|target                      |tree-optimization
           Severity|normal                      |enhancement

--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Confirmed.

GCC does not have a promotion pass, especially when dealing with induction
variables.  There are other bug reports which have a similar issue too.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug tree-optimization/102066] aarch64: Suboptimal addressing modes for SVE LD1W, ST1W
  2021-08-25 14:46 [Bug target/102066] New: aarch64: Suboptimal addressing modes for SVE LD1W, ST1W ktkachov at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2024-01-26  0:18 ` [Bug tree-optimization/102066] " pinskia at gcc dot gnu.org
@ 2024-01-26  0:19 ` pinskia at gcc dot gnu.org
  3 siblings, 0 replies; 5+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-01-26  0:19 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102066

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2024-01-26
     Ever confirmed|0                           |1

--- Comment #4 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2024-01-26  0:19 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-25 14:46 [Bug target/102066] New: aarch64: Suboptimal addressing modes for SVE LD1W, ST1W ktkachov at gcc dot gnu.org
2021-08-25 15:01 ` [Bug target/102066] " rsandifo at gcc dot gnu.org
2021-08-25 15:19 ` ktkachov at gcc dot gnu.org
2024-01-26  0:18 ` [Bug tree-optimization/102066] " pinskia at gcc dot gnu.org
2024-01-26  0:19 ` pinskia at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).