public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/111005] New: SVE produced code for different type sizes (smaller than int) with comparison in a loop can be improved
@ 2023-08-12 20:03 pinskia at gcc dot gnu.org
  2023-08-12 20:12 ` [Bug target/111005] " pinskia at gcc dot gnu.org
  2023-10-22 22:34 ` pinskia at gcc dot gnu.org
  0 siblings, 2 replies; 3+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-08-12 20:03 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111005

            Bug ID: 111005
           Summary: SVE produced code for different type sizes (smaller
                    than int) with comparison in a loop can be improved
           Product: gcc
           Version: 14.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: pinskia at gcc dot gnu.org
  Target Milestone: ---
            Target: aarch64

Take:
```
void __attribute__ ((noipa))
f0 (int *__restrict r,
   int *__restrict a,
   short *__restrict pred)
{
  for (int i = 0; i < 1024; ++i)
  {
    int p = pred[i]?-1:0;
    r[i] = p ;
  }
}

void __attribute__ ((noipa))
f1 (int *__restrict r,
   int *__restrict a,
   short *__restrict pred)
{
  for (int i = 0; i < 1024; ++i)
  {
    int p = pred[i];
    r[i] = p ;
  }
}
```

f1 produces:
```
.L6:
        ld1sh   z31.s, p7/z, [x2, x1, lsl 1]
        st1w    z31.s, p7, [x0, x1, lsl 2]
        incw    x1
        whilelo p7.s, w1, w3
        b.any   .L6
```

While f0 produces:
```
.L2:
        ld1h    z0.h, p0/z, [x2, x1, lsl 1]
        punpklo p2.h, p0.b
        cmpne   p3.h, p1/z, z0.h, #0
        punpkhi p0.h, p0.b
        mov     z0.h, p3/z, #1
        neg     z0.h, p1/m, z0.h
        sunpklo z1.s, z0.h
        sunpkhi z0.s, z0.h
        st1w    z1.s, p2, [x0, x1, lsl 2]
        st1w    z0.s, p0, [x4, x1, lsl 2]
        inch    x1
        whilelo p0.h, w1, w3
        b.any   .L2
```

While it should produce:
```
.L6:
        ld1sh   z31.s, p7/z, [x2, x1, lsl 1]
        cmpne   p1.s,  p7/z, z31.s, #0
        mov     z31.s, p1/z, #-1                 // =0xffffffffffffffff
        st1w    z31.s, p7, [x0, x1, lsl 2]
        incw    x1
        whilelo p7.s, w1, w3
        b.any   .L6
```

That is:
sign extend load
compare-not-equal to 0; setting p1
set z31 to -1 or 0 based on p1
store z31

But instead we push to do unpacking from VN2HI to VHI ...

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Bug target/111005] SVE produced code for different type sizes (smaller than int) with comparison in a loop can be improved
  2023-08-12 20:03 [Bug target/111005] New: SVE produced code for different type sizes (smaller than int) with comparison in a loop can be improved pinskia at gcc dot gnu.org
@ 2023-08-12 20:12 ` pinskia at gcc dot gnu.org
  2023-10-22 22:34 ` pinskia at gcc dot gnu.org
  1 sibling, 0 replies; 3+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-08-12 20:12 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111005

--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
I forgot to say Compile with `-march=armv8.5+sve2 -O3`.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Bug target/111005] SVE produced code for different type sizes (smaller than int) with comparison in a loop can be improved
  2023-08-12 20:03 [Bug target/111005] New: SVE produced code for different type sizes (smaller than int) with comparison in a loop can be improved pinskia at gcc dot gnu.org
  2023-08-12 20:12 ` [Bug target/111005] " pinskia at gcc dot gnu.org
@ 2023-10-22 22:34 ` pinskia at gcc dot gnu.org
  1 sibling, 0 replies; 3+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-10-22 22:34 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111005

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Severity|normal                      |enhancement

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2023-10-22 22:34 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-12 20:03 [Bug target/111005] New: SVE produced code for different type sizes (smaller than int) with comparison in a loop can be improved pinskia at gcc dot gnu.org
2023-08-12 20:12 ` [Bug target/111005] " pinskia at gcc dot gnu.org
2023-10-22 22:34 ` pinskia at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).