public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c/106146] New: [instcombine] a redundant movprfx insn compare to llvm
@ 2022-06-30 12:29 zhongyunde at huawei dot com
  2022-07-11 22:15 ` [Bug target/106146] " pinskia at gcc dot gnu.org
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: zhongyunde at huawei dot com @ 2022-06-30 12:29 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106146

            Bug ID: 106146
           Summary: [instcombine] a redundant movprfx insn compare to llvm
           Product: gcc
           Version: 13.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
          Assignee: unassigned at gcc dot gnu.org
          Reporter: zhongyunde at huawei dot com
  Target Milestone: ---

* test case, gcc has a redundant movprfx insn in the kernel loop body, see
detail https://gcc.godbolt.org/z/8vG4PzM18. 

```
#include <arm_sve.h>

#define ARRAY_ALIGNMENT 64
#define LEN_2D 128ll
#define LEN_1D 8000ll
#define iterations 10000
typedef float real_t;

__attribute__((aligned(ARRAY_ALIGNMENT))) real_t a[LEN_1D],b[LEN_1D];

void s113_tuned(void) {
    for (int nl = 0; nl < 4*iterations; nl++) {
        int64_t i = 1;
        svbool_t pg = svwhilelt_b32(i, LEN_1D);
        svfloat32_t a0v = svdup_f32(a[0]);
        do {
            svfloat32_t bv = svld1_f32(pg, &b[i]);
            svfloat32_t res = svadd_z(pg, bv, a0v);
            svst1(pg, &a[i], res);
            i += svcntw();
            pg = svwhilelt_b32(i, LEN_1D);
        } while (svptest_any(svptrue_b32(), pg));
    }
    return;
}
```

* gcc's kernel loop
```
.L2:
        ld1w    z0.s, p0/z, [x3, x0, lsl 2]
        movprfx z0.s, p0/z, z0.s
        fadd    z0.s, p0/m, z0.s, z1.s
        st1w    z0.s, p0, [x1, x0, lsl 2]
        incw    x0
        whilelt p0.s, x0, x2
        b.any   .L2
```

* llvm's kernel loop:
```
.LBB0_2:                                //   Parent Loop BB0_1 Depth=1
        ld1w    { z1.s }, p2/z, [x13, x14, lsl #2]
        fadd    z1.s, p2/m, z1.s, z0.s
        st1w    { z1.s }, p2, [x12, x14, lsl #2]
        add     x14, x10, x14
        whilelt p2.s, x14, x9
        b.ne    .LBB0_2
```

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug target/106146] a redundant movprfx insn compare to llvm
  2022-06-30 12:29 [Bug c/106146] New: [instcombine] a redundant movprfx insn compare to llvm zhongyunde at huawei dot com
@ 2022-07-11 22:15 ` pinskia at gcc dot gnu.org
  2022-07-11 22:23 ` pinskia at gcc dot gnu.org
  2024-02-27 19:02 ` pinskia at gcc dot gnu.org
  2 siblings, 0 replies; 4+ messages in thread
From: pinskia at gcc dot gnu.org @ 2022-07-11 22:15 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106146

--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
-march=armv8-a+sve2 -O3 -ffast-math

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug target/106146] a redundant movprfx insn compare to llvm
  2022-06-30 12:29 [Bug c/106146] New: [instcombine] a redundant movprfx insn compare to llvm zhongyunde at huawei dot com
  2022-07-11 22:15 ` [Bug target/106146] " pinskia at gcc dot gnu.org
@ 2022-07-11 22:23 ` pinskia at gcc dot gnu.org
  2024-02-27 19:02 ` pinskia at gcc dot gnu.org
  2 siblings, 0 replies; 4+ messages in thread
From: pinskia at gcc dot gnu.org @ 2022-07-11 22:23 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106146

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2022-07-11
     Ever confirmed|0                           |1
           Severity|normal                      |enhancement

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Looks like:
(insn 26 25 31 (set (reg/v:VNx4SF 32 v0 [orig:100 res ] [100])
        (unspec:VNx4SF [
                (reg:VNx4BI 68 p0 [orig:95 pg ] [95])
                (unspec:VNx4SF [
                        (reg:VNx4BI 68 p0 [orig:95 pg ] [95])
                        (const_int 1 [0x1])
                        (reg/v:VNx4SF 32 v0 [orig:100 res ] [100])
                        (reg/v:VNx4SF 33 v1 [orig:97 a0v ] [97])
                    ] UNSPEC_COND_FADD)
                (const_vector:VNx4SF repeat [
                        (const_double:SF 0.0 [0x0.0p+0])
                    ])
            ] UNSPEC_SEL)) "/app/example.cpp":19:38 7127
{*cond_addvnx4sf_any_strict}
     (nil))

Prints out two asm instructions.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug target/106146] a redundant movprfx insn compare to llvm
  2022-06-30 12:29 [Bug c/106146] New: [instcombine] a redundant movprfx insn compare to llvm zhongyunde at huawei dot com
  2022-07-11 22:15 ` [Bug target/106146] " pinskia at gcc dot gnu.org
  2022-07-11 22:23 ` pinskia at gcc dot gnu.org
@ 2024-02-27 19:02 ` pinskia at gcc dot gnu.org
  2 siblings, 0 replies; 4+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-02-27 19:02 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106146

--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
So I see svadd_z directly emits the instruction, not leaving any way to
optimize away the _z part before hand.
I am not sure how to fix this though.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2024-02-27 19:02 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-30 12:29 [Bug c/106146] New: [instcombine] a redundant movprfx insn compare to llvm zhongyunde at huawei dot com
2022-07-11 22:15 ` [Bug target/106146] " pinskia at gcc dot gnu.org
2022-07-11 22:23 ` pinskia at gcc dot gnu.org
2024-02-27 19:02 ` pinskia at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).