public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/108177] New: MVE predicated stores to same address get optimized away
@ 2022-12-19 14:29 avieira at gcc dot gnu.org
  2022-12-19 14:30 ` [Bug target/108177] " avieira at gcc dot gnu.org
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: avieira at gcc dot gnu.org @ 2022-12-19 14:29 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108177

            Bug ID: 108177
           Summary: MVE predicated stores to same address get optimized
                    away
           Product: gcc
           Version: 12.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: avieira at gcc dot gnu.org
  Target Milestone: ---

GCC currently generates wrong code for predicated MVE stores to the same
address. Like:

#include <arm_mve.h>

uint8x16_t foo (uint8x16_t a, uint8_t *pa, mve_pred16_t p1, mve_pred16_t p2)
{
    vstrbq_p_u8 (pa, a, p1);
    vstrbq_p_u8 (pa, a, p2);
}

with 'gcc -mcpu=cortex-m55 -mfloat-abi=hard -O3' it will only generate the
second MVE store. Though if (p2 | p1) != p2 then the second store will not
fully overwrite the first.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/108177] MVE predicated stores to same address get optimized away
  2022-12-19 14:29 [Bug target/108177] New: MVE predicated stores to same address get optimized away avieira at gcc dot gnu.org
@ 2022-12-19 14:30 ` avieira at gcc dot gnu.org
  2022-12-19 15:47 ` rguenth at gcc dot gnu.org
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: avieira at gcc dot gnu.org @ 2022-12-19 14:30 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108177

--- Comment #1 from avieira at gcc dot gnu.org ---
I noticed that for SVE stores seem to be marked as volatile memory accesses, I
suspect it's because they are represented using masked stores which probably
are by definition volatile (for this reason?).

A fix for this for now, before MVE starts using maskedstore patterns, would be
to use unspec_volatile for such stores.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/108177] MVE predicated stores to same address get optimized away
  2022-12-19 14:29 [Bug target/108177] New: MVE predicated stores to same address get optimized away avieira at gcc dot gnu.org
  2022-12-19 14:30 ` [Bug target/108177] " avieira at gcc dot gnu.org
@ 2022-12-19 15:47 ` rguenth at gcc dot gnu.org
  2022-12-19 17:20 ` avieira at gcc dot gnu.org
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-12-19 15:47 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108177

--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
I suspect

(insn 10 9 11 2 (set (mem:V16QI (reg/v/f:SI 117 [ pa ]) [0 MEM[(signed char[16]
*)pa_2(D)]+0 S16 A8]) 
        (unspec:V16QI [
                (reg/v:V16QI 116 [ a ])
                (reg:V16BI 120)
            ] VSTRBQ_U)) "include/arm_mve.h":12501:3 4673 {mve_vstrbq_p_uv16qi}
     (nil)) 

isn't a very accurate description of a masked store.  On x86 we have

(insn 17 16 0 (set (mem:V8DI (reg/v/f:DI 87 [ p ]) [0  S64 A8])
        (vec_merge:V8DI (reg:V8DI 92)
            (mem:V8DI (reg/v/f:DI 87 [ p ]) [0  S64 A8])
            (subreg:QI (reg:SI 95) 0)))
"/home/rguenther/obj-trunk-g/gcc/include/avx512fintrin.h":6484:3 -1

For AVX we use:

(define_expand "maskstore<mode><sseintvecmodelower>"
  [(set (match_operand:V48_AVX2 0 "memory_operand")
    (unspec:V48_AVX2
      [(match_operand:<sseintvecmode> 2 "register_operand")
       (match_operand:V48_AVX2 1 "register_operand")
       (match_dup 0)]
      UNSPEC_MASKMOV))]
  "TARGET_AVX")

note how the memory destination is also an input to the UNSPEC here.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/108177] MVE predicated stores to same address get optimized away
  2022-12-19 14:29 [Bug target/108177] New: MVE predicated stores to same address get optimized away avieira at gcc dot gnu.org
  2022-12-19 14:30 ` [Bug target/108177] " avieira at gcc dot gnu.org
  2022-12-19 15:47 ` rguenth at gcc dot gnu.org
@ 2022-12-19 17:20 ` avieira at gcc dot gnu.org
  2023-01-24 17:02 ` cvs-commit at gcc dot gnu.org
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: avieira at gcc dot gnu.org @ 2022-12-19 17:20 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108177

--- Comment #3 from avieira at gcc dot gnu.org ---
The architecture describes it as only writing the true-predicated bytes and
leaving the others untouched. I guess reading and writting to the same memory
is the best we can do to 'mimic' that in RTL. SVE does the same as x86, so I'll
try that approach over unspec_volatile.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/108177] MVE predicated stores to same address get optimized away
  2022-12-19 14:29 [Bug target/108177] New: MVE predicated stores to same address get optimized away avieira at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2022-12-19 17:20 ` avieira at gcc dot gnu.org
@ 2023-01-24 17:02 ` cvs-commit at gcc dot gnu.org
  2023-04-06 17:32 ` stammark at gcc dot gnu.org
  2023-05-18 10:43 ` cvs-commit at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-01-24 17:02 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108177

--- Comment #4 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Andre Simoes Dias Vieira
<avieira@gcc.gnu.org>:

https://gcc.gnu.org/g:c1093923733a1072a237f112e3239b5ebd88eadd

commit r13-5327-gc1093923733a1072a237f112e3239b5ebd88eadd
Author: Andre Vieira <andre.simoesdiasvieira@arm.com>
Date:   Tue Jan 24 16:59:23 2023 +0000

    arm: Make MVE masked stores read memory operand [PR 108177]

    This patch adds the memory operand of MVE masked stores as input operands
to
    mimic the 'partial' writes, to prevent erroneous write-after-write
    optimizations as described in the PR.

    gcc/ChangeLog:

            PR target/108177
            * config/arm/mve.md (mve_vstrbq_p_<supf><mode>, mve_vstrhq_p_fv8hf,
            mve_vstrhq_p_<supf><mode>, mve_vstrwq_p_<supf>v4si): Add memory
operand
            as input operand.

    gcc/testsuite/ChangeLog:

            * gcc.target/arm/mve/pr108177-1-run.c: New test.
            * gcc.target/arm/mve/pr108177-1.c: New test.
            * gcc.target/arm/mve/pr108177-10-run.c: New test.
            * gcc.target/arm/mve/pr108177-10.c: New test.
            * gcc.target/arm/mve/pr108177-11-run.c: New test.
            * gcc.target/arm/mve/pr108177-11.c: New test.
            * gcc.target/arm/mve/pr108177-12-run.c: New test.
            * gcc.target/arm/mve/pr108177-12.c: New test.
            * gcc.target/arm/mve/pr108177-13-run.c: New test.
            * gcc.target/arm/mve/pr108177-13.c: New test.
            * gcc.target/arm/mve/pr108177-14-run.c: New test.
            * gcc.target/arm/mve/pr108177-14.c: New test.
            * gcc.target/arm/mve/pr108177-2-run.c: New test.
            * gcc.target/arm/mve/pr108177-2.c: New test.
            * gcc.target/arm/mve/pr108177-3-run.c: New test.
            * gcc.target/arm/mve/pr108177-3.c: New test.
            * gcc.target/arm/mve/pr108177-4-run.c: New test.
            * gcc.target/arm/mve/pr108177-4.c: New test.
            * gcc.target/arm/mve/pr108177-5-run.c: New test.
            * gcc.target/arm/mve/pr108177-5.c: New test.
            * gcc.target/arm/mve/pr108177-6-run.c: New test.
            * gcc.target/arm/mve/pr108177-6.c: New test.
            * gcc.target/arm/mve/pr108177-7-run.c: New test.
            * gcc.target/arm/mve/pr108177-7.c: New test.
            * gcc.target/arm/mve/pr108177-8-run.c: New test.
            * gcc.target/arm/mve/pr108177-8.c: New test.
            * gcc.target/arm/mve/pr108177-9-run.c: New test.
            * gcc.target/arm/mve/pr108177-9.c: New test.
            * gcc.target/arm/mve/pr108177-main.x: New test include.
            * gcc.target/arm/mve/pr108177.x: New test include.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/108177] MVE predicated stores to same address get optimized away
  2022-12-19 14:29 [Bug target/108177] New: MVE predicated stores to same address get optimized away avieira at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2023-01-24 17:02 ` cvs-commit at gcc dot gnu.org
@ 2023-04-06 17:32 ` stammark at gcc dot gnu.org
  2023-05-18 10:43 ` cvs-commit at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: stammark at gcc dot gnu.org @ 2023-04-06 17:32 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108177

--- Comment #5 from Stam Markianos-Wright <stammark at gcc dot gnu.org> ---
With the fix to MVE auto_inc having gone in as
ddc9b5ee13cd686c8674f92d46045563c06a23ea I have found that this fix keeps the
auto-inc on these predicated stores broken.

It seems to fail in auto_inc_dec at this condition:

```
  /* Make sure this reg appears only once in this insn.  */
  if (count_occurrences (PATTERN (mem_insn.insn), mem_insn.reg0, 1) != 1)
    {
      if (dump_file)
        fprintf (dump_file, "mem count failure\n");
      return false;
    }
```
(which makes sense with the pattern now having the MEM appear twice)


I guess this is not urgent since this is only a performance impact on one
instruction. Also if the change needs to be in the auto-inc pass instead of the
backend, then likely something for GCC14, but I thought this would be a good
place to record this ;)

Does anyone have any ideas on this? Or I wonder what the AVX case does for this

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/108177] MVE predicated stores to same address get optimized away
  2022-12-19 14:29 [Bug target/108177] New: MVE predicated stores to same address get optimized away avieira at gcc dot gnu.org
                   ` (4 preceding siblings ...)
  2023-04-06 17:32 ` stammark at gcc dot gnu.org
@ 2023-05-18 10:43 ` cvs-commit at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-05-18 10:43 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108177

--- Comment #6 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-12 branch has been updated by Stam Markianos-Wright
<stammark@gcc.gnu.org>:

https://gcc.gnu.org/g:bb113a56e31159f4fbe26cc06ee25e4972ce74b5

commit r12-9610-gbb113a56e31159f4fbe26cc06ee25e4972ce74b5
Author: Andre Vieira <andre.simoesdiasvieira@arm.com>
Date:   Tue Jan 24 16:59:23 2023 +0000

    arm: Make MVE masked stores read memory operand [PR 108177]

    This patch adds the memory operand of MVE masked stores as input operands
to
    mimic the 'partial' writes, to prevent erroneous write-after-write
    optimizations as described in the PR.

    gcc/ChangeLog:

            PR target/108177
            * config/arm/mve.md (mve_vstrbq_p_<supf><mode>, mve_vstrhq_p_fv8hf,
            mve_vstrhq_p_<supf><mode>, mve_vstrwq_p_<supf>v4si): Add memory
operand
            as input operand.

    gcc/testsuite/ChangeLog:

            * gcc.target/arm/mve/pr108177-1-run.c: New test.
            * gcc.target/arm/mve/pr108177-1.c: New test.
            * gcc.target/arm/mve/pr108177-10-run.c: New test.
            * gcc.target/arm/mve/pr108177-10.c: New test.
            * gcc.target/arm/mve/pr108177-11-run.c: New test.
            * gcc.target/arm/mve/pr108177-11.c: New test.
            * gcc.target/arm/mve/pr108177-12-run.c: New test.
            * gcc.target/arm/mve/pr108177-12.c: New test.
            * gcc.target/arm/mve/pr108177-13-run.c: New test.
            * gcc.target/arm/mve/pr108177-13.c: New test.
            * gcc.target/arm/mve/pr108177-14-run.c: New test.
            * gcc.target/arm/mve/pr108177-14.c: New test.
            * gcc.target/arm/mve/pr108177-2-run.c: New test.
            * gcc.target/arm/mve/pr108177-2.c: New test.
            * gcc.target/arm/mve/pr108177-3-run.c: New test.
            * gcc.target/arm/mve/pr108177-3.c: New test.
            * gcc.target/arm/mve/pr108177-4-run.c: New test.
            * gcc.target/arm/mve/pr108177-4.c: New test.
            * gcc.target/arm/mve/pr108177-5-run.c: New test.
            * gcc.target/arm/mve/pr108177-5.c: New test.
            * gcc.target/arm/mve/pr108177-6-run.c: New test.
            * gcc.target/arm/mve/pr108177-6.c: New test.
            * gcc.target/arm/mve/pr108177-7-run.c: New test.
            * gcc.target/arm/mve/pr108177-7.c: New test.
            * gcc.target/arm/mve/pr108177-8-run.c: New test.
            * gcc.target/arm/mve/pr108177-8.c: New test.
            * gcc.target/arm/mve/pr108177-9-run.c: New test.
            * gcc.target/arm/mve/pr108177-9.c: New test.
            * gcc.target/arm/mve/pr108177-main.x: New test include.
            * gcc.target/arm/mve/pr108177.x: New test include.

    (cherry picked from commit c1093923733a1072a237f112e3239b5ebd88eadd)

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2023-05-18 10:43 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-12-19 14:29 [Bug target/108177] New: MVE predicated stores to same address get optimized away avieira at gcc dot gnu.org
2022-12-19 14:30 ` [Bug target/108177] " avieira at gcc dot gnu.org
2022-12-19 15:47 ` rguenth at gcc dot gnu.org
2022-12-19 17:20 ` avieira at gcc dot gnu.org
2023-01-24 17:02 ` cvs-commit at gcc dot gnu.org
2023-04-06 17:32 ` stammark at gcc dot gnu.org
2023-05-18 10:43 ` cvs-commit at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).