public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug libgomp/97291] New: [SIMT] Move SIMT_XCHG_* out of non-uniform execution region
@ 2020-10-05 11:51 vries at gcc dot gnu.org
  2020-10-05 14:19 ` [Bug libgomp/97291] " amonakov at gcc dot gnu.org
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: vries at gcc dot gnu.org @ 2020-10-05 11:51 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97291

            Bug ID: 97291
           Summary: [SIMT] Move SIMT_XCHG_* out of non-uniform execution
                    region
           Product: gcc
           Version: 11.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: libgomp
          Assignee: unassigned at gcc dot gnu.org
          Reporter: vries at gcc dot gnu.org
                CC: jakub at gcc dot gnu.org
  Target Milestone: ---

We have:
...
/* Allocate per-lane storage and begin non-uniform execution region.  */

static void
expand_GOMP_SIMT_ENTER_ALLOC (internal_fn, gcall *stmt)
...
and:
...
/* Deallocate per-lane storage and leave non-uniform execution region.  */

static void
expand_GOMP_SIMT_EXIT (internal_fn, gcall *stmt)
...

So, if the SIMT_ENTER_ALLOC and the SIMT_EXIT mark the start and end of a
region of non-uniform execution, it's strange that such a region can contain
SIMT_XCHG_*, which on nvptx requires uniform execution.

Moving SIMT_VOTE_ANY/SIMT_LAST_LANE/SIMT_XCHG_* as a whole after SIMT_EXIT is
not possible given that VOTE_ANY may have data dependencies to storage that is
deallocated by SIMT_EXIT (as Alexander mentioned here:
https://gcc.gnu.org/pipermail/gcc-patches/2020-October/555475.html).

A possible solution would be to split the SIMT_EXIT into separate bits for
exiting non-uniform execution and deallocation, and have:
- SIMT_ENTER_ALLOC
- SIMT_EXIT_UNI
- SIMT_VOTE_ANY/SIMT_LAST_LANE/SIMT_XCHG_*
- SIMT_EXIT_DEALLOC

Also I've wondered if we could do:
- SIMT_ENTER_ALLOC
- SIMT_VOTE_ANY
- SIMT_EXIT
- SIMT_LAST_LANE/SIMT_XCHG_*
but perhaps there are again data dependency problems.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug libgomp/97291] [SIMT] Move SIMT_XCHG_* out of non-uniform execution region
  2020-10-05 11:51 [Bug libgomp/97291] New: [SIMT] Move SIMT_XCHG_* out of non-uniform execution region vries at gcc dot gnu.org
@ 2020-10-05 14:19 ` amonakov at gcc dot gnu.org
  2020-10-16 10:41 ` cvs-commit at gcc dot gnu.org
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: amonakov at gcc dot gnu.org @ 2020-10-05 14:19 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97291

--- Comment #1 from Alexander Monakov <amonakov at gcc dot gnu.org> ---
Reshuffling statements and piling up extra abstraction doesn't help solve the
core issue that GIMPLE passes can duplicate any basic block, but basic blocks
of SIMT loop epilogue should be protected from that. More generally, arbitrary
duplication demonstrably causes miscompilation even without SIMT: PR 80053.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug libgomp/97291] [SIMT] Move SIMT_XCHG_* out of non-uniform execution region
  2020-10-05 11:51 [Bug libgomp/97291] New: [SIMT] Move SIMT_XCHG_* out of non-uniform execution region vries at gcc dot gnu.org
  2020-10-05 14:19 ` [Bug libgomp/97291] " amonakov at gcc dot gnu.org
@ 2020-10-16 10:41 ` cvs-commit at gcc dot gnu.org
  2020-10-16 10:47 ` cvs-commit at gcc dot gnu.org
  2020-10-16 10:55 ` sripar01 at gcc dot gnu.org
  3 siblings, 0 replies; 5+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2020-10-16 10:41 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97291

--- Comment #2 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by SRINATH PARVATHANENI
<sripar01@gcc.gnu.org>:

https://gcc.gnu.org/g:377535881166969dba43794f298170978d797ef6

commit r11-4004-g377535881166969dba43794f298170978d797ef6
Author: Srinath Parvathaneni <srinath.parvathaneni@arm.com>
Date:   Fri Oct 16 11:40:25 2020 +0100

    arm: Fix wrong code generated for mve scatter store with writeback
intrinsics with -O2 (PR97271).

    This patch fixes (PR97271) the wrong code-gen for mve scatter store with
writeback intrinsics with -O2.

    $cat bug.c
    void
    foo (uint32x4_t * addr, const int offset, int32x4_t value)
    {
      vstrwq_scatter_base_wb_s32 (addr, 8, value);
    }

    $ arm-none-eabi-gcc  bug.c -S -O2 -march=armv8.1-m.main+mve
-mfloat-abi=hard -o -
    Without this patch:
    ...
    foo:
            vldrw.32        q3, [r0]
            vstrw.u32       q0, [q3, #8]!  ---> (A)
            vldr.64 d4, .L3
            vldr.64 d5, .L3+8
            vldrw.32        q3, [r0]
            vstrw.u32       q2, [q3, #8]!  ---> (B)
            bx      lr
    ...

    With this patch:
    ...
    foo:
            vldrw.32        q3, [r0]
            vstrw.u32       q0, [q3, #8]!  --> (C)
            vstrw.32        q3, [r0]
            bx      lr
    ...

    Without this patch 2 vstrw assembly instructions (A and B) are generated
for vstrwq_scatter_base_wb_s32
    intrinsic where as fix generates only one vstrw assembly instruction (C).

    gcc/ChangeLog:

    2020-10-06  Srinath Parvathaneni  <srinath.parvathaneni@arm.com>

            PR target/97291
            * config/arm/arm-builtins.c (arm_strsbwbs_qualifiers): Modify
array.
            (arm_strsbwbu_qualifiers): Likewise.
            (arm_strsbwbs_p_qualifiers): Likewise.
            (arm_strsbwbu_p_qualifiers): Likewise.
            * config/arm/arm_mve.h (__arm_vstrdq_scatter_base_wb_s64): Modify
            function definition.
            (__arm_vstrdq_scatter_base_wb_u64): Likewise.
            (__arm_vstrdq_scatter_base_wb_p_s64): Likewise.
            (__arm_vstrdq_scatter_base_wb_p_u64): Likewise.
            (__arm_vstrwq_scatter_base_wb_p_s32): Likewise.
            (__arm_vstrwq_scatter_base_wb_p_u32): Likewise.
            (__arm_vstrwq_scatter_base_wb_s32): Likewise.
            (__arm_vstrwq_scatter_base_wb_u32): Likewise.
            (__arm_vstrwq_scatter_base_wb_f32): Likewise.
            (__arm_vstrwq_scatter_base_wb_p_f32): Likewise.
            * config/arm/arm_mve_builtins.def (vstrwq_scatter_base_wb_add_u):
Remove
            expansion for the builtin.
            (vstrwq_scatter_base_wb_add_s): Likewise.
            (vstrwq_scatter_base_wb_add_f): Likewise.
            (vstrdq_scatter_base_wb_add_u): Likewise.
            (vstrdq_scatter_base_wb_add_s): Likewise.
            (vstrwq_scatter_base_wb_p_add_u): Likewise.
            (vstrwq_scatter_base_wb_p_add_s): Likewise.
            (vstrwq_scatter_base_wb_p_add_f): Likewise.
            (vstrdq_scatter_base_wb_p_add_u): Likewise.
            (vstrdq_scatter_base_wb_p_add_s): Likewise.
            * config/arm/mve.md (mve_vstrwq_scatter_base_wb_<supf>v4si): Remove
            expand.
            (mve_vstrwq_scatter_base_wb_add_<supf>v4si): Likewise.
            (mve_vstrwq_scatter_base_wb_<supf>v4si_insn): Rename pattern to ...
            (mve_vstrwq_scatter_base_wb_<supf>v4si): This.
            (mve_vstrwq_scatter_base_wb_p_<supf>v4si): Remove expand.
            (mve_vstrwq_scatter_base_wb_p_add_<supf>v4si): Likewise.
            (mve_vstrwq_scatter_base_wb_p_<supf>v4si_insn): Rename pattern to
...
            (mve_vstrwq_scatter_base_wb_p_<supf>v4si): This.
            (mve_vstrwq_scatter_base_wb_fv4sf): Remove expand.
            (mve_vstrwq_scatter_base_wb_add_fv4sf): Likewise.
            (mve_vstrwq_scatter_base_wb_fv4sf_insn): Rename pattern to ...
            (mve_vstrwq_scatter_base_wb_fv4sf): This.
            (mve_vstrwq_scatter_base_wb_p_fv4sf): Remove expand.
            (mve_vstrwq_scatter_base_wb_p_add_fv4sf): Likewise.
            (mve_vstrwq_scatter_base_wb_p_fv4sf_insn): Rename pattern to ...
            (mve_vstrwq_scatter_base_wb_p_fv4sf): This.
            (mve_vstrdq_scatter_base_wb_<supf>v2di): Remove expand.
            (mve_vstrdq_scatter_base_wb_add_<supf>v2di): Likewise.
            (mve_vstrdq_scatter_base_wb_<supf>v2di_insn): Rename pattern to ...
            (mve_vstrdq_scatter_base_wb_<supf>v2di): This.
            (mve_vstrdq_scatter_base_wb_p_<supf>v2di): Remove expand.
            (mve_vstrdq_scatter_base_wb_p_add_<supf>v2di): Likewise.
            (mve_vstrdq_scatter_base_wb_p_<supf>v2di_insn): Rename pattern to
...
            (mve_vstrdq_scatter_base_wb_p_<supf>v2di): This.

    gcc/testsuite/ChangeLog:

            PR target/97291
            * gcc.target/arm/mve/intrinsics/vstrdq_scatter_base_wb_p_s64.c:
Modify.
            * gcc.target/arm/mve/intrinsics/vstrdq_scatter_base_wb_p_u64.c:
            Likewise.
            * gcc.target/arm/mve/intrinsics/vstrdq_scatter_base_wb_s64.c:
Likewise.
            * gcc.target/arm/mve/intrinsics/vstrdq_scatter_base_wb_u64.c:
Likewise.
            * gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_f32.c:
Likewise.
            * gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_p_f32.c:
            Likewise.
            * gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_p_s32.c:
            Likewise.
            * gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_p_u32.c:
            Likewise.
            * gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_s32.c:
Likewise.
            * gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_u32.c:
Likewise.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug libgomp/97291] [SIMT] Move SIMT_XCHG_* out of non-uniform execution region
  2020-10-05 11:51 [Bug libgomp/97291] New: [SIMT] Move SIMT_XCHG_* out of non-uniform execution region vries at gcc dot gnu.org
  2020-10-05 14:19 ` [Bug libgomp/97291] " amonakov at gcc dot gnu.org
  2020-10-16 10:41 ` cvs-commit at gcc dot gnu.org
@ 2020-10-16 10:47 ` cvs-commit at gcc dot gnu.org
  2020-10-16 10:55 ` sripar01 at gcc dot gnu.org
  3 siblings, 0 replies; 5+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2020-10-16 10:47 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97291

--- Comment #3 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-10 branch has been updated by SRINATH PARVATHANENI
<sripar01@gcc.gnu.org>:

https://gcc.gnu.org/g:4199cfa3d18eb99893456bd461061daa75115711

commit r10-8899-g4199cfa3d18eb99893456bd461061daa75115711
Author: Srinath Parvathaneni <srinath.parvathaneni@arm.com>
Date:   Fri Oct 16 11:40:25 2020 +0100

    arm: Fix wrong code generated for mve scatter store with writeback
intrinsics with -O2 (PR97271).

    This patch fixes (PR97271) the wrong code-gen for mve scatter store with
writeback intrinsics with -O2.

    $cat bug.c
    void
    foo (uint32x4_t * addr, const int offset, int32x4_t value)
    {
      vstrwq_scatter_base_wb_s32 (addr, 8, value);
    }

    $ arm-none-eabi-gcc  bug.c -S -O2 -march=armv8.1-m.main+mve
-mfloat-abi=hard -o -
    Without this patch:
    ...
    foo:
            vldrw.32        q3, [r0]
            vstrw.u32       q0, [q3, #8]!  ---> (A)
            vldr.64 d4, .L3
            vldr.64 d5, .L3+8
            vldrw.32        q3, [r0]
            vstrw.u32       q2, [q3, #8]!  ---> (B)
            bx      lr
    ...

    With this patch:
    ...
    foo:
            vldrw.32        q3, [r0]
            vstrw.u32       q0, [q3, #8]!  --> (C)
            vstrw.32        q3, [r0]
            bx      lr
    ...

    Without this patch 2 vstrw assembly instructions (A and B) are generated
for vstrwq_scatter_base_wb_s32
    intrinsic where as fix generates only one vstrw assembly instruction (C).

    gcc/ChangeLog:

    2020-10-06  Srinath Parvathaneni  <srinath.parvathaneni@arm.com>

            PR target/97291
            * config/arm/arm-builtins.c (arm_strsbwbs_qualifiers): Modify
array.
            (arm_strsbwbu_qualifiers): Likewise.
            (arm_strsbwbs_p_qualifiers): Likewise.
            (arm_strsbwbu_p_qualifiers): Likewise.
            * config/arm/arm_mve.h (__arm_vstrdq_scatter_base_wb_s64): Modify
            function definition.
            (__arm_vstrdq_scatter_base_wb_u64): Likewise.
            (__arm_vstrdq_scatter_base_wb_p_s64): Likewise.
            (__arm_vstrdq_scatter_base_wb_p_u64): Likewise.
            (__arm_vstrwq_scatter_base_wb_p_s32): Likewise.
            (__arm_vstrwq_scatter_base_wb_p_u32): Likewise.
            (__arm_vstrwq_scatter_base_wb_s32): Likewise.
            (__arm_vstrwq_scatter_base_wb_u32): Likewise.
            (__arm_vstrwq_scatter_base_wb_f32): Likewise.
            (__arm_vstrwq_scatter_base_wb_p_f32): Likewise.
            * config/arm/arm_mve_builtins.def (vstrwq_scatter_base_wb_add_u):
Remove
            expansion for the builtin.
            (vstrwq_scatter_base_wb_add_s): Likewise.
            (vstrwq_scatter_base_wb_add_f): Likewise.
            (vstrdq_scatter_base_wb_add_u): Likewise.
            (vstrdq_scatter_base_wb_add_s): Likewise.
            (vstrwq_scatter_base_wb_p_add_u): Likewise.
            (vstrwq_scatter_base_wb_p_add_s): Likewise.
            (vstrwq_scatter_base_wb_p_add_f): Likewise.
            (vstrdq_scatter_base_wb_p_add_u): Likewise.
            (vstrdq_scatter_base_wb_p_add_s): Likewise.
            * config/arm/mve.md (mve_vstrwq_scatter_base_wb_<supf>v4si): Remove
            expand.
            (mve_vstrwq_scatter_base_wb_add_<supf>v4si): Likewise.
            (mve_vstrwq_scatter_base_wb_<supf>v4si_insn): Rename pattern to ...
            (mve_vstrwq_scatter_base_wb_<supf>v4si): This.
            (mve_vstrwq_scatter_base_wb_p_<supf>v4si): Remove expand.
            (mve_vstrwq_scatter_base_wb_p_add_<supf>v4si): Likewise.
            (mve_vstrwq_scatter_base_wb_p_<supf>v4si_insn): Rename pattern to
...
            (mve_vstrwq_scatter_base_wb_p_<supf>v4si): This.
            (mve_vstrwq_scatter_base_wb_fv4sf): Remove expand.
            (mve_vstrwq_scatter_base_wb_add_fv4sf): Likewise.
            (mve_vstrwq_scatter_base_wb_fv4sf_insn): Rename pattern to ...
            (mve_vstrwq_scatter_base_wb_fv4sf): This.
            (mve_vstrwq_scatter_base_wb_p_fv4sf): Remove expand.
            (mve_vstrwq_scatter_base_wb_p_add_fv4sf): Likewise.
            (mve_vstrwq_scatter_base_wb_p_fv4sf_insn): Rename pattern to ...
            (mve_vstrwq_scatter_base_wb_p_fv4sf): This.
            (mve_vstrdq_scatter_base_wb_<supf>v2di): Remove expand.
            (mve_vstrdq_scatter_base_wb_add_<supf>v2di): Likewise.
            (mve_vstrdq_scatter_base_wb_<supf>v2di_insn): Rename pattern to ...
            (mve_vstrdq_scatter_base_wb_<supf>v2di): This.
            (mve_vstrdq_scatter_base_wb_p_<supf>v2di): Remove expand.
            (mve_vstrdq_scatter_base_wb_p_add_<supf>v2di): Likewise.
            (mve_vstrdq_scatter_base_wb_p_<supf>v2di_insn): Rename pattern to
...
            (mve_vstrdq_scatter_base_wb_p_<supf>v2di): This.

    gcc/testsuite/ChangeLog:

            PR target/97291
            * gcc.target/arm/mve/intrinsics/vstrdq_scatter_base_wb_p_s64.c:
Modify.
            * gcc.target/arm/mve/intrinsics/vstrdq_scatter_base_wb_p_u64.c:
            Likewise.
            * gcc.target/arm/mve/intrinsics/vstrdq_scatter_base_wb_s64.c:
Likewise.
            * gcc.target/arm/mve/intrinsics/vstrdq_scatter_base_wb_u64.c:
Likewise.
            * gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_f32.c:
Likewise.
            * gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_p_f32.c:
            Likewise.
            * gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_p_s32.c:
            Likewise.
            * gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_p_u32.c:
            Likewise.
            * gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_s32.c:
Likewise.
            * gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_u32.c:
Likewise.

    (cherry picked from commit 377535881166969dba43794f298170978d797ef6)

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug libgomp/97291] [SIMT] Move SIMT_XCHG_* out of non-uniform execution region
  2020-10-05 11:51 [Bug libgomp/97291] New: [SIMT] Move SIMT_XCHG_* out of non-uniform execution region vries at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2020-10-16 10:47 ` cvs-commit at gcc dot gnu.org
@ 2020-10-16 10:55 ` sripar01 at gcc dot gnu.org
  3 siblings, 0 replies; 5+ messages in thread
From: sripar01 at gcc dot gnu.org @ 2020-10-16 10:55 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97291

SRINATH PARVATHANENI <sripar01 at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |sripar01 at gcc dot gnu.org

--- Comment #4 from SRINATH PARVATHANENI <sripar01 at gcc dot gnu.org> ---
Sorry those are wrong commits, their are meant for PR97271.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-10-16 10:55 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-05 11:51 [Bug libgomp/97291] New: [SIMT] Move SIMT_XCHG_* out of non-uniform execution region vries at gcc dot gnu.org
2020-10-05 14:19 ` [Bug libgomp/97291] " amonakov at gcc dot gnu.org
2020-10-16 10:41 ` cvs-commit at gcc dot gnu.org
2020-10-16 10:47 ` cvs-commit at gcc dot gnu.org
2020-10-16 10:55 ` sripar01 at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).