public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug libgomp/97291] New: [SIMT] Move SIMT_XCHG_* out of non-uniform execution region
@ 2020-10-05 11:51 vries at gcc dot gnu.org
2020-10-05 14:19 ` [Bug libgomp/97291] " amonakov at gcc dot gnu.org
` (3 more replies)
0 siblings, 4 replies; 5+ messages in thread
From: vries at gcc dot gnu.org @ 2020-10-05 11:51 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97291
Bug ID: 97291
Summary: [SIMT] Move SIMT_XCHG_* out of non-uniform execution
region
Product: gcc
Version: 11.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: libgomp
Assignee: unassigned at gcc dot gnu.org
Reporter: vries at gcc dot gnu.org
CC: jakub at gcc dot gnu.org
Target Milestone: ---
We have:
...
/* Allocate per-lane storage and begin non-uniform execution region. */
static void
expand_GOMP_SIMT_ENTER_ALLOC (internal_fn, gcall *stmt)
...
and:
...
/* Deallocate per-lane storage and leave non-uniform execution region. */
static void
expand_GOMP_SIMT_EXIT (internal_fn, gcall *stmt)
...
So, if the SIMT_ENTER_ALLOC and the SIMT_EXIT mark the start and end of a
region of non-uniform execution, it's strange that such a region can contain
SIMT_XCHG_*, which on nvptx requires uniform execution.
Moving SIMT_VOTE_ANY/SIMT_LAST_LANE/SIMT_XCHG_* as a whole after SIMT_EXIT is
not possible given that VOTE_ANY may have data dependencies to storage that is
deallocated by SIMT_EXIT (as Alexander mentioned here:
https://gcc.gnu.org/pipermail/gcc-patches/2020-October/555475.html).
A possible solution would be to split the SIMT_EXIT into separate bits for
exiting non-uniform execution and deallocation, and have:
- SIMT_ENTER_ALLOC
- SIMT_EXIT_UNI
- SIMT_VOTE_ANY/SIMT_LAST_LANE/SIMT_XCHG_*
- SIMT_EXIT_DEALLOC
Also I've wondered if we could do:
- SIMT_ENTER_ALLOC
- SIMT_VOTE_ANY
- SIMT_EXIT
- SIMT_LAST_LANE/SIMT_XCHG_*
but perhaps there are again data dependency problems.
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug libgomp/97291] [SIMT] Move SIMT_XCHG_* out of non-uniform execution region
2020-10-05 11:51 [Bug libgomp/97291] New: [SIMT] Move SIMT_XCHG_* out of non-uniform execution region vries at gcc dot gnu.org
@ 2020-10-05 14:19 ` amonakov at gcc dot gnu.org
2020-10-16 10:41 ` cvs-commit at gcc dot gnu.org
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: amonakov at gcc dot gnu.org @ 2020-10-05 14:19 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97291
--- Comment #1 from Alexander Monakov <amonakov at gcc dot gnu.org> ---
Reshuffling statements and piling up extra abstraction doesn't help solve the
core issue that GIMPLE passes can duplicate any basic block, but basic blocks
of SIMT loop epilogue should be protected from that. More generally, arbitrary
duplication demonstrably causes miscompilation even without SIMT: PR 80053.
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug libgomp/97291] [SIMT] Move SIMT_XCHG_* out of non-uniform execution region
2020-10-05 11:51 [Bug libgomp/97291] New: [SIMT] Move SIMT_XCHG_* out of non-uniform execution region vries at gcc dot gnu.org
2020-10-05 14:19 ` [Bug libgomp/97291] " amonakov at gcc dot gnu.org
@ 2020-10-16 10:41 ` cvs-commit at gcc dot gnu.org
2020-10-16 10:47 ` cvs-commit at gcc dot gnu.org
2020-10-16 10:55 ` sripar01 at gcc dot gnu.org
3 siblings, 0 replies; 5+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2020-10-16 10:41 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97291
--- Comment #2 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by SRINATH PARVATHANENI
<sripar01@gcc.gnu.org>:
https://gcc.gnu.org/g:377535881166969dba43794f298170978d797ef6
commit r11-4004-g377535881166969dba43794f298170978d797ef6
Author: Srinath Parvathaneni <srinath.parvathaneni@arm.com>
Date: Fri Oct 16 11:40:25 2020 +0100
arm: Fix wrong code generated for mve scatter store with writeback
intrinsics with -O2 (PR97271).
This patch fixes (PR97271) the wrong code-gen for mve scatter store with
writeback intrinsics with -O2.
$cat bug.c
void
foo (uint32x4_t * addr, const int offset, int32x4_t value)
{
vstrwq_scatter_base_wb_s32 (addr, 8, value);
}
$ arm-none-eabi-gcc bug.c -S -O2 -march=armv8.1-m.main+mve
-mfloat-abi=hard -o -
Without this patch:
...
foo:
vldrw.32 q3, [r0]
vstrw.u32 q0, [q3, #8]! ---> (A)
vldr.64 d4, .L3
vldr.64 d5, .L3+8
vldrw.32 q3, [r0]
vstrw.u32 q2, [q3, #8]! ---> (B)
bx lr
...
With this patch:
...
foo:
vldrw.32 q3, [r0]
vstrw.u32 q0, [q3, #8]! --> (C)
vstrw.32 q3, [r0]
bx lr
...
Without this patch 2 vstrw assembly instructions (A and B) are generated
for vstrwq_scatter_base_wb_s32
intrinsic where as fix generates only one vstrw assembly instruction (C).
gcc/ChangeLog:
2020-10-06 Srinath Parvathaneni <srinath.parvathaneni@arm.com>
PR target/97291
* config/arm/arm-builtins.c (arm_strsbwbs_qualifiers): Modify
array.
(arm_strsbwbu_qualifiers): Likewise.
(arm_strsbwbs_p_qualifiers): Likewise.
(arm_strsbwbu_p_qualifiers): Likewise.
* config/arm/arm_mve.h (__arm_vstrdq_scatter_base_wb_s64): Modify
function definition.
(__arm_vstrdq_scatter_base_wb_u64): Likewise.
(__arm_vstrdq_scatter_base_wb_p_s64): Likewise.
(__arm_vstrdq_scatter_base_wb_p_u64): Likewise.
(__arm_vstrwq_scatter_base_wb_p_s32): Likewise.
(__arm_vstrwq_scatter_base_wb_p_u32): Likewise.
(__arm_vstrwq_scatter_base_wb_s32): Likewise.
(__arm_vstrwq_scatter_base_wb_u32): Likewise.
(__arm_vstrwq_scatter_base_wb_f32): Likewise.
(__arm_vstrwq_scatter_base_wb_p_f32): Likewise.
* config/arm/arm_mve_builtins.def (vstrwq_scatter_base_wb_add_u):
Remove
expansion for the builtin.
(vstrwq_scatter_base_wb_add_s): Likewise.
(vstrwq_scatter_base_wb_add_f): Likewise.
(vstrdq_scatter_base_wb_add_u): Likewise.
(vstrdq_scatter_base_wb_add_s): Likewise.
(vstrwq_scatter_base_wb_p_add_u): Likewise.
(vstrwq_scatter_base_wb_p_add_s): Likewise.
(vstrwq_scatter_base_wb_p_add_f): Likewise.
(vstrdq_scatter_base_wb_p_add_u): Likewise.
(vstrdq_scatter_base_wb_p_add_s): Likewise.
* config/arm/mve.md (mve_vstrwq_scatter_base_wb_<supf>v4si): Remove
expand.
(mve_vstrwq_scatter_base_wb_add_<supf>v4si): Likewise.
(mve_vstrwq_scatter_base_wb_<supf>v4si_insn): Rename pattern to ...
(mve_vstrwq_scatter_base_wb_<supf>v4si): This.
(mve_vstrwq_scatter_base_wb_p_<supf>v4si): Remove expand.
(mve_vstrwq_scatter_base_wb_p_add_<supf>v4si): Likewise.
(mve_vstrwq_scatter_base_wb_p_<supf>v4si_insn): Rename pattern to
...
(mve_vstrwq_scatter_base_wb_p_<supf>v4si): This.
(mve_vstrwq_scatter_base_wb_fv4sf): Remove expand.
(mve_vstrwq_scatter_base_wb_add_fv4sf): Likewise.
(mve_vstrwq_scatter_base_wb_fv4sf_insn): Rename pattern to ...
(mve_vstrwq_scatter_base_wb_fv4sf): This.
(mve_vstrwq_scatter_base_wb_p_fv4sf): Remove expand.
(mve_vstrwq_scatter_base_wb_p_add_fv4sf): Likewise.
(mve_vstrwq_scatter_base_wb_p_fv4sf_insn): Rename pattern to ...
(mve_vstrwq_scatter_base_wb_p_fv4sf): This.
(mve_vstrdq_scatter_base_wb_<supf>v2di): Remove expand.
(mve_vstrdq_scatter_base_wb_add_<supf>v2di): Likewise.
(mve_vstrdq_scatter_base_wb_<supf>v2di_insn): Rename pattern to ...
(mve_vstrdq_scatter_base_wb_<supf>v2di): This.
(mve_vstrdq_scatter_base_wb_p_<supf>v2di): Remove expand.
(mve_vstrdq_scatter_base_wb_p_add_<supf>v2di): Likewise.
(mve_vstrdq_scatter_base_wb_p_<supf>v2di_insn): Rename pattern to
...
(mve_vstrdq_scatter_base_wb_p_<supf>v2di): This.
gcc/testsuite/ChangeLog:
PR target/97291
* gcc.target/arm/mve/intrinsics/vstrdq_scatter_base_wb_p_s64.c:
Modify.
* gcc.target/arm/mve/intrinsics/vstrdq_scatter_base_wb_p_u64.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrdq_scatter_base_wb_s64.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrdq_scatter_base_wb_u64.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_f32.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_p_f32.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_p_s32.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_p_u32.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_s32.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_u32.c:
Likewise.
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug libgomp/97291] [SIMT] Move SIMT_XCHG_* out of non-uniform execution region
2020-10-05 11:51 [Bug libgomp/97291] New: [SIMT] Move SIMT_XCHG_* out of non-uniform execution region vries at gcc dot gnu.org
2020-10-05 14:19 ` [Bug libgomp/97291] " amonakov at gcc dot gnu.org
2020-10-16 10:41 ` cvs-commit at gcc dot gnu.org
@ 2020-10-16 10:47 ` cvs-commit at gcc dot gnu.org
2020-10-16 10:55 ` sripar01 at gcc dot gnu.org
3 siblings, 0 replies; 5+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2020-10-16 10:47 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97291
--- Comment #3 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-10 branch has been updated by SRINATH PARVATHANENI
<sripar01@gcc.gnu.org>:
https://gcc.gnu.org/g:4199cfa3d18eb99893456bd461061daa75115711
commit r10-8899-g4199cfa3d18eb99893456bd461061daa75115711
Author: Srinath Parvathaneni <srinath.parvathaneni@arm.com>
Date: Fri Oct 16 11:40:25 2020 +0100
arm: Fix wrong code generated for mve scatter store with writeback
intrinsics with -O2 (PR97271).
This patch fixes (PR97271) the wrong code-gen for mve scatter store with
writeback intrinsics with -O2.
$cat bug.c
void
foo (uint32x4_t * addr, const int offset, int32x4_t value)
{
vstrwq_scatter_base_wb_s32 (addr, 8, value);
}
$ arm-none-eabi-gcc bug.c -S -O2 -march=armv8.1-m.main+mve
-mfloat-abi=hard -o -
Without this patch:
...
foo:
vldrw.32 q3, [r0]
vstrw.u32 q0, [q3, #8]! ---> (A)
vldr.64 d4, .L3
vldr.64 d5, .L3+8
vldrw.32 q3, [r0]
vstrw.u32 q2, [q3, #8]! ---> (B)
bx lr
...
With this patch:
...
foo:
vldrw.32 q3, [r0]
vstrw.u32 q0, [q3, #8]! --> (C)
vstrw.32 q3, [r0]
bx lr
...
Without this patch 2 vstrw assembly instructions (A and B) are generated
for vstrwq_scatter_base_wb_s32
intrinsic where as fix generates only one vstrw assembly instruction (C).
gcc/ChangeLog:
2020-10-06 Srinath Parvathaneni <srinath.parvathaneni@arm.com>
PR target/97291
* config/arm/arm-builtins.c (arm_strsbwbs_qualifiers): Modify
array.
(arm_strsbwbu_qualifiers): Likewise.
(arm_strsbwbs_p_qualifiers): Likewise.
(arm_strsbwbu_p_qualifiers): Likewise.
* config/arm/arm_mve.h (__arm_vstrdq_scatter_base_wb_s64): Modify
function definition.
(__arm_vstrdq_scatter_base_wb_u64): Likewise.
(__arm_vstrdq_scatter_base_wb_p_s64): Likewise.
(__arm_vstrdq_scatter_base_wb_p_u64): Likewise.
(__arm_vstrwq_scatter_base_wb_p_s32): Likewise.
(__arm_vstrwq_scatter_base_wb_p_u32): Likewise.
(__arm_vstrwq_scatter_base_wb_s32): Likewise.
(__arm_vstrwq_scatter_base_wb_u32): Likewise.
(__arm_vstrwq_scatter_base_wb_f32): Likewise.
(__arm_vstrwq_scatter_base_wb_p_f32): Likewise.
* config/arm/arm_mve_builtins.def (vstrwq_scatter_base_wb_add_u):
Remove
expansion for the builtin.
(vstrwq_scatter_base_wb_add_s): Likewise.
(vstrwq_scatter_base_wb_add_f): Likewise.
(vstrdq_scatter_base_wb_add_u): Likewise.
(vstrdq_scatter_base_wb_add_s): Likewise.
(vstrwq_scatter_base_wb_p_add_u): Likewise.
(vstrwq_scatter_base_wb_p_add_s): Likewise.
(vstrwq_scatter_base_wb_p_add_f): Likewise.
(vstrdq_scatter_base_wb_p_add_u): Likewise.
(vstrdq_scatter_base_wb_p_add_s): Likewise.
* config/arm/mve.md (mve_vstrwq_scatter_base_wb_<supf>v4si): Remove
expand.
(mve_vstrwq_scatter_base_wb_add_<supf>v4si): Likewise.
(mve_vstrwq_scatter_base_wb_<supf>v4si_insn): Rename pattern to ...
(mve_vstrwq_scatter_base_wb_<supf>v4si): This.
(mve_vstrwq_scatter_base_wb_p_<supf>v4si): Remove expand.
(mve_vstrwq_scatter_base_wb_p_add_<supf>v4si): Likewise.
(mve_vstrwq_scatter_base_wb_p_<supf>v4si_insn): Rename pattern to
...
(mve_vstrwq_scatter_base_wb_p_<supf>v4si): This.
(mve_vstrwq_scatter_base_wb_fv4sf): Remove expand.
(mve_vstrwq_scatter_base_wb_add_fv4sf): Likewise.
(mve_vstrwq_scatter_base_wb_fv4sf_insn): Rename pattern to ...
(mve_vstrwq_scatter_base_wb_fv4sf): This.
(mve_vstrwq_scatter_base_wb_p_fv4sf): Remove expand.
(mve_vstrwq_scatter_base_wb_p_add_fv4sf): Likewise.
(mve_vstrwq_scatter_base_wb_p_fv4sf_insn): Rename pattern to ...
(mve_vstrwq_scatter_base_wb_p_fv4sf): This.
(mve_vstrdq_scatter_base_wb_<supf>v2di): Remove expand.
(mve_vstrdq_scatter_base_wb_add_<supf>v2di): Likewise.
(mve_vstrdq_scatter_base_wb_<supf>v2di_insn): Rename pattern to ...
(mve_vstrdq_scatter_base_wb_<supf>v2di): This.
(mve_vstrdq_scatter_base_wb_p_<supf>v2di): Remove expand.
(mve_vstrdq_scatter_base_wb_p_add_<supf>v2di): Likewise.
(mve_vstrdq_scatter_base_wb_p_<supf>v2di_insn): Rename pattern to
...
(mve_vstrdq_scatter_base_wb_p_<supf>v2di): This.
gcc/testsuite/ChangeLog:
PR target/97291
* gcc.target/arm/mve/intrinsics/vstrdq_scatter_base_wb_p_s64.c:
Modify.
* gcc.target/arm/mve/intrinsics/vstrdq_scatter_base_wb_p_u64.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrdq_scatter_base_wb_s64.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrdq_scatter_base_wb_u64.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_f32.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_p_f32.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_p_s32.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_p_u32.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_s32.c:
Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_u32.c:
Likewise.
(cherry picked from commit 377535881166969dba43794f298170978d797ef6)
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Bug libgomp/97291] [SIMT] Move SIMT_XCHG_* out of non-uniform execution region
2020-10-05 11:51 [Bug libgomp/97291] New: [SIMT] Move SIMT_XCHG_* out of non-uniform execution region vries at gcc dot gnu.org
` (2 preceding siblings ...)
2020-10-16 10:47 ` cvs-commit at gcc dot gnu.org
@ 2020-10-16 10:55 ` sripar01 at gcc dot gnu.org
3 siblings, 0 replies; 5+ messages in thread
From: sripar01 at gcc dot gnu.org @ 2020-10-16 10:55 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97291
SRINATH PARVATHANENI <sripar01 at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |sripar01 at gcc dot gnu.org
--- Comment #4 from SRINATH PARVATHANENI <sripar01 at gcc dot gnu.org> ---
Sorry those are wrong commits, their are meant for PR97271.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2020-10-16 10:55 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-05 11:51 [Bug libgomp/97291] New: [SIMT] Move SIMT_XCHG_* out of non-uniform execution region vries at gcc dot gnu.org
2020-10-05 14:19 ` [Bug libgomp/97291] " amonakov at gcc dot gnu.org
2020-10-16 10:41 ` cvs-commit at gcc dot gnu.org
2020-10-16 10:47 ` cvs-commit at gcc dot gnu.org
2020-10-16 10:55 ` sripar01 at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).