public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug middle-end/101809] New: emulated gather capability doesn't support 32-bit target
@ 2021-08-06 23:04 hjl.tools at gmail dot com
2021-08-07 14:11 ` [Bug middle-end/101809] " hjl.tools at gmail dot com
` (6 more replies)
0 siblings, 7 replies; 8+ messages in thread
From: hjl.tools at gmail dot com @ 2021-08-06 23:04 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101809
Bug ID: 101809
Summary: emulated gather capability doesn't support 32-bit
target
Product: gcc
Version: 12.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: middle-end
Assignee: unassigned at gcc dot gnu.org
Reporter: hjl.tools at gmail dot com
CC: crazylht at gmail dot com, rguenth at gcc dot gnu.org
Target Milestone: ---
On Linux/x86-64, I get
[hjl@gnu-cfl-2 xxx]$ cat x.c
#include <stdint.h>
#define loop_t uint32_t
#define idx_t uint32_t
void loop(double * const __restrict__ dst,
double const * const __restrict__ src,
idx_t const * const __restrict__ idx,
loop_t const begin,
loop_t const end)
{
for (loop_t i = begin; i < end; ++i)
dst[i] = 42.0 * src[idx[i]];
}
[hjl@gnu-cfl-2 xxx]$ make x.s
/export/build/gnu/tools-build/gcc-gitlab-debug/build-x86_64-linux/gcc/xgcc
-B/export/build/gnu/tools-build/gcc-gitlab-debug/build-x86_64-linux/gcc/ -O3
-m32 -march=x86-64 -mfpmath=sse -S x.c
[hjl@gnu-cfl-2 xxx]$ cat x.s
.file "x.c"
.text
.p2align 4
.globl loop
.type loop, @function
loop:
.LFB0:
.cfi_startproc
pushl %edi
.cfi_def_cfa_offset 8
.cfi_offset 7, -8
pushl %esi
.cfi_def_cfa_offset 12
.cfi_offset 6, -12
pushl %ebx
.cfi_def_cfa_offset 16
.cfi_offset 3, -16
movl 28(%esp), %eax
movl 32(%esp), %ecx
movl 16(%esp), %ebx
movl 20(%esp), %esi
movl 24(%esp), %edi
cmpl %ecx, %eax
jnb .L1
movsd .LC0, %xmm1
.p2align 4,,10
.p2align 3
.L3:
movl (%edi,%eax,4), %edx
movsd (%esi,%edx,8), %xmm0
mulsd %xmm1, %xmm0
movsd %xmm0, (%ebx,%eax,8)
addl $1, %eax
cmpl %eax, %ecx
jne .L3
.L1:
popl %ebx
.cfi_restore 3
.cfi_def_cfa_offset 12
popl %esi
.cfi_restore 6
.cfi_def_cfa_offset 8
popl %edi
.cfi_restore 7
.cfi_def_cfa_offset 4
ret
.cfi_endproc
.LFE0:
.size loop, .-loop
.section .rodata.cst8,"aM",@progbits,8
.align 8
.LC0:
.long 0
.long 1078263808
.ident "GCC: (GNU) 12.0.0 20210806 (experimental)"
.section .note.GNU-stack,"",@progbits
[hjl@gnu-cfl-2 xxx]$
emulated gather capability isn't enabled.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug middle-end/101809] emulated gather capability doesn't support 32-bit target
2021-08-06 23:04 [Bug middle-end/101809] New: emulated gather capability doesn't support 32-bit target hjl.tools at gmail dot com
@ 2021-08-07 14:11 ` hjl.tools at gmail dot com
2021-08-09 7:31 ` [Bug tree-optimization/101809] " rguenth at gcc dot gnu.org
` (5 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: hjl.tools at gmail dot com @ 2021-08-07 14:11 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101809
--- Comment #1 from H.J. Lu <hjl.tools at gmail dot com> ---
It fails in get_load_store_type:
else if (!TYPE_VECTOR_SUBPARTS (vectype).is_constant ()
|| !known_eq (TYPE_VECTOR_SUBPARTS (vectype),
TYPE_VECTOR_SUBPARTS
(gs_info->offset_vectype)))
{
if (dump_enabled_p ())
dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
"unsupported vector types for emulated "
"gather.\n");
return false;
}
For V2DF gather, we need V2DI index. But for -m32, index is V4SI.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug tree-optimization/101809] emulated gather capability doesn't support 32-bit target
2021-08-06 23:04 [Bug middle-end/101809] New: emulated gather capability doesn't support 32-bit target hjl.tools at gmail dot com
2021-08-07 14:11 ` [Bug middle-end/101809] " hjl.tools at gmail dot com
@ 2021-08-09 7:31 ` rguenth at gcc dot gnu.org
2021-08-09 8:50 ` rguenth at gcc dot gnu.org
` (4 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-08-09 7:31 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101809
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Ever confirmed|0 |1
Component|middle-end |tree-optimization
Target| |i?86-*-*
Status|UNCONFIRMED |ASSIGNED
Last reconfirmed| |2021-08-09
Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org
--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
Yes, I was lazy here - the complication is in the shared gather setup code
which will end up in a mismatching number of index/data vectors eventually.
I will eventually improve this.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug tree-optimization/101809] emulated gather capability doesn't support 32-bit target
2021-08-06 23:04 [Bug middle-end/101809] New: emulated gather capability doesn't support 32-bit target hjl.tools at gmail dot com
2021-08-07 14:11 ` [Bug middle-end/101809] " hjl.tools at gmail dot com
2021-08-09 7:31 ` [Bug tree-optimization/101809] " rguenth at gcc dot gnu.org
@ 2021-08-09 8:50 ` rguenth at gcc dot gnu.org
2021-08-10 9:29 ` rguenth at gcc dot gnu.org
` (3 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-08-09 8:50 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101809
--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
This is also the FAIL of
FAIL: gcc.target/i386/vect-gather-1.c scan-tree-dump vect "loop vectorized"
with -m32 testing. Note the intent was to have the testcase work independent
on the presence of HW gather (for you folks testing with -march=cascadelake).
The XFAIL condition is going to be tricky for this so I'll leave this test
FAILing at least until I can decide whether fixing the missing feature is
possible for GCC 12.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug tree-optimization/101809] emulated gather capability doesn't support 32-bit target
2021-08-06 23:04 [Bug middle-end/101809] New: emulated gather capability doesn't support 32-bit target hjl.tools at gmail dot com
` (2 preceding siblings ...)
2021-08-09 8:50 ` rguenth at gcc dot gnu.org
@ 2021-08-10 9:29 ` rguenth at gcc dot gnu.org
2021-08-10 10:25 ` cvs-commit at gcc dot gnu.org
` (2 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-08-10 9:29 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101809
--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
I am testing a patch.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug tree-optimization/101809] emulated gather capability doesn't support 32-bit target
2021-08-06 23:04 [Bug middle-end/101809] New: emulated gather capability doesn't support 32-bit target hjl.tools at gmail dot com
` (3 preceding siblings ...)
2021-08-10 9:29 ` rguenth at gcc dot gnu.org
@ 2021-08-10 10:25 ` cvs-commit at gcc dot gnu.org
2021-08-10 10:28 ` rguenth at gcc dot gnu.org
2021-08-10 12:34 ` cvs-commit at gcc dot gnu.org
6 siblings, 0 replies; 8+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-08-10 10:25 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101809
--- Comment #5 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Richard Biener <rguenth@gcc.gnu.org>:
https://gcc.gnu.org/g:08aa0e3d4f781fd6a6e293bb06d280365a0bdc1d
commit r12-2836-g08aa0e3d4f781fd6a6e293bb06d280365a0bdc1d
Author: Richard Biener <rguenther@suse.de>
Date: Tue Aug 10 10:54:58 2021 +0200
tree-optimization/101809 - support emulated gather for double[int]
This adds emulated gather support for index vectors with more
elements than the data vector. The internal function gather
vectorization code doesn't currently handle this (but the builtin
decl code does). This allows vectorization of double data gather
with int indexes on 32bit platforms where there isn't an implicit
widening to 64bit present.
2021-08-10 Richard Biener <rguenther@suse.de>
PR tree-optimization/101809
* tree-vect-stmts.c (get_load_store_type): Allow emulated
gathers with offset vector nunits being a constant multiple
of the data vector nunits.
(vect_get_gather_scatter_ops): Use the appropriate nunits
for the offset vector defs.
(vectorizable_store): Adjust call to
vect_get_gather_scatter_ops.
(vectorizable_load): Likewise. Handle the case of less
offset vectors than data vectors.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug tree-optimization/101809] emulated gather capability doesn't support 32-bit target
2021-08-06 23:04 [Bug middle-end/101809] New: emulated gather capability doesn't support 32-bit target hjl.tools at gmail dot com
` (4 preceding siblings ...)
2021-08-10 10:25 ` cvs-commit at gcc dot gnu.org
@ 2021-08-10 10:28 ` rguenth at gcc dot gnu.org
2021-08-10 12:34 ` cvs-commit at gcc dot gnu.org
6 siblings, 0 replies; 8+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-08-10 10:28 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101809
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Resolution|--- |FIXED
Status|ASSIGNED |RESOLVED
--- Comment #6 from Richard Biener <rguenth at gcc dot gnu.org> ---
Fixed. The reverse remains, so we currently cannot vectorize int[long] or
char[int] with emulated gather (but the more elements we get the less efficient
it will be).
It would be nice to transition x86 over to internal function gathers
(thus [mask_]gather_load and [mask_]scatter_store optabs, away from the
target hook returing a builtin decl). If somebody can cover the .md
part (write define_expands) I'll take over the vectorizer part which
unfortunately isn't a no-op.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug tree-optimization/101809] emulated gather capability doesn't support 32-bit target
2021-08-06 23:04 [Bug middle-end/101809] New: emulated gather capability doesn't support 32-bit target hjl.tools at gmail dot com
` (5 preceding siblings ...)
2021-08-10 10:28 ` rguenth at gcc dot gnu.org
@ 2021-08-10 12:34 ` cvs-commit at gcc dot gnu.org
6 siblings, 0 replies; 8+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-08-10 12:34 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101809
--- Comment #7 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by H.J. Lu <hjl@gcc.gnu.org>:
https://gcc.gnu.org/g:557d06f8b3ddb54bca134695e117c40c6e2267ab
commit r12-2838-g557d06f8b3ddb54bca134695e117c40c6e2267ab
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Tue Aug 10 05:30:44 2021 -0700
Enable gcc.target/i386/pr88531-1a.c for all targets
PR tree-optimization/101809
* gcc.target/i386/pr88531-1a.c: Enable for all targets.
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2021-08-10 12:34 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-06 23:04 [Bug middle-end/101809] New: emulated gather capability doesn't support 32-bit target hjl.tools at gmail dot com
2021-08-07 14:11 ` [Bug middle-end/101809] " hjl.tools at gmail dot com
2021-08-09 7:31 ` [Bug tree-optimization/101809] " rguenth at gcc dot gnu.org
2021-08-09 8:50 ` rguenth at gcc dot gnu.org
2021-08-10 9:29 ` rguenth at gcc dot gnu.org
2021-08-10 10:25 ` cvs-commit at gcc dot gnu.org
2021-08-10 10:28 ` rguenth at gcc dot gnu.org
2021-08-10 12:34 ` cvs-commit at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).