public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/100704] New: Vector register isn't used to push BLKmode argument onto stack
@ 2021-05-20 13:58 hjl.tools at gmail dot com
2021-05-20 14:11 ` [Bug target/100704] " rguenth at gcc dot gnu.org
` (4 more replies)
0 siblings, 5 replies; 6+ messages in thread
From: hjl.tools at gmail dot com @ 2021-05-20 13:58 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100704
Bug ID: 100704
Summary: Vector register isn't used to push BLKmode argument
onto stack
Product: gcc
Version: 11.1.1
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: hjl.tools at gmail dot com
CC: crazylht at gmail dot com, ubizjak at gmail dot com
Target Milestone: ---
Target: i386,x86-64
[hjl@gnu-cfl-2 gcc]$ cat /tmp/x.c
struct S
{
long long s1 __attribute__ ((aligned (8)));
unsigned s2, s3, s4, s5, s6, s7, s8, s9, s10, s11, s12, s13, s14;
};
extern struct S array[];
void bar (struct S x);
void
foo (void)
{
bar (array[0]);
}
[hjl@gnu-cfl-2 gcc]$ gcc -S -O2 /tmp/x.c
[hjl@gnu-cfl-2 gcc]$ cat x.s
.file "x.c"
.text
.p2align 4
.globl foo
.type foo, @function
foo:
.LFB0:
.cfi_startproc
subq $8, %rsp
.cfi_def_cfa_offset 16
pushq array+56(%rip)
.cfi_def_cfa_offset 24
pushq array+48(%rip)
.cfi_def_cfa_offset 32
pushq array+40(%rip)
.cfi_def_cfa_offset 40
pushq array+32(%rip)
.cfi_def_cfa_offset 48
pushq array+24(%rip)
.cfi_def_cfa_offset 56
pushq array+16(%rip)
.cfi_def_cfa_offset 64
pushq array+8(%rip)
.cfi_def_cfa_offset 72
pushq array(%rip)
.cfi_def_cfa_offset 80
call bar
addq $72, %rsp
.cfi_def_cfa_offset 8
ret
.cfi_endproc
.LFE0:
.size foo, .-foo
.ident "GCC: (GNU) 11.1.1 20210428 (Red Hat 11.1.1-1)"
.section .note.GNU-stack,"",@progbits
[hjl@gnu-cfl-2 gcc]$
[hjl@gnu-cfl-2 gcc]$ ./xgcc -B./ -S -O2 /tmp/x.c -mno-push-args
[hjl@gnu-cfl-2 gcc]$ cat x.s
.file "x.c"
.text
.p2align 4
.globl foo
.type foo, @function
foo:
.LFB0:
.cfi_startproc
subq $72, %rsp
.cfi_def_cfa_offset 80
movdqu array(%rip), %xmm0
movdqu array+16(%rip), %xmm1
movdqu array+32(%rip), %xmm2
movdqu array+48(%rip), %xmm3
movups %xmm0, (%rsp)
movups %xmm1, 16(%rsp)
movups %xmm2, 32(%rsp)
movups %xmm3, 48(%rsp)
call bar
addq $72, %rsp
.cfi_def_cfa_offset 8
ret
.cfi_endproc
.LFE0:
.size foo, .-foo
.ident "GCC: (GNU) 12.0.0 20210519 (experimental)"
.section .note.GNU-stack,"",@progbits
[hjl@gnu-cfl-2 gcc]$
There is
#define PUSH_ARGS (TARGET_PUSH_ARGS && !ACCUMULATE_OUTGOING_ARGS)
Can we update PUSH_ARGS to also check function arguments?
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug target/100704] Vector register isn't used to push BLKmode argument onto stack
2021-05-20 13:58 [Bug target/100704] New: Vector register isn't used to push BLKmode argument onto stack hjl.tools at gmail dot com
@ 2021-05-20 14:11 ` rguenth at gcc dot gnu.org
2021-05-20 15:33 ` hjl.tools at gmail dot com
` (3 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-05-20 14:11 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100704
--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
Hmm, I didn't realize we can't push %xmm regs... With loads+stores the pushes
do not look less efficient for this particular example? I suppose a nice
architectural enhancement would be a push-multiple - rep pushq anyone? ;)
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug target/100704] Vector register isn't used to push BLKmode argument onto stack
2021-05-20 13:58 [Bug target/100704] New: Vector register isn't used to push BLKmode argument onto stack hjl.tools at gmail dot com
2021-05-20 14:11 ` [Bug target/100704] " rguenth at gcc dot gnu.org
@ 2021-05-20 15:33 ` hjl.tools at gmail dot com
2021-05-23 12:48 ` hjl.tools at gmail dot com
` (2 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: hjl.tools at gmail dot com @ 2021-05-20 15:33 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100704
--- Comment #2 from H.J. Lu <hjl.tools at gmail dot com> ---
(In reply to Richard Biener from comment #1)
> Hmm, I didn't realize we can't push %xmm regs... With loads+stores the
> pushes do not look less efficient for this particular example? I suppose a
> nice
> architectural enhancement would be a push-multiple - rep pushq anyone? ;)
YMM and ZMM load/store is much more efficient.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug target/100704] Vector register isn't used to push BLKmode argument onto stack
2021-05-20 13:58 [Bug target/100704] New: Vector register isn't used to push BLKmode argument onto stack hjl.tools at gmail dot com
2021-05-20 14:11 ` [Bug target/100704] " rguenth at gcc dot gnu.org
2021-05-20 15:33 ` hjl.tools at gmail dot com
@ 2021-05-23 12:48 ` hjl.tools at gmail dot com
2021-06-17 13:34 ` cvs-commit at gcc dot gnu.org
2021-06-17 16:01 ` hjl.tools at gmail dot com
4 siblings, 0 replies; 6+ messages in thread
From: hjl.tools at gmail dot com @ 2021-05-23 12:48 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100704
H.J. Lu <hjl.tools at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |NEW
Keywords| |patch
Target Milestone|--- |12.0
Ever confirmed|0 |1
Last reconfirmed| |2021-05-23
URL| |https://gcc.gnu.org/piperma
| |il/gcc-patches/2021-May/571
| |027.html
--- Comment #3 from H.J. Lu <hjl.tools at gmail dot com> ---
Here is a micro benchmark of push vs vector load/store:
https://gitlab.com/x86-benchmarks/microbenchmark/-/tree/push
[hjl@gnu-cfl-2 microbenchmark]$ make
gcc -g -I. -O2 -c -o test.o test.c
gcc -g -I. -O2 -c -o push.o push.c
/usr/gcc-12.0.0-x32-push/bin/gcc -g -I. -O2 -Dmove=move128 -mno-push-args -c -o
move128.o move.c
/usr/gcc-12.0.0-x32-push/bin/gcc -g -I. -O2 -Dmove=move256 -mavx2
-mno-push-args -c -o move256.o move.c
gcc -o test test.o push.o move128.o move256.o
./test
push : 213630
move128: 174842
move256: 155407
[hjl@gnu-cfl-2 microbenchmark]$
push is slower than vector load/store.
A patch is posted at
https://gcc.gnu.org/pipermail/gcc-patches/2021-May/571027.html
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug target/100704] Vector register isn't used to push BLKmode argument onto stack
2021-05-20 13:58 [Bug target/100704] New: Vector register isn't used to push BLKmode argument onto stack hjl.tools at gmail dot com
` (2 preceding siblings ...)
2021-05-23 12:48 ` hjl.tools at gmail dot com
@ 2021-06-17 13:34 ` cvs-commit at gcc dot gnu.org
2021-06-17 16:01 ` hjl.tools at gmail dot com
4 siblings, 0 replies; 6+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-06-17 13:34 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100704
--- Comment #4 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by H.J. Lu <hjl@gcc.gnu.org>:
https://gcc.gnu.org/g:967b46530234b4e6ad3983057705aea6c20a03c4
commit r12-1564-g967b46530234b4e6ad3983057705aea6c20a03c4
Author: H.J. Lu <hjl.tools@gmail.com>
Date: Fri May 21 11:56:55 2021 -0700
Add a target calls hook: TARGET_PUSH_ARGUMENT
1. Replace PUSH_ARGS with a target calls hook, TARGET_PUSH_ARGUMENT, which
takes an integer argument. When it returns true, push instructions will
be used to pass outgoing arguments. If the argument is nonzero, it is
the number of bytes to push and indicates the PUSH instruction usage is
optional so that the backend can decide if PUSH instructions should be
generated. Otherwise, the argument is zero.
2. Implement x86 target hook which returns false when the number of bytes
to push is no less than 16 (8 for 32-bit targets) if vector load and store
can be used.
3. Remove target PUSH_ARGS definitions which return 0 as it is the same
as the default.
4. Define TARGET_PUSH_ARGUMENT of cr16 and m32c to always return true.
gcc/
PR target/100704
* calls.c (expand_call): Replace PUSH_ARGS with
targetm.calls.push_argument (0).
(emit_library_call_value_1): Likewise.
* defaults.h (PUSH_ARGS): Removed.
(PUSH_ARGS_REVERSED): Replace PUSH_ARGS with
targetm.calls.push_argument (0).
* expr.c (block_move_libcall_safe_for_call_parm): Likewise.
(emit_push_insn): Pass the number bytes to push to
targetm.calls.push_argument and pass 0 if ARGS_ADDR is 0.
* hooks.c (hook_bool_uint_true): New.
* hooks.h (hook_bool_uint_true): Likewise.
* rtlanal.c (nonzero_bits1): Replace PUSH_ARGS with
targetm.calls.push_argument (0).
* target.def (push_argument): Add a targetm.calls hook.
* targhooks.c (default_push_argument): New.
* targhooks.h (default_push_argument): Likewise.
* config/bpf/bpf.h (PUSH_ARGS): Removed.
* config/cr16/cr16.c (TARGET_PUSH_ARGUMENT): New.
* config/cr16/cr16.h (PUSH_ARGS): Removed.
* config/i386/i386.c (ix86_push_argument): New.
(TARGET_PUSH_ARGUMENT): Likewise.
* config/i386/i386.h (PUSH_ARGS): Removed.
* config/m32c/m32c.c (TARGET_PUSH_ARGUMENT): New.
* config/m32c/m32c.h (PUSH_ARGS): Removed.
* config/nios2/nios2.h (PUSH_ARGS): Likewise.
* config/pru/pru.h (PUSH_ARGS): Likewise.
* doc/tm.texi.in: Remove PUSH_ARGS documentation. Add
TARGET_PUSH_ARGUMENT hook.
* doc/tm.texi: Regenerated.
gcc/testsuite/
PR target/100704
* gcc.target/i386/pr100704-1.c: New test.
* gcc.target/i386/pr100704-2.c: Likewise.
* gcc.target/i386/pr100704-3.c: Likewise.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug target/100704] Vector register isn't used to push BLKmode argument onto stack
2021-05-20 13:58 [Bug target/100704] New: Vector register isn't used to push BLKmode argument onto stack hjl.tools at gmail dot com
` (3 preceding siblings ...)
2021-06-17 13:34 ` cvs-commit at gcc dot gnu.org
@ 2021-06-17 16:01 ` hjl.tools at gmail dot com
4 siblings, 0 replies; 6+ messages in thread
From: hjl.tools at gmail dot com @ 2021-06-17 16:01 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100704
H.J. Lu <hjl.tools at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution|--- |FIXED
--- Comment #5 from H.J. Lu <hjl.tools at gmail dot com> ---
Fixed for GCC 12.
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2021-06-17 16:01 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-20 13:58 [Bug target/100704] New: Vector register isn't used to push BLKmode argument onto stack hjl.tools at gmail dot com
2021-05-20 14:11 ` [Bug target/100704] " rguenth at gcc dot gnu.org
2021-05-20 15:33 ` hjl.tools at gmail dot com
2021-05-23 12:48 ` hjl.tools at gmail dot com
2021-06-17 13:34 ` cvs-commit at gcc dot gnu.org
2021-06-17 16:01 ` hjl.tools at gmail dot com
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).