public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/100704] New: Vector register isn't used to push BLKmode argument onto stack
@ 2021-05-20 13:58 hjl.tools at gmail dot com
  2021-05-20 14:11 ` [Bug target/100704] " rguenth at gcc dot gnu.org
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: hjl.tools at gmail dot com @ 2021-05-20 13:58 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100704

            Bug ID: 100704
           Summary: Vector register isn't used to push BLKmode argument
                    onto stack
           Product: gcc
           Version: 11.1.1
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: hjl.tools at gmail dot com
                CC: crazylht at gmail dot com, ubizjak at gmail dot com
  Target Milestone: ---
            Target: i386,x86-64

[hjl@gnu-cfl-2 gcc]$ cat /tmp/x.c
struct S
{
  long long s1 __attribute__ ((aligned (8)));
  unsigned s2, s3, s4, s5, s6, s7, s8, s9, s10, s11, s12, s13, s14;
};

extern struct S array[];

void bar (struct S x);

void
foo (void)
{
  bar (array[0]);
}
[hjl@gnu-cfl-2 gcc]$ gcc -S -O2 /tmp/x.c
[hjl@gnu-cfl-2 gcc]$ cat x.s
        .file   "x.c"
        .text
        .p2align 4
        .globl  foo
        .type   foo, @function
foo:
.LFB0:
        .cfi_startproc
        subq    $8, %rsp
        .cfi_def_cfa_offset 16
        pushq   array+56(%rip)
        .cfi_def_cfa_offset 24
        pushq   array+48(%rip)
        .cfi_def_cfa_offset 32
        pushq   array+40(%rip)
        .cfi_def_cfa_offset 40
        pushq   array+32(%rip)
        .cfi_def_cfa_offset 48
        pushq   array+24(%rip)
        .cfi_def_cfa_offset 56
        pushq   array+16(%rip)
        .cfi_def_cfa_offset 64
        pushq   array+8(%rip)
        .cfi_def_cfa_offset 72
        pushq   array(%rip)
        .cfi_def_cfa_offset 80
        call    bar
        addq    $72, %rsp
        .cfi_def_cfa_offset 8
        ret
        .cfi_endproc
.LFE0:
        .size   foo, .-foo
        .ident  "GCC: (GNU) 11.1.1 20210428 (Red Hat 11.1.1-1)"
        .section        .note.GNU-stack,"",@progbits
[hjl@gnu-cfl-2 gcc]$ 

[hjl@gnu-cfl-2 gcc]$ ./xgcc -B./ -S -O2 /tmp/x.c -mno-push-args
[hjl@gnu-cfl-2 gcc]$ cat x.s
        .file   "x.c"
        .text
        .p2align 4
        .globl  foo
        .type   foo, @function
foo:
.LFB0:
        .cfi_startproc
        subq    $72, %rsp
        .cfi_def_cfa_offset 80
        movdqu  array(%rip), %xmm0
        movdqu  array+16(%rip), %xmm1
        movdqu  array+32(%rip), %xmm2
        movdqu  array+48(%rip), %xmm3
        movups  %xmm0, (%rsp)
        movups  %xmm1, 16(%rsp)
        movups  %xmm2, 32(%rsp)
        movups  %xmm3, 48(%rsp)
        call    bar
        addq    $72, %rsp
        .cfi_def_cfa_offset 8
        ret
        .cfi_endproc
.LFE0:
        .size   foo, .-foo
        .ident  "GCC: (GNU) 12.0.0 20210519 (experimental)"
        .section        .note.GNU-stack,"",@progbits
[hjl@gnu-cfl-2 gcc]$ 

There is

#define PUSH_ARGS (TARGET_PUSH_ARGS && !ACCUMULATE_OUTGOING_ARGS)

Can we update PUSH_ARGS to also check function arguments?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug target/100704] Vector register isn't used to push BLKmode argument onto stack
  2021-05-20 13:58 [Bug target/100704] New: Vector register isn't used to push BLKmode argument onto stack hjl.tools at gmail dot com
@ 2021-05-20 14:11 ` rguenth at gcc dot gnu.org
  2021-05-20 15:33 ` hjl.tools at gmail dot com
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-05-20 14:11 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100704

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
Hmm, I didn't realize we can't push %xmm regs...  With loads+stores the pushes
do not look less efficient for this particular example?  I suppose a nice
architectural enhancement would be a push-multiple - rep pushq anyone? ;)

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug target/100704] Vector register isn't used to push BLKmode argument onto stack
  2021-05-20 13:58 [Bug target/100704] New: Vector register isn't used to push BLKmode argument onto stack hjl.tools at gmail dot com
  2021-05-20 14:11 ` [Bug target/100704] " rguenth at gcc dot gnu.org
@ 2021-05-20 15:33 ` hjl.tools at gmail dot com
  2021-05-23 12:48 ` hjl.tools at gmail dot com
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: hjl.tools at gmail dot com @ 2021-05-20 15:33 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100704

--- Comment #2 from H.J. Lu <hjl.tools at gmail dot com> ---
(In reply to Richard Biener from comment #1)
> Hmm, I didn't realize we can't push %xmm regs...  With loads+stores the
> pushes do not look less efficient for this particular example?  I suppose a
> nice
> architectural enhancement would be a push-multiple - rep pushq anyone? ;)

YMM and ZMM load/store is much more efficient.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug target/100704] Vector register isn't used to push BLKmode argument onto stack
  2021-05-20 13:58 [Bug target/100704] New: Vector register isn't used to push BLKmode argument onto stack hjl.tools at gmail dot com
  2021-05-20 14:11 ` [Bug target/100704] " rguenth at gcc dot gnu.org
  2021-05-20 15:33 ` hjl.tools at gmail dot com
@ 2021-05-23 12:48 ` hjl.tools at gmail dot com
  2021-06-17 13:34 ` cvs-commit at gcc dot gnu.org
  2021-06-17 16:01 ` hjl.tools at gmail dot com
  4 siblings, 0 replies; 6+ messages in thread
From: hjl.tools at gmail dot com @ 2021-05-23 12:48 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100704

H.J. Lu <hjl.tools at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
           Keywords|                            |patch
   Target Milestone|---                         |12.0
     Ever confirmed|0                           |1
   Last reconfirmed|                            |2021-05-23
                URL|                            |https://gcc.gnu.org/piperma
                   |                            |il/gcc-patches/2021-May/571
                   |                            |027.html

--- Comment #3 from H.J. Lu <hjl.tools at gmail dot com> ---
Here is a micro benchmark of push vs vector load/store:

https://gitlab.com/x86-benchmarks/microbenchmark/-/tree/push

[hjl@gnu-cfl-2 microbenchmark]$ make
gcc -g -I. -O2   -c -o test.o test.c
gcc -g -I. -O2   -c -o push.o push.c
/usr/gcc-12.0.0-x32-push/bin/gcc -g -I. -O2 -Dmove=move128 -mno-push-args -c -o
move128.o move.c
/usr/gcc-12.0.0-x32-push/bin/gcc -g -I. -O2 -Dmove=move256 -mavx2
-mno-push-args -c -o move256.o move.c
gcc -o test test.o push.o move128.o move256.o
./test
push   : 213630
move128: 174842
move256: 155407
[hjl@gnu-cfl-2 microbenchmark]$ 

push is slower than vector load/store.

A patch is posted at

https://gcc.gnu.org/pipermail/gcc-patches/2021-May/571027.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug target/100704] Vector register isn't used to push BLKmode argument onto stack
  2021-05-20 13:58 [Bug target/100704] New: Vector register isn't used to push BLKmode argument onto stack hjl.tools at gmail dot com
                   ` (2 preceding siblings ...)
  2021-05-23 12:48 ` hjl.tools at gmail dot com
@ 2021-06-17 13:34 ` cvs-commit at gcc dot gnu.org
  2021-06-17 16:01 ` hjl.tools at gmail dot com
  4 siblings, 0 replies; 6+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-06-17 13:34 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100704

--- Comment #4 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by H.J. Lu <hjl@gcc.gnu.org>:

https://gcc.gnu.org/g:967b46530234b4e6ad3983057705aea6c20a03c4

commit r12-1564-g967b46530234b4e6ad3983057705aea6c20a03c4
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Fri May 21 11:56:55 2021 -0700

    Add a target calls hook: TARGET_PUSH_ARGUMENT

    1. Replace PUSH_ARGS with a target calls hook, TARGET_PUSH_ARGUMENT, which
    takes an integer argument.  When it returns true, push instructions will
    be used to pass outgoing arguments.  If the argument is nonzero, it is
    the number of bytes to push and indicates the PUSH instruction usage is
    optional so that the backend can decide if PUSH instructions should be
    generated.  Otherwise, the argument is zero.
    2. Implement x86 target hook which returns false when the number of bytes
    to push is no less than 16 (8 for 32-bit targets) if vector load and store
    can be used.
    3. Remove target PUSH_ARGS definitions which return 0 as it is the same
    as the default.
    4. Define TARGET_PUSH_ARGUMENT of cr16 and m32c to always return true.

    gcc/

            PR target/100704
            * calls.c (expand_call): Replace PUSH_ARGS with
            targetm.calls.push_argument (0).
            (emit_library_call_value_1): Likewise.
            * defaults.h (PUSH_ARGS): Removed.
            (PUSH_ARGS_REVERSED): Replace PUSH_ARGS with
            targetm.calls.push_argument (0).
            * expr.c (block_move_libcall_safe_for_call_parm): Likewise.
            (emit_push_insn): Pass the number bytes to push to
            targetm.calls.push_argument and pass 0 if ARGS_ADDR is 0.
            * hooks.c (hook_bool_uint_true): New.
            * hooks.h (hook_bool_uint_true): Likewise.
            * rtlanal.c (nonzero_bits1): Replace PUSH_ARGS with
            targetm.calls.push_argument (0).
            * target.def (push_argument): Add a targetm.calls hook.
            * targhooks.c (default_push_argument): New.
            * targhooks.h (default_push_argument): Likewise.
            * config/bpf/bpf.h (PUSH_ARGS): Removed.
            * config/cr16/cr16.c (TARGET_PUSH_ARGUMENT): New.
            * config/cr16/cr16.h (PUSH_ARGS): Removed.
            * config/i386/i386.c (ix86_push_argument): New.
            (TARGET_PUSH_ARGUMENT): Likewise.
            * config/i386/i386.h (PUSH_ARGS): Removed.
            * config/m32c/m32c.c (TARGET_PUSH_ARGUMENT): New.
            * config/m32c/m32c.h (PUSH_ARGS): Removed.
            * config/nios2/nios2.h (PUSH_ARGS): Likewise.
            * config/pru/pru.h (PUSH_ARGS): Likewise.
            * doc/tm.texi.in: Remove PUSH_ARGS documentation.  Add
            TARGET_PUSH_ARGUMENT hook.
            * doc/tm.texi: Regenerated.

    gcc/testsuite/

            PR target/100704
            * gcc.target/i386/pr100704-1.c: New test.
            * gcc.target/i386/pr100704-2.c: Likewise.
            * gcc.target/i386/pr100704-3.c: Likewise.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug target/100704] Vector register isn't used to push BLKmode argument onto stack
  2021-05-20 13:58 [Bug target/100704] New: Vector register isn't used to push BLKmode argument onto stack hjl.tools at gmail dot com
                   ` (3 preceding siblings ...)
  2021-06-17 13:34 ` cvs-commit at gcc dot gnu.org
@ 2021-06-17 16:01 ` hjl.tools at gmail dot com
  4 siblings, 0 replies; 6+ messages in thread
From: hjl.tools at gmail dot com @ 2021-06-17 16:01 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100704

H.J. Lu <hjl.tools at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |FIXED

--- Comment #5 from H.J. Lu <hjl.tools at gmail dot com> ---
Fixed for GCC 12.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-06-17 16:01 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-20 13:58 [Bug target/100704] New: Vector register isn't used to push BLKmode argument onto stack hjl.tools at gmail dot com
2021-05-20 14:11 ` [Bug target/100704] " rguenth at gcc dot gnu.org
2021-05-20 15:33 ` hjl.tools at gmail dot com
2021-05-23 12:48 ` hjl.tools at gmail dot com
2021-06-17 13:34 ` cvs-commit at gcc dot gnu.org
2021-06-17 16:01 ` hjl.tools at gmail dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).