public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/100865] New: pass_data_constant_pool_broadcast doesn't work on TImode
@ 2021-06-02  1:53 hjl.tools at gmail dot com
  2021-06-02  2:10 ` [Bug target/100865] " crazylht at gmail dot com
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: hjl.tools at gmail dot com @ 2021-06-02  1:53 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100865

            Bug ID: 100865
           Summary: pass_data_constant_pool_broadcast doesn't work on
                    TImode
           Product: gcc
           Version: 11.1.1
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: hjl.tools at gmail dot com
                CC: crazylht at gmail dot com
  Target Milestone: ---
            Target: x86-64

[hjl@gnu-cfl-2 gcc]$ cat /tmp/y.c 
extern char *dst;

void
foo (void)
{
  __builtin_memset (dst, 12, 16);
}
[hjl@gnu-cfl-2 gcc]$ ./xgcc -B./ -S -O2  -march=skylake-avx512 /tmp/y.c
[hjl@gnu-cfl-2 gcc]$ cat y.s
        .file   "y.c"
        .text
        .p2align 4
        .globl  foo
        .type   foo, @function
foo:
.LFB0:
        .cfi_startproc
        movq    dst(%rip), %rax
        vmovdqa .LC0(%rip), %xmm0
        vmovdqu %xmm0, (%rax)
        ret
        .cfi_endproc
.LFE0:
        .size   foo, .-foo
        .section        .rodata.cst16,"aM",@progbits,16
        .align 16
.LC0:
        .quad   868082074056920076
        .quad   868082074056920076
        .ident  "GCC: (GNU) 12.0.0 20210602 (experimental)"
        .section        .note.GNU-stack,"",@progbits
[hjl@gnu-cfl-2 gcc]$ 

Also should broadcast from register be used to avoid memory load?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/100865] pass_data_constant_pool_broadcast doesn't work on TImode
  2021-06-02  1:53 [Bug target/100865] New: pass_data_constant_pool_broadcast doesn't work on TImode hjl.tools at gmail dot com
@ 2021-06-02  2:10 ` crazylht at gmail dot com
  2021-06-02  4:25 ` crazylht at gmail dot com
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: crazylht at gmail dot com @ 2021-06-02  2:10 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100865

--- Comment #1 from Hongtao.liu <crazylht at gmail dot com> ---
(insn 6 5 9 2 (set (reg:V1TI 84)
        (mem/u/c:V1TI (symbol_ref/u:DI ("*.LC0") [flags 0x2]) [0  S16 A128]))
"test.c":5:3 1474 {movv1ti_internal}
     (expr_list:REG_EQUAL (const_vector:V1TI [
                (const_wide_int 0xc0c0c0c0c0c0c0c0c0c0c0c0c0c0c0c)
            ])
For V1TImode, we don't know vec_duplicate of what, it could be (subreg:V1TI
(vec_duplicate:v4si (si)) 0) or (subreg:V1TI (vec_duplicate:v8hi (hi)) 0) or
(subreg:V1TI (vec_duplicate:v16qi (qi)) 0)

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/100865] pass_data_constant_pool_broadcast doesn't work on TImode
  2021-06-02  1:53 [Bug target/100865] New: pass_data_constant_pool_broadcast doesn't work on TImode hjl.tools at gmail dot com
  2021-06-02  2:10 ` [Bug target/100865] " crazylht at gmail dot com
@ 2021-06-02  4:25 ` crazylht at gmail dot com
  2021-06-03  1:47 ` hjl.tools at gmail dot com
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: crazylht at gmail dot com @ 2021-06-02  4:25 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100865

--- Comment #2 from Hongtao.liu <crazylht at gmail dot com> ---
> 
> Also should broadcast from register be used to avoid memory load?
I think yes as long as memory load from constant pool.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/100865] pass_data_constant_pool_broadcast doesn't work on TImode
  2021-06-02  1:53 [Bug target/100865] New: pass_data_constant_pool_broadcast doesn't work on TImode hjl.tools at gmail dot com
  2021-06-02  2:10 ` [Bug target/100865] " crazylht at gmail dot com
  2021-06-02  4:25 ` crazylht at gmail dot com
@ 2021-06-03  1:47 ` hjl.tools at gmail dot com
  2021-06-03  1:48 ` [Bug target/100865] Convert CONST_WIDE_INT to broadcast hjl.tools at gmail dot com
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: hjl.tools at gmail dot com @ 2021-06-03  1:47 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100865

--- Comment #3 from H.J. Lu <hjl.tools at gmail dot com> ---
Created attachment 50916
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50916&action=edit
x86: Convert CONST_WIDE_INT to broadcast in move expanders

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/100865] Convert CONST_WIDE_INT to broadcast
  2021-06-02  1:53 [Bug target/100865] New: pass_data_constant_pool_broadcast doesn't work on TImode hjl.tools at gmail dot com
                   ` (2 preceding siblings ...)
  2021-06-03  1:47 ` hjl.tools at gmail dot com
@ 2021-06-03  1:48 ` hjl.tools at gmail dot com
  2021-06-03  2:06 ` hjl.tools at gmail dot com
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: hjl.tools at gmail dot com @ 2021-06-03  1:48 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100865

H.J. Lu <hjl.tools at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
                 CC|                            |wwwhhhyyy333 at gmail dot com
     Ever confirmed|0                           |1
            Version|11.1.1                      |12.0
            Summary|pass_data_constant_pool_bro |Convert CONST_WIDE_INT to
                   |adcast doesn't work on      |broadcast
                   |TImode                      |
   Last reconfirmed|                            |2021-06-03

--- Comment #4 from H.J. Lu <hjl.tools at gmail dot com> ---
Please try this patch with SPEC CPU 2017 on SKX to see its impact on
performance.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/100865] Convert CONST_WIDE_INT to broadcast
  2021-06-02  1:53 [Bug target/100865] New: pass_data_constant_pool_broadcast doesn't work on TImode hjl.tools at gmail dot com
                   ` (3 preceding siblings ...)
  2021-06-03  1:48 ` [Bug target/100865] Convert CONST_WIDE_INT to broadcast hjl.tools at gmail dot com
@ 2021-06-03  2:06 ` hjl.tools at gmail dot com
  2021-07-01 15:11 ` cvs-commit at gcc dot gnu.org
  2021-07-28 15:03 ` hjl.tools at gmail dot com
  6 siblings, 0 replies; 8+ messages in thread
From: hjl.tools at gmail dot com @ 2021-06-03  2:06 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100865

--- Comment #5 from H.J. Lu <hjl.tools at gmail dot com> ---
A small benchmark:

https://gitlab.com/x86-benchmarks/microbenchmark/-/tree/memset/broadcast

shows that broadcast is a little bit faster on Intel Core i7-8559U:

[hjl@gnu-cfl-2 microbenchmark]$ make
gcc -g -I. -O2   -c -o test.o test.c
gcc -g   -c -o memory.o memory.S
gcc -g   -c -o broadcast.o broadcast.S
gcc -o test test.o memory.o broadcast.o
./test
memory   : 99333
broadcast: 97208
[hjl@gnu-cfl-2 microbenchmark]$

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/100865] Convert CONST_WIDE_INT to broadcast
  2021-06-02  1:53 [Bug target/100865] New: pass_data_constant_pool_broadcast doesn't work on TImode hjl.tools at gmail dot com
                   ` (4 preceding siblings ...)
  2021-06-03  2:06 ` hjl.tools at gmail dot com
@ 2021-07-01 15:11 ` cvs-commit at gcc dot gnu.org
  2021-07-28 15:03 ` hjl.tools at gmail dot com
  6 siblings, 0 replies; 8+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-07-01 15:11 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100865

--- Comment #6 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by H.J. Lu <hjl@gcc.gnu.org>:

https://gcc.gnu.org/g:edafb35bdadf309ebb9d1eddc5549f9e1ad49c09

commit r12-1958-gedafb35bdadf309ebb9d1eddc5549f9e1ad49c09
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Wed Jun 2 07:15:45 2021 -0700

    x86: Convert CONST_WIDE_INT/CONST_VECTOR to broadcast

    1. Update move expanders to convert the CONST_WIDE_INT and CONST_VECTOR
    operands to vector broadcast from an integer with AVX.
    2. Add ix86_gen_scratch_sse_rtx to return a scratch SSE register which
    won't increase stack alignment requirement and blocks transformation by
    the combine pass.

    A small benchmark:

    https://gitlab.com/x86-benchmarks/microbenchmark/-/tree/memset/broadcast

    shows that broadcast is a little bit faster on Intel Core i7-8559U:

    $ make
    gcc -g -I. -O2   -c -o test.o test.c
    gcc -g   -c -o memory.o memory.S
    gcc -g   -c -o broadcast.o broadcast.S
    gcc -g   -c -o vec_dup_sse2.o vec_dup_sse2.S
    gcc -o test test.o memory.o broadcast.o vec_dup_sse2.o
    ./test
    memory      : 147215
    broadcast   : 121213
    vec_dup_sse2: 171366
    $

    broadcast is also smaller:

    $ size memory.o broadcast.o
       text    data     bss     dec     hex filename
        132       0       0     132      84 memory.o
        122       0       0     122      7a broadcast.o
    $

    3. Update PR 87767 tests to expect integer broadcast instead of broadcast
    from memory.
    4. Update avx512f_cond_move.c to expect integer broadcast.

    A small benchmark:

    https://gitlab.com/x86-benchmarks/microbenchmark/-/tree/vpaddd/broadcast

    shows that integer broadcast is faster than embedded memory broadcast:

    $ make
    gcc -g -I. -O2 -march=skylake-avx512   -c -o test.o test.c
    gcc -g   -c -o memory.o memory.S
    gcc -g   -c -o broadcast.o broadcast.S
    gcc -o test test.o memory.o broadcast.o
    ./test
    memory      : 425538
    broadcast   : 375260
    $

    gcc/

            PR target/100865
            * config/i386/i386-expand.c (ix86_expand_vector_init_duplicate):
            New prototype.
            (ix86_byte_broadcast): New function.
            (ix86_convert_const_wide_int_to_broadcast): Likewise.
            (ix86_expand_move): Convert CONST_WIDE_INT to broadcast if mode
            size is 16 bytes or bigger.
            (ix86_broadcast_from_integer_constant): New function.
            (ix86_expand_vector_move): Convert CONST_WIDE_INT and CONST_VECTOR
            to broadcast if mode size is 16 bytes or bigger.
            * config/i386/i386-protos.h (ix86_gen_scratch_sse_rtx): New
            prototype.
            * config/i386/i386.c (ix86_gen_scratch_sse_rtx): New function.

    gcc/testsuite/

            PR target/100865
            * gcc.target/i386/avx512f-broadcast-pr87767-1.c: Expect integer
            broadcast.
            * gcc.target/i386/avx512f-broadcast-pr87767-5.c: Likewise.
            * gcc.target/i386/avx512vl-broadcast-pr87767-1.c: Likewise.
            * gcc.target/i386/avx512vl-broadcast-pr87767-5.c: Likewise.
            * gcc.target/i386/avx512f_cond_move.c: Also pass
            -mprefer-vector-width=512 and expect integer broadcast.
            * gcc.target/i386/pr100865-1.c: New test.
            * gcc.target/i386/pr100865-2.c: Likewise.
            * gcc.target/i386/pr100865-3.c: Likewise.
            * gcc.target/i386/pr100865-4a.c: Likewise.
            * gcc.target/i386/pr100865-4b.c: Likewise.
            * gcc.target/i386/pr100865-5a.c: Likewise.
            * gcc.target/i386/pr100865-5b.c: Likewise.
            * gcc.target/i386/pr100865-6a.c: Likewise.
            * gcc.target/i386/pr100865-6b.c: Likewise.
            * gcc.target/i386/pr100865-6c.c: Likewise.
            * gcc.target/i386/pr100865-7a.c: Likewise.
            * gcc.target/i386/pr100865-7b.c: Likewise.
            * gcc.target/i386/pr100865-7c.c: Likewise.
            * gcc.target/i386/pr100865-8a.c: Likewise.
            * gcc.target/i386/pr100865-8b.c: Likewise.
            * gcc.target/i386/pr100865-8c.c: Likewise.
            * gcc.target/i386/pr100865-9a.c: Likewise.
            * gcc.target/i386/pr100865-9b.c: Likewise.
            * gcc.target/i386/pr100865-9c.c: Likewise.
            * gcc.target/i386/pr100865-10a.c: Likewise.
            * gcc.target/i386/pr100865-10b.c: Likewise.
            * gcc.target/i386/pr100865-11a.c: Likewise.
            * gcc.target/i386/pr100865-11b.c: Likewise.
            * gcc.target/i386/pr100865-11c.c: Likewise.
            * gcc.target/i386/pr100865-12a.c: Likewise.
            * gcc.target/i386/pr100865-12b.c: Likewise.
            * gcc.target/i386/pr100865-12c.c: Likewise.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/100865] Convert CONST_WIDE_INT to broadcast
  2021-06-02  1:53 [Bug target/100865] New: pass_data_constant_pool_broadcast doesn't work on TImode hjl.tools at gmail dot com
                   ` (5 preceding siblings ...)
  2021-07-01 15:11 ` cvs-commit at gcc dot gnu.org
@ 2021-07-28 15:03 ` hjl.tools at gmail dot com
  6 siblings, 0 replies; 8+ messages in thread
From: hjl.tools at gmail dot com @ 2021-07-28 15:03 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100865

H.J. Lu <hjl.tools at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|---                         |12.0
         Resolution|---                         |FIXED
             Status|NEW                         |RESOLVED

--- Comment #7 from H.J. Lu <hjl.tools at gmail dot com> ---
Fixed for GCC 12.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-07-28 15:03 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-02  1:53 [Bug target/100865] New: pass_data_constant_pool_broadcast doesn't work on TImode hjl.tools at gmail dot com
2021-06-02  2:10 ` [Bug target/100865] " crazylht at gmail dot com
2021-06-02  4:25 ` crazylht at gmail dot com
2021-06-03  1:47 ` hjl.tools at gmail dot com
2021-06-03  1:48 ` [Bug target/100865] Convert CONST_WIDE_INT to broadcast hjl.tools at gmail dot com
2021-06-03  2:06 ` hjl.tools at gmail dot com
2021-07-01 15:11 ` cvs-commit at gcc dot gnu.org
2021-07-28 15:03 ` hjl.tools at gmail dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).