public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/104704] New: [12 Regression] ix86_gen_scratch_sse_rtx doesn't work explicit XMM7/XMM15/XMM31 usage
@ 2022-02-26 20:09 hjl.tools at gmail dot com
  2022-02-27  0:55 ` [Bug target/104704] [12 Regression] ix86_gen_scratch_sse_rtx doesn't work with " hjl.tools at gmail dot com
                   ` (15 more replies)
  0 siblings, 16 replies; 17+ messages in thread
From: hjl.tools at gmail dot com @ 2022-02-26 20:09 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104704

            Bug ID: 104704
           Summary: [12 Regression] ix86_gen_scratch_sse_rtx doesn't work
                    explicit XMM7/XMM15/XMM31 usage
           Product: gcc
           Version: 12.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: hjl.tools at gmail dot com
                CC: crazylht at gmail dot com, ubizjak at gmail dot com
  Target Milestone: ---

ix86_gen_scratch_sse_rtx returns XMM7/XMM15/XMM31 as a scratch register to
prevent RTL optimizer from removing scratch vector register.  But it introduces
a conflict with explicit XMM7/XMM15/XMM31 usage:

[hjl@gnu-tgl-2 scratch-2]$ cat x-2.c 
/* { dg-do compile { target { ! ia32 } } } */
/* { dg-options "-O2 -march=x86-64 -mavx2" } */

#include <immintrin.h>

__m256d y, z;

int i;

__attribute__((noipa))
int
do_test (void)
{
  register int xmm15 __asm ("xmm15") = i;
  asm volatile ("" : "+v" (xmm15));
  z = y;
  register int xmm2 __asm ("xmm2") = xmm15;
  asm volatile ("" : "+v" (xmm2));
  return xmm2;
}

__attribute__((target("arch=x86-64")))
int
main (void)
{
 if (__builtin_cpu_supports ("avx2"))
   {
     i = 4;
     if (do_test () != 4)
       __builtin_abort ();
   }
  return 0;
}
[hjl@gnu-tgl-2 scratch-2]$ make x-2
/export/build/gnu/tools-build/gcc-debug/build-x86_64-linux/gcc/xgcc
-B/export/build/gnu/tools-build/gcc-debug/build-x86_64-linux/gcc/ -O2
-march=x86-64 -mavx2 -S x-2.c
/export/build/gnu/tools-build/gcc-debug/build-x86_64-linux/gcc/xgcc
-B/export/build/gnu/tools-build/gcc-debug/build-x86_64-linux/gcc/ -O2
-march=x86-64 -o x-2 x-2.s
[hjl@gnu-tgl-2 scratch-2]$ ./x-2
Aborted (core dumped)
[hjl@gnu-tgl-2 scratch-2]$

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug target/104704] [12 Regression] ix86_gen_scratch_sse_rtx doesn't work with explicit XMM7/XMM15/XMM31 usage
  2022-02-26 20:09 [Bug target/104704] New: [12 Regression] ix86_gen_scratch_sse_rtx doesn't work explicit XMM7/XMM15/XMM31 usage hjl.tools at gmail dot com
@ 2022-02-27  0:55 ` hjl.tools at gmail dot com
  2022-02-28  1:36 ` crazylht at gmail dot com
                   ` (14 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: hjl.tools at gmail dot com @ 2022-02-27  0:55 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104704

H.J. Lu <hjl.tools at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|                            |2022-02-27
             Status|UNCONFIRMED                 |NEW
     Ever confirmed|0                           |1

--- Comment #1 from H.J. Lu <hjl.tools at gmail dot com> ---
ix86_expand_vector_move shouldn't use ix86_gen_scratch_sse_rtx.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug target/104704] [12 Regression] ix86_gen_scratch_sse_rtx doesn't work with explicit XMM7/XMM15/XMM31 usage
  2022-02-26 20:09 [Bug target/104704] New: [12 Regression] ix86_gen_scratch_sse_rtx doesn't work explicit XMM7/XMM15/XMM31 usage hjl.tools at gmail dot com
  2022-02-27  0:55 ` [Bug target/104704] [12 Regression] ix86_gen_scratch_sse_rtx doesn't work with " hjl.tools at gmail dot com
@ 2022-02-28  1:36 ` crazylht at gmail dot com
  2022-02-28  1:44 ` crazylht at gmail dot com
                   ` (13 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: crazylht at gmail dot com @ 2022-02-28  1:36 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104704

--- Comment #2 from Hongtao.liu <crazylht at gmail dot com> ---
Yes, thanks for the reproduced testcase.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug target/104704] [12 Regression] ix86_gen_scratch_sse_rtx doesn't work with explicit XMM7/XMM15/XMM31 usage
  2022-02-26 20:09 [Bug target/104704] New: [12 Regression] ix86_gen_scratch_sse_rtx doesn't work explicit XMM7/XMM15/XMM31 usage hjl.tools at gmail dot com
  2022-02-27  0:55 ` [Bug target/104704] [12 Regression] ix86_gen_scratch_sse_rtx doesn't work with " hjl.tools at gmail dot com
  2022-02-28  1:36 ` crazylht at gmail dot com
@ 2022-02-28  1:44 ` crazylht at gmail dot com
  2022-02-28  1:59 ` hjl.tools at gmail dot com
                   ` (12 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: crazylht at gmail dot com @ 2022-02-28  1:44 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104704

--- Comment #3 from Hongtao.liu <crazylht at gmail dot com> ---
(In reply to H.J. Lu from comment #1)
> ix86_expand_vector_move shouldn't use ix86_gen_scratch_sse_rtx.

Is it problematic for TARGET_GEN_MEMSET_SCRATCH_RTX?

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug target/104704] [12 Regression] ix86_gen_scratch_sse_rtx doesn't work with explicit XMM7/XMM15/XMM31 usage
  2022-02-26 20:09 [Bug target/104704] New: [12 Regression] ix86_gen_scratch_sse_rtx doesn't work explicit XMM7/XMM15/XMM31 usage hjl.tools at gmail dot com
                   ` (2 preceding siblings ...)
  2022-02-28  1:44 ` crazylht at gmail dot com
@ 2022-02-28  1:59 ` hjl.tools at gmail dot com
  2022-02-28  8:31 ` crazylht at gmail dot com
                   ` (11 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: hjl.tools at gmail dot com @ 2022-02-28  1:59 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104704

--- Comment #4 from H.J. Lu <hjl.tools at gmail dot com> ---
(In reply to Hongtao.liu from comment #3)
> (In reply to H.J. Lu from comment #1)
> > ix86_expand_vector_move shouldn't use ix86_gen_scratch_sse_rtx.
> 
> Is it problematic for TARGET_GEN_MEMSET_SCRATCH_RTX?

It is OK as long as it is used only by memset expander.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug target/104704] [12 Regression] ix86_gen_scratch_sse_rtx doesn't work with explicit XMM7/XMM15/XMM31 usage
  2022-02-26 20:09 [Bug target/104704] New: [12 Regression] ix86_gen_scratch_sse_rtx doesn't work explicit XMM7/XMM15/XMM31 usage hjl.tools at gmail dot com
                   ` (3 preceding siblings ...)
  2022-02-28  1:59 ` hjl.tools at gmail dot com
@ 2022-02-28  8:31 ` crazylht at gmail dot com
  2022-02-28  9:27 ` crazylht at gmail dot com
                   ` (10 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: crazylht at gmail dot com @ 2022-02-28  8:31 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104704

--- Comment #5 from Hongtao.liu <crazylht at gmail dot com> ---
I notice it regresses

FAIL: gcc.target/i386/incoming-11.c scan-assembler-not andl[\\t ]*\\$-16,[\\t
]*%esp

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug target/104704] [12 Regression] ix86_gen_scratch_sse_rtx doesn't work with explicit XMM7/XMM15/XMM31 usage
  2022-02-26 20:09 [Bug target/104704] New: [12 Regression] ix86_gen_scratch_sse_rtx doesn't work explicit XMM7/XMM15/XMM31 usage hjl.tools at gmail dot com
                   ` (4 preceding siblings ...)
  2022-02-28  8:31 ` crazylht at gmail dot com
@ 2022-02-28  9:27 ` crazylht at gmail dot com
  2022-02-28  9:33 ` crazylht at gmail dot com
                   ` (9 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: crazylht at gmail dot com @ 2022-02-28  9:27 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104704

--- Comment #6 from Hongtao.liu <crazylht at gmail dot com> ---
(In reply to Hongtao.liu from comment #5)
> I notice it regresses
> 
> FAIL: gcc.target/i386/incoming-11.c scan-assembler-not andl[\\t
> ]*\\$-16,[\\t ]*%esp

Why replace ix86_gen_scratch_sse_rtx with gen_reg_rtx will affect this
testcase, hmm.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug target/104704] [12 Regression] ix86_gen_scratch_sse_rtx doesn't work with explicit XMM7/XMM15/XMM31 usage
  2022-02-26 20:09 [Bug target/104704] New: [12 Regression] ix86_gen_scratch_sse_rtx doesn't work explicit XMM7/XMM15/XMM31 usage hjl.tools at gmail dot com
                   ` (5 preceding siblings ...)
  2022-02-28  9:27 ` crazylht at gmail dot com
@ 2022-02-28  9:33 ` crazylht at gmail dot com
  2022-03-01  7:37 ` rguenth at gcc dot gnu.org
                   ` (8 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: crazylht at gmail dot com @ 2022-02-28  9:33 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104704

--- Comment #7 from Hongtao.liu <crazylht at gmail dot com> ---
(In reply to Hongtao.liu from comment #6)
> (In reply to Hongtao.liu from comment #5)
> > I notice it regresses
> > 
> > FAIL: gcc.target/i386/incoming-11.c scan-assembler-not andl[\\t
> > ]*\\$-16,[\\t ]*%esp
> 
> Why replace ix86_gen_scratch_sse_rtx with gen_reg_rtx will affect this
> testcase, hmm.

Oh, it's just revert of r12-2665-g7f4c3943f795fd

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug target/104704] [12 Regression] ix86_gen_scratch_sse_rtx doesn't work with explicit XMM7/XMM15/XMM31 usage
  2022-02-26 20:09 [Bug target/104704] New: [12 Regression] ix86_gen_scratch_sse_rtx doesn't work explicit XMM7/XMM15/XMM31 usage hjl.tools at gmail dot com
                   ` (6 preceding siblings ...)
  2022-02-28  9:33 ` crazylht at gmail dot com
@ 2022-03-01  7:37 ` rguenth at gcc dot gnu.org
  2022-03-02  5:29 ` crazylht at gmail dot com
                   ` (7 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-03-01  7:37 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104704

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|---                         |12.0
           Keywords|                            |wrong-code

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug target/104704] [12 Regression] ix86_gen_scratch_sse_rtx doesn't work with explicit XMM7/XMM15/XMM31 usage
  2022-02-26 20:09 [Bug target/104704] New: [12 Regression] ix86_gen_scratch_sse_rtx doesn't work explicit XMM7/XMM15/XMM31 usage hjl.tools at gmail dot com
                   ` (7 preceding siblings ...)
  2022-03-01  7:37 ` rguenth at gcc dot gnu.org
@ 2022-03-02  5:29 ` crazylht at gmail dot com
  2022-03-02 14:49 ` hjl.tools at gmail dot com
                   ` (6 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: crazylht at gmail dot com @ 2022-03-02  5:29 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104704

--- Comment #8 from Hongtao.liu <crazylht at gmail dot com> ---
(In reply to H.J. Lu from comment #4)
> (In reply to Hongtao.liu from comment #3)
> > (In reply to H.J. Lu from comment #1)
> > > ix86_expand_vector_move shouldn't use ix86_gen_scratch_sse_rtx.
> > 
> > Is it problematic for TARGET_GEN_MEMSET_SCRATCH_RTX?
> 
> It is OK as long as it is used only by memset expander.

Use gen_reg_rtx for TARGET_GEN_MEMSET_SCRATCH_RTX regresses

gcc.target/i386/pieces-memset-21.c scan-assembler-not vzeroupper
gcc.target/i386/pieces-memset-3.c scan-assembler-not %[re]bp
gcc.target/i386/pieces-memset-3.c scan-assembler-not and[^\n\r]*%[re]sp
gcc.target/i386/pieces-memset-37.c scan-assembler-not %[re]bp
gcc.target/i386/pieces-memset-37.c scan-assembler-not and[^\n\r]*%[re]sp
gcc.target/i386/pieces-memset-39.c scan-assembler-not %[re]bp
gcc.target/i386/pieces-memset-39.c scan-assembler-not and[^\n\r]*%[re]sp
gcc.target/i386/pieces-memset-46.c scan-assembler-times vmovw[ \\t]+[^\n]*%xmm
1
gcc.target/i386/pieces-memset-47.c scan-assembler-times vmovw[ \\t]+[^\n]*%xmm
1
gcc.target/i386/pieces-memset-48.c scan-assembler-times vmovw[ \\t]+[^\n]*%xmm
1
gcc.target/i386/pr90773-14.c scan-assembler-times movd[\\t ]+%xmm[0-9]+,
16\\(%[^,]+\\) 1
gcc.target/i386/pr90773-17.c scan-assembler-times vmovd[\\t ]+%xmm[0-9]+,
15\\(%[^,]+\\) 1
gcc.target/i386/pr90773-5.c scan-assembler-times movq[\\t ]+%xmm[0-9]+,
13\\(%[^,]+\\) 1
unix/-m32: gcc.dg/guality/vla-1.c   -O2  -DPREVENT_OPTIMIZATION  line 24 i == 5
unix/-m32: gcc.dg/guality/vla-1.c   -O2 -flto -fno-use-linker-plugin
-flto-partition=none  -DPREVENT_OPTIMIZATION line 24 i == 5
unix/-m32: gcc.dg/guality/vla-1.c   -O2 -flto -fuse-linker-plugin
-fno-fat-lto-objects  -DPREVENT_OPTIMIZATION line 24 i == 5
unix/-m32: gcc.dg/guality/vla-1.c   -O2 -flto -fuse-linker-plugin
-fno-fat-lto-objects  -DPREVENT_OPTIMIZATION line 24 sizeof (a) == 17 * sizeof
(short)
unix/-m32: gcc.dg/guality/vla-1.c   -O3 -g  -DPREVENT_OPTIMIZATION  line 24 i
== 5
unix/-m32: gcc.target/i386/pieces-memset-3.c scan-assembler-not
and[^\n\r]*%[re]sp
unix/-m32: gcc.target/i386/pieces-memset-37.c scan-assembler-not %[re]bp
unix/-m32: gcc.target/i386/pieces-memset-37.c scan-assembler-not
and[^\n\r]*%[re]sp
unix/-m32: gcc.target/i386/pieces-memset-39.c scan-assembler-not %[re]bp
unix/-m32: gcc.target/i386/pieces-memset-39.c scan-assembler-not
and[^\n\r]*%[re]sp
unix/-m32: gcc.target/i386/pieces-memset-46.c scan-assembler-times vmovw[
\\t]+[^\n]*%xmm 1
unix/-m32: gcc.target/i386/pieces-memset-47.c scan-assembler-times vmovw[
\\t]+[^\n]*%xmm 1
unix/-m32: gcc.target/i386/pieces-memset-48.c scan-assembler-times vmovw[
\\t]+[^\n]*%xmm 1
unix/-m32: gcc.target/i386/pr90773-14.c scan-assembler-times movd[\\t
]+%xmm[0-9]+, 16\\(%[^,]+\\) 1
unix/-m32: gcc.target/i386/pr90773-17.c scan-assembler-times vmovd[\\t
]+%xmm[0-9]+, 15\\(%[^,]+\\) 1

It can be grouped into 4 categories:

1) stack alignment is needed. 
2) vzeroupper is needed.
3) rtl optimization rematerial vmovd xmm to movl imm which seems to be more
optimal

       vpbroadcastb    %eax, %xmm31
-       vmovdqu8        %xmm31, (%rdx)
-       vmovd   %xmm31, 15(%rdx)
+       vpbroadcastb    %eax, %xmm0
+       vmovdqu8        %xmm0, (%rdx)
+       movl    $202116108, 15(%rdx)

4) Some debug info missing after optimziation(I think it's acceptable, though
we can try to pass down debug info, related testcase gcc.dg/guality/vla-1.c).

I think 1),2) are acceptable since it's the same as GCC11's behavior, 3) is
better than currect trunk, for 4), it's about debuggability, i'll try to handle
this.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug target/104704] [12 Regression] ix86_gen_scratch_sse_rtx doesn't work with explicit XMM7/XMM15/XMM31 usage
  2022-02-26 20:09 [Bug target/104704] New: [12 Regression] ix86_gen_scratch_sse_rtx doesn't work explicit XMM7/XMM15/XMM31 usage hjl.tools at gmail dot com
                   ` (8 preceding siblings ...)
  2022-03-02  5:29 ` crazylht at gmail dot com
@ 2022-03-02 14:49 ` hjl.tools at gmail dot com
  2022-03-02 22:22 ` hjl.tools at gmail dot com
                   ` (5 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: hjl.tools at gmail dot com @ 2022-03-02 14:49 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104704

--- Comment #9 from H.J. Lu <hjl.tools at gmail dot com> ---
--- pieces-memset-46.s  2022-03-02 06:44:55.845212762 -0800
+++
/export/build/gnu/tools-build/gcc-gitlab-debug/build-x86_64-linux/gcc/pieces-memset-46.s
   2022-03-02 06:45:03.313188978 -0800
@@ -8,9 +8,11 @@ foo:
        .cfi_startproc
        movq    dst(%rip), %rdx
        movl    $3, %eax
-       vpbroadcastb    %eax, %zmm31
-       vmovdqu8        %zmm31, (%rdx)
-       vmovw   %xmm31, 64(%rdx)
+       vpbroadcastb    %eax, %zmm0
+       movl    $771, %eax
+       movw    %ax, 64(%rdx)
+       vmovdqu8        %zmm0, (%rdx)
+       vzeroupper
        ret
        .cfi_endproc
 .LFE0:

gen_reg_rtx generates 2 extra instructions for pieces-memset-46.c.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug target/104704] [12 Regression] ix86_gen_scratch_sse_rtx doesn't work with explicit XMM7/XMM15/XMM31 usage
  2022-02-26 20:09 [Bug target/104704] New: [12 Regression] ix86_gen_scratch_sse_rtx doesn't work explicit XMM7/XMM15/XMM31 usage hjl.tools at gmail dot com
                   ` (9 preceding siblings ...)
  2022-03-02 14:49 ` hjl.tools at gmail dot com
@ 2022-03-02 22:22 ` hjl.tools at gmail dot com
  2022-03-03  1:14 ` crazylht at gmail dot com
                   ` (4 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: hjl.tools at gmail dot com @ 2022-03-02 22:22 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104704

--- Comment #10 from H.J. Lu <hjl.tools at gmail dot com> ---
Created attachment 52553
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52553&action=edit
A patch to always return pseudo register in ix86_gen_scratch_sse_rtx

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug target/104704] [12 Regression] ix86_gen_scratch_sse_rtx doesn't work with explicit XMM7/XMM15/XMM31 usage
  2022-02-26 20:09 [Bug target/104704] New: [12 Regression] ix86_gen_scratch_sse_rtx doesn't work explicit XMM7/XMM15/XMM31 usage hjl.tools at gmail dot com
                   ` (10 preceding siblings ...)
  2022-03-02 22:22 ` hjl.tools at gmail dot com
@ 2022-03-03  1:14 ` crazylht at gmail dot com
  2022-03-03  1:28 ` crazylht at gmail dot com
                   ` (3 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: crazylht at gmail dot com @ 2022-03-03  1:14 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104704

--- Comment #11 from Hongtao.liu <crazylht at gmail dot com> ---
(In reply to H.J. Lu from comment #9)
> --- pieces-memset-46.s	2022-03-02 06:44:55.845212762 -0800
> +++
> /export/build/gnu/tools-build/gcc-gitlab-debug/build-x86_64-linux/gcc/pieces-
> memset-46.s	2022-03-02 06:45:03.313188978 -0800
> @@ -8,9 +8,11 @@ foo:
>  	.cfi_startproc
>  	movq	dst(%rip), %rdx
>  	movl	$3, %eax
> -	vpbroadcastb	%eax, %zmm31
> -	vmovdqu8	%zmm31, (%rdx)
> -	vmovw	%xmm31, 64(%rdx)
> +	vpbroadcastb	%eax, %zmm0
> +	movl	$771, %eax
> +	movw	%ax, 64(%rdx)
> +	vmovdqu8	%zmm0, (%rdx)
> +	vzeroupper
>  	ret
>  	.cfi_endproc
>  .LFE0:
> 
> gen_reg_rtx generates 2 extra instructions for pieces-memset-46.c.

It's on purpose.

;; Don't move an immediate directly to memory when the instruction
;; gets too big, or if LCP stalls are a problem for 16-bit moves.

(define_peephole2
  [(match_scratch:SWI124 2 "<r>")
   (set (match_operand:SWI124 0 "memory_operand")
        (match_operand:SWI124 1 "immediate_operand"))]
  "optimize_insn_for_speed_p ()
   && ((<MODE>mode == HImode
       && TARGET_LCP_STALL)
       || (TARGET_SPLIT_LONG_MOVES
          && get_attr_length (insn) >= ix86_cur_cost ()->large_insn))"
  [(set (match_dup 2) (match_dup 1))
   (set (match_dup 0) (match_dup 2))])

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug target/104704] [12 Regression] ix86_gen_scratch_sse_rtx doesn't work with explicit XMM7/XMM15/XMM31 usage
  2022-02-26 20:09 [Bug target/104704] New: [12 Regression] ix86_gen_scratch_sse_rtx doesn't work explicit XMM7/XMM15/XMM31 usage hjl.tools at gmail dot com
                   ` (11 preceding siblings ...)
  2022-03-03  1:14 ` crazylht at gmail dot com
@ 2022-03-03  1:28 ` crazylht at gmail dot com
  2022-03-03  8:51 ` crazylht at gmail dot com
                   ` (2 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: crazylht at gmail dot com @ 2022-03-03  1:28 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104704

--- Comment #12 from Hongtao.liu <crazylht at gmail dot com> ---
(In reply to H.J. Lu from comment #10)
> Created attachment 52553 [details]
> A patch to always return pseudo register in ix86_gen_scratch_sse_rtx

For pr100865-8a.c,pr100865-9c.c,pr100865-8c.c

+/* { dg-final { scan-assembler-times "(?:vpbroadcastd|vpshufd)\[\\t
\]+\[^\n\]*, %xmm\[0-9\]+" 1 { xfail *-*-* } } } */

This can be fixed by 

 (define_insn "*vec_dupv4si"
-  [(set (match_operand:V4SI 0 "register_operand"     "=v,v,x")
+  [(set (match_operand:V4SI 0 "register_operand"     "=v,v,x,v")
        (vec_duplicate:V4SI
-         (match_operand:SI 1 "nonimmediate_operand" "Yv,m,0")))]
+         (match_operand:SI 1 "nonimmediate_operand" "Yv,m,0,$r")))]
   "TARGET_SSE"
   "@
    %vpshufd\t{$0, %1, %0|%0, %1, 0}
    vbroadcastss\t{%1, %0|%0, %1}
-   shufps\t{$0, %0, %0|%0, %0, 0}"
-  [(set_attr "isa" "sse2,avx,noavx")
-   (set_attr "type" "sselog1,ssemov,sselog1")
-   (set_attr "length_immediate" "1,0,1")
-   (set_attr "prefix_extra" "0,1,*")
-   (set_attr "prefix" "maybe_vex,maybe_evex,orig")
-   (set_attr "mode" "TI,V4SF,V4SF")])
+   shufps\t{$0, %0, %0|%0, %0, 0}
+   #"
+  [(set_attr "isa" "sse2,avx,noavx,noavx512vl")
+   (set_attr "type" "sselog1,ssemov,sselog1,sselog1")
+   (set_attr "length_immediate" "1,0,1,1")
+   (set_attr "prefix_extra" "0,1,*,0")
+   (set_attr "prefix" "maybe_vex,maybe_evex,orig,maybe_vex")
+   (set_attr "mode" "TI,V4SF,V4SF,TI")
+   (set (attr "preferred_for_speed")
+     (cond [(eq_attr "alternative" "3")
+             (symbol_ref "TARGET_INTER_UNIT_MOVES_TO_VEC")
+          ]
+          (symbol_ref "true")))])
+
+(define_split
+  [(set (match_operand:V4SI 0 "sse_reg_operand")
+       (vec_duplicate:V4SI
+         (match_operand:SI 1 "general_reg_operand")))]
+  "TARGET_SSE && reload_completed
+   /* Disable this splitter if avx512vl_vec_dup_gprv4si insn is
+      available, because then we can broadcast from GPRs directly.  */
+   && !TARGET_AVX512VL"
+  [(const_int 0)]
+{
+  emit_insn (gen_vec_setv4si_0 (gen_lowpart (V4SImode, operands[0]),
+                               CONST0_RTX (V4SImode),
+                               gen_lowpart (SImode, operands[1])));
+  emit_insn (gen_vec_duplicatev4si (operands[0], operands[0]));
+  DONE;
+})

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug target/104704] [12 Regression] ix86_gen_scratch_sse_rtx doesn't work with explicit XMM7/XMM15/XMM31 usage
  2022-02-26 20:09 [Bug target/104704] New: [12 Regression] ix86_gen_scratch_sse_rtx doesn't work explicit XMM7/XMM15/XMM31 usage hjl.tools at gmail dot com
                   ` (12 preceding siblings ...)
  2022-03-03  1:28 ` crazylht at gmail dot com
@ 2022-03-03  8:51 ` crazylht at gmail dot com
  2022-03-04  3:02 ` cvs-commit at gcc dot gnu.org
  2022-03-04  3:03 ` hjl.tools at gmail dot com
  15 siblings, 0 replies; 17+ messages in thread
From: crazylht at gmail dot com @ 2022-03-03  8:51 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104704

--- Comment #13 from Hongtao.liu <crazylht at gmail dot com> ---
(In reply to H.J. Lu from comment #10)
> Created attachment 52553 [details]
> A patch to always return pseudo register in ix86_gen_scratch_sse_rtx

Please go ahead with this patch, i'll submit an incremental patch for #12

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug target/104704] [12 Regression] ix86_gen_scratch_sse_rtx doesn't work with explicit XMM7/XMM15/XMM31 usage
  2022-02-26 20:09 [Bug target/104704] New: [12 Regression] ix86_gen_scratch_sse_rtx doesn't work explicit XMM7/XMM15/XMM31 usage hjl.tools at gmail dot com
                   ` (13 preceding siblings ...)
  2022-03-03  8:51 ` crazylht at gmail dot com
@ 2022-03-04  3:02 ` cvs-commit at gcc dot gnu.org
  2022-03-04  3:03 ` hjl.tools at gmail dot com
  15 siblings, 0 replies; 17+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2022-03-04  3:02 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104704

--- Comment #14 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by H.J. Lu <hjl@gcc.gnu.org>:

https://gcc.gnu.org/g:609e8c492d62d92465460eae3d43dfc4b2c68288

commit r12-7472-g609e8c492d62d92465460eae3d43dfc4b2c68288
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Sat Feb 26 14:17:23 2022 -0800

    x86: Always return pseudo register in ix86_gen_scratch_sse_rtx

    ix86_gen_scratch_sse_rtx returns XMM7/XMM15/XMM31 as a scratch vector
    register to prevent RTL optimizers from removing vector register.  It
    introduces a conflict with explicit XMM7/XMM15/XMM31 usage and when it
    is called by RTL optimizers, it may introduce conflicting usages of
    XMM7/XMM15/XMM31.

    Change ix86_gen_scratch_sse_rtx to always return a pseudo register and
    xfail x86 tests which are optimized with a hard scratch register.

    gcc/

            PR target/104704
            * config/i386/i386.cc (ix86_gen_scratch_sse_rtx): Always return
            a pseudo register.

    gcc/testsuite/

            PR target/104704
            * gcc.target/i386/incoming-11.c: Xfail.
            * gcc.target/i386/pieces-memset-3.c: Likewise.
            * gcc.target/i386/pieces-memset-37.c: Likewise.
            * gcc.target/i386/pieces-memset-39.c: Likewise.
            * gcc.target/i386/pieces-memset-46.c: Likewise.
            * gcc.target/i386/pieces-memset-47.c: Likewise.
            * gcc.target/i386/pieces-memset-48.c: Likewise.
            * gcc.target/i386/pr90773-5.c: Likewise.
            * gcc.target/i386/pr90773-14.c: Likewise.
            * gcc.target/i386/pr90773-17.c: Likewise.
            * gcc.target/i386/pr100865-8a.c: Likewise.
            * gcc.target/i386/pr100865-8c.c: Likewise.
            * gcc.target/i386/pr100865-9c.c: Likewise.
            * gcc.target/i386/pieces-memset-21.c: Always expect vzeroupper.
            * gcc.target/i386/pr82941-1.c: Likewise.
            * gcc.target/i386/pr82942-1.c: Likewise.
            * gcc.target/i386/pr82990-1.c: Likewise.
            * gcc.target/i386/pr82990-3.c: Likewise.
            * gcc.target/i386/pr82990-5.c: Likewise.
            * gcc.target/i386/pr100865-11b.c: Expect vmovdqa instead of
            vmovdqa64.
            * gcc.target/i386/pr100865-12b.c: Likewise.
            * gcc.target/i386/pr100865-8b.c: Likewise.
            * gcc.target/i386/pr100865-9b.c: Likewise.
            * gcc.target/i386/pr104704-1.c: New test.
            * gcc.target/i386/pr104704-2.c: Likewise.
            * gcc.target/i386/pr104704-3.c: Likewise.
            * gcc.target/i386/pr104704-4.c: Likewise.
            * gcc.target/i386/pr104704-5.c: Likewise.
            * gcc.target/i386/pr104704-6.c: Likewise.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug target/104704] [12 Regression] ix86_gen_scratch_sse_rtx doesn't work with explicit XMM7/XMM15/XMM31 usage
  2022-02-26 20:09 [Bug target/104704] New: [12 Regression] ix86_gen_scratch_sse_rtx doesn't work explicit XMM7/XMM15/XMM31 usage hjl.tools at gmail dot com
                   ` (14 preceding siblings ...)
  2022-03-04  3:02 ` cvs-commit at gcc dot gnu.org
@ 2022-03-04  3:03 ` hjl.tools at gmail dot com
  15 siblings, 0 replies; 17+ messages in thread
From: hjl.tools at gmail dot com @ 2022-03-04  3:03 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104704

H.J. Lu <hjl.tools at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |FIXED
             Status|NEW                         |RESOLVED

--- Comment #15 from H.J. Lu <hjl.tools at gmail dot com> ---
Fixed.

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2022-03-04  3:03 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-26 20:09 [Bug target/104704] New: [12 Regression] ix86_gen_scratch_sse_rtx doesn't work explicit XMM7/XMM15/XMM31 usage hjl.tools at gmail dot com
2022-02-27  0:55 ` [Bug target/104704] [12 Regression] ix86_gen_scratch_sse_rtx doesn't work with " hjl.tools at gmail dot com
2022-02-28  1:36 ` crazylht at gmail dot com
2022-02-28  1:44 ` crazylht at gmail dot com
2022-02-28  1:59 ` hjl.tools at gmail dot com
2022-02-28  8:31 ` crazylht at gmail dot com
2022-02-28  9:27 ` crazylht at gmail dot com
2022-02-28  9:33 ` crazylht at gmail dot com
2022-03-01  7:37 ` rguenth at gcc dot gnu.org
2022-03-02  5:29 ` crazylht at gmail dot com
2022-03-02 14:49 ` hjl.tools at gmail dot com
2022-03-02 22:22 ` hjl.tools at gmail dot com
2022-03-03  1:14 ` crazylht at gmail dot com
2022-03-03  1:28 ` crazylht at gmail dot com
2022-03-03  8:51 ` crazylht at gmail dot com
2022-03-04  3:02 ` cvs-commit at gcc dot gnu.org
2022-03-04  3:03 ` hjl.tools at gmail dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).