* [Bug target/105504] Fails to break dependency for vcvtss2sd xmm, xmm, mem
2022-05-06 9:12 [Bug target/105504] New: Fails to break dependency for vcvtss2sd xmm, xmm, mem amonakov at gcc dot gnu.org
@ 2022-05-07 3:02 ` crazylht at gmail dot com
2022-05-07 3:16 ` crazylht at gmail dot com
` (7 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: crazylht at gmail dot com @ 2022-05-07 3:02 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105504
--- Comment #1 from Hongtao.liu <crazylht at gmail dot com> ---
Pass remove_partial_avx_dependency is before RA, which we have
(insn 128 127 129 22 (set (reg/v:DF 99 [ z ])
(float_extend:DF (reg/v:SF 117 [ x ]))) "test.c":43:10 163
{*extendsfdf2}
and attr avx_partial_xmm_update is false, since we don't need to handle
vcvtss2sd %xmm, %xmm, %xmm
But after RA, 117 is allocate as mem which causes partial xmm dependency.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug target/105504] Fails to break dependency for vcvtss2sd xmm, xmm, mem
2022-05-06 9:12 [Bug target/105504] New: Fails to break dependency for vcvtss2sd xmm, xmm, mem amonakov at gcc dot gnu.org
2022-05-07 3:02 ` [Bug target/105504] " crazylht at gmail dot com
@ 2022-05-07 3:16 ` crazylht at gmail dot com
2022-05-07 3:36 ` crazylht at gmail dot com
` (6 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: crazylht at gmail dot com @ 2022-05-07 3:16 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105504
--- Comment #2 from Hongtao.liu <crazylht at gmail dot com> ---
After set remove_partial_avx_dependency to true for register alternative, we
get
vxorps %xmm3, %xmm3, %xmm3
vmovsd .LC16(%rip), %xmm6
vmovsd .LC14(%rip), %xmm5
vcvtss2sd %xmm0, %xmm3, %xmm0
vmulsd .LC12(%rip), %xmm0, %xmm1
vroundsd $9, %xmm1, %xmm3, %xmm2
vcvttsd2siq %xmm2, %rax
vsubsd %xmm2, %xmm1, %xmm1
vfmadd231sd .LC13(%rip), %xmm0, %xmm1
vfmadd213sd .LC17(%rip), %xmm1, %xmm6
vmovsd .LC18(%rip), %xmm0
vfmadd213sd .LC19(%rip), %xmm1, %xmm0
vfmadd213sd .LC15(%rip), %xmm1, %xmm5
movq %rax, %rdx
sarq $4, %rax
vmulsd %xmm1, %xmm1, %xmm4
addq $1023, %rax
andl $15, %edx
salq $52, %rax
vmovq %rax, %xmm7
vmulsd tb.1(,%rdx,8), %xmm7, %xmm2
vfmadd132sd %xmm4, %xmm6, %xmm0
vmulsd %xmm2, %xmm1, %xmm1
vfmadd132sd %xmm4, %xmm5, %xmm0
vfmadd132sd %xmm1, %xmm2, %xmm0
vcvtsd2ss %xmm0, %xmm3, %xmm0
Also
>Also there's a potentially related issue that GCC copies the initial xmm0 value >to eax via stack in the beginning of the function:
This issue disappears(should still be latent after fix).
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug target/105504] Fails to break dependency for vcvtss2sd xmm, xmm, mem
2022-05-06 9:12 [Bug target/105504] New: Fails to break dependency for vcvtss2sd xmm, xmm, mem amonakov at gcc dot gnu.org
2022-05-07 3:02 ` [Bug target/105504] " crazylht at gmail dot com
2022-05-07 3:16 ` crazylht at gmail dot com
@ 2022-05-07 3:36 ` crazylht at gmail dot com
2022-05-07 4:04 ` crazylht at gmail dot com
` (5 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: crazylht at gmail dot com @ 2022-05-07 3:36 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105504
Hongtao.liu <crazylht at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |hjl.tools at gmail dot com
--- Comment #3 from Hongtao.liu <crazylht at gmail dot com> ---
The overhead for setting attr remove_partial_avx_dependency to true for
register alternative is:
For simple
double
foo (float a)
{
return a;
}
Now we generates:
foo:
.LFB0:
.cfi_startproc
vmovaps %xmm0, %xmm1
vxorps %xmm0, %xmm0, %xmm0
vcvtss2sd %xmm1, %xmm0, %xmm0
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug target/105504] Fails to break dependency for vcvtss2sd xmm, xmm, mem
2022-05-06 9:12 [Bug target/105504] New: Fails to break dependency for vcvtss2sd xmm, xmm, mem amonakov at gcc dot gnu.org
` (2 preceding siblings ...)
2022-05-07 3:36 ` crazylht at gmail dot com
@ 2022-05-07 4:04 ` crazylht at gmail dot com
2022-05-07 8:28 ` amonakov at gcc dot gnu.org
` (4 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: crazylht at gmail dot com @ 2022-05-07 4:04 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105504
--- Comment #4 from Hongtao.liu <crazylht at gmail dot com> ---
Another possible solution is add a little bit dislike for "m" alternative(like
?m) to avoid potential spill.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug target/105504] Fails to break dependency for vcvtss2sd xmm, xmm, mem
2022-05-06 9:12 [Bug target/105504] New: Fails to break dependency for vcvtss2sd xmm, xmm, mem amonakov at gcc dot gnu.org
` (3 preceding siblings ...)
2022-05-07 4:04 ` crazylht at gmail dot com
@ 2022-05-07 8:28 ` amonakov at gcc dot gnu.org
2022-06-08 3:24 ` cvs-commit at gcc dot gnu.org
` (3 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: amonakov at gcc dot gnu.org @ 2022-05-07 8:28 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105504
--- Comment #5 from Alexander Monakov <amonakov at gcc dot gnu.org> ---
The strange xmm0 spill issue may affect more code, so I reported an isolated
testcase: PR 105513 (regression vs. gcc-8, the complete testcase in this PR
also does not spill with gcc-8).
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug target/105504] Fails to break dependency for vcvtss2sd xmm, xmm, mem
2022-05-06 9:12 [Bug target/105504] New: Fails to break dependency for vcvtss2sd xmm, xmm, mem amonakov at gcc dot gnu.org
` (4 preceding siblings ...)
2022-05-07 8:28 ` amonakov at gcc dot gnu.org
@ 2022-06-08 3:24 ` cvs-commit at gcc dot gnu.org
2023-08-05 15:05 ` egallager at gcc dot gnu.org
` (2 subsequent siblings)
8 siblings, 0 replies; 10+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2022-06-08 3:24 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105504
--- Comment #6 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by hongtao Liu <liuhongt@gcc.gnu.org>:
https://gcc.gnu.org/g:5e005393d4ff0a428c5f55b9ba7f65d6078a7cf5
commit r13-1009-g5e005393d4ff0a428c5f55b9ba7f65d6078a7cf5
Author: liuhongt <hongtao.liu@intel.com>
Date: Mon May 30 15:30:51 2022 +0800
Disparages SSE_REGS alternatives sligntly with ?v instead of *v in
*mov{si,di}_internal.
So alternative v won't be igored in record_reg_classess.
Similar for *r alternatives in some vector patterns.
It helps testcase in the PR, also RA now makes better decisions for
gcc.target/i386/extract-insert-combining.c
movd %esi, %xmm0
movd %edi, %xmm1
- movl %esi, -12(%rsp)
paddd %xmm0, %xmm1
pinsrd $0, %esi, %xmm0
paddd %xmm1, %xmm0
The patch has no big impact on SPEC2017 for both O2 and Ofast
march=native run.
And I noticed there's some changes in SPEC2017 from code like
mov mem, %eax
vmovd %eax, %xmm0
..
mov %eax, 64(%rsp)
to
vmovd mem, %xmm0
..
vmovd %xmm0, 64(%rsp)
Which should be exactly what we want?
gcc/ChangeLog:
PR target/105513
PR target/105504
* config/i386/i386.md (*movsi_internal): Change alternative
from *v to ?v.
(*movdi_internal): Ditto.
* config/i386/sse.md (vec_set<mode>_0): Change alternative *r
to ?r.
(*vec_extractv4sf_mem): Ditto.
(*vec_extracthf): Ditto.
gcc/testsuite/ChangeLog:
* gcc.target/i386/pr105513-1.c: New test.
* gcc.target/i386/extract-insert-combining.c: Add new
scan-assembler-not for spill.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug target/105504] Fails to break dependency for vcvtss2sd xmm, xmm, mem
2022-05-06 9:12 [Bug target/105504] New: Fails to break dependency for vcvtss2sd xmm, xmm, mem amonakov at gcc dot gnu.org
` (5 preceding siblings ...)
2022-06-08 3:24 ` cvs-commit at gcc dot gnu.org
@ 2023-08-05 15:05 ` egallager at gcc dot gnu.org
2023-08-07 1:55 ` crazylht at gmail dot com
2023-08-12 13:50 ` egallager at gcc dot gnu.org
8 siblings, 0 replies; 10+ messages in thread
From: egallager at gcc dot gnu.org @ 2023-08-05 15:05 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105504
--- Comment #7 from Eric Gallager <egallager at gcc dot gnu.org> ---
(In reply to CVS Commits from comment #6)
> The master branch has been updated by hongtao Liu <liuhongt@gcc.gnu.org>:
>
> https://gcc.gnu.org/g:5e005393d4ff0a428c5f55b9ba7f65d6078a7cf5
>
> commit r13-1009-g5e005393d4ff0a428c5f55b9ba7f65d6078a7cf5
> Author: liuhongt <hongtao.liu@intel.com>
> Date: Mon May 30 15:30:51 2022 +0800
>
> Disparages SSE_REGS alternatives sligntly with ?v instead of *v in
> *mov{si,di}_internal.
>
> So alternative v won't be igored in record_reg_classess.
>
> Similar for *r alternatives in some vector patterns.
>
> It helps testcase in the PR, also RA now makes better decisions for
> gcc.target/i386/extract-insert-combining.c
>
> movd %esi, %xmm0
> movd %edi, %xmm1
> - movl %esi, -12(%rsp)
> paddd %xmm0, %xmm1
> pinsrd $0, %esi, %xmm0
> paddd %xmm1, %xmm0
>
> The patch has no big impact on SPEC2017 for both O2 and Ofast
> march=native run.
>
> And I noticed there's some changes in SPEC2017 from code like
>
> mov mem, %eax
> vmovd %eax, %xmm0
> ..
> mov %eax, 64(%rsp)
>
> to
>
> vmovd mem, %xmm0
> ..
> vmovd %xmm0, 64(%rsp)
>
> Which should be exactly what we want?
>
> gcc/ChangeLog:
>
> PR target/105513
> PR target/105504
> * config/i386/i386.md (*movsi_internal): Change alternative
> from *v to ?v.
> (*movdi_internal): Ditto.
> * config/i386/sse.md (vec_set<mode>_0): Change alternative *r
> to ?r.
> (*vec_extractv4sf_mem): Ditto.
> (*vec_extracthf): Ditto.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/i386/pr105513-1.c: New test.
> * gcc.target/i386/extract-insert-combining.c: Add new
> scan-assembler-not for spill.
Did this fix it?
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug target/105504] Fails to break dependency for vcvtss2sd xmm, xmm, mem
2022-05-06 9:12 [Bug target/105504] New: Fails to break dependency for vcvtss2sd xmm, xmm, mem amonakov at gcc dot gnu.org
` (6 preceding siblings ...)
2023-08-05 15:05 ` egallager at gcc dot gnu.org
@ 2023-08-07 1:55 ` crazylht at gmail dot com
2023-08-12 13:50 ` egallager at gcc dot gnu.org
8 siblings, 0 replies; 10+ messages in thread
From: crazylht at gmail dot com @ 2023-08-07 1:55 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105504
Hongtao.liu <crazylht at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |crazylht at gmail dot com
--- Comment #8 from Hongtao.liu <crazylht at gmail dot com> ---
(In reply to Eric Gallager from comment #7)
> (In reply to CVS Commits from comment #6)
> > The master branch has been updated by hongtao Liu <liuhongt@gcc.gnu.org>:
> >
> > https://gcc.gnu.org/g:5e005393d4ff0a428c5f55b9ba7f65d6078a7cf5
> >
> > commit r13-1009-g5e005393d4ff0a428c5f55b9ba7f65d6078a7cf5
> > Author: liuhongt <hongtao.liu@intel.com>
> > Date: Mon May 30 15:30:51 2022 +0800
> >
> > Disparages SSE_REGS alternatives sligntly with ?v instead of *v in
> > *mov{si,di}_internal.
> >
> > So alternative v won't be igored in record_reg_classess.
> >
> > Similar for *r alternatives in some vector patterns.
> >
> > It helps testcase in the PR, also RA now makes better decisions for
> > gcc.target/i386/extract-insert-combining.c
> >
> > movd %esi, %xmm0
> > movd %edi, %xmm1
> > - movl %esi, -12(%rsp)
> > paddd %xmm0, %xmm1
> > pinsrd $0, %esi, %xmm0
> > paddd %xmm1, %xmm0
> >
> > The patch has no big impact on SPEC2017 for both O2 and Ofast
> > march=native run.
> >
> > And I noticed there's some changes in SPEC2017 from code like
> >
> > mov mem, %eax
> > vmovd %eax, %xmm0
> > ..
> > mov %eax, 64(%rsp)
> >
> > to
> >
> > vmovd mem, %xmm0
> > ..
> > vmovd %xmm0, 64(%rsp)
> >
> > Which should be exactly what we want?
> >
> > gcc/ChangeLog:
> >
> > PR target/105513
> > PR target/105504
> > * config/i386/i386.md (*movsi_internal): Change alternative
> > from *v to ?v.
> > (*movdi_internal): Ditto.
> > * config/i386/sse.md (vec_set<mode>_0): Change alternative *r
> > to ?r.
> > (*vec_extractv4sf_mem): Ditto.
> > (*vec_extracthf): Ditto.
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.target/i386/pr105513-1.c: New test.
> > * gcc.target/i386/extract-insert-combining.c: Add new
> > scan-assembler-not for spill.
>
> Did this fix it?
Yes.
^ permalink raw reply [flat|nested] 10+ messages in thread
* [Bug target/105504] Fails to break dependency for vcvtss2sd xmm, xmm, mem
2022-05-06 9:12 [Bug target/105504] New: Fails to break dependency for vcvtss2sd xmm, xmm, mem amonakov at gcc dot gnu.org
` (7 preceding siblings ...)
2023-08-07 1:55 ` crazylht at gmail dot com
@ 2023-08-12 13:50 ` egallager at gcc dot gnu.org
8 siblings, 0 replies; 10+ messages in thread
From: egallager at gcc dot gnu.org @ 2023-08-12 13:50 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105504
Eric Gallager <egallager at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |RESOLVED
Resolution|--- |FIXED
--- Comment #9 from Eric Gallager <egallager at gcc dot gnu.org> ---
(In reply to Hongtao.liu from comment #8)
> (In reply to Eric Gallager from comment #7)
> >
> > Did this fix it?
>
> Yes.
OK, closing, then
^ permalink raw reply [flat|nested] 10+ messages in thread