public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/46519] New: Missing vzeroupper
@ 2010-11-17 15:09 hjl.tools at gmail dot com
2010-11-17 18:49 ` [Bug target/46519] " ubizjak at gmail dot com
` (9 more replies)
0 siblings, 10 replies; 11+ messages in thread
From: hjl.tools at gmail dot com @ 2010-11-17 15:09 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46519
Summary: Missing vzeroupper
Product: gcc
Version: 4.6.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
AssignedTo: unassigned@gcc.gnu.org
ReportedBy: hjl.tools@gmail.com
CC: ubizjak@gmail.com
Created attachment 22430
--> http://gcc.gnu.org/bugzilla/attachment.cgi?id=22430
A testcase
/export/build/gnu/gcc/build-x86_64-linux/gcc/xgcc
-B/export/build/gnu/gcc/build-x86_64-linux/gcc/ -O3 -funroll-loops -ffast-math
-mavx -Wno-multichar -S bad.c
.L531:
vmovapd %ymm0, (%rsp)
vzeroupper
call Get_Token
movl Token_Id(%rip), %eax
vmovapd (%rsp), %ymm0
cmpl $1, %eax
je .L139
cmpl $2, %eax
je .L541
L171:
vmovsd (%rax), %xmm9
vsubsd 32(%rsp,%rcx,8), %xmm9, %xmm8
leal 1(%rdx), %ecx
cmpl %ecx, %esi
vmovsd %xmm8, (%rax)
jg .L543
jmp .L531
.p2align 4,,10
.p2align 3
.L541:
leaq 80(%rsp), %rsi
movq %r12, %rdi
call Parse_Num_Term <<<< missing vzeroupper
movl 0(%r13), %esi
movl 80(%rsp), %eax
vmovapd (%rsp), %ymm0
cmpl %eax, %esi
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug target/46519] Missing vzeroupper
2010-11-17 15:09 [Bug target/46519] New: Missing vzeroupper hjl.tools at gmail dot com
@ 2010-11-17 18:49 ` ubizjak at gmail dot com
2010-11-17 18:56 ` hjl.tools at gmail dot com
` (8 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: ubizjak at gmail dot com @ 2010-11-17 18:49 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46519
--- Comment #1 from Uros Bizjak <ubizjak at gmail dot com> 2010-11-17 18:42:29 UTC ---
Does the patch at [1] also fix this test?
[1] http://gcc.gnu.org/ml/gcc-patches/2010-11/msg01802.html
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug target/46519] Missing vzeroupper
2010-11-17 15:09 [Bug target/46519] New: Missing vzeroupper hjl.tools at gmail dot com
2010-11-17 18:49 ` [Bug target/46519] " ubizjak at gmail dot com
@ 2010-11-17 18:56 ` hjl.tools at gmail dot com
2010-11-18 0:29 ` hjl.tools at gmail dot com
` (7 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: hjl.tools at gmail dot com @ 2010-11-17 18:56 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46519
--- Comment #2 from H.J. Lu <hjl.tools at gmail dot com> 2010-11-17 18:49:14 UTC ---
(In reply to comment #1)
> Does the patch at [1] also fix this test?
>
> [1] http://gcc.gnu.org/ml/gcc-patches/2010-11/msg01802.html
No, that is for a different case.
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug target/46519] Missing vzeroupper
2010-11-17 15:09 [Bug target/46519] New: Missing vzeroupper hjl.tools at gmail dot com
2010-11-17 18:49 ` [Bug target/46519] " ubizjak at gmail dot com
2010-11-17 18:56 ` hjl.tools at gmail dot com
@ 2010-11-18 0:29 ` hjl.tools at gmail dot com
2010-11-18 1:52 ` hjl.tools at gmail dot com
` (6 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: hjl.tools at gmail dot com @ 2010-11-18 0:29 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46519
--- Comment #3 from H.J. Lu <hjl.tools at gmail dot com> 2010-11-18 00:26:18 UTC ---
Created attachment 22437
--> http://gcc.gnu.org/bugzilla/attachment.cgi?id=22437
A testcase
With -O3 -funroll-loops -ffast-math -mavx:
movl Token_Id(%rip), %eax
vmovapd 32(%rsp), %ymm0
cmpl $1, %eax
je .L4
cmpl $2, %eax
je .L5
testl %eax, %eax
jne .L464
leaq 124(%rsp), %rsi
leaq 64(%rsp), %rdi
call _Z16Parse_Rel_FactorPdPi <<<< Missing vzeroupper
movl 124(%rsp), %esi
.L856:
vmovapd %ymm0, 32(%rsp)
call _Z9Get_Tokenv <<<< Missing vzeroupper
movl Token_Id(%rip), %eax
vmovapd 32(%rsp), %ymm0
...
.L491:
leaq 252(%rsp), %rsi
leaq 64(%rsp), %rdi
vmovapd %ymm0, 32(%rsp)
call _ZL14Parse_Rel_TermPdPi <<<< Missing vzeroupper
movl 252(%rsp), %esi
movl 224(%rsp), %edx
vmovapd 32(%rsp), %ymm0
cmpl %edx, %esi
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug target/46519] Missing vzeroupper
2010-11-17 15:09 [Bug target/46519] New: Missing vzeroupper hjl.tools at gmail dot com
` (2 preceding siblings ...)
2010-11-18 0:29 ` hjl.tools at gmail dot com
@ 2010-11-18 1:52 ` hjl.tools at gmail dot com
2010-11-24 18:31 ` hjl at gcc dot gnu.org
` (5 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: hjl.tools at gmail dot com @ 2010-11-18 1:52 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46519
--- Comment #4 from H.J. Lu <hjl.tools at gmail dot com> 2010-11-18 01:48:13 UTC ---
(In reply to comment #3)
> Created attachment 22437 [details]
> A testcase
>
> With -O3 -funroll-loops -ffast-math -mavx:
>
> movl Token_Id(%rip), %eax
> vmovapd 32(%rsp), %ymm0
> cmpl $1, %eax
> je .L4
> cmpl $2, %eax
> je .L5
> testl %eax, %eax
> jne .L464
> leaq 124(%rsp), %rsi
> leaq 64(%rsp), %rdi
> call _Z16Parse_Rel_FactorPdPi <<<< Missing vzeroupper
> movl 124(%rsp), %esi
>
> .L856:
> vmovapd %ymm0, 32(%rsp)
> call _Z9Get_Tokenv <<<< Missing vzeroupper
> movl Token_Id(%rip), %eax
> vmovapd 32(%rsp), %ymm0
>
> ...
> .L491:
> leaq 252(%rsp), %rsi
> leaq 64(%rsp), %rdi
> vmovapd %ymm0, 32(%rsp)
> call _ZL14Parse_Rel_TermPdPi <<<< Missing vzeroupper
> movl 252(%rsp), %esi
> movl 224(%rsp), %edx
> vmovapd 32(%rsp), %ymm0
> cmpl %edx, %esi
Some of 256bit vector insns are introduced by loop unroll. Maybe
we should drop the use_avx256_p check since it isn't reliable.
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug target/46519] Missing vzeroupper
2010-11-17 15:09 [Bug target/46519] New: Missing vzeroupper hjl.tools at gmail dot com
` (3 preceding siblings ...)
2010-11-18 1:52 ` hjl.tools at gmail dot com
@ 2010-11-24 18:31 ` hjl at gcc dot gnu.org
2010-11-24 19:46 ` hjl at gcc dot gnu.org
` (4 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: hjl at gcc dot gnu.org @ 2010-11-24 18:31 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46519
--- Comment #5 from hjl at gcc dot gnu.org <hjl at gcc dot gnu.org> 2010-11-24 18:24:46 UTC ---
Author: hjl
Date: Wed Nov 24 18:24:39 2010
New Revision: 167124
URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=167124
Log:
Improve vzeroupper optimization.
gcc/
2010-11-24 H.J. Lu <hongjiu.lu@intel.com>
PR target/46519
* config/i386/i386.c (upper_128bits_state): New.
(block_info_def): Remove upper_128bits_set and done. Add state,
referenced, count, processed and rescanned.
(check_avx256_stores): Updated.
(move_or_delete_vzeroupper_2): Updated. Handle deleted BB_END.
Call note_stores only if needed. Set referenced and count.
(move_or_delete_vzeroupper_1): Updated. Set rescan_vzeroupper_p.
(rescan_move_or_delete_vzeroupper): New.
(move_or_delete_vzeroupper): Process and rescan all all basic
blocks instead of predecessor blocks of all exit points.
(ix86_option_override_internal): Enable vzeroupper optimization
only for -fexpensive-optimizations and not optimizing for size.
(use_avx256_p): Removed.
(init_cumulative_args): Don't set use_avx256_p.
(ix86_function_arg): Likewise.
(ix86_expand_move): Likewise.
(ix86_expand_vector_move_misalign): Likewise.
(ix86_local_alignment): Likewise.
(ix86_minimum_alignment): Likewise.
(ix86_expand_epilogue): Don't check use_avx256_p when generating
vzeroupper.
(ix86_expand_call): Likewise.
* config/i386/i386.h (machine_function): Remove use_vzeroupper_p
and use_avx256_p. Add rescan_vzeroupper_p.
gcc/testsuite/
2010-11-24 H.J. Lu <hongjiu.lu@intel.com>
PR target/46519
* gcc.target/i386/avx-vzeroupper-10.c: Expect no avx_vzeroupper.
* gcc.target/i386/avx-vzeroupper-11.c: Likewise.
* gcc.target/i386/avx-vzeroupper-14.c: Replace -O0 with -O2.
* gcc.target/i386/avx-vzeroupper-15.c: Likewise.
* gcc.target/i386/avx-vzeroupper-16.c: Likewise.
* gcc.target/i386/avx-vzeroupper-17.c: Likewise.
* gcc.target/i386/avx-vzeroupper-20.c: New.
* gcc.target/i386/avx-vzeroupper-21.c: Likewise.
* gcc.target/i386/avx-vzeroupper-22.c: Likewise.
* gcc.target/i386/avx-vzeroupper-23.c: Likewise.
* gcc.target/i386/avx-vzeroupper-24.c: Likewise.
* gcc.target/i386/avx-vzeroupper-25.c: Likewise.
* gcc.target/i386/avx-vzeroupper-26.c: Likewise.
Added:
trunk/gcc/testsuite/gcc.target/i386/avx-vzeroupper-20.c
trunk/gcc/testsuite/gcc.target/i386/avx-vzeroupper-21.c
trunk/gcc/testsuite/gcc.target/i386/avx-vzeroupper-22.c
trunk/gcc/testsuite/gcc.target/i386/avx-vzeroupper-23.c
trunk/gcc/testsuite/gcc.target/i386/avx-vzeroupper-24.c
trunk/gcc/testsuite/gcc.target/i386/avx-vzeroupper-25.c
trunk/gcc/testsuite/gcc.target/i386/avx-vzeroupper-26.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/i386/i386.c
trunk/gcc/config/i386/i386.h
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/gcc.target/i386/avx-vzeroupper-10.c
trunk/gcc/testsuite/gcc.target/i386/avx-vzeroupper-11.c
trunk/gcc/testsuite/gcc.target/i386/avx-vzeroupper-14.c
trunk/gcc/testsuite/gcc.target/i386/avx-vzeroupper-15.c
trunk/gcc/testsuite/gcc.target/i386/avx-vzeroupper-16.c
trunk/gcc/testsuite/gcc.target/i386/avx-vzeroupper-17.c
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug target/46519] Missing vzeroupper
2010-11-17 15:09 [Bug target/46519] New: Missing vzeroupper hjl.tools at gmail dot com
` (4 preceding siblings ...)
2010-11-24 18:31 ` hjl at gcc dot gnu.org
@ 2010-11-24 19:46 ` hjl at gcc dot gnu.org
2010-12-30 13:12 ` hjl at gcc dot gnu.org
` (3 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: hjl at gcc dot gnu.org @ 2010-11-24 19:46 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46519
--- Comment #6 from hjl at gcc dot gnu.org <hjl at gcc dot gnu.org> 2010-11-24 19:16:48 UTC ---
Author: hjl
Date: Wed Nov 24 19:16:40 2010
New Revision: 167126
URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=167126
Log:
Don't check TREE_THIS_VOLATILE in ix86_expand_call.
gcc/
2010-11-24 H.J. Lu <hongjiu.lu@intel.com>
PR target/46519
* config/i386/i386.c (ix86_expand_call): Don't check
TREE_THIS_VOLATILE.
gcc/testsuite/
2010-11-24 H.J. Lu <hongjiu.lu@intel.com>
PR target/46519
* gfortran.dg/pr46519-1.f: New.
Added:
trunk/gcc/testsuite/gfortran.dg/pr46519-1.f
Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/i386/i386.c
trunk/gcc/testsuite/ChangeLog
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug target/46519] Missing vzeroupper
2010-11-17 15:09 [Bug target/46519] New: Missing vzeroupper hjl.tools at gmail dot com
` (5 preceding siblings ...)
2010-11-24 19:46 ` hjl at gcc dot gnu.org
@ 2010-12-30 13:12 ` hjl at gcc dot gnu.org
2011-01-24 17:55 ` hjl.tools at gmail dot com
` (2 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: hjl at gcc dot gnu.org @ 2010-12-30 13:12 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46519
--- Comment #7 from hjl at gcc dot gnu.org <hjl at gcc dot gnu.org> 2010-12-30 13:12:05 UTC ---
Author: hjl
Date: Thu Dec 30 13:12:02 2010
New Revision: 168342
URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=168342
Log:
Repeat processing all basic blocks for vzeroupper optimization.
gcc/
2010-12-30 H.J. Lu <hongjiu.lu@intel.com>
PR target/46519
* config/i386/i386.c (block_info_def): Remove referenced, count
and rescanned.
(move_or_delete_vzeroupper_2): Updated.
(move_or_delete_vzeroupper_1): Rewritten to avoid recursive call.
(rescan_move_or_delete_vzeroupper): Removed.
(move_or_delete_vzeroupper): Repeat processing all basic blocks
until no basic block state is changed to used at exit.
gcc/testsuite/
2010-12-30 H.J. Lu <hongjiu.lu@intel.com>
PR target/46519
* gfortran.dg/pr46519-2.f90: New.
Added:
trunk/gcc/testsuite/gfortran.dg/pr46519-2.f90
Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/i386/i386.c
trunk/gcc/testsuite/ChangeLog
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug target/46519] Missing vzeroupper
2010-11-17 15:09 [Bug target/46519] New: Missing vzeroupper hjl.tools at gmail dot com
` (7 preceding siblings ...)
2011-01-24 17:55 ` hjl.tools at gmail dot com
@ 2011-01-24 17:55 ` hjl at gcc dot gnu.org
2011-02-02 17:41 ` dnovillo at gcc dot gnu.org
9 siblings, 0 replies; 11+ messages in thread
From: hjl at gcc dot gnu.org @ 2011-01-24 17:55 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46519
--- Comment #8 from hjl at gcc dot gnu.org <hjl at gcc dot gnu.org> 2011-01-24 17:30:00 UTC ---
Author: hjl
Date: Mon Jan 24 17:29:58 2011
New Revision: 169173
URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=169173
Log:
Visit basic blocks using the work-list based algorithm.
2011-01-24 H.J. Lu <hongjiu.lu@intel.com>
PR target/46519
* config/i386/i386.c: Include sbitmap.h and fibheap.h.
(block_info): Add scanned and prev.
(move_or_delete_vzeroupper_2): Return if the basic block
has been scanned and the upper 128bit state is unchanged
from the last scan.
(move_or_delete_vzeroupper_1): Return true if the exit
state is changed.
(move_or_delete_vzeroupper): Visit basic blocks using the
work-list based algorithm based on vt_find_locations in
var-tracking.c.
* config/i386/t-i386: Also depend on sbitmap.h and $(FIBHEAP_H).
Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/i386/i386.c
trunk/gcc/config/i386/t-i386
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug target/46519] Missing vzeroupper
2010-11-17 15:09 [Bug target/46519] New: Missing vzeroupper hjl.tools at gmail dot com
` (6 preceding siblings ...)
2010-12-30 13:12 ` hjl at gcc dot gnu.org
@ 2011-01-24 17:55 ` hjl.tools at gmail dot com
2011-01-24 17:55 ` hjl at gcc dot gnu.org
2011-02-02 17:41 ` dnovillo at gcc dot gnu.org
9 siblings, 0 replies; 11+ messages in thread
From: hjl.tools at gmail dot com @ 2011-01-24 17:55 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46519
H.J. Lu <hjl.tools at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |RESOLVED
Resolution| |FIXED
Target Milestone|--- |4.6.0
--- Comment #9 from H.J. Lu <hjl.tools at gmail dot com> 2011-01-24 17:30:59 UTC ---
Fixed.
^ permalink raw reply [flat|nested] 11+ messages in thread
* [Bug target/46519] Missing vzeroupper
2010-11-17 15:09 [Bug target/46519] New: Missing vzeroupper hjl.tools at gmail dot com
` (8 preceding siblings ...)
2011-01-24 17:55 ` hjl at gcc dot gnu.org
@ 2011-02-02 17:41 ` dnovillo at gcc dot gnu.org
9 siblings, 0 replies; 11+ messages in thread
From: dnovillo at gcc dot gnu.org @ 2011-02-02 17:41 UTC (permalink / raw)
To: gcc-bugs
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46519
--- Comment #10 from Diego Novillo <dnovillo at gcc dot gnu.org> 2011-02-02 17:40:46 UTC ---
Author: dnovillo
Date: Wed Feb 2 17:40:40 2011
New Revision: 169536
URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=169536
Log:
Visit basic blocks using the work-list based algorithm.
2011-01-24 H.J. Lu <hongjiu.lu@intel.com>
PR target/46519
* config/i386/i386.c: Include sbitmap.h and fibheap.h.
(block_info): Add scanned and prev.
(move_or_delete_vzeroupper_2): Return if the basic block
has been scanned and the upper 128bit state is unchanged
from the last scan.
(move_or_delete_vzeroupper_1): Return true if the exit
state is changed.
(move_or_delete_vzeroupper): Visit basic blocks using the
work-list based algorithm based on vt_find_locations in
var-tracking.c.
* config/i386/t-i386: Also depend on sbitmap.h and $(FIBHEAP_H).
Modified:
branches/google/integration/gcc/ChangeLog
branches/google/integration/gcc/config/i386/i386.c
branches/google/integration/gcc/config/i386/t-i386
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2011-02-02 17:41 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-11-17 15:09 [Bug target/46519] New: Missing vzeroupper hjl.tools at gmail dot com
2010-11-17 18:49 ` [Bug target/46519] " ubizjak at gmail dot com
2010-11-17 18:56 ` hjl.tools at gmail dot com
2010-11-18 0:29 ` hjl.tools at gmail dot com
2010-11-18 1:52 ` hjl.tools at gmail dot com
2010-11-24 18:31 ` hjl at gcc dot gnu.org
2010-11-24 19:46 ` hjl at gcc dot gnu.org
2010-12-30 13:12 ` hjl at gcc dot gnu.org
2011-01-24 17:55 ` hjl.tools at gmail dot com
2011-01-24 17:55 ` hjl at gcc dot gnu.org
2011-02-02 17:41 ` dnovillo at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).