public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/46519] New: Missing vzeroupper
@ 2010-11-17 15:09 hjl.tools at gmail dot com
  2010-11-17 18:49 ` [Bug target/46519] " ubizjak at gmail dot com
                   ` (9 more replies)
  0 siblings, 10 replies; 11+ messages in thread
From: hjl.tools at gmail dot com @ 2010-11-17 15:09 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46519

           Summary: Missing vzeroupper
           Product: gcc
           Version: 4.6.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: hjl.tools@gmail.com
                CC: ubizjak@gmail.com


Created attachment 22430
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=22430
A testcase

/export/build/gnu/gcc/build-x86_64-linux/gcc/xgcc
-B/export/build/gnu/gcc/build-x86_64-linux/gcc/ -O3 -funroll-loops -ffast-math
-mavx -Wno-multichar -S bad.c

.L531:
        vmovapd %ymm0, (%rsp)
        vzeroupper
        call    Get_Token
        movl    Token_Id(%rip), %eax
        vmovapd (%rsp), %ymm0
        cmpl    $1, %eax
        je      .L139
        cmpl    $2, %eax
        je      .L541

L171:
        vmovsd  (%rax), %xmm9
        vsubsd  32(%rsp,%rcx,8), %xmm9, %xmm8
        leal    1(%rdx), %ecx
        cmpl    %ecx, %esi
        vmovsd  %xmm8, (%rax)
        jg      .L543
        jmp     .L531
        .p2align 4,,10
        .p2align 3
.L541:
        leaq    80(%rsp), %rsi 
        movq    %r12, %rdi 
        call    Parse_Num_Term <<<< missing vzeroupper
        movl    0(%r13), %esi 
        movl    80(%rsp), %eax 
        vmovapd (%rsp), %ymm0
        cmpl    %eax, %esi


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/46519] Missing vzeroupper
  2010-11-17 15:09 [Bug target/46519] New: Missing vzeroupper hjl.tools at gmail dot com
@ 2010-11-17 18:49 ` ubizjak at gmail dot com
  2010-11-17 18:56 ` hjl.tools at gmail dot com
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: ubizjak at gmail dot com @ 2010-11-17 18:49 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46519

--- Comment #1 from Uros Bizjak <ubizjak at gmail dot com> 2010-11-17 18:42:29 UTC ---
Does the patch at [1] also fix this test?

[1] http://gcc.gnu.org/ml/gcc-patches/2010-11/msg01802.html


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/46519] Missing vzeroupper
  2010-11-17 15:09 [Bug target/46519] New: Missing vzeroupper hjl.tools at gmail dot com
  2010-11-17 18:49 ` [Bug target/46519] " ubizjak at gmail dot com
@ 2010-11-17 18:56 ` hjl.tools at gmail dot com
  2010-11-18  0:29 ` hjl.tools at gmail dot com
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: hjl.tools at gmail dot com @ 2010-11-17 18:56 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46519

--- Comment #2 from H.J. Lu <hjl.tools at gmail dot com> 2010-11-17 18:49:14 UTC ---
(In reply to comment #1)
> Does the patch at [1] also fix this test?
> 
> [1] http://gcc.gnu.org/ml/gcc-patches/2010-11/msg01802.html

No, that is for a different case.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/46519] Missing vzeroupper
  2010-11-17 15:09 [Bug target/46519] New: Missing vzeroupper hjl.tools at gmail dot com
  2010-11-17 18:49 ` [Bug target/46519] " ubizjak at gmail dot com
  2010-11-17 18:56 ` hjl.tools at gmail dot com
@ 2010-11-18  0:29 ` hjl.tools at gmail dot com
  2010-11-18  1:52 ` hjl.tools at gmail dot com
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: hjl.tools at gmail dot com @ 2010-11-18  0:29 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46519

--- Comment #3 from H.J. Lu <hjl.tools at gmail dot com> 2010-11-18 00:26:18 UTC ---
Created attachment 22437
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=22437
A testcase

With -O3 -funroll-loops -ffast-math -mavx:

        movl    Token_Id(%rip), %eax
        vmovapd 32(%rsp), %ymm0
        cmpl    $1, %eax
        je      .L4
        cmpl    $2, %eax
        je      .L5
        testl   %eax, %eax
        jne     .L464
        leaq    124(%rsp), %rsi
        leaq    64(%rsp), %rdi
        call    _Z16Parse_Rel_FactorPdPi   <<<< Missing vzeroupper
        movl    124(%rsp), %esi

.L856:
        vmovapd %ymm0, 32(%rsp)
        call    _Z9Get_Tokenv  <<<< Missing vzeroupper
        movl    Token_Id(%rip), %eax 
        vmovapd 32(%rsp), %ymm0

...
.L491:
        leaq    252(%rsp), %rsi
        leaq    64(%rsp), %rdi 
        vmovapd %ymm0, 32(%rsp)    
        call    _ZL14Parse_Rel_TermPdPi   <<<< Missing vzeroupper
        movl    252(%rsp), %esi
        movl    224(%rsp), %edx
        vmovapd 32(%rsp), %ymm0
        cmpl    %edx, %esi


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/46519] Missing vzeroupper
  2010-11-17 15:09 [Bug target/46519] New: Missing vzeroupper hjl.tools at gmail dot com
                   ` (2 preceding siblings ...)
  2010-11-18  0:29 ` hjl.tools at gmail dot com
@ 2010-11-18  1:52 ` hjl.tools at gmail dot com
  2010-11-24 18:31 ` hjl at gcc dot gnu.org
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: hjl.tools at gmail dot com @ 2010-11-18  1:52 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46519

--- Comment #4 from H.J. Lu <hjl.tools at gmail dot com> 2010-11-18 01:48:13 UTC ---
(In reply to comment #3)
> Created attachment 22437 [details]
> A testcase
> 
> With -O3 -funroll-loops -ffast-math -mavx:
> 
>         movl    Token_Id(%rip), %eax
>         vmovapd 32(%rsp), %ymm0
>         cmpl    $1, %eax
>         je      .L4
>         cmpl    $2, %eax
>         je      .L5
>         testl   %eax, %eax
>         jne     .L464
>         leaq    124(%rsp), %rsi
>         leaq    64(%rsp), %rdi
>         call    _Z16Parse_Rel_FactorPdPi   <<<< Missing vzeroupper
>         movl    124(%rsp), %esi
> 
> .L856:
>         vmovapd %ymm0, 32(%rsp)
>         call    _Z9Get_Tokenv  <<<< Missing vzeroupper
>         movl    Token_Id(%rip), %eax 
>         vmovapd 32(%rsp), %ymm0
> 
> ...
> .L491:
>         leaq    252(%rsp), %rsi
>         leaq    64(%rsp), %rdi 
>         vmovapd %ymm0, 32(%rsp)    
>         call    _ZL14Parse_Rel_TermPdPi   <<<< Missing vzeroupper
>         movl    252(%rsp), %esi
>         movl    224(%rsp), %edx
>         vmovapd 32(%rsp), %ymm0
>         cmpl    %edx, %esi

Some of 256bit vector insns are introduced by loop unroll. Maybe
we should drop the use_avx256_p check since it isn't reliable.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/46519] Missing vzeroupper
  2010-11-17 15:09 [Bug target/46519] New: Missing vzeroupper hjl.tools at gmail dot com
                   ` (3 preceding siblings ...)
  2010-11-18  1:52 ` hjl.tools at gmail dot com
@ 2010-11-24 18:31 ` hjl at gcc dot gnu.org
  2010-11-24 19:46 ` hjl at gcc dot gnu.org
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: hjl at gcc dot gnu.org @ 2010-11-24 18:31 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46519

--- Comment #5 from hjl at gcc dot gnu.org <hjl at gcc dot gnu.org> 2010-11-24 18:24:46 UTC ---
Author: hjl
Date: Wed Nov 24 18:24:39 2010
New Revision: 167124

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=167124
Log:
Improve vzeroupper optimization.

gcc/

2010-11-24  H.J. Lu  <hongjiu.lu@intel.com>

    PR target/46519
    * config/i386/i386.c (upper_128bits_state): New.
    (block_info_def): Remove upper_128bits_set and done.  Add state,
    referenced, count, processed and rescanned. 
    (check_avx256_stores): Updated.
    (move_or_delete_vzeroupper_2): Updated. Handle deleted BB_END.
    Call note_stores only if needed.  Set referenced and count.
    (move_or_delete_vzeroupper_1): Updated.  Set rescan_vzeroupper_p.
    (rescan_move_or_delete_vzeroupper): New.
    (move_or_delete_vzeroupper):  Process and rescan all all basic
    blocks instead of predecessor blocks of all exit points.
    (ix86_option_override_internal): Enable vzeroupper optimization
    only for -fexpensive-optimizations and not optimizing for size.
    (use_avx256_p): Removed.
    (init_cumulative_args): Don't set use_avx256_p.
    (ix86_function_arg): Likewise.
    (ix86_expand_move): Likewise.
    (ix86_expand_vector_move_misalign): Likewise.
    (ix86_local_alignment): Likewise.
    (ix86_minimum_alignment): Likewise.
    (ix86_expand_epilogue): Don't check use_avx256_p when generating
    vzeroupper.
    (ix86_expand_call): Likewise.

    * config/i386/i386.h (machine_function): Remove use_vzeroupper_p
    and use_avx256_p.  Add rescan_vzeroupper_p.

gcc/testsuite/

2010-11-24  H.J. Lu  <hongjiu.lu@intel.com>

    PR target/46519
    * gcc.target/i386/avx-vzeroupper-10.c: Expect no avx_vzeroupper.
    * gcc.target/i386/avx-vzeroupper-11.c: Likewise.

    * gcc.target/i386/avx-vzeroupper-14.c: Replace -O0 with -O2.
    * gcc.target/i386/avx-vzeroupper-15.c: Likewise.
    * gcc.target/i386/avx-vzeroupper-16.c: Likewise.
    * gcc.target/i386/avx-vzeroupper-17.c: Likewise.

    * gcc.target/i386/avx-vzeroupper-20.c: New.
    * gcc.target/i386/avx-vzeroupper-21.c: Likewise.
    * gcc.target/i386/avx-vzeroupper-22.c: Likewise.
    * gcc.target/i386/avx-vzeroupper-23.c: Likewise.
    * gcc.target/i386/avx-vzeroupper-24.c: Likewise.
    * gcc.target/i386/avx-vzeroupper-25.c: Likewise.
    * gcc.target/i386/avx-vzeroupper-26.c: Likewise.

Added:
    trunk/gcc/testsuite/gcc.target/i386/avx-vzeroupper-20.c
    trunk/gcc/testsuite/gcc.target/i386/avx-vzeroupper-21.c
    trunk/gcc/testsuite/gcc.target/i386/avx-vzeroupper-22.c
    trunk/gcc/testsuite/gcc.target/i386/avx-vzeroupper-23.c
    trunk/gcc/testsuite/gcc.target/i386/avx-vzeroupper-24.c
    trunk/gcc/testsuite/gcc.target/i386/avx-vzeroupper-25.c
    trunk/gcc/testsuite/gcc.target/i386/avx-vzeroupper-26.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/i386/i386.c
    trunk/gcc/config/i386/i386.h
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/testsuite/gcc.target/i386/avx-vzeroupper-10.c
    trunk/gcc/testsuite/gcc.target/i386/avx-vzeroupper-11.c
    trunk/gcc/testsuite/gcc.target/i386/avx-vzeroupper-14.c
    trunk/gcc/testsuite/gcc.target/i386/avx-vzeroupper-15.c
    trunk/gcc/testsuite/gcc.target/i386/avx-vzeroupper-16.c
    trunk/gcc/testsuite/gcc.target/i386/avx-vzeroupper-17.c


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/46519] Missing vzeroupper
  2010-11-17 15:09 [Bug target/46519] New: Missing vzeroupper hjl.tools at gmail dot com
                   ` (4 preceding siblings ...)
  2010-11-24 18:31 ` hjl at gcc dot gnu.org
@ 2010-11-24 19:46 ` hjl at gcc dot gnu.org
  2010-12-30 13:12 ` hjl at gcc dot gnu.org
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: hjl at gcc dot gnu.org @ 2010-11-24 19:46 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46519

--- Comment #6 from hjl at gcc dot gnu.org <hjl at gcc dot gnu.org> 2010-11-24 19:16:48 UTC ---
Author: hjl
Date: Wed Nov 24 19:16:40 2010
New Revision: 167126

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=167126
Log:
Don't check TREE_THIS_VOLATILE in ix86_expand_call.

gcc/

2010-11-24  H.J. Lu  <hongjiu.lu@intel.com>

    PR target/46519
    * config/i386/i386.c (ix86_expand_call): Don't check
    TREE_THIS_VOLATILE.

gcc/testsuite/

2010-11-24  H.J. Lu  <hongjiu.lu@intel.com>

    PR target/46519
    * gfortran.dg/pr46519-1.f: New.

Added:
    trunk/gcc/testsuite/gfortran.dg/pr46519-1.f
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/i386/i386.c
    trunk/gcc/testsuite/ChangeLog


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/46519] Missing vzeroupper
  2010-11-17 15:09 [Bug target/46519] New: Missing vzeroupper hjl.tools at gmail dot com
                   ` (5 preceding siblings ...)
  2010-11-24 19:46 ` hjl at gcc dot gnu.org
@ 2010-12-30 13:12 ` hjl at gcc dot gnu.org
  2011-01-24 17:55 ` hjl.tools at gmail dot com
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: hjl at gcc dot gnu.org @ 2010-12-30 13:12 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46519

--- Comment #7 from hjl at gcc dot gnu.org <hjl at gcc dot gnu.org> 2010-12-30 13:12:05 UTC ---
Author: hjl
Date: Thu Dec 30 13:12:02 2010
New Revision: 168342

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=168342
Log:
Repeat processing all basic blocks for vzeroupper optimization.

gcc/

2010-12-30  H.J. Lu  <hongjiu.lu@intel.com>

    PR target/46519
    * config/i386/i386.c (block_info_def): Remove referenced, count
    and rescanned.
    (move_or_delete_vzeroupper_2): Updated.
    (move_or_delete_vzeroupper_1): Rewritten to avoid recursive call.
    (rescan_move_or_delete_vzeroupper): Removed.
    (move_or_delete_vzeroupper): Repeat processing all basic blocks
    until no basic block state is changed to used at exit.

gcc/testsuite/

2010-12-30  H.J. Lu  <hongjiu.lu@intel.com>

    PR target/46519
    * gfortran.dg/pr46519-2.f90: New.

Added:
    trunk/gcc/testsuite/gfortran.dg/pr46519-2.f90
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/i386/i386.c
    trunk/gcc/testsuite/ChangeLog


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/46519] Missing vzeroupper
  2010-11-17 15:09 [Bug target/46519] New: Missing vzeroupper hjl.tools at gmail dot com
                   ` (7 preceding siblings ...)
  2011-01-24 17:55 ` hjl.tools at gmail dot com
@ 2011-01-24 17:55 ` hjl at gcc dot gnu.org
  2011-02-02 17:41 ` dnovillo at gcc dot gnu.org
  9 siblings, 0 replies; 11+ messages in thread
From: hjl at gcc dot gnu.org @ 2011-01-24 17:55 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46519

--- Comment #8 from hjl at gcc dot gnu.org <hjl at gcc dot gnu.org> 2011-01-24 17:30:00 UTC ---
Author: hjl
Date: Mon Jan 24 17:29:58 2011
New Revision: 169173

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=169173
Log:
Visit basic blocks using the work-list based algorithm.

2011-01-24  H.J. Lu  <hongjiu.lu@intel.com>

    PR target/46519
    * config/i386/i386.c: Include sbitmap.h and fibheap.h.
    (block_info): Add scanned and prev.
    (move_or_delete_vzeroupper_2): Return if the basic block
    has been scanned and the upper 128bit state is unchanged
    from the last scan.
    (move_or_delete_vzeroupper_1): Return true if the exit
    state is changed.
    (move_or_delete_vzeroupper): Visit basic blocks using the
    work-list based algorithm based on vt_find_locations in
    var-tracking.c.

    * config/i386/t-i386: Also depend on sbitmap.h and $(FIBHEAP_H).

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/i386/i386.c
    trunk/gcc/config/i386/t-i386


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/46519] Missing vzeroupper
  2010-11-17 15:09 [Bug target/46519] New: Missing vzeroupper hjl.tools at gmail dot com
                   ` (6 preceding siblings ...)
  2010-12-30 13:12 ` hjl at gcc dot gnu.org
@ 2011-01-24 17:55 ` hjl.tools at gmail dot com
  2011-01-24 17:55 ` hjl at gcc dot gnu.org
  2011-02-02 17:41 ` dnovillo at gcc dot gnu.org
  9 siblings, 0 replies; 11+ messages in thread
From: hjl.tools at gmail dot com @ 2011-01-24 17:55 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46519

H.J. Lu <hjl.tools at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |RESOLVED
         Resolution|                            |FIXED
   Target Milestone|---                         |4.6.0

--- Comment #9 from H.J. Lu <hjl.tools at gmail dot com> 2011-01-24 17:30:59 UTC ---
Fixed.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Bug target/46519] Missing vzeroupper
  2010-11-17 15:09 [Bug target/46519] New: Missing vzeroupper hjl.tools at gmail dot com
                   ` (8 preceding siblings ...)
  2011-01-24 17:55 ` hjl at gcc dot gnu.org
@ 2011-02-02 17:41 ` dnovillo at gcc dot gnu.org
  9 siblings, 0 replies; 11+ messages in thread
From: dnovillo at gcc dot gnu.org @ 2011-02-02 17:41 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46519

--- Comment #10 from Diego Novillo <dnovillo at gcc dot gnu.org> 2011-02-02 17:40:46 UTC ---
Author: dnovillo
Date: Wed Feb  2 17:40:40 2011
New Revision: 169536

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=169536
Log:
Visit basic blocks using the work-list based algorithm.

2011-01-24  H.J. Lu  <hongjiu.lu@intel.com>

    PR target/46519
    * config/i386/i386.c: Include sbitmap.h and fibheap.h.
    (block_info): Add scanned and prev.
    (move_or_delete_vzeroupper_2): Return if the basic block
    has been scanned and the upper 128bit state is unchanged
    from the last scan.
    (move_or_delete_vzeroupper_1): Return true if the exit
    state is changed.
    (move_or_delete_vzeroupper): Visit basic blocks using the
    work-list based algorithm based on vt_find_locations in
    var-tracking.c.

    * config/i386/t-i386: Also depend on sbitmap.h and $(FIBHEAP_H).

Modified:
    branches/google/integration/gcc/ChangeLog
    branches/google/integration/gcc/config/i386/i386.c
    branches/google/integration/gcc/config/i386/t-i386


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2011-02-02 17:41 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-11-17 15:09 [Bug target/46519] New: Missing vzeroupper hjl.tools at gmail dot com
2010-11-17 18:49 ` [Bug target/46519] " ubizjak at gmail dot com
2010-11-17 18:56 ` hjl.tools at gmail dot com
2010-11-18  0:29 ` hjl.tools at gmail dot com
2010-11-18  1:52 ` hjl.tools at gmail dot com
2010-11-24 18:31 ` hjl at gcc dot gnu.org
2010-11-24 19:46 ` hjl at gcc dot gnu.org
2010-12-30 13:12 ` hjl at gcc dot gnu.org
2011-01-24 17:55 ` hjl.tools at gmail dot com
2011-01-24 17:55 ` hjl at gcc dot gnu.org
2011-02-02 17:41 ` dnovillo at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).