public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/47440] New: Use LCM for vzeroupper optimization
@ 2011-01-24 17:42 hjl.tools at gmail dot com
  2011-05-19  8:56 ` [Bug target/47440] Use LCM for vzeroupper insertion ubizjak at gmail dot com
                   ` (8 more replies)
  0 siblings, 9 replies; 10+ messages in thread
From: hjl.tools at gmail dot com @ 2011-01-24 17:42 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47440

           Summary: Use LCM for vzeroupper optimization
           Product: gcc
           Version: 4.6.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: hjl.tools@gmail.com


From:

http://gcc.gnu.org/ml/gcc-patches/2010-12/msg01923.html
http://gcc.gnu.org/ml/gcc-patches/2011-01/msg00967.html

LCM infrastructure (see lcm.c) is suggested for placing vzerouppers at
optimum points.  Targeting 4.7.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug target/47440] Use LCM for vzeroupper insertion
  2011-01-24 17:42 [Bug target/47440] New: Use LCM for vzeroupper optimization hjl.tools at gmail dot com
@ 2011-05-19  8:56 ` ubizjak at gmail dot com
  2012-03-22  9:24 ` rguenth at gcc dot gnu.org
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: ubizjak at gmail dot com @ 2011-05-19  8:56 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47440

Uros Bizjak <ubizjak at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2011.05.19 08:29:24
                 CC|                            |ubizjak at gmail dot com
            Version|4.6.0                       |4.7.0
   Target Milestone|---                         |4.7.0
            Summary|Use LCM for vzeroupper      |Use LCM for vzeroupper
                   |optimization                |insertion
     Ever Confirmed|0                           |1
           Severity|normal                      |enhancement

--- Comment #1 from Uros Bizjak <ubizjak at gmail dot com> 2011-05-19 08:29:24 UTC ---
Confirmed.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug target/47440] Use LCM for vzeroupper insertion
  2011-01-24 17:42 [Bug target/47440] New: Use LCM for vzeroupper optimization hjl.tools at gmail dot com
  2011-05-19  8:56 ` [Bug target/47440] Use LCM for vzeroupper insertion ubizjak at gmail dot com
@ 2012-03-22  9:24 ` rguenth at gcc dot gnu.org
  2012-07-02 14:07 ` rguenth at gcc dot gnu.org
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: rguenth at gcc dot gnu.org @ 2012-03-22  9:24 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47440

Richard Guenther <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|4.7.0                       |4.7.1

--- Comment #2 from Richard Guenther <rguenth at gcc dot gnu.org> 2012-03-22 08:27:28 UTC ---
GCC 4.7.0 is being released, adjusting target milestone.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug target/47440] Use LCM for vzeroupper insertion
  2011-01-24 17:42 [Bug target/47440] New: Use LCM for vzeroupper optimization hjl.tools at gmail dot com
  2011-05-19  8:56 ` [Bug target/47440] Use LCM for vzeroupper insertion ubizjak at gmail dot com
  2012-03-22  9:24 ` rguenth at gcc dot gnu.org
@ 2012-07-02 14:07 ` rguenth at gcc dot gnu.org
  2012-08-22 15:26 ` vbyakovl23 at gmail dot com
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: rguenth at gcc dot gnu.org @ 2012-07-02 14:07 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47440

Richard Guenther <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|4.7.1                       |---


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug target/47440] Use LCM for vzeroupper insertion
  2011-01-24 17:42 [Bug target/47440] New: Use LCM for vzeroupper optimization hjl.tools at gmail dot com
                   ` (2 preceding siblings ...)
  2012-07-02 14:07 ` rguenth at gcc dot gnu.org
@ 2012-08-22 15:26 ` vbyakovl23 at gmail dot com
  2012-08-23 19:16 ` vbyakovl23 at gmail dot com
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: vbyakovl23 at gmail dot com @ 2012-08-22 15:26 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47440

Vladimir Yakovlev <vbyakovl23 at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |ASSIGNED

--- Comment #3 from Vladimir Yakovlev <vbyakovl23 at gmail dot com> 2012-08-22 15:25:54 UTC ---
I implemented vzeroupper insertion using mode switching technique.
http://gcc.gnu.org/ml/gcc-patches/2012-08/msg01429.html


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug target/47440] Use LCM for vzeroupper insertion
  2011-01-24 17:42 [Bug target/47440] New: Use LCM for vzeroupper optimization hjl.tools at gmail dot com
                   ` (3 preceding siblings ...)
  2012-08-22 15:26 ` vbyakovl23 at gmail dot com
@ 2012-08-23 19:16 ` vbyakovl23 at gmail dot com
  2012-11-06 18:04 ` ubizjak at gmail dot com
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: vbyakovl23 at gmail dot com @ 2012-08-23 19:16 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47440

--- Comment #4 from Vladimir Yakovlev <vbyakovl23 at gmail dot com> 2012-08-23 19:15:58 UTC ---
As recomended Uros, I splitted up the patch by two part. First, middle end part
is here
http://gcc.gnu.org/ml/gcc-patches/2012-08/msg01590.html


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug target/47440] Use LCM for vzeroupper insertion
  2011-01-24 17:42 [Bug target/47440] New: Use LCM for vzeroupper optimization hjl.tools at gmail dot com
                   ` (4 preceding siblings ...)
  2012-08-23 19:16 ` vbyakovl23 at gmail dot com
@ 2012-11-06 18:04 ` ubizjak at gmail dot com
  2012-11-06 18:05 ` ubizjak at gmail dot com
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: ubizjak at gmail dot com @ 2012-11-06 18:04 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47440

--- Comment #5 from Uros Bizjak <ubizjak at gmail dot com> 2012-11-06 18:03:40 UTC ---
Author: kyukhin
Date: Tue Nov  6 10:29:23 2012
New Revision: 193229

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=193229
Log:
        * config/i386/i386-protos.h (emit_i387_cw_initialization): Deleted.
        (emit_vzero): Added prototype.
        (ix86_mode_entry): Likewise.
        (ix86_mode_exit): Likewise.
        (ix86_emit_mode_set): Likewise.

        * config/i386/i386.c (typedef struct block_info_def): Deleted.
        (define BLOCK_INFO): Deleted.
        (check_avx256_stores): Added checking for MEM_P.
        (move_or_delete_vzeroupper_2): Deleted.
        (move_or_delete_vzeroupper_1): Deleted.
        (move_or_delete_vzeroupper): Deleted.
        (ix86_maybe_emit_epilogue_vzeroupper): Deleted.
        (function_pass_avx256_p): Deleted.
        (ix86_function_ok_for_sibcall): Deleted disabling sibcall.
        (nit_cumulative_args): Deleted initialization of of avx256 fields of
        cfun->machine.
        (ix86_emit_restore_sse_regs_using_mov): Deleted vzeroupper generation.
        (ix86_expand_epilogue): Likewise.
        (ix86_avx_u128_mode_needed): New.
        (ix86_i387_mode_needed): Renamed ix86_mode_needed.
        (ix86_mode_needed): New.
        (ix86_avx_u128_mode_after): New.
        (ix86_mode_after): New.
        (ix86_avx_u128_mode_entry): New.
        (ix86_mode_entry): New.
        (ix86_avx_u128_mode_exit): New.
        (ix86_mode_exit): New.
        (ix86_emit_mode_set): New.
        (ix86_expand_call): Deleted vzeroupper generation.
        (ix86_split_call_vzeroupper): Deleted.
        (ix86_init_machine_status): Initialzed optimize_mode_switching.
        (ix86_expand_special_args_builtin): Changed.
        (ix86_reorg): Deleted a call of move_or_delete_vzeroupper.

        * config/i386/i386.h  (VALID_AVX256_REG_OR_OI_MODE): New.
        (AVX_U128): New.
        (avx_u128_state): New.
        (NUM_MODES_FOR_MODE_SWITCHING): Added AVX_U128_ANY.
        (MODE_AFTER): New.
        (MODE_ENTRY): New.
        (MODE_EXIT): New.
        (EMIT_MODE_SET): Changed.
        (machine_function): Deleted avx256 fields.

        * config/i386/i386.md (UNSPEC_CALL_NEEDS_VZEROUPPER): Deleted.
        (define_insn_and_split "*call_vzeroupper"): Deleted.
        (define_insn_and_split "*call_rex64_ms_sysv_vzeroupper"): Deleted.
        (define_insn_and_split "*sibcall_vzeroupper"): Deleted.
        (define_insn_and_split "*call_pop_vzeroupper"): Deleted.
        (define_insn_and_split "*sibcall_pop_vzeroupper"): Deleted.
        (define_insn_and_split "*call_value_vzeroupper"): Deleted.
        (define_insn_and_split "*sibcall_value_vzeroupper"): Deleted.
        (define_insn_and_split "*call_value_rex64_ms_sysv_vzeroupper"):
Deleted.
        (define_insn_and_split "*call_value_pop_vzeroupper"): Deleted.
        (define_insn_and_split "*sibcall_value_pop_vzeroupper"): Deleted.
        (define_expand "return"): Deleted vzeroupper emitting.
        (define_expand "simple_return"): Deleted.

        * config/i386/predicates.md (vzeroupper_operation): New.

        * config/i386/sse.md (avx_vzeroupper): Changed.

testsuite/ChangeLog:
        * gcc.target/i386/avx-vzeroupper-5.c: Changed scan-assembler-times.
        * gcc.target/i386/avx-vzeroupper-8.c: Likewise.
        * gcc.target/i386/avx-vzeroupper-9.c: Likewise.
        * gcc.target/i386/avx-vzeroupper-10.c: Likewise.
        * gcc.target/i386/avx-vzeroupper-11.c: Likewise.
        * gcc.target/i386/avx-vzeroupper-12.c: Likewise.
        * gcc.target/i386/avx-vzeroupper-19.c: Likewis.
        * gcc.target/i386/avx-vzeroupper-27.c: New.



Added:
    trunk/gcc/testsuite/gcc.target/i386/avx-vzeroupper-27.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/i386/i386-protos.h
    trunk/gcc/config/i386/i386.c
    trunk/gcc/config/i386/i386.h
    trunk/gcc/config/i386/i386.md
    trunk/gcc/config/i386/predicates.md
    trunk/gcc/config/i386/sse.md
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/testsuite/gcc.target/i386/avx-vzeroupper-10.c
    trunk/gcc/testsuite/gcc.target/i386/avx-vzeroupper-11.c
    trunk/gcc/testsuite/gcc.target/i386/avx-vzeroupper-12.c
    trunk/gcc/testsuite/gcc.target/i386/avx-vzeroupper-19.c
    trunk/gcc/testsuite/gcc.target/i386/avx-vzeroupper-5.c
    trunk/gcc/testsuite/gcc.target/i386/avx-vzeroupper-8.c
    trunk/gcc/testsuite/gcc.target/i386/avx-vzeroupper-9.c


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug target/47440] Use LCM for vzeroupper insertion
  2011-01-24 17:42 [Bug target/47440] New: Use LCM for vzeroupper optimization hjl.tools at gmail dot com
                   ` (5 preceding siblings ...)
  2012-11-06 18:04 ` ubizjak at gmail dot com
@ 2012-11-06 18:05 ` ubizjak at gmail dot com
  2012-11-11 19:17 ` uros at gcc dot gnu.org
  2012-11-14 16:48 ` uros at gcc dot gnu.org
  8 siblings, 0 replies; 10+ messages in thread
From: ubizjak at gmail dot com @ 2012-11-06 18:05 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47440

Uros Bizjak <ubizjak at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Target|                            |x86
             Status|ASSIGNED                    |RESOLVED
                URL|                            |http://gcc.gnu.org/ml/gcc-p
                   |                            |atches/2012-11/msg00292.htm
                   |                            |l
         Resolution|                            |FIXED
   Target Milestone|---                         |4.8.0

--- Comment #6 from Uros Bizjak <ubizjak at gmail dot com> 2012-11-06 18:05:21 UTC ---
Implemented in 4.8.0.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug target/47440] Use LCM for vzeroupper insertion
  2011-01-24 17:42 [Bug target/47440] New: Use LCM for vzeroupper optimization hjl.tools at gmail dot com
                   ` (6 preceding siblings ...)
  2012-11-06 18:05 ` ubizjak at gmail dot com
@ 2012-11-11 19:17 ` uros at gcc dot gnu.org
  2012-11-14 16:48 ` uros at gcc dot gnu.org
  8 siblings, 0 replies; 10+ messages in thread
From: uros at gcc dot gnu.org @ 2012-11-11 19:17 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47440

--- Comment #7 from uros at gcc dot gnu.org 2012-11-11 19:17:22 UTC ---
Author: uros
Date: Sun Nov 11 19:17:17 2012
New Revision: 193409

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=193409
Log:
    PR target/47440
    * config/i386/i386.c (check_avx256_stores): Remove.
    (ix86_check_avx256_register): New.
    (ix86_avx_u128_mode_needed): Use ix86_check_avx256_register.
    Check the whole RTX for 256bit registers using for_each_rtx.
    (ix86_check_avx_stores): New.
    (ix86_avx_u128_mode_after): Change mode of CALL RTX to AVX_U128_CLEAN
    if there are no 256bit registers used in the function return register.
    (ix86_avx_u128_mode_entry): Use ix86_check_avx256_register.
    (ix86_avx_u128_mode_exit): Ditto.


Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/i386/i386.c


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug target/47440] Use LCM for vzeroupper insertion
  2011-01-24 17:42 [Bug target/47440] New: Use LCM for vzeroupper optimization hjl.tools at gmail dot com
                   ` (7 preceding siblings ...)
  2012-11-11 19:17 ` uros at gcc dot gnu.org
@ 2012-11-14 16:48 ` uros at gcc dot gnu.org
  8 siblings, 0 replies; 10+ messages in thread
From: uros at gcc dot gnu.org @ 2012-11-14 16:48 UTC (permalink / raw)
  To: gcc-bugs


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47440

--- Comment #8 from uros at gcc dot gnu.org 2012-11-14 16:47:43 UTC ---
Author: uros
Date: Wed Nov 14 16:47:29 2012
New Revision: 193503

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=193503
Log:
    PR target/47440
    * config/i386/i386.c (gate_insert_vzeroupper): New function.
    (rest_of_handle_insert_vzeroupper): Ditto.
    (struct rtl_opt_pass pass_insert_vzeroupper): New.
    (ix86_option_override): Register vzeroupper insertion pass here.
    (ix86_check_avx256_register): Handle SUBREGs properly.
    (ix86_init_machine_status): Remove optimize_mode_switching[AVX_U128]
    initialization.


Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/i386/i386.c


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2012-11-14 16:48 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-01-24 17:42 [Bug target/47440] New: Use LCM for vzeroupper optimization hjl.tools at gmail dot com
2011-05-19  8:56 ` [Bug target/47440] Use LCM for vzeroupper insertion ubizjak at gmail dot com
2012-03-22  9:24 ` rguenth at gcc dot gnu.org
2012-07-02 14:07 ` rguenth at gcc dot gnu.org
2012-08-22 15:26 ` vbyakovl23 at gmail dot com
2012-08-23 19:16 ` vbyakovl23 at gmail dot com
2012-11-06 18:04 ` ubizjak at gmail dot com
2012-11-06 18:05 ` ubizjak at gmail dot com
2012-11-11 19:17 ` uros at gcc dot gnu.org
2012-11-14 16:48 ` uros at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).