[Bug other/64928] New: unreasonable cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs

public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed

* [Bug other/64928] New: unreasonable cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs
@ 2015-02-03 21:09 lucier at math dot purdue.edu
  2015-02-03 21:11 ` [Bug other/64928] Inordinate " lucier at math dot purdue.edu
                   ` (39 more replies)
  0 siblings, 40 replies; 41+ messages in thread
From: lucier at math dot purdue.edu @ 2015-02-03 21:09 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64928

            Bug ID: 64928
           Summary: unreasonable cpu time and memory usage in "phase opt
                    and generate" with -ftest-coverage -fprofile-arcs
           Product: gcc
           Version: 4.9.2
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: other
          Assignee: unassigned at gcc dot gnu.org
          Reporter: lucier at math dot purdue.edu

With this compiler:

firefly:~/Downloads/gambit/lib> /pkgs/gcc-4.9.2/bin/gcc -v
Using built-in specs.
COLLECT_GCC=/pkgs/gcc-4.9.2/bin/gcc
COLLECT_LTO_WRAPPER=/pkgs/gcc-4.9.2/libexec/gcc/x86_64-unknown-linux-gnu/4.9.2/lto-wrapper
Target: x86_64-unknown-linux-gnu
Configured with: ../../gcc-4.9.2/configure --prefix=/pkgs/gcc-4.9.2
Thread model: posix
gcc version 4.9.2 (GCC) 


With this command:

/pkgs/gcc-4.9.2/bin/gcc -Q -save-temps -Wno-unused -Wno-write-strings -O1
-fno-math-errno -fschedule-insns2 -fno-strict-aliasing -fno-trapping-math
-fwrapv -fomit-frame-pointer -fPIC -fno-common -mieee-fp  -fprofile-arcs
-ftest-coverage  -I"../include" -c -o "_system.o" -I. -DHAVE_CONFIG_H
-D___GAMBCDIR="\"/usr/local/Gambit-C\"" -D___SYS_TYPE_CPU="\"x86_64\""
-D___SYS_TYPE_VENDOR="\"unknown\"" -D___SYS_TYPE_OS="\"linux-gnu\""
-D___CONFIGURE_COMMAND="\"./configure 'CC=/pkgs/gcc-4.9.2/bin/gcc -Q
-save-temps' '--enable-track-scheme' '--enable-coverage'"\"
-D___OBJ_EXTENSION="\".o\"" -D___EXE_EXTENSION="\"\"" -D___BAT_EXTENSION="\"\""
-D___PRIMAL _system.c -D___LIBRARY

I get the output:

Execution times (seconds)
 phase setup             :   0.12 (100%) usr   0.00 ( 0%) sys   0.13 (100%)
wall   35712 kB (100%) ggc
 TOTAL                 :   0.12             0.00             0.13             
35728 kB
 btowc wctob mbrlen __signbitf __signbit __signbitl ___H__20___system
___H__23__23_type ___H__23__23_type_2d_cast ___H__23__23_subtype
___H__23__23_subtype_2d_set_21_ ___H__23__23_fixnum_3f_
___H__23__23_subtyped_3f_ ___H__23__23_subtyped_2d_mutable_3f_
___H__23__23_subtyped_2e_vector_3f_ ___H__23__23_subtyped_2e_symbol_3f_
___H__23__23_subtyped_2e_flonum_3f_ ___H__23__23_subtyped_2e_bignum_3f_
___H__23__23_special_3f_ ___H__23__23_ratnum_3f_ ___H__23__23_cpxnum_3f_
___H__23__23_structure_3f_ ___H__23__23_values_3f_ ___H__23__23_meroon_3f_
___H__23__23_jazz_3f_ ___H__23__23_frame_3f_ ___H__23__23_continuation_3f_
___H__23__23_promise_3f_ ___H__23__23_return_3f_ ___H__23__23_foreign_3f_
___H__23__23_flonum_3f_ ___H__23__23_bignum_3f_ ___H__23__23_unbound_3f_
___H__23__23_quasi_2d_append ___H__23__23_quasi_2d_list
___H__23__23_quasi_2d_cons ___H__23__23_quasi_2d_list_2d__3e_vector
___H__23__23_quasi_2d_vector ___H__23__23_case_2d_memv ___H__23__23_eqv_3f_
___H_eqv_3f_ ___H__23__23_eq_3f_ ___H_eq_3f_ ___H__23__23_bvector_2d_equal_3f_
___H__23__23_equal_3f_ ___H_equal_3f_ ___H__23__23_symbol_2d_hash
___H_symbol_2d_hash ___H__23__23_keyword_2d_hash ___H_keyword_2d_hash
___H__23__23_eq_3f__2d_hash ___H_eq_3f__2d_hash ___H__23__23_eqv_3f__2d_hash
___H_eqv_3f__2d_hash ___H__23__23_equal_3f__2d_hash ___H_equal_3f__2d_hash
___H__23__23_string_3d__3f__2d_hash ___H_string_3d__3f__2d_hash
___H__23__23_string_2d_ci_3d__3f__2d_hash ___H_string_2d_ci_3d__3f__2d_hash
___H__23__23_generic_2d_hash
___H__23__23_fail_2d_check_2d_invalid_2d_hash_2d_number_2d_exception
___H_invalid_2d_hash_2d_number_2d_exception_3f_
___H_invalid_2d_hash_2d_number_2d_exception_2d_procedure
___H_invalid_2d_hash_2d_number_2d_exception_2d_arguments
___H__23__23_raise_2d_invalid_2d_hash_2d_number_2d_exception
___H__23__23_fail_2d_check_2d_unbound_2d_table_2d_key_2d_exception
___H_unbound_2d_table_2d_key_2d_exception_3f_
___H_unbound_2d_table_2d_key_2d_exception_2d_procedure
___H_unbound_2d_table_2d_key_2d_exception_2d_arguments
___H__23__23_raise_2d_unbound_2d_table_2d_key_2d_exception
___H__23__23_gc_2d_hash_2d_table_3f_ ___H__23__23_gc_2d_hash_2d_table_2d_ref
___H__23__23_gc_2d_hash_2d_table_2d_set_21_
___H__23__23_gc_2d_hash_2d_table_2d_rehash_21_
___H__23__23_smallest_2d_prime_2d_no_2d_less_2d_than
___H__23__23_gc_2d_hash_2d_table_2d_resize_21_
___H__23__23_gc_2d_hash_2d_table_2d_allocate
___H__23__23_gc_2d_hash_2d_table_2d_for_2d_each
___H__23__23_gc_2d_hash_2d_table_2d_search
___H__23__23_gc_2d_hash_2d_table_2d_foldl ___H__23__23_mem_2d_allocated_3f_
___H__23__23_fail_2d_check_2d_table ___H_table_3f_ ___H__23__23_make_2d_table
___H_make_2d_table ___H__23__23_table_2d_get_2d_eq_2d_gcht
___H__23__23_table_2d_get_2d_gcht_2d_not_2d_mem_2d_alloc
___H__23__23_table_2d_get_2d_gcht ___H__23__23_table_2d_length
___H_table_2d_length ___H__23__23_table_2d_access ___H__23__23_table_2d_ref
___H_table_2d_ref ___H__23__23_table_2d_resize_21_
___H__23__23_table_2d_set_21_ ___H_table_2d_set_21_
___H__23__23_table_2d_search ___H_table_2d_search
___H__23__23_table_2d_for_2d_each ___H_table_2d_for_2d_each
___H__23__23_table_2d_foldl ___H__23__23_table_2d__3e_list
___H_table_2d__3e_list ___H__23__23_list_2d__3e_table ___H_list_2d__3e_table
___H__23__23_table_2d_copy ___H_table_2d_copy ___H__23__23_table_2d_merge_21_
___H_table_2d_merge_21_ ___H__23__23_table_2d_merge ___H_table_2d_merge
___H__23__23_table_2d_equal_3f_ ___H__23__23_table_2d_equal_3f__2d_hash
___H__23__23_fail_2d_check_2d_unbound_2d_serial_2d_number_2d_exception
___H_unbound_2d_serial_2d_number_2d_exception_3f_
___H_unbound_2d_serial_2d_number_2d_exception_2d_procedure
___H_unbound_2d_serial_2d_number_2d_exception_2d_arguments
___H__23__23_raise_2d_unbound_2d_serial_2d_number_2d_exception
___H__23__23_object_2d__3e_serial_2d_number ___H_object_2d__3e_serial_2d_number
___H__23__23_serial_2d_number_2d__3e_object ___H_serial_2d_number_2d__3e_object
___H__23__23_object_2d__3e_u8vector ___H_object_2d__3e_u8vector
___H__23__23_u8vector_2d__3e_object ___H_u8vector_2d__3e_object ___setup_mod
___init_mod ____20___system
Analyzing compilation unit
Performing interprocedural optimizations
 <*free_lang_data> <visibility> <early_local_cleanups> <*free_inline_summary>
<profile> <whole-program> <profile_estimate> <inline> <pure-const>
<static-var>Assembling functions:
 ___setup_mod ___init_mod ___H_u8vector_2d__3e_object
___H__23__23_u8vector_2d__3e_object ___H_object_2d__3e_u8vector
___H__23__23_object_2d__3e_u8vector {GC 298137k -> 101678k}
___H_serial_2d_number_2d__3e_object ___H__23__23_serial_2d_number_2d__3e_object
___H_object_2d__3e_serial_2d_number ___H__23__23_object_2d__3e_serial_2d_number
___H__23__23_raise_2d_unbound_2d_serial_2d_number_2d_exception
___H_unbound_2d_serial_2d_number_2d_exception_2d_arguments
___H_unbound_2d_serial_2d_number_2d_exception_2d_procedure
___H_unbound_2d_serial_2d_number_2d_exception_3f_
___H__23__23_fail_2d_check_2d_unbound_2d_serial_2d_number_2d_exception
___H__23__23_table_2d_equal_3f__2d_hash ___H__23__23_table_2d_equal_3f_
___H_table_2d_merge ___H__23__23_table_2d_merge ___H_table_2d_merge_21_
___H__23__23_table_2d_merge_21_ ___H_table_2d_copy ___H__23__23_table_2d_copy
___H_list_2d__3e_table ___H__23__23_list_2d__3e_table ___H_table_2d__3e_list
___H__23__23_table_2d__3e_list ___H__23__23_table_2d_foldl
___H_table_2d_for_2d_each ___H__23__23_table_2d_for_2d_each
___H_table_2d_search ___H__23__23_table_2d_search ___H_table_2d_set_21_
___H__23__23_table_2d_resize_21_ ___H_table_2d_ref ___H__23__23_table_2d_access
___H_table_2d_length ___H__23__23_table_2d_length
___H__23__23_table_2d_get_2d_gcht
___H__23__23_table_2d_get_2d_gcht_2d_not_2d_mem_2d_alloc
___H__23__23_table_2d_get_2d_eq_2d_gcht ___H_make_2d_table ___H_table_3f_
___H__23__23_fail_2d_check_2d_table ___H__23__23_mem_2d_allocated_3f_
___H__23__23_gc_2d_hash_2d_table_2d_foldl
___H__23__23_gc_2d_hash_2d_table_2d_search
___H__23__23_gc_2d_hash_2d_table_2d_for_2d_each
___H__23__23_gc_2d_hash_2d_table_2d_allocate
___H__23__23_gc_2d_hash_2d_table_2d_resize_21_
___H__23__23_smallest_2d_prime_2d_no_2d_less_2d_than
___H__23__23_gc_2d_hash_2d_table_3f_
___H__23__23_raise_2d_unbound_2d_table_2d_key_2d_exception
___H_unbound_2d_table_2d_key_2d_exception_2d_arguments
___H_unbound_2d_table_2d_key_2d_exception_2d_procedure
___H_unbound_2d_table_2d_key_2d_exception_3f_
___H__23__23_fail_2d_check_2d_unbound_2d_table_2d_key_2d_exception
___H__23__23_raise_2d_invalid_2d_hash_2d_number_2d_exception
___H_invalid_2d_hash_2d_number_2d_exception_2d_arguments
___H_invalid_2d_hash_2d_number_2d_exception_2d_procedure
___H_invalid_2d_hash_2d_number_2d_exception_3f_
___H__23__23_fail_2d_check_2d_invalid_2d_hash_2d_number_2d_exception
___H__23__23_generic_2d_hash ___H_string_2d_ci_3d__3f__2d_hash
___H_string_3d__3f__2d_hash ___H__23__23_string_3d__3f__2d_hash
___H_equal_3f__2d_hash ___H__23__23_equal_3f__2d_hash ___H_eqv_3f__2d_hash
___H__23__23_eqv_3f__2d_hash ___H_eq_3f__2d_hash ___H__23__23_eq_3f__2d_hash
___H_keyword_2d_hash ___H__23__23_keyword_2d_hash ___H_symbol_2d_hash
___H__23__23_symbol_2d_hash ___H_equal_3f_ ___H__23__23_equal_3f_
___H__23__23_bvector_2d_equal_3f_ ___H_eq_3f_ ___H__23__23_eq_3f_ ___H_eqv_3f_
___H__23__23_eqv_3f_ ___H__23__23_case_2d_memv ___H__23__23_quasi_2d_vector
___H__23__23_quasi_2d_list_2d__3e_vector ___H__23__23_quasi_2d_cons
___H__23__23_quasi_2d_list ___H__23__23_quasi_2d_append
___H__23__23_unbound_3f_ ___H__23__23_bignum_3f_ ___H__23__23_flonum_3f_
___H__23__23_foreign_3f_ ___H__23__23_return_3f_ ___H__23__23_promise_3f_
___H__23__23_continuation_3f_ ___H__23__23_frame_3f_ ___H__23__23_jazz_3f_
___H__23__23_meroon_3f_ ___H__23__23_values_3f_ ___H__23__23_structure_3f_
___H__23__23_cpxnum_3f_ ___H__23__23_ratnum_3f_ ___H__23__23_special_3f_
___H__23__23_subtyped_2e_bignum_3f_ ___H__23__23_subtyped_2e_flonum_3f_
___H__23__23_subtyped_2e_symbol_3f_ ___H__23__23_subtyped_2e_vector_3f_
___H__23__23_subtyped_2d_mutable_3f_ ___H__23__23_subtyped_3f_
___H__23__23_fixnum_3f_ ___H__23__23_subtype_2d_set_21_ ___H__23__23_subtype
___H__23__23_type_2d_cast ___H__23__23_type ___H__20___system
___H__23__23_gc_2d_hash_2d_table_2d_set_21_ ___H__23__23_table_2d_set_21_
___H__23__23_gc_2d_hash_2d_table_2d_rehash_21_ ___H__23__23_table_2d_ref
___H__23__23_gc_2d_hash_2d_table_2d_ref ___H__23__23_make_2d_table
___H__23__23_string_2d_ci_3d__3f__2d_hash ____20___system
_GLOBAL__sub_I_65535_0__system.c
Execution times (seconds)
 phase setup             :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall 
  1134 kB ( 0%) ggc
 phase parsing           :   0.11 ( 0%) usr   0.12 (14%) sys   0.23 ( 1%) wall 
  7383 kB ( 1%) ggc
 phase opt and generate  :  35.79 (100%) usr   0.73 (86%) sys  36.55 (99%) wall
 513422 kB (98%) ggc
 garbage collection      :   0.08 ( 0%) usr   0.00 ( 0%) sys   0.08 ( 0%) wall 
     0 kB ( 0%) ggc
 dump files              :   0.00 ( 0%) usr   0.01 ( 1%) sys   0.00 ( 0%) wall 
     0 kB ( 0%) ggc
 callgraph construction  :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall 
  2337 kB ( 0%) ggc
 callgraph optimization  :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall 
   399 kB ( 0%) ggc
 ipa dead code removal   :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall 
     0 kB ( 0%) ggc
 ipa inlining heuristics :   0.01 ( 0%) usr   0.01 ( 1%) sys   0.01 ( 0%) wall 
  1132 kB ( 0%) ggc
 ipa profile             :   0.00 ( 0%) usr   0.01 ( 1%) sys   0.00 ( 0%) wall 
  2688 kB ( 1%) ggc
 ipa pure const          :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall 
     0 kB ( 0%) ggc
 cfg construction        :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall 
   416 kB ( 0%) ggc
 cfg cleanup             :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.04 ( 0%) wall 
    14 kB ( 0%) ggc
 trivially dead code     :   0.14 ( 0%) usr   0.00 ( 0%) sys   0.15 ( 0%) wall 
     0 kB ( 0%) ggc
 df scan insns           :   0.07 ( 0%) usr   0.00 ( 0%) sys   0.05 ( 0%) wall 
    13 kB ( 0%) ggc
 df multiple defs        :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall 
     0 kB ( 0%) ggc
 df reaching defs        :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall 
     0 kB ( 0%) ggc
 df live regs            :   0.29 ( 1%) usr   0.00 ( 0%) sys   0.26 ( 1%) wall 
     0 kB ( 0%) ggc
 df live&initialized regs:   0.03 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall 
     0 kB ( 0%) ggc
 df reg dead/unused notes:   0.24 ( 1%) usr   0.01 ( 1%) sys   0.24 ( 1%) wall 
 12426 kB ( 2%) ggc
 register information    :   0.07 ( 0%) usr   0.00 ( 0%) sys   0.05 ( 0%) wall 
     0 kB ( 0%) ggc
 alias analysis          :   0.21 ( 1%) usr   0.00 ( 0%) sys   0.19 ( 1%) wall 
 23934 kB ( 5%) ggc
 alias stmt walking      :   0.33 ( 1%) usr   0.01 ( 1%) sys   0.28 ( 1%) wall 
   609 kB ( 0%) ggc
 register scan           :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall 
   104 kB ( 0%) ggc
 rebuild jump labels     :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.04 ( 0%) wall 
     0 kB ( 0%) ggc
 preprocessing           :   0.03 ( 0%) usr   0.03 ( 4%) sys   0.06 ( 0%) wall 
  1743 kB ( 0%) ggc
 lexical analysis        :   0.03 ( 0%) usr   0.03 ( 4%) sys   0.03 ( 0%) wall 
     0 kB ( 0%) ggc
 parser (global)         :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.04 ( 0%) wall 
  1477 kB ( 0%) ggc
 parser function body    :   0.04 ( 0%) usr   0.06 ( 7%) sys   0.10 ( 0%) wall 
  3815 kB ( 1%) ggc
 inline parameters       :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall 
    89 kB ( 0%) ggc
 tree gimplify           :   0.03 ( 0%) usr   0.01 ( 1%) sys   0.02 ( 0%) wall 
  5057 kB ( 1%) ggc
 tree CFG construction   :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall 
  1743 kB ( 0%) ggc
 tree CFG cleanup        :   0.16 ( 0%) usr   0.00 ( 0%) sys   0.24 ( 1%) wall 
   300 kB ( 0%) ggc
 tree copy propagation   :   0.28 ( 1%) usr   0.00 ( 0%) sys   0.31 ( 1%) wall 
  3211 kB ( 1%) ggc
 tree PTA                :   0.06 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall 
   217 kB ( 0%) ggc
 tree PHI insertion      :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall 
  2191 kB ( 0%) ggc
 tree SSA rewrite        :   0.19 ( 1%) usr   0.00 ( 0%) sys   0.18 ( 0%) wall 
 17777 kB ( 3%) ggc
 tree SSA other          :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall 
    18 kB ( 0%) ggc
 tree SSA incremental    :   0.23 ( 1%) usr   0.01 ( 1%) sys   0.26 ( 1%) wall 
 27481 kB ( 5%) ggc
 tree operand scan       :   0.02 ( 0%) usr   0.02 ( 2%) sys   0.05 ( 0%) wall 
 15630 kB ( 3%) ggc
 dominator optimization  :   0.22 ( 1%) usr   0.01 ( 1%) sys   0.22 ( 1%) wall 
 27417 kB ( 5%) ggc
 tree CCP                :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.06 ( 0%) wall 
   491 kB ( 0%) ggc
 tree PHI const/copy prop:   0.02 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall 
   127 kB ( 0%) ggc
 tree split crit edges   :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall 
   743 kB ( 0%) ggc
 tree reassociation      :   0.04 ( 0%) usr   0.00 ( 0%) sys   0.05 ( 0%) wall 
     7 kB ( 0%) ggc
 tree FRE                :   0.06 ( 0%) usr   0.00 ( 0%) sys   0.07 ( 0%) wall 
  2875 kB ( 1%) ggc
 tree code sinking       :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall 
     0 kB ( 0%) ggc
 tree forward propagate  :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall 
   336 kB ( 0%) ggc
 tree conservative DCE   :   0.10 ( 0%) usr   0.00 ( 0%) sys   0.09 ( 0%) wall 
    99 kB ( 0%) ggc
 tree aggressive DCE     :   0.10 ( 0%) usr   0.00 ( 0%) sys   0.08 ( 0%) wall 
    20 kB ( 0%) ggc
 tree DSE                :   2.80 ( 8%) usr   0.00 ( 0%) sys   2.80 ( 8%) wall 
     0 kB ( 0%) ggc
 tree loop invariant motion:   0.16 ( 0%) usr   0.03 ( 4%) sys   0.19 ( 1%)
wall   64219 kB (12%) ggc
 scev constant prop      :   0.29 ( 1%) usr   0.00 ( 0%) sys   0.27 ( 1%) wall 
 12074 kB ( 2%) ggc
 complete unrolling      :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall 
    16 kB ( 0%) ggc
 tree iv optimization    :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall 
   932 kB ( 0%) ggc
 tree SSA uncprop        :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall 
     0 kB ( 0%) ggc
 tree rename SSA copies  :   0.04 ( 0%) usr   0.00 ( 0%) sys   0.04 ( 0%) wall 
     0 kB ( 0%) ggc
 dominance frontiers     :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall 
     0 kB ( 0%) ggc
 dominance computation   :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.09 ( 0%) wall 
     0 kB ( 0%) ggc
 out of ssa              :   5.90 (16%) usr   0.50 (59%) sys   6.41 (17%) wall 
    26 kB ( 0%) ggc
 expand vars             :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall 
   866 kB ( 0%) ggc
 expand                  :   0.39 ( 1%) usr   0.02 ( 2%) sys   0.40 ( 1%) wall 
 87038 kB (17%) ggc
 post expand cleanups    :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall 
   322 kB ( 0%) ggc
 forward prop            :   0.33 ( 1%) usr   0.00 ( 0%) sys   0.33 ( 1%) wall 
 14733 kB ( 3%) ggc
 CSE                     :   7.53 (21%) usr   0.01 ( 1%) sys   7.53 (20%) wall 
 30934 kB ( 6%) ggc
 dead code elimination   :   0.05 ( 0%) usr   0.00 ( 0%) sys   0.06 ( 0%) wall 
     0 kB ( 0%) ggc
 dead store elim1        :   0.37 ( 1%) usr   0.01 ( 1%) sys   0.36 ( 1%) wall 
  7276 kB ( 1%) ggc
 dead store elim2        :   1.73 ( 5%) usr   0.00 ( 0%) sys   1.71 ( 5%) wall 
 18715 kB ( 4%) ggc
 loop init               :   0.04 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall 
   713 kB ( 0%) ggc
 loop invariant motion   :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall 
    27 kB ( 0%) ggc
 branch prediction       :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall 
   277 kB ( 0%) ggc
 combiner                :   0.06 ( 0%) usr   0.00 ( 0%) sys   0.11 ( 0%) wall 
   913 kB ( 0%) ggc
 integrated RA           :   0.87 ( 2%) usr   0.00 ( 0%) sys   0.99 ( 3%) wall 
 48097 kB ( 9%) ggc
 LRA non-specific        :   1.61 ( 4%) usr   0.01 ( 1%) sys   1.63 ( 4%) wall 
 37254 kB ( 7%) ggc
 LRA virtuals elimination:   0.13 ( 0%) usr   0.00 ( 0%) sys   0.10 ( 0%) wall 
 15481 kB ( 3%) ggc
 LRA reload inheritance  :   0.04 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall 
    11 kB ( 0%) ggc
 LRA create live ranges  :   0.32 ( 1%) usr   0.01 ( 1%) sys   0.29 ( 1%) wall 
  7642 kB ( 1%) ggc
 LRA hard reg assignment :   0.69 ( 2%) usr   0.01 ( 1%) sys   0.73 ( 2%) wall 
     0 kB ( 0%) ggc
 reload CSE regs         :   5.74 (16%) usr   0.00 ( 0%) sys   5.73 (16%) wall 
 12325 kB ( 2%) ggc
 thread pro- & epilogue  :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall 
   465 kB ( 0%) ggc
 combine stack adjustments:   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall
      0 kB ( 0%) ggc
 hard reg cprop          :   0.33 ( 1%) usr   0.00 ( 0%) sys   0.34 ( 1%) wall 
     4 kB ( 0%) ggc
 scheduling 2            :   1.79 ( 5%) usr   0.01 ( 1%) sys   1.77 ( 5%) wall 
   299 kB ( 0%) ggc
 machine dep reorg       :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall 
     0 kB ( 0%) ggc
 shorten branches        :   0.21 ( 1%) usr   0.00 ( 0%) sys   0.20 ( 1%) wall 
     0 kB ( 0%) ggc
 final                   :   0.36 ( 1%) usr   0.00 ( 0%) sys   0.34 ( 1%) wall 
  1508 kB ( 0%) ggc
 variable output         :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall 
   146 kB ( 0%) ggc
 straight-line strength reduction:   0.06 ( 0%) usr   0.00 ( 0%) sys   0.06 (
0%) wall      16 kB ( 0%) ggc
 rest of compilation     :   0.35 ( 1%) usr   0.02 ( 2%) sys   0.37 ( 1%) wall 
   991 kB ( 0%) ggc
 remove unused locals    :   0.07 ( 0%) usr   0.00 ( 0%) sys   0.07 ( 0%) wall 
     0 kB ( 0%) ggc
 address taken           :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall 
     0 kB ( 0%) ggc
 unaccounted todo        :   0.19 ( 1%) usr   0.00 ( 0%) sys   0.18 ( 0%) wall 
     0 kB ( 0%) ggc
 TOTAL                 :  35.90             0.85            36.79            
521957 kB

The "phase opt and generate" part uses most of the CPU time and most of the
RAM.  With somewhat larger files, RAM usage goes up > 80GB.

Including _system.i with this report.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Bug other/64928] Inordinate cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs
  2015-02-03 21:09 [Bug other/64928] New: unreasonable cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs lucier at math dot purdue.edu
@ 2015-02-03 21:11 ` lucier at math dot purdue.edu
  2015-02-03 21:33 ` pinskia at gcc dot gnu.org
                   ` (38 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: lucier at math dot purdue.edu @ 2015-02-03 21:11 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64928

--- Comment #1 from lucier at math dot purdue.edu ---
Created attachment 34660
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=34660&action=edit
Input file for bug


^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Bug other/64928] Inordinate cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs
  2015-02-03 21:09 [Bug other/64928] New: unreasonable cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs lucier at math dot purdue.edu
  2015-02-03 21:11 ` [Bug other/64928] Inordinate " lucier at math dot purdue.edu
@ 2015-02-03 21:33 ` pinskia at gcc dot gnu.org
  2015-02-03 21:35 ` pinskia at gcc dot gnu.org
                   ` (37 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: pinskia at gcc dot gnu.org @ 2015-02-03 21:33 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64928

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Note phase opt and generate is a toplevel time area.
The passes which take most of the time are:
 tree DSE                :   2.80 ( 8%) usr   0.00 ( 0%) sys   2.80 ( 8%) wall 
     0 kB ( 0%) ggc
 out of ssa              :   5.90 (16%) usr   0.50 (59%) sys   6.41 (17%) wall 
    26 kB ( 0%) ggc
 CSE                     :   7.53 (21%) usr   0.01 ( 1%) sys   7.53 (20%) wall 
 30934 kB ( 6%) ggc
 reload CSE regs         :   5.74 (16%) usr   0.00 ( 0%) sys   5.73 (16%) wall 
 12325 kB ( 2%) ggc
 scheduling 2            :   1.79 ( 5%) usr   0.01 ( 1%) sys   1.77 ( 5%) wall 
   299 kB ( 0%) ggc


^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Bug other/64928] Inordinate cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs
  2015-02-03 21:09 [Bug other/64928] New: unreasonable cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs lucier at math dot purdue.edu
  2015-02-03 21:11 ` [Bug other/64928] Inordinate " lucier at math dot purdue.edu
  2015-02-03 21:33 ` pinskia at gcc dot gnu.org
@ 2015-02-03 21:35 ` pinskia at gcc dot gnu.org
  2015-02-03 21:49 ` lucier at math dot purdue.edu
                   ` (36 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: pinskia at gcc dot gnu.org @ 2015-02-03 21:35 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64928

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |compile-time-hog

--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
I think this is just an issue with computed goto (indirect gotos).


^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Bug other/64928] Inordinate cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs
  2015-02-03 21:09 [Bug other/64928] New: unreasonable cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs lucier at math dot purdue.edu
                   ` (2 preceding siblings ...)
  2015-02-03 21:35 ` pinskia at gcc dot gnu.org
@ 2015-02-03 21:49 ` lucier at math dot purdue.edu
  2015-02-06  5:07 ` lucier at math dot purdue.edu
                   ` (35 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: lucier at math dot purdue.edu @ 2015-02-03 21:49 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64928

--- Comment #4 from lucier at math dot purdue.edu ---
On 02/03/2015 04:32 PM, pinskia at gcc dot gnu.org wrote:
> > --- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
> Note phase opt and generate is a toplevel time area.
> The passes which take most of the time are:

I'm also concerned about excessive memory usage; the largest passes (> 
20 MB) are

  alias analysis          :   0.21 ( 1%) usr   0.00 ( 0%) sys   0.19 ( 
1%) wall   23934 kB ( 5%) ggc
  tree SSA incremental    :   0.23 ( 1%) usr   0.01 ( 1%) sys   0.26 ( 
1%) wall   27481 kB ( 5%) ggc
  dominator optimization  :   0.22 ( 1%) usr   0.01 ( 1%) sys   0.22 ( 
1%) wall   27417 kB ( 5%) ggc
  tree loop invariant motion:   0.16 ( 0%) usr   0.03 ( 4%) sys   0.19 ( 
1%) wall   64219 kB (12%) ggc
  expand                  :   0.39 ( 1%) usr   0.02 ( 2%) sys   0.40 ( 
1%) wall   87038 kB (17%) ggc
  CSE                     :   7.53 (21%) usr   0.01 ( 1%) sys   7.53 
(20%) wall   30934 kB ( 6%) ggc
  integrated RA           :   0.87 ( 2%) usr   0.00 ( 0%) sys   0.99 ( 
3%) wall   48097 kB ( 9%) ggc
  LRA non-specific        :   1.61 ( 4%) usr   0.01 ( 1%) sys   1.63 ( 
4%) wall   37254 kB ( 7%) ggc

This also affects the 4.8 branch and the mainline.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Bug other/64928] Inordinate cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs
  2015-02-03 21:09 [Bug other/64928] New: unreasonable cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs lucier at math dot purdue.edu
                   ` (3 preceding siblings ...)
  2015-02-03 21:49 ` lucier at math dot purdue.edu
@ 2015-02-06  5:07 ` lucier at math dot purdue.edu
  2015-02-06  5:08 ` lucier at math dot purdue.edu
                   ` (34 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: lucier at math dot purdue.edu @ 2015-02-06  5:07 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64928

--- Comment #5 from lucier at math dot purdue.edu ---
Created attachment 34681
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=34681&action=edit
_io.i.gz: larger test file

With this compiler:

firefly:~/Downloads/gambit/lib> /pkgs/gcc-mainline/bin/gcc -v
Using built-in specs.
COLLECT_GCC=/pkgs/gcc-mainline/bin/gcc
COLLECT_LTO_WRAPPER=/pkgs/gcc-mainline/libexec/gcc/x86_64-unknown-linux-gnu/5.0.0/lto-wrapper
Target: x86_64-unknown-linux-gnu
Configured with: ../../gcc-devel/configure --prefix=/pkgs/gcc-mainline
--enable-languages=c --enable-checking=release
Thread model: posix
gcc version 5.0.0 20150206 (experimental) [trunk revision 220467] (GCC) 

and the input file _io.c, I find

/pkgs/gcc-mainline/bin/gcc -Q -save-temps -Wno-unused -Wno-write-strings -O1
-fno-math-errno -fschedule-insns2 -fno-strict-aliasing -fno-trapping-math
-fwrapv -fomit-frame-pointer -fPIC -fno-common -mieee-fp  -fprofile-arcs
-ftest-coverage  -I"../include" -c -o "_io.o" -I. -DHAVE_CONFIG_H
-D___GAMBCDIR="\"/usr/local/Gambit-C\"" -D___SYS_TYPE_CPU="\"x86_64\""
-D___SYS_TYPE_VENDOR="\"unknown\"" -D___SYS_TYPE_OS="\"linux-gnu\""
-D___CONFIGURE_COMMAND="\"./configure 'CC=/pkgs/gcc-mainline/bin/gcc -Q
-save-temps' '--enable-coverage' '--enable-track-scheme'"\"
-D___OBJ_EXTENSION="\".o\"" -D___EXE_EXTENSION="\"\"" -D___BAT_EXTENSION="\"\""
-D___PRIMAL _io.c -D___LIBRARY

Execution times (seconds)
 phase setup             :   0.78 (100%) usr   0.04 (100%) sys   0.83 (100%)
wall  156905 kB (100%) ggc
 TOTAL                 :   0.78             0.04             0.83            
156922 kB
 btowc wctob mbrlen __signbitf __signbit __signbitl ___H__20___io
___H__23__23_fail_2d_check_2d_datum_2d_parsing_2d_exception
___H_datum_2d_parsing_2d_exception_3f_
___H_datum_2d_parsing_2d_exception_2d_kind
___H_datum_2d_parsing_2d_exception_2d_readenv
___H_datum_2d_parsing_2d_exception_2d_parameters
___H__23__23_raise_2d_datum_2d_parsing_2d_exception
___H__23__23_fail_2d_check_2d_unterminated_2d_process_2d_exception
___H_unterminated_2d_process_2d_exception_3f_
___H_unterminated_2d_process_2d_exception_2d_procedure
___H_unterminated_2d_process_2d_exception_2d_arguments
___H__23__23_raise_2d_unterminated_2d_process_2d_exception
___H__23__23_fail_2d_check_2d_nonempty_2d_input_2d_port_2d_character_2d_buffer_2d_exception
___H_nonempty_2d_input_2d_port_2d_character_2d_buffer_2d_exception_3f_
___H_nonempty_2d_input_2d_port_2d_character_2d_buffer_2d_exception_2d_procedure
___H_nonempty_2d_input_2d_port_2d_character_2d_buffer_2d_exception_2d_arguments
___H__23__23_raise_2d_nonempty_2d_input_2d_port_2d_character_2d_buffer_2d_exception
___H__23__23_fail_2d_check_2d_no_2d_such_2d_file_2d_or_2d_directory_2d_exception
___H_no_2d_such_2d_file_2d_or_2d_directory_2d_exception_3f_
___H_no_2d_such_2d_file_2d_or_2d_directory_2d_exception_2d_procedure
___H_no_2d_such_2d_file_2d_or_2d_directory_2d_exception_2d_arguments
___H__23__23_raise_2d_no_2d_such_2d_file_2d_or_2d_directory_2d_exception
___H__23__23_raise_2d_os_2d_io_2d_exception
___H__23__23_raise_2d_io_2d_exception ___H__23__23_fail_2d_check_2d_settings
___H__23__23_fail_2d_check_2d_exact_2d_integer_2d_or_2d_string_2d_or_2d_settings
___H__23__23_fail_2d_check_2d_string_2d_or_2d_ip_2d_address
___H__23__23_make_2d_writeenv ___H__23__23_make_2d_readenv
___H__23__23_readenv_2d_current_2d_filepos
___H__23__23_readenv_2d_relative_2d_filepos ___H__23__23_make_2d_psettings
___H__23__23_parse_2d_psettings_21_ ___H__23__23_psettings_2d__3e_roptions
___H__23__23_psettings_2d__3e_woptions
___H__23__23_psettings_2d__3e_input_2d_readtable
___H__23__23_psettings_2d__3e_output_2d_readtable
___H__23__23_psettings_2d_options_2d__3e_options
___H__23__23_psettings_2d__3e_device_2d_flags
___H__23__23_psettings_2d__3e_permissions
___H__23__23_psettings_2d__3e_output_2d_width ___H__23__23_port_3f_
___H_port_3f_ ___H__23__23_input_2d_port_3f_ ___H_input_2d_port_3f_
___H__23__23_output_2d_port_3f_ ___H_output_2d_port_3f_
___H__23__23_fail_2d_check_2d_port ___H__23__23_fail_2d_check_2d_input_2d_port
___H__23__23_fail_2d_check_2d_output_2d_port
___H__23__23_fail_2d_check_2d_character_2d_input_2d_port
___H__23__23_fail_2d_check_2d_character_2d_output_2d_port
___H__23__23_fail_2d_check_2d_byte_2d_port
___H__23__23_fail_2d_check_2d_byte_2d_input_2d_port
___H__23__23_fail_2d_check_2d_byte_2d_output_2d_port
___H__23__23_fail_2d_check_2d_device_2d_input_2d_port
___H__23__23_fail_2d_check_2d_device_2d_output_2d_port
___H__23__23_make_2d_io_2d_condvar ___H__23__23_io_2d_condvar_3f_
___H__23__23_io_2d_condvar_2d_for_2d_writing_3f_
___H__23__23_io_2d_condvar_2d_port
___H__23__23_io_2d_condvar_2d_port_2d_set_21_
___H__23__23_make_2d_dummy_2d_port ___H_open_2d_dummy
___H__23__23_make_2d_device_2d_port ___H__23__23_make_2d_rdevice_2d_condvar
___H__23__23_make_2d_wdevice_2d_condvar
___H__23__23_make_2d_device_2d_port_2d_from_2d_single_2d_device
___H__23__23_close_2d_device ___H__23__23_input_2d_port_2d_byte_2d_position
___H_input_2d_port_2d_byte_2d_position
___H__23__23_output_2d_port_2d_byte_2d_position
___H_output_2d_port_2d_byte_2d_position
___H__23__23_device_2d_port_2d_wait_2d_for_2d_input_21_
___H__23__23_device_2d_port_2d_wait_2d_for_2d_output_21_
___H__23__23_char_2d_rbuf_2d_fill ___H__23__23_byte_2d_rbuf_2d_fill
___H__23__23_char_2d_wbuf_2d_drain_2d_no_2d_reset
___H__23__23_char_2d_wbuf_2d_drain
___H__23__23_byte_2d_wbuf_2d_drain_2d_no_2d_reset
___H__23__23_byte_2d_wbuf_2d_drain ___H__23__23_vect_2d_port_2d_options
___H__23__23_fail_2d_check_2d_vector_2d_input_2d_port
___H__23__23_fail_2d_check_2d_vector_2d_output_2d_port
___H__23__23_fail_2d_check_2d_vector_2d_or_2d_settings
___H__23__23_subvector_2d__3e_fifo ___H__23__23_fifo_2d__3e_vector
___H__23__23_open_2d_vector_2d_generic ___H__23__23_open_2d_vector
___H_open_2d_vector ___H__23__23_make_2d_vector_2d_pipe_2d_port
___H__23__23_open_2d_vector_2d_pipe_2d_generic
___H__23__23_open_2d_vector_2d_pipe ___H_open_2d_vector_2d_pipe
___H__23__23_open_2d_input_2d_vector ___H_open_2d_input_2d_vector
___H__23__23_open_2d_output_2d_vector ___H_open_2d_output_2d_vector
___H__23__23_get_2d_output_2d_vector ___H_get_2d_output_2d_vector
___H_call_2d_with_2d_input_2d_vector ___H_call_2d_with_2d_output_2d_vector
___H_with_2d_input_2d_from_2d_vector ___H_with_2d_output_2d_to_2d_vector
___H__23__23_make_2d_vector_2d_port
___H__23__23_fail_2d_check_2d_string_2d_input_2d_port
___H__23__23_fail_2d_check_2d_string_2d_output_2d_port
___H__23__23_fail_2d_check_2d_string_2d_or_2d_settings
___H__23__23_substring_2d__3e_fifo ___H__23__23_fifo_2d__3e_string
___H__23__23_open_2d_string_2d_generic ___H__23__23_open_2d_string
___H_open_2d_string ___H__23__23_make_2d_string_2d_pipe_2d_port
___H__23__23_open_2d_string_2d_pipe_2d_generic
___H__23__23_open_2d_string_2d_pipe ___H_open_2d_string_2d_pipe
___H__23__23_open_2d_input_2d_string ___H_open_2d_input_2d_string
___H__23__23_open_2d_output_2d_string ___H_open_2d_output_2d_string
___H__23__23_get_2d_output_2d_string ___H_get_2d_output_2d_string
___H_call_2d_with_2d_input_2d_string ___H_call_2d_with_2d_output_2d_string
___H_with_2d_input_2d_from_2d_string ___H_with_2d_output_2d_to_2d_string
___H__23__23_make_2d_string_2d_port
___H__23__23_fail_2d_check_2d_u8vector_2d_input_2d_port
___H__23__23_fail_2d_check_2d_u8vector_2d_output_2d_port
___H__23__23_fail_2d_check_2d_u8vector_2d_or_2d_settings
___H__23__23_subu8vector_2d__3e_fifo ___H__23__23_fifo_2d__3e_u8vector
___H__23__23_open_2d_u8vector_2d_generic ___H__23__23_open_2d_u8vector
___H_open_2d_u8vector ___H__23__23_make_2d_u8vector_2d_pipe_2d_port
___H__23__23_open_2d_u8vector_2d_pipe_2d_generic
___H__23__23_open_2d_u8vector_2d_pipe ___H_open_2d_u8vector_2d_pipe
___H__23__23_open_2d_input_2d_u8vector ___H_open_2d_input_2d_u8vector
___H__23__23_open_2d_output_2d_u8vector ___H_open_2d_output_2d_u8vector
___H__23__23_get_2d_output_2d_u8vector ___H_get_2d_output_2d_u8vector
___H_call_2d_with_2d_input_2d_u8vector ___H_call_2d_with_2d_output_2d_u8vector
___H_with_2d_input_2d_from_2d_u8vector ___H_with_2d_output_2d_to_2d_u8vector
___H__23__23_make_2d_u8vector_2d_port ___H__23__23_port_2d_of_2d_kind_3f_
___H__23__23_port_2d_kind ___H__23__23_port_2d_device ___H__23__23_port_2d_name
___H__23__23_read ___H_read
___H__23__23_write_2d_generic_2d_to_2d_character_2d_port ___H__23__23_write
___H_write ___H__23__23_display ___H_display ___H__23__23_pretty_2d_print
___H_pretty_2d_print ___H__23__23_print ___H_print ___H_println
___H__23__23_newline ___H_newline ___H__23__23_flush_2d_input_2d_buffering
___H__23__23_force_2d_output ___H_force_2d_output
___H__23__23_close_2d_input_2d_port ___H_close_2d_input_2d_port
___H__23__23_close_2d_output_2d_port ___H_close_2d_output_2d_port
___H__23__23_close_2d_port ___H_close_2d_port ___H_input_2d_port_2d_readtable
___H_input_2d_port_2d_readtable_2d_set_21_ ___H_output_2d_port_2d_readtable
___H_output_2d_port_2d_readtable_2d_set_21_
___H__23__23_input_2d_port_2d_timeout_2d_set_21_
___H_input_2d_port_2d_timeout_2d_set_21_
___H__23__23_output_2d_port_2d_timeout_2d_set_21_
___H_output_2d_port_2d_timeout_2d_set_21_
___H__23__23_port_2d_io_2d_exception_2d_handler_2d_set_21_
___H_port_2d_io_2d_exception_2d_handler_2d_set_21_
___H__23__23_input_2d_port_2d_char_2d_position
___H_input_2d_port_2d_char_2d_position
___H__23__23_output_2d_port_2d_char_2d_position
___H_output_2d_port_2d_char_2d_position
___H__23__23_input_2d_port_2d_line_2d_set_21_
___H__23__23_input_2d_port_2d_line ___H_input_2d_port_2d_line
___H__23__23_input_2d_port_2d_column_2d_set_21_
___H__23__23_input_2d_port_2d_column ___H_input_2d_port_2d_column
___H__23__23_output_2d_port_2d_line_2d_set_21_
___H__23__23_output_2d_port_2d_line ___H_output_2d_port_2d_line
___H__23__23_output_2d_port_2d_column_2d_set_21_
___H__23__23_output_2d_port_2d_column ___H_output_2d_port_2d_column
___H__23__23_output_2d_port_2d_width ___H_output_2d_port_2d_width
___H__23__23_object_2d__3e_truncated_2d_string
___H__23__23_object_2d__3e_string ___H_object_2d__3e_string
___H__23__23_string_2d__3e_limited_2d_string
___H__23__23_force_2d_limited_2d_string_21_
___H__23__23_input_2d_port_2d_characters_2d_buffered
___H_input_2d_port_2d_characters_2d_buffered ___H__23__23_char_2d_ready_3f_
___H_char_2d_ready_3f_ ___H__23__23_peek_2d_char ___H_peek_2d_char
___H__23__23_read_2d_char ___H_read_2d_char ___H__23__23_read_2d_substring
___H_read_2d_substring ___H__23__23_read_2d_line ___H_read_2d_line
___H__23__23_read_2d_all ___H_read_2d_all
___H__23__23_read_2d_all_2d_as_2d_a_2d_begin_2d_expr_2d_from_2d_path
___H__23__23_read_2d_all_2d_as_2d_a_2d_begin_2d_expr_2d_from_2d_psettings
___H__23__23_read_2d_all_2d_as_2d_a_2d_begin_2d_expr_2d_from_2d_port
___H__23__23_write_2d_char ___H_write_2d_char ___H__23__23_write_2d_substring
___H_write_2d_substring ___H__23__23_write_2d_string
___H__23__23_input_2d_port_2d_bytes_2d_buffered
___H_input_2d_port_2d_bytes_2d_buffered ___H__23__23_read_2d_u8 ___H_read_2d_u8
___H__23__23_read_2d_subu8vector ___H_read_2d_subu8vector
___H__23__23_write_2d_u8 ___H_write_2d_u8 ___H__23__23_write_2d_subu8vector
___H_write_2d_subu8vector ___H__23__23_options_2d_set_21_
___H__23__23_port_2d_settings_2d_set_21_ ___H_port_2d_settings_2d_set_21_
___H__23__23_fail_2d_check_2d_tty_2d_port ___H__23__23_tty_3f_ ___H_tty_3f_
___H__23__23_tty_2d_type_2d_set_21_ ___H_tty_2d_type_2d_set_21_
___H__23__23_tty_2d_text_2d_attributes_2d_set_21_
___H_tty_2d_text_2d_attributes_2d_set_21_ ___H__23__23_tty_2d_history
___H_tty_2d_history ___H__23__23_tty_2d_history_2d_set_21_
___H_tty_2d_history_2d_set_21_
___H__23__23_tty_2d_history_2d_max_2d_length_2d_set_21_
___H_tty_2d_history_2d_max_2d_length_2d_set_21_
___H__23__23_tty_2d_paren_2d_balance_2d_duration_2d_set_21_
___H_tty_2d_paren_2d_balance_2d_duration_2d_set_21_
___H__23__23_tty_2d_mode_2d_set_21_ ___H_tty_2d_mode_2d_set_21_
___H__23__23_fail_2d_check_2d_process_2d_port
___H__23__23_make_2d_process_2d_psettings
___H__23__23_open_2d_process_2d_generic ___H__23__23_open_2d_process
___H_open_2d_process ___H__23__23_open_2d_input_2d_process
___H_open_2d_input_2d_process ___H__23__23_open_2d_output_2d_process
___H_open_2d_output_2d_process ___H_call_2d_with_2d_input_2d_process
___H_call_2d_with_2d_output_2d_process ___H_with_2d_input_2d_from_2d_process
___H_with_2d_output_2d_to_2d_process ___H__23__23_process_2d_pid
___H_process_2d_pid ___H__23__23_process_2d_status ___H_process_2d_status
___H__23__23_fail_2d_check_2d_host_2d_info ___H_host_2d_info_3f_
___H_host_2d_info_2d_name ___H_host_2d_info_2d_aliases
___H_host_2d_info_2d_addresses ___H__23__23_host_2d_info ___H_host_2d_info
___H__23__23_host_2d_name ___H_host_2d_name
___H__23__23_string_2d_or_2d_ip_2d_address_3f_ ___H__23__23_ip_2d_address_3f_
___H__23__23_fail_2d_check_2d_service_2d_info ___H_service_2d_info_3f_
___H_service_2d_info_2d_name ___H_service_2d_info_2d_aliases
___H_service_2d_info_2d_port_2d_number ___H_service_2d_info_2d_protocol
___H__23__23_service_2d_info ___H_service_2d_info
___H__23__23_fail_2d_check_2d_protocol_2d_info ___H_protocol_2d_info_3f_
___H_protocol_2d_info_2d_name ___H_protocol_2d_info_2d_aliases
___H_protocol_2d_info_2d_number ___H__23__23_protocol_2d_info
___H_protocol_2d_info ___H__23__23_fail_2d_check_2d_network_2d_info
___H_network_2d_info_3f_ ___H_network_2d_info_2d_name
___H_network_2d_info_2d_aliases ___H_network_2d_info_2d_number
___H__23__23_network_2d_info ___H_network_2d_info
___H__23__23_fail_2d_check_2d_tcp_2d_client_2d_port
___H__23__23_make_2d_tcp_2d_psettings
___H__23__23_make_2d_tcp_2d_client_2d_port ___H__23__23_open_2d_tcp_2d_client
___H_open_2d_tcp_2d_client ___H__23__23_fail_2d_check_2d_socket_2d_info
___H_socket_2d_info_3f_ ___H_socket_2d_info_2d_family
___H_socket_2d_info_2d_port_2d_number ___H_socket_2d_info_2d_address
___H__23__23_socket_2d_info_2d_setup_21_
___H__23__23_tcp_2d_client_2d_socket_2d_info
___H__23__23_tcp_2d_client_2d_self_2d_socket_2d_info
___H_tcp_2d_client_2d_self_2d_socket_2d_info
___H__23__23_tcp_2d_client_2d_peer_2d_socket_2d_info
___H_tcp_2d_client_2d_peer_2d_socket_2d_info
___H__23__23_fail_2d_check_2d_address_2d_info ___H_address_2d_info_3f_
___H_address_2d_info_2d_family ___H_address_2d_info_2d_socket_2d_type
___H_address_2d_info_2d_protocol ___H_address_2d_info_2d_socket_2d_info
___H__23__23_net_2d_family_2d_encode ___H__23__23_net_2d_family_2d_decode
___H__23__23_net_2d_socket_2d_type_2d_encode
___H__23__23_net_2d_socket_2d_type_2d_decode
___H__23__23_net_2d_protocol_2d_encode ___H__23__23_net_2d_protocol_2d_decode
___H__23__23_address_2d_info_2d_setup_21_ ___H__23__23_address_2d_infos
___H_address_2d_infos ___H__23__23_fail_2d_check_2d_tcp_2d_server_2d_port
___H__23__23_make_2d_tcp_2d_server_2d_port
___H__23__23_process_2d_tcp_2d_server_2d_psettings
___H__23__23_open_2d_tcp_2d_server_2d_aux ___H__23__23_open_2d_tcp_2d_server
___H_open_2d_tcp_2d_server ___H__23__23_tcp_2d_server_2d_socket_2d_info
___H_tcp_2d_server_2d_socket_2d_info
___H__23__23_string_2d__3e_address_2d_and_2d_port_2d_number
___H__23__23_fail_2d_check_2d_directory_2d_port
___H__23__23_make_2d_directory_2d_psettings
___H__23__23_make_2d_directory_2d_port ___H__23__23_open_2d_directory
___H_open_2d_directory ___H__23__23_fail_2d_check_2d_event_2d_queue_2d_port
___H__23__23_make_2d_event_2d_queue_2d_port ___H__23__23_open_2d_event_2d_queue
___H_open_2d_event_2d_queue ___H__23__23_make_2d_path_2d_psettings
___H__23__23_make_2d_input_2d_path_2d_psettings
___H__23__23_open_2d_file_2d_generic
___H__23__23_open_2d_file_2d_generic_2d_from_2d_psettings
___H__23__23_path_2d_reference ___H__23__23_open_2d_file ___H_open_2d_file
___H__23__23_open_2d_input_2d_file ___H_open_2d_input_2d_file
___H__23__23_open_2d_output_2d_file ___H_open_2d_output_2d_file
___H_call_2d_with_2d_input_2d_file ___H_call_2d_with_2d_output_2d_file
___H_with_2d_input_2d_from_2d_file ___H_with_2d_output_2d_to_2d_file
___H_with_2d_input_2d_from_2d_port ___H_with_2d_output_2d_to_2d_port
___H__23__23_open_2d_predefined ___H_console_2d_port
___H__23__23_open_2d_all_2d_predefined
___H__23__23_force_2d_output_2d_on_2d_predefined ___H__23__23_make_2d_filepos
___H__23__23_filepos_2d_line ___H__23__23_filepos_2d_col
___H__23__23_fail_2d_check_2d_readtable ___H__23__23_readtable_3f_
___H_readtable_3f_ ___H__23__23_readtable_2d_copy_2d_shallow
___H__23__23_readtable_2d_copy ___H_readtable_2d_case_2d_conversion_3f_
___H_readtable_2d_case_2d_conversion_3f__2d_set
___H_readtable_2d_keywords_2d_allowed_3f_
___H_readtable_2d_keywords_2d_allowed_3f__2d_set
___H_readtable_2d_sharing_2d_allowed_3f_
___H_readtable_2d_sharing_2d_allowed_3f__2d_set
___H_readtable_2d_eval_2d_allowed_3f_
___H_readtable_2d_eval_2d_allowed_3f__2d_set
___H_readtable_2d_write_2d_extended_2d_read_2d_macros_3f_
___H_readtable_2d_write_2d_extended_2d_read_2d_macros_3f__2d_set
___H_readtable_2d_write_2d_cdr_2d_read_2d_macros_3f_
___H_readtable_2d_write_2d_cdr_2d_read_2d_macros_3f__2d_set
___H_readtable_2d_max_2d_write_2d_level
___H_readtable_2d_max_2d_write_2d_level_2d_set
___H_readtable_2d_max_2d_write_2d_length
___H_readtable_2d_max_2d_write_2d_length_2d_set
___H_readtable_2d_max_2d_unescaped_2d_char
___H_readtable_2d_max_2d_unescaped_2d_char_2d_set
___H_readtable_2d_comment_2d_handler
___H_readtable_2d_comment_2d_handler_2d_set ___H_readtable_2d_start_2d_syntax
___H_readtable_2d_start_2d_syntax_2d_set
___H__23__23_extract_2d_language_2d_and_2d_tail
___H__23__23_readtable_2d_setup_2d_for_2d_language_21_
___H__23__23_readtable_2d_setup_2d_for_2d_standard_2d_level_21_
___H__23__23_make_2d_readtable_2d_parameter ___H__23__23_start_2d_main
___H__23__23_make_2d_marktable ___H__23__23_marktable_2d_mark_21_
___H__23__23_marktable_2d_lookup_21_ ___H__23__23_marktable_2d_save
___H__23__23_marktable_2d_restore_21_
___H__23__23_might_2d_write_2d_differently_3f_ ___H__23__23_default_2d_wr
___H__23__23_wr_2d_str ___H__23__23_wr_2d_substr ___H__23__23_wr_2d_ch
___H__23__23_wr_2d_filler ___H__23__23_wr_2d_spaces ___H__23__23_wr_2d_indent
___H__23__23_shifted_2d_column ___H__23__23_wr_2d_sn
___H__23__23_wr_2d_no_2d_display ___H__23__23_wr_2d_mark
___H__23__23_wr_2d_stamp ___H__23__23_wr_2d_symbol
___H__23__23_escape_2d_symbol_3f_ ___H__23__23_escape_2d_symkey_3f_
___H__23__23_wr_2d_keyword ___H__23__23_escape_2d_keyword_3f_
___H__23__23_wr_2d_pair ___H__23__23_print_2d_marker
___H__23__23_wr_2d_one_2d_line_2d_pretty_2d_print
___H__23__23_wr_2d_fits_2d_on_2d_line ___H__23__23_wr_2d_complex
___H__23__23_wr_2d_char ___H__23__23_wr_2d_hex ___H__23__23_wr_2d_oct
___H__23__23_wr_2d_string ___H__23__23_wr_2d_escaped_2d_string
___H__23__23_reader_2d__3e_open_2d_close ___H__23__23_head_2d__3e_open_2d_close
___H__23__23_wr_2d_vector ___H__23__23_wr_2d_vector_2d_aux1
___H__23__23_wr_2d_vector_2d_aux2 ___H__23__23_wr_2d_vector_2d_aux3
___H__23__23_wr_2d_foreign ___H__23__23_explode_2d_object
___H__23__23_implode_2d_object ___H__23__23_explode_2d_structure
___H__23__23_implode_2d_structure ___H__23__23_implode_2d_frame
___H__23__23_implode_2d_continuation ___H__23__23_explode_2d_procedure
___H__23__23_explode_2d_closure ___H__23__23_explode_2d_subprocedure
___H__23__23_implode_2d_procedure
___H__23__23_implode_2d_procedure_2d_or_2d_return
___H__23__23_explode_2d_return ___H__23__23_implode_2d_return
___H__23__23_wr_2d_opaque ___H__23__23_wr_2d_serialize
___H__23__23_wr_2d_s8vector ___H__23__23_wr_2d_u8vector
___H__23__23_wr_2d_s16vector ___H__23__23_wr_2d_u16vector
___H__23__23_wr_2d_s32vector ___H__23__23_wr_2d_u32vector
___H__23__23_wr_2d_s64vector ___H__23__23_wr_2d_u64vector
___H__23__23_wr_2d_f32vector ___H__23__23_wr_2d_f64vector
___H__23__23_wr_2d_structure ___H__23__23_wr_2d_gc_2d_hash_2d_table
___H__23__23_explode_2d_gc_2d_hash_2d_table
___H__23__23_implode_2d_gc_2d_hash_2d_table ___H__23__23_wr_2d_meroon
___H__23__23_wr_2d_jazz ___H__23__23_wr_2d_frame
___H__23__23_wr_2d_continuation ___H__23__23_wr_2d_promise
___H__23__23_explode_2d_promise ___H__23__23_implode_2d_promise
___H__23__23_wr_2d_will ___H__23__23_wr_2d_procedure ___H__23__23_wr_2d_return
___H__23__23_wr_2d_box ___H__23__23_wr_2d_other ___H__23__23_eof_2d_object_3f_
___H_eof_2d_object_3f_ ___H_transcript_2d_on ___H_transcript_2d_off
___H__23__23_make_2d_chartable ___H__23__23_chartable_2d_copy
___H__23__23_chartable_2d_ref ___H__23__23_chartable_2d_set_21_
___H__23__23_readtable_2d_char_2d_delimiter_3f_
___H__23__23_readtable_2d_char_2d_delimiter_3f__2d_set_21_
___H__23__23_readtable_2d_char_2d_handler
___H__23__23_readtable_2d_char_2d_handler_2d_set_21_
___H__23__23_readtable_2d_char_2d_sharp_2d_handler
___H__23__23_readtable_2d_char_2d_sharp_2d_handler_2d_set_21_
___H__23__23_readtable_2d_char_2d_class_2d_set_21_
___H__23__23_readtable_2d_convert_2d_case
___H__23__23_readtable_2d_string_2d_convert_2d_case_21_
___H__23__23_readtable_2d_parse_2d_keyword
___H__23__23_read_2d_datum_2d_or_2d_eof
___H__23__23_read_2d_datum_2d_or_2d_label
___H__23__23_read_2d_datum_2d_or_2d_label_2d_or_2d_none
___H__23__23_read_2d_datum_2d_or_2d_label_2d_or_2d_none_2d_or_2d_dot
___H__23__23_script_2d_marker ___H__23__23_none_2d_marker
___H__23__23_dot_2d_marker ___H__23__23_label_2d_marker_3f_
___H__23__23_label_2d_marker_2d_enter_21_
___H__23__23_label_2d_marker_2d_reference
___H__23__23_label_2d_marker_2d_fixup_2d_handler_2d_add_21_
___H__23__23_label_2d_marker_2d_define
___H__23__23_label_2d_marker_2d_fixup_21_
___H__23__23_read_2d_check_2d_labels_21_ ___H__23__23_build_2d_list
___H__23__23_read_2d_next_2d_char_2d_expecting ___H__23__23_build_2d_vector
___H__23__23_build_2d_delimited_2d_string
___H__23__23_build_2d_delimited_2d_number_2f_keyword_2f_symbol
___H__23__23_string_2d__3e_number_2f_keyword_2f_symbol
___H__23__23_char_2d_octal_3f_ ___H__23__23_char_2d_hexadecimal_3f_
___H__23__23_build_2d_escaped_2d_string_2d_up_2d_to
___H__23__23_build_2d_decimal_2d_integer ___H__23__23_build_2d_read_2d_macro
___H__23__23_skip_2d_extended_2d_comment
___H__23__23_skip_2d_single_2d_line_2d_comment
___H__23__23_skip_2d_comment_2d_done ___H__23__23_read_2d_sharp
___H__23__23_read_2d_sharp_2d_aux ___H__23__23_read_2d_sharp_2d_vector
___H__23__23_read_2d_sharp_2d_char ___H__23__23_read_2d_sharp_2d_comment
___H__23__23_read_2d_sharp_2d_bang
___H__23__23_read_2d_sharp_2d_keyword_2f_symbol
___H__23__23_read_2d_sharp_2d_colon ___H__23__23_read_2d_sharp_2d_semicolon
___H__23__23_read_2d_sharp_2d_quotation ___H__23__23_read_2d_sharp_2d_ampersand
___H__23__23_read_2d_sharp_2d_dot ___H__23__23_read_2d_sharp_2d_less
___H__23__23_read_2d_sharp_2d_digit ___H__23__23_wrap ___H__23__23_wrap_2d_op
___H__23__23_wrap_2d_op0 ___H__23__23_wrap_2d_op1 ___H__23__23_wrap_2d_op1_2a_
___H__23__23_wrap_2d_op2 ___H__23__23_wrap_2d_op3 ___H__23__23_wrap_2d_op4
___H__23__23_read_2d_sharp_2d_other ___H__23__23_read_2d_whitespace
___H__23__23_read_2d_single_2d_line_2d_comment
___H__23__23_read_2d_escaped_2d_string ___H__23__23_read_2d_quotation
___H__23__23_closing_2d_parenthesis_2d_for
___H__23__23_read_2d_vector_2d_or_2d_list ___H__23__23_read_2d_list
___H__23__23_read_2d_vector ___H__23__23_read_2d_other
___H__23__23_read_2d_none ___H__23__23_read_2d_illegal ___H__23__23_read_2d_dot
___H__23__23_read_2d_number_2f_keyword_2f_symbol
___H__23__23_read_2d_assoc_2d_string_3d__3f_
___H__23__23_read_2d_string_3d__3f_ ___H__23__23_read_2d_six
___H__23__23_read_2d_six_2d_datum_2d_or_2d_eof ___H__23__23_six_2d_type_3f_
___H__23__23_make_2d_standard_2d_readtable ___setup_mod ___init_mod ____20___io
Analyzing compilation unit
Performing interprocedural optimizations
 <*free_lang_data> <visibility> <build_ssa_passes> <chkp_passes>
<opt_local_passes> <free-inline-summary> <profile> <whole-program>
<profile_estimate> <inline> <pure-const> <static-var> <single-use>
<comdats>Assembling functions:
 ___setup_mod ___init_mod ___H__23__23_make_2d_standard_2d_readtable
___H__23__23_six_2d_type_3f_ ___H__23__23_read_2d_six_2d_datum_2d_or_2d_eof {GC
1963188k -> 1911014k}^Cmakefile:150: recipe for target '_io.o' failed
make: *** [_io.o] Interrupt


When I killed it, top was reporting:

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND     
 8760 lucier    20   0 37.918g 0.029t    584 D   4.7 95.6  34:11.14 cc1       

(I don't remember seeing resident memory measured in terabytes before ;-)

I'm having similar problems with the 4.8 branch.  

I'm including _io.i.gz


^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Bug other/64928] Inordinate cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs
  2015-02-03 21:09 [Bug other/64928] New: unreasonable cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs lucier at math dot purdue.edu
                   ` (4 preceding siblings ...)
  2015-02-06  5:07 ` lucier at math dot purdue.edu
@ 2015-02-06  5:08 ` lucier at math dot purdue.edu
  2015-02-09 14:31 ` [Bug middle-end/64928] [4.8/4.9/5 Regression] " rguenth at gcc dot gnu.org
                   ` (33 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: lucier at math dot purdue.edu @ 2015-02-06  5:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64928

--- Comment #6 from lucier at math dot purdue.edu ---
The problem does not appear with this compiler:

maclaurin-271% gcc -v
Using built-in specs.
Target: x86_64-redhat-linux
Configured with: ../configure --prefix=/usr --mandir=/usr/share/man
--infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla
--enable-bootstrap --enable-shared --enable-threads=posix
--enable-checking=release --with-system-zlib --enable-__cxa_atexit
--disable-libunwind-exceptions --enable-gnu-unique-object
--enable-languages=c,c++,objc,obj-c++,java,fortran,ada --enable-java-awt=gtk
--disable-dssi --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-1.5.0.0/jre
--enable-libgcj-multifile --enable-java-maintainer-mode
--with-ecj-jar=/usr/share/java/eclipse-ecj.jar --disable-libjava-multilib
--with-ppl --with-cloog --with-tune=generic --with-arch_32=i686
--build=x86_64-redhat-linux
Thread model: posix
gcc version 4.4.7 20120313 (Red Hat 4.4.7-4) (GCC) 

so it appears to be a regression.

Brad


^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Bug middle-end/64928] [4.8/4.9/5 Regression] Inordinate cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs
  2015-02-03 21:09 [Bug other/64928] New: unreasonable cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs lucier at math dot purdue.edu
                   ` (5 preceding siblings ...)
  2015-02-06  5:08 ` lucier at math dot purdue.edu
@ 2015-02-09 14:31 ` rguenth at gcc dot gnu.org
  2015-02-09 15:07 ` rguenth at gcc dot gnu.org
                   ` (32 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: rguenth at gcc dot gnu.org @ 2015-02-09 14:31 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64928

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |memory-hog
          Component|other                       |middle-end
      Known to work|                            |4.4.7
   Target Milestone|---                         |4.8.5
            Summary|Inordinate cpu time and     |[4.8/4.9/5 Regression]
                   |memory usage in "phase opt  |Inordinate cpu time and
                   |and generate" with          |memory usage in "phase opt
                   |-ftest-coverage             |and generate" with
                   |-fprofile-arcs              |-ftest-coverage
                   |                            |-fprofile-arcs

--- Comment #7 from Richard Biener <rguenth at gcc dot gnu.org> ---
Given from the description I suppose that non-profiling/coverage mode is fine.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Bug middle-end/64928] [4.8/4.9/5 Regression] Inordinate cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs
  2015-02-03 21:09 [Bug other/64928] New: unreasonable cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs lucier at math dot purdue.edu
                   ` (6 preceding siblings ...)
  2015-02-09 14:31 ` [Bug middle-end/64928] [4.8/4.9/5 Regression] " rguenth at gcc dot gnu.org
@ 2015-02-09 15:07 ` rguenth at gcc dot gnu.org
  2015-02-16 19:57 ` law at redhat dot com
                   ` (31 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: rguenth at gcc dot gnu.org @ 2015-02-09 15:07 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64928

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2015-02-09
     Ever confirmed|0                           |1

--- Comment #8 from Richard Biener <rguenth at gcc dot gnu.org> ---
Ok, so the memory is used by out-of-SSA it seems

#5  0x0000000000c9eebc in coalesce_ssa_name ()
    at /space/rguenther/src/svn/gcc-4_9-branch/gcc/tree-ssa-coalesce.c:1330
1330      graph = build_ssa_conflict_graph (liveinfo);
(gdb) p *cl->list.htab
$10 = {entries = 0x2b19b30, size = 524287, n_elements = 77146, n_deleted = 0, 
  searches = 122189, collisions = 6508, size_prime_index = 16}

where we malloc(!) 77146 entries of size 12.

But of course bad is the conflict graph with 76063 bitmaps eating up around
1GB of memory for the first testcase (and function
___H__23__23_u8vector_2d__3e_object).

That's likely caused by the change to more aggressively coalesce anonymous
SSA names.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Bug middle-end/64928] [4.8/4.9/5 Regression] Inordinate cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs
  2015-02-03 21:09 [Bug other/64928] New: unreasonable cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs lucier at math dot purdue.edu
                   ` (7 preceding siblings ...)
  2015-02-09 15:07 ` rguenth at gcc dot gnu.org
@ 2015-02-16 19:57 ` law at redhat dot com
  2015-03-05 17:22 ` rguenth at gcc dot gnu.org
                   ` (30 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: law at redhat dot com @ 2015-02-16 19:57 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64928

Jeffrey A. Law <law at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |law at redhat dot com

--- Comment #10 from Jeffrey A. Law <law at redhat dot com> ---
Might want to look at 65076 as well where phase opt and generate is taking 89%
of the compile time.  Might be a better testcase to work with.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Bug middle-end/64928] [4.8/4.9/5 Regression] Inordinate cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs
  2015-02-03 21:09 [Bug other/64928] New: unreasonable cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs lucier at math dot purdue.edu
                   ` (8 preceding siblings ...)
  2015-02-16 19:57 ` law at redhat dot com
@ 2015-03-05 17:22 ` rguenth at gcc dot gnu.org
  2015-03-05 23:07 ` steven at gcc dot gnu.org
                   ` (29 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: rguenth at gcc dot gnu.org @ 2015-03-05 17:22 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64928

--- Comment #11 from Richard Biener <rguenth at gcc dot gnu.org> ---
Ok, so it's already calculate_live_ranges that takes much memory.  I have a
small patch to improve that somewhat.

But what we really need is to get the "must coalesce" stuff "coalesced" with
respect to both live and conflict computation.  That is, map must-coalesce
SSA vars to the same partition.  That loses the SSA corruption testing, but
well so it might be much more controversical (silent wrong-code instead of
ICE).
Unfortunately in the testcase there are only 2750 must-coalesces but
109493 partitions participating in the coalescing (so at least 50000 want
coalesces).

The good news is of course that we can simply choose to _not_ coalesce that
many variables, but say only the important ones.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Bug middle-end/64928] [4.8/4.9/5 Regression] Inordinate cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs
  2015-02-03 21:09 [Bug other/64928] New: unreasonable cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs lucier at math dot purdue.edu
                   ` (9 preceding siblings ...)
  2015-03-05 17:22 ` rguenth at gcc dot gnu.org
@ 2015-03-05 23:07 ` steven at gcc dot gnu.org
  2015-03-06  0:45 ` law at redhat dot com
                   ` (28 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: steven at gcc dot gnu.org @ 2015-03-05 23:07 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64928

Steven Bosscher <steven at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |steven at gcc dot gnu.org

--- Comment #12 from Steven Bosscher <steven at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #9)
> It seems that loop invariant motion is responsible for most of the abnormals,
> thus -fno-tree-loop-im restores performance.
> 
> The loop LIM detects is of style
> 
>   <bb 6>: (header)
>   # ___fp_3(ab) = PHI <___fp_41(4), ___fp_5(21)>
>   # ___r1_7(ab) = PHI <___r1_42(4), ___r1_9(21)>
>   # ___r2_11(ab) = PHI <___r2_43(4), ___r3_17(21)>
>   # ___r3_19(ab) = PHI <___r3_44(4), ___r3_23(21)>
>   # ___r4_25 = PHI <___r4_45(4), ___r4_26(21)>
>   # gotovar.17_29 = PHI <_51(4), _69(21)>
>   goto gotovar.17_29;

Perhaps disable LIM (and maybe PRE) if the CFG has a large edge/bb ratio (i.e.
dense CFG)? There's probably no benefit in such cases anyway.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Bug middle-end/64928] [4.8/4.9/5 Regression] Inordinate cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs
  2015-02-03 21:09 [Bug other/64928] New: unreasonable cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs lucier at math dot purdue.edu
                   ` (10 preceding siblings ...)
  2015-03-05 23:07 ` steven at gcc dot gnu.org
@ 2015-03-06  0:45 ` law at redhat dot com
  2015-03-06 10:53 ` rguenth at gcc dot gnu.org
                   ` (27 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: law at redhat dot com @ 2015-03-06  0:45 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64928

--- Comment #13 from Jeffrey A. Law <law at redhat dot com> ---
I think we've done similar things for Brad's large testcases in the past.  You
want to look at both the edge/bb density as well as the overall size.  ie, a
high density doesn't really hurt if the total cfg is small.

See "is_too_expensive" in gcse.c for the current heuristics to avoid trying
global opts on these kinds of testcases.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Bug middle-end/64928] [4.8/4.9/5 Regression] Inordinate cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs
  2015-02-03 21:09 [Bug other/64928] New: unreasonable cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs lucier at math dot purdue.edu
                   ` (11 preceding siblings ...)
  2015-03-06  0:45 ` law at redhat dot com
@ 2015-03-06 10:53 ` rguenth at gcc dot gnu.org
  2015-03-06 12:35 ` rguenth at gcc dot gnu.org
                   ` (26 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: rguenth at gcc dot gnu.org @ 2015-03-06 10:53 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64928

--- Comment #14 from Richard Biener <rguenth at gcc dot gnu.org> ---
Note that if we fix out-of-SSA coalescing (patch in testing) then RTL CSE
explodes via DF.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Bug middle-end/64928] [4.8/4.9/5 Regression] Inordinate cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs
  2015-02-03 21:09 [Bug other/64928] New: unreasonable cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs lucier at math dot purdue.edu
                   ` (12 preceding siblings ...)
  2015-03-06 10:53 ` rguenth at gcc dot gnu.org
@ 2015-03-06 12:35 ` rguenth at gcc dot gnu.org
  2015-03-06 12:47 ` rguenth at gcc dot gnu.org
                   ` (25 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: rguenth at gcc dot gnu.org @ 2015-03-06 12:35 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64928

--- Comment #15 from Richard Biener <rguenth at gcc dot gnu.org> ---
Author: rguenth
Date: Fri Mar  6 12:34:28 2015
New Revision: 221237

URL: https://gcc.gnu.org/viewcvs?rev=221237&root=gcc&view=rev
Log:
2015-03-06  Richard Biener  <rguenther@suse.de>

    PR middle-end/64928
    * tree-ssa-live.h (struct tree_live_info_d): Add livein_obstack
    and liveout_obstack members.
    (calculate_live_on_exit): Remove.
    (calculate_live_ranges): Change declaration.
    * tree-ssa-live.c (liveness_bitmap_obstack): Remove global var.
    (new_tree_live_info): Adjust.
    (calculate_live_ranges): Delete livein when not wanted.
    (calculate_live_ranges): Do not initialize liveness_bitmap_obstack.
    Deal with partly deleted live info.
    (loe_visit_block): Remove temporary bitmap by using
    bitmap_ior_and_compl_into.
    (live_worklist): Adjust accordingly.
    (calculate_live_on_exit): Make static.
    * tree-ssa-coalesce.c (coalesce_ssa_name): Tell calculate_live_ranges
    we do not need livein.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/tree-ssa-coalesce.c
    trunk/gcc/tree-ssa-live.c
    trunk/gcc/tree-ssa-live.h


^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Bug middle-end/64928] [4.8/4.9/5 Regression] Inordinate cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs
  2015-02-03 21:09 [Bug other/64928] New: unreasonable cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs lucier at math dot purdue.edu
                   ` (13 preceding siblings ...)
  2015-03-06 12:35 ` rguenth at gcc dot gnu.org
@ 2015-03-06 12:47 ` rguenth at gcc dot gnu.org
  2015-03-06 12:53 ` rguenth at gcc dot gnu.org
                   ` (24 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: rguenth at gcc dot gnu.org @ 2015-03-06 12:47 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64928

--- Comment #16 from Richard Biener <rguenth at gcc dot gnu.org> ---
Created attachment 34974
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=34974&action=edit
Patch to limit coalescing amount

The committed patch improves peak memory usage from 7.6GB to 5.8GB for the
small testcase.

The attached patch reduces memory usage from SSA coalescing further (to ~300MB)
by simply doing less coalescing.  Unfortunately the generated RTL puts a bigger
load on CSE/DF and thus we need 7.6GB again (eventually one can find an optimal
--param max-out-of-ssa-coalesce-names, but that's probably highly testcase
specific).

In theory you can iterate on coalescing piecewise as well, but the overhead
for doing this might be too big (basically up to computing live/conflict
for each coalesce pair separately, taking into account previous coalesces).


^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Bug middle-end/64928] [4.8/4.9/5 Regression] Inordinate cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs
  2015-02-03 21:09 [Bug other/64928] New: unreasonable cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs lucier at math dot purdue.edu
                   ` (14 preceding siblings ...)
  2015-03-06 12:47 ` rguenth at gcc dot gnu.org
@ 2015-03-06 12:53 ` rguenth at gcc dot gnu.org
  2015-03-06 13:01 ` rguenth at gcc dot gnu.org
                   ` (23 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: rguenth at gcc dot gnu.org @ 2015-03-06 12:53 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64928

--- Comment #17 from Richard Biener <rguenth at gcc dot gnu.org> ---
Created attachment 34975
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=34975&action=edit
do not compute live/conflict for abnormal coalesces

This is the other idea of simply not computing live/conflict for abnormal
coalesces we know to always succeed.  This shrinks the following live/conflict
problem for the regular coalesces by unifying some partitions.

Doesn't help this particular testcase much.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Bug middle-end/64928] [4.8/4.9/5 Regression] Inordinate cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs
  2015-02-03 21:09 [Bug other/64928] New: unreasonable cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs lucier at math dot purdue.edu
                   ` (15 preceding siblings ...)
  2015-03-06 12:53 ` rguenth at gcc dot gnu.org
@ 2015-03-06 13:01 ` rguenth at gcc dot gnu.org
  2015-03-18 12:54 ` rguenth at gcc dot gnu.org
                   ` (22 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: rguenth at gcc dot gnu.org @ 2015-03-06 13:01 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64928

--- Comment #18 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #17)
> Created attachment 34975 [details]
> do not compute live/conflict for abnormal coalesces
> 
> This is the other idea of simply not computing live/conflict for abnormal
> coalesces we know to always succeed.  This shrinks the following
> live/conflict
> problem for the regular coalesces by unifying some partitions.
> 
> Doesn't help this particular testcase much.

But it fixes PR63155 ...


^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Bug middle-end/64928] [4.8/4.9/5 Regression] Inordinate cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs
  2015-02-03 21:09 [Bug other/64928] New: unreasonable cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs lucier at math dot purdue.edu
                   ` (16 preceding siblings ...)
  2015-03-06 13:01 ` rguenth at gcc dot gnu.org
@ 2015-03-18 12:54 ` rguenth at gcc dot gnu.org
  2015-05-20 14:49 ` [Bug middle-end/64928] [4.8/4.9/5/6 " wellnhofer at aevum dot de
                   ` (21 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: rguenth at gcc dot gnu.org @ 2015-03-18 12:54 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64928

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Priority|P3                          |P2


^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Bug middle-end/64928] [4.8/4.9/5/6 Regression] Inordinate cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs
  2015-02-03 21:09 [Bug other/64928] New: unreasonable cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs lucier at math dot purdue.edu
                   ` (17 preceding siblings ...)
  2015-03-18 12:54 ` rguenth at gcc dot gnu.org
@ 2015-05-20 14:49 ` wellnhofer at aevum dot de
  2015-05-20 14:49 ` wellnhofer at aevum dot de
                   ` (20 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: wellnhofer at aevum dot de @ 2015-05-20 14:49 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64928
Bug 64928 depends on bug 66209, which changed state.

Bug 66209 Summary: Out of memory when compiling with --coverage and optimizations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66209

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |RESOLVED
         Resolution|---                         |DUPLICATE


^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Bug middle-end/64928] [4.8/4.9/5/6 Regression] Inordinate cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs
  2015-02-03 21:09 [Bug other/64928] New: unreasonable cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs lucier at math dot purdue.edu
                   ` (18 preceding siblings ...)
  2015-05-20 14:49 ` [Bug middle-end/64928] [4.8/4.9/5/6 " wellnhofer at aevum dot de
@ 2015-05-20 14:49 ` wellnhofer at aevum dot de
  2015-06-23  8:19 ` rguenth at gcc dot gnu.org
                   ` (19 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: wellnhofer at aevum dot de @ 2015-05-20 14:49 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64928

Nick Wellnhofer <wellnhofer at aevum dot de> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |wellnhofer at aevum dot de

--- Comment #19 from Nick Wellnhofer <wellnhofer at aevum dot de> ---
*** Bug 66209 has been marked as a duplicate of this bug. ***


^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Bug middle-end/64928] [4.8/4.9/5/6 Regression] Inordinate cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs
  2015-02-03 21:09 [Bug other/64928] New: unreasonable cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs lucier at math dot purdue.edu
                   ` (19 preceding siblings ...)
  2015-05-20 14:49 ` wellnhofer at aevum dot de
@ 2015-06-23  8:19 ` rguenth at gcc dot gnu.org
  2015-06-26 19:56 ` [Bug middle-end/64928] [4.9/5/6 " jakub at gcc dot gnu.org
                   ` (18 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: rguenth at gcc dot gnu.org @ 2015-06-23  8:19 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64928

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|4.8.5                       |4.9.3

--- Comment #20 from Richard Biener <rguenth at gcc dot gnu.org> ---
The gcc-4_8-branch is being closed, re-targeting regressions to 4.9.3.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Bug middle-end/64928] [4.9/5/6 Regression] Inordinate cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs
  2015-02-03 21:09 [Bug other/64928] New: unreasonable cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs lucier at math dot purdue.edu
                   ` (20 preceding siblings ...)
  2015-06-23  8:19 ` rguenth at gcc dot gnu.org
@ 2015-06-26 19:56 ` jakub at gcc dot gnu.org
  2015-06-26 20:28 ` jakub at gcc dot gnu.org
                   ` (17 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: jakub at gcc dot gnu.org @ 2015-06-26 19:56 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64928

--- Comment #21 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
GCC 4.9.3 has been released.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Bug middle-end/64928] [4.9/5/6 Regression] Inordinate cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs
  2015-02-03 21:09 [Bug other/64928] New: unreasonable cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs lucier at math dot purdue.edu
                   ` (21 preceding siblings ...)
  2015-06-26 19:56 ` [Bug middle-end/64928] [4.9/5/6 " jakub at gcc dot gnu.org
@ 2015-06-26 20:28 ` jakub at gcc dot gnu.org
  2020-09-29  0:14 ` [Bug middle-end/64928] [8/9/10/11 " lucier at math dot purdue.edu
                   ` (16 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: jakub at gcc dot gnu.org @ 2015-06-26 20:28 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64928

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|4.9.3                       |4.9.4


^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Bug middle-end/64928] [8/9/10/11 Regression] Inordinate cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs
  2015-02-03 21:09 [Bug other/64928] New: unreasonable cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs lucier at math dot purdue.edu
                   ` (22 preceding siblings ...)
  2015-06-26 20:28 ` jakub at gcc dot gnu.org
@ 2020-09-29  0:14 ` lucier at math dot purdue.edu
  2020-09-29  7:09 ` rguenth at gcc dot gnu.org
                   ` (15 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: lucier at math dot purdue.edu @ 2020-09-29  0:14 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64928

--- Comment #30 from lucier at math dot purdue.edu ---
I'm coming back to this project.

I naively thought "Well, I don't need arc profiling, I'll just set
-ftest-coverage without -fprofile-arcs" but it appears that I can't do that,
the gcda files are generated by -fprofile-arcs.

It seems to me that test coverage could be implemented simply by instrumenting
each basic block in an algorithm that's linear in the number of basic blocks. 
Is it possible to do this?

Brad

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Bug middle-end/64928] [8/9/10/11 Regression] Inordinate cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs
  2015-02-03 21:09 [Bug other/64928] New: unreasonable cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs lucier at math dot purdue.edu
                   ` (23 preceding siblings ...)
  2020-09-29  0:14 ` [Bug middle-end/64928] [8/9/10/11 " lucier at math dot purdue.edu
@ 2020-09-29  7:09 ` rguenth at gcc dot gnu.org
  2020-09-29 12:17 ` lucier at math dot purdue.edu
                   ` (14 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: rguenth at gcc dot gnu.org @ 2020-09-29  7:09 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64928

--- Comment #31 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to lucier from comment #30)
> I'm coming back to this project.
> 
> I naively thought "Well, I don't need arc profiling, I'll just set
> -ftest-coverage without -fprofile-arcs" but it appears that I can't do that,
> the gcda files are generated by -fprofile-arcs.
> 
> It seems to me that test coverage could be implemented simply by
> instrumenting each basic block in an algorithm that's linear in the number
> of basic blocks.  Is it possible to do this?
> 
> Brad

I don't think the instrumentation itself is the problem - it's already
doing better than one counter per block.  It's simply that the large
source runs into multiple non-linearities in core pieces of the compiler
that cannot be turned off ...

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Bug middle-end/64928] [8/9/10/11 Regression] Inordinate cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs
  2015-02-03 21:09 [Bug other/64928] New: unreasonable cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs lucier at math dot purdue.edu
                   ` (24 preceding siblings ...)
  2020-09-29  7:09 ` rguenth at gcc dot gnu.org
@ 2020-09-29 12:17 ` lucier at math dot purdue.edu
  2020-09-29 13:06 ` rguenther at suse dot de
                   ` (13 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: lucier at math dot purdue.edu @ 2020-09-29 12:17 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64928

--- Comment #32 from lucier at math dot purdue.edu ---
I don't know precisely what you're saying, but it compiles fine without the
instrumentation.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Bug middle-end/64928] [8/9/10/11 Regression] Inordinate cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs
  2015-02-03 21:09 [Bug other/64928] New: unreasonable cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs lucier at math dot purdue.edu
                   ` (25 preceding siblings ...)
  2020-09-29 12:17 ` lucier at math dot purdue.edu
@ 2020-09-29 13:06 ` rguenther at suse dot de
  2021-03-10  2:10 ` lucier at math dot purdue.edu
                   ` (12 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: rguenther at suse dot de @ 2020-09-29 13:06 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64928

--- Comment #33 from rguenther at suse dot de <rguenther at suse dot de> ---
On Tue, 29 Sep 2020, lucier at math dot purdue.edu wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64928
> 
> --- Comment #32 from lucier at math dot purdue.edu ---
> I don't know precisely what you're saying, but it compiles fine without the
> instrumentation.

Yes - the instrumentation does complicate the IL but the instrumentation
should be already better than linear in the blocks.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Bug middle-end/64928] [8/9/10/11 Regression] Inordinate cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs
  2015-02-03 21:09 [Bug other/64928] New: unreasonable cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs lucier at math dot purdue.edu
                   ` (26 preceding siblings ...)
  2020-09-29 13:06 ` rguenther at suse dot de
@ 2021-03-10  2:10 ` lucier at math dot purdue.edu
  2021-03-10  2:13 ` lucier at math dot purdue.edu
                   ` (11 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: lucier at math dot purdue.edu @ 2021-03-10  2:10 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64928

--- Comment #34 from lucier at math dot purdue.edu ---
I decided to approach this a bit more methodically by generating a series of
synthetic programs, each twice as long as the previous, and to measure the
compilation time.  I'll attach the associated .i files here.

Each .i file was generated from a Scheme file with 2^k copies, k=1,..,5, of a
simple recursive definition of the fibonacci function, suitably renamed.  So
these are not large files by my standards.

The short summary is that CPU time seems to grow quadraticly with the length of
the code.  The required memory grows very quickly, too---I killed the
compilation with k=5 (so 32 copies of fibonacci function) because the
computation filled 32GB of RAM and 32GB of swap.

Perhaps this parameterized input files might be of help.

Brad

I downloaded the git sources for gcc:

heine:~/programs/gcc/gcc-mainline> git log
commit 7eef9a66018e23677058fec421229e3fa435a1a3 (HEAD -> master, origin/master,
origin/HEAD)
Author: Joel Brobecker <brobecker@adacore.com>
Date:   Mon Mar 8 23:59:37 2021 -0300

I configured and built gcc with

heine:~/programs/gcc/gcc-mainline> /pkgs/gcc-mainline/bin/gcc -v
Using built-in specs.
COLLECT_GCC=/pkgs/gcc-mainline/bin/gcc
COLLECT_LTO_WRAPPER=/pkgs/gcc-mainline/libexec/gcc/x86_64-pc-linux-gnu/11.0.1/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../../gcc-mainline/configure --prefix=/pkgs/gcc-mainline
--enable-languages=c --enable-checking=release
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 11.0.1 20210309 (experimental) (GCC) 

The program names are fib-1.c to fib-5.c, fib-k.c contains 2^k copies of
fibonacci.

/pkgs/gcc-mainline/bin/gcc -march=native -D___CAN_IMPORT_CLIB_DYNAMICALLY  -O1 
   -Wno-unused -Wno-write-strings -Wdisabled-optimization -fwrapv
-fno-strict-aliasing -fno-trapping-math -fno-math-errno -fschedule-insns2
-fomit-frame-pointer -fPIC -fno-common -mpc64   -rdynamic -shared 
-D___SINGLE_HOST -D___DYNAMIC
-I"/home/lucier/programs/gambit/gambit-profiled/include" -o 'fib-1.o1' -Q
-fprofile-arcs -ftest-coverage -save-temps   'fib-1.c' 

Time variable                                   usr           sys          wall
          GGC
 phase setup                        :   0.02 (100%)   0.00 (  0%)   0.03 (100%)
 5039k (100%)
 TOTAL                              :   0.02          0.00          0.03       
 5049k
 btowc wctob mbrlen ___H_fib_2d_1 ___setup_mod ___init_mod
___LNK_fib_2d_1_2e_o1
Analyzing compilation unit
Performing interprocedural optimizations
 <*free_lang_data> {heap 1240k} <visibility> {heap 1240k} <build_ssa_passes>
{heap 1240k} <opt_local_passes> {heap 1240k} <remove_symbols> {heap 2468k}
<targetclone> {heap 2468k} <profile> {heap 2468k} <free-fnsummary> {heap
2468k}Streaming LTO
 <whole-program> {heap 2468k} <profile_estimate> {heap 2468k} <fnsummary> {heap
2468k} <inline> {heap 2468k} <pure-const> {heap 2468k} <modref> {heap 2468k}
<free-fnsummary> {heap 2468k} <static-var> {heap 2468k} <single-use> {heap
2468k} <comdats> {heap 2468k}Assembling functions:
 <simdclone> {heap 2468k} ___setup_mod ___init_mod ___H_fib_2d_1
___LNK_fib_2d_1_2e_o1 _sub_I_00100_0 _sub_D_00100_1
Time variable                                   usr           sys          wall
          GGC
 phase setup                        :   0.00 (  0%)   0.00 (  0%)   0.01 (  1%)
 1519k (  6%)
 phase parsing                      :   0.06 (  8%)   0.01 ( 20%)   0.08 ( 10%)
 2072k (  8%)
 phase opt and generate             :   0.67 ( 92%)   0.04 ( 80%)   0.70 ( 89%)
   22M ( 86%)
 dump files                         :   0.01 (  1%)   0.00 (  0%)   0.00 (  0%)
    0  (  0%)
 callgraph functions expansion      :   0.66 ( 90%)   0.03 ( 60%)   0.69 ( 87%)
   21M ( 82%)
 callgraph ipa passes               :   0.01 (  1%)   0.00 (  0%)   0.01 (  1%)
  570k (  2%)
 cfg cleanup                        :   0.00 (  0%)   0.00 (  0%)   0.04 (  5%)
   64  (  0%)
 trivially dead code                :   0.00 (  0%)   0.01 ( 20%)   0.00 (  0%)
    0  (  0%)
 df live regs                       :   0.01 (  1%)   0.00 (  0%)   0.02 (  3%)
    0  (  0%)
 df live&initialized regs           :   0.02 (  3%)   0.00 (  0%)   0.02 (  3%)
    0  (  0%)
 df reg dead/unused notes           :   0.02 (  3%)   0.00 (  0%)   0.01 (  1%)
  305k (  1%)
 alias analysis                     :   0.01 (  1%)   0.00 (  0%)   0.01 (  1%)
 1482k (  6%)
 alias stmt walking                 :   0.02 (  3%)   0.01 ( 20%)   0.02 (  3%)
 7280  (  0%)
 rebuild jump labels                :   0.01 (  1%)   0.00 (  0%)   0.00 (  0%)
    0  (  0%)
 preprocessing                      :   0.02 (  3%)   0.00 (  0%)   0.01 (  1%)
  240k (  1%)
 lexical analysis                   :   0.02 (  3%)   0.01 ( 20%)   0.00 (  0%)
    0  (  0%)
 parser (global)                    :   0.01 (  1%)   0.00 (  0%)   0.04 (  5%)
 1239k (  5%)
 parser struct body                 :   0.01 (  1%)   0.00 (  0%)   0.01 (  1%)
  359k (  1%)
 parser function body               :   0.00 (  0%)   0.00 (  0%)   0.02 (  3%)
  201k (  1%)
 tree gimplify                      :   0.00 (  0%)   0.01 ( 20%)   0.00 (  0%)
  297k (  1%)
 tree copy propagation              :   0.01 (  1%)   0.00 (  0%)   0.01 (  1%)
   13k (  0%)
 tree SSA rewrite                   :   0.00 (  0%)   0.00 (  0%)   0.01 (  1%)
  356k (  1%)
 tree SSA incremental               :   0.01 (  1%)   0.00 (  0%)   0.00 (  0%)
 2918k ( 11%)
 tree operand scan                  :   0.01 (  1%)   0.00 (  0%)   0.00 (  0%)
  314k (  1%)
 dominator optimization             :   0.03 (  4%)   0.01 ( 20%)   0.04 (  5%)
  531k (  2%)
 tree FRE                           :   0.00 (  0%)   0.00 (  0%)   0.01 (  1%)
   36k (  0%)
 tree forward propagate             :   0.02 (  3%)   0.00 (  0%)   0.00 (  0%)
   34k (  0%)
 tree conservative DCE              :   0.00 (  0%)   0.00 (  0%)   0.01 (  1%)
 6224  (  0%)
 tree DSE                           :   0.03 (  4%)   0.00 (  0%)   0.04 (  5%)
    0  (  0%)
 tree loop invariant motion         :   0.01 (  1%)   0.00 (  0%)   0.03 (  4%)
 2496k (  9%)
 tree strlen optimization           :   0.01 (  1%)   0.00 (  0%)   0.01 (  1%)
   83k (  0%)
 dominance computation              :   0.02 (  3%)   0.00 (  0%)   0.00 (  0%)
    0  (  0%)
 out of ssa                         :   0.03 (  4%)   0.00 (  0%)   0.02 (  3%)
   64k (  0%)
 expand                             :   0.00 (  0%)   0.00 (  0%)   0.01 (  1%)
 2473k (  9%)
 forward prop                       :   0.02 (  3%)   0.00 (  0%)   0.02 (  3%)
   81k (  0%)
 CSE                                :   0.01 (  1%)   0.00 (  0%)   0.00 (  0%)
  211k (  1%)
 dead store elim2                   :   0.01 (  1%)   0.00 (  0%)   0.02 (  3%)
  701k (  3%)
 loop init                          :   0.01 (  1%)   0.00 (  0%)   0.00 (  0%)
   29k (  0%)
 loop fini                          :   0.00 (  0%)   0.00 (  0%)   0.01 (  1%)
  116k (  0%)
 combiner                           :   0.01 (  1%)   0.00 (  0%)   0.01 (  1%)
  108k (  0%)
 if-conversion                      :   0.02 (  3%)   0.00 (  0%)   0.00 (  0%)
  666k (  3%)
 integrated RA                      :   0.06 (  8%)   0.00 (  0%)   0.05 (  6%)
 3986k ( 15%)
 LRA non-specific                   :   0.05 (  7%)   0.00 (  0%)   0.06 (  8%)
 1324k (  5%)
 LRA reload inheritance             :   0.01 (  1%)   0.00 (  0%)   0.01 (  1%)
  224  (  0%)
 LRA create live ranges             :   0.09 ( 12%)   0.00 (  0%)   0.08 ( 10%)
  241k (  1%)
 LRA hard reg assignment            :   0.02 (  3%)   0.00 (  0%)   0.02 (  3%)
    0  (  0%)
 reload CSE regs                    :   0.02 (  3%)   0.00 (  0%)   0.02 (  3%)
  368k (  1%)
 thread pro- & epilogue             :   0.01 (  1%)   0.00 (  0%)   0.00 (  0%)
   10k (  0%)
 hard reg cprop                     :   0.00 (  0%)   0.00 (  0%)   0.01 (  1%)
  288  (  0%)
 scheduling 2                       :   0.04 (  5%)   0.00 (  0%)   0.04 (  5%)
  149k (  1%)
 shorten branches                   :   0.00 (  0%)   0.00 (  0%)   0.01 (  1%)
    0  (  0%)
 final                              :   0.01 (  1%)   0.00 (  0%)   0.00 (  0%)
  816k (  3%)
 initialize rtl                     :   0.00 (  0%)   0.00 (  0%)   0.01 (  1%)
   12k (  0%)
 rest of compilation                :   0.00 (  0%)   0.00 (  0%)   0.02 (  3%)
   66k (  0%)
 TOTAL                              :   0.73          0.05          0.79       
   25M


/pkgs/gcc-mainline/bin/gcc -march=native -D___CAN_IMPORT_CLIB_DYNAMICALLY  -O1 
   -Wno-unused -Wno-write-strings -Wdisabled-optimization -fwrapv
-fno-strict-aliasing -fno-trapping-math -fno-math-errno -fschedule-insns2
-fomit-frame-pointer -fPIC -fno-common -mpc64   -rdynamic -shared 
-D___SINGLE_HOST -D___DYNAMIC
-I"/home/lucier/programs/gambit/gambit-profiled/include" -o 'fib-2.o1' -Q
-fprofile-arcs -ftest-coverage -save-temps   'fib-2.c' 

Time variable                                   usr           sys          wall
          GGC
 phase setup                        :   0.01 (100%)   0.02 (100%)   0.04 (100%)
 7596k (100%)
 TOTAL                              :   0.01          0.02          0.04       
 7606k
 btowc wctob mbrlen ___H_fib_2d_2 ___setup_mod ___init_mod
___LNK_fib_2d_2_2e_o1
Analyzing compilation unit
Performing interprocedural optimizations
 <*free_lang_data> {heap 1432k} <visibility> {heap 1432k} <build_ssa_passes>
{heap 1432k} <opt_local_passes> {heap 1432k} <remove_symbols> {heap 3104k}
<targetclone> {heap 3104k} <profile> {heap 3104k} <free-fnsummary> {heap
3104k}Streaming LTO
 <whole-program> {heap 3104k} <profile_estimate> {heap 3104k} <fnsummary> {heap
3104k} <inline> {heap 3104k} <pure-const> {heap 3104k} <modref> {heap 3104k}
<free-fnsummary> {heap 3104k} <static-var> {heap 3104k} <single-use> {heap
3104k} <comdats> {heap 3104k}Assembling functions:
 <simdclone> {heap 3104k} ___setup_mod ___init_mod ___H_fib_2d_2
___LNK_fib_2d_2_2e_o1 _sub_I_00100_0 _sub_D_00100_1
Time variable                                   usr           sys          wall
          GGC
 phase setup                        :   0.00 (  0%)   0.00 (  0%)   0.00 (  0%)
 1519k (  2%)
 phase parsing                      :   0.04 (  1%)   0.05 ( 36%)   0.10 (  3%)
 2500k (  4%)
 phase opt and generate             :   2.78 ( 99%)   0.09 ( 64%)   2.88 ( 97%)
   62M ( 94%)
 callgraph construction             :   0.00 (  0%)   0.00 (  0%)   0.01 (  0%)
   26k (  0%)
 callgraph functions expansion      :   2.75 ( 98%)   0.09 ( 64%)   2.85 ( 96%)
   61M ( 92%)
 callgraph ipa passes               :   0.02 (  1%)   0.00 (  0%)   0.02 (  1%)
  939k (  1%)
 ipa pure const                     :   0.00 (  0%)   0.00 (  0%)   0.01 (  0%)
    0  (  0%)
 cfg cleanup                        :   0.04 (  1%)   0.00 (  0%)   0.04 (  1%)
   64  (  0%)
 trivially dead code                :   0.01 (  0%)   0.00 (  0%)   0.01 (  0%)
    0  (  0%)
 df scan insns                      :   0.00 (  0%)   0.00 (  0%)   0.01 (  0%)
  288  (  0%)
 df reaching defs                   :   0.00 (  0%)   0.00 (  0%)   0.01 (  0%)
    0  (  0%)
 df live regs                       :   0.07 (  2%)   0.00 (  0%)   0.10 (  3%)
    0  (  0%)
 df live&initialized regs           :   0.08 (  3%)   0.00 (  0%)   0.07 (  2%)
    0  (  0%)
 df reg dead/unused notes           :   0.05 (  2%)   0.01 (  7%)   0.06 (  2%)
  935k (  1%)
 register information               :   0.04 (  1%)   0.00 (  0%)   0.03 (  1%)
    0  (  0%)
 alias analysis                     :   0.02 (  1%)   0.00 (  0%)   0.00 (  0%)
 2960k (  4%)
 alias stmt walking                 :   0.13 (  5%)   0.02 ( 14%)   0.10 (  3%)
 7472  (  0%)
 rebuild jump labels                :   0.01 (  0%)   0.00 (  0%)   0.03 (  1%)
    0  (  0%)
 preprocessing                      :   0.00 (  0%)   0.03 ( 21%)   0.03 (  1%)
  250k (  0%)
 lexical analysis                   :   0.02 (  1%)   0.02 ( 14%)   0.06 (  2%)
    0  (  0%)
 parser (global)                    :   0.00 (  0%)   0.00 (  0%)   0.00 (  0%)
 1252k (  2%)
 parser struct body                 :   0.01 (  0%)   0.00 (  0%)   0.00 (  0%)
  359k (  1%)
 parser function body               :   0.01 (  0%)   0.00 (  0%)   0.00 (  0%)
  608k (  1%)
 inline parameters                  :   0.00 (  0%)   0.00 (  0%)   0.01 (  0%)
   39k (  0%)
 tree gimplify                      :   0.01 (  0%)   0.00 (  0%)   0.00 (  0%)
  505k (  1%)
 tree CFG cleanup                   :   0.02 (  1%)   0.01 (  7%)   0.02 (  1%)
  320k (  0%)
 tree copy propagation              :   0.04 (  1%)   0.00 (  0%)   0.05 (  2%)
   24k (  0%)
 tree PTA                           :   0.01 (  0%)   0.00 (  0%)   0.00 (  0%)
   13k (  0%)
 tree SSA rewrite                   :   0.02 (  1%)   0.00 (  0%)   0.02 (  1%)
  605k (  1%)
 tree SSA incremental               :   0.05 (  2%)   0.00 (  0%)   0.06 (  2%)
 9895k ( 14%)
 tree operand scan                  :   0.01 (  0%)   0.00 (  0%)   0.01 (  0%)
  882k (  1%)
 dominator optimization             :   0.13 (  5%)   0.00 (  0%)   0.16 (  5%)
 1261k (  2%)
 tree split crit edges              :   0.01 (  0%)   0.00 (  0%)   0.00 (  0%)
 1410k (  2%)
 tree reassociation                 :   0.00 (  0%)   0.00 (  0%)   0.01 (  0%)
   48  (  0%)
 tree code sinking                  :   0.00 (  0%)   0.00 (  0%)   0.01 (  0%)
 1680k (  2%)
 tree forward propagate             :   0.01 (  0%)   0.00 (  0%)   0.02 (  1%)
   63k (  0%)
 tree conservative DCE              :   0.02 (  1%)   0.00 (  0%)   0.02 (  1%)
 8288  (  0%)
 tree aggressive DCE                :   0.03 (  1%)   0.00 (  0%)   0.02 (  1%)
   40  (  0%)
 tree DSE                           :   0.11 (  4%)   0.00 (  0%)   0.12 (  4%)
    0  (  0%)
 tree loop invariant motion         :   0.09 (  3%)   0.01 (  7%)   0.09 (  3%)
 7961k ( 12%)
 tree iv optimization               :   0.00 (  0%)   0.00 (  0%)   0.01 (  0%)
   22k (  0%)
 tree SSA uncprop                   :   0.00 (  0%)   0.00 (  0%)   0.01 (  0%)
    0  (  0%)
 tree strlen optimization           :   0.02 (  1%)   0.00 (  0%)   0.02 (  1%)
  149k (  0%)
 tree modref                        :   0.01 (  0%)   0.00 (  0%)   0.00 (  0%)
 2800  (  0%)
 dominance computation              :   0.02 (  1%)   0.00 (  0%)   0.05 (  2%)
    0  (  0%)
 out of ssa                         :   0.11 (  4%)   0.01 (  7%)   0.13 (  4%)
  752  (  0%)
 expand                             :   0.03 (  1%)   0.00 (  0%)   0.02 (  1%)
 7567k ( 11%)
 post expand cleanups               :   0.01 (  0%)   0.00 (  0%)   0.00 (  0%)
   49k (  0%)
 varconst                           :   0.00 (  0%)   0.00 (  0%)   0.01 (  0%)
 1024  (  0%)
 forward prop                       :   0.09 (  3%)   0.00 (  0%)   0.09 (  3%)
  255k (  0%)
 CSE                                :   0.02 (  1%)   0.00 (  0%)   0.02 (  1%)
  659k (  1%)
 dead code elimination              :   0.01 (  0%)   0.00 (  0%)   0.01 (  0%)
    0  (  0%)
 dead store elim1                   :   0.02 (  1%)   0.00 (  0%)   0.03 (  1%)
  467k (  1%)
 dead store elim2                   :   0.04 (  1%)   0.00 (  0%)   0.03 (  1%)
 2157k (  3%)
 loop init                          :   0.01 (  0%)   0.00 (  0%)   0.00 (  0%)
   36k (  0%)
 loop fini                          :   0.01 (  0%)   0.00 (  0%)   0.01 (  0%)
  352k (  1%)
 combiner                           :   0.02 (  1%)   0.00 (  0%)   0.02 (  1%)
  260k (  0%)
 if-conversion                      :   0.03 (  1%)   0.00 (  0%)   0.04 (  1%)
 2511k (  4%)
 integrated RA                      :   0.21 (  7%)   0.01 (  7%)   0.22 (  7%)
 9272k ( 14%)
 LRA non-specific                   :   0.18 (  6%)   0.01 (  7%)   0.16 (  5%)
 4240k (  6%)
 LRA virtuals elimination           :   0.03 (  1%)   0.00 (  0%)   0.02 (  1%)
 1264k (  2%)
 LRA reload inheritance             :   0.04 (  1%)   0.00 (  0%)   0.04 (  1%)
    0  (  0%)
 LRA create live ranges             :   0.41 ( 15%)   0.00 (  0%)   0.44 ( 15%)
  757k (  1%)
 LRA hard reg assignment            :   0.08 (  3%)   0.01 (  7%)   0.09 (  3%)
    0  (  0%)
 reload CSE regs                    :   0.05 (  2%)   0.00 (  0%)   0.05 (  2%)
 1113k (  2%)
 thread pro- & epilogue             :   0.02 (  1%)   0.00 (  0%)   0.02 (  1%)
   10k (  0%)
 if-conversion 2                    :   0.00 (  0%)   0.00 (  0%)   0.01 (  0%)
    0  (  0%)
 combine stack adjustments          :   0.00 (  0%)   0.00 (  0%)   0.01 (  0%)
    0  (  0%)
 hard reg cprop                     :   0.02 (  1%)   0.00 (  0%)   0.02 (  1%)
  432  (  0%)
 scheduling 2                       :   0.11 (  4%)   0.00 (  0%)   0.12 (  4%)
  457k (  1%)
 reorder blocks                     :   0.02 (  1%)   0.00 (  0%)   0.01 (  0%)
  370k (  1%)
 shorten branches                   :   0.01 (  0%)   0.00 (  0%)   0.01 (  0%)
    0  (  0%)
 final                              :   0.03 (  1%)   0.00 (  0%)   0.03 (  1%)
 2482k (  4%)
 straight-line strength reduction   :   0.01 (  0%)   0.00 (  0%)   0.00 (  0%)
 4440  (  0%)
 rest of compilation                :   0.08 (  3%)   0.00 (  0%)   0.03 (  1%)
  179k (  0%)
 remove unused locals               :   0.01 (  0%)   0.00 (  0%)   0.01 (  0%)
    0  (  0%)
 repair loop structures             :   0.01 (  0%)   0.00 (  0%)   0.00 (  0%)
    0  (  0%)
 TOTAL                              :   2.82          0.14          2.98       
   66M

/pkgs/gcc-mainline/bin/gcc -march=native -D___CAN_IMPORT_CLIB_DYNAMICALLY  -O1 
   -Wno-unused -Wno-write-strings -Wdisabled-optimization -fwrapv
-fno-strict-aliasing -fno-trapping-math -fno-math-errno -fschedule-insns2
-fomit-frame-pointer -fPIC -fno-common -mpc64   -rdynamic -shared 
-D___SINGLE_HOST -D___DYNAMIC
-I"/home/lucier/programs/gambit/gambit-profiled/include" -o 'fib-3.o1' -Q
-fprofile-arcs -ftest-coverage -save-temps   'fib-3.c' 

Time variable                                   usr           sys          wall
          GGC
 phase setup                        :   0.04 (100%)   0.00 (  0%)   0.04 (100%)
 8613k (100%)
 TOTAL                              :   0.04          0.00          0.04       
 8624k
 btowc wctob mbrlen ___H_fib_2d_3 ___setup_mod ___init_mod
___LNK_fib_2d_3_2e_o1
Analyzing compilation unit
Performing interprocedural optimizations
 <*free_lang_data> {heap 1436k} <visibility> {heap 1436k} <build_ssa_passes>
{heap 1436k} <opt_local_passes> {heap 1436k} <remove_symbols> {heap 3060k}
<targetclone> {heap 3060k} <profile> {heap 3060k} <free-fnsummary> {heap
3060k}Streaming LTO
 <whole-program> {heap 3060k} <profile_estimate> {heap 3060k} <fnsummary> {heap
3060k} <inline> {heap 3060k} <pure-const> {heap 3060k} <modref> {heap 3060k}
<free-fnsummary> {heap 3060k} <static-var> {heap 3060k} <single-use> {heap
3060k} <comdats> {heap 3060k}Assembling functions:
 <simdclone> {heap 3060k} ___setup_mod ___init_mod ___H_fib_2d_3
___LNK_fib_2d_3_2e_o1 _sub_I_00100_0 _sub_D_00100_1
Time variable                                   usr           sys          wall
          GGC
 phase setup                        :   0.00 (  0%)   0.00 (  0%)   0.00 (  0%)
 1519k (  1%)
 phase parsing                      :   0.09 (  1%)   0.05 ( 11%)   0.14 (  1%)
 2845k (  1%)
 phase opt and generate             :  13.80 ( 99%)   0.42 ( 89%)  14.22 ( 99%)
  220M ( 98%)
 callgraph functions expansion      :  13.76 ( 99%)   0.42 ( 89%)  14.17 ( 99%)
  216M ( 97%)
 callgraph ipa passes               :   0.04 (  0%)   0.00 (  0%)   0.04 (  0%)
 1687k (  1%)
 ipa function summary               :   0.01 (  0%)   0.00 (  0%)   0.00 (  0%)
  176k (  0%)
 ipa profile                        :   0.00 (  0%)   0.00 (  0%)   0.01 (  0%)
  300k (  0%)
 ipa pure const                     :   0.00 (  0%)   0.00 (  0%)   0.01 (  0%)
    0  (  0%)
 cfg construction                   :   0.02 (  0%)   0.00 (  0%)   0.01 (  0%)
   82k (  0%)
 cfg cleanup                        :   0.20 (  1%)   0.01 (  2%)   0.19 (  1%)
   64  (  0%)
 trivially dead code                :   0.04 (  0%)   0.00 (  0%)   0.04 (  0%)
    0  (  0%)
 df scan insns                      :   0.03 (  0%)   0.00 (  0%)   0.03 (  0%)
  288  (  0%)
 df reaching defs                   :   0.01 (  0%)   0.00 (  0%)   0.00 (  0%)
    0  (  0%)
 df live regs                       :   0.37 (  3%)   0.00 (  0%)   0.40 (  3%)
    0  (  0%)
 df live&initialized regs           :   0.37 (  3%)   0.01 (  2%)   0.38 (  3%)
    0  (  0%)
 df reg dead/unused notes           :   0.17 (  1%)   0.01 (  2%)   0.18 (  1%)
 3229k (  1%)
 register information               :   0.15 (  1%)   0.00 (  0%)   0.17 (  1%)
    0  (  0%)
 alias analysis                     :   0.07 (  1%)   0.00 (  0%)   0.05 (  0%)
   11M (  5%)
 alias stmt walking                 :   1.02 (  7%)   0.02 (  4%)   0.93 (  6%)
 7856  (  0%)
 rebuild jump labels                :   0.02 (  0%)   0.00 (  0%)   0.02 (  0%)
    0  (  0%)
 preprocessing                      :   0.03 (  0%)   0.00 (  0%)   0.04 (  0%)
  268k (  0%)
 lexical analysis                   :   0.04 (  0%)   0.02 (  4%)   0.03 (  0%)
    0  (  0%)
 parser (global)                    :   0.00 (  0%)   0.01 (  2%)   0.03 (  0%)
 1275k (  1%)
 parser struct body                 :   0.01 (  0%)   0.00 (  0%)   0.00 (  0%)
  359k (  0%)
 parser function body               :   0.01 (  0%)   0.02 (  4%)   0.04 (  0%)
  911k (  0%)
 tree gimplify                      :   0.00 (  0%)   0.00 (  0%)   0.01 (  0%)
  937k (  0%)
 tree CFG cleanup                   :   0.11 (  1%)   0.00 (  0%)   0.14 (  1%)
 1373k (  1%)
 tree copy propagation              :   0.17 (  1%)   0.00 (  0%)   0.17 (  1%)
   48k (  0%)
 tree PTA                           :   0.01 (  0%)   0.00 (  0%)   0.01 (  0%)
   23k (  0%)
 tree SSA rewrite                   :   0.13 (  1%)   0.00 (  0%)   0.13 (  1%)
 1877k (  1%)
 tree SSA other                     :   0.01 (  0%)   0.00 (  0%)   0.00 (  0%)
  952  (  0%)
 tree SSA incremental               :   0.24 (  2%)   0.01 (  2%)   0.24 (  2%)
   34M ( 15%)
 tree operand scan                  :   0.01 (  0%)   0.02 (  4%)   0.03 (  0%)
 2882k (  1%)
 dominator optimization             :   0.43 (  3%)   0.01 (  2%)   0.58 (  4%)
 4002k (  2%)
 tree CCP                           :   0.01 (  0%)   0.00 (  0%)   0.00 (  0%)
   47k (  0%)
 tree split crit edges              :   0.01 (  0%)   0.00 (  0%)   0.01 (  0%)
 5019k (  2%)
 tree reassociation                 :   0.01 (  0%)   0.00 (  0%)   0.01 (  0%)
   48  (  0%)
 tree FRE                           :   0.02 (  0%)   0.00 (  0%)   0.02 (  0%)
  110k (  0%)
 tree code sinking                  :   0.02 (  0%)   0.00 (  0%)   0.02 (  0%)
 6070k (  3%)
 tree linearize phis                :   0.01 (  0%)   0.00 (  0%)   0.01 (  0%)
 6432  (  0%)
 tree forward propagate             :   0.20 (  1%)   0.02 (  4%)   0.21 (  1%)
  119k (  0%)
 tree conservative DCE              :   0.06 (  0%)   0.00 (  0%)   0.05 (  0%)
   16k (  0%)
 tree aggressive DCE                :   0.08 (  1%)   0.00 (  0%)   0.07 (  0%)
   40  (  0%)
 tree DSE                           :   0.47 (  3%)   0.00 (  0%)   0.47 (  3%)
    0  (  0%)
 tree loop invariant motion         :   0.61 (  4%)   0.04 (  9%)   0.65 (  5%)
   27M ( 12%)
 complete unrolling                 :   0.01 (  0%)   0.00 (  0%)   0.00 (  0%)
  544  (  0%)
 tree iv optimization               :   0.00 (  0%)   0.00 (  0%)   0.01 (  0%)
   47k (  0%)
 tree SSA uncprop                   :   0.03 (  0%)   0.00 (  0%)   0.03 (  0%)
    0  (  0%)
 tree strlen optimization           :   0.09 (  1%)   0.00 (  0%)   0.10 (  1%)
  281k (  0%)
 tree modref                        :   0.01 (  0%)   0.00 (  0%)   0.00 (  0%)
 2800  (  0%)
 dominance computation              :   0.16 (  1%)   0.00 (  0%)   0.14 (  1%)
    0  (  0%)
 out of ssa                         :   0.72 (  5%)   0.12 ( 26%)   0.85 (  6%)
  512k (  0%)
 expand                             :   0.10 (  1%)   0.02 (  4%)   0.11 (  1%)
   25M ( 11%)
 post expand cleanups               :   0.02 (  0%)   0.00 (  0%)   0.03 (  0%)
   89k (  0%)
 forward prop                       :   0.35 (  3%)   0.01 (  2%)   0.35 (  2%)
  888k (  0%)
 CSE                                :   0.10 (  1%)   0.00 (  0%)   0.11 (  1%)
 2302k (  1%)
 dead code elimination              :   0.02 (  0%)   0.00 (  0%)   0.03 (  0%)
    0  (  0%)
 dead store elim1                   :   0.08 (  1%)   0.00 (  0%)   0.09 (  1%)
 1532k (  1%)
 dead store elim2                   :   0.13 (  1%)   0.00 (  0%)   0.14 (  1%)
 7464k (  3%)
 loop init                          :   0.08 (  1%)   0.00 (  0%)   0.11 (  1%)
   50k (  0%)
 loop invariant motion              :   0.02 (  0%)   0.00 (  0%)   0.01 (  0%)
   58k (  0%)
 loop fini                          :   0.03 (  0%)   0.00 (  0%)   0.03 (  0%)
  928k (  0%)
 combiner                           :   0.06 (  0%)   0.00 (  0%)   0.06 (  0%)
  736k (  0%)
 if-conversion                      :   0.10 (  1%)   0.00 (  0%)   0.09 (  1%)
 9292k (  4%)
 integrated RA                      :   1.16 (  8%)   0.01 (  2%)   1.15 (  8%)
   37M ( 17%)
 LRA non-specific                   :   0.93 (  7%)   0.01 (  2%)   0.95 (  7%)
   10M (  5%)
 LRA virtuals elimination           :   0.06 (  0%)   0.00 (  0%)   0.07 (  0%)
 4366k (  2%)
 LRA reload inheritance             :   0.23 (  2%)   0.00 (  0%)   0.23 (  2%)
    0  (  0%)
 LRA create live ranges             :   2.41 ( 17%)   0.00 (  0%)   2.41 ( 17%)
 2648k (  1%)
 LRA hard reg assignment            :   0.78 (  6%)   0.02 (  4%)   0.78 (  5%)
    0  (  0%)
 reload                             :   0.00 (  0%)   0.00 (  0%)   0.01 (  0%)
  144  (  0%)
 reload CSE regs                    :   0.16 (  1%)   0.01 (  2%)   0.16 (  1%)
 3807k (  2%)
 thread pro- & epilogue             :   0.06 (  0%)   0.00 (  0%)   0.05 (  0%)
   10k (  0%)
 if-conversion 2                    :   0.00 (  0%)   0.00 (  0%)   0.01 (  0%)
    0  (  0%)
 combine stack adjustments          :   0.01 (  0%)   0.00 (  0%)   0.02 (  0%)
    0  (  0%)
 hard reg cprop                     :   0.07 (  1%)   0.02 (  4%)   0.08 (  1%)
  720  (  0%)
 scheduling 2                       :   0.36 (  3%)   0.01 (  2%)   0.35 (  2%)
 1590k (  1%)
 machine dep reorg                  :   0.01 (  0%)   0.00 (  0%)   0.00 (  0%)
    0  (  0%)
 reorder blocks                     :   0.06 (  0%)   0.00 (  0%)   0.05 (  0%)
 1180k (  1%)
 shorten branches                   :   0.04 (  0%)   0.00 (  0%)   0.04 (  0%)
    0  (  0%)
 final                              :   0.07 (  1%)   0.01 (  2%)   0.08 (  1%)
 8569k (  4%)
 straight-line strength reduction   :   0.02 (  0%)   0.00 (  0%)   0.03 (  0%)
 8232  (  0%)
 rest of compilation                :   0.13 (  1%)   0.03 (  6%)   0.18 (  1%)
  342k (  0%)
 remove unused locals               :   0.02 (  0%)   0.00 (  0%)   0.02 (  0%)
    0  (  0%)
 address taken                      :   0.02 (  0%)   0.00 (  0%)   0.01 (  0%)
    0  (  0%)
 TOTAL                              :  13.89          0.47         14.36       
  224M

/pkgs/gcc-mainline/bin/gcc -march=native -D___CAN_IMPORT_CLIB_DYNAMICALLY  -O1 
   -Wno-unused -Wno-write-strings -Wdisabled-optimization -fwrapv
-fno-strict-aliasing -fno-trapping-math -fno-math-errno -fschedule-insns2
-fomit-frame-pointer -fPIC -fno-common -mpc64   -rdynamic -shared 
-D___SINGLE_HOST -D___DYNAMIC
-I"/home/lucier/programs/gambit/gambit-profiled/include" -o 'fib-4.o1' -Q
-fprofile-arcs -ftest-coverage -save-temps   'fib-4.c' 

Time variable                                   usr           sys          wall
          GGC
 phase setup                        :   0.05 (100%)   0.00 (  0%)   0.06 (100%)
   10M (100%)
 TOTAL                              :   0.05          0.00          0.06       
   10M
 btowc wctob mbrlen ___H_fib_2d_4 ___setup_mod ___init_mod
___LNK_fib_2d_4_2e_o1
Analyzing compilation unit
Performing interprocedural optimizations
 <*free_lang_data> {heap 1652k} <visibility> {heap 1652k} <build_ssa_passes>
{heap 1652k} <opt_local_passes> {heap 1652k} <remove_symbols> {heap 4168k}
<targetclone> {heap 4168k} <profile> {heap 4168k} <free-fnsummary> {heap
4168k}Streaming LTO
 <whole-program> {heap 4168k} <profile_estimate> {heap 4168k} <fnsummary> {heap
4168k} <inline> {heap 4168k} <pure-const> {heap 4168k} <modref> {heap 4168k}
<free-fnsummary> {heap 4168k} <static-var> {heap 4168k} <single-use> {heap
4168k} <comdats> {heap 4168k}Assembling functions:
 <simdclone> {heap 4168k} ___setup_mod ___init_mod ___H_fib_2d_4 {GC
madv_dontneed 556k} {GC 264M -> 260M} {GC madv_dontneed 116k} {GC 526M -> 302M}
___LNK_fib_2d_4_2e_o1 _sub_I_00100_0 _sub_D_00100_1
Time variable                                   usr           sys          wall
          GGC
 phase setup                        :   0.00 (  0%)   0.00 (  0%)   0.01 (  0%)
 1519k (  0%)
 phase parsing                      :   0.16 (  0%)   0.08 (  3%)   0.23 (  0%)
 4049k (  1%)
 phase lang. deferred               :   0.00 (  0%)   0.00 (  0%)   0.01 (  0%)
   96  (  0%)
 phase opt and generate             :  55.79 (100%)   2.22 ( 97%)  58.03 (100%)
  712M ( 99%)
 garbage collection                 :   0.38 (  1%)   0.00 (  0%)   0.38 (  1%)
    0  (  0%)
 dump files                         :   0.00 (  0%)   0.00 (  0%)   0.01 (  0%)
    0  (  0%)
 callgraph construction             :   0.00 (  0%)   0.00 (  0%)   0.00 (  0%)
 1108k (  0%)
 callgraph optimization             :   0.01 (  0%)   0.00 (  0%)   0.00 (  0%)
   19k (  0%)
 callgraph functions expansion      :  55.71 (100%)   2.21 ( 96%)  57.94 ( 99%)
  706M ( 98%)
 callgraph ipa passes               :   0.07 (  0%)   0.01 (  0%)   0.09 (  0%)
 3221k (  0%)
 ipa function summary               :   0.01 (  0%)   0.00 (  0%)   0.00 (  0%)
  335k (  0%)
 ipa inlining heuristics            :   0.01 (  0%)   0.00 (  0%)   0.00 (  0%)
   16  (  0%)
 ipa profile                        :   0.00 (  0%)   0.01 (  0%)   0.01 (  0%)
  605k (  0%)
 ipa pure const                     :   0.01 (  0%)   0.00 (  0%)   0.02 (  0%)
    0  (  0%)
 cfg construction                   :   0.06 (  0%)   0.00 (  0%)   0.05 (  0%)
  159k (  0%)
 cfg cleanup                        :   0.68 (  1%)   0.02 (  1%)   0.69 (  1%)
   48  (  0%)
 trivially dead code                :   0.11 (  0%)   0.00 (  0%)   0.11 (  0%)
    0  (  0%)
 df scan insns                      :   0.09 (  0%)   0.01 (  0%)   0.11 (  0%)
  288  (  0%)
 df live regs                       :   1.30 (  2%)   0.04 (  2%)   1.36 (  2%)
    0  (  0%)
 df live&initialized regs           :   1.52 (  3%)   0.03 (  1%)   1.56 (  3%)
    0  (  0%)
 df reg dead/unused notes           :   0.52 (  1%)   0.01 (  0%)   0.54 (  1%)
   11M (  2%)
 register information               :   0.34 (  1%)   0.00 (  0%)   0.34 (  1%)
    0  (  0%)
 alias analysis                     :   0.20 (  0%)   0.00 (  0%)   0.20 (  0%)
   26M (  4%)
 alias stmt walking                 :   7.31 ( 13%)   0.11 (  5%)   7.32 ( 13%)
 8624  (  0%)
 register scan                      :   0.02 (  0%)   0.00 (  0%)   0.02 (  0%)
 9008  (  0%)
 rebuild jump labels                :   0.07 (  0%)   0.00 (  0%)   0.05 (  0%)
    0  (  0%)
 preprocessing                      :   0.02 (  0%)   0.02 (  1%)   0.07 (  0%)
  306k (  0%)
 lexical analysis                   :   0.06 (  0%)   0.03 (  1%)   0.10 (  0%)
    0  (  0%)
 parser (global)                    :   0.03 (  0%)   0.02 (  1%)   0.02 (  0%)
 1323k (  0%)
 parser function body               :   0.05 (  0%)   0.01 (  0%)   0.05 (  0%)
 2029k (  0%)
 inline parameters                  :   0.01 (  0%)   0.00 (  0%)   0.00 (  0%)
  131k (  0%)
 tree gimplify                      :   0.00 (  0%)   0.00 (  0%)   0.00 (  0%)
 1802k (  0%)
 tree CFG construction              :   0.01 (  0%)   0.00 (  0%)   0.00 (  0%)
  578k (  0%)
 tree CFG cleanup                   :   0.41 (  1%)   0.00 (  0%)   0.42 (  1%)
 5686k (  1%)
 tree copy propagation              :   0.68 (  1%)   0.00 (  0%)   0.67 (  1%)
   96k (  0%)
 tree PTA                           :   0.01 (  0%)   0.01 (  0%)   0.02 (  0%)
   43k (  0%)
 tree PHI insertion                 :   0.00 (  0%)   0.00 (  0%)   0.01 (  0%)
  866k (  0%)
 tree SSA rewrite                   :   0.57 (  1%)   0.00 (  0%)   0.57 (  1%)
   10M (  1%)
 tree SSA incremental               :   1.15 (  2%)   0.05 (  2%)   1.20 (  2%)
  118M ( 16%)
 tree operand scan                  :   0.10 (  0%)   0.06 (  3%)   0.25 (  0%)
   10M (  1%)
 dominator optimization             :   3.64 (  7%)   0.04 (  2%)   3.82 (  7%)
   13M (  2%)
 tree CCP                           :   0.01 (  0%)   0.00 (  0%)   0.01 (  0%)
   94k (  0%)
 tree split crit edges              :   0.04 (  0%)   0.00 (  0%)   0.03 (  0%)
   18M (  3%)
 tree reassociation                 :   0.04 (  0%)   0.00 (  0%)   0.04 (  0%)
   48  (  0%)
 tree FRE                           :   0.01 (  0%)   0.00 (  0%)   0.03 (  0%)
  208k (  0%)
 tree code sinking                  :   0.07 (  0%)   0.00 (  0%)   0.07 (  0%)
   18M (  3%)
 tree linearize phis                :   0.04 (  0%)   0.00 (  0%)   0.03 (  0%)
 6432  (  0%)
 tree backward propagate            :   0.00 (  0%)   0.00 (  0%)   0.01 (  0%)
    0  (  0%)
 tree forward propagate             :   1.65 (  3%)   0.01 (  0%)   1.66 (  3%)
  232k (  0%)
 tree conservative DCE              :   0.29 (  1%)   0.00 (  0%)   0.29 (  0%)
   31k (  0%)
 tree aggressive DCE                :   0.30 (  1%)   0.00 (  0%)   0.24 (  0%)
   40  (  0%)
 tree DSE                           :   1.88 (  3%)   0.00 (  0%)   1.89 (  3%)
    0  (  0%)
 tree loop invariant motion         :   5.00 (  9%)   0.15 (  7%)   5.10 (  9%)
  103M ( 14%)
 tree iv optimization               :   0.01 (  0%)   0.01 (  0%)   0.02 (  0%)
   95k (  0%)
 tree SSA uncprop                   :   0.13 (  0%)   0.00 (  0%)   0.15 (  0%)
    0  (  0%)
 tree strlen optimization           :   0.62 (  1%)   0.00 (  0%)   0.62 (  1%)
  547k (  0%)
 tree modref                        :   0.02 (  0%)   0.00 (  0%)   0.02 (  0%)
 2800  (  0%)
 dominance frontiers                :   0.04 (  0%)   0.00 (  0%)   0.04 (  0%)
    0  (  0%)
 dominance computation              :   0.58 (  1%)   0.02 (  1%)   0.59 (  1%)
    0  (  0%)
 out of ssa                         :   5.62 ( 10%)   1.11 ( 48%)   6.73 ( 12%)
 2049k (  0%)
 expand vars                        :   0.01 (  0%)   0.00 (  0%)   0.00 (  0%)
  407k (  0%)
 expand                             :   0.39 (  1%)   0.01 (  0%)   0.42 (  1%)
   92M ( 13%)
 post expand cleanups               :   0.12 (  0%)   0.00 (  0%)   0.13 (  0%)
  169k (  0%)
 lower subreg                       :   0.00 (  0%)   0.00 (  0%)   0.01 (  0%)
    0  (  0%)
 forward prop                       :   1.25 (  2%)   0.05 (  2%)   1.29 (  2%)
 3301k (  0%)
 CSE                                :   0.28 (  1%)   0.00 (  0%)   0.27 (  0%)
 8571k (  1%)
 dead code elimination              :   0.08 (  0%)   0.00 (  0%)   0.08 (  0%)
    0  (  0%)
 dead store elim1                   :   0.32 (  1%)   0.00 (  0%)   0.32 (  1%)
 5493k (  1%)
 dead store elim2                   :   0.41 (  1%)   0.00 (  0%)   0.43 (  1%)
   23M (  3%)
 loop analysis                      :   0.01 (  0%)   0.00 (  0%)   0.00 (  0%)
    0  (  0%)
 loop init                          :   0.20 (  0%)   0.00 (  0%)   0.21 (  0%)
   62k (  0%)
 loop fini                          :   0.07 (  0%)   0.02 (  1%)   0.10 (  0%)
 3776k (  1%)
 combiner                           :   0.22 (  0%)   0.00 (  0%)   0.22 (  0%)
 2378k (  0%)
 if-conversion                      :   0.38 (  1%)   0.01 (  0%)   0.37 (  1%)
   36M (  5%)
 integrated RA                      :   5.43 ( 10%)   0.02 (  1%)   5.44 (  9%)
   96M ( 13%)
 LRA non-specific                   :   3.61 (  6%)   0.01 (  0%)   3.64 (  6%)
   21M (  3%)
 LRA virtuals elimination           :   0.18 (  0%)   0.01 (  0%)   0.16 (  0%)
   15M (  2%)
 LRA create live ranges             :   3.08 (  6%)   0.01 (  0%)   3.09 (  5%)
 2027k (  0%)
 LRA hard reg assignment            :   0.07 (  0%)   0.00 (  0%)   0.07 (  0%)
    0  (  0%)
 reload                             :   0.02 (  0%)   0.00 (  0%)   0.02 (  0%)
  144  (  0%)
 reload CSE regs                    :   0.51 (  1%)   0.00 (  0%)   0.51 (  1%)
   13M (  2%)
 thread pro- & epilogue             :   0.10 (  0%)   0.00 (  0%)   0.11 (  0%)
 9680  (  0%)
 if-conversion 2                    :   0.05 (  0%)   0.00 (  0%)   0.02 (  0%)
   24  (  0%)
 combine stack adjustments          :   0.04 (  0%)   0.00 (  0%)   0.03 (  0%)
    0  (  0%)
 hard reg cprop                     :   0.21 (  0%)   0.10 (  4%)   0.31 (  1%)
 3288  (  0%)
 scheduling 2                       :   1.36 (  2%)   0.04 (  2%)   1.38 (  2%)
 5904k (  1%)
 machine dep reorg                  :   0.02 (  0%)   0.00 (  0%)   0.02 (  0%)
    0  (  0%)
 reorder blocks                     :   0.19 (  0%)   0.00 (  0%)   0.23 (  0%)
 4176k (  1%)
 shorten branches                   :   0.14 (  0%)   0.00 (  0%)   0.14 (  0%)
    0  (  0%)
 final                              :   0.27 (  0%)   0.01 (  0%)   0.29 (  0%)
   31M (  4%)
 straight-line strength reduction   :   0.10 (  0%)   0.00 (  0%)   0.10 (  0%)
   33k (  0%)
 rest of compilation                :   0.93 (  2%)   0.24 ( 10%)   1.15 (  2%)
 1158k (  0%)
 remove unused locals               :   0.07 (  0%)   0.00 (  0%)   0.07 (  0%)
    0  (  0%)
 address taken                      :   0.09 (  0%)   0.00 (  0%)   0.09 (  0%)
    0  (  0%)
 repair loop structures             :   0.01 (  0%)   0.00 (  0%)   0.01 (  0%)
    0  (  0%)
 TOTAL                              :  55.95          2.30         58.28       
  718M


heine:~/programs/gambit/gambit-profiled> /pkgs/gcc-mainline/bin/gcc
-march=native -D___CAN_IMPORT_CLIB_DYNAMICALLY  -O1     -Wno-unused
-Wno-write-strings -Wdisabled-optimization -fwrapv -fno-strict-aliasing
-fno-trapping-math -fno-math-errno -fschedule-insns2 -fomit-frame-pointer -fPIC
-fno-common -mpc64   -rdynamic -shared  -D___SINGLE_HOST -D___DYNAMIC
-I"/home/lucier/programs/gambit/gambit-profiled/include" -o 'fib-5.o1' -Q
-fprofile-arcs -ftest-coverage -save-temps   'fib-5.c' 

Time variable                                   usr           sys          wall
          GGC
 phase setup                        :   0.08 (100%)   0.02 (100%)   0.13 ( 93%)
   22M (100%)
 TOTAL                              :   0.08          0.02          0.14       
   22M
 btowc wctob mbrlen ___H_fib_2d_5 ___setup_mod ___init_mod
___LNK_fib_2d_5_2e_o1
Analyzing compilation unit
Performing interprocedural optimizations
 <*free_lang_data> {heap 2884k} <visibility> {heap 2884k} <build_ssa_passes>
{heap 2884k} <opt_local_passes> {heap 3032k} <remove_symbols> {heap 7436k}
<targetclone> {heap 7436k} <profile> {heap 7436k} <free-fnsummary> {heap
7436k}Streaming LTO
 <whole-program> {heap 7436k} <profile_estimate> {heap 7436k} <fnsummary> {heap
7436k} <inline> {heap 7436k} <pure-const> {heap 7436k} <modref> {heap 7436k}
<free-fnsummary> {heap 7436k} <static-var> {heap 7436k} <single-use> {heap
7436k} <comdats> {heap 7436k}Assembling functions:
 <simdclone> {heap 7436k} ___setup_mod ___init_mod ___H_fib_2d_5gcc: fatal
error: Killed signal terminated program cc1
compilation terminated.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Bug middle-end/64928] [8/9/10/11 Regression] Inordinate cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs
  2015-02-03 21:09 [Bug other/64928] New: unreasonable cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs lucier at math dot purdue.edu
                   ` (27 preceding siblings ...)
  2021-03-10  2:10 ` lucier at math dot purdue.edu
@ 2021-03-10  2:13 ` lucier at math dot purdue.edu
  2021-03-10  9:47 ` rguenth at gcc dot gnu.org
                   ` (10 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: lucier at math dot purdue.edu @ 2021-03-10  2:13 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64928

--- Comment #35 from lucier at math dot purdue.edu ---
Created attachment 50345
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50345&action=edit
Parametrized input files for test coverage testing.

These are the .i files that go with my previous comment.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Bug middle-end/64928] [8/9/10/11 Regression] Inordinate cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs
  2015-02-03 21:09 [Bug other/64928] New: unreasonable cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs lucier at math dot purdue.edu
                   ` (28 preceding siblings ...)
  2021-03-10  2:13 ` lucier at math dot purdue.edu
@ 2021-03-10  9:47 ` rguenth at gcc dot gnu.org
  2021-03-10 14:16 ` lucier at math dot purdue.edu
                   ` (9 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-03-10  9:47 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64928

--- Comment #36 from Richard Biener <rguenth at gcc dot gnu.org> ---
So the issue is still the same - one thing I noticed is that store-motion also
adds a flag for each counter update to avoid introducing store-data-races.
-fallow-store-data-races mitigates that part and speeds up the compilation
quite a bit.  In case there are threads involved you'd want
-fprofile-update=atomic
which then causes store-motion to give up and the compile-time is great
overall.

The original trigger of the regression is likely the marking of the profile
counters as to not be aliased - we might want to introduce another flag to
tell that store-data-races for the particular decl are not a consideration
(maybe even have some user-visible attribute for this).

Otherwise re-confirmed (I stripped options down to -O -fPIC -fprofile-arcs
-ftest-coverage):

rguenther@ryzen:/tmp> /usr/bin/time ~/install/gcc-11.0/usr/local/bin/gcc -S -O
-fPIC -fprofile-arcs -ftest-coverage fib-2.o1-fib-2.i
1.84user 0.05system 0:01.90elapsed 99%CPU (0avgtext+0avgdata
160764maxresident)k
0inputs+0outputs (0major+58129minor)pagefaults 0swaps
rguenther@ryzen:/tmp> /usr/bin/time ~/install/gcc-11.0/usr/local/bin/gcc -S -O
-fPIC -fprofile-arcs -ftest-coverage fib-3.o1-fib-3.i 
10.15user 0.17system 0:10.32elapsed 99%CPU (0avgtext+0avgdata
726688maxresident)k
0inputs+0outputs (0major+265008minor)pagefaults 0swaps
rguenther@ryzen:/tmp> /usr/bin/time ~/install/gcc-11.0/usr/local/bin/gcc -S -O
-fPIC -fprofile-arcs -ftest-coverage fib-4.o1-fib-4.i 
43.60user 1.06system 0:44.68elapsed 99%CPU (0avgtext+0avgdata
6107260maxresident)k
0inputs+0outputs (0major+1765217minor)pagefaults 0swaps
rguenther@ryzen:/tmp> /usr/bin/time ~/install/gcc-11.0/usr/local/bin/gcc -S -O
-fPIC -fprofile-arcs -ftest-coverage fib-5.o1-fib-5.i 
gcc: fatal error: Killed signal terminated program cc1
compilation terminated.
Command exited with non-zero status 1
143.09user 3.93system 2:28.29elapsed 99%CPU (0avgtext+0avgdata
24636148maxresident)k
37504inputs+0outputs (31major+6133278minor)pagefaults 0swaps

on the last which runs OOM adding -fallow-store-data-races does

rguenther@ryzen:/tmp> /usr/bin/time ~/install/gcc-11.0/usr/local/bin/gcc -S -O
-fPIC -fprofile-arcs -ftest-coverage fib-5.o1-fib-5.i -fallow-store-data-races
123.06user 0.45system 2:03.59elapsed 99%CPU (0avgtext+0avgdata
1777700maxresident)k
57304inputs+0outputs (68major+535127minor)pagefaults 0swaps

and -fprofile-update=atomic

rguenther@ryzen:/tmp> /usr/bin/time ~/install/gcc-11.0/usr/local/bin/gcc -S -O
-fPIC -fprofile-arcs -ftest-coverage fib-5.o1-fib-5.i -fprofile-update=atomic 
0.61user 0.02system 0:00.63elapsed 100%CPU (0avgtext+0avgdata
73236maxresident)k
72inputs+0outputs (0major+18284minor)pagefaults 0swaps

and -fno-tree-loop-im

rguenther@ryzen:/tmp> /usr/bin/time ~/install/gcc-11.0/usr/local/bin/gcc -S -O
-fPIC -fprofile-arcs -ftest-coverage fib-5.o1-fib-5.i -fno-tree-loop-im      
1.06user 0.01system 0:01.07elapsed 99%CPU (0avgtext+0avgdata 90672maxresident)k
0inputs+0outputs (0major+24331minor)pagefaults 0swaps

I still wonder if you can produce an even smaller testcase where visualizing
the CFG is possible.  Unfortunately the source is mechanically generated
and following it is hard.  Like a testcase that retains the basic structure
but ends up with just a few (2, less than 10) computed gotos?

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Bug middle-end/64928] [8/9/10/11 Regression] Inordinate cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs
  2015-02-03 21:09 [Bug other/64928] New: unreasonable cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs lucier at math dot purdue.edu
                   ` (29 preceding siblings ...)
  2021-03-10  9:47 ` rguenth at gcc dot gnu.org
@ 2021-03-10 14:16 ` lucier at math dot purdue.edu
  2021-03-10 15:06 ` rguenth at gcc dot gnu.org
                   ` (8 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: lucier at math dot purdue.edu @ 2021-03-10 14:16 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64928

--- Comment #37 from lucier at math dot purdue.edu ---
Created attachment 50352
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50352&action=edit
Smaller parameterized test file

This file is generated from a single copy of the fibonacci function, and is
simplified a bit otherwise.  I believe it has two computed gotos.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Bug middle-end/64928] [8/9/10/11 Regression] Inordinate cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs
  2015-02-03 21:09 [Bug other/64928] New: unreasonable cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs lucier at math dot purdue.edu
                   ` (30 preceding siblings ...)
  2021-03-10 14:16 ` lucier at math dot purdue.edu
@ 2021-03-10 15:06 ` rguenth at gcc dot gnu.org
  2021-05-14  9:47 ` [Bug middle-end/64928] [9/10/11/12 " jakub at gcc dot gnu.org
                   ` (7 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-03-10 15:06 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64928

--- Comment #38 from Richard Biener <rguenth at gcc dot gnu.org> ---
Created attachment 50354
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50354&action=edit
SVG of the CFG at LIM

This is a SVG of the CFG as created by dot at the point of the first LIM pass.

The CFG isn't too special and I guess a switch instead of the computed goto
would present us with the same issues.

I suppose putting a hard limit on the number of stores to move and then
ordering candidates based on their importance (execution frequency) is the
way to go.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Bug middle-end/64928] [9/10/11/12 Regression] Inordinate cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs
  2015-02-03 21:09 [Bug other/64928] New: unreasonable cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs lucier at math dot purdue.edu
                   ` (31 preceding siblings ...)
  2021-03-10 15:06 ` rguenth at gcc dot gnu.org
@ 2021-05-14  9:47 ` jakub at gcc dot gnu.org
  2021-06-01  8:06 ` rguenth at gcc dot gnu.org
                   ` (6 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-05-14  9:47 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64928

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|8.5                         |9.4

--- Comment #39 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
GCC 8 branch is being closed.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Bug middle-end/64928] [9/10/11/12 Regression] Inordinate cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs
  2015-02-03 21:09 [Bug other/64928] New: unreasonable cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs lucier at math dot purdue.edu
                   ` (32 preceding siblings ...)
  2021-05-14  9:47 ` [Bug middle-end/64928] [9/10/11/12 " jakub at gcc dot gnu.org
@ 2021-06-01  8:06 ` rguenth at gcc dot gnu.org
  2022-05-27  9:35 ` [Bug middle-end/64928] [10/11/12/13 " rguenth at gcc dot gnu.org
                   ` (5 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-06-01  8:06 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64928

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|9.4                         |9.5

--- Comment #40 from Richard Biener <rguenth at gcc dot gnu.org> ---
GCC 9.4 is being released, retargeting bugs to GCC 9.5.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Bug middle-end/64928] [10/11/12/13 Regression] Inordinate cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs
  2015-02-03 21:09 [Bug other/64928] New: unreasonable cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs lucier at math dot purdue.edu
                   ` (33 preceding siblings ...)
  2021-06-01  8:06 ` rguenth at gcc dot gnu.org
@ 2022-05-27  9:35 ` rguenth at gcc dot gnu.org
  2022-06-28 10:31 ` jakub at gcc dot gnu.org
                   ` (4 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-05-27  9:35 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64928

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|9.5                         |10.4

--- Comment #41 from Richard Biener <rguenth at gcc dot gnu.org> ---
GCC 9 branch is being closed

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Bug middle-end/64928] [10/11/12/13 Regression] Inordinate cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs
  2015-02-03 21:09 [Bug other/64928] New: unreasonable cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs lucier at math dot purdue.edu
                   ` (34 preceding siblings ...)
  2022-05-27  9:35 ` [Bug middle-end/64928] [10/11/12/13 " rguenth at gcc dot gnu.org
@ 2022-06-28 10:31 ` jakub at gcc dot gnu.org
  2023-07-07 10:30 ` [Bug middle-end/64928] [11/12/13/14 " rguenth at gcc dot gnu.org
                   ` (3 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: jakub at gcc dot gnu.org @ 2022-06-28 10:31 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64928

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|10.4                        |10.5

--- Comment #42 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
GCC 10.4 is being released, retargeting bugs to GCC 10.5.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Bug middle-end/64928] [11/12/13/14 Regression] Inordinate cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs
  2015-02-03 21:09 [Bug other/64928] New: unreasonable cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs lucier at math dot purdue.edu
                   ` (35 preceding siblings ...)
  2022-06-28 10:31 ` jakub at gcc dot gnu.org
@ 2023-07-07 10:30 ` rguenth at gcc dot gnu.org
  2023-09-28  7:06 ` [Bug middle-end/64928] [11 " rguenth at gcc dot gnu.org
                   ` (2 subsequent siblings)
  39 siblings, 0 replies; 41+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-07-07 10:30 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64928

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|10.5                        |11.5

--- Comment #43 from Richard Biener <rguenth at gcc dot gnu.org> ---
GCC 10 branch is being closed.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Bug middle-end/64928] [11 Regression] Inordinate cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs
  2015-02-03 21:09 [Bug other/64928] New: unreasonable cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs lucier at math dot purdue.edu
                   ` (36 preceding siblings ...)
  2023-07-07 10:30 ` [Bug middle-end/64928] [11/12/13/14 " rguenth at gcc dot gnu.org
@ 2023-09-28  7:06 ` rguenth at gcc dot gnu.org
  2023-10-02  0:26 ` lucier at math dot purdue.edu
  2023-10-04  6:39 ` rguenth at gcc dot gnu.org
  39 siblings, 0 replies; 41+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-09-28  7:06 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64928

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Assignee|rguenth at gcc dot gnu.org         |unassigned at gcc dot gnu.org
      Known to fail|                            |11.4.0
                 CC|                            |rguenth at gcc dot gnu.org
      Known to work|                            |12.1.0, 13.1.0, 14.0
             Status|ASSIGNED                    |NEW
            Summary|[11/12/13/14 Regression]    |[11 Regression] Inordinate
                   |Inordinate cpu time and     |cpu time and memory usage
                   |memory usage in "phase opt  |in "phase opt and generate"
                   |and generate" with          |with -ftest-coverage
                   |-ftest-coverage             |-fprofile-arcs
                   |-fprofile-arcs              |

--- Comment #44 from Richard Biener <rguenth at gcc dot gnu.org> ---
I tried the first input file with GCC 13.2 and on a Ryzen 9 7900X get a memory
usage of 105MB and 1.1s compile-time.  The larger testcase needs 360MB peak
and 6.3s to compile.  Both with mostly flat -ftime-report profile.

Upping to -O2 shows same memory peak but 13.1s for the larger testcase.  We
then see

 PRE                                :   2.09 ( 16%)   0.01 (  1%)   2.15 ( 15%)
  288k (  0%)

as the biggest thing sticking out (similar for the small testcase).

I think we've come a long way here.  GCC 12.3 behaves the same.  For GCC 11.4
the larger testcase at -O2 I stopped after 3 minutes, the small testcase at -O1
takes 44s and 5GB memory.

Fixed for GCC 12+, I'm not going to look at identifying what to backport (I
usually backported compile-time/memory-usage improvements when reasonable, so
I suspect this was a bigger change).

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Bug middle-end/64928] [11 Regression] Inordinate cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs
  2015-02-03 21:09 [Bug other/64928] New: unreasonable cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs lucier at math dot purdue.edu
                   ` (37 preceding siblings ...)
  2023-09-28  7:06 ` [Bug middle-end/64928] [11 " rguenth at gcc dot gnu.org
@ 2023-10-02  0:26 ` lucier at math dot purdue.edu
  2023-10-04  6:39 ` rguenth at gcc dot gnu.org
  39 siblings, 0 replies; 41+ messages in thread
From: lucier at math dot purdue.edu @ 2023-10-02  0:26 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64928

--- Comment #45 from lucier at math dot purdue.edu ---
I confirm that I no longer have this problem with

> gcc-12 -v
Using built-in specs.
COLLECT_GCC=gcc-12
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/12/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none:amdgcn-amdhsa
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu
12.3.0-1ubuntu1~22.04' --with-bugurl=file:///usr/share/doc/gcc-12/README.Bugs
--enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --prefix=/usr
--with-gcc-major-version-only --program-suffix=-12
--program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id
--libexecdir=/usr/lib --without-included-gettext --enable-threads=posix
--libdir=/usr/lib --enable-nls --enable-clocale=gnu --enable-libstdcxx-debug
--enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new
--enable-gnu-unique-object --disable-vtable-verify --enable-plugin
--enable-default-pie --with-system-zlib --enable-libphobos-checking=release
--with-target-system-zlib=auto --enable-objc-gc=auto --enable-multiarch
--disable-werror --enable-cet --with-arch-32=i686 --with-abi=m64
--with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic
--enable-offload-targets=nvptx-none=/build/gcc-12-ALHxjy/gcc-12-12.3.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-ALHxjy/gcc-12-12.3.0/debian/tmp-gcn/usr
--enable-offload-defaulted --without-cuda-driver --enable-checking=release
--build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 12.3.0 (Ubuntu 12.3.0-1ubuntu1~22.04) 

A different example procedure still took > 45 minutes and > 3.5 GB to compile
with -ftest-coverage -fprofile-arcs (it had finished when I came back from
lunch) but it was quite large (even by my standards!).

If this is a "won't fix" for earlier versions of gcc, then I'm OK with closing
this PR.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Bug middle-end/64928] [11 Regression] Inordinate cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs
  2015-02-03 21:09 [Bug other/64928] New: unreasonable cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs lucier at math dot purdue.edu
                   ` (38 preceding siblings ...)
  2023-10-02  0:26 ` lucier at math dot purdue.edu
@ 2023-10-04  6:39 ` rguenth at gcc dot gnu.org
  39 siblings, 0 replies; 41+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-10-04  6:39 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64928

--- Comment #46 from Richard Biener <rguenth at gcc dot gnu.org> ---
It'll get closed when we close the GCC 11 branch, there's still the opportunity
for somebody to bisect what fixed it in GCC 12 in case it was something
trivial.

^ permalink raw reply	[flat|nested] 41+ messages in thread

end of thread, other threads:[~2023-10-04  6:39 UTC | newest]

Thread overview: 41+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-02-03 21:09 [Bug other/64928] New: unreasonable cpu time and memory usage in "phase opt and generate" with -ftest-coverage -fprofile-arcs lucier at math dot purdue.edu
2015-02-03 21:11 ` [Bug other/64928] Inordinate " lucier at math dot purdue.edu
2015-02-03 21:33 ` pinskia at gcc dot gnu.org
2015-02-03 21:35 ` pinskia at gcc dot gnu.org
2015-02-03 21:49 ` lucier at math dot purdue.edu
2015-02-06  5:07 ` lucier at math dot purdue.edu
2015-02-06  5:08 ` lucier at math dot purdue.edu
2015-02-09 14:31 ` [Bug middle-end/64928] [4.8/4.9/5 Regression] " rguenth at gcc dot gnu.org
2015-02-09 15:07 ` rguenth at gcc dot gnu.org
2015-02-16 19:57 ` law at redhat dot com
2015-03-05 17:22 ` rguenth at gcc dot gnu.org
2015-03-05 23:07 ` steven at gcc dot gnu.org
2015-03-06  0:45 ` law at redhat dot com
2015-03-06 10:53 ` rguenth at gcc dot gnu.org
2015-03-06 12:35 ` rguenth at gcc dot gnu.org
2015-03-06 12:47 ` rguenth at gcc dot gnu.org
2015-03-06 12:53 ` rguenth at gcc dot gnu.org
2015-03-06 13:01 ` rguenth at gcc dot gnu.org
2015-03-18 12:54 ` rguenth at gcc dot gnu.org
2015-05-20 14:49 ` [Bug middle-end/64928] [4.8/4.9/5/6 " wellnhofer at aevum dot de
2015-05-20 14:49 ` wellnhofer at aevum dot de
2015-06-23  8:19 ` rguenth at gcc dot gnu.org
2015-06-26 19:56 ` [Bug middle-end/64928] [4.9/5/6 " jakub at gcc dot gnu.org
2015-06-26 20:28 ` jakub at gcc dot gnu.org
2020-09-29  0:14 ` [Bug middle-end/64928] [8/9/10/11 " lucier at math dot purdue.edu
2020-09-29  7:09 ` rguenth at gcc dot gnu.org
2020-09-29 12:17 ` lucier at math dot purdue.edu
2020-09-29 13:06 ` rguenther at suse dot de
2021-03-10  2:10 ` lucier at math dot purdue.edu
2021-03-10  2:13 ` lucier at math dot purdue.edu
2021-03-10  9:47 ` rguenth at gcc dot gnu.org
2021-03-10 14:16 ` lucier at math dot purdue.edu
2021-03-10 15:06 ` rguenth at gcc dot gnu.org
2021-05-14  9:47 ` [Bug middle-end/64928] [9/10/11/12 " jakub at gcc dot gnu.org
2021-06-01  8:06 ` rguenth at gcc dot gnu.org
2022-05-27  9:35 ` [Bug middle-end/64928] [10/11/12/13 " rguenth at gcc dot gnu.org
2022-06-28 10:31 ` jakub at gcc dot gnu.org
2023-07-07 10:30 ` [Bug middle-end/64928] [11/12/13/14 " rguenth at gcc dot gnu.org
2023-09-28  7:06 ` [Bug middle-end/64928] [11 " rguenth at gcc dot gnu.org
2023-10-02  0:26 ` lucier at math dot purdue.edu
2023-10-04  6:39 ` rguenth at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).