public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug rtl-optimization/23490] [3.4/4.0/4.1 Regression] Long compile time for array initializer with inlined constructor
       [not found] <bug-23490-2744@http.gcc.gnu.org/bugzilla/>
@ 2005-10-31  5:17 ` mmitchel at gcc dot gnu dot org
  2005-11-03 20:06 ` hubicka at gcc dot gnu dot org
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 9+ messages in thread
From: mmitchel at gcc dot gnu dot org @ 2005-10-31  5:17 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #8 from mmitchel at gcc dot gnu dot org  2005-10-31 05:17 -------
I'd like to see this fixed.  Is there any throttling we can do here?


-- 

mmitchel at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Priority|P2                          |P4


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23490


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug rtl-optimization/23490] [3.4/4.0/4.1 Regression] Long compile time for array initializer with inlined constructor
       [not found] <bug-23490-2744@http.gcc.gnu.org/bugzilla/>
  2005-10-31  5:17 ` [Bug rtl-optimization/23490] [3.4/4.0/4.1 Regression] Long compile time for array initializer with inlined constructor mmitchel at gcc dot gnu dot org
@ 2005-11-03 20:06 ` hubicka at gcc dot gnu dot org
  2005-11-03 20:20 ` falk at debian dot org
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 9+ messages in thread
From: hubicka at gcc dot gnu dot org @ 2005-11-03 20:06 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #9 from hubicka at gcc dot gnu dot org  2005-11-03 20:06 -------
For reload CSE there is --param max-cselib-memory-locations=10
that cuts it's time to 1%, overall.
Since this testcase is sort of designed to excercise the worst case behaviour,
I think it is not too bad...


alias analysis has:
/* Maximum length of pbi->mem_set_list before we start dropping
   new elements on the floor.  */
#define MAX_MEM_SET_LIST_LEN    100
perhaps unifying all these lists would make sense.  Would that be applicable
for 4.1? (Ie I would have to pick some common value, like 300 - cselib defaults
to 500)

CSE has:
      /* If we have processed 1,000 insns, flush the hash table to
         avoid extreme quadratic behavior.  We must not include NOTEs
         in the count since there may be more of them when generating
         debugging information.  If we clear the table at different
         times, code generated with -g -O might be different than code
         generated with -O but not -g.

         ??? This is a real kludge and needs to be done some other way.
         Perhaps for 2.9.  */

I am not sure if I can fit memory specific limit in this (it would be nice to
simply throw away old entries rather than flushing it all), or we can just keep
it.
None of these algorithms would grow ad infimum and manifest worse behaviour
than in this testcase, so I would not consider it P2

Honza


-- 

hubicka at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |hubicka at gcc dot gnu dot
                   |                            |org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23490


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug rtl-optimization/23490] [3.4/4.0/4.1 Regression] Long compile time for array initializer with inlined constructor
       [not found] <bug-23490-2744@http.gcc.gnu.org/bugzilla/>
  2005-10-31  5:17 ` [Bug rtl-optimization/23490] [3.4/4.0/4.1 Regression] Long compile time for array initializer with inlined constructor mmitchel at gcc dot gnu dot org
  2005-11-03 20:06 ` hubicka at gcc dot gnu dot org
@ 2005-11-03 20:20 ` falk at debian dot org
  2005-11-03 21:23 ` hubicka at ucw dot cz
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 9+ messages in thread
From: falk at debian dot org @ 2005-11-03 20:20 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #10 from falk at debian dot org  2005-11-03 20:20 -------
Created an attachment (id=10134)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=10134&action=view)
Preprocessed test case

This is not a designed test case, it's distilled from a real-world application
(libqd). I've attached the original source. g++ -O2 takes 103s for 33667 lines
(about 1M, not really very large for preprocessed C++).


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23490


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug rtl-optimization/23490] [3.4/4.0/4.1 Regression] Long compile time for array initializer with inlined constructor
       [not found] <bug-23490-2744@http.gcc.gnu.org/bugzilla/>
                   ` (2 preceding siblings ...)
  2005-11-03 20:20 ` falk at debian dot org
@ 2005-11-03 21:23 ` hubicka at ucw dot cz
  2005-11-05  0:55 ` hubicka at gcc dot gnu dot org
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 9+ messages in thread
From: hubicka at ucw dot cz @ 2005-11-03 21:23 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #11 from hubicka at ucw dot cz  2005-11-03 21:23 -------
Subject: Re:  [3.4/4.0/4.1 Regression] Long compile time for array initializer
with inlined constructor

Hmm, OK, this adds the neccesary knobs so you can trottle the parameters
even further.  Sadly this brings quite dificult to tune parameters since
reducing them considerably might easilly ruin code quality and I am
unsure how we want to proceed... (IE keep them high and change your
makefiles or try to reduce them to something more scalable)

Honza

2005-11-03  Jan Hubicka  <jh@suse.cz>
        * doc/invoke.texi (max-predicted-iterations, max-cse-insns,
        max-flow-memory-location): Document.
        * flow.c: Include params.h
        (MAX_MEM_SET_LIST_LEN): Kill.
        (add_to_mem_set_list): Use new param.
        * cse.c (cse_basic_block): Replace 1000 by new param.
        * params.def (PARAM_MAX_PREDICTED_ITERATIONS, PARAM_MAX_CSE_INSNS,
        PARAM_MAX_FLOW_MEMORY_LOCATIONS): New.
        * predict.c (predict_loops): Use new param.
        * predict.def (MAX_PRED_LOOP_ITERATIONS): Remove.
Index: doc/invoke.texi
===================================================================
--- doc/invoke.texi     (revision 106422)
+++ doc/invoke.texi     (working copy)
@@ -5934,6 +5938,13 @@ given basic block needs to have to be co
 Select fraction of the maximal frequency of executions of basic block in
 function given basic block needs to have to be considered hot

+@item max-predicted-iterations
+The maximum number of loop iterations we predict statically.  This is useful
+in cases where function contain single loop with known bound and other loop
+with unknown.  We predict the known number of iterations correctly, while
+the unknown nummber of iterations average to roughly 10.  This means that the
+loop without bounds would appear artifically cold relative to the other one.
+
 @item tracer-dynamic-coverage
 @itemx tracer-dynamic-coverage-feedback

@@ -5971,6 +5982,9 @@ order to make tracer effective.

 Maximum number of basic blocks on path that cse considers.  The default is 10.

+@item max-cse-insns
+The maximum instructions CSE process before flushing. The default is 1000.
+
 @item global-var-threshold

 Counts the number of function calls (@var{n}) and the number of
@@ -6031,6 +6045,10 @@ value is 100.
 The maximum number of memory locations cselib should take into acount.
 Increasing values mean more aggressive optimization, making the compile time
 increase with probably slightly better performance.  The default value is 500.
+
+@item max-flow-memory-location
+Similar as @option{max-cselib-memory-location} but for dataflow liveness.
+The default value is 100.

 @item reorder-blocks-duplicate
 @itemx reorder-blocks-duplicate-feedback
Index: flow.c
===================================================================
--- flow.c      (revision 106422)
+++ flow.c      (working copy)
@@ -141,6 +141,7 @@ Software Foundation, 51 Franklin Street,
 #include "obstack.h"
 #include "splay-tree.h"
 #include "tree-pass.h"
+#include "params.h"

 #ifndef HAVE_epilogue
 #define HAVE_epilogue 0
@@ -283,10 +284,6 @@ static int ndead;

 static int *reg_deaths;

-/* Maximum length of pbi->mem_set_list before we start dropping
-   new elements on the floor.  */
-#define MAX_MEM_SET_LIST_LEN   100
-
 /* Forward declarations */
 static int verify_wide_reg_1 (rtx *, void *);
 static void verify_wide_reg (int, basic_block);
@@ -630,7 +627,7 @@ update_life_info (sbitmap blocks, enum u

          /* We repeat regardless of what cleanup_cfg says.  If there were
             instructions deleted above, that might have been only a
-            partial improvement (see MAX_MEM_SET_LIST_LEN usage).
+            partial improvement (see PARAM_MAX_FLOW_MEMORY_LOCATIONS  usage).
             Further improvement may be possible.  */
          cleanup_cfg (CLEANUP_EXPENSIVE);

@@ -2515,7 +2512,7 @@ add_to_mem_set_list (struct propagate_bl
        }
     }

-  if (pbi->mem_set_list_len < MAX_MEM_SET_LIST_LEN)
+  if (pbi->mem_set_list_len < PARAM_VALUE (PARAM_MAX_FLOW_MEMORY_LOCATIONS))
     {
 #ifdef AUTO_INC_DEC
       /* Store a copy of mem, otherwise the address may be
Index: cse.c
===================================================================
--- cse.c       (revision 106422)
+++ cse.c       (working copy)
@@ -6890,7 +6890,7 @@ cse_basic_block (rtx from, rtx to, struc

         ??? This is a real kludge and needs to be done some other way.
         Perhaps for 2.9.  */
-      if (code != NOTE && num_insns++ > 1000)
+      if (code != NOTE && num_insns++ > PARAM_VALUE (PARAM_MAX_CSE_INSNS))
        {
          flush_hash_table ();
          num_insns = 0;
Index: predict.c
===================================================================
--- predict.c   (revision 106422)
+++ predict.c   (working copy)
@@ -624,8 +624,9 @@ predict_loops (struct loops *loops_info,
              niter = desc.niter + 1;
              if (niter == 0)        /* We might overflow here.  */
                niter = desc.niter;
-             if (niter > MAX_PRED_LOOP_ITERATIONS)
-               niter = MAX_PRED_LOOP_ITERATIONS;
+             if (niter
+                 > (unsigned int) PARAM_VALUE
(PARAM_MAX_PREDICTED_ITERATIONS))
+               niter = PARAM_VALUE (PARAM_MAX_PREDICTED_ITERATIONS);

              prob = (REG_BR_PROB_BASE
                      - (REG_BR_PROB_BASE + niter /2) / niter);
@@ -653,19 +654,17 @@ predict_loops (struct loops *loops_info,
              if (TREE_CODE (niter) == INTEGER_CST)
                {
                  int probability;
+                 int max = PARAM_VALUE (PARAM_MAX_PREDICTED_ITERATIONS);
                  if (host_integerp (niter, 1)
                      && tree_int_cst_lt (niter,
-                                         build_int_cstu (NULL_TREE,
-                                                MAX_PRED_LOOP_ITERATIONS -
1)))
+                                         build_int_cstu (NULL_TREE, max - 1)))
                    {
                      HOST_WIDE_INT nitercst = tree_low_cst (niter, 1) + 1;
                      probability = ((REG_BR_PROB_BASE + nitercst / 2)
                                     / nitercst);
                    }
                  else
-                   probability = ((REG_BR_PROB_BASE
-                                   + MAX_PRED_LOOP_ITERATIONS / 2)
-                                  / MAX_PRED_LOOP_ITERATIONS);
+                   probability = ((REG_BR_PROB_BASE + max / 2) / max);

                  predict_edge (exits[j], PRED_LOOP_ITERATIONS, probability);
                }
Index: params.def
===================================================================
--- params.def  (revision 106422)
+++ params.def  (working copy)
@@ -309,6 +309,22 @@ DEFPARAM(HOT_BB_FREQUENCY_FRACTION,
         "hot-bb-frequency-fraction",
         "Select fraction of the maximal frequency of executions of basic block
in function given basic block needs to have to be considered hot",
         1000, 0, 0)
+
+/* For guessed profiles, the loops having unknown number of iterations
+   are predicted to iterate relatively few (10) times at average.
+   For functions containing one loop with large known number of iterations
+   and other loops having unbounded loops we would end up predicting all
+   the other loops cold that is not usually the case.  So we need to
artifically
+   flatten the profile.  
+
+   We need to cut the maximal predicted iterations to large enought iterations
+   so the loop appears important, but safely within HOT_BB_COUNT_FRACTION
+   range.  */
+
+DEFPARAM(PARAM_MAX_PREDICTED_ITERATIONS,
+        "max-predicted-iterations",
+        "The maximum number of loop iterations we predict statically",
+        100, 0, 0)
 DEFPARAM(TRACER_DYNAMIC_COVERAGE_FEEDBACK,
         "tracer-dynamic-coverage-feedback",
         "The percentage of function, weighted by execution frequency, that
must be covered by trace formation. Used when profile feedback is available",
@@ -363,6 +379,10 @@ DEFPARAM(PARAM_MAX_CSE_PATH_LENGTH,
         "max-cse-path-length",
         "The maximum length of path considered in cse",
         10, 0, 0)
+DEFPARAM(PARAM_MAX_CSE_INSNS,
+        "max-flow-memory-locations",
+        "The maximum instructions CSE process before flushing",
+        1000, 0, 0)

 /* The cost of expression in loop invariant motion that is considered
    expensive.  */
@@ -417,6 +437,10 @@ DEFPARAM(PARAM_MAX_CSELIB_MEMORY_LOCATIO
         "max-cselib-memory-locations",
         "The maximum memory locations recorded by cselib",
         500, 0, 0)
+DEFPARAM(PARAM_MAX_FLOW_MEMORY_LOCATIONS,
+        "max-flow-memory-locations",
+        "The maximum memory locations recorded by flow",
+        100, 0, 0)

 #ifdef ENABLE_GC_ALWAYS_COLLECT
 # define GGC_MIN_EXPAND_DEFAULT 0
Index: predict.def
===================================================================
--- predict.def (revision 106422)
+++ predict.def (working copy)
@@ -58,18 +58,6 @@ DEF_PREDICTOR (PRED_UNCONDITIONAL, "unco
 DEF_PREDICTOR (PRED_LOOP_ITERATIONS, "loop iterations", PROB_ALWAYS,
               PRED_FLAG_FIRST_MATCH)

-/* For guessed profiles, the loops having unknown number of iterations
-   are predicted to iterate relatively few (10) times at average.
-   For functions containing one loop with large known number of iterations
-   and other loops having unbounded loops we would end up predicting all
-   the other loops cold that is not usually the case.  So we need to
artifically
-   flatten the profile.  
-
-   We need to cut the maximal predicted iterations to large enought iterations
-   so the loop appears important, but safely within HOT_BB_COUNT_FRACTION
-   range.  */
-#define MAX_PRED_LOOP_ITERATIONS 100
-
 /* Hints dropped by user via __builtin_expect feature.  */
 DEF_PREDICTOR (PRED_BUILTIN_EXPECT, "__builtin_expect", PROB_VERY_LIKELY,
               PRED_FLAG_FIRST_MATCH)


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23490


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug rtl-optimization/23490] [3.4/4.0/4.1 Regression] Long compile time for array initializer with inlined constructor
       [not found] <bug-23490-2744@http.gcc.gnu.org/bugzilla/>
                   ` (3 preceding siblings ...)
  2005-11-03 21:23 ` hubicka at ucw dot cz
@ 2005-11-05  0:55 ` hubicka at gcc dot gnu dot org
  2005-11-07 20:57 ` [Bug rtl-optimization/23490] [3.4/4.0 " pinskia at gcc dot gnu dot org
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 9+ messages in thread
From: hubicka at gcc dot gnu dot org @ 2005-11-05  0:55 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #12 from hubicka at gcc dot gnu dot org  2005-11-05 00:55 -------
Subject: Bug 23490

Author: hubicka
Date: Sat Nov  5 00:55:23 2005
New Revision: 106520

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=106520
Log:
        PR rtl-optimization/23490
        * doc/invoke.texi (max-predicted-iterations, max-cse-insns,
        max-flow-memory-location): Document.
        * flow.c: Include params.h
        (MAX_MEM_SET_LIST_LEN): Kill.
        (add_to_mem_set_list): Use new param.
        * cse.c (cse_basic_block): Replace 1000 by new param.
        * params.def (PARAM_MAX_PREDICTED_ITERATIONS, PARAM_MAX_CSE_INSNS,
        PARAM_MAX_FLOW_MEMORY_LOCATIONS): New.
        * predict.c (predict_loops): Use new param.
        * predict.def (MAX_PRED_LOOP_ITERATIONS): Remove.


Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/cse.c
    trunk/gcc/doc/invoke.texi
    trunk/gcc/flow.c
    trunk/gcc/params.def
    trunk/gcc/predict.c
    trunk/gcc/predict.def


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23490


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug rtl-optimization/23490] [3.4/4.0 Regression] Long compile time for array initializer with inlined constructor
       [not found] <bug-23490-2744@http.gcc.gnu.org/bugzilla/>
                   ` (4 preceding siblings ...)
  2005-11-05  0:55 ` hubicka at gcc dot gnu dot org
@ 2005-11-07 20:57 ` pinskia at gcc dot gnu dot org
  2006-03-11  3:20 ` mmitchel at gcc dot gnu dot org
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 9+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2005-11-07 20:57 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #13 from pinskia at gcc dot gnu dot org  2005-11-07 20:56 -------
Fixed at least on the mainline.


-- 

pinskia at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|[3.4/4.0/4.1 Regression]    |[3.4/4.0 Regression] Long
                   |Long compile time for array |compile time for array
                   |initializer with inlined    |initializer with inlined
                   |constructor                 |constructor


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23490


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug rtl-optimization/23490] [3.4/4.0 Regression] Long compile time for array initializer with inlined constructor
       [not found] <bug-23490-2744@http.gcc.gnu.org/bugzilla/>
                   ` (5 preceding siblings ...)
  2005-11-07 20:57 ` [Bug rtl-optimization/23490] [3.4/4.0 " pinskia at gcc dot gnu dot org
@ 2006-03-11  3:20 ` mmitchel at gcc dot gnu dot org
  2007-02-03 15:37 ` [Bug rtl-optimization/23490] [4.0 " gdr at gcc dot gnu dot org
  2008-05-03  0:10 ` pinskia at gcc dot gnu dot org
  8 siblings, 0 replies; 9+ messages in thread
From: mmitchel at gcc dot gnu dot org @ 2006-03-11  3:20 UTC (permalink / raw)
  To: gcc-bugs



-- 

mmitchel at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|4.0.3                       |4.0.4


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23490


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug rtl-optimization/23490] [4.0 Regression] Long compile time for array initializer with inlined constructor
       [not found] <bug-23490-2744@http.gcc.gnu.org/bugzilla/>
                   ` (6 preceding siblings ...)
  2006-03-11  3:20 ` mmitchel at gcc dot gnu dot org
@ 2007-02-03 15:37 ` gdr at gcc dot gnu dot org
  2008-05-03  0:10 ` pinskia at gcc dot gnu dot org
  8 siblings, 0 replies; 9+ messages in thread
From: gdr at gcc dot gnu dot org @ 2007-02-03 15:37 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #14 from gdr at gcc dot gnu dot org  2007-02-03 15:37 -------
Fixed in GCC-4.1.0 and higher


-- 

gdr at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |FIXED
   Target Milestone|4.0.4                       |4.1.0


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23490


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug rtl-optimization/23490] [4.0 Regression] Long compile time for array initializer with inlined constructor
       [not found] <bug-23490-2744@http.gcc.gnu.org/bugzilla/>
                   ` (7 preceding siblings ...)
  2007-02-03 15:37 ` [Bug rtl-optimization/23490] [4.0 " gdr at gcc dot gnu dot org
@ 2008-05-03  0:10 ` pinskia at gcc dot gnu dot org
  8 siblings, 0 replies; 9+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2008-05-03  0:10 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #15 from pinskia at gcc dot gnu dot org  2008-05-03 00:09 -------
hehe:
 tree DSE              :   4.79 (14%) usr   0.05 ( 6%) sys   5.11 (14%) wall   
   2 kB ( 0%) ggc
 tree PRE              :   9.79 (29%) usr   0.12 (14%) sys  10.32 (29%) wall   
2016 kB ( 1%) ggc
 tree FRE              :   9.68 (29%) usr   0.10 (12%) sys  10.09 (28%) wall   
 112 kB ( 0%) ggc


Note this is with checking enabled so they might not be good numbers.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23490


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2008-05-03  0:10 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <bug-23490-2744@http.gcc.gnu.org/bugzilla/>
2005-10-31  5:17 ` [Bug rtl-optimization/23490] [3.4/4.0/4.1 Regression] Long compile time for array initializer with inlined constructor mmitchel at gcc dot gnu dot org
2005-11-03 20:06 ` hubicka at gcc dot gnu dot org
2005-11-03 20:20 ` falk at debian dot org
2005-11-03 21:23 ` hubicka at ucw dot cz
2005-11-05  0:55 ` hubicka at gcc dot gnu dot org
2005-11-07 20:57 ` [Bug rtl-optimization/23490] [3.4/4.0 " pinskia at gcc dot gnu dot org
2006-03-11  3:20 ` mmitchel at gcc dot gnu dot org
2007-02-03 15:37 ` [Bug rtl-optimization/23490] [4.0 " gdr at gcc dot gnu dot org
2008-05-03  0:10 ` pinskia at gcc dot gnu dot org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).