public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/18595] New: IV-OPTS is slow (and does not scale)
@ 2004-11-21 14:59 pinskia at gcc dot gnu dot org
  2004-11-24 11:15 ` [Bug tree-optimization/18595] " steven at gcc dot gnu dot org
                   ` (55 more replies)
  0 siblings, 56 replies; 59+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2004-11-21 14:59 UTC (permalink / raw)
  To: gcc-bugs

Using the program in PR 10060, to generate 1000 and 2000, I see that "IV OPT" goes from:
 tree iv optimization  :   1.79 (14%) usr   0.15 ( 5%) sys   2.60 (13%) wall
to
 tree iv optimization  :   6.92 (13%) usr   0.29 ( 5%) sys   7.30 (12%) wall

which looks like PHI iv opt does not scale O(n).

This is not as importran as PR 18594 though (which is the same testcase).

-- 
           Summary: IV-OPTS is slow (and does not scale)
           Product: gcc
           Version: 4.0.0
            Status: UNCONFIRMED
          Keywords: compile-time-hog
          Severity: minor
          Priority: P2
         Component: tree-optimization
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: pinskia at gcc dot gnu dot org
                CC: gcc-bugs at gcc dot gnu dot org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug tree-optimization/18595] IV-OPTS is slow (and does not scale)
  2004-11-21 14:59 [Bug tree-optimization/18595] New: IV-OPTS is slow (and does not scale) pinskia at gcc dot gnu dot org
@ 2004-11-24 11:15 ` steven at gcc dot gnu dot org
  2004-11-25 11:16 ` [Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3) belyshev at lubercy dot com
                   ` (54 subsequent siblings)
  55 siblings, 0 replies; 59+ messages in thread
From: steven at gcc dot gnu dot org @ 2004-11-24 11:15 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From steven at gcc dot gnu dot org  2004-11-24 11:14 -------
if IV opts doesn't scale, then why is the compile time ~14% in both
cases?

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
  2004-11-21 14:59 [Bug tree-optimization/18595] New: IV-OPTS is slow (and does not scale) pinskia at gcc dot gnu dot org
  2004-11-24 11:15 ` [Bug tree-optimization/18595] " steven at gcc dot gnu dot org
@ 2004-11-25 11:16 ` belyshev at lubercy dot com
  2004-11-27  1:11 ` steven at gcc dot gnu dot org
                   ` (53 subsequent siblings)
  55 siblings, 0 replies; 59+ messages in thread
From: belyshev at lubercy dot com @ 2004-11-25 11:16 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From belyshev at lubercy dot com  2004-11-25 11:16 -------
use this awk script to generate testcase (first arg is number of loops):
----------------------------------------------------------------------------------------
BEGIN {
	ORS=""
	print "int f ()\n{\tint "
	for (j = 0; j < ARGV [1]; j++)
		print "j" j ", "
	print "a;\n\ta = 0;\n"
	print "\tfor (j0 = 0; j0 < 2; j0 ++)\n"
	for (j = 1; j < ARGV [1]; j++)
		print "\tfor (j" j " = j" j-1 "; j" j " < 2; j" j" ++)\n"
	print "\ta += "
	for (j = 0; j < ARGV [1]-1; j++)
		print "j" j " + "
	print "j" j ";\n\treturn a;\n}\n"
}
----------------------------------------------------------------------------------------

N loops   Time, s
 5        0.05
10        0.17
15        0.38
20        1.14
25        2.81
30        4.68
35        7.52
40        13.6
45        21.8
50        25.6
55        33.9


-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
           Severity|minor                       |normal
             Status|UNCONFIRMED                 |NEW
     Ever Confirmed|                            |1
           Keywords|                            |memory-hog
      Known to fail|                            |4.0.0
   Last reconfirmed|0000-00-00 00:00:00         |2004-11-25 11:16:30
               date|                            |
            Summary|IV-OPTS is slow (and does   |[4.0 Regression] IV-OPTS is
                   |not scale)                  |O(N^3)
   Target Milestone|---                         |4.0.0


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
  2004-11-21 14:59 [Bug tree-optimization/18595] New: IV-OPTS is slow (and does not scale) pinskia at gcc dot gnu dot org
  2004-11-24 11:15 ` [Bug tree-optimization/18595] " steven at gcc dot gnu dot org
  2004-11-25 11:16 ` [Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3) belyshev at lubercy dot com
@ 2004-11-27  1:11 ` steven at gcc dot gnu dot org
  2004-11-27  1:45 ` belyshev at lubercy dot com
                   ` (52 subsequent siblings)
  55 siblings, 0 replies; 59+ messages in thread
From: steven at gcc dot gnu dot org @ 2004-11-27  1:11 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From steven at gcc dot gnu dot org  2004-11-27 01:10 -------
Of course it's not very useful to claim that some algorithm
is O(N^x) when you don't say what N is...

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
  2004-11-21 14:59 [Bug tree-optimization/18595] New: IV-OPTS is slow (and does not scale) pinskia at gcc dot gnu dot org
                   ` (2 preceding siblings ...)
  2004-11-27  1:11 ` steven at gcc dot gnu dot org
@ 2004-11-27  1:45 ` belyshev at lubercy dot com
  2004-11-27 21:17 ` neroden at gcc dot gnu dot org
                   ` (51 subsequent siblings)
  55 siblings, 0 replies; 59+ messages in thread
From: belyshev at lubercy dot com @ 2004-11-27  1:45 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From belyshev at lubercy dot com  2004-11-27 01:45 -------
(In reply to comment #3)
> Of course it's not very useful to claim that some algorithm
> is O(N^x) when you don't say what N is...

N is number of nested loops.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
  2004-11-21 14:59 [Bug tree-optimization/18595] New: IV-OPTS is slow (and does not scale) pinskia at gcc dot gnu dot org
                   ` (3 preceding siblings ...)
  2004-11-27  1:45 ` belyshev at lubercy dot com
@ 2004-11-27 21:17 ` neroden at gcc dot gnu dot org
  2004-11-28 18:43 ` rakdver at gcc dot gnu dot org
                   ` (50 subsequent siblings)
  55 siblings, 0 replies; 59+ messages in thread
From: neroden at gcc dot gnu dot org @ 2004-11-27 21:17 UTC (permalink / raw)
  To: gcc-bugs



-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
OtherBugsDependingO|                            |18693
              nThis|                            |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
  2004-11-21 14:59 [Bug tree-optimization/18595] New: IV-OPTS is slow (and does not scale) pinskia at gcc dot gnu dot org
                   ` (4 preceding siblings ...)
  2004-11-27 21:17 ` neroden at gcc dot gnu dot org
@ 2004-11-28 18:43 ` rakdver at gcc dot gnu dot org
  2004-12-22 23:40 ` belyshev at depni dot sinp dot msu dot ru
                   ` (49 subsequent siblings)
  55 siblings, 0 replies; 59+ messages in thread
From: rakdver at gcc dot gnu dot org @ 2004-11-28 18:43 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From rakdver at gcc dot gnu dot org  2004-11-28 18:42 -------
IVOPTs should behave quadratically on this type of tests; I do not know where
does the extra O(N) factor come from, and I would not expect the times to grow
this fast.

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
         AssignedTo|unassigned at gcc dot gnu   |rakdver at gcc dot gnu dot
                   |dot org                     |org
             Status|NEW                         |ASSIGNED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
  2004-11-21 14:59 [Bug tree-optimization/18595] New: IV-OPTS is slow (and does not scale) pinskia at gcc dot gnu dot org
                   ` (5 preceding siblings ...)
  2004-11-28 18:43 ` rakdver at gcc dot gnu dot org
@ 2004-12-22 23:40 ` belyshev at depni dot sinp dot msu dot ru
  2004-12-23  1:26 ` steven at gcc dot gnu dot org
                   ` (48 subsequent siblings)
  55 siblings, 0 replies; 59+ messages in thread
From: belyshev at depni dot sinp dot msu dot ru @ 2004-12-22 23:40 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From belyshev at depni dot sinp dot msu dot ru  2004-12-22 23:40 -------
It seems that extra O(N) factor comes from high memory usage. For smaller N
(0..30) compiler behaves like O(2) and for larger (> 50) like O(2.5 .. 3).

gnuplot> fit [0:30] A*x**k+B "data" via A,k,B
[snip]
Final set of parameters            Asymptotic Standard Error
=======================            ==========================

A               = 0.000465461      +/- 2.625e-05    (5.64%)
k               = 1.90612          +/- 0.01652      (0.8667%)
B               = 0.0328722        +/- 0.0007677    (2.335%)

gnuplot> fit [50:175] A*x**k+B "data" via A,k,B
[snip]
Final set of parameters            Asymptotic Standard Error
=======================            ==========================

A               = 4.76329e-06      +/- 3.493e-07    (7.332%)
k               = 3.0033           +/- 0.01405      (0.4678%)
B               = 0.392746         +/- 0.02972      (7.567%)

For N=175 memory usage on my machine exceedes 600 MB

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|2004-11-25 11:16:30         |2004-12-22 23:40:41
               date|                            |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
  2004-11-21 14:59 [Bug tree-optimization/18595] New: IV-OPTS is slow (and does not scale) pinskia at gcc dot gnu dot org
                   ` (6 preceding siblings ...)
  2004-12-22 23:40 ` belyshev at depni dot sinp dot msu dot ru
@ 2004-12-23  1:26 ` steven at gcc dot gnu dot org
  2004-12-23  1:29 ` rakdver at atrey dot karlin dot mff dot cuni dot cz
                   ` (47 subsequent siblings)
  55 siblings, 0 replies; 59+ messages in thread
From: steven at gcc dot gnu dot org @ 2004-12-23  1:26 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From steven at gcc dot gnu dot org  2004-12-23 01:26 -------
hmmm maybe the extra O(N) comes from O(N) bitmap operations? (Just guessing) 
 

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
  2004-11-21 14:59 [Bug tree-optimization/18595] New: IV-OPTS is slow (and does not scale) pinskia at gcc dot gnu dot org
                   ` (7 preceding siblings ...)
  2004-12-23  1:26 ` steven at gcc dot gnu dot org
@ 2004-12-23  1:29 ` rakdver at atrey dot karlin dot mff dot cuni dot cz
  2005-01-23 14:18 ` steven at gcc dot gnu dot org
                   ` (46 subsequent siblings)
  55 siblings, 0 replies; 59+ messages in thread
From: rakdver at atrey dot karlin dot mff dot cuni dot cz @ 2004-12-23  1:29 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From rakdver at atrey dot karlin dot mff dot cuni dot cz  2004-12-23 01:29 -------
Subject: Re:  [4.0 Regression] IV-OPTS is O(N^3)

> hmmm maybe the extra O(N) comes from O(N) bitmap operations? (Just guessing) 

that might be the case, but I don't think it is likely (also just
guessing :-)  As far as I can tell all bitmaps in ivopts should be small
for this testcase.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
  2004-11-21 14:59 [Bug tree-optimization/18595] New: IV-OPTS is slow (and does not scale) pinskia at gcc dot gnu dot org
                   ` (8 preceding siblings ...)
  2004-12-23  1:29 ` rakdver at atrey dot karlin dot mff dot cuni dot cz
@ 2005-01-23 14:18 ` steven at gcc dot gnu dot org
  2005-01-23 15:01   ` Daniel Berlin
  2005-01-23 14:22 ` steven at gcc dot gnu dot org
                   ` (45 subsequent siblings)
  55 siblings, 1 reply; 59+ messages in thread
From: steven at gcc dot gnu dot org @ 2005-01-23 14:18 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From steven at gcc dot gnu dot org  2005-01-23 14:18 -------
It's definitely not the bitmaps: 
 time   seconds   seconds    calls   s/call   s/call  name 
 10.28     25.14    25.14 24356740     0.00     0.00  ggc_alloc_stat 
  5.39     38.32    13.18  3133403     0.00     0.00  instantiate_parameters_1 
  5.25     51.15    12.83 34260186     0.00     0.00  htab_find_slot_with_hash 
  4.41     61.94    10.79 40175413     0.00     0.00  build_int_cst_wide 
  4.35     72.58    10.64  1139473     0.00     0.00  chrec_convert 
  3.79     81.86     9.28     1810     0.01     0.01  htab_empty 
  3.63     90.73     8.87 22863854     0.00     0.00  build3_stat 
  3.41     99.07     8.34      800     0.01     0.01  for_each_insn_in_loop 
  2.74    105.76     6.69   415264     0.00     0.00  follow_ssa_edge 
  2.64    112.21     6.45  3085512     0.00     0.00  
find_interesting_uses_stmt 
  2.14    117.45     5.24  3197505     0.00     0.00  record_invariant 
  2.13    122.66     5.21  9695984     0.00     0.00  
analyze_scalar_evolution_1 
  1.65    126.70     4.04 11769469     0.00     0.00  comptypes 
  1.64    130.70     4.00 11710741     0.00     0.00  fold_convert_const 
  1.63    134.68     3.98 23551936     0.00     0.00  make_node_stat 
  1.44    138.19     3.51 11464308     0.00     0.00  force_fit_type 
  1.28    141.31     3.12 12009984     0.00     0.00  fold_convert 
  1.25    144.37     3.06 21935414     0.00     0.00  eq_scev_info 
  1.18    147.26     2.89  9695984     0.00     0.00  analyze_scalar_evolution 
  1.09    149.93     2.67   326865     0.00     0.00  chrec_fold_plus_1 
  1.01    152.41     2.48      200     0.01     0.94  
tree_ssa_iv_optimize_loop 
  1.01    154.88     2.47 29473273     0.00     0.00  is_gimple_min_invariant 
  0.99    157.30     2.43 26291603     0.00     0.00  flow_bb_inside_loop_p 
  0.98    159.70     2.40 22605936     0.00     0.00  flow_loop_nested_p 
  0.95    162.03     2.33  3135046     0.00     0.00  htab_delete 
  0.95    164.35     2.32   164459     0.00     0.00  htab_expand 
 

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
  2004-11-21 14:59 [Bug tree-optimization/18595] New: IV-OPTS is slow (and does not scale) pinskia at gcc dot gnu dot org
                   ` (9 preceding siblings ...)
  2005-01-23 14:18 ` steven at gcc dot gnu dot org
@ 2005-01-23 14:22 ` steven at gcc dot gnu dot org
  2005-01-23 14:23 ` steven at gcc dot gnu dot org
                   ` (44 subsequent siblings)
  55 siblings, 0 replies; 59+ messages in thread
From: steven at gcc dot gnu dot org @ 2005-01-23 14:22 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From steven at gcc dot gnu dot org  2005-01-23 14:22 -------
More profile data: 
 
----------------------------------------------- 
                0.00  187.04       1/1           execute_pass_list [6] 
[7]     76.5    0.00  187.04       1         tree_ssa_iv_optimize [7] 
                2.48  184.56     200/200         tree_ssa_iv_optimize_loop [8] 
                0.00    0.00       1/201         free_loop_data [349] 
                0.00    0.00       2/30137       bitmap_obstack_free [178] 
                0.00    0.00     201/7042572     xcalloc [127] 
                0.00    0.00       3/2369        varray_init [1063] 
                0.00    0.00       2/36950       bitmap_obstack_alloc [794] 
----------------------------------------------- 
                2.48  184.56     200/200         tree_ssa_iv_optimize [7] 
[8]     76.5    2.48  184.56     200         tree_ssa_iv_optimize_loop [8] 
               30.61  134.21  309011/311017      simple_iv <cycle 2> [35] 
                6.45    9.73 3085512/3085512     find_interesting_uses_stmt 
[25] 
                0.04    1.03     200/202         scev_reset [93] 
                0.00    0.76     200/200         determine_use_iv_costs [107] 
                0.00    0.50     200/200         rewrite_uses [142] 
                0.00    0.35     200/202         loop_commit_inserts [181] 
                0.11    0.11   20100/20100       find_interesting_uses_cond 
[241] 
                0.02    0.09     200/311017      number_of_iterations_exit 
<cycle 2> [41] 
                0.06    0.04     200/201         free_loop_data [349] 
                0.00    0.09     200/200         create_new_ivs [360] 
                0.00    0.08     200/401         get_loop_body_in_dom_order 
[274] 
                0.00    0.06     200/2601        get_loop_body [111] 
                0.05    0.00     200/200         remove_unused_ivs [440] 
                0.02    0.03   19703/99904       
find_interesting_uses_outer_or_nonlin [223] 
                0.00    0.04    2004/2804        add_candidate [413] 
                0.00    0.02     200/200         find_optimal_iv_set [609] 
                0.00    0.02   20507/40407       set_iv [503] 
                0.00    0.02     800/800         add_iv_value_candidates [689] 
                0.01    0.01   80200/26291603     flow_bb_inside_loop_p [44] 
                0.01    0.00     800/30137       bitmap_obstack_free [178] 
                0.00    0.00    1006/5012        force_var_cost [590] 
                0.00    0.00     200/200         iv_ca_free [1067] 
                0.00    0.00     201/3005        add_candidate_1 [530] 
                0.00    0.00   20708/370258      zero_p [517] 
                0.00    0.00    1399/3525413     get_iv [69] 
                0.00    0.00     402/12009984     fold_convert [17] 
                0.00    0.00    1198/3329551     is_gimple_reg [71] 
                0.00    0.00    1202/40175413     build_int_cst_wide [26] 
                0.00    0.00     200/200         single_dom_exit [1344] 
                0.00    0.00     402/132285      find_edge [436] 
                0.00    0.00    1003/8376514     bitmap_set_bit [92] 
                0.00    0.00   20507/20507       contains_abnormal_ssa_name_p 
[1427] 
                0.00    0.00     201/57945       int_cst_value [804] 
                0.00    0.00    1202/22889750     build_int_cst [141] 
                0.00    0.00    1006/5034        add_cost [1463] 
                0.00    0.00    1007/22165       cst_and_fits_in_hwi [1748] 
                0.00    0.00     402/1401        loop_latch_edge [2001] 
                0.00    0.00     201/1597        loop_preheader_edge [1984] 
----------------------------------------------- 
[9]     67.8   30.81  135.08  311017+24191806 <cycle 2 as a whole> [9] 
               13.18   43.06 3133403+18788021     instantiate_parameters_1 
<cycle 2> [11] 
                5.21   44.15 9695984             analyze_scalar_evolution_1 
<cycle 2> [12] 
                0.22   20.27  703698             interpret_rhs_modify_expr 
<cycle 2> [22] 
                2.89    7.22 9695984             analyze_scalar_evolution 
<cycle 2> [32] 
                0.57    6.74  313469+81390       follow_ssa_edge_in_rhs <cycle 
2> [38] 
                6.69    0.56  415264+1352605     follow_ssa_edge <cycle 2> 
[39] 
                0.22    4.74   21906             number_of_iterations_exit 
<cycle 2> [41] 
                0.05    1.16   30495             number_of_iterations_in_loop 
<cycle 2> [86] 
                0.04    0.07   43784             instantiate_parameters <cycle 
2> [347] 
                0.05    0.01   65718             
simplify_using_outer_evolutions <cycle 2> [441] 
                0.05    0.00   30295             
compute_overall_effect_of_inner_loop <cycle 2> [459] 
----------------------------------------------- 
                             22088230             chrec_convert [10] 
                0.95    4.12  101392/1139473     instantiate_parameters_1 
<cycle 2> [11] 
                1.23    5.36  131992/1139473     follow_ssa_edge_in_rhs <cycle 
2> [38] 
                3.78   16.43  404782/1139473     interpret_rhs_modify_expr 
<cycle 2> [22] 
                4.68   20.35  501307/1139473     analyze_scalar_evolution_1 
<cycle 2> [12] 
[10]    23.3   10.64   46.26 1139473+22088230 chrec_convert [10] 
                3.01   17.81 11575896/12009984     fold_convert [17] 
                4.15   13.69 10684608/22863854     build3_stat [13] 
                2.97    1.31 11044115/40175413     build_int_cst_wide [26] 
                0.76    0.94  340478/806987      fold <cycle 7> [47] 
                1.37    0.00 23227703/23318505     chrec_type [77] 
                0.25    0.00 11044115/22889750     build_int_cst [141] 
                             22088230             chrec_convert [10] 
----------------------------------------------- 
                             18788021             instantiate_parameters_1 
<cycle 2> [11] 
                               43784             instantiate_parameters <cycle 
2> [347] 
                             3089619             simple_iv <cycle 2> [35] 
[11]    23.0   13.18   43.06 3133403+18788021 instantiate_parameters_1 <cycle 
2> [11] 
                1.28    8.71 11465180/11465180     set_instantiated_value [33] 
                2.25    7.43 5793593/22863854     build3_stat [13] 
                0.85    5.09  104374/326865      chrec_fold_plus_1 [24] 
                0.95    4.12  101392/1139473     chrec_convert [10] 
                1.51    0.67 5628417/40175413     build_int_cst_wide [26] 
                1.92    0.03 5773179/5820723     htab_find_with_hash [67] 
                1.84    0.00 21921424/29473273     is_gimple_min_invariant 
[56] 
                0.82    0.70 8874753/26291603     flow_bb_inside_loop_p [44] 
                0.90    0.28 5773179/5784085     htab_find [88] 
                1.14    0.00 5732590/6202135     bitmap_clear_bit [85] 
                0.85    0.00 5732590/5756097     find_common_loop [104] 
                0.78    0.00 5732590/8376514     bitmap_set_bit [92] 
                0.67    0.00 5732590/10054625     bitmap_bit_p [89] 
                0.13    0.00 5628417/22889750     build_int_cst [141] 
                0.05    0.06   21892/806987      fold <cycle 7> [47] 
                0.01    0.03   21892/609615      build2_stat [98] 
                0.01    0.01   82482/3352077     chrec_fold_plus [122] 
                0.00    0.00   21892/2867796     chrec_fold_minus [154] 
                             5732590             analyze_scalar_evolution 
<cycle 2> [32] 
                             18788021             instantiate_parameters_1 
<cycle 2> [11] 
----------------------------------------------- 
                             9695984             analyze_scalar_evolution 
<cycle 2> [32] 
[12]    20.2    5.21   44.15 9695984         analyze_scalar_evolution_1 <cycle 
2> [12] 
                4.68   20.35  501307/1139473     chrec_convert [10] 
                1.65    9.88  202391/326865      chrec_fold_plus_1 [24] 
                0.45    4.61 6482915/15734611     find_var_scev_info [29] 
                0.88    0.75 9534492/26291603     flow_bb_inside_loop_p [44] 
                0.17    0.55  428920/22863854     build3_stat [13] 
                0.05    0.06   21785/806987      fold <cycle 7> [47] 
                0.01    0.03  202389/3352077     chrec_fold_plus [122] 
                0.02    0.00   70699/70699       chrec_merge [642] 
                0.01    0.00   70699/1575094     loop_phi_node_p [202] 
                0.00    0.00       2/2867796     chrec_fold_minus [154] 
                              703698             interpret_rhs_modify_expr 
<cycle 2> [22] 
                               70699             follow_ssa_edge <cycle 2> 
[39] 
----------------------------------------------- 
 

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
  2004-11-21 14:59 [Bug tree-optimization/18595] New: IV-OPTS is slow (and does not scale) pinskia at gcc dot gnu dot org
                   ` (10 preceding siblings ...)
  2005-01-23 14:22 ` steven at gcc dot gnu dot org
@ 2005-01-23 14:23 ` steven at gcc dot gnu dot org
  2005-01-23 14:24 ` steven at gcc dot gnu dot org
                   ` (43 subsequent siblings)
  55 siblings, 0 replies; 59+ messages in thread
From: steven at gcc dot gnu dot org @ 2005-01-23 14:23 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From steven at gcc dot gnu dot org  2005-01-23 14:23 -------
Looks to me like Seb may want to look at this bug too... 

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |spop at gcc dot gnu dot org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
  2004-11-21 14:59 [Bug tree-optimization/18595] New: IV-OPTS is slow (and does not scale) pinskia at gcc dot gnu dot org
                   ` (11 preceding siblings ...)
  2005-01-23 14:23 ` steven at gcc dot gnu dot org
@ 2005-01-23 14:24 ` steven at gcc dot gnu dot org
  2005-01-23 15:02 ` dberlin at dberlin dot org
                   ` (42 subsequent siblings)
  55 siblings, 0 replies; 59+ messages in thread
From: steven at gcc dot gnu dot org @ 2005-01-23 14:24 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From steven at gcc dot gnu dot org  2005-01-23 14:24 -------
One more, because I also run out of memory pretty quickly: 
----------------------------------------------- 
                0.00    0.00       1/22863854     c_build_bind_expr [1608] 
                0.00    0.00       1/22863854     thread_across_edge [392] 
                0.00    0.00     199/22863854     
estimate_numbers_of_iterations [184] 
                0.00    0.00     200/22863854     c_finish_loop [754] 
                0.00    0.01    9801/22863854     follow_ssa_edge_in_rhs 
<cycle 2> [38] 
                0.02    0.06   50602/22863854     add_to_evolution [324] 
                0.10    0.32  247913/22863854     simple_iv <cycle 2> [35] 
                0.17    0.55  428920/22863854     analyze_scalar_evolution_1 
<cycle 2> [12] 
                2.19    7.24 5648016/22863854     chrec_fold_plus_1 [24] 
                2.25    7.43 5793593/22863854     instantiate_parameters_1 
<cycle 2> [11] 
                4.15   13.69 10684608/22863854     chrec_convert [10] 
[13]    15.6    8.87   29.30 22863854         build3_stat [13] 
                3.86   25.44 22863854/23551936     make_node_stat [15] 
----------------------------------------------- 
 

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
  2005-01-23 14:18 ` steven at gcc dot gnu dot org
@ 2005-01-23 15:01   ` Daniel Berlin
  0 siblings, 0 replies; 59+ messages in thread
From: Daniel Berlin @ 2005-01-23 15:01 UTC (permalink / raw)
  To: steven at gcc dot gnu dot org; +Cc: gcc-bugs

I believe seb/zdenek already submitted patches for speeding up scev quite 
recently, with the goal of alleviating this problem.
I'm pretty sure they have not been applied yet.



^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
  2004-11-21 14:59 [Bug tree-optimization/18595] New: IV-OPTS is slow (and does not scale) pinskia at gcc dot gnu dot org
                   ` (12 preceding siblings ...)
  2005-01-23 14:24 ` steven at gcc dot gnu dot org
@ 2005-01-23 15:02 ` dberlin at dberlin dot org
  2005-01-23 15:06 ` rakdver at atrey dot karlin dot mff dot cuni dot cz
                   ` (41 subsequent siblings)
  55 siblings, 0 replies; 59+ messages in thread
From: dberlin at dberlin dot org @ 2005-01-23 15:02 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From dberlin at gcc dot gnu dot org  2005-01-23 15:01 -------
Subject: Re:  [4.0 Regression] IV-OPTS is O(N^3)

I believe seb/zdenek already submitted patches for speeding up scev quite 
recently, with the goal of alleviating this problem.
I'm pretty sure they have not been applied yet.




-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
  2004-11-21 14:59 [Bug tree-optimization/18595] New: IV-OPTS is slow (and does not scale) pinskia at gcc dot gnu dot org
                   ` (13 preceding siblings ...)
  2005-01-23 15:02 ` dberlin at dberlin dot org
@ 2005-01-23 15:06 ` rakdver at atrey dot karlin dot mff dot cuni dot cz
  2005-01-23 16:38 ` rakdver at gcc dot gnu dot org
                   ` (40 subsequent siblings)
  55 siblings, 0 replies; 59+ messages in thread
From: rakdver at atrey dot karlin dot mff dot cuni dot cz @ 2005-01-23 15:06 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From rakdver at atrey dot karlin dot mff dot cuni dot cz  2005-01-23 15:06 -------
Subject: Re:  [4.0 Regression] IV-OPTS is O(N^3)

> I believe seb/zdenek already submitted patches for speeding up scev quite 
> recently, with the goal of alleviating this problem.
> I'm pretty sure they have not been applied yet.

the Sebastian's patch is not in yet; it might help a bit in this case, a
believe, although I suspect a real source of the problem is elsewhere.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
  2004-11-21 14:59 [Bug tree-optimization/18595] New: IV-OPTS is slow (and does not scale) pinskia at gcc dot gnu dot org
                   ` (14 preceding siblings ...)
  2005-01-23 15:06 ` rakdver at atrey dot karlin dot mff dot cuni dot cz
@ 2005-01-23 16:38 ` rakdver at gcc dot gnu dot org
  2005-01-23 17:04 ` rakdver at gcc dot gnu dot org
                   ` (39 subsequent siblings)
  55 siblings, 0 replies; 59+ messages in thread
From: rakdver at gcc dot gnu dot org @ 2005-01-23 16:38 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From rakdver at gcc dot gnu dot org  2005-01-23 16:38 -------
Sebastian's patch does not help significantly :-(

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
  2004-11-21 14:59 [Bug tree-optimization/18595] New: IV-OPTS is slow (and does not scale) pinskia at gcc dot gnu dot org
                   ` (15 preceding siblings ...)
  2005-01-23 16:38 ` rakdver at gcc dot gnu dot org
@ 2005-01-23 17:04 ` rakdver at gcc dot gnu dot org
  2005-01-23 17:14 ` rakdver at gcc dot gnu dot org
                   ` (38 subsequent siblings)
  55 siblings, 0 replies; 59+ messages in thread
From: rakdver at gcc dot gnu dot org @ 2005-01-23 17:04 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From rakdver at gcc dot gnu dot org  2005-01-23 17:04 -------
One extra N factor seems to be coming from simple_iv
(analyze_scalar_evolution_in_loop).  The function was created before
loop closed ssa form, under assumptions of lcssa it should be possible
to get rid of this.  I am working on patch

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
  2004-11-21 14:59 [Bug tree-optimization/18595] New: IV-OPTS is slow (and does not scale) pinskia at gcc dot gnu dot org
                   ` (16 preceding siblings ...)
  2005-01-23 17:04 ` rakdver at gcc dot gnu dot org
@ 2005-01-23 17:14 ` rakdver at gcc dot gnu dot org
  2005-01-24  1:40 ` rakdver at gcc dot gnu dot org
                   ` (37 subsequent siblings)
  55 siblings, 0 replies; 59+ messages in thread
From: rakdver at gcc dot gnu dot org @ 2005-01-23 17:14 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From rakdver at gcc dot gnu dot org  2005-01-23 17:14 -------
Also did not help much.

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
  2004-11-21 14:59 [Bug tree-optimization/18595] New: IV-OPTS is slow (and does not scale) pinskia at gcc dot gnu dot org
                   ` (17 preceding siblings ...)
  2005-01-23 17:14 ` rakdver at gcc dot gnu dot org
@ 2005-01-24  1:40 ` rakdver at gcc dot gnu dot org
  2005-01-24  1:46 ` rakdver at gcc dot gnu dot org
                   ` (36 subsequent siblings)
  55 siblings, 0 replies; 59+ messages in thread
From: rakdver at gcc dot gnu dot org @ 2005-01-24  1:40 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From rakdver at gcc dot gnu dot org  2005-01-24 01:40 -------
The patch that causes ivopts to reset just the relevant parts of the scev cache
(just regtesting) instead of clearing it completely helps a bit -- the compile
time for N=100 gets about 2x better and memory consumption about 10x.

Still there remain some inefficiences within the scev analysis itself.

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
  2004-11-21 14:59 [Bug tree-optimization/18595] New: IV-OPTS is slow (and does not scale) pinskia at gcc dot gnu dot org
                   ` (18 preceding siblings ...)
  2005-01-24  1:40 ` rakdver at gcc dot gnu dot org
@ 2005-01-24  1:46 ` rakdver at gcc dot gnu dot org
  2005-01-24  1:49   ` Daniel Berlin
  2005-01-24  1:49 ` dberlin at dberlin dot org
                   ` (35 subsequent siblings)
  55 siblings, 1 reply; 59+ messages in thread
From: rakdver at gcc dot gnu dot org @ 2005-01-24  1:46 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From rakdver at gcc dot gnu dot org  2005-01-24 01:46 -------
On a side note, PRE also seems to have problems with the testcase.  With the
patch mentioned above, the largest consumers of compile time are ivopts (45%)
and pre (20%).

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |dberlin at gcc dot gnu dot
                   |                            |org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
  2005-01-24  1:46 ` rakdver at gcc dot gnu dot org
@ 2005-01-24  1:49   ` Daniel Berlin
  0 siblings, 0 replies; 59+ messages in thread
From: Daniel Berlin @ 2005-01-24  1:49 UTC (permalink / raw)
  To: rakdver at gcc dot gnu dot org; +Cc: gcc-bugs



On Sun, 24 Jan 2005, rakdver at gcc dot gnu dot org wrote:

>
> ------- Additional Comments From rakdver at gcc dot gnu dot org  2005-01-24 01:46 -------
> On a side note, PRE also seems to have problems with the testcase.  With the
> patch mentioned above, the largest consumers of compile time are ivopts (45%)
> and pre (20%).

Uh, there was a bug filed about this, and i fixed it, last i looked.


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
  2004-11-21 14:59 [Bug tree-optimization/18595] New: IV-OPTS is slow (and does not scale) pinskia at gcc dot gnu dot org
                   ` (19 preceding siblings ...)
  2005-01-24  1:46 ` rakdver at gcc dot gnu dot org
@ 2005-01-24  1:49 ` dberlin at dberlin dot org
  2005-01-24  1:58 ` dberlin at dberlin dot org
                   ` (34 subsequent siblings)
  55 siblings, 0 replies; 59+ messages in thread
From: dberlin at dberlin dot org @ 2005-01-24  1:49 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From dberlin at gcc dot gnu dot org  2005-01-24 01:49 -------
Subject: Re:  [4.0 Regression] IV-OPTS is O(N^3)



On Sun, 24 Jan 2005, rakdver at gcc dot gnu dot org wrote:

>
> ------- Additional Comments From rakdver at gcc dot gnu dot org  2005-01-24 01:46 -------
> On a side note, PRE also seems to have problems with the testcase.  With the
> patch mentioned above, the largest consumers of compile time are ivopts (45%)
> and pre (20%).

Uh, there was a bug filed about this, and i fixed it, last i looked.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
  2004-11-21 14:59 [Bug tree-optimization/18595] New: IV-OPTS is slow (and does not scale) pinskia at gcc dot gnu dot org
                   ` (20 preceding siblings ...)
  2005-01-24  1:49 ` dberlin at dberlin dot org
@ 2005-01-24  1:58 ` dberlin at dberlin dot org
  2005-01-24 21:36 ` sebastian dot pop at cri dot ensmp dot fr
                   ` (33 subsequent siblings)
  55 siblings, 0 replies; 59+ messages in thread
From: dberlin at dberlin dot org @ 2005-01-24  1:58 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From dberlin at gcc dot gnu dot org  2005-01-24 01:57 -------
Subject: Re:  [4.0 Regression] IV-OPTS is O(N^3)

>> On a side note, PRE also seems to have problems with the testcase.  With the
>> patch mentioned above, the largest consumers of compile time are ivopts (45%)
>> and pre (20%).
>
> Uh, there was a bug filed about this, and i fixed it, last i looked.

Here it is.
Bug 18673.

N=100 takes .25 seconds now.

I can't make PRE much faster than that, that's almost all hash function 
time.
--Dan


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
  2004-11-21 14:59 [Bug tree-optimization/18595] New: IV-OPTS is slow (and does not scale) pinskia at gcc dot gnu dot org
                   ` (21 preceding siblings ...)
  2005-01-24  1:58 ` dberlin at dberlin dot org
@ 2005-01-24 21:36 ` sebastian dot pop at cri dot ensmp dot fr
  2005-01-24 22:17 ` rakdver at atrey dot karlin dot mff dot cuni dot cz
                   ` (32 subsequent siblings)
  55 siblings, 0 replies; 59+ messages in thread
From: sebastian dot pop at cri dot ensmp dot fr @ 2005-01-24 21:36 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From sebastian dot pop at cri dot ensmp dot fr  2005-01-24 21:36 -------
Subject: Re:  [4.0 Regression] IV-OPTS is O(N^3)

> 
> Still there remain some inefficiences within the scev analysis itself.
> 

Zdenek, have you tried to revert the patch that caches the
instantiation information?  This could speed up things a little.  

On a side note, I wonder how many times we're using this instantiation
cache, in other words whether we pay a high price for the scev
analysis for some (probably) rare cases...



-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
  2004-11-21 14:59 [Bug tree-optimization/18595] New: IV-OPTS is slow (and does not scale) pinskia at gcc dot gnu dot org
                   ` (22 preceding siblings ...)
  2005-01-24 21:36 ` sebastian dot pop at cri dot ensmp dot fr
@ 2005-01-24 22:17 ` rakdver at atrey dot karlin dot mff dot cuni dot cz
  2005-01-24 22:25 ` dberlin at dberlin dot org
                   ` (31 subsequent siblings)
  55 siblings, 0 replies; 59+ messages in thread
From: rakdver at atrey dot karlin dot mff dot cuni dot cz @ 2005-01-24 22:17 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From rakdver at atrey dot karlin dot mff dot cuni dot cz  2005-01-24 22:16 -------
Subject: Re:  [4.0 Regression] IV-OPTS is O(N^3)

> > Still there remain some inefficiences within the scev analysis itself.
> > 
> 
> Zdenek, have you tried to revert the patch that caches the
> instantiation information?  This could speed up things a little.  
>
> On a side note, I wonder how many times we're using this instantiation
> cache, in other words whether we pay a high price for the scev
> analysis for some (probably) rare cases...

Adding the instantiation cache was compile time neutral on normal code,
so I don't think the effect on scev cost is significant.

The problem is that we end up calling the instantiate_parameters_1
function 83214+2453273 (outside + recursive) times (for N=100).
We also call analyze_scalar_evolution_1 1146317 times.  Many of these
calls create POLYNOMIAL_CHREC nodes (2894131 calls to build3_stat).
Large part of 5142869 of build_int_cst_wide calls is very likely also
due to scev analysis.  This is not going to be cheap.  Removing the
instantiation cache definitely would not decrease the number of
instantiate_parameters_1 calls (might increase it, in fact, although
I don't know how significantly).

One part of the problem is that loop optimizers need to throw away the
scev caches after each change to loops, which leads to recounting the
information too much.  Allowing to invalidate only changed parts helps,
but seems to be relatively costly on normal code -- you need to scan the
hash table for things that refer to removed or from some other reason
invalidated ssa names, which is slow.  And to make things worse, this
change of course means that most of the information is left in the hash
table, i.e. the size of the table grows and these scans get slower and
slower.  The alternative -- checking the validity of entries only when
querying the cache -- is not possible, since we reuse the released
ssa names, so we would be unable to recognize the validity of the
information.

Other part is that scev tries to be too clever.  Without need to
represent nonaffine induction variables (that we do not use anywhere)
we could use more memory efficient representation of ivs.
Without need to handle symbolic references (that we also do not use
anywhere, we could store evolutions in a fully instantiated form, and
we would not need instantiate_parameters/resolve_mixers functions at all.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
  2004-11-21 14:59 [Bug tree-optimization/18595] New: IV-OPTS is slow (and does not scale) pinskia at gcc dot gnu dot org
                   ` (23 preceding siblings ...)
  2005-01-24 22:17 ` rakdver at atrey dot karlin dot mff dot cuni dot cz
@ 2005-01-24 22:25 ` dberlin at dberlin dot org
  2005-01-24 23:09 ` rakdver at atrey dot karlin dot mff dot cuni dot cz
                   ` (30 subsequent siblings)
  55 siblings, 0 replies; 59+ messages in thread
From: dberlin at dberlin dot org @ 2005-01-24 22:25 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From dberlin at gcc dot gnu dot org  2005-01-24 22:25 -------
Subject: Re:  [4.0 Regression] IV-OPTS is O(N^3)

>
> Other part is that scev tries to be too clever.  Without need to
> represent nonaffine induction variables (that we do not use anywhere)
> we could use more memory efficient representation of ivs.
> Without need to handle symbolic references (that we also do not use
> anywhere, we could store evolutions in a fully instantiated form, and
> we would not need instantiate_parameters/resolve_mixers functions atall.

Uh, symbolic references are or will be used to do data dependence when 
MEM_REF and ARRAY_REF couldn't be generated from the pointers.



-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
  2004-11-21 14:59 [Bug tree-optimization/18595] New: IV-OPTS is slow (and does not scale) pinskia at gcc dot gnu dot org
                   ` (24 preceding siblings ...)
  2005-01-24 22:25 ` dberlin at dberlin dot org
@ 2005-01-24 23:09 ` rakdver at atrey dot karlin dot mff dot cuni dot cz
  2005-01-24 23:12 ` dberlin at dberlin dot org
                   ` (29 subsequent siblings)
  55 siblings, 0 replies; 59+ messages in thread
From: rakdver at atrey dot karlin dot mff dot cuni dot cz @ 2005-01-24 23:09 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From rakdver at atrey dot karlin dot mff dot cuni dot cz  2005-01-24 23:09 -------
Subject: Re:  [4.0 Regression] IV-OPTS is O(N^3)

> > Other part is that scev tries to be too clever.  Without need to
> > represent nonaffine induction variables (that we do not use anywhere)
> > we could use more memory efficient representation of ivs.
> > Without need to handle symbolic references (that we also do not use
> > anywhere, we could store evolutions in a fully instantiated form, and
> > we would not need instantiate_parameters/resolve_mixers functions atall.
> 
> Uh, symbolic references are or will be used to do data dependence when 
> MEM_REF and ARRAY_REF couldn't be generated from the pointers.

How?  If the reference is left in symbolic form, it means that you know
nothing about its value, so the only thing you can do with it is to
check its equality/inequality, in code like

while (...)
  {
    i = something_weird ();

    a[i] = ...;  (a)
    a[i+1] = ...;  (b)
    a[i] = ...;  (c)
  }

to find that (b) is independent on (a) and (c) and to find the
dependence between (a) and (c), but you do not need scev for this --
value numbering info is enough.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
  2004-11-21 14:59 [Bug tree-optimization/18595] New: IV-OPTS is slow (and does not scale) pinskia at gcc dot gnu dot org
                   ` (25 preceding siblings ...)
  2005-01-24 23:09 ` rakdver at atrey dot karlin dot mff dot cuni dot cz
@ 2005-01-24 23:12 ` dberlin at dberlin dot org
  2005-01-25  0:27 ` rakdver at atrey dot karlin dot mff dot cuni dot cz
                   ` (28 subsequent siblings)
  55 siblings, 0 replies; 59+ messages in thread
From: dberlin at dberlin dot org @ 2005-01-24 23:12 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From dberlin at gcc dot gnu dot org  2005-01-24 23:12 -------
Subject: Re:  [4.0 Regression] IV-OPTS is O(N^3)



On Mon, 24 Jan 2005, rakdver at atrey dot karlin dot mff dot cuni dot cz wrote:

>
> ------- Additional Comments From rakdver at atrey dot karlin dot mff dot cuni dot cz  2005-01-24 23:09 -------
> Subject: Re:  [4.0 Regression] IV-OPTS is O(N^3)
>
>>> Other part is that scev tries to be too clever.  Without need to
>>> represent nonaffine induction variables (that we do not use anywhere)
>>> we could use more memory efficient representation of ivs.
>>> Without need to handle symbolic references (that we also do not use
>>> anywhere, we could store evolutions in a fully instantiated form, and
>>> we would not need instantiate_parameters/resolve_mixers functions atall.
>>
>> Uh, symbolic references are or will be used to do data dependence when
>> MEM_REF and ARRAY_REF couldn't be generated from the pointers.
>
> How?  If the reference is left in symbolic form, it means that you know
> nothing about its value,

Wrong.

We could know things about it's value, such as ranges. We just don't know 
an exact number.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
  2004-11-21 14:59 [Bug tree-optimization/18595] New: IV-OPTS is slow (and does not scale) pinskia at gcc dot gnu dot org
                   ` (26 preceding siblings ...)
  2005-01-24 23:12 ` dberlin at dberlin dot org
@ 2005-01-25  0:27 ` rakdver at atrey dot karlin dot mff dot cuni dot cz
  2005-01-25  0:39 ` dberlin at dberlin dot org
                   ` (27 subsequent siblings)
  55 siblings, 0 replies; 59+ messages in thread
From: rakdver at atrey dot karlin dot mff dot cuni dot cz @ 2005-01-25  0:27 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From rakdver at atrey dot karlin dot mff dot cuni dot cz  2005-01-25 00:27 -------
Subject: Re:  [4.0 Regression] IV-OPTS is O(N^3)

> >>> Other part is that scev tries to be too clever.  Without need to
> >>> represent nonaffine induction variables (that we do not use anywhere)
> >>> we could use more memory efficient representation of ivs.
> >>> Without need to handle symbolic references (that we also do not use
> >>> anywhere, we could store evolutions in a fully instantiated form, and
> >>> we would not need instantiate_parameters/resolve_mixers functions atall.
> >>
> >> Uh, symbolic references are or will be used to do data dependence when
> >> MEM_REF and ARRAY_REF couldn't be generated from the pointers.
> >
> > How?  If the reference is left in symbolic form, it means that you know
> > nothing about its value,
> 
> Wrong.
> 
> We could know things about it's value, such as ranges. We just don't know 
> an exact number.

OK.  This is work for VRP, not scev.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
  2004-11-21 14:59 [Bug tree-optimization/18595] New: IV-OPTS is slow (and does not scale) pinskia at gcc dot gnu dot org
                   ` (27 preceding siblings ...)
  2005-01-25  0:27 ` rakdver at atrey dot karlin dot mff dot cuni dot cz
@ 2005-01-25  0:39 ` dberlin at dberlin dot org
  2005-01-25  0:50 ` rakdver at atrey dot karlin dot mff dot cuni dot cz
                   ` (26 subsequent siblings)
  55 siblings, 0 replies; 59+ messages in thread
From: dberlin at dberlin dot org @ 2005-01-25  0:39 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From dberlin at gcc dot gnu dot org  2005-01-25 00:39 -------
Subject: Re:  [4.0 Regression] IV-OPTS is O(N^3)


>>>>
>>>> Uh, symbolic references are or will be used to do data dependence when
>>>> MEM_REF and ARRAY_REF couldn't be generated from the pointers.
>>>
>>> How?  If the reference is left in symbolic form, it means that you know
>>> nothing about its value,
>>
>> Wrong.
>>
>> We could know things about it's value, such as ranges. We just don't know
>> an exact number.
>
> OK.  This is work for VRP, not scev.

And once we have it, we can talk about removing the symbolic stuff from 
SCEV :)



-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
  2004-11-21 14:59 [Bug tree-optimization/18595] New: IV-OPTS is slow (and does not scale) pinskia at gcc dot gnu dot org
                   ` (29 preceding siblings ...)
  2005-01-25  0:50 ` rakdver at atrey dot karlin dot mff dot cuni dot cz
@ 2005-01-25  0:50 ` dberlin at dberlin dot org
  2005-01-25  0:52 ` dberlin at dberlin dot org
                   ` (24 subsequent siblings)
  55 siblings, 0 replies; 59+ messages in thread
From: dberlin at dberlin dot org @ 2005-01-25  0:50 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From dberlin at gcc dot gnu dot org  2005-01-25 00:50 -------
Subject: Re:  [4.0 Regression] IV-OPTS is O(N^3)



On Mon, 25 Jan 2005, rakdver at atrey dot karlin dot mff dot cuni dot cz wrote:

>
> ------- Additional Comments From rakdver at atrey dot karlin dot mff dot cuni dot cz  2005-01-25 00:49 -------
> Subject: Re:  [4.0 Regression] IV-OPTS is O(N^3)
>
>>>>>> Uh, symbolic references are or will be used to do data dependence when
>>>>>> MEM_REF and ARRAY_REF couldn't be generated from the pointers.
>>>>>
>>>>> How?  If the reference is left in symbolic form, it means that you know
>>>>> nothing about its value,
>>>>
>>>> Wrong.
>>>>
>>>> We could know things about it's value, such as ranges. We just don't know
>>>> an exact number.
>>>
>>> OK.  This is work for VRP, not scev.
>>
>> And once we have it, we can talk about removing the symbolic stuff from
>> SCEV :)
>
> Once we use any value range info from SCEV, we can speak about leaving
> symbolic references in SCEV until we have a proper VRP :-)
See autovec branch.




-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
  2004-11-21 14:59 [Bug tree-optimization/18595] New: IV-OPTS is slow (and does not scale) pinskia at gcc dot gnu dot org
                   ` (28 preceding siblings ...)
  2005-01-25  0:39 ` dberlin at dberlin dot org
@ 2005-01-25  0:50 ` rakdver at atrey dot karlin dot mff dot cuni dot cz
  2005-01-25  0:50 ` dberlin at dberlin dot org
                   ` (25 subsequent siblings)
  55 siblings, 0 replies; 59+ messages in thread
From: rakdver at atrey dot karlin dot mff dot cuni dot cz @ 2005-01-25  0:50 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From rakdver at atrey dot karlin dot mff dot cuni dot cz  2005-01-25 00:49 -------
Subject: Re:  [4.0 Regression] IV-OPTS is O(N^3)

> >>>> Uh, symbolic references are or will be used to do data dependence when
> >>>> MEM_REF and ARRAY_REF couldn't be generated from the pointers.
> >>>
> >>> How?  If the reference is left in symbolic form, it means that you know
> >>> nothing about its value,
> >>
> >> Wrong.
> >>
> >> We could know things about it's value, such as ranges. We just don't know
> >> an exact number.
> >
> > OK.  This is work for VRP, not scev.
> 
> And once we have it, we can talk about removing the symbolic stuff from 
> SCEV :)

Once we use any value range info from SCEV, we can speak about leaving
symbolic references in SCEV until we have a proper VRP :-)


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
  2004-11-21 14:59 [Bug tree-optimization/18595] New: IV-OPTS is slow (and does not scale) pinskia at gcc dot gnu dot org
                   ` (30 preceding siblings ...)
  2005-01-25  0:50 ` dberlin at dberlin dot org
@ 2005-01-25  0:52 ` dberlin at dberlin dot org
  2005-01-25  1:11 ` rakdver at gcc dot gnu dot org
                   ` (23 subsequent siblings)
  55 siblings, 0 replies; 59+ messages in thread
From: dberlin at dberlin dot org @ 2005-01-25  0:52 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From dberlin at gcc dot gnu dot org  2005-01-25 00:52 -------
Subject: Re:  [4.0 Regression] IV-OPTS is O(N^3)

> See autovec branch.


You could also look at recent patches posted by sebastian and i for the 
autovect branch that have been adding this support.



-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
  2004-11-21 14:59 [Bug tree-optimization/18595] New: IV-OPTS is slow (and does not scale) pinskia at gcc dot gnu dot org
                   ` (31 preceding siblings ...)
  2005-01-25  0:52 ` dberlin at dberlin dot org
@ 2005-01-25  1:11 ` rakdver at gcc dot gnu dot org
  2005-01-25  1:15 ` dberlin at dberlin dot org
                   ` (22 subsequent siblings)
  55 siblings, 0 replies; 59+ messages in thread
From: rakdver at gcc dot gnu dot org @ 2005-01-25  1:11 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From rakdver at gcc dot gnu dot org  2005-01-25 01:11 -------
Which one? I cannot find anything relevant in changelog.

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
  2004-11-21 14:59 [Bug tree-optimization/18595] New: IV-OPTS is slow (and does not scale) pinskia at gcc dot gnu dot org
                   ` (32 preceding siblings ...)
  2005-01-25  1:11 ` rakdver at gcc dot gnu dot org
@ 2005-01-25  1:15 ` dberlin at dberlin dot org
  2005-01-25  1:24 ` rakdver at atrey dot karlin dot mff dot cuni dot cz
                   ` (21 subsequent siblings)
  55 siblings, 0 replies; 59+ messages in thread
From: dberlin at dberlin dot org @ 2005-01-25  1:15 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From dberlin at gcc dot gnu dot org  2005-01-25 01:15 -------
Subject: Re:  [4.0 Regression] IV-OPTS is O(N^3)


> Which one? I cannot find anything relevant in changelog.
>


         * tree-data-ref.c (analyze_subscript_affine_affine): Implement a
         solution for the FIXME concerning the symbolic affine
         dependence testing; remove the FIXME.
         (can_use_analyze_subscript_affine_affine): New function.
         (analyze_siv_subscript): Use it.

and


2004-12-15  Daniel Berlin  <dberlin@dberlin.org>

         * Makefile.in (tree-chrec.o): Add cfgloop.h

         * tree-chrec.c: Add cfgloop.h, tree-flow.h.
         (evolution_function_is_invariant_p): New function.
         (evolution_function_is_affine_multivariate_p): Use
         evolution_function_is_invariant_p instead of
         evolution_function_is_constant_p.

         * tree-chrec.h: Add prototype for
         evolution_function_is_invariant_p.
         (evolution_function_is_affine_p): Use
         evolution_function_is_invariant_p.

         * tree-data-ref.c (analyze_overlapping_iterations): chrecs that
         are equal overlap on every iteration.

This stuff is just simple symbolic differencing, and checking of 
invariantness of symbols,
But it is indeed starting to se the symbolic scev info



-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
  2004-11-21 14:59 [Bug tree-optimization/18595] New: IV-OPTS is slow (and does not scale) pinskia at gcc dot gnu dot org
                   ` (33 preceding siblings ...)
  2005-01-25  1:15 ` dberlin at dberlin dot org
@ 2005-01-25  1:24 ` rakdver at atrey dot karlin dot mff dot cuni dot cz
  2005-01-25  1:28 ` steven at gcc dot gnu dot org
                   ` (20 subsequent siblings)
  55 siblings, 0 replies; 59+ messages in thread
From: rakdver at atrey dot karlin dot mff dot cuni dot cz @ 2005-01-25  1:24 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From rakdver at atrey dot karlin dot mff dot cuni dot cz  2005-01-25 01:24 -------
Subject: Re:  [4.0 Regression] IV-OPTS is O(N^3)

> > Which one? I cannot find anything relevant in changelog.
> >
> 
> 
>          * tree-data-ref.c (analyze_subscript_affine_affine): Implement a
>          solution for the FIXME concerning the symbolic affine
>          dependence testing; remove the FIXME.
>          (can_use_analyze_subscript_affine_affine): New function.
>          (analyze_siv_subscript): Use it.
> 
> and
> 
> 
> 2004-12-15  Daniel Berlin  <dberlin@dberlin.org>
> 
>          * Makefile.in (tree-chrec.o): Add cfgloop.h
> 
>          * tree-chrec.c: Add cfgloop.h, tree-flow.h.
>          (evolution_function_is_invariant_p): New function.
>          (evolution_function_is_affine_multivariate_p): Use
>          evolution_function_is_invariant_p instead of
>          evolution_function_is_constant_p.
> 
>          * tree-chrec.h: Add prototype for
>          evolution_function_is_invariant_p.
>          (evolution_function_is_affine_p): Use
>          evolution_function_is_invariant_p.
> 
>          * tree-data-ref.c (analyze_overlapping_iterations): chrecs that
>          are equal overlap on every iteration.
> 
> This stuff is just simple symbolic differencing, and checking of 
> invariantness of symbols,
> But it is indeed starting to se the symbolic scev info

Ugh... OK, if you think it is a right way...  Anyway, I am seriously
considering resurrecting the simple iv analysis for purposes of passes
that are not interested in this fancy stuff.  Overhead of scev is
probably acceptable if it is only used for this type of analysis,
but for other purposes it is clearly overkill.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
  2004-11-21 14:59 [Bug tree-optimization/18595] New: IV-OPTS is slow (and does not scale) pinskia at gcc dot gnu dot org
                   ` (34 preceding siblings ...)
  2005-01-25  1:24 ` rakdver at atrey dot karlin dot mff dot cuni dot cz
@ 2005-01-25  1:28 ` steven at gcc dot gnu dot org
  2005-01-25  1:42 ` dberlin at dberlin dot org
                   ` (19 subsequent siblings)
  55 siblings, 0 replies; 59+ messages in thread
From: steven at gcc dot gnu dot org @ 2005-01-25  1:28 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From steven at gcc dot gnu dot org  2005-01-25 01:28 -------
Do remember that this bug is about bad behavior with deeply nested loops 
and we already have other algorithms that are quadratic in the loop nest 
depth.  So I really wouldn't consider this to be a very serious problem, 
but rather something that could be improved. 
 
It is a shame that Seb has so far not commented on the behavior of his 
scev algorithm with respect to the loop nest depth.  Is it expected to 
be cubic in the loop nest depth?  Perhaps he can improve the algorithm? 
 

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
  2004-11-21 14:59 [Bug tree-optimization/18595] New: IV-OPTS is slow (and does not scale) pinskia at gcc dot gnu dot org
                   ` (35 preceding siblings ...)
  2005-01-25  1:28 ` steven at gcc dot gnu dot org
@ 2005-01-25  1:42 ` dberlin at dberlin dot org
  2005-01-25  1:52 ` rakdver at atrey dot karlin dot mff dot cuni dot cz
                   ` (18 subsequent siblings)
  55 siblings, 0 replies; 59+ messages in thread
From: dberlin at dberlin dot org @ 2005-01-25  1:42 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From dberlin at gcc dot gnu dot org  2005-01-25 01:42 -------
Subject: Re:  [4.0 Regression] IV-OPTS is O(N^3)

>>
>>          * tree-data-ref.c (analyze_overlapping_iterations): chrecs that
>>          are equal overlap on every iteration.
>>
>> This stuff is just simple symbolic differencing, and checking of
>> invariantness of symbols,
>> But it is indeed starting to se the symbolic scev info
>
> Ugh... OK, if you think it is a right way...  Anyway, I am seriously
> considering resurrecting the simple iv analysis for purposes of passes
> that are not interested in this fancy stuff.  Overhead of scev is
> probably acceptable if it is only used for this type of analysis,
> but for other purposes it is clearly overkill.

Uh, almost all high level loop optimizations are going to do very badly in 
the depth of the loop nest.

The fact that scev doesn't do so well on a 100 deep loop nest doesn't 
really concern me at all.



-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
  2004-11-21 14:59 [Bug tree-optimization/18595] New: IV-OPTS is slow (and does not scale) pinskia at gcc dot gnu dot org
                   ` (36 preceding siblings ...)
  2005-01-25  1:42 ` dberlin at dberlin dot org
@ 2005-01-25  1:52 ` rakdver at atrey dot karlin dot mff dot cuni dot cz
  2005-01-25  1:57 ` rakdver at atrey dot karlin dot mff dot cuni dot cz
                   ` (17 subsequent siblings)
  55 siblings, 0 replies; 59+ messages in thread
From: rakdver at atrey dot karlin dot mff dot cuni dot cz @ 2005-01-25  1:52 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From rakdver at atrey dot karlin dot mff dot cuni dot cz  2005-01-25 01:52 -------
Subject: Re:  [4.0 Regression] IV-OPTS is O(N^3)

> Do remember that this bug is about bad behavior with deeply nested loops 
> and we already have other algorithms that are quadratic in the loop nest 
> depth.  So I really wouldn't consider this to be a very serious problem, 
> but rather something that could be improved. 
>  
> It is a shame that Seb has so far not commented on the behavior of his 
> scev algorithm with respect to the loop nest depth.  Is it expected to 
> be cubic in the loop nest depth?  Perhaps he can improve the algorithm? 

the algorithm itself is not cubic with loop depth, but worst case linear
(per query) (*). This time complexity is acceptable, IMHO.

(*) I hope; scev is a mess of mutualy recursive functions --
analyze_scalar_evolution calling number_of_iterations calling
instantiate_parameters calling analyze_scalar_evolution again, with
instantiate_parameters hacked so that the infinite cycle cannot occur is
my favourite one.  Nobody can say anything sure about behavior of scev
-- it is not even well defined what analyze_scalar_evolutions will
return to you, unless you call instantiate_parameters or resolve_mixers
on the result.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
  2004-11-21 14:59 [Bug tree-optimization/18595] New: IV-OPTS is slow (and does not scale) pinskia at gcc dot gnu dot org
                   ` (37 preceding siblings ...)
  2005-01-25  1:52 ` rakdver at atrey dot karlin dot mff dot cuni dot cz
@ 2005-01-25  1:57 ` rakdver at atrey dot karlin dot mff dot cuni dot cz
  2005-01-25 10:32 ` sebastian dot pop at cri dot ensmp dot fr
                   ` (16 subsequent siblings)
  55 siblings, 0 replies; 59+ messages in thread
From: rakdver at atrey dot karlin dot mff dot cuni dot cz @ 2005-01-25  1:57 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From rakdver at atrey dot karlin dot mff dot cuni dot cz  2005-01-25 01:57 -------
Subject: Re:  [4.0 Regression] IV-OPTS is O(N^3)

> >>          * tree-data-ref.c (analyze_overlapping_iterations): chrecs that
> >>          are equal overlap on every iteration.
> >>
> >> This stuff is just simple symbolic differencing, and checking of
> >> invariantness of symbols,
> >> But it is indeed starting to se the symbolic scev info
> >
> > Ugh... OK, if you think it is a right way...  Anyway, I am seriously
> > considering resurrecting the simple iv analysis for purposes of passes
> > that are not interested in this fancy stuff.  Overhead of scev is
> > probably acceptable if it is only used for this type of analysis,
> > but for other purposes it is clearly overkill.
> 
> Uh, almost all high level loop optimizations are going to do very badly in 
> the depth of the loop nest.

Remove "almost" and "high level" from this sentence and I will agree
with you.  This really does not worry me that much.

> The fact that scev doesn't do so well on a 100 deep loop nest doesn't 
> really concern me at all.

Me neither.  Its time and memory consumption on normal testcases however
is not entirely satisfactory (not critical, but there definitely is a
place for improvement), and implementation could (should?) also be
significantly cleaned up.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
  2004-11-21 14:59 [Bug tree-optimization/18595] New: IV-OPTS is slow (and does not scale) pinskia at gcc dot gnu dot org
                   ` (38 preceding siblings ...)
  2005-01-25  1:57 ` rakdver at atrey dot karlin dot mff dot cuni dot cz
@ 2005-01-25 10:32 ` sebastian dot pop at cri dot ensmp dot fr
  2005-01-25 10:41 ` sebastian dot pop at cri dot ensmp dot fr
                   ` (15 subsequent siblings)
  55 siblings, 0 replies; 59+ messages in thread
From: sebastian dot pop at cri dot ensmp dot fr @ 2005-01-25 10:32 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From sebastian dot pop at cri dot ensmp dot fr  2005-01-25 10:32 -------
Subject: Re:  [4.0 Regression] IV-OPTS is O(N^3)

rakdver at atrey dot karlin dot mff dot cuni dot cz wrote:
> Adding the instantiation cache was compile time neutral on normal code,
> so I don't think the effect on scev cost is significant.
> 

How that?  You end up querying and caching an evolution for every
instantiation of an SSA_NAME!  

I will quantify the number of queries wrt the number of times this
information is useful.  I think that with my patch, the instantiation
cache information is used zero times on a bootstrap.

> The problem is that we end up calling the instantiate_parameters_1
> function 83214+2453273 (outside + recursive) times (for N=100).
> We also call analyze_scalar_evolution_1 1146317 times.  Many of these
> calls create POLYNOMIAL_CHREC nodes (2894131 calls to build3_stat).
> Large part of 5142869 of build_int_cst_wide calls is very likely also
> due to scev analysis.  This is not going to be cheap.  Removing the
> instantiation cache definitely would not decrease the number of
> instantiate_parameters_1 calls (might increase it, in fact, although
> I don't know how significantly).
> 

This also could be a bad use of the scev analysis.

For Steven: you can have a O(N**3) algorithm very simply:

  loop O(N) times
    loop O(N) times
      algorithm in O(N)

> One part of the problem is that loop optimizers need to throw away the
> scev caches after each change to loops, which leads to recounting the
> information too much.  Allowing to invalidate only changed parts helps,
> but seems to be relatively costly on normal code -- you need to scan the
> hash table for things that refer to removed or from some other reason
> invalidated ssa names, which is slow.  

Just set the SSA_NAME to not_analyzed_yet or NULL_TREE, and the next time
you'll ask for the evolution of this SSA_NAME you'll analyze the
evolution from scratch.



-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
  2004-11-21 14:59 [Bug tree-optimization/18595] New: IV-OPTS is slow (and does not scale) pinskia at gcc dot gnu dot org
                   ` (39 preceding siblings ...)
  2005-01-25 10:32 ` sebastian dot pop at cri dot ensmp dot fr
@ 2005-01-25 10:41 ` sebastian dot pop at cri dot ensmp dot fr
  2005-01-25 11:02 ` sebastian dot pop at cri dot ensmp dot fr
                   ` (14 subsequent siblings)
  55 siblings, 0 replies; 59+ messages in thread
From: sebastian dot pop at cri dot ensmp dot fr @ 2005-01-25 10:41 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From sebastian dot pop at cri dot ensmp dot fr  2005-01-25 10:39 -------
Subject: Re:  [4.0 Regression] IV-OPTS is O(N^3)

rakdver at atrey dot karlin dot mff dot cuni dot cz wrote:
> 
> How?  If the reference is left in symbolic form, it means that you know
> nothing about its value, so the only thing you can do with it is to
> check its equality/inequality, in code like
> 
> while (...)
>   {
>     i = something_weird ();
> 
>     a[i] = ...;  (a)
>     a[i+1] = ...;  (b)
>     a[i] = ...;  (c)
>   }
> 

The following is probably a more frequent case:

i = 0
x = something_weird () or a function parameter
loop
  i = i + 1 
  a[i] = ...
  ... = a[i + x]
endloop

where you end with a symbolic distance vector.



-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
  2004-11-21 14:59 [Bug tree-optimization/18595] New: IV-OPTS is slow (and does not scale) pinskia at gcc dot gnu dot org
                   ` (40 preceding siblings ...)
  2005-01-25 10:41 ` sebastian dot pop at cri dot ensmp dot fr
@ 2005-01-25 11:02 ` sebastian dot pop at cri dot ensmp dot fr
  2005-01-25 11:15 ` rakdver at atrey dot karlin dot mff dot cuni dot cz
                   ` (13 subsequent siblings)
  55 siblings, 0 replies; 59+ messages in thread
From: sebastian dot pop at cri dot ensmp dot fr @ 2005-01-25 11:02 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From sebastian dot pop at cri dot ensmp dot fr  2005-01-25 11:02 -------
Subject: Re:  [4.0 Regression] IV-OPTS is O(N^3)

rakdver at atrey dot karlin dot mff dot cuni dot cz wrote:
> 
> (*) I hope; scev is a mess of mutualy recursive functions --
> analyze_scalar_evolution calling number_of_iterations calling
> instantiate_parameters calling analyze_scalar_evolution again, with
> instantiate_parameters hacked so that the infinite cycle cannot occur is
> my favourite one.  Nobody can say anything sure about behavior of scev
> -- it is not even well defined what analyze_scalar_evolutions will
> return to you, 

It returns to you an evolution that might contain SSA_NAMEs.

> unless you call instantiate_parameters or resolve_mixers
> on the result.
> 

And once you call instantiate_parameters on the result you're
guaranteed that what you get does contain only determined constants,
or otherwise the result is chrec_dont_know.



-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
  2004-11-21 14:59 [Bug tree-optimization/18595] New: IV-OPTS is slow (and does not scale) pinskia at gcc dot gnu dot org
                   ` (41 preceding siblings ...)
  2005-01-25 11:02 ` sebastian dot pop at cri dot ensmp dot fr
@ 2005-01-25 11:15 ` rakdver at atrey dot karlin dot mff dot cuni dot cz
  2005-01-25 11:16 ` rakdver at atrey dot karlin dot mff dot cuni dot cz
                   ` (12 subsequent siblings)
  55 siblings, 0 replies; 59+ messages in thread
From: rakdver at atrey dot karlin dot mff dot cuni dot cz @ 2005-01-25 11:15 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From rakdver at atrey dot karlin dot mff dot cuni dot cz  2005-01-25 11:14 -------
Subject: Re:  [4.0 Regression] IV-OPTS is O(N^3)

> rakdver at atrey dot karlin dot mff dot cuni dot cz wrote:
> > Adding the instantiation cache was compile time neutral on normal code,
> > so I don't think the effect on scev cost is significant.
> > 
> 
> How that?  You end up querying and caching an evolution for every
> instantiation of an SSA_NAME!  

which is pretty cheap (constant time operation); note that we do an
equivalent lookup in the analyze_scalar_evolution call in any case, also
without causing any compile time problems.  I haven't measured any slowdown on
normal testcases.

> I will quantify the number of queries wrt the number of times this
> information is useful.  I think that with my patch, the instantiation
> cache information is used zero times on a bootstrap.

I don't think so.  Please come up with some numbers, otherwise this
discussion is pointless.

> > The problem is that we end up calling the instantiate_parameters_1
> > function 83214+2453273 (outside + recursive) times (for N=100).
> > We also call analyze_scalar_evolution_1 1146317 times.  Many of these
> > calls create POLYNOMIAL_CHREC nodes (2894131 calls to build3_stat).
> > Large part of 5142869 of build_int_cst_wide calls is very likely also
> > due to scev analysis.  This is not going to be cheap.  Removing the
> > instantiation cache definitely would not decrease the number of
> > instantiate_parameters_1 calls (might increase it, in fact, although
> > I don't know how significantly).
> > 
> 
> This also could be a bad use of the scev analysis.

partly -- see the analysis at the PR.  However one O(n) per query comes
from scev itself.

> For Steven: you can have a O(N**3) algorithm very simply:
> 
>   loop O(N) times
>     loop O(N) times
>       algorithm in O(N)
> 
> > One part of the problem is that loop optimizers need to throw away the
> > scev caches after each change to loops, which leads to recounting the
> > information too much.  Allowing to invalidate only changed parts helps,
> > but seems to be relatively costly on normal code -- you need to scan the
> > hash table for things that refer to removed or from some other reason
> > invalidated ssa names, which is slow.  
> 
> Just set the SSA_NAME to not_analyzed_yet or NULL_TREE, and the next time
> you'll ask for the evolution of this SSA_NAME you'll analyze the
> evolution from scratch.

This does not work, of course.  When the ssa name is removed, you must
remove all references to it from the cache.  Otherwise you will ICE when
you try to process the relevant entry in (say) instantiate_parameters.
What's worse, the ssa name may get reused by ssa name management, thus
causing you not even be able to note the change, and thus possibly to
misscompilations.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
  2004-11-21 14:59 [Bug tree-optimization/18595] New: IV-OPTS is slow (and does not scale) pinskia at gcc dot gnu dot org
                   ` (42 preceding siblings ...)
  2005-01-25 11:15 ` rakdver at atrey dot karlin dot mff dot cuni dot cz
@ 2005-01-25 11:16 ` rakdver at atrey dot karlin dot mff dot cuni dot cz
  2005-01-25 11:29 ` rakdver at atrey dot karlin dot mff dot cuni dot cz
                   ` (11 subsequent siblings)
  55 siblings, 0 replies; 59+ messages in thread
From: rakdver at atrey dot karlin dot mff dot cuni dot cz @ 2005-01-25 11:16 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From rakdver at atrey dot karlin dot mff dot cuni dot cz  2005-01-25 11:16 -------
Subject: Re:  [4.0 Regression] IV-OPTS is O(N^3)

> > How?  If the reference is left in symbolic form, it means that you know
> > nothing about its value, so the only thing you can do with it is to
> > check its equality/inequality, in code like
> > 
> > while (...)
> >   {
> >     i = something_weird ();
> > 
> >     a[i] = ...;  (a)
> >     a[i+1] = ...;  (b)
> >     a[i] = ...;  (c)
> >   }
> > 
> 
> The following is probably a more frequent case:
> 
> i = 0
> x = something_weird () or a function parameter
> loop
>   i = i + 1 
>   a[i] = ...
>   ... = a[i + x]
> endloop
> 
> where you end with a symbolic distance vector.

But here x is a loop invariant.  Nothing would change here if we were
keeping the evolutions fully instantiated.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
  2004-11-21 14:59 [Bug tree-optimization/18595] New: IV-OPTS is slow (and does not scale) pinskia at gcc dot gnu dot org
                   ` (43 preceding siblings ...)
  2005-01-25 11:16 ` rakdver at atrey dot karlin dot mff dot cuni dot cz
@ 2005-01-25 11:29 ` rakdver at atrey dot karlin dot mff dot cuni dot cz
  2005-01-25 12:03 ` sebastian dot pop at cri dot ensmp dot fr
                   ` (10 subsequent siblings)
  55 siblings, 0 replies; 59+ messages in thread
From: rakdver at atrey dot karlin dot mff dot cuni dot cz @ 2005-01-25 11:29 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From rakdver at atrey dot karlin dot mff dot cuni dot cz  2005-01-25 11:29 -------
Subject: Re:  [4.0 Regression] IV-OPTS is O(N^3)

> > (*) I hope; scev is a mess of mutualy recursive functions --
> > analyze_scalar_evolution calling number_of_iterations calling
> > instantiate_parameters calling analyze_scalar_evolution again, with
> > instantiate_parameters hacked so that the infinite cycle cannot occur is
> > my favourite one.  Nobody can say anything sure about behavior of scev
> > -- it is not even well defined what analyze_scalar_evolutions will
> > return to you, 
> 
> It returns to you an evolution that might contain SSA_NAMEs.

Hmm... if this is all you can tell, I start fear even more; this does
not define any semantics at all :-)

More seriously -- which of the possibilities?  If I have loops like

while (...)
  {
    while (...)
      {
        x_1 = something ();
      }
    x_2 = phi (x_1);
    x_3 = x_2 + 1;
  }

What will analyze_scalar_evolutions return for x_3? There are (at least)
three possible valid values:

x_3
x_2 + 1
x_1 + 1

In this example probably the last one will be chosen, but in more
complicated cases you cannot tell (without simulating what scev analysis
does).  It might be better to not export analyze_scalar_evolution at
all and to force it to be used through
instantiate_parameters/resolve_mixers, that have at least a well defined
semantics of return value -- I hope.  Till recently I believed that I
made the functions to behave reasonably in cases when values defined
inside loop are used outside of it (through chain of lcssa phi nodes),
but my attempt to remove the hacks that ensure sane behavior of the
results for ivopts causes misscompilations; I am investigating the cause
now.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
  2004-11-21 14:59 [Bug tree-optimization/18595] New: IV-OPTS is slow (and does not scale) pinskia at gcc dot gnu dot org
                   ` (44 preceding siblings ...)
  2005-01-25 11:29 ` rakdver at atrey dot karlin dot mff dot cuni dot cz
@ 2005-01-25 12:03 ` sebastian dot pop at cri dot ensmp dot fr
  2005-01-25 12:44 ` sebastian dot pop at cri dot ensmp dot fr
                   ` (9 subsequent siblings)
  55 siblings, 0 replies; 59+ messages in thread
From: sebastian dot pop at cri dot ensmp dot fr @ 2005-01-25 12:03 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From sebastian dot pop at cri dot ensmp dot fr  2005-01-25 12:03 -------
Subject: Re:  [4.0 Regression] IV-OPTS is O(N^3)

rakdver at atrey dot karlin dot mff dot cuni dot cz wrote:
> 
> ------- Additional Comments From rakdver at atrey dot karlin dot mff dot cuni dot cz  2005-01-25 11:14 -------
> Subject: Re:  [4.0 Regression] IV-OPTS is O(N^3)
> 
> > rakdver at atrey dot karlin dot mff dot cuni dot cz wrote:
> > > Adding the instantiation cache was compile time neutral on normal code,
> > > so I don't think the effect on scev cost is significant.
> > > 
> > 
> > How that?  You end up querying and caching an evolution for every
> > instantiation of an SSA_NAME!  
> 
> which is pretty cheap (constant time operation); note that we do an
> equivalent lookup in the analyze_scalar_evolution call in any case, also
> without causing any compile time problems.  I haven't measured any slowdown on
> normal testcases.
> 
> > I will quantify the number of queries wrt the number of times this
> > information is useful.  I think that with my patch, the instantiation
> > cache information is used zero times on a bootstrap.
> 
> I don't think so.  Please come up with some numbers, otherwise this
> discussion is pointless.
> 

during a bootstrap: 

instantiation cache queries: 1146452

instantiation cache uses: 247452 of which 143977 were scev_not_known,
the other were SSA_NAMEs.

this was counted with a grep|wc with the following patch: 

Index: tree-scalar-evolution.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree-scalar-evolution.c,v
retrieving revision 2.16
diff -d -u -p -r2.16 tree-scalar-evolution.c
--- tree-scalar-evolution.c	18 Jan 2005 11:36:26 -0000	2.16
+++ tree-scalar-evolution.c	25 Jan 2005 12:00:57 -0000
@@ -1898,8 +1898,15 @@ get_instantiated_value (htab_t cache, tr
   pattern.var = version;
   info = htab_find (cache, &pattern);
 
+  fprintf (stdout, "IC_query \n");
+
   if (info)
-    return info->chrec;
+    {
+      fprintf (stdout, "IC_used_once \n");
+      print_generic_expr (stdout, info->chrec, 0);
+      fprintf (stdout, "\n");
+      return info->chrec;
+    }
   else
     return NULL_TREE;
 }
@@ -1915,6 +1922,8 @@ set_instantiated_value (htab_t cache, tr
   pattern.var = version;
   slot = htab_find_slot (cache, &pattern, INSERT);
 
+  fprintf (stdout, "IC_query \n");
+
   if (*slot)
     info = *slot;
   else
@@ -1990,7 +1999,8 @@ instantiate_parameters_1 (struct loop *l
 	 result again.  */
       bitmap_set_bit (already_instantiated, SSA_NAME_VERSION (chrec));
       res = analyze_scalar_evolution (def_loop, chrec);
-      res = instantiate_parameters_1 (loop, res, allow_superloop_chrecs, cache);
+      if  (res != chrec_dont_know)
+	res = instantiate_parameters_1 (loop, res, allow_superloop_chrecs, cache);
       bitmap_clear_bit (already_instantiated, SSA_NAME_VERSION (chrec));
 
       /* Store the correct value to the cache.  */
@@ -2000,8 +2010,14 @@ instantiate_parameters_1 (struct loop *l
     case POLYNOMIAL_CHREC:
       op0 = instantiate_parameters_1 (loop, CHREC_LEFT (chrec),
 				      allow_superloop_chrecs, cache);
+      if (op0 == chrec_dont_know)
+	return chrec_dont_know;
+
       op1 = instantiate_parameters_1 (loop, CHREC_RIGHT (chrec),
 				      allow_superloop_chrecs, cache);
+      if (op1 == chrec_dont_know)
+	return chrec_dont_know;
+
       if (CHREC_LEFT (chrec) != op0
 	  || CHREC_RIGHT (chrec) != op1)
 	chrec = build_polynomial_chrec (CHREC_VARIABLE (chrec), op0, op1);
@@ -2010,8 +2026,14 @@ instantiate_parameters_1 (struct loop *l
     case PLUS_EXPR:
       op0 = instantiate_parameters_1 (loop, TREE_OPERAND (chrec, 0),
 				      allow_superloop_chrecs, cache);
+      if (op0 == chrec_dont_know)
+	return chrec_dont_know;
+
       op1 = instantiate_parameters_1 (loop, TREE_OPERAND (chrec, 1),
 				      allow_superloop_chrecs, cache);
+      if (op1 == chrec_dont_know)
+	return chrec_dont_know;
+
       if (TREE_OPERAND (chrec, 0) != op0
 	  || TREE_OPERAND (chrec, 1) != op1)
       	chrec = chrec_fold_plus (TREE_TYPE (chrec), op0, op1);
@@ -2020,8 +2042,14 @@ instantiate_parameters_1 (struct loop *l
     case MINUS_EXPR:
       op0 = instantiate_parameters_1 (loop, TREE_OPERAND (chrec, 0),
 				      allow_superloop_chrecs, cache);
+      if (op0 == chrec_dont_know)
+	return chrec_dont_know;
+
       op1 = instantiate_parameters_1 (loop, TREE_OPERAND (chrec, 1),
 				      allow_superloop_chrecs, cache);
+      if (op1 == chrec_dont_know)
+        return chrec_dont_know;
+
       if (TREE_OPERAND (chrec, 0) != op0
 	  || TREE_OPERAND (chrec, 1) != op1)
         chrec = chrec_fold_minus (TREE_TYPE (chrec), op0, op1);
@@ -2030,8 +2058,14 @@ instantiate_parameters_1 (struct loop *l
     case MULT_EXPR:
       op0 = instantiate_parameters_1 (loop, TREE_OPERAND (chrec, 0),
 				      allow_superloop_chrecs, cache);
+      if (op0 == chrec_dont_know)
+        return chrec_dont_know;
+
       op1 = instantiate_parameters_1 (loop, TREE_OPERAND (chrec, 1),
 				      allow_superloop_chrecs, cache);
+      if (op1 == chrec_dont_know)
+        return chrec_dont_know;
+
       if (TREE_OPERAND (chrec, 0) != op0
 	  || TREE_OPERAND (chrec, 1) != op1)
 	chrec = chrec_fold_multiply (TREE_TYPE (chrec), op0, op1);
@@ -2065,13 +2099,17 @@ instantiate_parameters_1 (struct loop *l
     case 3:
       op0 = instantiate_parameters_1 (loop, TREE_OPERAND (chrec, 0),
 				      allow_superloop_chrecs, cache);
+      if (op0 == chrec_dont_know)
+	return chrec_dont_know;
+
       op1 = instantiate_parameters_1 (loop, TREE_OPERAND (chrec, 1),
 				      allow_superloop_chrecs, cache);
+      if (op1 == chrec_dont_know)
+	return chrec_dont_know;
+
       op2 = instantiate_parameters_1 (loop, TREE_OPERAND (chrec, 2),
 				      allow_superloop_chrecs, cache);
-      if (op0 == chrec_dont_know
-	  || op1 == chrec_dont_know
-	  || op2 == chrec_dont_know)
+      if (op2 == chrec_dont_know)
         return chrec_dont_know;
 
       if (op0 == TREE_OPERAND (chrec, 0)
@@ -2085,10 +2123,12 @@ instantiate_parameters_1 (struct loop *l
     case 2:
       op0 = instantiate_parameters_1 (loop, TREE_OPERAND (chrec, 0),
 				      allow_superloop_chrecs, cache);
+      if (op0 == chrec_dont_know)
+	return chrec_dont_know;
+
       op1 = instantiate_parameters_1 (loop, TREE_OPERAND (chrec, 1),
 				      allow_superloop_chrecs, cache);
-      if (op0 == chrec_dont_know
-	  || op1 == chrec_dont_know)
+      if (op1 == chrec_dont_know)
         return chrec_dont_know;
 
       if (op0 == TREE_OPERAND (chrec, 0)



-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
  2004-11-21 14:59 [Bug tree-optimization/18595] New: IV-OPTS is slow (and does not scale) pinskia at gcc dot gnu dot org
                   ` (45 preceding siblings ...)
  2005-01-25 12:03 ` sebastian dot pop at cri dot ensmp dot fr
@ 2005-01-25 12:44 ` sebastian dot pop at cri dot ensmp dot fr
  2005-01-25 15:54 ` rakdver at atrey dot karlin dot mff dot cuni dot cz
                   ` (8 subsequent siblings)
  55 siblings, 0 replies; 59+ messages in thread
From: sebastian dot pop at cri dot ensmp dot fr @ 2005-01-25 12:44 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From sebastian dot pop at cri dot ensmp dot fr  2005-01-25 12:44 -------
Subject: Re:  [4.0 Regression] IV-OPTS is O(N^3)

rakdver at atrey dot karlin dot mff dot cuni dot cz wrote:
> More seriously -- which of the possibilities?  If I have loops like
> 
> while (...)
>   {
>     while (...)
>       {
>         x_1 = something ();
>       }
>     x_2 = phi (x_1);
>     x_3 = x_2 + 1;
>   }
> 
> What will analyze_scalar_evolutions return for x_3? There are (at least)
> three possible valid values:
> 
> x_3

This would be the answer if analyze_scalar_evolutions would be the
identity function.  If you want, you could change analyze_scalar_evolutions
such that it behaves like that, and decide that the instantiation do
the rest of the work (I mean moving the code that is currently in
analyze_scalar_evolutions to the instantiation phase).

> x_2 + 1

If you decide to reconstruct the tree expression, there is no reason
to stop on a phi node that has a single argument.  Why would you like
to get this answer as the reconstructed tree?

> x_1 + 1
> 

IMO this would be the answer, although I didn't checked.



-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
  2004-11-21 14:59 [Bug tree-optimization/18595] New: IV-OPTS is slow (and does not scale) pinskia at gcc dot gnu dot org
                   ` (46 preceding siblings ...)
  2005-01-25 12:44 ` sebastian dot pop at cri dot ensmp dot fr
@ 2005-01-25 15:54 ` rakdver at atrey dot karlin dot mff dot cuni dot cz
  2005-01-25 16:27 ` sebastian dot pop at cri dot ensmp dot fr
                   ` (7 subsequent siblings)
  55 siblings, 0 replies; 59+ messages in thread
From: rakdver at atrey dot karlin dot mff dot cuni dot cz @ 2005-01-25 15:54 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From rakdver at atrey dot karlin dot mff dot cuni dot cz  2005-01-25 15:54 -------
Subject: Re:  [4.0 Regression] IV-OPTS is O(N^3)

> rakdver at atrey dot karlin dot mff dot cuni dot cz wrote:
> > More seriously -- which of the possibilities?  If I have loops like
> > 
> > while (...)
> >   {
> >     while (...)
> >       {
> >         x_1 = something ();
> >       }
> >     x_2 = phi (x_1);
> >     x_3 = x_2 + 1;
> >   }
> > 
> > What will analyze_scalar_evolutions return for x_3? There are (at least)
> > three possible valid values:
> > 
> > x_3
> 
> This would be the answer if analyze_scalar_evolutions would be the
> identity function.  If you want, you could change analyze_scalar_evolutions
> such that it behaves like that, and decide that the instantiation do
> the rest of the work (I mean moving the code that is currently in
> analyze_scalar_evolutions to the instantiation phase).
> 
> > x_2 + 1
> 
> If you decide to reconstruct the tree expression, there is no reason
> to stop on a phi node that has a single argument.  Why would you like
> to get this answer as the reconstructed tree?

because this answer preserves loop closed ssa form -- the answer x_1 + 1
copy propagates the value outsied of the loop.  In some applications
this choice could make sense.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
  2004-11-21 14:59 [Bug tree-optimization/18595] New: IV-OPTS is slow (and does not scale) pinskia at gcc dot gnu dot org
                   ` (47 preceding siblings ...)
  2005-01-25 15:54 ` rakdver at atrey dot karlin dot mff dot cuni dot cz
@ 2005-01-25 16:27 ` sebastian dot pop at cri dot ensmp dot fr
  2005-01-25 19:16 ` pinskia at gcc dot gnu dot org
                   ` (6 subsequent siblings)
  55 siblings, 0 replies; 59+ messages in thread
From: sebastian dot pop at cri dot ensmp dot fr @ 2005-01-25 16:27 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From sebastian dot pop at cri dot ensmp dot fr  2005-01-25 16:26 -------
Subject: Re:  [4.0 Regression] IV-OPTS is O(N^3)

> > If you decide to reconstruct the tree expression, there is no reason
> > to stop on a phi node that has a single argument.  Why would you like
> > to get this answer as the reconstructed tree?
> 
> because this answer preserves loop closed ssa form -- the answer x_1 + 1
> copy propagates the value outsied of the loop.  In some applications
> this choice could make sense.
> 

Okay, if it makes sense, you could modify the analyzer such that it
stops reconstructing the tree once it is on the phi following a loop.
This won't change the information provided after the instantiation.



-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
  2004-11-21 14:59 [Bug tree-optimization/18595] New: IV-OPTS is slow (and does not scale) pinskia at gcc dot gnu dot org
                   ` (48 preceding siblings ...)
  2005-01-25 16:27 ` sebastian dot pop at cri dot ensmp dot fr
@ 2005-01-25 19:16 ` pinskia at gcc dot gnu dot org
  2005-01-27 13:18 ` sebastian dot pop at cri dot ensmp dot fr
                   ` (5 subsequent siblings)
  55 siblings, 0 replies; 59+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2005-01-25 19:16 UTC (permalink / raw)
  To: gcc-bugs



-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
           Severity|normal                      |minor
           Priority|P2                          |P3


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
  2004-11-21 14:59 [Bug tree-optimization/18595] New: IV-OPTS is slow (and does not scale) pinskia at gcc dot gnu dot org
                   ` (49 preceding siblings ...)
  2005-01-25 19:16 ` pinskia at gcc dot gnu dot org
@ 2005-01-27 13:18 ` sebastian dot pop at cri dot ensmp dot fr
  2005-01-27 15:00 ` rakdver at atrey dot karlin dot mff dot cuni dot cz
                   ` (4 subsequent siblings)
  55 siblings, 0 replies; 59+ messages in thread
From: sebastian dot pop at cri dot ensmp dot fr @ 2005-01-27 13:18 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From sebastian dot pop at cri dot ensmp dot fr  2005-01-27 13:18 -------
Subject: Re:  [4.0 Regression] IV-OPTS is O(N^3)

With the following patch I got some speedup for depth 100.

from: 
 tree iv optimization  :   2.62 (62%) usr   0.27 (82%) sys   2.92 (62%) wall
 tree iv optimization  :   2.33 (59%) usr   0.25 (83%) sys   2.63 (54%) wall

to:
 tree iv optimization  :   1.19 (46%) usr   0.04 (40%) sys   1.26 (45%) wall
 tree iv optimization  :   1.21 (47%) usr   0.05 (45%) sys   1.30 (46%) wall

Basically we're reseting too much information each time scev_reset is
called.  It would even be possible to reset only the part of the scev
database that contains symbols instead of erasing everything.

Do somebody know how to iterate over all the elements of the hash
setting to NULL_TREE the elements that answer true to
chrec_contains_symbols?

Index: tree-scalar-evolution.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree-scalar-evolution.c,v
retrieving revision 2.16
diff -d -u -p -r2.16 tree-scalar-evolution.c
--- tree-scalar-evolution.c	18 Jan 2005 11:36:26 -0000	2.16
+++ tree-scalar-evolution.c	27 Jan 2005 13:12:36 -0000
@@ -2490,7 +2490,7 @@ scev_reset (void)
   for (i = 1; i < current_loops->num; i++)
     {
       loop = current_loops->parray[i];
-      if (loop)
+      if (loop && chrec_contains_symbols (loop->nb_iterations))
 	loop->nb_iterations = NULL_TREE;
     }
 }


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
  2004-11-21 14:59 [Bug tree-optimization/18595] New: IV-OPTS is slow (and does not scale) pinskia at gcc dot gnu dot org
                   ` (50 preceding siblings ...)
  2005-01-27 13:18 ` sebastian dot pop at cri dot ensmp dot fr
@ 2005-01-27 15:00 ` rakdver at atrey dot karlin dot mff dot cuni dot cz
  2005-01-27 15:12 ` sebastian dot pop at cri dot ensmp dot fr
                   ` (3 subsequent siblings)
  55 siblings, 0 replies; 59+ messages in thread
From: rakdver at atrey dot karlin dot mff dot cuni dot cz @ 2005-01-27 15:00 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From rakdver at atrey dot karlin dot mff dot cuni dot cz  2005-01-27 15:00 -------
Subject: Re:  [4.0 Regression] IV-OPTS is O(N^3)

> ------- Additional Comments From sebastian dot pop at cri dot ensmp dot fr  2005-01-27 13:18 -------
> Subject: Re:  [4.0 Regression] IV-OPTS is O(N^3)
> 
> With the following patch I got some speedup for depth 100.
> 
> from: 
>  tree iv optimization  :   2.62 (62%) usr   0.27 (82%) sys   2.92 (62%) wall
>  tree iv optimization  :   2.33 (59%) usr   0.25 (83%) sys   2.63 (54%) wall
> 
> to:
>  tree iv optimization  :   1.19 (46%) usr   0.04 (40%) sys   1.26 (45%) wall
>  tree iv optimization  :   1.21 (47%) usr   0.05 (45%) sys   1.30 (46%) wall
> 
> Basically we're reseting too much information each time scev_reset is
> called.  It would even be possible to reset only the part of the scev
> database that contains symbols instead of erasing everything.
> 
> Do somebody know how to iterate over all the elements of the hash
> setting to NULL_TREE the elements that answer true to
> chrec_contains_symbols?

the patch is below (in stronger form -- only removing entries that
contain invalidated symbols), and indeed helps with this testcase.
It however causes measurable slow down on normal code (see analysis
in one of the previous mails).

Note that your patch is not entirely correct -- in case loop is
unrolled, its number of iterations is changed, so in this case
scev_reset should clear its number of iterations unconditionally
(I think something similar occurs with vectorizer, so we need to
be a bit careful).

Index: tree-loop-linear.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree-loop-linear.c,v
retrieving revision 2.6
diff -c -3 -p -r2.6 tree-loop-linear.c
*** tree-loop-linear.c	18 Jan 2005 11:36:24 -0000	2.6
--- tree-loop-linear.c	24 Jan 2005 01:34:04 -0000
*************** linear_transform_loops (struct loops *lo
*** 373,379 ****
        free_data_refs (datarefs);
      }
    free_df ();
!   scev_reset ();
    rewrite_into_loop_closed_ssa ();
  #ifdef ENABLE_CHECKING
    verify_loop_closed_ssa ();
--- 373,379 ----
        free_data_refs (datarefs);
      }
    free_df ();
!   scev_reset (NULL);
    rewrite_into_loop_closed_ssa ();
  #ifdef ENABLE_CHECKING
    verify_loop_closed_ssa ();
Index: tree-scalar-evolution.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree-scalar-evolution.c,v
retrieving revision 2.16
diff -c -3 -p -r2.16 tree-scalar-evolution.c
*** tree-scalar-evolution.c	18 Jan 2005 11:36:26 -0000	2.16
--- tree-scalar-evolution.c	24 Jan 2005 01:34:05 -0000
*************** scev_initialize (struct loops *loops)
*** 2475,2484 ****
        loops->parray[i]->nb_iterations = NULL_TREE;
  }
  
! /* Cleans up the information cached by the scalar evolutions analysis.  */
  
  void
! scev_reset (void)
  {
    unsigned i;
    struct loop *loop;
--- 2475,2539 ----
        loops->parray[i]->nb_iterations = NULL_TREE;
  }
  
! /* Returns true if EXPR references one of the ssa names in set
!    INVALIDATED_NAMES.  */
! 
! static bool
! tree_references_names (tree expr, bitmap invalidated_names)
! {
!   if (!expr)
!     return false;
! 
!   if (TREE_CODE (expr) == SSA_NAME)
!     return bitmap_bit_p (invalidated_names, SSA_NAME_VERSION (expr));
! 
!   switch (TREE_CODE_LENGTH (TREE_CODE (expr)))
!     {
!     case 4:
!       if (tree_references_names (TREE_OPERAND (expr, 3), invalidated_names))
! 	return true;
!     case 3:
!       if (tree_references_names (TREE_OPERAND (expr, 2), invalidated_names))
! 	return true;
!     case 2:
!       if (tree_references_names (TREE_OPERAND (expr, 1), invalidated_names))
! 	return true;
!     case 1:
!       if (tree_references_names (TREE_OPERAND (expr, 0), invalidated_names))
! 	return true;
!     case 0:
!       return false;
! 
!     default:
!       /* Don't worry about handling strange cases.  This function is only used
! 	 in a way in that it does not matter if we return true here.  */
!       return true;
!     }
! }
! 
! /* If the SLOT in the scev cache contains ssa name belonging to the set
!    passed in DATA, the function removes it from the cache.  Callback
!    for htab_traverse.  */
! 
! static int
! invalidate_names_reference (void **slot, void *data)
! {
!   bitmap invalidated_names = data;
!   struct scev_info_str *elt = *slot;
! 
!   if (tree_references_names (elt->var, invalidated_names)
!       || tree_references_names (elt->chrec, invalidated_names))
!     htab_clear_slot (scalar_evolution_info, slot);
! 
!   return 1;
! }
! 
! /* Cleans up the information cached by the scalar evolutions analysis.
!    If INVALIDATED_NAMES is provided, only the references to invalidated
!    ssa names are removed.  */
  
  void
! scev_reset (bitmap invalidated_names)
  {
    unsigned i;
    struct loop *loop;
*************** scev_reset (void)
*** 2486,2497 ****
    if (!scalar_evolution_info || !current_loops)
      return;
  
!   htab_empty (scalar_evolution_info);
    for (i = 1; i < current_loops->num; i++)
      {
        loop = current_loops->parray[i];
!       if (loop)
! 	loop->nb_iterations = NULL_TREE;
      }
  }
  
--- 2541,2564 ----
    if (!scalar_evolution_info || !current_loops)
      return;
  
!   if (invalidated_names)
!     htab_traverse (scalar_evolution_info, invalidate_names_reference,
! 		   invalidated_names);
!   else
!     htab_empty (scalar_evolution_info);
! 
    for (i = 1; i < current_loops->num; i++)
      {
        loop = current_loops->parray[i];
!       if (!loop
! 	  || !loop->nb_iterations)
! 	continue;
! 	  
!       if (invalidated_names
! 	  && !tree_references_names (loop->nb_iterations, invalidated_names))
! 	continue;
! 
!       loop->nb_iterations = NULL_TREE;
      }
  }
  
Index: tree-scalar-evolution.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree-scalar-evolution.h,v
retrieving revision 2.3
diff -c -3 -p -r2.3 tree-scalar-evolution.h
*** tree-scalar-evolution.h	17 Nov 2004 22:06:00 -0000	2.3
--- tree-scalar-evolution.h	24 Jan 2005 01:34:05 -0000
*************** extern tree number_of_iterations_in_loop
*** 26,32 ****
  extern tree get_loop_exit_condition (struct loop *);
  
  extern void scev_initialize (struct loops *loops);
! extern void scev_reset (void);
  extern void scev_finalize (void);
  extern tree analyze_scalar_evolution (struct loop *, tree);
  extern tree instantiate_parameters (struct loop *, tree);
--- 26,32 ----
  extern tree get_loop_exit_condition (struct loop *);
  
  extern void scev_initialize (struct loops *loops);
! extern void scev_reset (bitmap);
  extern void scev_finalize (void);
  extern tree analyze_scalar_evolution (struct loop *, tree);
  extern tree instantiate_parameters (struct loop *, tree);
Index: tree-ssa-loop-ivcanon.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree-ssa-loop-ivcanon.c,v
retrieving revision 2.5
diff -c -3 -p -r2.5 tree-ssa-loop-ivcanon.c
*** tree-ssa-loop-ivcanon.c	1 Oct 2004 18:26:31 -0000	2.5
--- tree-ssa-loop-ivcanon.c	24 Jan 2005 01:34:05 -0000
*************** canonicalize_induction_variables (struct
*** 262,268 ****
  
    /* Clean up the information about numbers of iterations, since brute force
       evaluation could reveal new information.  */
!   scev_reset ();
  
  #if 0
    /* The necessary infrastructure is not in yet.  */
--- 262,268 ----
  
    /* Clean up the information about numbers of iterations, since brute force
       evaluation could reveal new information.  */
!   scev_reset (NULL);
  
  #if 0
    /* The necessary infrastructure is not in yet.  */
*************** tree_unroll_loops_completely (struct loo
*** 294,300 ****
  
    /* Clean up the information about numbers of iterations, since complete
       unrolling might have invalidated it.  */
!   scev_reset ();
  
  #if 0
    /* The necessary infrastructure is not in yet.  */
--- 294,300 ----
  
    /* Clean up the information about numbers of iterations, since complete
       unrolling might have invalidated it.  */
!   scev_reset (NULL);
  
  #if 0
    /* The necessary infrastructure is not in yet.  */
Index: tree-ssa-loop-ivopts.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree-ssa-loop-ivopts.c,v
retrieving revision 2.42
diff -c -3 -p -r2.42 tree-ssa-loop-ivopts.c
*** tree-ssa-loop-ivopts.c	19 Jan 2005 22:50:04 -0000	2.42
--- tree-ssa-loop-ivopts.c	24 Jan 2005 01:34:05 -0000
*************** struct ivopts_data
*** 208,213 ****
--- 208,216 ----
    /* The bitmap of indices in version_info whose value was changed.  */
    bitmap relevant;
  
+   /* The set of ssa names whose scev info might get invalidated.  */
+   bitmap invalidated_names;
+ 
    /* The maximum invariant id.  */
    unsigned max_inv_id;
  
*************** tree_ssa_iv_optimize_init (struct loops 
*** 651,656 ****
--- 654,660 ----
    data->version_info = xcalloc (data->version_info_size,
  				sizeof (struct version_info));
    data->relevant = BITMAP_XMALLOC ();
+   data->invalidated_names = BITMAP_XMALLOC ();
    data->important_candidates = BITMAP_XMALLOC ();
    data->max_inv_id = 0;
  
*************** remove_unused_ivs (struct ivopts_data *d
*** 5016,5022 ****
  	  && !info->inv_id
  	  && !info->iv->have_use_for
  	  && !info->preserve_biv)
! 	remove_statement (SSA_NAME_DEF_STMT (info->iv->ssa_name), true);
      }
  }
  
--- 5020,5029 ----
  	  && !info->inv_id
  	  && !info->iv->have_use_for
  	  && !info->preserve_biv)
! 	{
! 	  bitmap_set_bit (data->invalidated_names, j);
! 	  remove_statement (SSA_NAME_DEF_STMT (info->iv->ssa_name), true);
! 	}
      }
  }
  
*************** free_loop_data (struct ivopts_data *data
*** 5041,5046 ****
--- 5048,5054 ----
        info->inv_id = 0;
      }
    bitmap_clear (data->relevant);
+   bitmap_clear (data->invalidated_names);
    bitmap_clear (data->important_candidates);
  
    for (i = 0; i < n_iv_uses (data); i++)
*************** tree_ssa_iv_optimize_finalize (struct lo
*** 5104,5109 ****
--- 5112,5118 ----
    free_loop_data (data);
    free (data->version_info);
    BITMAP_XFREE (data->relevant);
+   BITMAP_XFREE (data->invalidated_names);
    BITMAP_XFREE (data->important_candidates);
  
    VARRAY_FREE (decl_rtl_to_reset);
*************** tree_ssa_iv_optimize_loop (struct ivopts
*** 5177,5183 ****
    /* We have changed the structure of induction variables; it might happen
       that definitions in the scev database refer to some of them that were
       eliminated.  */
!   scev_reset ();
  
  finish:
    free_loop_data (data);
--- 5186,5192 ----
    /* We have changed the structure of induction variables; it might happen
       that definitions in the scev database refer to some of them that were
       eliminated.  */
!   scev_reset (data->invalidated_names);
  
  finish:
    free_loop_data (data);
Index: tree-ssa-loop-manip.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree-ssa-loop-manip.c,v
retrieving revision 2.21
diff -c -3 -p -r2.21 tree-ssa-loop-manip.c
*** tree-ssa-loop-manip.c	29 Nov 2004 17:53:47 -0000	2.21
--- tree-ssa-loop-manip.c	24 Jan 2005 01:34:05 -0000
*************** tree_duplicate_loop_to_header_edge (stru
*** 633,639 ****
    	set_phi_def_stmts (bb->rbi->original);
      }
  
!   scev_reset ();
  #ifdef ENABLE_CHECKING
    verify_loop_closed_ssa ();
  #endif
--- 633,639 ----
    	set_phi_def_stmts (bb->rbi->original);
      }
  
!   scev_reset (NULL);
  #ifdef ENABLE_CHECKING
    verify_loop_closed_ssa ();
  #endif
Index: tree-ssa-loop.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree-ssa-loop.c,v
retrieving revision 2.23
diff -c -3 -p -r2.23 tree-ssa-loop.c
*** tree-ssa-loop.c	26 Nov 2004 06:42:25 -0000	2.23
--- tree-ssa-loop.c	24 Jan 2005 01:34:05 -0000
*************** tree_ssa_loop_bounds (void)
*** 306,312 ****
      return;
  
    estimate_numbers_of_iterations (current_loops);
!   scev_reset ();
  }
  
  struct tree_opt_pass pass_record_bounds =
--- 306,312 ----
      return;
  
    estimate_numbers_of_iterations (current_loops);
!   scev_reset (NULL);
  }
  
  struct tree_opt_pass pass_record_bounds =
Index: tree-vectorizer.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree-vectorizer.c,v
retrieving revision 2.62
diff -c -3 -p -r2.62 tree-vectorizer.c
*** tree-vectorizer.c	18 Jan 2005 11:36:29 -0000	2.62
--- tree-vectorizer.c	24 Jan 2005 01:34:05 -0000
*************** vect_do_peeling_for_loop_bound (loop_vec
*** 3258,3264 ****
    vect_update_ivs_after_vectorizer (loop, ratio_mult_vf_name, update_e); 
  
    /* After peeling we have to reset scalar evolution analyzer.  */
!   scev_reset ();
  
    return;
  }
--- 3258,3264 ----
    vect_update_ivs_after_vectorizer (loop, ratio_mult_vf_name, update_e); 
  
    /* After peeling we have to reset scalar evolution analyzer.  */
!   scev_reset (NULL);
  
    return;
  }
*************** vect_do_peeling_for_alignment (loop_vec_
*** 3442,3448 ****
    vect_update_inits_of_drs (loop_vinfo, niters_of_prolog_loop);
  
    /* After peeling we have to reset scalar evolution analyzer.  */
!   scev_reset ();
  
    return;
  }
--- 3442,3448 ----
    vect_update_inits_of_drs (loop_vinfo, niters_of_prolog_loop);
  
    /* After peeling we have to reset scalar evolution analyzer.  */
!   scev_reset (NULL);
  
    return;
  }


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
  2004-11-21 14:59 [Bug tree-optimization/18595] New: IV-OPTS is slow (and does not scale) pinskia at gcc dot gnu dot org
                   ` (51 preceding siblings ...)
  2005-01-27 15:00 ` rakdver at atrey dot karlin dot mff dot cuni dot cz
@ 2005-01-27 15:12 ` sebastian dot pop at cri dot ensmp dot fr
  2005-01-27 19:11 ` rakdver at atrey dot karlin dot mff dot cuni dot cz
                   ` (2 subsequent siblings)
  55 siblings, 0 replies; 59+ messages in thread
From: sebastian dot pop at cri dot ensmp dot fr @ 2005-01-27 15:12 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From sebastian dot pop at cri dot ensmp dot fr  2005-01-27 15:12 -------
Subject: Re:  [4.0 Regression] IV-OPTS is O(N^3)

rakdver at atrey dot karlin dot mff dot cuni dot cz wrote:
> the patch is below (in stronger form -- only removing entries that
> contain invalidated symbols), and indeed helps with this testcase.

Many thanks.

> It however causes measurable slow down on normal code (see analysis
> in one of the previous mails).
> 

I see. 

Another idea: would it be possible to insert the invalidated names
during the optimization pass instead of invalidating all the symbols?
This would avoid the first pass over the scev database that search for
symbols.

> Note that your patch is not entirely correct -- in case loop is
> unrolled, its number of iterations is changed, so in this case
> scev_reset should clear its number of iterations unconditionally

or the unroller should do that work on the unrolled loops?



-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
  2004-11-21 14:59 [Bug tree-optimization/18595] New: IV-OPTS is slow (and does not scale) pinskia at gcc dot gnu dot org
                   ` (52 preceding siblings ...)
  2005-01-27 15:12 ` sebastian dot pop at cri dot ensmp dot fr
@ 2005-01-27 19:11 ` rakdver at atrey dot karlin dot mff dot cuni dot cz
  2005-01-27 20:40 ` sebastian dot pop at cri dot ensmp dot fr
  2005-02-14 11:12 ` pinskia at gcc dot gnu dot org
  55 siblings, 0 replies; 59+ messages in thread
From: rakdver at atrey dot karlin dot mff dot cuni dot cz @ 2005-01-27 19:11 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From rakdver at atrey dot karlin dot mff dot cuni dot cz  2005-01-27 19:10 -------
Subject: Re:  [4.0 Regression] IV-OPTS is O(N^3)

> Another idea: would it be possible to insert the invalidated names
> during the optimization pass instead of invalidating all the symbols?
> This would avoid the first pass over the scev database that search for
> symbols.

I don't understand this proposal.

> > Note that your patch is not entirely correct -- in case loop is
> > unrolled, its number of iterations is changed, so in this case
> > scev_reset should clear its number of iterations unconditionally
> 
> or the unroller should do that work on the unrolled loops?

This seems to be the right approach.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
  2004-11-21 14:59 [Bug tree-optimization/18595] New: IV-OPTS is slow (and does not scale) pinskia at gcc dot gnu dot org
                   ` (53 preceding siblings ...)
  2005-01-27 19:11 ` rakdver at atrey dot karlin dot mff dot cuni dot cz
@ 2005-01-27 20:40 ` sebastian dot pop at cri dot ensmp dot fr
  2005-02-14 11:12 ` pinskia at gcc dot gnu dot org
  55 siblings, 0 replies; 59+ messages in thread
From: sebastian dot pop at cri dot ensmp dot fr @ 2005-01-27 20:40 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From sebastian dot pop at cri dot ensmp dot fr  2005-01-27 20:38 -------
Subject: Re:  [4.0 Regression] IV-OPTS is O(N^3)

rakdver at atrey dot karlin dot mff dot cuni dot cz wrote:
> > Another idea: would it be possible to insert the invalidated names
> > during the optimization pass instead of invalidating all the symbols?
> > This would avoid the first pass over the scev database that search for
> > symbols.
> 
> I don't understand this proposal.
> 

Anyway, I see that I was wrong: even if you mark the SSA_NAMEs that
are removed in an optimization, and that you invalidate only those
symbols, you still have to walk the scev database to ensure that there
is no other evolution that references the removed SSA_NAMEs.



-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3)
  2004-11-21 14:59 [Bug tree-optimization/18595] New: IV-OPTS is slow (and does not scale) pinskia at gcc dot gnu dot org
                   ` (54 preceding siblings ...)
  2005-01-27 20:40 ` sebastian dot pop at cri dot ensmp dot fr
@ 2005-02-14 11:12 ` pinskia at gcc dot gnu dot org
  55 siblings, 0 replies; 59+ messages in thread
From: pinskia at gcc dot gnu dot org @ 2005-02-14 11:12 UTC (permalink / raw)
  To: gcc-bugs


------- Additional Comments From pinskia at gcc dot gnu dot org  2005-02-13 23:14 -------
Moving to 4.1 as I don't care much if this gets fixed.

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|4.0.0                       |4.1.0


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18595


^ permalink raw reply	[flat|nested] 59+ messages in thread

end of thread, other threads:[~2005-02-13 23:15 UTC | newest]

Thread overview: 59+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-11-21 14:59 [Bug tree-optimization/18595] New: IV-OPTS is slow (and does not scale) pinskia at gcc dot gnu dot org
2004-11-24 11:15 ` [Bug tree-optimization/18595] " steven at gcc dot gnu dot org
2004-11-25 11:16 ` [Bug tree-optimization/18595] [4.0 Regression] IV-OPTS is O(N^3) belyshev at lubercy dot com
2004-11-27  1:11 ` steven at gcc dot gnu dot org
2004-11-27  1:45 ` belyshev at lubercy dot com
2004-11-27 21:17 ` neroden at gcc dot gnu dot org
2004-11-28 18:43 ` rakdver at gcc dot gnu dot org
2004-12-22 23:40 ` belyshev at depni dot sinp dot msu dot ru
2004-12-23  1:26 ` steven at gcc dot gnu dot org
2004-12-23  1:29 ` rakdver at atrey dot karlin dot mff dot cuni dot cz
2005-01-23 14:18 ` steven at gcc dot gnu dot org
2005-01-23 15:01   ` Daniel Berlin
2005-01-23 14:22 ` steven at gcc dot gnu dot org
2005-01-23 14:23 ` steven at gcc dot gnu dot org
2005-01-23 14:24 ` steven at gcc dot gnu dot org
2005-01-23 15:02 ` dberlin at dberlin dot org
2005-01-23 15:06 ` rakdver at atrey dot karlin dot mff dot cuni dot cz
2005-01-23 16:38 ` rakdver at gcc dot gnu dot org
2005-01-23 17:04 ` rakdver at gcc dot gnu dot org
2005-01-23 17:14 ` rakdver at gcc dot gnu dot org
2005-01-24  1:40 ` rakdver at gcc dot gnu dot org
2005-01-24  1:46 ` rakdver at gcc dot gnu dot org
2005-01-24  1:49   ` Daniel Berlin
2005-01-24  1:49 ` dberlin at dberlin dot org
2005-01-24  1:58 ` dberlin at dberlin dot org
2005-01-24 21:36 ` sebastian dot pop at cri dot ensmp dot fr
2005-01-24 22:17 ` rakdver at atrey dot karlin dot mff dot cuni dot cz
2005-01-24 22:25 ` dberlin at dberlin dot org
2005-01-24 23:09 ` rakdver at atrey dot karlin dot mff dot cuni dot cz
2005-01-24 23:12 ` dberlin at dberlin dot org
2005-01-25  0:27 ` rakdver at atrey dot karlin dot mff dot cuni dot cz
2005-01-25  0:39 ` dberlin at dberlin dot org
2005-01-25  0:50 ` rakdver at atrey dot karlin dot mff dot cuni dot cz
2005-01-25  0:50 ` dberlin at dberlin dot org
2005-01-25  0:52 ` dberlin at dberlin dot org
2005-01-25  1:11 ` rakdver at gcc dot gnu dot org
2005-01-25  1:15 ` dberlin at dberlin dot org
2005-01-25  1:24 ` rakdver at atrey dot karlin dot mff dot cuni dot cz
2005-01-25  1:28 ` steven at gcc dot gnu dot org
2005-01-25  1:42 ` dberlin at dberlin dot org
2005-01-25  1:52 ` rakdver at atrey dot karlin dot mff dot cuni dot cz
2005-01-25  1:57 ` rakdver at atrey dot karlin dot mff dot cuni dot cz
2005-01-25 10:32 ` sebastian dot pop at cri dot ensmp dot fr
2005-01-25 10:41 ` sebastian dot pop at cri dot ensmp dot fr
2005-01-25 11:02 ` sebastian dot pop at cri dot ensmp dot fr
2005-01-25 11:15 ` rakdver at atrey dot karlin dot mff dot cuni dot cz
2005-01-25 11:16 ` rakdver at atrey dot karlin dot mff dot cuni dot cz
2005-01-25 11:29 ` rakdver at atrey dot karlin dot mff dot cuni dot cz
2005-01-25 12:03 ` sebastian dot pop at cri dot ensmp dot fr
2005-01-25 12:44 ` sebastian dot pop at cri dot ensmp dot fr
2005-01-25 15:54 ` rakdver at atrey dot karlin dot mff dot cuni dot cz
2005-01-25 16:27 ` sebastian dot pop at cri dot ensmp dot fr
2005-01-25 19:16 ` pinskia at gcc dot gnu dot org
2005-01-27 13:18 ` sebastian dot pop at cri dot ensmp dot fr
2005-01-27 15:00 ` rakdver at atrey dot karlin dot mff dot cuni dot cz
2005-01-27 15:12 ` sebastian dot pop at cri dot ensmp dot fr
2005-01-27 19:11 ` rakdver at atrey dot karlin dot mff dot cuni dot cz
2005-01-27 20:40 ` sebastian dot pop at cri dot ensmp dot fr
2005-02-14 11:12 ` pinskia at gcc dot gnu dot org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).