* [Bug ipa/109983] [12/13/14 regression] Wireshark compilation hangs with -O2 -fipa-pta
2023-05-26 9:26 [Bug ipa/109983] New: [12/13/14 regression] Wireshark compilation hangs with -O2 -fipa-pta sjames at gcc dot gnu.org
@ 2023-05-26 9:28 ` sjames at gcc dot gnu.org
2023-05-26 9:36 ` sjames at gcc dot gnu.org
` (18 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: sjames at gcc dot gnu.org @ 2023-05-26 9:28 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109983
--- Comment #1 from Sam James <sjames at gcc dot gnu.org> ---
I let perf spin for a while and got this w/ 13:
```
$ perf record gcc-13 -O2 -fipa-pta -c packet-rnsap.c.i
[^C'd after ~2 minutes]
$ perf report
43.18% cc1 cc1 [.] bitmap_ior_into
40.43% cc1 cc1 [.] bitmap_bit_p
9.65% cc1 cc1 [.] bitmap_set_bit
2.87% cc1 cc1 [.] solve_graph
0.98% cc1 cc1 [.] solution_set_expand
0.80% cc1 cc1 [.] add_graph_edge
0.39% cc1 cc1 [.] find
0.04% cc1 cc1 [.]
symbol_table::decl_assembler_name_hash
0.04% cc1 libc.so.6 [.] malloc
0.03% cc1 cc1 [.] ggc_set_mark
0.03% cc1 cc1 [.] verify_ssa
0.03% cc1 cc1 [.] gt_ggc_mx_lang_tree_node
0.03% cc1 cc1 [.] walk_gimple_op
0.03% cc1 cc1 [.] walk_tree_1
0.03% cc1 cc1 [.]
base_pool_allocator<memory_block_pool>::allocate
0.03% cc1 cc1 [.] solve_constraints
0.02% cc1 cc1 [.]
bitmap_list_insert_element_after
0.02% cc1 cc1 [.] verify_gimple_call
0.02% cc1 libc.so.6 [.] cfree
0.02% cc1 cc1 [.] ggc_internal_alloc
0.02% cc1 cc1 [.] verify_gimple_in_cfg
0.02% cc1 cc1 [.] equiv_class_lookup_or_add
0.02% cc1 cc1 [.] verify_gimple_stmt
0.02% cc1 [unknown] [k] 0xffffffff89000b90
0.02% cc1 cc1 [.] do_per_function
0.01% cc1 cc1 [.]
hash_table<default_hash_traits<tree_node*>, false, xcallocator>::verify
0.01% cc1 cc1 [.] contains_struct_check
0.01% cc1 cc1 [.]
hash_table<hash_map<nofree_string_hash, align_flags,
simple_hashmap_traits<default_hash_traits<nofree_string_hash>, align_flags>
>::hash_entry, false, xcallocator>::verify
0.01% cc1 cc1 [.] free_dominance_info
0.01% cc1 cc1 [.]
operands_scanner::get_expr_operands
0.01% cc1 cc1 [.] verify_expr_location_1
0.01% cc1 cc1 [.] _cpp_lex_direct
0.01% cc1 cc1 [.] verify_location
0.01% cc1 cc1 [.] (anonymous
namespace)::dom_info::calc_dfs_tree_nonrec
0.01% cc1 cc1 [.] (anonymous
namespace)::dom_info::calc_idoms
[...]
```
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Bug ipa/109983] [12/13/14 regression] Wireshark compilation hangs with -O2 -fipa-pta
2023-05-26 9:26 [Bug ipa/109983] New: [12/13/14 regression] Wireshark compilation hangs with -O2 -fipa-pta sjames at gcc dot gnu.org
2023-05-26 9:28 ` [Bug ipa/109983] " sjames at gcc dot gnu.org
@ 2023-05-26 9:36 ` sjames at gcc dot gnu.org
2023-05-26 10:38 ` slyfox at gcc dot gnu.org
` (17 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: sjames at gcc dot gnu.org @ 2023-05-26 9:36 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109983
Sam James <sjames at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Keywords| |compile-time-hog
--- Comment #2 from Sam James <sjames at gcc dot gnu.org> ---
Note that compilation (even with checking off) is kind of slow even without
-fipa-pta, but of course it doesn't hang then.
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Bug ipa/109983] [12/13/14 regression] Wireshark compilation hangs with -O2 -fipa-pta
2023-05-26 9:26 [Bug ipa/109983] New: [12/13/14 regression] Wireshark compilation hangs with -O2 -fipa-pta sjames at gcc dot gnu.org
2023-05-26 9:28 ` [Bug ipa/109983] " sjames at gcc dot gnu.org
2023-05-26 9:36 ` sjames at gcc dot gnu.org
@ 2023-05-26 10:38 ` slyfox at gcc dot gnu.org
2023-05-26 10:48 ` sjames at gcc dot gnu.org
` (16 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: slyfox at gcc dot gnu.org @ 2023-05-26 10:38 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109983
Sergei Trofimovich <slyfox at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |slyfox at gcc dot gnu.org
--- Comment #3 from Sergei Trofimovich <slyfox at gcc dot gnu.org> ---
Quoting `man gcc`:
-fipa-pta
Perform interprocedural pointer analysis and interprocedural
modification
and reference analysis. This option can cause excessive memory and
compile-time usage on large compilation units. It is not enabled by
default at any optimization level.
Chances are `gcc` succeeds in a longer time: O(minutes). For me it finishes in
3 minutes on gcc-14. Enabled extra checkers (verify_*) probably make it slower.
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Bug ipa/109983] [12/13/14 regression] Wireshark compilation hangs with -O2 -fipa-pta
2023-05-26 9:26 [Bug ipa/109983] New: [12/13/14 regression] Wireshark compilation hangs with -O2 -fipa-pta sjames at gcc dot gnu.org
` (2 preceding siblings ...)
2023-05-26 10:38 ` slyfox at gcc dot gnu.org
@ 2023-05-26 10:48 ` sjames at gcc dot gnu.org
2023-05-26 11:04 ` slyfox at gcc dot gnu.org
` (15 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: sjames at gcc dot gnu.org @ 2023-05-26 10:48 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109983
--- Comment #4 from Sam James <sjames at gcc dot gnu.org> ---
I know what the option does, but:
1. It's substantially slower in 12/13/14, with or without checking. If that's
expected, that's fine, but someone has to say if it is.
2. With default checking (=release) on 12.2.1 20230428, it's taken over 5
minutes.
3. Even lots of checking (yes,rtl,extra) on 11, it takes about 10 seconds.
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Bug ipa/109983] [12/13/14 regression] Wireshark compilation hangs with -O2 -fipa-pta
2023-05-26 9:26 [Bug ipa/109983] New: [12/13/14 regression] Wireshark compilation hangs with -O2 -fipa-pta sjames at gcc dot gnu.org
` (3 preceding siblings ...)
2023-05-26 10:48 ` sjames at gcc dot gnu.org
@ 2023-05-26 11:04 ` slyfox at gcc dot gnu.org
2023-05-26 18:17 ` slyfox at gcc dot gnu.org
` (14 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: slyfox at gcc dot gnu.org @ 2023-05-26 11:04 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109983
--- Comment #5 from Sergei Trofimovich <slyfox at gcc dot gnu.org> ---
> For me it finishes in 3 minutes on gcc-14.
I'll take it back. It does not finish for me in 10 minutes on gcc-14. Don't
know where I picked the number.
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Bug ipa/109983] [12/13/14 regression] Wireshark compilation hangs with -O2 -fipa-pta
2023-05-26 9:26 [Bug ipa/109983] New: [12/13/14 regression] Wireshark compilation hangs with -O2 -fipa-pta sjames at gcc dot gnu.org
` (4 preceding siblings ...)
2023-05-26 11:04 ` slyfox at gcc dot gnu.org
@ 2023-05-26 18:17 ` slyfox at gcc dot gnu.org
2023-05-26 20:03 ` slyfox at gcc dot gnu.org
` (13 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: slyfox at gcc dot gnu.org @ 2023-05-26 18:17 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109983
--- Comment #6 from Sergei Trofimovich <slyfox at gcc dot gnu.org> ---
Created attachment 55174
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55174&action=edit
packet-rnsap-shrunk-slightly.c.i.xz
packet-rnsap-shrunk-slightly.c.i.xz is a slightly shrunk version of the
original.
It exhibits 10x slowdown and has a reasonable compilation completion time.
Might be useful to explore as is or bisect gcc:
$ gcc -O2 -c packet-rnsap-shrunk-slightly.c.i -o bug.o -fipa-pta
-Wno-deprecated-declarations -fno-ipa-pta >/dev/null 2>&1
real 0m0,657s
user 0m0,626s
sys 0m0,026s
$ gcc -O2 -c packet-rnsap-shrunk-slightly.c.i -o bug.o -fipa-pta
-Wno-deprecated-declarations -fipa-pta >/dev/null 2>&1
real 0m6,120s
user 0m6,065s
sys 0m0,045s
-ftime-report says 'ipa points-to' takes 88%.
-fdump-ipa-all-details creates 2.0G bug.i.092i.pta2 file (the rest of files are
unred 5M).
I suspect it's a pathology in solving a huge `proto_reg_handoff_rnsap()` graph.
Some variables have up to 5000 PT entries.
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Bug ipa/109983] [12/13/14 regression] Wireshark compilation hangs with -O2 -fipa-pta
2023-05-26 9:26 [Bug ipa/109983] New: [12/13/14 regression] Wireshark compilation hangs with -O2 -fipa-pta sjames at gcc dot gnu.org
` (5 preceding siblings ...)
2023-05-26 18:17 ` slyfox at gcc dot gnu.org
@ 2023-05-26 20:03 ` slyfox at gcc dot gnu.org
2023-05-26 20:09 ` pinskia at gcc dot gnu.org
` (12 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: slyfox at gcc dot gnu.org @ 2023-05-26 20:03 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109983
--- Comment #7 from Sergei Trofimovich <slyfox at gcc dot gnu.org> ---
Original packet-rnsap.c.i.xz takes 27 minutes to compile for me.
The hack below cuts this time down to 9 minutes (slashes 60% of runtime).
The considerable amount of time is spent looking up the bitmaps for graph edges
to extract and solve PT facts.
I'd say there is a room for micro-optimization to turn bitmap to something
slightly smarter than a linked list. It will not improve the runtime too much.
Another option could be to put a limit on edge count (say, controlled by a
`param`) which `gcc` could use to fallback on conservative value.
--- a/gcc/bitmap.h
+++ b/gcc/bitmap.h
@@ -283,7 +283,7 @@ typedef unsigned long BITMAP_WORD;
/* Number of words to use for each element in the linked list. */
#ifndef BITMAP_ELEMENT_WORDS
-#define BITMAP_ELEMENT_WORDS ((128 + BITMAP_WORD_BITS - 1) / BITMAP_WORD_BITS)
+#define BITMAP_ELEMENT_WORDS ((8192 + BITMAP_WORD_BITS - 1) /
BITMAP_WORD_BITS)
#endif
/* Number of bits in each actual element of a bitmap. */
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Bug ipa/109983] [12/13/14 regression] Wireshark compilation hangs with -O2 -fipa-pta
2023-05-26 9:26 [Bug ipa/109983] New: [12/13/14 regression] Wireshark compilation hangs with -O2 -fipa-pta sjames at gcc dot gnu.org
` (6 preceding siblings ...)
2023-05-26 20:03 ` slyfox at gcc dot gnu.org
@ 2023-05-26 20:09 ` pinskia at gcc dot gnu.org
2023-05-30 7:25 ` rguenth at gcc dot gnu.org
` (11 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-05-26 20:09 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109983
--- Comment #8 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Sergei Trofimovich from comment #7)
> Original packet-rnsap.c.i.xz takes 27 minutes to compile for me.
>
> The hack below cuts this time down to 9 minutes (slashes 60% of runtime).
Or maybe it should be moved over to use sbitmap rather than bitmap ...
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Bug ipa/109983] [12/13/14 regression] Wireshark compilation hangs with -O2 -fipa-pta
2023-05-26 9:26 [Bug ipa/109983] New: [12/13/14 regression] Wireshark compilation hangs with -O2 -fipa-pta sjames at gcc dot gnu.org
` (7 preceding siblings ...)
2023-05-26 20:09 ` pinskia at gcc dot gnu.org
@ 2023-05-30 7:25 ` rguenth at gcc dot gnu.org
2023-05-30 8:39 ` rguenth at gcc dot gnu.org
` (10 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-05-30 7:25 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109983
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Last reconfirmed| |2023-05-30
Target Milestone|--- |12.4
Ever confirmed|0 |1
Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org
Status|UNCONFIRMED |ASSIGNED
--- Comment #9 from Richard Biener <rguenth at gcc dot gnu.org> ---
IPA PTA (the PTA solver in general) is a known hog, it got significantly worse
for testcases involving a lot of calls with Honzas re-org to improve PTA around
calls.
My suggestion is to not enable -fipa-pta if you hit such issue or in general if
you don't know it pays off with a good runtime performance boost.
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Bug ipa/109983] [12/13/14 regression] Wireshark compilation hangs with -O2 -fipa-pta
2023-05-26 9:26 [Bug ipa/109983] New: [12/13/14 regression] Wireshark compilation hangs with -O2 -fipa-pta sjames at gcc dot gnu.org
` (8 preceding siblings ...)
2023-05-30 7:25 ` rguenth at gcc dot gnu.org
@ 2023-05-30 8:39 ` rguenth at gcc dot gnu.org
2023-05-30 9:08 ` rguenth at gcc dot gnu.org
` (9 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-05-30 8:39 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109983
--- Comment #10 from Richard Biener <rguenth at gcc dot gnu.org> ---
Unsurprisingly, for the full testcase on current trunk with release checking:
ipa points-to :1080.80 (100%) 0.56 ( 47%)1081.52 (100%)
9739k ( 4%)
TOTAL :1083.27 1.20 1084.62
261M
1083.27user 1.21system 18:04.64elapsed 99%CPU (0avgtext+0avgdata
2778064maxresident)k
0inputs+9168outputs (0major+712792minor)pagefaults 0swaps
Note there are PRs showing PTA itself is slow for large functions, IPA PTA
isn't very much special here.
The bitmap data structure isn't the main issue, the main issue is the
working set becomes too big and the iteration isn't optimized for memory
locality. The biggest hits tends to be the bitmap_ior_into done in
solve_add_graph_edge which is necessary to support the solver working
on incremental solution changes only (we only have 'changed' on nodes,
not on graph edges which are only represented implicitly).
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Bug ipa/109983] [12/13/14 regression] Wireshark compilation hangs with -O2 -fipa-pta
2023-05-26 9:26 [Bug ipa/109983] New: [12/13/14 regression] Wireshark compilation hangs with -O2 -fipa-pta sjames at gcc dot gnu.org
` (9 preceding siblings ...)
2023-05-30 8:39 ` rguenth at gcc dot gnu.org
@ 2023-05-30 9:08 ` rguenth at gcc dot gnu.org
2023-05-31 10:37 ` rguenth at gcc dot gnu.org
` (8 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-05-30 9:08 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109983
--- Comment #11 from Richard Biener <rguenth at gcc dot gnu.org> ---
Created attachment 55213
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55213&action=edit
prototype patch
Trying to fix that (again - I usually break correctness here :/) this
would result in
ipa points-to : 170.66 ( 99%) 0.28 ( 28%) 170.95 ( 98%)
1428k ( 1%)
TOTAL : 173.10 0.99 174.12
253M
so still slow but a lot faster. What the patch definitely misses is
handling the graph node merging cases, properly merging new_edges[] for
the merged nodes. Note it's also trading memory locality against the
possibility of needing more solver iterations (it probably pays off for
large working sets but might hurt small ones).
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Bug ipa/109983] [12/13/14 regression] Wireshark compilation hangs with -O2 -fipa-pta
2023-05-26 9:26 [Bug ipa/109983] New: [12/13/14 regression] Wireshark compilation hangs with -O2 -fipa-pta sjames at gcc dot gnu.org
` (10 preceding siblings ...)
2023-05-30 9:08 ` rguenth at gcc dot gnu.org
@ 2023-05-31 10:37 ` rguenth at gcc dot gnu.org
2023-05-31 10:49 ` rguenth at gcc dot gnu.org
` (7 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-05-31 10:37 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109983
--- Comment #12 from Richard Biener <rguenth at gcc dot gnu.org> ---
Bad:
Samples: 4M of event 'cycles:u', Event count (approx.): 6574968845607
Overhead Samples Command Shared Object Symbol
59.44% 2803407 cc1 cc1 [.] bitmap_ior_into
23.71% 1119062 cc1 cc1 [.] bitmap_bit_p
12.30% 583673 cc1 cc1 [.] bitmap_set_bit
1.13% 53698 cc1 cc1 [.] find
1.09% 51770 cc1 cc1 [.] solve_constraints
0.55% 26265 cc1 cc1 [.] add_graph_edge
0.52% 24726 cc1 cc1 [.] solve_add_graph_edge
0.34% 16328 cc1 cc1 [.]
find_what_var_points_to
0.23% 11259 cc1 cc1 [.] solution_set_expand
0.14% 7015 cc1 cc1 [.] ipa_pta_execute
0.13% 6296 cc1 cc1 [.] topo_visit
0.06% 2708 cc1 cc1 [.] auto_var_in_fn_p
1157768.85 msec task-clock:u # 1.000 CPUs
utilized
0 context-switches:u # 0.000 /sec
0 cpu-migrations:u # 0.000 /sec
712310 page-faults:u # 615.244 /sec
6492736607297 cycles:u # 5.608 GHz
(83.33%)
629989262 stalled-cycles-frontend:u # 0.01% frontend
cycles idle (83.33%)
1079567639 stalled-cycles-backend:u # 0.02% backend
cycles idle (83.33%)
3618200824104 instructions:u # 0.56 insn per
cycle
# 0.00 stalled cycles per
insn (83.33%)
976059766707 branches:u # 843.052 M/sec
(83.33%)
2658800911 branch-misses:u # 0.27% of all
branches (83.33%)
1158.070038218 seconds time elapsed
1157.021441000 seconds user
0.743804000 seconds sys
Samples: 4M of event 'cache-misses:u', Event count (approx.): 30071477326
Overhead Samples Command Shared Object Symbol
67.54% 2723606 cc1 cc1 [.] bitmap_ior_into
16.50% 994335 cc1 cc1 [.] bitmap_bit_p
3.33% 121544 cc1 cc1 [.] bitmap_set_bit
Good:
Samples: 390K of event 'cycles:u', Event count (approx.): 532564428833
Overhead Samples Command Shared Object Symbol
73.09% 280949 cc1 cc1 [.] bitmap_set_bit
3.70% 14919 cc1 cc1 [.] find
3.54% 13591 cc1 cc1 [.]
find_what_var_points_to
3.48% 13952 cc1 cc1 [.] bitmap_bit_p
2.26% 8969 cc1 cc1 [.] bitmap_ior_into
2.24% 9027 cc1 cc1 [.] add_graph_edge
2.03% 8176 cc1 cc1 [.] solve_add_graph_edge
1.75% 7072 cc1 cc1 [.] solution_set_expand
1.41% 5620 cc1 cc1 [.] ipa_pta_execute
1.34% 5406 cc1 cc1 [.] solve_constraints
98853.84 msec task-clock:u # 1.000 CPUs
utilized
0 context-switches:u # 0.000 /sec
0 cpu-migrations:u # 0.000 /sec
558875 page-faults:u # 5.654 K/sec
533990826593 cycles:u # 5.402 GHz
(83.33%)
387751641 stalled-cycles-frontend:u # 0.07% frontend
cycles idle (83.33%)
236682949 stalled-cycles-backend:u # 0.04% backend
cycles idle (83.33%)
1084066822948 instructions:u # 2.03 insn per
cycle
# 0.00 stalled cycles per
insn (83.33%)
276015818957 branches:u # 2.792 G/sec
(83.34%)
899265351 branch-misses:u # 0.33% of all
branches (83.33%)
98.878776190 seconds time elapsed
98.459437000 seconds user
0.395912000 seconds sys
Samples: 140K of event 'cache-misses:u', Event count (approx.): 4272440656
Overhead Samples Command Shared Object Symbol
17.92% 6124 cc1 cc1 [.] ipa_pta_execute
11.93% 28566 cc1 cc1 [.] bitmap_set_bit
10.48% 14489 cc1 cc1 [.] find
But the key observation is from the stats. In the bad case we have
Points-to Stats:
Total vars: 108033
Non-pointer vars: 59
Statically unified vars: 20147
Dynamically unified vars: 0
Iterations: 4
Number of edges: 141447426
Number of implicit edges: 215676
Points to sets created:15930
and the good case sees
Points-to Stats:
Total vars: 108033
Non-pointer vars: 59
Statically unified vars: 20147
Dynamically unified vars: 0
Iterations: 10
Number of edges: 582729
Number of implicit edges: 215676
Points to sets created:15930
so we have a _lot_ more (242x) edges in the slow case (but also only 4
solver iterations) while patched we somehow do with less edges (and
thus way less propagations aka bitmap_ior) but need 2.5x the number of
solver iterations.
The final points-to solutions are 1:1 identical (semantically, I did
not look at the ESCAPED set here).
The key is the heuristic in add_graph_edge triggers way more often:
/* The graph solving process does not avoid "triangles", thus
there can be multiple paths from a node to another involving
intermediate other nodes. That causes extra copying which is
most difficult to avoid when the intermediate node is ESCAPED
because there are no edges added from ESCAPED. Avoid
adding the direct edge FROM -> TO when we have FROM -> ESCAPED
and TO contains ESCAPED.
??? Note this is only a heuristic, it does not prevent the
situation from occuring. The heuristic helps PR38474 and
PR99912 significantly. */
if (to < FIRST_REF_NODE
&& bitmap_bit_p (graph->succs[from], find (escaped_id))
&& bitmap_bit_p (get_varinfo (find (to))->solution, escaped_id))
{
stats.num_avoided_edges++;
return false;
}
I've added num_avoided_edges and for the fast case we see
Number of avoided edges: 870215056 (that's 6x the number of edges in the slow
case!)
For the slow case we have
Number of avoided edges: 1113769444
which is even more. But I think that should be the only reason the
number of edges created differs. Converging more slowly means
executing ESCAPED = *ESCAPED more often which creates edges to
escaped earlier and thus possibly triggering the above. But it
also means the 'to' solution might not yet contain ESCAPED ...
ICK. I can get it to
ipa points-to : 0.40 ( 14%) 0.03 ( 5%) 0.43 ( 12%)
1950k ( 1%)
TOTAL : 2.87 0.61 3.49
253M
when I avoid defering copies to ESCAPED itself (that makes the first
ESCAPED = *ESCAPED do more but delays increasing other solutions
to followup iterations). In fact it probably special-cases the
ESCAPED = *ESCAPED complex constraint which we also locally iterate
until convergence before visiting other constraints.
Note the initial graph tends to be very unconnected thus the order
is somewhat random but the way we compute the topological sorting
and process the worklist ensures ESCAPED = *ESCAPED is done as something
very last in the first iteration (if there are no existing edges to
ESCAPED - I think we do have those for calls though).
It does help to order the ESCAPED connected component first:
Points-to Stats:
Total vars: 108033
Non-pointer vars: 59
Statically unified vars: 20147
Dynamically unified vars: 0
Iterations: 3
Number of edges: 239596
Number of implicit edges: 215676
Points to sets created:15930
ipa points-to : 0.28 ( 10%) 0.05 ( 8%) 0.34 ( 10%)
2191k ( 1%)
TOTAL : 2.85 0.60 3.47
253M
I'm testing this approach now.
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Bug ipa/109983] [12/13/14 regression] Wireshark compilation hangs with -O2 -fipa-pta
2023-05-26 9:26 [Bug ipa/109983] New: [12/13/14 regression] Wireshark compilation hangs with -O2 -fipa-pta sjames at gcc dot gnu.org
` (11 preceding siblings ...)
2023-05-31 10:37 ` rguenth at gcc dot gnu.org
@ 2023-05-31 10:49 ` rguenth at gcc dot gnu.org
2023-05-31 11:27 ` cvs-commit at gcc dot gnu.org
` (6 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-05-31 10:49 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109983
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Known to work| |11.3.0
--- Comment #13 from Richard Biener <rguenth at gcc dot gnu.org> ---
GCC 11.3 also takes ~3s to compile this so the patch would fix the regression
in full. It's also remotely simple enough to be backported I think.
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Bug ipa/109983] [12/13/14 regression] Wireshark compilation hangs with -O2 -fipa-pta
2023-05-26 9:26 [Bug ipa/109983] New: [12/13/14 regression] Wireshark compilation hangs with -O2 -fipa-pta sjames at gcc dot gnu.org
` (12 preceding siblings ...)
2023-05-31 10:49 ` rguenth at gcc dot gnu.org
@ 2023-05-31 11:27 ` cvs-commit at gcc dot gnu.org
2023-05-31 11:27 ` rguenth at gcc dot gnu.org
` (5 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-05-31 11:27 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109983
--- Comment #14 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Richard Biener <rguenth@gcc.gnu.org>:
https://gcc.gnu.org/g:95e5c38a98cc64a797b1d766a20f8c0d0c807a74
commit r14-1429-g95e5c38a98cc64a797b1d766a20f8c0d0c807a74
Author: Richard Biener <rguenther@suse.de>
Date: Wed May 31 12:07:42 2023 +0200
ipa/109983 - (IPA) PTA speedup
This improves the edge avoidance heuristic by re-ordering the
topological sort of the graph to make sure the component with
the ESCAPED node is processed first. This improves the number
of created edges which directly correlates with the number
of bitmap_ior_into calls from 141447426 to 239596 and the
compile-time from 1083s to 3s. It also improves the compile-time
for the related PR109143 from 81s to 27s.
I've modernized the topological sorting API on the way as well.
PR ipa/109983
PR tree-optimization/109143
* tree-ssa-structalias.cc (struct topo_info): Remove.
(init_topo_info): Likewise.
(free_topo_info): Likewise.
(compute_topo_order): Simplify API, put the component
with ESCAPED last so it's processed first.
(topo_visit): Adjust.
(solve_graph): Likewise.
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Bug ipa/109983] [12/13/14 regression] Wireshark compilation hangs with -O2 -fipa-pta
2023-05-26 9:26 [Bug ipa/109983] New: [12/13/14 regression] Wireshark compilation hangs with -O2 -fipa-pta sjames at gcc dot gnu.org
` (13 preceding siblings ...)
2023-05-31 11:27 ` cvs-commit at gcc dot gnu.org
@ 2023-05-31 11:27 ` rguenth at gcc dot gnu.org
2023-05-31 19:27 ` [Bug ipa/109983] [12/13 " sjames at gcc dot gnu.org
` (4 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-05-31 11:27 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109983
--- Comment #15 from Richard Biener <rguenth at gcc dot gnu.org> ---
Btw, for me GCC 12.3 takes 170s for the compile.
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Bug ipa/109983] [12/13 regression] Wireshark compilation hangs with -O2 -fipa-pta
2023-05-26 9:26 [Bug ipa/109983] New: [12/13/14 regression] Wireshark compilation hangs with -O2 -fipa-pta sjames at gcc dot gnu.org
` (14 preceding siblings ...)
2023-05-31 11:27 ` rguenth at gcc dot gnu.org
@ 2023-05-31 19:27 ` sjames at gcc dot gnu.org
2023-06-06 8:42 ` rguenth at gcc dot gnu.org
` (3 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: sjames at gcc dot gnu.org @ 2023-05-31 19:27 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109983
--- Comment #16 from Sam James <sjames at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #9)
> My suggestion is to not enable -fipa-pta if you hit such issue or in general
> if you don't know it pays off with a good runtime performance boost.
Many thanks richi. I've been telling people this informally but I'll try to
make it a bit clearer somewhere as well. We don't recommend using -fipa-pta but
some adventurous users pick it up.
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Bug ipa/109983] [12/13 regression] Wireshark compilation hangs with -O2 -fipa-pta
2023-05-26 9:26 [Bug ipa/109983] New: [12/13/14 regression] Wireshark compilation hangs with -O2 -fipa-pta sjames at gcc dot gnu.org
` (15 preceding siblings ...)
2023-05-31 19:27 ` [Bug ipa/109983] [12/13 " sjames at gcc dot gnu.org
@ 2023-06-06 8:42 ` rguenth at gcc dot gnu.org
2023-06-23 10:29 ` cvs-commit at gcc dot gnu.org
` (2 subsequent siblings)
19 siblings, 0 replies; 21+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-06-06 8:42 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109983
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Priority|P3 |P2
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Bug ipa/109983] [12/13 regression] Wireshark compilation hangs with -O2 -fipa-pta
2023-05-26 9:26 [Bug ipa/109983] New: [12/13/14 regression] Wireshark compilation hangs with -O2 -fipa-pta sjames at gcc dot gnu.org
` (16 preceding siblings ...)
2023-06-06 8:42 ` rguenth at gcc dot gnu.org
@ 2023-06-23 10:29 ` cvs-commit at gcc dot gnu.org
2023-11-27 11:34 ` [Bug ipa/109983] [12 " cvs-commit at gcc dot gnu.org
2023-11-27 11:38 ` rguenth at gcc dot gnu.org
19 siblings, 0 replies; 21+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-06-23 10:29 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109983
--- Comment #17 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-13 branch has been updated by Richard Biener
<rguenth@gcc.gnu.org>:
https://gcc.gnu.org/g:a737da4885b7038fdd7d6646deaa1644efff2257
commit r13-7466-ga737da4885b7038fdd7d6646deaa1644efff2257
Author: Richard Biener <rguenther@suse.de>
Date: Wed May 31 12:07:42 2023 +0200
ipa/109983 - (IPA) PTA speedup
This improves the edge avoidance heuristic by re-ordering the
topological sort of the graph to make sure the component with
the ESCAPED node is processed first. This improves the number
of created edges which directly correlates with the number
of bitmap_ior_into calls from 141447426 to 239596 and the
compile-time from 1083s to 3s. It also improves the compile-time
for the related PR109143 from 81s to 27s.
I've modernized the topological sorting API on the way as well.
PR ipa/109983
PR tree-optimization/109143
* tree-ssa-structalias.cc (struct topo_info): Remove.
(init_topo_info): Likewise.
(free_topo_info): Likewise.
(compute_topo_order): Simplify API, put the component
with ESCAPED last so it's processed first.
(topo_visit): Adjust.
(solve_graph): Likewise.
(cherry picked from commit 95e5c38a98cc64a797b1d766a20f8c0d0c807a74)
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Bug ipa/109983] [12 regression] Wireshark compilation hangs with -O2 -fipa-pta
2023-05-26 9:26 [Bug ipa/109983] New: [12/13/14 regression] Wireshark compilation hangs with -O2 -fipa-pta sjames at gcc dot gnu.org
` (17 preceding siblings ...)
2023-06-23 10:29 ` cvs-commit at gcc dot gnu.org
@ 2023-11-27 11:34 ` cvs-commit at gcc dot gnu.org
2023-11-27 11:38 ` rguenth at gcc dot gnu.org
19 siblings, 0 replies; 21+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-11-27 11:34 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109983
--- Comment #18 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-12 branch has been updated by Richard Biener
<rguenth@gcc.gnu.org>:
https://gcc.gnu.org/g:5a381334e2b0faac75c059504155aa4a8f8809ac
commit r12-10005-g5a381334e2b0faac75c059504155aa4a8f8809ac
Author: Richard Biener <rguenther@suse.de>
Date: Wed May 31 12:07:42 2023 +0200
ipa/109983 - (IPA) PTA speedup
This improves the edge avoidance heuristic by re-ordering the
topological sort of the graph to make sure the component with
the ESCAPED node is processed first. This improves the number
of created edges which directly correlates with the number
of bitmap_ior_into calls from 141447426 to 239596 and the
compile-time from 1083s to 3s. It also improves the compile-time
for the related PR109143 from 81s to 27s.
I've modernized the topological sorting API on the way as well.
PR ipa/109983
PR tree-optimization/109143
* tree-ssa-structalias.cc (struct topo_info): Remove.
(init_topo_info): Likewise.
(free_topo_info): Likewise.
(compute_topo_order): Simplify API, put the component
with ESCAPED last so it's processed first.
(topo_visit): Adjust.
(solve_graph): Likewise.
(cherry picked from commit 95e5c38a98cc64a797b1d766a20f8c0d0c807a74)
^ permalink raw reply [flat|nested] 21+ messages in thread
* [Bug ipa/109983] [12 regression] Wireshark compilation hangs with -O2 -fipa-pta
2023-05-26 9:26 [Bug ipa/109983] New: [12/13/14 regression] Wireshark compilation hangs with -O2 -fipa-pta sjames at gcc dot gnu.org
` (18 preceding siblings ...)
2023-11-27 11:34 ` [Bug ipa/109983] [12 " cvs-commit at gcc dot gnu.org
@ 2023-11-27 11:38 ` rguenth at gcc dot gnu.org
19 siblings, 0 replies; 21+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-11-27 11:38 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109983
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Resolution|--- |FIXED
Status|ASSIGNED |RESOLVED
Known to work| |12.3.1
--- Comment #19 from Richard Biener <rguenth at gcc dot gnu.org> ---
Fixed.
^ permalink raw reply [flat|nested] 21+ messages in thread