From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 655533858CD1; Thu, 29 Jun 2023 19:35:48 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 655533858CD1 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1688067348; bh=J2NVrJflkwRxymFNPoiGTxVxYJ4Ww+IwVkczx1jGyV4=; h=From:To:Subject:Date:From; b=I8ZkqckBk5mXi61JPH2bQ1Ym+S8ybzMnGT+fPyXJrZfiMokb0XV1svNFXCk2sG5LF 3CpYYdMry6QUUU7GQZvmurO55MSQ/9u7HCH2wccbK5SSH2xLwW8RW5ufttnacEbgL8 XypDDENcMnj3naZwK83oYQepSOT25kgj5tw7K+aM= From: "sjames at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug c/110489] New: Slow building virtual.c.i from p11-kit Date: Thu, 29 Jun 2023 19:35:47 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: c X-Bugzilla-Version: 14.0 X-Bugzilla-Keywords: compile-time-hog X-Bugzilla-Severity: normal X-Bugzilla-Who: sjames at gcc dot gnu.org X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status keywords bug_severity priority component assigned_to reporter target_milestone attachments.created Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D110489 Bug ID: 110489 Summary: Slow building virtual.c.i from p11-kit Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords: compile-time-hog Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: sjames at gcc dot gnu.org Target Milestone: --- Created attachment 55430 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=3D55430&action=3Dedit virtual.c.i.xz I fear this is a degenerate case as it's a largeish generated file (made du= ring the build process), but it's noticeable enough for me to raise it anyway. When building p11-kit, I noticed a handful of files took considerably longe= r to build. This is with release checking. The standout seems to be `virtual.c.i`: ``` $ time gcc -c virtual.c.i -O2 -fPIC real 0m12.429s user 0m12.137s sys 0m0.238s $ gcc --version gcc (Gentoo 12.3.1_p20230526 p2) 12.3.1 20230526 Copyright (C) 2022 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. $ gcc -c virtual.c.i -O2 -pipe -fPIC -ftime-report Time variable usr sys = wall GGC phase setup : 0.00 ( 0%) 0.00 ( 0%) 0.00 ( = 0%) 1326k ( 0%) phase parsing : 1.24 ( 5%) 1.01 ( 20%) 2.26 ( = 7%) 53M ( 9%) phase lang. deferred : 0.00 ( 0%) 0.00 ( 0%) 0.01 ( = 0%) 96 ( 0%) phase opt and generate : 24.16 ( 95%) 4.08 ( 80%) 28.74 ( = 93%) 570M ( 91%) phase finalize : 0.00 ( 0%) 0.00 ( 0%) 0.02 ( = 0%) 0 ( 0%) garbage collection : 0.08 ( 0%) 0.00 ( 0%) 0.08 ( = 0%) 0 ( 0%) dump files : 1.07 ( 4%) 0.24 ( 5%) 1.58 ( = 5%) 0 ( 0%) callgraph construction : 0.14 ( 1%) 0.01 ( 0%) 0.22 ( = 1%) 18M ( 3%) callgraph optimization : 0.36 ( 1%) 0.12 ( 2%) 0.50 ( = 2%) 13k ( 0%) callgraph functions expansion : 20.14 ( 79%) 3.12 ( 61%) 23.71 ( = 76%) 390M ( 62%) callgraph ipa passes : 3.61 ( 14%) 0.84 ( 17%) 4.48 ( = 14%) 132M ( 21%) ipa function summary : 0.14 ( 1%) 0.04 ( 1%) 0.12 ( = 0%) 13M ( 2%) ipa dead code removal : 0.09 ( 0%) 0.00 ( 0%) 0.07 ( = 0%) 0 ( 0%) ipa devirtualization : 0.01 ( 0%) 0.00 ( 0%) 0.01 ( = 0%) 0 ( 0%) ipa cp : 0.14 ( 1%) 0.04 ( 1%) 0.11 ( = 0%) 6933k ( 1%) ipa inlining heuristics : 0.14 ( 1%) 0.02 ( 0%) 0.10 ( = 0%) 135k ( 0%) ipa function splitting : 0.39 ( 2%) 0.06 ( 1%) 0.34 ( = 1%) 40M ( 7%) ipa comdats : 0.01 ( 0%) 0.00 ( 0%) 0.01 ( = 0%) 0 ( 0%) ipa reference : 0.03 ( 0%) 0.00 ( 0%) 0.03 ( = 0%) 0 ( 0%) ipa profile : 0.02 ( 0%) 0.00 ( 0%) 0.03 ( = 0%) 0 ( 0%) ipa pure const : 0.07 ( 0%) 0.02 ( 0%) 0.16 ( = 1%) 2416 ( 0%) ipa icf : 0.14 ( 1%) 0.00 ( 0%) 0.14 ( = 0%) 8112 ( 0%) ipa SRA : 0.10 ( 0%) 0.00 ( 0%) 0.11 ( = 0%) 1116k ( 0%) ipa free lang data : 0.01 ( 0%) 0.00 ( 0%) 0.00 ( = 0%) 0 ( 0%) ipa free inline summary : 0.02 ( 0%) 0.00 ( 0%) 0.04 ( = 0%) 0 ( 0%) ipa modref : 0.08 ( 0%) 0.00 ( 0%) 0.08 ( = 0%) 2874k ( 0%) cfg construction : 0.02 ( 0%) 0.02 ( 0%) 0.06 ( = 0%) 1323k ( 0%) cfg cleanup : 0.23 ( 1%) 0.06 ( 1%) 0.34 ( = 1%) 3108k ( 0%) trivially dead code : 0.07 ( 0%) 0.02 ( 0%) 0.14 ( = 0%) 0 ( 0%) df scan insns : 0.20 ( 1%) 0.03 ( 1%) 0.25 ( = 1%) 282k ( 0%) df reaching defs : 0.21 ( 1%) 0.03 ( 1%) 0.37 ( = 1%) 0 ( 0%) df live regs : 0.35 ( 1%) 0.08 ( 2%) 0.31 ( = 1%) 0 ( 0%) df live&initialized regs : 0.23 ( 1%) 0.01 ( 0%) 0.17 ( = 1%) 0 ( 0%) df must-initialized regs : 0.06 ( 0%) 0.00 ( 0%) 0.03 ( = 0%) 0 ( 0%) df use-def / def-use chains : 0.10 ( 0%) 0.02 ( 0%) 0.17 ( = 1%) 0 ( 0%) df reg dead/unused notes : 0.40 ( 2%) 0.02 ( 0%) 0.32 ( = 1%) 6334k ( 1%) register information : 0.14 ( 1%) 0.01 ( 0%) 0.06 ( = 0%) 0 ( 0%) alias analysis : 0.46 ( 2%) 0.03 ( 1%) 0.31 ( = 1%) 11M ( 2%) alias stmt walking : 0.13 ( 1%) 0.04 ( 1%) 0.13 ( = 0%) 18k ( 0%) register scan : 0.02 ( 0%) 0.00 ( 0%) 0.02 ( = 0%) 137k ( 0%) rebuild jump labels : 0.06 ( 0%) 0.00 ( 0%) 0.06 ( = 0%) 0 ( 0%) preprocessing : 0.32 ( 1%) 0.23 ( 5%) 0.60 ( = 2%) 4145k ( 1%) lexical analysis : 0.48 ( 2%) 0.37 ( 7%) 0.70 ( = 2%) 0 ( 0%) parser (global) : 0.10 ( 0%) 0.07 ( 1%) 0.25 ( = 1%) 14M ( 2%) parser struct body : 0.00 ( 0%) 0.00 ( 0%) 0.01 ( = 0%) 181k ( 0%) parser function body : 0.34 ( 1%) 0.34 ( 7%) 0.71 ( = 2%) 34M ( 6%) early inlining heuristics : 0.01 ( 0%) 0.00 ( 0%) 0.00 ( = 0%) 50k ( 0%) inline parameters : 0.16 ( 1%) 0.02 ( 0%) 0.16 ( = 1%) 9064k ( 1%) integration : 0.42 ( 2%) 0.02 ( 0%) 0.44 ( = 1%) 17M ( 3%) tree gimplify : 0.19 ( 1%) 0.05 ( 1%) 0.20 ( = 1%) 20M ( 3%) tree eh : 0.00 ( 0%) 0.01 ( 0%) 0.00 ( = 0%) 0 ( 0%) tree CFG construction : 0.00 ( 0%) 0.01 ( 0%) 0.05 ( = 0%) 11M ( 2%) tree CFG cleanup : 0.34 ( 1%) 0.10 ( 2%) 0.46 ( = 1%) 68k ( 0%) tree tail merge : 0.04 ( 0%) 0.00 ( 0%) 0.04 ( = 0%) 7696 ( 0%) tree VRP : 0.39 ( 2%) 0.10 ( 2%) 0.58 ( = 2%) 14M ( 2%) tree Early VRP : 0.12 ( 0%) 0.05 ( 1%) 0.22 ( = 1%) 6698k ( 1%) tree copy propagation : 0.06 ( 0%) 0.00 ( 0%) 0.13 ( = 0%) 432 ( 0%) tree PTA : 0.80 ( 3%) 0.18 ( 4%) 1.10 ( = 4%) 6475k ( 1%) tree SSA other : 0.01 ( 0%) 0.00 ( 0%) 0.00 ( = 0%) 0 ( 0%) tree SSA rewrite : 0.12 ( 0%) 0.02 ( 0%) 0.12 ( = 0%) 10M ( 2%) tree SSA incremental : 0.23 ( 1%) 0.04 ( 1%) 0.19 ( = 1%) 3738k ( 1%) tree operand scan : 0.07 ( 0%) 0.05 ( 1%) 0.16 ( = 1%) 14M ( 2%) dominator optimization : 0.62 ( 2%) 0.15 ( 3%) 0.53 ( = 2%) 4427k ( 1%) backwards jump threading : 0.27 ( 1%) 0.01 ( 0%) 0.31 ( = 1%) 38k ( 0%) tree SRA : 0.01 ( 0%) 0.00 ( 0%) 0.02 ( = 0%) 0 ( 0%) isolate eroneous paths : 0.01 ( 0%) 0.00 ( 0%) 0.03 ( = 0%) 0 ( 0%) tree CCP : 0.50 ( 2%) 0.08 ( 2%) 0.50 ( = 2%) 696k ( 0%) tree reassociation : 0.04 ( 0%) 0.02 ( 0%) 0.04 ( = 0%) 48 ( 0%) tree PRE : 0.34 ( 1%) 0.05 ( 1%) 0.35 ( = 1%) 8448k ( 1%) tree FRE : 0.46 ( 2%) 0.06 ( 1%) 0.55 ( = 2%) 5354k ( 1%) tree code sinking : 0.07 ( 0%) 0.02 ( 0%) 0.02 ( = 0%) 270k ( 0%) tree linearize phis : 0.02 ( 0%) 0.00 ( 0%) 0.04 ( = 0%) 6316k ( 1%) tree backward propagate : 0.02 ( 0%) 0.00 ( 0%) 0.04 ( = 0%) 0 ( 0%) tree forward propagate : 0.12 ( 0%) 0.05 ( 1%) 0.19 ( = 1%) 41k ( 0%) tree phiprop : 0.02 ( 0%) 0.00 ( 0%) 0.01 ( = 0%) 0 ( 0%) tree conservative DCE : 0.10 ( 0%) 0.02 ( 0%) 0.17 ( = 1%) 7296 ( 0%) tree aggressive DCE : 0.14 ( 1%) 0.04 ( 1%) 0.13 ( = 0%) 12M ( 2%) tree buildin call DCE : 0.00 ( 0%) 0.00 ( 0%) 0.02 ( = 0%) 0 ( 0%) tree DSE : 0.03 ( 0%) 0.00 ( 0%) 0.09 ( = 0%) 15k ( 0%) PHI merge : 0.01 ( 0%) 0.01 ( 0%) 0.01 ( = 0%) 48k ( 0%) tree loop optimization : 0.00 ( 0%) 0.00 ( 0%) 0.01 ( = 0%) 0 ( 0%) tree loop invariant motion : 0.02 ( 0%) 0.00 ( 0%) 0.02 ( = 0%) 0 ( 0%) complete unrolling : 0.01 ( 0%) 0.00 ( 0%) 0.02 ( = 0%) 12k ( 0%) tree slp vectorization : 0.18 ( 1%) 0.04 ( 1%) 0.23 ( = 1%) 12M ( 2%) tree copy headers : 0.02 ( 0%) 0.00 ( 0%) 0.00 ( = 0%) 7904 ( 0%) tree SSA uncprop : 0.05 ( 0%) 0.00 ( 0%) 0.05 ( = 0%) 0 ( 0%) tree NRV optimization : 0.00 ( 0%) 0.00 ( 0%) 0.02 ( = 0%) 139k ( 0%) tree switch conversion : 0.00 ( 0%) 0.00 ( 0%) 0.01 ( = 0%) 0 ( 0%) tree switch lowering : 0.00 ( 0%) 0.00 ( 0%) 0.05 ( = 0%) 0 ( 0%) gimple CSE sin/cos : 0.01 ( 0%) 0.00 ( 0%) 0.01 ( = 0%) 0 ( 0%) gimple widening/fma detection : 0.01 ( 0%) 0.00 ( 0%) 0.02 ( = 0%) 0 ( 0%) tree strlen optimization : 0.07 ( 0%) 0.01 ( 0%) 0.12 ( = 0%) 6325k ( 1%) tree modref : 0.14 ( 1%) 0.04 ( 1%) 0.16 ( = 1%) 10M ( 2%) dominance frontiers : 0.03 ( 0%) 0.00 ( 0%) 0.01 ( = 0%) 0 ( 0%) dominance computation : 0.50 ( 2%) 0.13 ( 3%) 0.82 ( = 3%) 0 ( 0%) control dependences : 0.00 ( 0%) 0.00 ( 0%) 0.01 ( = 0%) 0 ( 0%) out of ssa : 0.06 ( 0%) 0.02 ( 0%) 0.11 ( = 0%) 1061k ( 0%) expand vars : 0.05 ( 0%) 0.01 ( 0%) 0.03 ( = 0%) 2414k ( 0%) expand : 0.54 ( 2%) 0.09 ( 2%) 0.55 ( = 2%) 38M ( 6%) post expand cleanups : 0.03 ( 0%) 0.02 ( 0%) 0.08 ( = 0%) 3084k ( 0%) lower subreg : 0.01 ( 0%) 0.00 ( 0%) 0.01 ( = 0%) 0 ( 0%) jump : 0.04 ( 0%) 0.02 ( 0%) 0.03 ( = 0%) 0 ( 0%) forward prop : 0.42 ( 2%) 0.04 ( 1%) 0.50 ( = 2%) 226k ( 0%) CSE : 0.31 ( 1%) 0.03 ( 1%) 0.38 ( = 1%) 237k ( 0%) dead code elimination : 0.13 ( 1%) 0.02 ( 0%) 0.07 ( = 0%) 0 ( 0%) dead store elim1 : 0.12 ( 0%) 0.03 ( 1%) 0.22 ( = 1%) 2042k ( 0%) dead store elim2 : 0.16 ( 1%) 0.01 ( 0%) 0.18 ( = 1%) 2333k ( 0%) loop analysis : 0.04 ( 0%) 0.00 ( 0%) 0.04 ( = 0%) 0 ( 0%) loop init : 0.36 ( 1%) 0.10 ( 2%) 0.44 ( = 1%) 33M ( 5%) loop fini : 0.09 ( 0%) 0.04 ( 1%) 0.13 ( = 0%) 0 ( 0%) CPROP : 0.43 ( 2%) 0.08 ( 2%) 0.50 ( = 2%) 1842k ( 0%) PRE : 0.13 ( 1%) 0.04 ( 1%) 0.17 ( = 1%) 3816 ( 0%) CSE 2 : 0.25 ( 1%) 0.04 ( 1%) 0.34 ( = 1%) 311k ( 0%) branch prediction : 0.11 ( 0%) 0.03 ( 1%) 0.07 ( = 0%) 623k ( 0%) combiner : 0.39 ( 2%) 0.04 ( 1%) 0.57 ( = 2%) 4948k ( 1%) if-conversion : 0.09 ( 0%) 0.01 ( 0%) 0.12 ( = 0%) 86k ( 0%) mode switching : 0.01 ( 0%) 0.00 ( 0%) 0.00 ( = 0%) 0 ( 0%) integrated RA : 1.75 ( 7%) 0.11 ( 2%) 2.10 ( = 7%) 147M ( 24%) LRA non-specific : 0.39 ( 2%) 0.05 ( 1%) 0.50 ( = 2%) 391k ( 0%) LRA virtuals elimination : 0.02 ( 0%) 0.01 ( 0%) 0.06 ( = 0%) 112k ( 0%) LRA reload inheritance : 0.03 ( 0%) 0.00 ( 0%) 0.07 ( = 0%) 0 ( 0%) LRA create live ranges : 0.09 ( 0%) 0.02 ( 0%) 0.07 ( = 0%) 8232 ( 0%) LRA hard reg assignment : 0.01 ( 0%) 0.00 ( 0%) 0.03 ( = 0%) 0 ( 0%) reload : 0.02 ( 0%) 0.01 ( 0%) 0.05 ( = 0%) 141k ( 0%) reload CSE regs : 0.34 ( 1%) 0.11 ( 2%) 0.49 ( = 2%) 3089k ( 0%) ree : 0.05 ( 0%) 0.00 ( 0%) 0.09 ( = 0%) 19k ( 0%) thread pro- & epilogue : 0.32 ( 1%) 0.01 ( 0%) 0.32 ( = 1%) 12M ( 2%) if-conversion 2 : 0.07 ( 0%) 0.02 ( 0%) 0.04 ( = 0%) 0 ( 0%) combine stack adjustments : 0.01 ( 0%) 0.00 ( 0%) 0.04 ( = 0%) 0 ( 0%) peephole 2 : 0.21 ( 1%) 0.02 ( 0%) 0.17 ( = 1%) 1225k ( 0%) hard reg cprop : 0.10 ( 0%) 0.01 ( 0%) 0.15 ( = 0%) 552 ( 0%) scheduling 2 : 1.55 ( 6%) 0.14 ( 3%) 1.35 ( = 4%) 2653k ( 0%) machine dep reorg : 0.11 ( 0%) 0.04 ( 1%) 0.12 ( = 0%) 0 ( 0%) reorder blocks : 0.10 ( 0%) 0.02 ( 0%) 0.21 ( = 1%) 724k ( 0%) shorten branches : 0.12 ( 0%) 0.02 ( 0%) 0.14 ( = 0%) 0 ( 0%) reg stack : 0.00 ( 0%) 0.00 ( 0%) 0.03 ( = 0%) 0 ( 0%) final : 0.33 ( 1%) 0.07 ( 1%) 0.59 ( = 2%) 11M ( 2%) variable output : 0.02 ( 0%) 0.00 ( 0%) 0.04 ( = 0%) 183k ( 0%) tree if-combine : 0.03 ( 0%) 0.00 ( 0%) 0.00 ( = 0%) 80 ( 0%) if to switch conversion : 0.02 ( 0%) 0.00 ( 0%) 0.02 ( = 0%) 0 ( 0%) straight-line strength reduction : 0.05 ( 0%) 0.00 ( 0%) 0.08 ( = 0%) 864 ( 0%) store merging : 0.01 ( 0%) 0.00 ( 0%) 0.06 ( = 0%) 728 ( 0%) initialize rtl : 0.02 ( 0%) 0.00 ( 0%) 0.01 ( = 0%) 12k ( 0%) address lowering : 0.00 ( 0%) 0.00 ( 0%) 0.02 ( = 0%) 360 ( 0%) access analysis : 0.12 ( 0%) 0.03 ( 1%) 0.20 ( = 1%) 16k ( 0%) early local passes : 0.00 ( 0%) 0.00 ( 0%) 0.01 ( = 0%) 0 ( 0%) unaccounted optimizations : 0.01 ( 0%) 0.00 ( 0%) 0.00 ( = 0%) 0 ( 0%) rest of compilation : 1.76 ( 7%) 0.34 ( 7%) 1.82 ( = 6%) 14M ( 2%) unaccounted post reload : 0.01 ( 0%) 0.00 ( 0%) 0.00 ( = 0%) 0 ( 0%) unaccounted late compilation : 0.00 ( 0%) 0.00 ( 0%) 0.01 ( = 0%) 0 ( 0%) remove unused locals : 0.08 ( 0%) 0.04 ( 1%) 0.11 ( = 0%) 0 ( 0%) address taken : 0.07 ( 0%) 0.01 ( 0%) 0.15 ( = 0%) 0 ( 0%) rebuild frequencies : 0.04 ( 0%) 0.02 ( 0%) 0.10 ( = 0%) 568 ( 0%) repair loop structures : 0.03 ( 0%) 0.01 ( 0%) 0.02 ( = 0%) 112 ( 0%) TOTAL : 25.40 5.09 31.03=20= =20=20=20=20=20=20 625M ```=