From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 8B76D3861962; Thu, 9 Jul 2020 09:34:09 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 8B76D3861962 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1594287249; bh=f12LkZfw3evCHp+h/u1B4Xh5Hijjj3X1QHpbsI9IvG0=; h=From:To:Subject:Date:In-Reply-To:References:From; b=uBDFZf5IXVteEdf1xaku0s2oKNMga+TLQQPjgeS6HLV8J5iEaQpXxcMq0F2b6HcdG 4kbIswQ7XmXE4HOyU/kQLkKhZ2SMxFDYVnNdHpSFSssM2eJxieoHBu9kq565GN3EWB A/CtNqkyq0JmOPUW8wvbDvtdQeycJLWaUVoOE10Q= From: "rguenth at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug debug/78288] Compile time hog (with var-tracking) for libsanitizer/asan/asan_interceptors.cc Date: Thu, 09 Jul 2020 09:34:09 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: debug X-Bugzilla-Version: unknown X-Bugzilla-Keywords: compile-time-hog X-Bugzilla-Severity: normal X-Bugzilla-Who: rguenth at gcc dot gnu.org X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: gcc-bugs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-bugs mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 09 Jul 2020 09:34:09 -0000 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D78288 --- Comment #4 from Richard Biener --- As update with todays trunk 96b7f495f9269d5448822e4fc28882 I see for not bootstrapped build of asan_interceptors.cc on x86_64 with -fno-checking: callgraph ipa passes : 1.24 ( 11%) 0.11 ( 14%) 1.36 ( = 11%) 117240 kB ( 19%) integrated RA : 0.47 ( 4%) 0.06 ( 8%) 0.35 ( = 3%) 48666 kB ( 8%) variable tracking : 0.15 ( 1%) 0.00 ( 0%) 0.21 ( = 2%) 14895 kB ( 2%) var-tracking dataflow : 1.64 ( 14%) 0.00 ( 0%) 1.57 ( = 13%) 158 kB ( 0%) var-tracking emit : 0.30 ( 3%) 0.01 ( 1%) 0.32 ( = 3%) 9859 kB ( 2%) TOTAL : 11.51 0.76 12.27=20= =20=20=20=20=20=20 625036 kB so this might not be the very best example for var-tracking slowness. I do have a simple patch that improves it to variable tracking : 0.21 ( 2%) 0.00 ( 0%) 0.22 ( = 2%) 14895 kB ( 2%) var-tracking dataflow : 0.95 ( 8%) 0.00 ( 0%) 0.94 ( = 8%) 157 kB ( 0%) var-tracking emit : 0.31 ( 3%) 0.00 ( 0%) 0.37 ( = 3%) 9859 kB ( 2%) though which is done by changing the vt_find_locations iteration scheme from worklist/pending to immediately following backedges for cache locality reasons and to avoid processing parts of the function we'd eventually have to revisit anyway due to changed DF on the backedge destination. diff --git a/gcc/var-tracking.c b/gcc/var-tracking.c index 899a5c0290d..7929b0b39eb 100644 --- a/gcc/var-tracking.c +++ b/gcc/var-tracking.c @@ -7209,22 +7209,18 @@ vt_find_locations (void) if (e->dest =3D=3D EXIT_BLOCK_PTR_FOR_FN (cfun)) continue; - if (bitmap_bit_p (visited, e->dest->index)) - { - if (!bitmap_bit_p (in_pending, e->dest->index)) - { - /* Send E->DEST to next round. */ - bitmap_set_bit (in_pending, e->dest->index); - pending->insert (bb_order[e->dest->index], - e->dest); - } - } - else if (!bitmap_bit_p (in_worklist, e->dest->index)) + if (!bitmap_bit_p (in_worklist, e->dest->index)) { /* Add E->DEST to current round. */ bitmap_set_bit (in_worklist, e->dest->index); worklist->insert (bb_order[e->dest->index], e->dest); + /* Clear visited in case we already visited + the block. We're following backedges + immediately to improve cache locality + and avoid wasting cycles on blocks we'd + have to re-visit eventually. */ + bitmap_clear_bit (visited, e->dest->index); } } } all the worklist/fibheap/visited tracking could be also simplified down to a single worklist bitmap (not sbitmap) indexed by bb_order and extracting elements via bitmap_first_set_bit. But that's of course unrelated and not part of the performance issue (the visited bitmap can be elimiated though). It looks like this two-worklist approach is there from the very beginning, also the use of a fibheap.=