From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp-out2.suse.de (smtp-out2.suse.de [IPv6:2a07:de40:b251:101:10:150:64:2]) by sourceware.org (Postfix) with ESMTPS id 2066C3858C98 for ; Thu, 4 Apr 2024 13:32:26 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 2066C3858C98 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.de ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 2066C3858C98 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a07:de40:b251:101:10:150:64:2 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1712237548; cv=none; b=fgFIdhFAe8F/CAd+45HRPBlhuZeWwACzLEB1e8n1qWEZ4VQkIp5tMo5q5MwLXj2/ECQ8c82AMprRnMXgXeHcKH9bw7Q9mWa5QcWsXvJcse7JPTnl1BmAvmRtW/g4A+Q12PhU5e6hUPjqVG6O2oEbAimUYrEuhpmgNn7bBJqQiQE= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1712237548; c=relaxed/simple; bh=hTHNQiNjyvGIWaksjC81mO7F2r3oO3Rnu2I/m3QLAcM=; h=DKIM-Signature:DKIM-Signature:DKIM-Signature:DKIM-Signature:Date: From:To:Subject:Message-ID:MIME-Version; b=S5ErB+q/8RARNdj2WE0DhbTFnPfVjmBNfoUIMEzoOBegfaQdz6cYEO+yBwcEy3hubn4Xutt2kBxxWIR3ojan6UXcWSBQCEmSvgWwfS7Ubumvc6jZCXZu0iS5CiNxcze+rn9BbC/VCOCRHRIol3LKFcJxLLZW2pgKusQea2UYaKs= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from knuth.suse.de (unknown [10.168.5.16]) by smtp-out2.suse.de (Postfix) with ESMTP id E8ECE5D9CF; Thu, 4 Apr 2024 13:32:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1712237545; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type; bh=BqNgegh1oScxSb9dJ7bSuRIQdvS/EDyRaVyGIRj1ELk=; b=tMeZcXyIuSK89rtLhNl0l2/q3XJTeex8UCyeJXys5qv9uXwjAONkwt+rF7bOEkPWaAW5qr 5jOi781PAg7kCP9EUXfQCh0k/hKAZgaoj/V+JrAOKi5KmGO4uYleVcyKyWTldeni88HXgq 1QZiIij4Pl6+9Ti9/9F15HzSlzPjj6k= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1712237545; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type; bh=BqNgegh1oScxSb9dJ7bSuRIQdvS/EDyRaVyGIRj1ELk=; b=rE4qLvJT1SfRGFRcX6kWhhb/U4qNkQVuwbtJGpofJoDdlbej+7qpi0KdgiX0FchqUqM2F9 He0Q0BamRSGIyCCg== Authentication-Results: smtp-out2.suse.de; none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1712237543; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type; bh=BqNgegh1oScxSb9dJ7bSuRIQdvS/EDyRaVyGIRj1ELk=; b=h6bYjZaUO1yFVGtFyJfNfW+MzZ35oWYs7hL0a2Mrv+arlVZOKfweuKKjylZ0oMqGZpzqxP mtekDhTsCmazZLvmup4R9cOdzACSEuo/TiGVw+uvL5xtYtRXcZ+5DZm7tvvQ5S8HquT2WN GIQJqSIiz7m6QHuuXcsY7ht4YgM6rDY= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1712237543; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type; bh=BqNgegh1oScxSb9dJ7bSuRIQdvS/EDyRaVyGIRj1ELk=; b=KT11+VIOriSEGhpNYOavUK22WvVIscUJOQOLnUUKAhToCueYlFw7ASKzBfpOPtpe9Cku4a kuePrZLnJ5PNcoDw== Received: by knuth.suse.de (Postfix, from userid 10510) id DB233345F09; Thu, 4 Apr 2024 15:32:23 +0200 (CEST) Received: from localhost (localhost [127.0.0.1]) by knuth.suse.de (Postfix) with ESMTP id CB186345F08; Thu, 4 Apr 2024 15:32:23 +0200 (CEST) Date: Thu, 4 Apr 2024 15:32:23 +0200 (CEST) From: Michael Matz To: Richard Biener cc: gcc-patches@gcc.gnu.org, Jakub Jelinek Subject: Re: [PATCH] middle-end/114579 - speed up add_scope_conflicts Message-ID: <55d933ac-ae20-3a76-b6bc-62b298825499@suse.de> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Spam-Level: X-Spamd-Result: default: False [-3.19 / 50.00]; BAYES_HAM(-3.00)[100.00%]; FAKE_REPLY(1.00)[]; NEURAL_HAM_LONG(-0.99)[-0.989]; NEURAL_HAM_SHORT(-0.20)[-0.998]; RCVD_NO_TLS_LAST(0.10)[]; MIME_GOOD(-0.10)[text/plain]; MISSING_XM_UA(0.00)[]; MIME_TRACE(0.00)[0:+]; TO_DN_SOME(0.00)[]; ARC_NA(0.00)[]; FROM_HAS_DN(0.00)[]; MID_RHS_MATCH_FROM(0.00)[]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; FROM_EQ_ENVFROM(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; RCVD_COUNT_TWO(0.00)[2]; FUZZY_BLOCKED(0.00)[rspamd.com]; TO_MATCH_ENVRCPT_ALL(0.00)[]; DBL_BLOCKED_OPENRESOLVER(0.00)[cfgexpand.cc:url,knuth.suse.de:helo] X-Spam-Score: -3.19 X-Spam-Status: No, score=-9.1 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hello, On Thu, 4 Apr 2024, Richard Biener wrote: > I have reworded the comment before the all-to-all conflict recording > since it seemed to be confusing and missing the point - but maybe I > am also missing something here. The point of the comment was to explain how this differs from building a conflict graph for named objects without indirect accesses, i.e. how e.g. a conflict graph for register allocation is done. There conflicts are added at the points where objects are defined (between that def and all currently live objects), and _only_ there. I.e. not at CFG merge points and not at uses. So something like so: foreach INSN in block backwards: foreach DEF in INSN: foreach O in active: add_conflict (DEF, O); remove (DEF from active); That's for the more usual backwards direction, but it would be similar for forward one. The point is that this is the only place where conflicts are added, not at CFG splits/merges. The above relies on being able to see all places where the objects are written. With indirect accesses that's not the case anymore: ptr = cond : &foo : &bar; // MENTION foo, bar *ptr = 42; // no DEF of a named object escape (ptr); CLOBBER (foo) CLOBBER (bar) We somewhere need to record the conflict between foo and bar. Conservatively we did so at the first real instruction of a block. As minor optimization we also start a block in a mode where we just record mentions, and switch to also recording conflicts at the first real insn. (An empty block, or just PHIs do not cause conflicts in themself) This need for an N-to-N conflict generation at some point is the major difference between building the conflict graph for named vs. potentionally indirect objects, and the comment tried to explain that from the perspective of someone familiar with conflict graph building in regalloc :-) Where exactly these N-to-N conflicts are generated doesn't matter so much, as long as everything is there. The current way was the most obvious one. The observation that these N-to-N conflicts only need to be done on the commonly disjoint sets of the inputs (which is trivially empty for a single predecessor) is correct, i.e. only doing it with more than a single predecessor makes sense. This requires adding the N-to-Ns always even in mostly emtpy blocks, for the danger of that being missed in the successor block, as you are doing. So, fine with me, also with the changed comment if you think it explains stuff better :) Ciao, Michael. > > Nevertheless for the testcase in the PR the compile-time spend in > add_scope_conflicts at -O1 drops from previously 67s (39%) to 10s (9%). > > Bootstrapped and tested on x86_64-unknown-linux-gnu, OK for trunk? > > Thanks, > Richard. > > PR middle-end/114579 > * cfgexpand.cc (add_scope_conflicts_1): Record all-to-all > conflicts only when there's a CFG merge but for all CFG merges. > --- > gcc/cfgexpand.cc | 46 +++++++++++++++++++++++++++++++++------------- > 1 file changed, 33 insertions(+), 13 deletions(-) > > diff --git a/gcc/cfgexpand.cc b/gcc/cfgexpand.cc > index eef565eddb5..fa48a4c633f 100644 > --- a/gcc/cfgexpand.cc > +++ b/gcc/cfgexpand.cc > @@ -640,21 +640,26 @@ add_scope_conflicts_1 (basic_block bb, bitmap work, bool for_conflict) > { > if (for_conflict && visit == visit_op) > { > - /* If this is the first real instruction in this BB we need > - to add conflicts for everything live at this point now. > - Unlike classical liveness for named objects we can't > - rely on seeing a def/use of the names we're interested in. > - There might merely be indirect loads/stores. We'd not add any > - conflicts for such partitions. */ > + /* When we are inheriting live variables from our predecessors > + through a CFG merge we might not see an actual mention of > + the variables to record the approprate conflict as defs/uses > + might be through indirect stores/loads. For this reason > + we have to make sure each live variable conflicts with > + each other. When there's just a single predecessor the > + set of conflicts is already up-to-date. > + We perform this delayed at the first real instruction to > + allow clobbers starting this block to remove variables from > + the set of live variables. */ > bitmap_iterator bi; > unsigned i; > - EXECUTE_IF_SET_IN_BITMAP (work, 0, i, bi) > - { > - class stack_var *a = &stack_vars[i]; > - if (!a->conflicts) > - a->conflicts = BITMAP_ALLOC (&stack_var_bitmap_obstack); > - bitmap_ior_into (a->conflicts, work); > - } > + if (EDGE_COUNT (bb->preds) > 1) > + EXECUTE_IF_SET_IN_BITMAP (work, 0, i, bi) > + { > + class stack_var *a = &stack_vars[i]; > + if (!a->conflicts) > + a->conflicts = BITMAP_ALLOC (&stack_var_bitmap_obstack); > + bitmap_ior_into (a->conflicts, work); > + } > visit = visit_conflict; > } > walk_stmt_load_store_addr_ops (stmt, work, visit, visit, visit); > @@ -662,6 +667,21 @@ add_scope_conflicts_1 (basic_block bb, bitmap work, bool for_conflict) > add_scope_conflicts_2 (USE_FROM_PTR (use_p), work, visit); > } > } > + > + /* When there was no real instruction but there's a CFG merge we need > + to add the conflicts now. */ > + if (for_conflict && visit == visit_op && EDGE_COUNT (bb->preds) > 1) > + { > + bitmap_iterator bi; > + unsigned i; > + EXECUTE_IF_SET_IN_BITMAP (work, 0, i, bi) > + { > + class stack_var *a = &stack_vars[i]; > + if (!a->conflicts) > + a->conflicts = BITMAP_ALLOC (&stack_var_bitmap_obstack); > + bitmap_ior_into (a->conflicts, work); > + } > + } > } > > /* Generate stack partition conflicts between all partitions that are >