From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by sourceware.org (Postfix) with ESMTPS id 2F8B83858D3C for ; Wed, 22 Nov 2023 14:20:52 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 2F8B83858D3C Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=suse.cz Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.cz ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 2F8B83858D3C Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=195.135.220.28 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700662854; cv=none; b=hpcYASSn20ZeiSzLESliahc7DMTafgMwx2dgGE5U7klIa2NJAxFOS9nlz/YFmw3oqmMhAsZAEpk04tDVmWwuICGV42rS6AINhnaFtxcaWMv6nZ+G8Kl2RZm88n0sTlv+kXdivpJmmhnmOALZkuD1TEUQn3rexkpVxmM5YP3ytv8= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700662854; c=relaxed/simple; bh=u+28ET/BjFgM/V9ahq8zSckSuQc4vNXh1dm+2bRcCHI=; h=DKIM-Signature:DKIM-Signature:Date:From:To:Subject:Message-ID: MIME-Version; b=Z1LDNwMPk5O21J0jm3vbRtdEiScYrmraTULlVq6bktjAGaV02L/E3SSukLIHEWMc+JkTjhFC1T11xYuT5hFBznW4m7CePzdjBEsQagphafhg48TZTz3gMmR33xd3rXAD39qAH2y1pqTNH+L5HxObifq1EPzykNF6gX/xL78OzHM= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 1F08421978; Wed, 22 Nov 2023 14:20:51 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1700662851; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=bi4hDzf29RjKDF1fq6LHOrsXs+18HD4lOOl5yTZvcvE=; b=llvUaxqPSZM41aVesi6r7Rt4FNS6Pd+bpUINTcvSEb4OM0ggyOlDzCo/pWcb3XLxdSUME+ NoFnsjhHWBx3cAkfmofxOd+LEwXoLw2c4GkCEzUyCbsb/FDyk0v+0N+iZqu6lmqtQXHUMQ s9AATpnkI0WUghDnG07y43kwbMPqpp4= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1700662851; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=bi4hDzf29RjKDF1fq6LHOrsXs+18HD4lOOl5yTZvcvE=; b=V2yR1dPl1TgKaNpqZBusAagiOrJIXi1zKKCaHJ2/Qq1Km9LTuaEh7nN74D2wtG/DWGS/LL jZXUGFZoO77KMGAw== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 0399113461; Wed, 22 Nov 2023 14:20:50 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id rSplO0IOXmUABQAAMHmgww (envelope-from ); Wed, 22 Nov 2023 14:20:50 +0000 Date: Wed, 22 Nov 2023 15:20:50 +0100 From: Filip Kastl To: Richard Biener Cc: gcc-patches@gcc.gnu.org, hubicka@ucw.cz Subject: Re: [PATCH v2] A new copy propagation and PHI elimination pass Message-ID: References: <571578n9-pn64-qqqq-q59s-866548669143@fhfr.qr> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <571578n9-pn64-qqqq-q59s-866548669143@fhfr.qr> Authentication-Results: smtp-out1.suse.de; none X-Spam-Level: X-Spam-Score: -4.30 X-Spamd-Result: default: False [-4.30 / 50.00]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; TO_DN_SOME(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_GOOD(-0.10)[text/plain]; NEURAL_HAM_LONG(-1.00)[-1.000]; DKIM_SIGNED(0.00)[suse.cz:s=susede2_rsa,suse.cz:s=susede2_ed25519]; NEURAL_HAM_SHORT(-0.20)[-1.000]; FUZZY_BLOCKED(0.00)[rspamd.com]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; RCVD_COUNT_TWO(0.00)[2]; RCVD_TLS_ALL(0.00)[]; BAYES_HAM(-3.00)[100.00%] X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hi Richard, > Can you name the new file gimple-ssa-sccopy.cc please? Yes, no problem. Btw, I thought that it is standard that gimple ssa passes have the tree-ssa- prefix. Do I understand it correctly that this is not true and many tree-ssa-*.cc passes should actually be named gimple-ssa-*.cc but remain tree-ssa-*.cc for historical reasons? >> + 3 A set of PHI statements that only refer to each other or to one other >> + value. >> + >> + _8 = PHI <_9, _10>; >> + _9 = PHI <_8, _10>; >> + _10 = PHI <_8, _9, _1>; > > this case necessarily involves a cyclic CFG, so maybe say > > "This is a lightweight SSA copy propagation pass that is able to handle > cycles optimistically, eliminating PHIs within those." > > ? Or is this a mis-characterization? I'm not sure what you mean here. Yes, this case always involves a cyclic CFG. Is it weird that a lightweight pass is able to handle cyclic CFG and therefore you suggest to comment this fact and say that the pass handles cycles optimistically? I'm not sure if optimistic is a good word to characterize the pass. I'd expect an "optimistic" pass to make assumptions which may not be true and therefore not always all redundancies it can. This pass however should achieve all that it sets out to do. > It might be nice to optimize SCCs of size 1 somehow, not sure how > many times these appear - possibly prevent them from even entering > the SCC discovery? Maybe that could be done. I would have to think about it and make sure it doesn't break anything. I'd prefer to get this version into upstream and then possibly post this upgrade later. Btw, SCCs of size of size 1 appear all the time. Those are the cases 1 and 2 described in the comment at the beginning of the file. > I'll note that while you are working with stmts everywhere that > you are really bound to using SSA defs and those would already > nicely have numbers (the SSA_NAME_VERSION). In principle the > SCC lattice could be pre-allocated once, indexed by > SSA_NAME_VERSION and you could keep a "generation" number > indicating what SCC discovery round it belongs to (aka the > set_using). I see. I could allocate a vertex struct for each statement only once when the pass is invoked instead of allocating the structs each time tarjan_compute_sccs is called. Will do that. I'm not sure if I want to use SSA_NAME_VERSION for indexing an vec/array with all those vertex structs. Many SSA names will be defined neither by PHI nor by a copy assignment statement. If I created a vertex struct for every SSA name I would allocate a lot of extra memory. > There's a old SCC finding algorithm working on the SSA graph > in the old SCC based value-numbering, for example on the > gcc 7 branch in tree-ssa-sccvn.c:DFS > For reading it would be nice to put the SCC finding in its > own class. Okay, I'll do that. > > + } > > + } > > + > > + if (!stack.is_empty ()) > > + gcc_unreachable (); > > + > > + /* Clear copy stmts' 'using' flags. */ > > + for (vertex v : vs) > > + { > > + gimple *s = v.stmt; > > + tarjan_clear_using (s); > > + } > > + > > + return sccs; > > +} > > + > > +/* Could this statement potentially be a copy statement? > > + > > + This pass only considers statements for which this function returns 'true'. > > + Those are basically PHI functions and assignment statements similar to > > + > > + _2 = _1; > > + or > > + _2 = 5; */ > > + > > +static bool > > +stmt_may_generate_copy (gimple *stmt) > > +{ > > + if (gimple_code (stmt) == GIMPLE_PHI) > > + { > > + gphi *phi = as_a (stmt); > > + > > + /* No OCCURS_IN_ABNORMAL_PHI SSA names in lhs nor rhs. */ > > + if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (gimple_phi_result (phi))) > > + return false; > > + > > + unsigned i; > > + for (i = 0; i < gimple_phi_num_args (phi); i++) > > + { > > + tree op = gimple_phi_arg_def (phi, i); > > + if (TREE_CODE (op) == SSA_NAME > > + && SSA_NAME_OCCURS_IN_ABNORMAL_PHI (op)) > > + return false; > > + } > > When there's more than one non-SSA PHI argument and they are not > the same then the stmt also cannot be a copy, right? > > > + return true; > > + } Do I understand you correctly that you propose to put another check here? Something like unsigned nonssa_args_num = 0; unsigned i; for (i = 0; i < gimple_phi_num_args (phi); i++) { tree op = gimple_phi_arg_def (phi, i); if (TREE_CODE (op) == SSA_NAME) { nonssa_args_num++; if (nonssa_args_num >= 2) return false; if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (op)) return false; } } > > + > > + if (gimple_code (stmt) != GIMPLE_ASSIGN) > > + return false; > > + > > + /* If the statement has volatile operands, it won't generate a > > + useful copy. */ > > + if (gimple_has_volatile_ops (stmt)) > > + return false; > > + > > + /* Statements with loads and/or stores will never generate a useful copy. */ > > + if (gimple_store_p (stmt) || gimple_assign_load_p (stmt)) > > + return false; > > + > > + if (!gimple_assign_single_p (stmt)) > > + return false; > > + > > + tree lhs = gimple_assign_lhs (stmt); > > + if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (lhs)) > > + return false; > > + > > + /* If the assignment is from a constant it generates a useful copy. */ > > + if (is_gimple_min_invariant (gimple_assign_rhs1 (stmt))) > > + return true; > > + > > + tree rhs = single_ssa_tree_operand (stmt, SSA_OP_USE); > > + > > + if (!is_gimple_val (gimple_assign_rhs1 (stmt))) > > + return false; > > + > > + /* rhs shouldn't flow through any abnormal edges. */ > > + if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rhs)) > > + return false; > > + > > + if (!rhs) > > + return false; > > so this means > > _1 = 0; > > doesn't generate a copy? But you allowed those in PHIs? > > Generally I wanted to mention there's gimple_assign_ssa_name_copy_p > which exactly matches _1 = _2 (and nothing more). There cannot > be volatiles or side-effects in those, no stores or loads > (but SSA_NAME_OCCURS_IN_ABNORMAL_PHI can appear). > > Can you use that and remove the unneeded checks above? I'm pretty sure that _1 = 0; generates a copy. The lines if (is_gimple_min_invariant (gimple_assign_rhs1 (stmt))) return true; ensure that. I think I have even considered using gimple_assign_ssa_name_copy_p at some point but decided to instead write more elementary checks. I think I did that so that I am able to mark assignments of constants like _1 = 0; as generating copy statements too. > > + /* If rhs and lhs are pointers, alignment of lhs and rhs should be the same. > > + Usage of __builtin_assume_aligned can cause alignment of lhs to be greater > > + than alignment of rhs. In that case we don't want to propagate rhs since > > + we would lose the alignment information. */ > > The same might be true for value-ranges. I think it would be better to > do it like copy-prop does and in cycles propagate the info to the > copy-of variable we substitute with. For non-cyclic cases we cannot > generally copy flow-sensitive info. This sounds complicated. I already thought about propagating the alignment info. But detecting if it is part of a cycle or not... right now I'm not sure how I would go about doing that. Thanks for the note that also value-range info needs to be handled. > A more conservative test would be to compare > SSA_NAME_PTR_INFO for pointers and SSA_NAME_RANGE_INFO for non-pointers. Let's do this more conservative check. Then later I could upgrade the pass to do the aligment and value-range info propagation. > > +static auto_vec > > +get_all_stmt_may_generate_copy (void) > > +{ > > + auto_vec result; > > + > > + basic_block bb; > > + FOR_EACH_BB_FN (bb, cfun) > > + { > > + gimple_stmt_iterator gsi; > > + for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi)) > > + { > > + gimple *s = gsi_stmt (gsi); > > + if (stmt_may_generate_copy (s)) > > + result.safe_push (s); > > + } > > + > > + gphi_iterator pi; > > + for (pi = gsi_start_phis (bb); !gsi_end_p (pi); gsi_next (&pi)) > > + { > > + gimple *s = pi.phi (); > > + if (stmt_may_generate_copy (s)) > > + result.safe_push (s); > > + } > > + } > > I wonder if this can do stmt-to-vertices on-the-fly, thus not > build the vector of gimple *. That is, you build very many > vec<>s (probably not many in practice, but ...) If I understand correctly, you're suggesting to check if a statement may generate copy during runtime of the sccopy algorithm. I guess that could be done but I think that it would make the code less readable. And it wouldn't reduce the number of used vec<>s that much -- get_all_stmt_may_generate_copy is run just once during the execution of the pass so it only creates one vec<>. Correct me if I didn't get what you meant here. > If I understand correctly sofar you'll never replace a SSA name by > a constant, so REPLACE_BY is always an SSA_NAME. No, I certainly do replace SSA names by constants. > I wonder if you ever hit the need to do any of the fancy cleanups. I originally didn't have them then I ran the GCC testsuite and some tests were failing so I added the cleanups. > In fact I suggest you use the existing replace_uses_by API instead > of re-inventing it. Oh nice. I didn't know that replace_uses_by API already exists in GCC. Will look into that. > > + /* Remove all propagated statements. */ > > + FOR_EACH_BB_FN (bb, cfun) > > + { > > + gimple_stmt_iterator gsi; > > + for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi);) > You generally should remove stmts back-to-front to make > debug stmt generation work. To not re-invent the wheel > it might be simplest to use simple_dce_from_worklist with > the dead stmts defs as seeds (just remove set/is_stmt_dead > and track dead SSA defs in a bitmap). > > > + { > > + gimple *stmt = gsi_stmt (gsi); > > + if (is_stmt_dead (stmt)) > > + gsi_remove (&gsi, true); > > + else > > + gsi_next (&gsi); > > this also fails to release the SSA def in the stmt. > > release_defs (stmt); > > would do that (or simple_dce_from_worklist). Interesting. Wouldn't expect that the order of removing statements would have influence on debug information. Alright. Will rewrite this part of the pass to use simple_dce_from_worklist. > > + } > > + > > + gphi_iterator pi; > > + for (pi = gsi_start_phis (bb); !gsi_end_p (pi);) > > + { > > + gphi *stmt = pi.phi (); > > + > > + if (is_stmt_dead (stmt)) > > + remove_phi_node (&pi, true); > > + else > > + gsi_next (&pi); > > + } > > + } > > + > > + /* More cleanup. */ > > + FOR_EACH_BB_FN (bb, cfun) > > + gimple_purge_dead_eh_edges (bb); > > You only removed copies, I'm sure this should neve do anything? I think some testsuite testcases fail if I don't include this eh cleanup in my pass. I don't understand eh edges very well but my explanation for this is that I may replace an SSA name with a constant and ensure that a statement won't throw any exceptions anymore. For example in a x = 10 / y; statement I may replace y by 5 which ensures that we won't have to handle zero division exception for this statement. > As a general comment I wonder if we can simplify most of the > pass by starting SSA based SCC discovery from a copy stmt and > whether we can reduce the number of vec<>s that are allocated > and possibly re-allocated many times? I'm not sure what you mean by "starting SCC discovery from a copy stmt" here. Thanks for the review! Filip