From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) by sourceware.org (Postfix) with ESMTPS id 505973858CD1 for ; Fri, 8 Dec 2023 15:15:37 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 505973858CD1 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=suse.cz Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.cz ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 505973858CD1 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=195.135.223.131 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1702048541; cv=none; b=dx19pm1bEw10rfJs5R2Johaha1daZHWGRNUECRlWPFTpnGYt1D0qLUvVvlMXF4IXrjDxB+Lhuc+C+71IcD4fA+NQYB+4zJ2SKNNgkC9kgtRNsPt/+uVJRc5g+pJw0NnFDWQJ9K41hz05uTLm3oRd9pNxwPjz62Jsfh3UMUd7rec= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1702048541; c=relaxed/simple; bh=2IbmPf26HvDOEwWqi0R7wZ+/Rdy6LhKTDrlagIQuuE4=; h=DKIM-Signature:DKIM-Signature:DKIM-Signature:DKIM-Signature:Date: From:To:Subject:Message-ID:MIME-Version; b=HtnDrfSzFeLrzcGeODN2bbeCCDclZ8Ux7qYxHLKZwXRipZ0uG/hsvBXbj4xwkD3WDBUSoiv5cjPmDbd9uGXb2kivvWfOySOx0LMgvA818644rbLwA8zoISh8av1sQl2jBUT12huddgpIQZeU+V3CjVb7JCNY2lJ/iLcoYsy76eQ= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [IPv6:2a07:de40:b281:104:10:150:64:97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id AB9291F451; Fri, 8 Dec 2023 15:15:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1702048534; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=zEPs/ed1x6uZBrSzJrgiyLLaOtgvZ6T7JLTxOZwCGM0=; b=SU+S8tSWWXYcSCnhi+mzFMgDJu9q8XRgUZLjTn4lMQ8rqAtFd328bAu+uBACUhW0CfoLDR YPg+sdJluh7H5+uJsAGhPHxzTB56xrn47LVQrsQ5a25AoSVCxP9YMBtEkBlpCGOneys/+M OTmvpl3uK7gz3kraGsN85ylJOCNtlcE= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1702048534; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=zEPs/ed1x6uZBrSzJrgiyLLaOtgvZ6T7JLTxOZwCGM0=; b=xzhIkHU61Ue1J9HPzeDdxNDd9HHhRDW8fBPjhbCmHbQ0+yl62MZmVXeSaT0obMb5huHu9p 0DdeDAphd+IfGqBQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1702048534; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=zEPs/ed1x6uZBrSzJrgiyLLaOtgvZ6T7JLTxOZwCGM0=; b=SU+S8tSWWXYcSCnhi+mzFMgDJu9q8XRgUZLjTn4lMQ8rqAtFd328bAu+uBACUhW0CfoLDR YPg+sdJluh7H5+uJsAGhPHxzTB56xrn47LVQrsQ5a25AoSVCxP9YMBtEkBlpCGOneys/+M OTmvpl3uK7gz3kraGsN85ylJOCNtlcE= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1702048534; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=zEPs/ed1x6uZBrSzJrgiyLLaOtgvZ6T7JLTxOZwCGM0=; b=xzhIkHU61Ue1J9HPzeDdxNDd9HHhRDW8fBPjhbCmHbQ0+yl62MZmVXeSaT0obMb5huHu9p 0DdeDAphd+IfGqBQ== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 8BEF912FF7; Fri, 8 Dec 2023 15:15:34 +0000 (UTC) Received: from dovecot-director2.suse.de ([10.150.64.162]) by imap1.dmz-prg2.suse.org with ESMTPSA id js8UIRYzc2U8dwAAD6G6ig (envelope-from ); Fri, 08 Dec 2023 15:15:34 +0000 Date: Fri, 8 Dec 2023 16:15:34 +0100 From: Filip Kastl To: gcc-patches@gcc.gnu.org Cc: hubicka@ucw.cz, Richard Biener Subject: [PATCH v3] A new copy propagation and PHI elimination pass Message-ID: References: <571578n9-pn64-qqqq-q59s-866548669143@fhfr.qr> <39489or3-7659-5306-4133-6p3n2pr58p99@fhfr.qr> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <39489or3-7659-5306-4133-6p3n2pr58p99@fhfr.qr> X-Spam-Score: 7.87 X-Spamd-Bar: + Authentication-Results: smtp-out2.suse.de; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=SU+S8tSW; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=xzhIkHU6; dmarc=none; spf=softfail (smtp-out2.suse.de: 2a07:de40:b281:104:10:150:64:97 is neither permitted nor denied by domain of fkastl@suse.cz) smtp.mailfrom=fkastl@suse.cz X-Rspamd-Server: rspamd2 X-Spamd-Result: default: False [1.36 / 50.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; TO_DN_SOME(0.00)[]; R_SPF_SOFTFAIL(4.60)[~all]; RCVD_COUNT_THREE(0.00)[3]; MID_RHS_MATCH_FROMTLD(0.00)[]; DKIM_TRACE(0.00)[suse.cz:+]; MX_GOOD(-0.01)[]; NEURAL_HAM_SHORT(-0.14)[-0.710]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; BAYES_HAM(-3.00)[100.00%]; ARC_NA(0.00)[]; R_DKIM_ALLOW(-0.20)[suse.cz:s=susede2_rsa,suse.cz:s=susede2_ed25519]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; TO_MATCH_ENVRCPT_ALL(0.00)[]; NEURAL_HAM_LONG(-0.99)[-0.985]; MIME_GOOD(-0.10)[text/plain]; DMARC_NA(1.20)[suse.cz]; DKIM_SIGNED(0.00)[suse.cz:s=susede2_rsa,suse.cz:s=susede2_ed25519]; DBL_BLOCKED_OPENRESOLVER(0.00)[gnu.org:url,suse.cz:dkim,suse.cz:email]; FUZZY_BLOCKED(0.00)[rspamd.com]; RCVD_TLS_ALL(0.00)[] X-Spam-Score: 1.36 X-Rspamd-Queue-Id: AB9291F451 X-Spam-Flag: NO X-Spam-Status: No, score=-11.8 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,KAM_SHORT,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: > > Hi, > > > > this is a patch that I submitted two months ago as an RFC. I added some polish > > since. > > > > It is a new lightweight pass that removes redundant PHI functions and as a > > bonus does basic copy propagation. With Jan Hubička we measured that it is able > > to remove usually more than 5% of all PHI functions when run among early passes > > (sometimes even 13% or more). Those are mostly PHI functions that would be > > later optimized away but with this pass it is possible to remove them early > > enough so that they don't get streamed when runing LTO (and also potentially > > inlined at multiple places). It is also able to remove some redundant PHIs > > that otherwise would still be present during RTL expansion. > > > > Jakub Jelínek was concerned about debug info coverage so I compiled cc1plus > > with and without this patch. These are the sizes of .debug_info and > > .debug_loclists > > > > .debug_info without patch 181694311 > > .debug_info with patch 181692320 > > +0.0011% change > > > > .debug_loclists without patch 47934753 > > .debug_loclists with patch 47934966 > > -0.0004% change > > > > I wanted to use dwlocstat to compare debug coverages but didn't manage to get > > the program working on my machine sadly. Hope this suffices. Seems to me that > > my patch doesn't have a significant impact on debug info. > > > > Bootstraped and tested* on x86_64-pc-linux-gnu. > > > > * One testcase (pr79691.c) did regress. However that is because the test is > > dependent on a certain variable not being copy propagated. I will go into more > > detail about this in a reply to this mail. > > > > Ok to commit? > > This is a second version of the patch. In this version, I modified the > pr79691.c testcase so that it works as intended with other changes from the > patch. > > The pr79691.c testcase checks that we get constants from snprintf calls and > that they simplify into a single constant. The testcase doesn't account for > the fact that this constant may be further copy propagated which is exactly > what happens with this patch applied. > > Bootstrapped and tested on x86_64-pc-linux-gnu. > > Ok to commit? This is the third version of the patch. In this version, I adressed most of Richards remarks about the second version. Here is a summary of changes I made: - Rename the pass from tree-ssa-sccopy.cc to gimple-ssa-sccopy.cc - Use simple_dce_from_worklist to remove propagated statements - Use existing replace_uses API instead of reinventing it - This allowed me to get rid of some now redundant cleanup code - Encapsulate the SCC finding into a class - Rework stmt_may_generate_copy to get rid of redundant checks - Add check that PHI doesn't contain two non-SSA-name values to stmt_may_generate_copy - Regarding alignment and value ranges in stmt_may_generate_copy: For now use the conservative check that Richard suggested - Index array of vertices that SCC discovery uses by SSA name version numbers instead of numbering statements myself. I didn't make any changes based on these remarks: 1 It might be nice to optimize SCCs of size 1 somehow, not sure how many times these appear - possibly prevent them from even entering the SCC discovery? It would be nice. But the only way to do this that I see right now is to first propagate SCCs of size 1 and then the rest. This would mean adding a new copy propagation procedure. It wouldn't be a trivial procedure. Efficiency of the pass relies on having SCCs topogically sorted so this procedure would have to implement some topological sort algorithm. This could be done. It could save allocating some vec<>s (right now, SCCs of size 1 are represented by a vec<> with a single element). But is it worth it to optimize the pass this way right now? If possible, I'd like to see that the pass works and sort out any problems people encounter with it before I start optimizing it. 2 Instead of collecting all stmts that may generate a copy at the beginning of the pass into a vec<>, let the SCC discovery check that statements may generate a copy on the fly. This would be a big change to the pass, it would require a lot of reworking. I'm also not sure if this would help reduce the number of allocated vec<>s that much because I'll still want to represent SCCs by vec<>s. Again - its possible I'll want to rework the pass in this way in the future but I'd like to leave it as it is for now. 3 Add a comment saying that the pass is doing optimistic copy propagation I don't think the pass works in an optimistic way. It doesn't assume that all variables are copies of each other at any point. It instead identifies copy statements (or PHI SCCs that act as copy statements) and then for each of these it propagates: By that I mean if a copy statement says that _3 is a copy of _2, then it replaces all uses of _3 by _2. But its possible that I still misinterpret what 'optimistic' means. If mentioning that it works in an optimistic way would help readability of the pass, I'm happy to add that comment. Richard, is the patch ok as it is now? Or do you think that making changes 1, 2 or 3 is necessary? Or are there any other issues which should be adressed? Filip -- >8 -- This patch adds the strongly-connected copy propagation (SCCOPY) pass. It is a lightweight GIMPLE copy propagation pass that also removes some redundant PHI statements. It handles degenerate PHIs, e.g.: _5 = PHI <_1>; _6 = PHI <_6, _6, _1, _1>; _7 = PHI <16, _7>; // Replaces occurences of _5 and _6 by _1 and _7 by 16 It also handles more complicated situations, e.g.: _8 = PHI <_9, _10>; _9 = PHI <_8, _10>; _10 = PHI <_8, _9, _1>; // Replaces occurences of _8, _9 and _10 by _1 gcc/ChangeLog: * Makefile.in: Added sccopy pass. * passes.def: Added sccopy pass before LTO streaming and before RTL expanstion. * tree-pass.h (make_pass_sccopy): Added sccopy pass. * gimple-ssa-sccopy.cc: New file. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/pr79691.c: Updated scan-tree-dump to account for additional copy propagation this patch adds. * gcc.dg/sccopy-1.c: New test. Signed-off-by: Filip Kastl --- gcc/Makefile.in | 1 + gcc/gimple-ssa-sccopy.cc | 687 ++++++++++++++++++++++++ gcc/passes.def | 3 + gcc/testsuite/gcc.dg/sccopy-1.c | 78 +++ gcc/testsuite/gcc.dg/tree-ssa/pr79691.c | 2 +- gcc/tree-pass.h | 1 + 6 files changed, 771 insertions(+), 1 deletion(-) create mode 100644 gcc/gimple-ssa-sccopy.cc create mode 100644 gcc/testsuite/gcc.dg/sccopy-1.c diff --git a/gcc/Makefile.in b/gcc/Makefile.in index 753f2f36618..2e6f2fef1ba 100644 --- a/gcc/Makefile.in +++ b/gcc/Makefile.in @@ -1497,6 +1497,7 @@ OBJS = \ gimple-ssa-backprop.o \ gimple-ssa-isolate-paths.o \ gimple-ssa-nonnull-compare.o \ + gimple-ssa-sccopy.o \ gimple-ssa-split-paths.o \ gimple-ssa-store-merging.o \ gimple-ssa-strength-reduction.o \ diff --git a/gcc/gimple-ssa-sccopy.cc b/gcc/gimple-ssa-sccopy.cc new file mode 100644 index 00000000000..b74f7a62569 --- /dev/null +++ b/gcc/gimple-ssa-sccopy.cc @@ -0,0 +1,687 @@ +/* Strongly-connected copy propagation pass for the GNU compiler. + Copyright (C) 2023 Free Software Foundation, Inc. + Contributed by Filip Kastl + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify it under +the terms of the GNU General Public License as published by the Free +Software Foundation; either version 3, or (at your option) any later +version. + +GCC is distributed in the hope that it will be useful, but WITHOUT ANY +WARRANTY; without even the implied warranty of MERCHANTABILITY or +FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License +for more details. + +You should have received a copy of the GNU General Public License +along with GCC; see the file COPYING3. If not see +. */ + +#include "config.h" +#include "system.h" +#include "coretypes.h" +#include "backend.h" +#include "tree.h" +#include "gimple.h" +#include "tree-pass.h" +#include "ssa.h" +#include "gimple-iterator.h" +#include "vec.h" +#include "hash-set.h" +#include +#include "ssa-iterators.h" +#include "gimple-fold.h" +#include "gimplify.h" +#include "tree-cfg.h" +#include "tree-eh.h" +#include "builtins.h" +#include "tree-ssa-dce.h" +#include "fold-const.h" + +/* Strongly connected copy propagation pass. + + This is a lightweight copy propagation pass that is also able to eliminate + redundant PHI statements. The pass considers the following types of copy + statements: + + 1 An assignment statement with a single argument. + + _3 = _2; + _4 = 5; + + 2 A degenerate PHI statement. A degenerate PHI is a PHI that only refers to + itself or one other value. + + _5 = PHI <_1>; + _6 = PHI <_6, _6, _1, _1>; + _7 = PHI <16, _7>; + + 3 A set of PHI statements that only refer to each other or to one other + value. + + _8 = PHI <_9, _10>; + _9 = PHI <_8, _10>; + _10 = PHI <_8, _9, _1>; + + All of these statements produce copies and can be eliminated from the + program. For a copy statement we identify the value it creates a copy of + and replace references to the statement with the value -- we propagate the + copy. + + _3 = _2; // Replace all occurences of _3 by _2 + + _8 = PHI <_9, _10>; + _9 = PHI <_8, _10>; + _10 = PHI <_8, _9, _1>; // Replace all occurences of _8, _9 and _10 by _1 + + To find all three types of copy statements we use an algorithm based on + strongly-connected components (SCCs) in dataflow graph. The algorithm was + introduced in an article from 2013[1]. We describe the algorithm bellow. + + To identify SCCs we implement the Robert Tarjan's SCC algorithm. For the + SCC computation we wrap potential copy statements in the 'vertex' struct. + To each of these statements we also assign a vertex number ('vxnum'). Since + the main algorithm has to be able to compute SCCs of subgraphs of the whole + dataflow graph we use GIMPLE stmt flags to prevent Tarjan's algorithm from + leaving the subgraph. + + References: + + [1] Simple and Efficient Construction of Static Single Assignmemnt Form, + Braun, Buchwald, Hack, Leissa, Mallon, Zwinkau, 2013, LNCS vol. 7791, + Section 3.2. */ + +/* Bitmap tracking statements which were propagated to be removed at the end of + the pass. */ + +static bitmap dead_stmts; + +/* State of vertex during SCC discovery. + + unvisited Vertex hasn't yet been popped from worklist. + vopen DFS has visited vertex for the first time. Vertex has been put + on Tarjan stack. + closed DFS has backtracked through vertex. At this point, vertex + doesn't have any unvisited neighbors. + in_scc Vertex has been popped from Tarjan stack. */ + +enum vstate +{ + unvisited, + vopen, + closed, + in_scc +}; + +/* Information about a vertex. Used by SCC discovery. */ + +struct vertex +{ + bool active; /* scc_discovery::compute_sccs () only considers a subgraph of + the whole dataflow graph. It uses this flag so that it knows + which vertices are part of this subgraph. */ + vstate state; + unsigned index; + unsigned lowlink; +}; + +/* SCC discovery. + + Used to find SCCs in a dataflow graph. Implements Tarjan's SCC + algorithm. */ + +class scc_discovery +{ +public: + scc_discovery (); + ~scc_discovery (); + auto_vec> compute_sccs (auto_vec &stmts); + +private: + unsigned curr_generation = 0; + vertex* vertices; /* Indexed by SSA_NAME_VERSION. */ + auto_vec worklist; /* DFS stack. */ + auto_vec stack; /* Tarjan stack. */ + + void visit_neighbor (tree neigh_tree, unsigned parent_vxnum); +}; + +scc_discovery::scc_discovery () +{ + /* Create vertex struct for each SSA name. */ + vertices = XNEWVEC (struct vertex, num_ssa_names); + unsigned i = 0; + for (i = 0; i < num_ssa_names; i++) + vertices[i].active = false; +} + +scc_discovery::~scc_discovery () +{ + XDELETEVEC (vertices); +} + +/* Part of 'scc_discovery::compute_sccs()'. */ + +void +scc_discovery::visit_neighbor (tree neigh_tree, unsigned parent_version) +{ + if (TREE_CODE (neigh_tree) != SSA_NAME) + return; /* Skip any neighbor that isn't an SSA name. */ + unsigned neigh_version = SSA_NAME_VERSION (neigh_tree); + + /* Skip neighbors outside the subgraph that Tarjan currently works + with. */ + if (!vertices[neigh_version].active) + return; + + vstate neigh_state = vertices[neigh_version].state; + vstate parent_state = vertices[parent_version].state; + if (parent_state == vopen) /* We're currently opening parent. */ + { + /* Put unvisited neighbors on worklist. Update lowlink of parent + vertex according to indices of neighbors present on stack. */ + switch (neigh_state) + { + case unvisited: + worklist.safe_push (neigh_version); + break; + case vopen: + case closed: + vertices[parent_version].lowlink = + std::min (vertices[parent_version].lowlink, + vertices[neigh_version].index); + break; + case in_scc: + /* Ignore these edges. */ + break; + } + } + else if (parent_state == closed) /* We're currently closing parent. */ + { + /* Update lowlink of parent vertex according to lowlinks of + children of parent (in terms of DFS tree). */ + if (neigh_state == closed) + { + vertices[parent_version].lowlink = + std::min (vertices[parent_version].lowlink, + vertices[neigh_version].lowlink); + } + } +} + +/* Compute SCCs in dataflow graph on given statements 'stmts'. Ignore + statements outside 'stmts'. Return the SCCs in a reverse topological + order. + + stmt_may_generate_copy () must be true for all statements from 'stmts'! */ + +auto_vec> +scc_discovery::compute_sccs (auto_vec &stmts) +{ + auto_vec> sccs; + + for (gimple *stmt : stmts) + { + unsigned i; + switch (gimple_code (stmt)) + { + case GIMPLE_ASSIGN: + i = SSA_NAME_VERSION (gimple_assign_lhs (stmt)); + break; + case GIMPLE_PHI: + i = SSA_NAME_VERSION (gimple_phi_result (stmt)); + break; + default: + gcc_unreachable (); + } + + vertices[i].index = 0; + vertices[i].lowlink = 0; + vertices[i].state = unvisited; + vertices[i].active = true; /* Mark the subgraph we'll be working on so + that we don't leave it. */ + + worklist.safe_push (i); + } + + /* Worklist loop. */ + unsigned curr_index = 0; + while (!worklist.is_empty ()) + { + unsigned i = worklist.pop (); + gimple *stmt = SSA_NAME_DEF_STMT (ssa_name (i)); + vstate state = vertices[i].state; + + if (state == unvisited) + { + vertices[i].state = vopen; + + /* Assign index to this vertex. */ + vertices[i].index = curr_index; + vertices[i].lowlink = curr_index; + curr_index++; + + /* Put vertex on stack and also on worklist to be closed later. */ + stack.safe_push (i); + worklist.safe_push (i); + } + else if (state == vopen) + vertices[i].state = closed; + + /* Visit neighbors of this vertex. */ + tree op; + gphi *phi; + switch (gimple_code (stmt)) + { + case GIMPLE_PHI: + phi = as_a (stmt); + unsigned j; + for (j = 0; j < gimple_phi_num_args (phi); j++) + { + op = gimple_phi_arg_def (phi, j); + visit_neighbor (op, i); + } + break; + case GIMPLE_ASSIGN: + op = gimple_assign_rhs1 (stmt); + visit_neighbor (op, i); + break; + default: + gcc_unreachable (); + } + + /* If we've just closed a root vertex of an scc, pop scc from stack. */ + if (state == vopen && vertices[i].lowlink == vertices[i].index) + { + vec scc = vNULL; + + unsigned j; + do + { + j = stack.pop (); + scc.safe_push (SSA_NAME_DEF_STMT (ssa_name (j))); + vertices[j].state = in_scc; + } + while (j != i); + + sccs.safe_push (scc); + } + } + + if (!stack.is_empty ()) + gcc_unreachable (); + + /* Clear 'active' flags. */ + for (gimple *stmt : stmts) + { + unsigned i; + switch (gimple_code (stmt)) + { + case GIMPLE_ASSIGN: + i = SSA_NAME_VERSION (gimple_assign_lhs (stmt)); + break; + case GIMPLE_PHI: + i = SSA_NAME_VERSION (gimple_phi_result (stmt)); + break; + default: + gcc_unreachable (); + } + + vertices[i].active = false; + } + + return sccs; +} + +/* Could this statement potentially be a copy statement? + + This pass only considers statements for which this function returns 'true'. + Those are basically PHI functions and assignment statements similar to + + _2 = _1; + or + _2 = 5; */ + +static bool +stmt_may_generate_copy (gimple *stmt) +{ + /* A PHI may generate a copy. */ + if (gimple_code (stmt) == GIMPLE_PHI) + { + gphi *phi = as_a (stmt); + + /* No OCCURS_IN_ABNORMAL_PHI SSA names in lhs nor rhs. */ + if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (gimple_phi_result (phi))) + return false; + + unsigned i; + for (i = 0; i < gimple_phi_num_args (phi); i++) + { + tree op = gimple_phi_arg_def (phi, i); + if (TREE_CODE (op) == SSA_NAME + && SSA_NAME_OCCURS_IN_ABNORMAL_PHI (op)) + return false; + } + + /* If PHI has more than one unique non-SSA arguments, it won't generate a + copy. */ + tree const_op = NULL_TREE; + for (i = 0; i < gimple_phi_num_args (phi); i++) + { + tree op = gimple_phi_arg_def (phi, i); + if (TREE_CODE (op) != SSA_NAME) + { + if (const_op && !operand_equal_p (op, const_op)) + return false; + const_op = op; + } + } + + return true; + } + + /* Or a statement of type _2 = _1; OR _2 = 5; may generate a copy. */ + + if (!gimple_assign_single_p (stmt)) + return false; + + tree lhs = gimple_assign_lhs (stmt); + tree rhs = gimple_assign_rhs1 (stmt); + + if (TREE_CODE (lhs) != SSA_NAME) + return false; + + /* lhs shouldn't flow through any abnormal edges. */ + if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (lhs)) + return false; + + if (is_gimple_min_invariant (rhs)) + return true; /* A statement of type _2 = 5;. */ + + if (TREE_CODE (rhs) != SSA_NAME) + return false; + + /* rhs shouldn't flow through any abnormal edges. */ + if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rhs)) + return false; + + /* It is possible that lhs has more alignment or value range information. By + propagating we would lose this information. So in the case that alignment + or value range information differs, we are conservative and do not + propagate. + + FIXME: Propagate alignment and value range info the same way copy-prop + does. */ + if (POINTER_TYPE_P (TREE_TYPE (lhs)) + && POINTER_TYPE_P (TREE_TYPE (rhs)) + && SSA_NAME_PTR_INFO (lhs) != SSA_NAME_PTR_INFO (rhs)) + return false; + if (!POINTER_TYPE_P (TREE_TYPE (lhs)) + && !POINTER_TYPE_P (TREE_TYPE (rhs)) + && SSA_NAME_RANGE_INFO (lhs) != SSA_NAME_RANGE_INFO (rhs)) + return false; + + return true; /* A statement of type _2 = _1;. */ +} + +/* Return all statements in cfun that could generate copies. All statements + for which stmt_may_generate_copy returns 'true'. */ + +static auto_vec +get_all_stmt_may_generate_copy (void) +{ + auto_vec result; + + basic_block bb; + FOR_EACH_BB_FN (bb, cfun) + { + gimple_stmt_iterator gsi; + for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi)) + { + gimple *s = gsi_stmt (gsi); + if (stmt_may_generate_copy (s)) + { + result.safe_push (s); + } + } + + gphi_iterator pi; + for (pi = gsi_start_phis (bb); !gsi_end_p (pi); gsi_next (&pi)) + { + gimple *s = pi.phi (); + if (stmt_may_generate_copy (s)) + { + result.safe_push (s); + } + } + } + + return result; +} + +/* For each statement from given SCC, replace its usages by value + VAL. */ + +static void +replace_scc_by_value (vec scc, tree val) +{ + for (gimple *stmt : scc) + { + tree name = gimple_get_lhs (stmt); + replace_uses_by (name, val); + bitmap_set_bit (dead_stmts, SSA_NAME_VERSION (name)); + } + + if (dump_file) + fprintf (dump_file, "Replacing SCC of size %d\n", scc.length ()); +} + +/* Part of 'sccopy_propagate ()'. */ + +static void +sccopy_visit_op (tree op, hash_set &outer_ops, + hash_set &scc_set, bool &is_inner, + tree &last_outer_op) +{ + bool op_in_scc = false; + + if (TREE_CODE (op) == SSA_NAME) + { + gimple *op_stmt = SSA_NAME_DEF_STMT (op); + if (scc_set.contains (op_stmt)) + op_in_scc = true; + } + + if (!op_in_scc) + { + outer_ops.add (op); + last_outer_op = op; + is_inner = false; + } +} + +/* Main function of this pass. Find and propagate all three types of copy + statements (see pass description above). + + This is an implementation of an algorithm from the paper Simple and + Efficient Construction of Static Single Assignmemnt Form[1]. It is based + on strongly-connected components (SCCs) in dataflow graph. The original + algorithm only considers PHI statements. We extend it to also consider + assignment statements of type _2 = _1;. + + The algorithm is based on this definition of a set of redundant PHIs[1]: + + A non-empty set P of PHI functions is redundant iff the PHI functions just + reference each other or one other value + + It uses this lemma[1]: + + Let P be a redundant set of PHI functions. Then there is a + strongly-connected component S subset of P that is also redundant. + + The algorithm works in this way: + + 1 Find SCCs + 2 For each SCC S in topological order: + 3 Construct set 'inner' of statements that only have other statements + from S on their right hand side + 4 Construct set 'outer' of values that originate outside S and appear on + right hand side of some statement from S + 5 If |outer| = 1, outer only contains a value v. Statements in S only + refer to each other or to v -- they are redundant. Propagate v. + Else, recurse on statements in inner. + + The implementation is non-recursive. + + References: + + [1] Simple and Efficient Construction of Static Single Assignmemnt Form, + Braun, Buchwald, Hack, Leissa, Mallon, Zwinkau, 2013, LNCS vol. 7791, + Section 3.2. */ + +static void +sccopy_propagate () +{ + auto_vec useful_stmts = get_all_stmt_may_generate_copy (); + scc_discovery discovery; + + auto_vec> worklist; + worklist = discovery.compute_sccs (useful_stmts); + + while (!worklist.is_empty ()) + { + vec scc = worklist.pop (); + + auto_vec inner; + hash_set outer_ops; + tree last_outer_op = NULL_TREE; + + /* Prepare hash set of PHIs in scc to query later. */ + hash_set scc_set; + for (gimple *stmt : scc) + scc_set.add (stmt); + + for (gimple *stmt : scc) + { + bool is_inner = true; + + gphi *phi; + tree op; + + switch (gimple_code (stmt)) + { + case GIMPLE_PHI: + phi = as_a (stmt); + unsigned j; + for (j = 0; j < gimple_phi_num_args (phi); j++) + { + op = gimple_phi_arg_def (phi, j); + sccopy_visit_op (op, outer_ops, scc_set, is_inner, + last_outer_op); + } + break; + case GIMPLE_ASSIGN: + op = gimple_assign_rhs1 (stmt); + sccopy_visit_op (op, outer_ops, scc_set, is_inner, + last_outer_op); + break; + default: + gcc_unreachable (); + } + + if (is_inner) + { + inner.safe_push (stmt); + } + } + + if (outer_ops.elements () == 1) + { + /* The only operand in outer_ops. */ + tree outer_op = last_outer_op; + replace_scc_by_value (scc, outer_op); + } + else if (outer_ops.elements () > 1) + { + /* Add inner sccs to worklist. */ + auto_vec> inner_sccs = + discovery.compute_sccs (inner); + for (vec inner_scc : inner_sccs) + worklist.safe_push (inner_scc); + } + else + gcc_unreachable (); + + scc.release (); + } +} + +/* Called when pass execution starts. */ + +static void +init_sccopy (void) +{ + /* For propagated statements. */ + dead_stmts = BITMAP_ALLOC (NULL); +} + +/* Called before pass execution ends. */ + +static void +finalize_sccopy (void) +{ + /* Remove all propagated statements. */ + simple_dce_from_worklist (dead_stmts); + BITMAP_FREE (dead_stmts); + + /* Propagating a constant may create dead eh edges. */ + basic_block bb; + FOR_EACH_BB_FN (bb, cfun) + gimple_purge_dead_eh_edges (bb); +} + +namespace { + +const pass_data pass_data_sccopy = +{ + GIMPLE_PASS, /* type */ + "sccopy", /* name */ + OPTGROUP_NONE, /* optinfo_flags */ + TV_NONE, /* tv_id */ + ( PROP_cfg | PROP_ssa ), /* properties_required */ + 0, /* properties_provided */ + 0, /* properties_destroyed */ + 0, /* todo_flags_start */ + TODO_update_ssa | TODO_cleanup_cfg, /* todo_flags_finish */ +}; + +class pass_sccopy : public gimple_opt_pass +{ +public: + pass_sccopy (gcc::context *ctxt) + : gimple_opt_pass (pass_data_sccopy, ctxt) + {} + + /* opt_pass methods: */ + virtual bool gate (function *) { return true; } + virtual unsigned int execute (function *); + opt_pass * clone () final override { return new pass_sccopy (m_ctxt); } +}; // class pass_sccopy + +unsigned +pass_sccopy::execute (function *) +{ + init_sccopy (); + sccopy_propagate (); + finalize_sccopy (); + return 0; +} + +} // anon namespace + +gimple_opt_pass * +make_pass_sccopy (gcc::context *ctxt) +{ + return new pass_sccopy (ctxt); +} diff --git a/gcc/passes.def b/gcc/passes.def index 1e1950bdb39..fa6c5a2c9fa 100644 --- a/gcc/passes.def +++ b/gcc/passes.def @@ -100,6 +100,7 @@ along with GCC; see the file COPYING3. If not see NEXT_PASS (pass_if_to_switch); NEXT_PASS (pass_convert_switch); NEXT_PASS (pass_cleanup_eh); + NEXT_PASS (pass_sccopy); NEXT_PASS (pass_profile); NEXT_PASS (pass_local_pure_const); NEXT_PASS (pass_modref); @@ -368,6 +369,7 @@ along with GCC; see the file COPYING3. If not see However, this also causes us to misdiagnose cases that should be real warnings (e.g., testsuite/gcc.dg/pr18501.c). */ NEXT_PASS (pass_cd_dce, false /* update_address_taken_p */); + NEXT_PASS (pass_sccopy); NEXT_PASS (pass_tail_calls); /* Split critical edges before late uninit warning to reduce the number of false positives from it. */ @@ -409,6 +411,7 @@ along with GCC; see the file COPYING3. If not see NEXT_PASS (pass_sancov); NEXT_PASS (pass_asan); NEXT_PASS (pass_tsan); + NEXT_PASS (pass_sccopy); /* ??? We do want some kind of loop invariant motion, but we possibly need to adjust LIM to be more friendly towards preserving accurate debug information here. */ diff --git a/gcc/testsuite/gcc.dg/sccopy-1.c b/gcc/testsuite/gcc.dg/sccopy-1.c new file mode 100644 index 00000000000..1e61a6b320e --- /dev/null +++ b/gcc/testsuite/gcc.dg/sccopy-1.c @@ -0,0 +1,78 @@ +/* { dg-do compile } */ +/* { dg-options "-fgimple -fdump-tree-sccopy -O2" } */ +/* { dg-final { scan-tree-dump "Replacing SCC of size 2" "sccopy1" } } */ + +int __GIMPLE (ssa, startwith ("sccopy")) +main () +{ + int a; + int y; + int x; + int _1; + int _2; + int _13; + + __BB(2): + if (x_7(D) == 5) + goto __BB3; + else + goto __BB4; + + __BB(3): + a_10 = x_7(D); + goto __BB5; + + __BB(4): + a_9 = y_8(D); + goto __BB5; + + __BB(5): + a_3 = __PHI (__BB3: a_10, __BB4: a_9); + if (x_7(D) == y_8(D)) + goto __BB6; + else + goto __BB11; + + __BB(6): + a_11 = a_3 + 1; + goto __BB7; + + __BB(7): + a_4 = __PHI (__BB6: a_11, __BB11: a_6); +label1: + if (x_7(D) != y_8(D)) + goto __BB8; + else + goto __BB10; + + __BB(8): + goto __BB9; + + __BB(9): + a_12 = __PHI (__BB8: a_4, __BB10: a_5); + goto __BB10; + + __BB(10,loop_header(1)): + a_5 = __PHI (__BB7: a_4, __BB9: a_12); +label2: + _1 = y_8(D) * 2; + if (x_7(D) == _1) + goto __BB9; + else + goto __BB11; + + __BB(11): + a_6 = __PHI (__BB5: a_3, __BB10: a_5); + _2 = x_7(D) * 3; + if (y_8(D) == _2) + goto __BB7; + else + goto __BB12; + + __BB(12): + _13 = 0; + return _13; + +} + + diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr79691.c b/gcc/testsuite/gcc.dg/tree-ssa/pr79691.c index bf889318c06..43770c95bca 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/pr79691.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr79691.c @@ -34,4 +34,4 @@ int f4 (int i) /* { dg-final { scan-tree-dump-times "sprintf" 1 "optimized" } } { dg-final { scan-tree-dump-times "snprintf" 1 "optimized" } } - { dg-final { scan-tree-dump " = 9;" "optimized" } } */ + { dg-final { scan-tree-dump "return 9;" "optimized" } } */ diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h index 09e6ada5b2f..94a48461590 100644 --- a/gcc/tree-pass.h +++ b/gcc/tree-pass.h @@ -399,6 +399,7 @@ extern gimple_opt_pass *make_pass_iv_optimize (gcc::context *ctxt); extern gimple_opt_pass *make_pass_tree_loop_done (gcc::context *ctxt); extern gimple_opt_pass *make_pass_ch (gcc::context *ctxt); extern gimple_opt_pass *make_pass_ch_vect (gcc::context *ctxt); +extern gimple_opt_pass *make_pass_sccopy (gcc::context *ctxt); extern gimple_opt_pass *make_pass_ccp (gcc::context *ctxt); extern gimple_opt_pass *make_pass_split_paths (gcc::context *ctxt); extern gimple_opt_pass *make_pass_build_ssa (gcc::context *ctxt); -- 2.42.0