From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29]) by sourceware.org (Postfix) with ESMTPS id 7A9DF3858404 for ; Wed, 10 Nov 2021 07:17:59 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 7A9DF3858404 Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 41D7D1FD3F; Wed, 10 Nov 2021 07:17:58 +0000 (UTC) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 200F813B52; Wed, 10 Nov 2021 07:17:58 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id yC3EBiZyi2G5YAAAMHmgww (envelope-from ); Wed, 10 Nov 2021 07:17:58 +0000 Date: Wed, 10 Nov 2021 08:17:57 +0100 (CET) From: Richard Biener To: Tamar Christina cc: Richard Sandiford , Richard Biener via Gcc-patches , nd Subject: RE: [PATCH]middle-end Add an RPO pass after successful vectorization In-Reply-To: Message-ID: <5471312p-6898-1523-7240-n852s4801q7n@fhfr.qr> References: <9nnp8so9-p3nq-r26-3098-s96334191030@fhfr.qr> <045975-5o48-87ns-70pr-39r47q12o3p6@fhfr.qr> <33781on-6390-o5np-43s6-9n473q10oq57@fhfr.qr> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="-1463801166-1829396552-1636528678=:8346" X-Spam-Status: No, score=-11.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Nov 2021 07:18:01 -0000 This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. ---1463801166-1829396552-1636528678=:8346 Content-Type: text/plain; charset=US-ASCII On Tue, 9 Nov 2021, Tamar Christina wrote: > > > + bitmap_set_bit (exit_bbs, single_exit (loop)->dest->index); > > > + bitmap_set_bit (exit_bbs, loop->latch->index); > > > > treating the latch as exit is probably premature optimization (yes, it's empty). > > > > > + > > > + do_rpo_vn (cfun, loop_preheader_edge (loop), exit_bbs); > > > + > > > + BITMAP_FREE (exit_bbs); > > > > ... deallocation can go. Note I wonder whether, if we are already spinning up > > VN, we should include the preheader in the operation? > > We regularly end up emitting redundant vector initializers that could be > > cleaned up earlier this way. > > I've change it to include the preheader but it looks like this breaks bootstrap on both > x86 and AArch64. > > On x86 the following testcase > > double matmul_c8_vanilla_bbase_0; > double *matmul_c8_vanilla_dest; > matmul_c8_vanilla_x; > matmul_c8_vanilla() { > for (; matmul_c8_vanilla_x; matmul_c8_vanilla_x++) > matmul_c8_vanilla_dest[matmul_c8_vanilla_x] += matmul_c8_vanilla_bbase_0; > } > > ICEs with -std=gnu11 -ffast-math -ftree-vectorize -O2 with: > > internal compiler error: tree check: expected ssa_name, have var_decl in SSA_VAL, at tree-ssa-sccvn.c:535 > 0x80731c tree_check_failed(tree_node const*, char const*, int, char const*, ...) > ../gcc-dsg/gcc/tree.c:8689 > 0x7ebda2 tree_check(tree_node*, char const*, int, char const*, tree_code) > ../gcc-dsg/gcc/tree.h:3433 > 0x7ebda2 SSA_VAL(tree_node*, bool*) > ../gcc-dsg/gcc/tree-ssa-sccvn.c:535 > 0x7ebda2 vuse_ssa_val > ../gcc-dsg/gcc/tree-ssa-sccvn.c:553 > 0x7ebda2 vn_reference_lookup(tree_node*, tree_node*, vn_lookup_kind, vn_reference_s**, bool, tree_node**, tree_node*) > ../gcc-dsg/gcc/tree-ssa-sccvn.c:3664 > 0x10d8ca5 visit_reference_op_load > ../gcc-dsg/gcc/tree-ssa-sccvn.c:5166 > 0x10d8ca5 visit_stmt > ../gcc-dsg/gcc/tree-ssa-sccvn.c:5615 > 0x10d976c process_bb > ../gcc-dsg/gcc/tree-ssa-sccvn.c:7344 > 0x10dafe5 do_rpo_vn > ../gcc-dsg/gcc/tree-ssa-sccvn.c:7942 > 0x10dc828 do_rpo_vn(function*, edge_def*, bitmap_head*) > ../gcc-dsg/gcc/tree-ssa-sccvn.c:8039 > 0x119c39c vectorize_loops() > ../gcc-dsg/gcc/tree-vectorizer.c:1304 OK, as I thought this is .MEMs not in SSA form. We're later fixing that up so maybe try the attached which re-orders the postprocessing after vectorization to do that earlier. Richard. > on AArch64 this one ICEs with -ffast-math -ftree-vectorize -O2 > > _Complex *a; > _Complex b; > c, d; > fn1() { > _Complex e; > for (; c; ++c) > e = d * a[c]; > b = e; > } > > With the message > > internal compiler error: tree check: expected ssa_name, have var_decl in VN_INFO, at tree-ssa-sccvn.c:451 > 0x734073 tree_check_failed(tree_node const*, char const*, int, char const*, ...) > ../../gcc-fsf/gcc/tree.c:8691 > 0x10e2e2f tree_check(tree_node*, char const*, int, char const*, tree_code) > ../../gcc-fsf/gcc/tree.h:3433 > 0x10e2e2f VN_INFO(tree_node*) > ../../gcc-fsf/gcc/tree-ssa-sccvn.c:451 > 0x10ed223 process_bb > ../../gcc-fsf/gcc/tree-ssa-sccvn.c:7331 > 0x10eea43 do_rpo_vn > ../../gcc-fsf/gcc/tree-ssa-sccvn.c:7944 > 0x10efe2b do_rpo_vn(function*, edge_def*, bitmap_head*) > ../../gcc-fsf/gcc/tree-ssa-sccvn.c:8039 > 0x11c436b vectorize_loops() > ../../gcc-fsf/gcc/tree-vectorizer.c:1304 > > Any ideas? > > Thanks, > Tamar > > > > > Otherwise the change looks OK. > > > > --- inline copy of patch --- > > diff --git a/gcc/tree-vectorizer.c b/gcc/tree-vectorizer.c > index edb7538a67f00cd80a608ee82510cf437fe88083..029d59016c9652f87d80fc5500f89532c79a66d0 100644 > --- a/gcc/tree-vectorizer.c > +++ b/gcc/tree-vectorizer.c > @@ -81,7 +81,7 @@ along with GCC; see the file COPYING3. If not see > #include "gimple-pretty-print.h" > #include "opt-problem.h" > #include "internal-fn.h" > - > +#include "tree-ssa-sccvn.h" > > /* Loop or bb location, with hotness information. */ > dump_user_location_t vect_location; > @@ -1298,6 +1298,17 @@ vectorize_loops (void) > if (has_mask_store > && targetm.vectorize.empty_mask_is_expensive (IFN_MASK_STORE)) > optimize_mask_stores (loop); > + > + auto_bitmap exit_bbs; > + /* Perform local CSE, this esp. helps because we emit code for > + predicates that need to be shared for optimal predicate usage. > + However reassoc will re-order them and prevent CSE from working > + as it should. CSE only the loop body, not the entry. */ > + bitmap_set_bit (exit_bbs, single_exit (loop)->dest->index); > + > + edge entry = EDGE_PRED (loop_preheader_edge (loop)->src, 0); > + do_rpo_vn (cfun, entry, exit_bbs); > + > loop->aux = NULL; > } > > -- Richard Biener SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany; GF: Felix Imend ---1463801166-1829396552-1636528678=:8346 Content-Type: text/plain; charset=US-ASCII; name=p Content-Transfer-Encoding: BASE64 Content-Description: patch Content-Disposition: attachment; filename=p ZGlmZiAtLWdpdCBhL2djYy90cmVlLXZlY3Rvcml6ZXIuYyBiL2djYy90cmVl LXZlY3Rvcml6ZXIuYw0KaW5kZXggYTJlMTNhY2I2ZDIuLjc4ODgzYjA1M2Ix IDEwMDY0NA0KLS0tIGEvZ2NjL3RyZWUtdmVjdG9yaXplci5jDQorKysgYi9n Y2MvdHJlZS12ZWN0b3JpemVyLmMNCkBAIC04MSw3ICs4MSw3IEBAIGFsb25n IHdpdGggR0NDOyBzZWUgdGhlIGZpbGUgQ09QWUlORzMuICBJZiBub3Qgc2Vl DQogI2luY2x1ZGUgImdpbXBsZS1wcmV0dHktcHJpbnQuaCINCiAjaW5jbHVk ZSAib3B0LXByb2JsZW0uaCINCiAjaW5jbHVkZSAiaW50ZXJuYWwtZm4uaCIN Ci0NCisjaW5jbHVkZSAidHJlZS1zc2Etc2Njdm4uaCINCiANCiAvKiBMb29w IG9yIGJiIGxvY2F0aW9uLCB3aXRoIGhvdG5lc3MgaW5mb3JtYXRpb24uICAq Lw0KIGR1bXBfdXNlcl9sb2NhdGlvbl90IHZlY3RfbG9jYXRpb247DQpAQCAt MTI3OCwyMyArMTI3OCw2IEBAIHZlY3Rvcml6ZV9sb29wcyAodm9pZCkNCiAJ ICB9DQogICAgICAgfQ0KIA0KLSAgZm9yIChpID0gMTsgaSA8IG51bWJlcl9v Zl9sb29wcyAoY2Z1bik7IGkrKykNCi0gICAgew0KLSAgICAgIGxvb3BfdmVj X2luZm8gbG9vcF92aW5mbzsNCi0gICAgICBib29sIGhhc19tYXNrX3N0b3Jl Ow0KLQ0KLSAgICAgIGxvb3AgPSBnZXRfbG9vcCAoY2Z1biwgaSk7DQotICAg ICAgaWYgKCFsb29wIHx8ICFsb29wLT5hdXgpDQotCWNvbnRpbnVlOw0KLSAg ICAgIGxvb3BfdmluZm8gPSAobG9vcF92ZWNfaW5mbykgbG9vcC0+YXV4Ow0K LSAgICAgIGhhc19tYXNrX3N0b3JlID0gTE9PUF9WSU5GT19IQVNfTUFTS19T VE9SRSAobG9vcF92aW5mbyk7DQotICAgICAgZGVsZXRlIGxvb3BfdmluZm87 DQotICAgICAgaWYgKGhhc19tYXNrX3N0b3JlDQotCSAgJiYgdGFyZ2V0bS52 ZWN0b3JpemUuZW1wdHlfbWFza19pc19leHBlbnNpdmUgKElGTl9NQVNLX1NU T1JFKSkNCi0Jb3B0aW1pemVfbWFza19zdG9yZXMgKGxvb3ApOw0KLSAgICAg IGxvb3AtPmF1eCA9IE5VTEw7DQotICAgIH0NCi0NCiAgIC8qIEZvbGQgSUZO X0dPTVBfU0lNRF97VkYsTEFORSxMQVNUX0xBTkUsT1JERVJFRF97U1RBUlQs RU5EfX0gYnVpbHRpbnMuICAqLw0KICAgaWYgKGNmdW4tPmhhc19zaW1kdWlk X2xvb3BzKQ0KICAgICB7DQpAQCAtMTMwMiwxNCArMTI4NSwxMiBAQCB2ZWN0 b3JpemVfbG9vcHMgKHZvaWQpDQogICAgICAgLyogQXZvaWQgc3RhbGUgU0NF ViBjYWNoZSBlbnRyaWVzIGZvciB0aGUgU0lNRF9MQU5FIGRlZnMuICAqLw0K ICAgICAgIHNjZXZfcmVzZXQgKCk7DQogICAgIH0NCi0NCiAgIC8qIFNocmlu ayBhbnkgIm9tcCBhcnJheSBzaW1kIiB0ZW1wb3JhcnkgYXJyYXlzIHRvIHRo ZQ0KICAgICAgYWN0dWFsIHZlY3Rvcml6YXRpb24gZmFjdG9ycy4gICovDQog ICBpZiAoc2ltZF9hcnJheV90b19zaW1kdWlkX2h0YWIpDQogICAgIHNocmlu a19zaW1kX2FycmF5cyAoc2ltZF9hcnJheV90b19zaW1kdWlkX2h0YWIsIHNp bWR1aWRfdG9fdmZfaHRhYik7DQogICBkZWxldGUgc2ltZHVpZF90b192Zl9o dGFiOw0KICAgY2Z1bi0+aGFzX3NpbWR1aWRfbG9vcHMgPSBmYWxzZTsNCi0g IHZlY3Rfc2xwX2ZpbmkgKCk7DQogDQogICBpZiAobnVtX3ZlY3Rvcml6ZWRf bG9vcHMgPiAwKQ0KICAgICB7DQpAQCAtMTMxNyw5ICsxMjk4LDM5IEBAIHZl Y3Rvcml6ZV9sb29wcyAodm9pZCkNCiAJID8/PyAgQWxzbyB3aGlsZSB3ZSB0 cnkgaGFyZCB0byB1cGRhdGUgbG9vcC1jbG9zZWQgU1NBIGZvcm0gd2UgZmFp bA0KIAkgdG8gcHJvcGVybHkgZG8gdGhpcyBpbiBzb21lIGNvcm5lci1jYXNl cyAoc2VlIFBSNTYyODYpLiAgKi8NCiAgICAgICByZXdyaXRlX2ludG9fbG9v cF9jbG9zZWRfc3NhIChOVUxMLCBUT0RPX3VwZGF0ZV9zc2Ffb25seV92aXJ0 dWFscyk7DQotICAgICAgcmV0dXJuIFRPRE9fY2xlYW51cF9jZmc7DQorICAg ICAgcmV0IHw9IFRPRE9fY2xlYW51cF9jZmc7DQogICAgIH0NCiANCisgIGZv ciAoaSA9IDE7IGkgPCBudW1iZXJfb2ZfbG9vcHMgKGNmdW4pOyBpKyspDQor ICAgIHsNCisgICAgICBsb29wX3ZlY19pbmZvIGxvb3BfdmluZm87DQorICAg ICAgYm9vbCBoYXNfbWFza19zdG9yZTsNCisNCisgICAgICBsb29wID0gZ2V0 X2xvb3AgKGNmdW4sIGkpOw0KKyAgICAgIGlmICghbG9vcCB8fCAhbG9vcC0+ YXV4KQ0KKwljb250aW51ZTsNCisgICAgICBsb29wX3ZpbmZvID0gKGxvb3Bf dmVjX2luZm8pIGxvb3AtPmF1eDsNCisgICAgICBoYXNfbWFza19zdG9yZSA9 IExPT1BfVklORk9fSEFTX01BU0tfU1RPUkUgKGxvb3BfdmluZm8pOw0KKyAg ICAgIGRlbGV0ZSBsb29wX3ZpbmZvOw0KKyAgICAgIGlmIChoYXNfbWFza19z dG9yZQ0KKwkgICYmIHRhcmdldG0udmVjdG9yaXplLmVtcHR5X21hc2tfaXNf ZXhwZW5zaXZlIChJRk5fTUFTS19TVE9SRSkpDQorCW9wdGltaXplX21hc2tf c3RvcmVzIChsb29wKTsNCisNCisgICAgICBhdXRvX2JpdG1hcCBleGl0X2Ji czsNCisgICAgICAvKiBQZXJmb3JtIGxvY2FsIENTRSwgdGhpcyBlc3AuIGhl bHBzIGJlY2F1c2Ugd2UgZW1pdCBjb2RlIGZvcg0KKwkgcHJlZGljYXRlcyB0 aGF0IG5lZWQgdG8gYmUgc2hhcmVkIGZvciBvcHRpbWFsIHByZWRpY2F0ZSB1 c2FnZS4NCisJIEhvd2V2ZXIgcmVhc3NvYyB3aWxsIHJlLW9yZGVyIHRoZW0g YW5kIHByZXZlbnQgQ1NFIGZyb20gd29ya2luZw0KKwkgYXMgaXQgc2hvdWxk LiAgQ1NFIG9ubHkgdGhlIGxvb3AgYm9keSwgbm90IHRoZSBlbnRyeS4gICov DQorICAgICAgYml0bWFwX3NldF9iaXQgKGV4aXRfYmJzLCBzaW5nbGVfZXhp dCAobG9vcCktPmRlc3QtPmluZGV4KTsNCisNCisgICAgICBlZGdlIGVudHJ5 ID0gRURHRV9QUkVEIChsb29wX3ByZWhlYWRlcl9lZGdlIChsb29wKS0+c3Jj LCAwKTsNCisgICAgICBkb19ycG9fdm4gKGNmdW4sIGVudHJ5LCBleGl0X2Ji cyk7DQorDQorICAgICAgbG9vcC0+YXV4ID0gTlVMTDsNCisgICAgfQ0KKw0K KyAgdmVjdF9zbHBfZmluaSAoKTsNCisNCiAgIHJldHVybiByZXQ7DQogfQ0K IA0K ---1463801166-1829396552-1636528678=:8346--