From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) by sourceware.org (Postfix) with ESMTPS id 51F7F385840B for ; Mon, 26 Feb 2024 14:39:45 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 51F7F385840B Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.de ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 51F7F385840B Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=195.135.223.131 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1708958387; cv=none; b=sqZh3eZZH0wdRnUfKBEGIbwEfxnaGCcScR+L++enkUh38P0iNs/hT7kCyhY6zh9SKtJXevrS2590iibIBpURR2R1fqzg4IHDjcjSTayu1YDF0UPogSFbZSg8ZsYAyzLd9XJHo0Fd4Q+keCaPQ7uDEznW8FVHT/zAFEJcCzQr7q4= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1708958387; c=relaxed/simple; bh=YdOkJJnOz5IHJH33IOxIX9JYH1vkytXsQmPYTvVKY2M=; h=DKIM-Signature:DKIM-Signature:DKIM-Signature:DKIM-Signature:Date: From:To:Subject:MIME-Version; b=mbwmgsJ6UwtF79Tv+mqMi/txmDYS/mCNXhEhfRlMu5IZjXZRKEbJUXRCpOnuJAdYkNdmMxJc9Cy6cmgxbN6Rxm1biAXN3fxyKLIDcgIVdhZig1Xv7kTet9e7bY80MFd8IK88/AAFPz0UuxF3lBKyW9rK+lbd0YNgrlRq1hv+O6o= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from [10.168.4.150] (unknown [10.168.4.150]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 292CB1FB47; Mon, 26 Feb 2024 14:39:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1708958384; h=from:from:reply-to:date:date:to:to:cc:cc:mime-version:mime-version: content-type:content-type; bh=MoPmwLliC9DwwWTWGI/j+yH+o2eAJ2daxaWmWUSMUxo=; b=C6fox9d8C/nIush7lA4/q4B/Tjo+2osFgOK2QIH5uScKV9x4NMQh9aXu3AK3c8EMxhDhPq 7ADCZaWL+xhdme4VjT+YFaUrU2pPuutJRW5Rw+thkflpLogp2ClJtVUSWn2kM82+236/8S 8boIeKrySGD+yaV8efKksbknURxtbW8= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1708958384; h=from:from:reply-to:date:date:to:to:cc:cc:mime-version:mime-version: content-type:content-type; bh=MoPmwLliC9DwwWTWGI/j+yH+o2eAJ2daxaWmWUSMUxo=; b=cvVrwREUNZc5KmPATjOTk1WklwXoFVgiAKcCdmuaLMuLnOsPamYKnhMAYRtpeSFpdmTRTQ Ab8joRn+QKpfmpDQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1708958384; h=from:from:reply-to:date:date:to:to:cc:cc:mime-version:mime-version: content-type:content-type; bh=MoPmwLliC9DwwWTWGI/j+yH+o2eAJ2daxaWmWUSMUxo=; b=C6fox9d8C/nIush7lA4/q4B/Tjo+2osFgOK2QIH5uScKV9x4NMQh9aXu3AK3c8EMxhDhPq 7ADCZaWL+xhdme4VjT+YFaUrU2pPuutJRW5Rw+thkflpLogp2ClJtVUSWn2kM82+236/8S 8boIeKrySGD+yaV8efKksbknURxtbW8= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1708958384; h=from:from:reply-to:date:date:to:to:cc:cc:mime-version:mime-version: content-type:content-type; bh=MoPmwLliC9DwwWTWGI/j+yH+o2eAJ2daxaWmWUSMUxo=; b=cvVrwREUNZc5KmPATjOTk1WklwXoFVgiAKcCdmuaLMuLnOsPamYKnhMAYRtpeSFpdmTRTQ Ab8joRn+QKpfmpDQ== Date: Mon, 26 Feb 2024 15:39:44 +0100 (CET) From: Richard Biener To: gcc-patches@gcc.gnu.org cc: tamar.christina@arm.com Subject: [PATCH] tree-optimization/114081 - dominator update for prologue peeling MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Authentication-Results: smtp-out2.suse.de; none X-Spam-Score: 1.40 X-Spamd-Result: default: False [1.40 / 50.00]; ARC_NA(0.00)[]; FROM_HAS_DN(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; NEURAL_HAM_LONG(-1.00)[-0.998]; MIME_GOOD(-0.10)[text/plain]; TO_DN_NONE(0.00)[]; NEURAL_SPAM_SHORT(3.00)[0.999]; MISSING_MID(2.50)[]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; RCPT_COUNT_TWO(0.00)[2]; FUZZY_BLOCKED(0.00)[rspamd.com]; RCVD_COUNT_ZERO(0.00)[0]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; BAYES_HAM(-3.00)[100.00%] X-Spam-Status: No, score=-10.6 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,MISSING_MID,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Message-ID: <20240226143944.FaDZqE23WtwHUIoXbJBynl4x6uZq1P0vG6dHctvU2DQ@z> The following implements manual update for multi-exit loop prologue peeling during vectorization. Boostrap / regtest running on x86_64-unknown-linux-gnu. I think the amount of coverage for prologue peeling with early exits is very low, so my testing success might not mean much. Richard. PR tree-optimization/114081 * tree-vect-loop-manip.cc (slpeel_tree_duplicate_loop_to_edge_cfg): Perform manual dominator update for prologue peeling. (vect_do_peeling): Properly update dominators after adding the prologue-around guard. * gcc.dg/vect/vect-early-break_121-pr114081.c: New testcase. --- .../vect/vect-early-break_121-pr114081.c | 39 ++++++++++ gcc/tree-vect-loop-manip.cc | 78 +++++++++++++------ 2 files changed, 95 insertions(+), 22 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/vect/vect-early-break_121-pr114081.c diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_121-pr114081.c b/gcc/testsuite/gcc.dg/vect/vect-early-break_121-pr114081.c new file mode 100644 index 00000000000..423ff0b566b --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_121-pr114081.c @@ -0,0 +1,39 @@ +/* { dg-do compile } */ +/* { dg-add-options vect_early_break } */ +/* { dg-require-effective-target vect_early_break } */ +/* { dg-require-effective-target vect_int } */ +/* { dg-additional-options "-O3" } */ +/* { dg-additional-options "-mavx2" { target { x86_64-*-* i?86-*-* } } } */ + +/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" } } */ + +typedef struct filter_list_entry { + const char *name; + int id; + void (*function)(); +} filter_list_entry; + +static const filter_list_entry filter_list[9] = {0}; + +void php_zval_filter(int filter, int id1) { + filter_list_entry filter_func; + + int size = 9; + for (int i = 0; i < size; ++i) { + if (filter_list[i].id == filter) { + filter_func = filter_list[i]; + goto done; + } + } + +#pragma GCC novector + for (int i = 0; i < size; ++i) { + if (filter_list[i].id == 0x0204) { + filter_func = filter_list[i]; + goto done; + } + } +done: + if (!filter_func.id) + filter_func.function(); +} diff --git a/gcc/tree-vect-loop-manip.cc b/gcc/tree-vect-loop-manip.cc index 137b053ac35..f72da915103 100644 --- a/gcc/tree-vect-loop-manip.cc +++ b/gcc/tree-vect-loop-manip.cc @@ -1594,7 +1594,6 @@ slpeel_tree_duplicate_loop_to_edge_cfg (class loop *loop, edge loop_exit, auto loop_exits = get_loop_exit_edges (loop); bool multiple_exits_p = loop_exits.length () > 1; auto_vec doms; - class loop *update_loop = NULL; if (at_exit) /* Add the loop copy at exit. */ { @@ -1856,11 +1855,33 @@ slpeel_tree_duplicate_loop_to_edge_cfg (class loop *loop, edge loop_exit, correct. */ if (multiple_exits_p) { - update_loop = new_loop; + class loop *update_loop = new_loop; doms = get_all_dominated_blocks (CDI_DOMINATORS, loop->header); for (unsigned i = 0; i < doms.length (); ++i) if (flow_bb_inside_loop_p (loop, doms[i])) doms.unordered_remove (i); + + for (edge e : get_loop_exit_edges (update_loop)) + { + edge ex; + edge_iterator ei; + FOR_EACH_EDGE (ex, ei, e->dest->succs) + { + /* Find the first non-fallthrough block as fall-throughs can't + dominate other blocks. */ + if (single_succ_p (ex->dest)) + { + doms.safe_push (ex->dest); + ex = single_succ_edge (ex->dest); + } + doms.safe_push (ex->dest); + } + doms.safe_push (e->dest); + } + + iterate_fix_dominators (CDI_DOMINATORS, doms, false); + if (updated_doms) + updated_doms->safe_splice (doms); } } else /* Add the copy at entry. */ @@ -1910,33 +1931,28 @@ slpeel_tree_duplicate_loop_to_edge_cfg (class loop *loop, edge loop_exit, set_immediate_dominator (CDI_DOMINATORS, new_loop->header, loop_preheader_edge (new_loop)->src); + /* Update dominators for multiple exits. */ if (multiple_exits_p) - update_loop = loop; - } - - if (multiple_exits_p) - { - for (edge e : get_loop_exit_edges (update_loop)) { - edge ex; - edge_iterator ei; - FOR_EACH_EDGE (ex, ei, e->dest->succs) + for (edge alt_e : loop_exits) { - /* Find the first non-fallthrough block as fall-throughs can't - dominate other blocks. */ - if (single_succ_p (ex->dest)) + if (alt_e == loop_exit) + continue; + basic_block old_dom + = get_immediate_dominator (CDI_DOMINATORS, alt_e->dest); + if (flow_bb_inside_loop_p (loop, old_dom)) { - doms.safe_push (ex->dest); - ex = single_succ_edge (ex->dest); + auto_vec queue; + for (auto son = first_dom_son (CDI_DOMINATORS, old_dom); + son; son = next_dom_son (CDI_DOMINATORS, son)) + if (!flow_bb_inside_loop_p (loop, son)) + queue.safe_push (son); + for (auto son : queue) + set_immediate_dominator (CDI_DOMINATORS, + son, get_bb_copy (old_dom)); } - doms.safe_push (ex->dest); } - doms.safe_push (e->dest); } - - iterate_fix_dominators (CDI_DOMINATORS, doms, false); - if (updated_doms) - updated_doms->safe_splice (doms); } free (new_bbs); @@ -3368,6 +3384,24 @@ vect_do_peeling (loop_vec_info loop_vinfo, tree niters, tree nitersm1, guard_to, guard_bb, prob_prolog.invert (), irred_flag); + for (edge alt_e : get_loop_exit_edges (prolog)) + { + if (alt_e == prolog_e) + continue; + basic_block old_dom + = get_immediate_dominator (CDI_DOMINATORS, alt_e->dest); + if (flow_bb_inside_loop_p (prolog, old_dom)) + { + auto_vec queue; + for (auto son = first_dom_son (CDI_DOMINATORS, old_dom); + son; son = next_dom_son (CDI_DOMINATORS, son)) + if (!flow_bb_inside_loop_p (prolog, son)) + queue.safe_push (son); + for (auto son : queue) + set_immediate_dominator (CDI_DOMINATORS, son, guard_bb); + } + } + e = EDGE_PRED (guard_to, 0); e = (e != guard_e ? e : EDGE_PRED (guard_to, 1)); slpeel_update_phi_nodes_for_guard1 (prolog, loop, guard_e, e); -- 2.35.3