From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp-out2.suse.de (smtp-out2.suse.de [IPv6:2a07:de40:b251:101:10:150:64:2]) by sourceware.org (Postfix) with ESMTPS id DBC863858CDA for ; Tue, 30 Jan 2024 13:10:12 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org DBC863858CDA Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.de ARC-Filter: OpenARC Filter v1.0.0 sourceware.org DBC863858CDA Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a07:de40:b251:101:10:150:64:2 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1706620216; cv=none; b=XCLmCSeG0DKpIRNm9md0gF/qPcLKs4QFy6THqHl14rnKPlKL48f+YKWL0ZbdT+G9a9QlOwDnXJfiOepwaH3FmKHhZAEA5ZRG4/DRUdvWrcCC+4HuR8p5PPYR4pqpmtkRumPQBOKLxULn7or2Syz23dCuGGHc18yJEvYz+12NpFA= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1706620216; c=relaxed/simple; bh=/1SMeRk+t5Vxb0LPf6N+NnO02zGHds6FvOSpJsDnS5A=; h=DKIM-Signature:DKIM-Signature:DKIM-Signature:DKIM-Signature:Date: From:To:Subject:Message-ID:MIME-Version; b=nCVcu9oWGQDCg1ZFi2t6kB73iLSqiAlNULTCGimO8EvHZ1pOq3VmxIPow1Z0U9wnXmNohtjTqNtpTw8LH9Pu4zyHAbtljhuf+t2jEfU3HRIYdDhsQPzr499espN8UMuf6skshjY+cO2nBv1GB2pn5A/SHx6DOHv1vH5mlxxsATY= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from [10.168.4.150] (unknown [10.168.4.150]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id B57D21F84C; Tue, 30 Jan 2024 13:10:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1706620211; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=t0jWi/6zOEd+NwIB4FJVGOT/1ChHvG+FSkpJSXa0ohg=; b=WlQ6TIzaNKe4vEDwKckPLFJBnR4VY7+EEOqzcNx3QIPn6lTiBXav5KVWR2QGxwu1tjAHKG n9N/oBLVIPBqRNHtrQqaVSBQHhlD7mIY/Zmg5FDqVoMG10sG3jY913mARUt89hBt/i421W E+GxLLXzg3laKfxSG7Kk1GJ50zg+zS8= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1706620211; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=t0jWi/6zOEd+NwIB4FJVGOT/1ChHvG+FSkpJSXa0ohg=; b=S5bALpW9iQf55+JkzgHlNCbLlbPFFpCox9PQHyUEw18SFp5xBx9iJKQfttbiIAGGe9H5t4 BUoevNmQjk61k0BA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1706620211; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=t0jWi/6zOEd+NwIB4FJVGOT/1ChHvG+FSkpJSXa0ohg=; b=WlQ6TIzaNKe4vEDwKckPLFJBnR4VY7+EEOqzcNx3QIPn6lTiBXav5KVWR2QGxwu1tjAHKG n9N/oBLVIPBqRNHtrQqaVSBQHhlD7mIY/Zmg5FDqVoMG10sG3jY913mARUt89hBt/i421W E+GxLLXzg3laKfxSG7Kk1GJ50zg+zS8= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1706620211; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=t0jWi/6zOEd+NwIB4FJVGOT/1ChHvG+FSkpJSXa0ohg=; b=S5bALpW9iQf55+JkzgHlNCbLlbPFFpCox9PQHyUEw18SFp5xBx9iJKQfttbiIAGGe9H5t4 BUoevNmQjk61k0BA== Date: Tue, 30 Jan 2024 14:04:35 +0100 (CET) From: Richard Biener To: Tamar Christina cc: "gcc-patches@gcc.gnu.org" , nd , "jlaw@ventanamicro.com" Subject: RE: [PATCH]middle-end: check memory accesses in the destination block [PR113588]. In-Reply-To: Message-ID: References: <1096n2p8-s430-5p01-orn1-ro09s5162o8n@fhfr.qr> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Authentication-Results: smtp-out2.suse.de; none X-Spamd-Result: default: False [-3.10 / 50.00]; ARC_NA(0.00)[]; TO_DN_EQ_ADDR_SOME(0.00)[]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[4]; TO_DN_SOME(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_GOOD(-0.10)[text/plain]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; DBL_BLOCKED_OPENRESOLVER(0.00)[arm.com:email,gnu.org:email,suse.de:email]; FUZZY_BLOCKED(0.00)[rspamd.com]; RCVD_COUNT_ZERO(0.00)[0]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; BAYES_HAM(-3.00)[100.00%] X-Spam-Level: X-Spam-Score: -3.10 X-Spam-Status: No, score=-11.2 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Tue, 30 Jan 2024, Tamar Christina wrote: > > -----Original Message----- > > From: Richard Biener > > Sent: Tuesday, January 30, 2024 9:51 AM > > To: Tamar Christina > > Cc: gcc-patches@gcc.gnu.org; nd ; jlaw@ventanamicro.com > > Subject: Re: [PATCH]middle-end: check memory accesses in the destination block > > [PR113588]. > > > > On Mon, 29 Jan 2024, Tamar Christina wrote: > > > > > Hi All, > > > > > > When analyzing loads for early break it was always the intention that > > > for the exit where things get moved to we only check the loads that can > > > be reached from the condition. > > > > Looking at the code I'm a bit confused that we always move to > > single_pred (loop->latch) - IIRC that was different at some point? > > > > Shouldn't we move stores after the last early exit condition instead? > > Yes it was changed during another PR fix. The rationale at that time didn't take into account > the peeled case. It used to be that we would "search" for the the exit to place it in. > > At that time the rational was, well it doesn't make sense. It has to go in the block that is the > last to be executed. With the non-peeled case it's always the one before the latch. > > Or put differently, I think the destination should be the main IV block. I am not quite sure > I'm following why you want to put the peeled cases inside the latch block. > > Ah, is it because the latch block is always going to only be executed when you make a full iteration? > That makes sense, but then I think we should also analyze the stores in all blocks (which your change > maybe already does, let me check) since we'll also lifting past the final block we need to update the vuses > there too. > > If the above is correct then I think I understand what you're saying and > will update the patch and do some Checks. Yes, I think that's what I wanted to say. Richard. > Thanks, > Tamar > > > > > In particular for the peeled case single_pred (loop->latch) is the > > block with the actual early exit condition? So for that case we'd > > need to move to the latch itself instead? For non-peeled we move > > to the block with the IV condition which looks OK. > > > > > However the main loop checks all loads and we skip the destination BB. > > > As such we never actually check the loads reachable from the COND in the > > > last BB unless this BB was also the exit chosen by the vectorizer. > > > > > > This leads us to incorrectly vectorize the loop in the PR and in doing so access > > > out of bounds. > > > > > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. > > > > > > Ok for master? > > > > The patch ends up with a worklist and another confusing comment > > > > + /* For the destination BB we need to only analyze loads reachable from > > the early > > + break statement itself. */ > > > > But I think it's a downstream issue from the issue above. That said, > > even for the non-peeled case we need to check ref_within_array_bound, > > no? > > > > So what about re-doing that initial loop like the following instead > > (and also fix dest_bb, but I'd like clarification here). Basically > > walk all blocks, do the ref_within_array_bound first and only > > after we've seen 'dest_bb' do the checks required for moving > > stores for all upstream BBs. > > > > And dest_bb should be > > > > /* Move side-effects to the in-loop destination of the last early > > exit. */ > > if (LOOP_VINFO_EARLY_BREAKS_VECT_PEELED (loop_vinfo)) > > dest_bb = loop->latch; > > else > > dest_bb = single_pred (loop->latch); > > > > > > diff --git a/gcc/tree-vect-data-refs.cc b/gcc/tree-vect-data-refs.cc > > index f592aeb8028..d6c8910dd6c 100644 > > --- a/gcc/tree-vect-data-refs.cc > > +++ b/gcc/tree-vect-data-refs.cc > > @@ -668,7 +668,6 @@ vect_analyze_early_break_dependences (loop_vec_info > > loop_vinfo) > > auto_vec bases; > > basic_block dest_bb = NULL; > > > > - hash_set visited; > > class loop *loop = LOOP_VINFO_LOOP (loop_vinfo); > > class loop *loop_nest = loop_outer (loop); > > > > @@ -681,15 +680,11 @@ vect_analyze_early_break_dependences > > (loop_vec_info loop_vinfo) > > side-effects to is always the latch connected exit. When we support > > general control flow we can do better but for now this is fine. */ > > dest_bb = single_pred (loop->latch); > > - basic_block bb = dest_bb; > > + basic_block bb = loop->latch; > > + bool check_deps = false; > > > > do > > { > > - /* If the destination block is also the header then we have nothing to do. */ > > - if (!single_pred_p (bb)) > > - continue; > > - > > - bb = single_pred (bb); > > gimple_stmt_iterator gsi = gsi_last_bb (bb); > > > > /* Now analyze all the remaining statements and try to determine which > > @@ -707,6 +702,25 @@ vect_analyze_early_break_dependences (loop_vec_info > > loop_vinfo) > > if (!dr_ref) > > continue; > > > > + /* Check if vector accesses to the object will be within bounds. > > + must be a constant or assume loop will be versioned or niters > > + bounded by VF so accesses are within range. */ > > + if (!ref_within_array_bound (stmt, DR_REF (dr_ref))) > > + { > > + if (dump_enabled_p ()) > > + dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, > > + "early breaks not supported: vectorization " > > + "would %s beyond size of obj.", > > + DR_IS_READ (dr_ref) ? "read" : "write"); > > + return opt_result::failure_at (stmt, > > + "can't safely apply code motion to " > > + "dependencies of %G to vectorize " > > + "the early exit.\n", stmt); > > + } > > + > > + if (!check_deps) > > + continue; > > + > > /* We currently only support statically allocated objects due to > > not having first-faulting loads support or peeling for > > alignment support. Compute the size of the referenced object > > @@ -739,22 +753,6 @@ vect_analyze_early_break_dependences (loop_vec_info > > loop_vinfo) > > "the early exit.\n", stmt); > > } > > > > - /* Check if vector accesses to the object will be within bounds. > > - must be a constant or assume loop will be versioned or niters > > - bounded by VF so accesses are within range. */ > > - if (!ref_within_array_bound (stmt, DR_REF (dr_ref))) > > - { > > - if (dump_enabled_p ()) > > - dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, > > - "early breaks not supported: vectorization " > > - "would %s beyond size of obj.", > > - DR_IS_READ (dr_ref) ? "read" : "write"); > > - return opt_result::failure_at (stmt, > > - "can't safely apply code motion to " > > - "dependencies of %G to vectorize " > > - "the early exit.\n", stmt); > > - } > > - > > if (DR_IS_READ (dr_ref)) > > bases.safe_push (dr_ref); > > else if (DR_IS_WRITE (dr_ref)) > > @@ -814,8 +812,16 @@ vect_analyze_early_break_dependences (loop_vec_info > > loop_vinfo) > > "marked statement for vUSE update: %G", stmt); > > } > > } > > + if (!single_pred_p (bb)) > > + { > > + gcc_assert (bb == loop->header); > > + break; > > + } > > + if (bb == dest_bb) > > + check_deps = true; > > + bb = single_pred (bb); > > } > > - while (bb != loop->header); > > + while (1); > > > > /* We don't allow outer -> inner loop transitions which should have been > > trapped already during loop form analysis. */ > > > > > Thanks, > > > Tamar > > > > > > gcc/ChangeLog: > > > > > > PR tree-optimization/113588 > > > * tree-vect-data-refs.cc (vect_analyze_early_break_dependences_1): New. > > > (vect_analyze_data_ref_dependence): Use it. > > > (vect_analyze_early_break_dependences): Update comments. > > > > > > gcc/testsuite/ChangeLog: > > > > > > PR tree-optimization/113588 > > > * gcc.dg/vect/vect-early-break_108-pr113588.c: New test. > > > * gcc.dg/vect/vect-early-break_109-pr113588.c: New test. > > > > > > --- inline copy of patch -- > > > diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_108-pr113588.c > > b/gcc/testsuite/gcc.dg/vect/vect-early-break_108-pr113588.c > > > new file mode 100644 > > > index > > 0000000000000000000000000000000000000000..e488619c9aac41fafbcf479 > > 818392a6bb7c6924f > > > --- /dev/null > > > +++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_108-pr113588.c > > > @@ -0,0 +1,15 @@ > > > +/* { dg-do compile } */ > > > +/* { dg-add-options vect_early_break } */ > > > +/* { dg-require-effective-target vect_early_break } */ > > > +/* { dg-require-effective-target vect_int } */ > > > + > > > +/* { dg-final { scan-tree-dump-not "LOOP VECTORIZED" "vect" } } */ > > > + > > > +int foo (const char *s, unsigned long n) > > > +{ > > > + unsigned long len = 0; > > > + while (*s++ && n--) > > > + ++len; > > > + return len; > > > +} > > > + > > > diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_109-pr113588.c > > b/gcc/testsuite/gcc.dg/vect/vect-early-break_109-pr113588.c > > > new file mode 100644 > > > index > > 0000000000000000000000000000000000000000..488c19d3ede809631d1a7 > > ede0e7f7bcdc7a1ae43 > > > --- /dev/null > > > +++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_109-pr113588.c > > > @@ -0,0 +1,44 @@ > > > +/* { dg-add-options vect_early_break } */ > > > +/* { dg-require-effective-target vect_early_break } */ > > > +/* { dg-require-effective-target vect_int } */ > > > +/* { dg-require-effective-target mmap } */ > > > + > > > +/* { dg-final { scan-tree-dump-not "LOOP VECTORIZED" "vect" } } */ > > > + > > > +#include > > > +#include > > > + > > > +#include "tree-vect.h" > > > + > > > +__attribute__((noipa)) > > > +int foo (const char *s, unsigned long n) > > > +{ > > > + unsigned long len = 0; > > > + while (*s++ && n--) > > > + ++len; > > > + return len; > > > +} > > > + > > > +int main() > > > +{ > > > + > > > + check_vect (); > > > + > > > + long pgsz = sysconf (_SC_PAGESIZE); > > > + void *p = mmap (NULL, pgsz * 3, PROT_READ|PROT_WRITE, > > > + MAP_ANONYMOUS|MAP_PRIVATE, 0, 0); > > > + if (p == MAP_FAILED) > > > + return 0; > > > + mprotect (p, pgsz, PROT_NONE); > > > + mprotect (p+2*pgsz, pgsz, PROT_NONE); > > > + char *p1 = p + pgsz; > > > + p1[0] = 1; > > > + p1[1] = 0; > > > + foo (p1, 1000); > > > + p1 = p + 2*pgsz - 2; > > > + p1[0] = 1; > > > + p1[1] = 0; > > > + foo (p1, 1000); > > > + return 0; > > > +} > > > + > > > diff --git a/gcc/tree-vect-data-refs.cc b/gcc/tree-vect-data-refs.cc > > > index > > f592aeb8028afd4fd70e2175104efab2a2c0d82e..52cef242a7ce5d0e525bff639fa > > 1dc2f0a6f30b9 100644 > > > --- a/gcc/tree-vect-data-refs.cc > > > +++ b/gcc/tree-vect-data-refs.cc > > > @@ -619,10 +619,69 @@ vect_analyze_data_ref_dependence (struct > > data_dependence_relation *ddr, > > > return opt_result::success (); > > > } > > > > > > -/* Funcion vect_analyze_early_break_dependences. > > > +/* Function vect_analyze_early_break_dependences_1 > > > > > > - Examime all the data references in the loop and make sure that if we have > > > - mulitple exits that we are able to safely move stores such that they become > > > + Helper function of vect_analyze_early_break_dependences which performs > > safety > > > + analysis for load operations in an early break. */ > > > + > > > +static opt_result > > > +vect_analyze_early_break_dependences_1 (data_reference *dr_ref, gimple > > *stmt) > > > +{ > > > + /* We currently only support statically allocated objects due to > > > + not having first-faulting loads support or peeling for > > > + alignment support. Compute the size of the referenced object > > > + (it could be dynamically allocated). */ > > > + tree obj = DR_BASE_ADDRESS (dr_ref); > > > + if (!obj || TREE_CODE (obj) != ADDR_EXPR) > > > + { > > > + if (dump_enabled_p ()) > > > + dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, > > > + "early breaks only supported on statically" > > > + " allocated objects.\n"); > > > + return opt_result::failure_at (stmt, > > > + "can't safely apply code motion to " > > > + "dependencies of %G to vectorize " > > > + "the early exit.\n", stmt); > > > + } > > > + > > > + tree refop = TREE_OPERAND (obj, 0); > > > + tree refbase = get_base_address (refop); > > > + if (!refbase || !DECL_P (refbase) || !DECL_SIZE (refbase) > > > + || TREE_CODE (DECL_SIZE (refbase)) != INTEGER_CST) > > > + { > > > + if (dump_enabled_p ()) > > > + dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, > > > + "early breaks only supported on" > > > + " statically allocated objects.\n"); > > > + return opt_result::failure_at (stmt, > > > + "can't safely apply code motion to " > > > + "dependencies of %G to vectorize " > > > + "the early exit.\n", stmt); > > > + } > > > + > > > + /* Check if vector accesses to the object will be within bounds. > > > + must be a constant or assume loop will be versioned or niters > > > + bounded by VF so accesses are within range. */ > > > + if (!ref_within_array_bound (stmt, DR_REF (dr_ref))) > > > + { > > > + if (dump_enabled_p ()) > > > + dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, > > > + "early breaks not supported: vectorization " > > > + "would %s beyond size of obj.", > > > + DR_IS_READ (dr_ref) ? "read" : "write"); > > > + return opt_result::failure_at (stmt, > > > + "can't safely apply code motion to " > > > + "dependencies of %G to vectorize " > > > + "the early exit.\n", stmt); > > > + } > > > + > > > + return opt_result::success (); > > > +} > > > + > > > +/* Function vect_analyze_early_break_dependences. > > > + > > > + Examine all the data references in the loop and make sure that if we have > > > + multiple exits that we are able to safely move stores such that they become > > > safe for vectorization. The function also calculates the place where to move > > > the instructions to and computes what the new vUSE chain should be. > > > > > > @@ -639,7 +698,7 @@ vect_analyze_data_ref_dependence (struct > > data_dependence_relation *ddr, > > > - Multiple loads are allowed as long as they don't alias. > > > > > > NOTE: > > > - This implemementation is very conservative. Any overlappig loads/stores > > > + This implementation is very conservative. Any overlapping loads/stores > > > that take place before the early break statement gets rejected aside from > > > WAR dependencies. > > > > > > @@ -668,7 +727,6 @@ vect_analyze_early_break_dependences (loop_vec_info > > loop_vinfo) > > > auto_vec bases; > > > basic_block dest_bb = NULL; > > > > > > - hash_set visited; > > > class loop *loop = LOOP_VINFO_LOOP (loop_vinfo); > > > class loop *loop_nest = loop_outer (loop); > > > > > > @@ -683,6 +741,7 @@ vect_analyze_early_break_dependences (loop_vec_info > > loop_vinfo) > > > dest_bb = single_pred (loop->latch); > > > basic_block bb = dest_bb; > > > > > > + /* First analyse all blocks leading to dest_bb excluding dest_bb itself. */ > > > do > > > { > > > /* If the destination block is also the header then we have nothing to do. */ > > > @@ -707,53 +766,11 @@ vect_analyze_early_break_dependences > > (loop_vec_info loop_vinfo) > > > if (!dr_ref) > > > continue; > > > > > > - /* We currently only support statically allocated objects due to > > > - not having first-faulting loads support or peeling for > > > - alignment support. Compute the size of the referenced object > > > - (it could be dynamically allocated). */ > > > - tree obj = DR_BASE_ADDRESS (dr_ref); > > > - if (!obj || TREE_CODE (obj) != ADDR_EXPR) > > > - { > > > - if (dump_enabled_p ()) > > > - dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, > > > - "early breaks only supported on statically" > > > - " allocated objects.\n"); > > > - return opt_result::failure_at (stmt, > > > - "can't safely apply code motion to " > > > - "dependencies of %G to vectorize " > > > - "the early exit.\n", stmt); > > > - } > > > - > > > - tree refop = TREE_OPERAND (obj, 0); > > > - tree refbase = get_base_address (refop); > > > - if (!refbase || !DECL_P (refbase) || !DECL_SIZE (refbase) > > > - || TREE_CODE (DECL_SIZE (refbase)) != INTEGER_CST) > > > - { > > > - if (dump_enabled_p ()) > > > - dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, > > > - "early breaks only supported on" > > > - " statically allocated objects.\n"); > > > - return opt_result::failure_at (stmt, > > > - "can't safely apply code motion to " > > > - "dependencies of %G to vectorize " > > > - "the early exit.\n", stmt); > > > - } > > > - > > > - /* Check if vector accesses to the object will be within bounds. > > > - must be a constant or assume loop will be versioned or niters > > > - bounded by VF so accesses are within range. */ > > > - if (!ref_within_array_bound (stmt, DR_REF (dr_ref))) > > > - { > > > - if (dump_enabled_p ()) > > > - dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, > > > - "early breaks not supported: vectorization " > > > - "would %s beyond size of obj.", > > > - DR_IS_READ (dr_ref) ? "read" : "write"); > > > - return opt_result::failure_at (stmt, > > > - "can't safely apply code motion to " > > > - "dependencies of %G to vectorize " > > > - "the early exit.\n", stmt); > > > - } > > > + /* Check if the operation is one we can safely do. */ > > > + opt_result res > > > + = vect_analyze_early_break_dependences_1 (dr_ref, stmt); > > > + if (!res) > > > + return res; > > > > > > if (DR_IS_READ (dr_ref)) > > > bases.safe_push (dr_ref); > > > @@ -817,6 +834,51 @@ vect_analyze_early_break_dependences > > (loop_vec_info loop_vinfo) > > > } > > > while (bb != loop->header); > > > > > > + /* For the destination BB we need to only analyze loads reachable from the > > early > > > + break statement itself. */ > > > + auto_vec workset; > > > + hash_set visited; > > > + gimple *last_stmt = gsi_stmt (gsi_last_bb (dest_bb)); > > > + gcond *last_cond = dyn_cast (last_stmt); > > > + /* If the cast fails we have a different control flow statement in the latch. Most > > > + commonly this is a switch. */ > > > + if (!last_cond) > > > + return opt_result::failure_at (last_stmt, > > > + "can't safely apply code motion to dependencies" > > > + " to vectorize the early exit, unknown control fow" > > > + " in stmt %G", last_stmt); > > > + workset.safe_push (gimple_cond_lhs (last_cond)); > > > + workset.safe_push (gimple_cond_rhs (last_cond)); > > > + > > > + imm_use_iterator imm_iter; > > > + use_operand_p use_p; > > > + tree lhs; > > > + do > > > + { > > > + tree op = workset.pop (); > > > + if (visited.add (op)) > > > + continue; > > > + stmt_vec_info stmt_vinfo = loop_vinfo->lookup_def (op); > > > + > > > + /* Not defined in loop, don't care. */ > > > + if (!stmt_vinfo) > > > + continue; > > > + gimple *stmt = STMT_VINFO_STMT (stmt_vinfo); > > > + auto dr_ref = STMT_VINFO_DATA_REF (stmt_vinfo); > > > + if (dr_ref) > > > + { > > > + opt_result res > > > + = vect_analyze_early_break_dependences_1 (dr_ref, stmt); > > > + if (!res) > > > + return res; > > > + } > > > + else > > > + FOR_EACH_IMM_USE_FAST (use_p, imm_iter, op) > > > + if ((lhs = gimple_get_lhs (USE_STMT (use_p)))) > > > + workset.safe_push (lhs); > > > + } > > > + while (!workset.is_empty ()); > > > + > > > /* We don't allow outer -> inner loop transitions which should have been > > > trapped already during loop form analysis. */ > > > gcc_assert (dest_bb->loop_father == loop); > > > > > > > > > > > > > > > > > > > -- > > Richard Biener > > SUSE Software Solutions Germany GmbH, > > Frankenstrasse 146, 90461 Nuernberg, Germany; > > GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg) > -- Richard Biener SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)