From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.223.130]) by sourceware.org (Postfix) with ESMTPS id 841B03858D33 for ; Fri, 2 Feb 2024 10:22:06 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 841B03858D33 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.de ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 841B03858D33 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=195.135.223.130 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1706869329; cv=none; b=al3plnJzcjX33bRvk2IPYHxS5zxAm9hWB4PFY40wMTBgG/a8y+EVp9L+kVj/kgEJXIc41tDbQdO0eW1reZgH5T3p1AQMyt3ORohZAeuD1hYiLTlJJTtrqOd3nft7Z2tNvBfYTFy/0agRSAsln1huuxQ1hVoBAHEwJXAIQy927tQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1706869329; c=relaxed/simple; bh=QkwTqIF1PS3DBdoRHulprB4LUzCx7PusffrV/ek39Ww=; h=DKIM-Signature:DKIM-Signature:DKIM-Signature:DKIM-Signature:From: Mime-Version:Subject:Date:Message-Id:To; b=TYbn2fHFU1OrAf6mStkqvLsPHpxvBcY/0SAkdccSYHekmtokYaofLUUgCfL83BpQnTuwOJ7yIFDITIpsILp7J+GYewo0q7M72maWewIR6E98JBxKrzTBgYwSUpDrTdgwlPEax4gLIopXKgn1iworlQvgV9ad7nOqeT93RCyUVnk= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [10.150.64.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 9814222082; Fri, 2 Feb 2024 10:22:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1706869324; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=VLy3Umr3fEcyf0vFnK32MoU2ugwYZacb/qCdIJ0RFDM=; b=bBRBKddJm7KP4c7Xq0D2rbLYgWZLSjYFyNFa+FJVH03LLJSzevqsRoyECC+nsTFCbOXlVp w8p5ty4CyPXjGcEw0vngt5QICa6rfh7CLNXa95ykVZwuxn8Yb3zeBpokHYHZ9leWMdH3DI iRuR89KhM6kZkPlcxb1TiISpTCeCZRY= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1706869324; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=VLy3Umr3fEcyf0vFnK32MoU2ugwYZacb/qCdIJ0RFDM=; b=j3BT59t54kYkNF7xnN51SveFSuqDLSebBBnVGgjwZT5Kce7BjuggRQGwKEi+HKMemyHpX+ WoP8sowOenAhJcAw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1706869324; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=VLy3Umr3fEcyf0vFnK32MoU2ugwYZacb/qCdIJ0RFDM=; b=bBRBKddJm7KP4c7Xq0D2rbLYgWZLSjYFyNFa+FJVH03LLJSzevqsRoyECC+nsTFCbOXlVp w8p5ty4CyPXjGcEw0vngt5QICa6rfh7CLNXa95ykVZwuxn8Yb3zeBpokHYHZ9leWMdH3DI iRuR89KhM6kZkPlcxb1TiISpTCeCZRY= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1706869324; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=VLy3Umr3fEcyf0vFnK32MoU2ugwYZacb/qCdIJ0RFDM=; b=j3BT59t54kYkNF7xnN51SveFSuqDLSebBBnVGgjwZT5Kce7BjuggRQGwKEi+HKMemyHpX+ WoP8sowOenAhJcAw== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 768D213A58; Fri, 2 Feb 2024 10:22:04 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id xV3hHEzCvGUqXQAAD6G6ig (envelope-from ); Fri, 02 Feb 2024 10:22:04 +0000 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable From: Richard Biener Mime-Version: 1.0 (1.0) Subject: Re: [PATCH]middle-end: check memory accesses in the destination block [PR113588]. Date: Fri, 2 Feb 2024 11:21:53 +0100 Message-Id: <7A28DB3D-4AFC-4597-AA00-CECF36675E4F@suse.de> References: Cc: gcc-patches@gcc.gnu.org, nd , jlaw@ventanamicro.com In-Reply-To: To: Tamar Christina X-Mailer: iPhone Mail (21D50) Authentication-Results: smtp-out1.suse.de; none X-Spamd-Result: default: False [-2.60 / 50.00]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; BAYES_HAM(-3.00)[100.00%]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[4]; MV_CASE(0.50)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_GOOD(-0.10)[text/plain]; TO_DN_SOME(0.00)[]; RCVD_COUNT_THREE(0.00)[3]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; FUZZY_BLOCKED(0.00)[rspamd.com]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; RCVD_TLS_ALL(0.00)[]; MID_RHS_MATCH_FROM(0.00)[] X-Spam-Level: X-Spam-Score: -2.60 X-Spam-Status: No, score=-11.5 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,KAM_LOTSOFHASH,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: > Am 01.02.2024 um 22:34 schrieb Tamar Christina : >=20 > =EF=BB=BF >>=20 >>>=20 >>> If the above is correct then I think I understand what you're saying and= >>> will update the patch and do some Checks. >>=20 >> Yes, I think that's what I wanted to say. >>=20 >=20 > As discussed: >=20 > Bootstrapped Regtested on aarch64-none-linux-gnu and x86_64-pc-linux-gnu n= o issues. > Also checked both with --enable-lto --with-build-config=3D'bootstrap-O3 bo= otstrap-lto' --enable-multilib > and --enable-lto --with-build-config=3Dbootstrap-O3 --enable-checking=3Dre= lease,yes,rtl,extra; > and checked the libcrypt testsuite as reported on PR113467. >=20 > Ok for master? Ok Richard=20 > Thanks, > Tamar >=20 > gcc/ChangeLog: >=20 > PR tree-optimization/113588 > PR tree-optimization/113467 > (vect_analyze_data_ref_dependence): Choose correct dest and fix checks= . > (vect_analyze_early_break_dependences): Update comments. >=20 > gcc/testsuite/ChangeLog: >=20 > PR tree-optimization/113588 > PR tree-optimization/113467 > * gcc.dg/vect/vect-early-break_108-pr113588.c: New test. > * gcc.dg/vect/vect-early-break_109-pr113588.c: New test. >=20 > --- inline copy of patch --- >=20 > diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_108-pr113588.c b/g= cc/testsuite/gcc.dg/vect/vect-early-break_108-pr113588.c > new file mode 100644 > index 0000000000000000000000000000000000000000..e488619c9aac41fafbcf479818= 392a6bb7c6924f > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_108-pr113588.c > @@ -0,0 +1,15 @@ > +/* { dg-do compile } */ > +/* { dg-add-options vect_early_break } */ > +/* { dg-require-effective-target vect_early_break } */ > +/* { dg-require-effective-target vect_int } */ > + > +/* { dg-final { scan-tree-dump-not "LOOP VECTORIZED" "vect" } } */ > + > +int foo (const char *s, unsigned long n) > +{ > + unsigned long len =3D 0; > + while (*s++ && n--) > + ++len; > + return len; > +} > + > diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_109-pr113588.c b/g= cc/testsuite/gcc.dg/vect/vect-early-break_109-pr113588.c > new file mode 100644 > index 0000000000000000000000000000000000000000..488c19d3ede809631d1a7ede0e= 7f7bcdc7a1ae43 > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_109-pr113588.c > @@ -0,0 +1,44 @@ > +/* { dg-add-options vect_early_break } */ > +/* { dg-require-effective-target vect_early_break } */ > +/* { dg-require-effective-target vect_int } */ > +/* { dg-require-effective-target mmap } */ > + > +/* { dg-final { scan-tree-dump-not "LOOP VECTORIZED" "vect" } } */ > + > +#include > +#include > + > +#include "tree-vect.h" > + > +__attribute__((noipa)) > +int foo (const char *s, unsigned long n) > +{ > + unsigned long len =3D 0; > + while (*s++ && n--) > + ++len; > + return len; > +} > + > +int main() > +{ > + > + check_vect (); > + > + long pgsz =3D sysconf (_SC_PAGESIZE); > + void *p =3D mmap (NULL, pgsz * 3, PROT_READ|PROT_WRITE, > + MAP_ANONYMOUS|MAP_PRIVATE, 0, 0); > + if (p =3D=3D MAP_FAILED) > + return 0; > + mprotect (p, pgsz, PROT_NONE); > + mprotect (p+2*pgsz, pgsz, PROT_NONE); > + char *p1 =3D p + pgsz; > + p1[0] =3D 1; > + p1[1] =3D 0; > + foo (p1, 1000); > + p1 =3D p + 2*pgsz - 2; > + p1[0] =3D 1; > + p1[1] =3D 0; > + foo (p1, 1000); > + return 0; > +} > + > diff --git a/gcc/tree-vect-data-refs.cc b/gcc/tree-vect-data-refs.cc > index f592aeb8028afd4fd70e2175104efab2a2c0d82e..53fdfc25d7dc2deb7788176252= 697d2e455555fc 100644 > --- a/gcc/tree-vect-data-refs.cc > +++ b/gcc/tree-vect-data-refs.cc > @@ -619,10 +619,10 @@ vect_analyze_data_ref_dependence (struct data_depend= ence_relation *ddr, > return opt_result::success (); > } >=20 > -/* Funcion vect_analyze_early_break_dependences. > +/* Function vect_analyze_early_break_dependences. >=20 > - Examime all the data references in the loop and make sure that if we h= ave > - mulitple exits that we are able to safely move stores such that they b= ecome > + Examine all the data references in the loop and make sure that if we h= ave > + multiple exits that we are able to safely move stores such that they b= ecome > safe for vectorization. The function also calculates the place where t= o move > the instructions to and computes what the new vUSE chain should be. >=20 > @@ -639,7 +639,7 @@ vect_analyze_data_ref_dependence (struct data_dependen= ce_relation *ddr, > - Multiple loads are allowed as long as they don't alias. >=20 > NOTE: > - This implemementation is very conservative. Any overlappig loads/sto= res > + This implementation is very conservative. Any overlapping loads/stor= es > that take place before the early break statement gets rejected aside f= rom > WAR dependencies. >=20 > @@ -668,7 +668,6 @@ vect_analyze_early_break_dependences (loop_vec_info lo= op_vinfo) > auto_vec bases; > basic_block dest_bb =3D NULL; >=20 > - hash_set visited; > class loop *loop =3D LOOP_VINFO_LOOP (loop_vinfo); > class loop *loop_nest =3D loop_outer (loop); >=20 > @@ -677,19 +676,33 @@ vect_analyze_early_break_dependences (loop_vec_info l= oop_vinfo) > "loop contains multiple exits, analyzing" > " statement dependencies.\n"); >=20 > + if (LOOP_VINFO_EARLY_BREAKS_VECT_PEELED (loop_vinfo)) > + if (dump_enabled_p ()) > + dump_printf_loc (MSG_NOTE, vect_location, > + "alternate exit has been chosen as main exit.\n"); > + > /* Since we don't support general control flow, the location we'll move t= he > side-effects to is always the latch connected exit. When we support > - general control flow we can do better but for now this is fine. */ > - dest_bb =3D single_pred (loop->latch); > + general control flow we can do better but for now this is fine. Mov= e > + side-effects to the in-loop destination of the last early exit. For= the PEELED > + case we move the side-effects to the latch block as this is guarante= ed to be the > + last block to be executed when a vector iteration finished. */ > + if (LOOP_VINFO_EARLY_BREAKS_VECT_PEELED (loop_vinfo)) > + dest_bb =3D loop->latch; > + else > + dest_bb =3D single_pred (loop->latch); > + > + /* We start looking from dest_bb, for the non-PEELED case we don't want= to > + move any stores already present, but we do want to read and validate= the > + loads. */ > basic_block bb =3D dest_bb; >=20 > + /* In the peeled case we need to check all the loads in the loop since t= o move the > + the stores we lift the stores over all loads into the latch. */ > + bool check_deps =3D LOOP_VINFO_EARLY_BREAKS_VECT_PEELED (loop_vinfo); > + > do > { > - /* If the destination block is also the header then we have nothing= to do. */ > - if (!single_pred_p (bb)) > - continue; > - > - bb =3D single_pred (bb); > gimple_stmt_iterator gsi =3D gsi_last_bb (bb); >=20 > /* Now analyze all the remaining statements and try to determine whi= ch > @@ -707,42 +720,13 @@ vect_analyze_early_break_dependences (loop_vec_info l= oop_vinfo) > if (!dr_ref) > continue; >=20 > - /* We currently only support statically allocated objects due to > - not having first-faulting loads support or peeling for > - alignment support. Compute the size of the referenced object > - (it could be dynamically allocated). */ > - tree obj =3D DR_BASE_ADDRESS (dr_ref); > - if (!obj || TREE_CODE (obj) !=3D ADDR_EXPR) > - { > - if (dump_enabled_p ()) > - dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, > - "early breaks only supported on statically" > - " allocated objects.\n"); > - return opt_result::failure_at (stmt, > - "can't safely apply code motion to " > - "dependencies of %G to vectorize " > - "the early exit.\n", stmt); > - } > - > - tree refop =3D TREE_OPERAND (obj, 0); > - tree refbase =3D get_base_address (refop); > - if (!refbase || !DECL_P (refbase) || !DECL_SIZE (refbase) > - || TREE_CODE (DECL_SIZE (refbase)) !=3D INTEGER_CST) > - { > - if (dump_enabled_p ()) > - dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, > - "early breaks only supported on" > - " statically allocated objects.\n"); > - return opt_result::failure_at (stmt, > - "can't safely apply code motion to " > - "dependencies of %G to vectorize " > - "the early exit.\n", stmt); > - } > - > /* Check if vector accesses to the object will be within bounds. > must be a constant or assume loop will be versioned or niters > - bounded by VF so accesses are within range. */ > - if (!ref_within_array_bound (stmt, DR_REF (dr_ref))) > + bounded by VF so accesses are within range. We only need to che= ck the > + reads since writes are moved to a safe place where if we get the= re we > + know they are safe to perform. */ > + if (DR_IS_READ (dr_ref) > + && !ref_within_array_bound (stmt, DR_REF (dr_ref))) > { > if (dump_enabled_p ()) > dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, > @@ -755,6 +739,9 @@ vect_analyze_early_break_dependences (loop_vec_info lo= op_vinfo) > "the early exit.\n", stmt); > } >=20 > + if (!check_deps) > + continue; > + > if (DR_IS_READ (dr_ref)) > bases.safe_push (dr_ref); > else if (DR_IS_WRITE (dr_ref)) > @@ -814,8 +801,19 @@ vect_analyze_early_break_dependences (loop_vec_info l= oop_vinfo) > "marked statement for vUSE update: %G", stmt); > } > } > + > + if (!single_pred_p (bb)) > + { > + gcc_assert (bb =3D=3D loop->header); > + break; > + } > + > + /* For the non-PEELED case we don't want to check the loads in the I= V exit block > + for dependencies with the stores, but any block preceeding it we do.= */ > + check_deps =3D true; > + bb =3D single_pred (bb); > } > - while (bb !=3D loop->header); > + while (1); >=20 > /* We don't allow outer -> inner loop transitions which should have been= > trapped already during loop form analysis. */ >