From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) by sourceware.org (Postfix) with ESMTPS id 50B253858C35 for ; Fri, 1 Mar 2024 09:48:28 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 50B253858C35 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.de ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 50B253858C35 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=195.135.223.131 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1709286521; cv=none; b=AcNOLXYQG7niSsUunIRMM39a/H/h+My3qEToiwzdorV+uzN4F1oNG+NPRsek/lL8XhSGvC0r415TILtmE1PDpKYULK6wga/7R2atAb9rpcK9Jok/BV5Fg37vCDrzrIbKOhAQNrOjICF5D7xxPhryePSRanPSjqzQgi1X9By12+Q= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1709286521; c=relaxed/simple; bh=ekaxvX2oNmY63poDPOy+8+Wpybhj2lLL/gwhf1120/I=; h=DKIM-Signature:DKIM-Signature:DKIM-Signature:DKIM-Signature:Date: From:To:Subject:Message-ID:MIME-Version; b=tNB/0vIavy3P/TzM18elDeGF4rMLudpvQidJcdK5HEsV2uDv2XECLAv60LnDRMJvkkxhKVs2Yxs5pm7ZnBMY68jHV0AUftnn1hVmEvy2uF5iJdfhkmg6+ZJspEV9pWmcVpCuqt1efk6Ovk2kdGMh0nssoO9F+/UOdRbrCplPt/s= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from [10.168.4.150] (unknown [10.168.4.150]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 4B6E720089; Fri, 1 Mar 2024 09:48:27 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1709286507; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=OmZEMYm4QhmNWDobuHb/DDMuDvKSD9suzuMa0Uw6SlI=; b=K9yQp4shYX8wAVhTZOBGf/gOYV1/0IOjyG6BnQmptVnC3su6nAaLAEgS0Gza9ctauJX68F 9tBL6yTFInfwWLKA3ch2bMZQ+AwH16/n9q7I1yv4rD3xIFnURcXWlX5ix+pX72DpQX7wEN wFafyXOd38StAjA4uogrC31/N4cQ8rY= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1709286507; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=OmZEMYm4QhmNWDobuHb/DDMuDvKSD9suzuMa0Uw6SlI=; b=GBFJv/ow+xo0VVw/vn1YhLq47zS0UkEJvU249sZHPZw7M7xlHtBFMbncQ4/poU5wlxzkln XzkK3Kl/tE49jpDw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1709286507; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=OmZEMYm4QhmNWDobuHb/DDMuDvKSD9suzuMa0Uw6SlI=; b=K9yQp4shYX8wAVhTZOBGf/gOYV1/0IOjyG6BnQmptVnC3su6nAaLAEgS0Gza9ctauJX68F 9tBL6yTFInfwWLKA3ch2bMZQ+AwH16/n9q7I1yv4rD3xIFnURcXWlX5ix+pX72DpQX7wEN wFafyXOd38StAjA4uogrC31/N4cQ8rY= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1709286507; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=OmZEMYm4QhmNWDobuHb/DDMuDvKSD9suzuMa0Uw6SlI=; b=GBFJv/ow+xo0VVw/vn1YhLq47zS0UkEJvU249sZHPZw7M7xlHtBFMbncQ4/poU5wlxzkln XzkK3Kl/tE49jpDw== Date: Fri, 1 Mar 2024 10:48:27 +0100 (CET) From: Richard Biener To: "Andre Vieira (lists)" cc: gcc-patches@gcc.gnu.org Subject: Re: [PATCH] tree-optimization/110221 - SLP and loop mask/len In-Reply-To: <6846f165-1cfb-415c-9a47-e620c784dc96@arm.com> Message-ID: <86190n96-6srr-1n99-p71q-59431ror308r@fhfr.qr> References: <20231110131658.09A5D13398@imap2.suse-dmz.suse.de> <6846f165-1cfb-415c-9a47-e620c784dc96@arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Authentication-Results: smtp-out2.suse.de; none X-Spamd-Result: default: False [-3.10 / 50.00]; ARC_NA(0.00)[]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_GOOD(-0.10)[text/plain]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; RCPT_COUNT_TWO(0.00)[2]; DBL_BLOCKED_OPENRESOLVER(0.00)[suse.de:email,tree-vect-slp.cc:url]; FUZZY_BLOCKED(0.00)[rspamd.com]; RCVD_COUNT_ZERO(0.00)[0]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; BAYES_HAM(-3.00)[100.00%] X-Spam-Level: X-Spam-Score: -3.10 X-Spam-Status: No, score=-11.0 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Fri, 1 Mar 2024, Andre Vieira (lists) wrote: > Hi, > > Bootstrapped and tested the gcc-13 backport of this on gcc-12 for > aarch64-unknown-linux-gnu and x86_64-pc-linux-gnu and no regressions. > > OK to push to gcc-12 branch? OK. Thanks, Richard. > Kind regards, > Andre Vieira > > On 10/11/2023 13:16, Richard Biener wrote: > > The following fixes the issue that when SLP stmts are internal defs > > but appear invariant because they end up only using invariant defs > > then they get scheduled outside of the loop. This nice optimization > > breaks down when loop masks or lens are applied since those are not > > explicitly tracked as dependences. The following makes sure to never > > schedule internal defs outside of the vectorized loop when the > > loop uses masks/lens. > > > > Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed. > > > > PR tree-optimization/110221 > > * tree-vect-slp.cc (vect_schedule_slp_node): When loop > > masking / len is applied make sure to not schedule > > intenal defs outside of the loop. > > > > * gfortran.dg/pr110221.f: New testcase. > > --- > > gcc/testsuite/gfortran.dg/pr110221.f | 17 +++++++++++++++++ > > gcc/tree-vect-slp.cc | 10 ++++++++++ > > 2 files changed, 27 insertions(+) > > create mode 100644 gcc/testsuite/gfortran.dg/pr110221.f > > > > diff --git a/gcc/testsuite/gfortran.dg/pr110221.f > > b/gcc/testsuite/gfortran.dg/pr110221.f > > new file mode 100644 > > index 00000000000..8b57384313a > > --- /dev/null > > +++ b/gcc/testsuite/gfortran.dg/pr110221.f > > @@ -0,0 +1,17 @@ > > +C PR middle-end/68146 > > +C { dg-do compile } > > +C { dg-options "-O2 -w" } > > +C { dg-additional-options "-mavx512f --param vect-partial-vector-usage=2" { > > target avx512f } } > > + SUBROUTINE CJYVB(V,Z,V0,CBJ,CDJ,CBY,CYY) > > + IMPLICIT DOUBLE PRECISION (A,B,G,O-Y) > > + IMPLICIT COMPLEX*16 (C,Z) > > + DIMENSION CBJ(0:*),CDJ(0:*),CBY(0:*) > > + N=INT(V) > > + CALL GAMMA2(VG,GA) > > + DO 65 K=1,N > > + CBY(K)=CYY > > +65 CONTINUE > > + CDJ(0)=V0/Z*CBJ(0)-CBJ(1) > > + DO 70 K=1,N > > +70 CDJ(K)=-(K+V0)/Z*CBJ(K)+CBJ(K-1) > > + END > > diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc > > index 3e5814c3a31..80e279d8f50 100644 > > --- a/gcc/tree-vect-slp.cc > > +++ b/gcc/tree-vect-slp.cc > > @@ -9081,6 +9081,16 @@ vect_schedule_slp_node (vec_info *vinfo, > > /* Emit other stmts after the children vectorized defs which is > > earliest possible. */ > > gimple *last_stmt = NULL; > > + if (auto loop_vinfo = dyn_cast (vinfo)) > > + if (LOOP_VINFO_FULLY_MASKED_P (loop_vinfo) > > + || LOOP_VINFO_FULLY_WITH_LENGTH_P (loop_vinfo)) > > + { > > + /* But avoid scheduling internal defs outside of the loop when > > + we might have only implicitly tracked loop mask/len defs. */ > > + gimple_stmt_iterator si > > + = gsi_after_labels (LOOP_VINFO_LOOP (loop_vinfo)->header); > > + last_stmt = *si; > > + } > > bool seen_vector_def = false; > > FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (node), i, child) > > if (SLP_TREE_DEF_TYPE (child) == vect_internal_def) > -- Richard Biener SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)