From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp-out1.suse.de (smtp-out1.suse.de [IPv6:2a07:de40:b251:101:10:150:64:1]) by sourceware.org (Postfix) with ESMTPS id 13A393858C24 for ; Fri, 5 Apr 2024 12:13:02 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 13A393858C24 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.de ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 13A393858C24 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a07:de40:b251:101:10:150:64:1 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1712319185; cv=none; b=XK15nR/8I5jeO4/294lIBiN32QaQJGGU60ZOHEpcvSPjTkFlCmE97Th7sFLzt1Ek26gRF/LzjA9BhSBHeL5y/jYLaRs0hjcD50BIW/k1AUvLYI0XbRN4YuwYoKugGpEEsXQHU5Wg7OGOjQAhgF1c+TFxXAen7qms9iOSkhf+FyU= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1712319185; c=relaxed/simple; bh=xXFOad3cRwpkF3jsSHFkeJyM98RtXqXWG3zgL2xbfak=; h=DKIM-Signature:DKIM-Signature:DKIM-Signature:DKIM-Signature:Date: From:To:Subject:Message-ID:MIME-Version; b=NLpK4Qz+kNd0fR19TKZuowuKpFKZ57zViGOc30JUZTzQfLQNJkiWxuJ2QyFJSba5i7Ho/G4I4iYl3vDjGE+XYUkPPKgNjDpaWXUPAg44JEeaCZO32VcuO1/F3bOKbRbnyLyf3yNwi9TAL0A81F0doLPX4IKLJG3ofzf3A8tlMlw= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from [10.168.5.241] (unknown [10.168.5.241]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id D4B3F21A38; Fri, 5 Apr 2024 12:13:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1712319181; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=k9va+IeLRRocFk3XPF8yf+Y71N8vVccMuFPRJpCU3Jk=; b=UsOHRnN+qooAVTLcjZMPPX5YR3xNyEu/WIKB5aPOIbOwEK7367NGJHFm61cL3/Nnrq1fmD 1w2OyEEuqeLcmrJ0sOmDAFbv980n7QswogQ+MbhlK6LWTm65nHu0RGylvlokKfR3Lh2pHa Tr1LdgjO+iN3woeYsAnhcGIcKbMz5s4= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1712319181; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=k9va+IeLRRocFk3XPF8yf+Y71N8vVccMuFPRJpCU3Jk=; b=YJv0C8exy21jkiVwx4vWPqhrodk/7qHvd6S1M+I83BCRpZyO9bqovVioFAQOjzKk2xAB1t NAiY7UBiB6OyCgAQ== Authentication-Results: smtp-out1.suse.de; none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1712319180; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=k9va+IeLRRocFk3XPF8yf+Y71N8vVccMuFPRJpCU3Jk=; b=Y3x/bTtvpQKb1aQP2bgqSFHkIfnBPFVjzp4NXrDegLfI+XO5J4gOEw7tq0NTGR1Nvnqa8n A5qzeK8cHnRRs6Jq3tumZCzO9bWW4HrcOrtaVCbvudaYraNVTWng/xBWxwAEvg/7m4R6gh /FNQcGab3tBwSkhmWZD/2WLQgfckC4o= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1712319180; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=k9va+IeLRRocFk3XPF8yf+Y71N8vVccMuFPRJpCU3Jk=; b=/l/wLfnNROPRhenfDlXszlOkhP+D72WD+LSjp9cFUiKODtz7pw7yMKZlnyYY+WNJHGBn2C 0ZpWFktamXF7AFCw== Date: Fri, 5 Apr 2024 14:13:00 +0200 (CEST) From: Richard Biener To: Jakub Jelinek cc: Andre Vieira , gcc-patches@gcc.gnu.org Subject: Re: [PATCH] vect: Don't clear base_misaligned in update_epilogue_loop_vinfo [PR114566] In-Reply-To: Message-ID: <2257rp3q-7nq1-2n13-7n07-95r1o6379302@fhfr.qr> References: MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Spam-Level: X-Spamd-Result: default: False [-4.30 / 50.00]; BAYES_HAM(-3.00)[100.00%]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_SHORT(-0.20)[-0.998]; MIME_GOOD(-0.10)[text/plain]; FUZZY_BLOCKED(0.00)[rspamd.com]; FROM_HAS_DN(0.00)[]; ARC_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; TO_MATCH_ENVRCPT_ALL(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; MISSING_XM_UA(0.00)[]; TO_DN_SOME(0.00)[]; FROM_EQ_ENVFROM(0.00)[]; RCVD_COUNT_ZERO(0.00)[0]; DBL_BLOCKED_OPENRESOLVER(0.00)[suse.de:email] X-Spam-Score: -4.30 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,KAM_SHORT,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Fri, 5 Apr 2024, Jakub Jelinek wrote: > Hi! > > The following testcase is miscompiled, because in the vectorized > epilogue the vectorizer assumes it can use aligned loads/stores > (if the base decl gets alignment increased), but it actually doesn't > increase that. > This is because r10-4203-g97c1460367 added the hunk following > patch removes. The explanation feels reasonable, but actually it > is not true as the testcase proves. > The thing is, we vectorize the main loop with 64-byte vectors > and the corresponding data refs have base_alignment 16 (the > a array has DECL_ALIGN 128) and offset_alignment 32. Now, because > of the offset_alignment 32 rather than 64, we need to use unaligned > loads/stores in the main loop (and ditto in the first load/store > in vectorized epilogue). But the second load/store in the vectorized > epilogue uses only 32-byte vectors and because it is a multiple > of offset_alignment, it checks if we could increase alignment of the > a VAR_DECL, the function returns true, sets base_misaligned = true > and says the access is then aligned. > But when update_epilogue_loop_vinfo clears base_misaligned with the > assumption that the var had to have the alignment increased already, > the update of DECL_ALIGN doesn't happen anymore. > > Now, I'd think this base_alignment = false was needed before > r10-4030-gd2db7f7901 change was committed where it incorrectly > overwrote DECL_ALIGN even if it was already larger, rather than > just always increasing it. But with that change in, it doesn't > make sense to me anymore. I think it was incorrect before - basically it assumed that the main loop would never use misaligned accesses and the epilogue analysis would align. But for this we'd need to make sure to never request such during epilogue analysis. > Note, the testcase is latent on the trunk, but reproduces on the 13 > branch. > > Bootstrapped/regtested on x86_64-linux and i686-linux on the trunk, > plus tested with the testcase on 13 branch with -m32/-m64 without/with > the tree-vect-loop.cc patch (where it FAILed before and now PASSes). > Ok for trunk? OK. Thanks, Richard. > 2024-04-05 Jakub Jelinek > > PR tree-optimization/114566 > * tree-vect-loop.cc (update_epilogue_loop_vinfo): Don't clear > base_misaligned. > > * gcc.target/i386/avx512f-pr114566.c: New test. > > --- gcc/tree-vect-loop.cc.jj 2024-04-04 00:48:05.932072711 +0200 > +++ gcc/tree-vect-loop.cc 2024-04-05 00:59:33.743101468 +0200 > @@ -11590,9 +11590,7 @@ find_in_mapping (tree t, void *context) > corresponding dr_vec_info need to be reconnected to the EPILOGUE's > stmt_vec_infos, their statements need to point to their corresponding copy, > if they are gather loads or scatter stores then their reference needs to be > - updated to point to its corresponding copy and finally we set > - 'base_misaligned' to false as we have already peeled for alignment in the > - prologue of the main loop. */ > + updated to point to its corresponding copy. */ > > static void > update_epilogue_loop_vinfo (class loop *epilogue, tree advance) > @@ -11736,10 +11734,6 @@ update_epilogue_loop_vinfo (class loop * > } > DR_STMT (dr) = STMT_VINFO_STMT (stmt_vinfo); > stmt_vinfo->dr_aux.stmt = stmt_vinfo; > - /* The vector size of the epilogue is smaller than that of the main loop > - so the alignment is either the same or lower. This means the dr will > - thus by definition be aligned. */ > - STMT_VINFO_DR_INFO (stmt_vinfo)->base_misaligned = false; > } > > epilogue_vinfo->shared->datarefs_copy.release (); > --- gcc/testsuite/gcc.target/i386/avx512f-pr114566.c.jj 2024-04-05 11:21:04.282639386 +0200 > +++ gcc/testsuite/gcc.target/i386/avx512f-pr114566.c 2024-04-05 11:21:04.282639386 +0200 > @@ -0,0 +1,34 @@ > +/* PR tree-optimization/114566 */ > +/* { dg-do run } */ > +/* { dg-options "-O3 -mavx512f" } */ > +/* { dg-additional-options "-fstack-protector-strong" { target fstack_protector } } */ > +/* { dg-require-effective-target avx512f } */ > + > +#define AVX512F > +#include "avx512f-helper.h" > + > +__attribute__((noipa)) int > +foo (float x, float y) > +{ > + float a[8][56]; > + __builtin_memset (a, 0, sizeof (a)); > + > + for (int j = 0; j < 8; j++) > + for (int k = 0; k < 56; k++) > + { > + float b = k * y; > + if (b < 0.) > + b = 0.; > + if (b > 0.) > + b = 0.; > + a[j][k] += b; > + } > + > + return __builtin_log (x); > +} > + > +void > +TEST (void) > +{ > + foo (86.25f, 0.625f); > +} > > Jakub > > -- Richard Biener SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)