From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp-out1.suse.de (smtp-out1.suse.de [IPv6:2001:67c:2178:6::1c]) by sourceware.org (Postfix) with ESMTPS id 127763858D1E for ; Thu, 22 Dec 2022 11:20:21 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 127763858D1E Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.de Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id A8ADF23923 for ; Thu, 22 Dec 2022 11:20:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1671708019; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version:content-type:content-type; bh=6WhFAfkwqmGpBDiXIDJ01tbDzDIYbR5MMFY5hYHOZPk=; b=SLUQgSUO2LBjZTZNIS1W25U5iwKlB8zQqjKGnl9eoeNXCaoXuRfTR+Qjp7ZyG1bi/wIqRw thihzQEKFHWvG7MHwPekoyF5p3v2z9xbuDlmtuVYZO5Aink5Hpl7yOYOzA0ECfwmg8sAHu D8FafVqZdGvsH/I+egC5uuwsp5/zmT4= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1671708019; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version:content-type:content-type; bh=6WhFAfkwqmGpBDiXIDJ01tbDzDIYbR5MMFY5hYHOZPk=; b=CTrMAIKl4PuuLKFzoxTCiarUyQoZVe5RSWD4ZJNR7yQBc36q0n8gNKFzG4M5211PHcO9z4 ed1ydTbsfM1rubDw== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 9549E138FD for ; Thu, 22 Dec 2022 11:20:19 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id 5YpMI3M9pGNadwAAMHmgww (envelope-from ) for ; Thu, 22 Dec 2022 11:20:19 +0000 Date: Thu, 22 Dec 2022 12:20:19 +0100 (CET) From: Richard Biener To: gcc-patches@gcc.gnu.org Subject: [PATCH] tree-optimization/107451 - SLP load vectorization issue MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Message-Id: <20221222112019.9549E138FD@imap2.suse-dmz.suse.de> X-Spam-Status: No, score=-11.8 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: When vectorizing SLP loads with permutations we can access excess elements when the load vector type is bigger than the group size and the vectorization factor covers less groups than necessary to fill it. Since we know the code will only access up to group_size * VF elements in the unpermuted vector we can simply fill the rest of the vector with whatever we want. For simplicity this patch chooses to repeat the last group. Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed. PR tree-optimization/107451 * tree-vect-stmts.c (vectorizable_load): Avoid loading SLP group members from group numbers in excess of the vectorization factor. * gcc.dg/torture/pr107451.c: New testcase. --- gcc/testsuite/gcc.dg/torture/pr107451.c | 27 +++++++++++++++++++++++++ gcc/tree-vect-stmts.cc | 20 ++++++++++++------ 2 files changed, 41 insertions(+), 6 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/torture/pr107451.c diff --git a/gcc/testsuite/gcc.dg/torture/pr107451.c b/gcc/testsuite/gcc.dg/torture/pr107451.c new file mode 100644 index 00000000000..a17574c6896 --- /dev/null +++ b/gcc/testsuite/gcc.dg/torture/pr107451.c @@ -0,0 +1,27 @@ +/* { dg-do run } */ +/* { dg-additional-options "-ftree-vectorize -fno-vect-cost-model" } */ +/* { dg-additional-options "-mavx2" { target avx2_runtime } } */ + +double getdot(int n, const double *x, int inc_x, const double *y) +{ + int i, ix = 0; + double dot[4] = { 0.0, 0.0, 0.0, 0.0 } ; + + for(i = 0; i < n; i++) { + dot[0] += x[ix] * y[ix] ; + dot[1] += x[ix+1] * y[ix+1] ; + dot[2] += x[ix] * y[ix+1] ; + dot[3] += x[ix+1] * y[ix] ; + ix += inc_x ; + } + + return dot[0] + dot[1] + dot[2] + dot[3]; +} + +int main() +{ + double x[2] = {0, 0}, y[2] = {0, 0}; + if (getdot(1, x, 4096*4096, y) != 0.) + __builtin_abort (); + return 0; +} diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index 5485da58b38..8f8deaf82bc 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -9235,6 +9235,7 @@ vectorizable_load (vec_info *vinfo, unsigned int group_el = 0; unsigned HOST_WIDE_INT elsz = tree_to_uhwi (TYPE_SIZE_UNIT (TREE_TYPE (vectype))); + unsigned int n_groups = 0; for (j = 0; j < ncopies; j++) { if (nloads > 1) @@ -9256,12 +9257,19 @@ vectorizable_load (vec_info *vinfo, if (! slp || group_el == group_size) { - tree newoff = copy_ssa_name (running_off); - gimple *incr = gimple_build_assign (newoff, POINTER_PLUS_EXPR, - running_off, stride_step); - vect_finish_stmt_generation (vinfo, stmt_info, incr, gsi); - - running_off = newoff; + n_groups++; + /* When doing SLP make sure to not load elements from + the next vector iteration, those will not be accessed + so just use the last element again. See PR107451. */ + if (!slp || known_lt (n_groups, vf)) + { + tree newoff = copy_ssa_name (running_off); + gimple *incr + = gimple_build_assign (newoff, POINTER_PLUS_EXPR, + running_off, stride_step); + vect_finish_stmt_generation (vinfo, stmt_info, incr, gsi); + running_off = newoff; + } group_el = 0; } } -- 2.35.3