From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 1666) id 18D1038582AB; Wed, 15 Mar 2023 09:47:47 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 18D1038582AB DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1678873667; bh=jmK0TJzoBaOYYUkjr2kARRlz3v/2AFmuEl1pgKNWRvE=; h=From:To:Subject:Date:From; b=Yy2/URkvfT86iChremWH6FXJcEVHumlVcf8uo2w63dpdGI13ZMkXswLYsnF9Mzbhv zVd1IY5IUcg0VM+1TPDGRzkV3nGo03JOjoMyspdbAn/SO2GGRZAICcvxDfkTtBNUfx fQhtAQQWV6XHBMxOtQokgn4t5jnF/cGacAyB0yA0= MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="utf-8" From: Richard Biener To: gcc-cvs@gcc.gnu.org Subject: [gcc r12-9255] tree-optimization/107451 - SLP load vectorization issue X-Act-Checkin: gcc X-Git-Author: Richard Biener X-Git-Refname: refs/heads/releases/gcc-12 X-Git-Oldrev: 97d599e09b0fd389a7cbac8867e56977ec97900f X-Git-Newrev: c722c6b061a5e909267eae53ffe5910fbe0a7d5e Message-Id: <20230315094747.18D1038582AB@sourceware.org> Date: Wed, 15 Mar 2023 09:47:47 +0000 (GMT) List-Id: https://gcc.gnu.org/g:c722c6b061a5e909267eae53ffe5910fbe0a7d5e commit r12-9255-gc722c6b061a5e909267eae53ffe5910fbe0a7d5e Author: Richard Biener Date: Thu Dec 22 09:36:17 2022 +0100 tree-optimization/107451 - SLP load vectorization issue When vectorizing SLP loads with permutations we can access excess elements when the load vector type is bigger than the group size and the vectorization factor covers less groups than necessary to fill it. Since we know the code will only access up to group_size * VF elements in the unpermuted vector we can simply fill the rest of the vector with whatever we want. For simplicity this patch chooses to repeat the last group. PR tree-optimization/107451 * tree-vect-stmts.cc (vectorizable_load): Avoid loading SLP group members from group numbers in excess of the vectorization factor. * gcc.dg/torture/pr107451.c: New testcase. (cherry picked from commit 7b2cf5041460859ca4f58e5da1308b7ef9129d8b) Diff: --- gcc/testsuite/gcc.dg/torture/pr107451.c | 27 +++++++++++++++++++++++++++ gcc/tree-vect-stmts.cc | 20 ++++++++++++++------ 2 files changed, 41 insertions(+), 6 deletions(-) diff --git a/gcc/testsuite/gcc.dg/torture/pr107451.c b/gcc/testsuite/gcc.dg/torture/pr107451.c new file mode 100644 index 00000000000..a17574c6896 --- /dev/null +++ b/gcc/testsuite/gcc.dg/torture/pr107451.c @@ -0,0 +1,27 @@ +/* { dg-do run } */ +/* { dg-additional-options "-ftree-vectorize -fno-vect-cost-model" } */ +/* { dg-additional-options "-mavx2" { target avx2_runtime } } */ + +double getdot(int n, const double *x, int inc_x, const double *y) +{ + int i, ix = 0; + double dot[4] = { 0.0, 0.0, 0.0, 0.0 } ; + + for(i = 0; i < n; i++) { + dot[0] += x[ix] * y[ix] ; + dot[1] += x[ix+1] * y[ix+1] ; + dot[2] += x[ix] * y[ix+1] ; + dot[3] += x[ix+1] * y[ix] ; + ix += inc_x ; + } + + return dot[0] + dot[1] + dot[2] + dot[3]; +} + +int main() +{ + double x[2] = {0, 0}, y[2] = {0, 0}; + if (getdot(1, x, 4096*4096, y) != 0.) + __builtin_abort (); + return 0; +} diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index 4c5d20a0e2c..2498948fad2 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -9202,6 +9202,7 @@ vectorizable_load (vec_info *vinfo, unsigned int group_el = 0; unsigned HOST_WIDE_INT elsz = tree_to_uhwi (TYPE_SIZE_UNIT (TREE_TYPE (vectype))); + unsigned int n_groups = 0; for (j = 0; j < ncopies; j++) { if (nloads > 1) @@ -9223,12 +9224,19 @@ vectorizable_load (vec_info *vinfo, if (! slp || group_el == group_size) { - tree newoff = copy_ssa_name (running_off); - gimple *incr = gimple_build_assign (newoff, POINTER_PLUS_EXPR, - running_off, stride_step); - vect_finish_stmt_generation (vinfo, stmt_info, incr, gsi); - - running_off = newoff; + n_groups++; + /* When doing SLP make sure to not load elements from + the next vector iteration, those will not be accessed + so just use the last element again. See PR107451. */ + if (!slp || known_lt (n_groups, vf)) + { + tree newoff = copy_ssa_name (running_off); + gimple *incr + = gimple_build_assign (newoff, POINTER_PLUS_EXPR, + running_off, stride_step); + vect_finish_stmt_generation (vinfo, stmt_info, incr, gsi); + running_off = newoff; + } group_el = 0; } }