public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
From: "linkw at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug target/102789] [12 regression] libgomp.c++/simd-3.C fails after r12-4340 for 32 bits
Date: Wed, 20 Oct 2021 06:24:34 +0000	[thread overview]
Message-ID: <bug-102789-4-FiCLOdMygF@http.gcc.gnu.org/bugzilla/> (raw)
In-Reply-To: <bug-102789-4@http.gcc.gnu.org/bugzilla/>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102789

Kewen Lin <linkw at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |bergner at gcc dot gnu.org,
                   |                            |rguenth at gcc dot gnu.org,
                   |                            |wschmidt at gcc dot gnu.org
             Status|NEW                         |ASSIGNED

--- Comment #5 from Kewen Lin <linkw at gcc dot gnu.org> ---
As Jakub noted, r12-4340 just exposed one latent bug, even without r12-4340 but
with -fvect-cost-model=dynamic, the issue still exists. The key is if it will
do the peeling for alignment in prologue.

          unsigned max_allowed_peel
            = param_vect_max_peeling_for_alignment;
          if (flag_vect_cost_model <= VECT_COST_MODEL_CHEAP)
            max_allowed_peel = 0;

--param vect-max-peeling-for-alignment=14 make the peeling disabled and it
passes.

I think this is a bug in vectorizer, reduced the culprit loop to (also move the
first loop out of function):

  for (i = n; i < o; i++)
    {
      k += m + 1;
      t = k + p[i];
      s2 += t;
      c[i]++;
    }

we have some temporary storages for the omp clause such as:

  int D.3802[16];  // for k
  int D.3800[16];  // for s2
  int D.3799[16];  // for t

After having the peeling (one prologue), the addresses of k,s2,t become to:

  _187 = prolog_loop_niters.27_88 * 4;
  vectp.37_186 = &D.3802 + _187;
  _213 = prolog_loop_niters.27_88 * 4;
  vectp.46_212 = &D.3799 + _213;
  _222 = prolog_loop_niters.27_88 * 4;
  vectp.48_221 = &D.3800 + _222;

then the main vectorized loop body acts on the biased addresses which is wrong:

  vect__61.49_223 = MEM <vector(4) int> [(int *)vectp.48_221];
  vectp.48_224 = vectp.48_221 + 16;
  vect__61.50_225 = MEM <vector(4) int> [(int *)vectp.48_224];
  vectp.48_226 = vectp.48_221 + 32;
  vect__61.51_227 = MEM <vector(4) int> [(int *)vectp.48_226];
  vectp.48_228 = vectp.48_221 + 48;
  vect__61.52_229 = MEM <vector(4) int> [(int *)vectp.48_228];
  _61 = D.3800[_56];

  vect__62.53_230 = vect__59.44_208 + vect__61.49_223;
  vect__62.53_231 = vect__59.44_209 + vect__61.50_225;
  vect__62.53_232 = vect__59.44_210 + vect__61.51_227;
  vect__62.53_233 = vect__59.44_211 + vect__61.52_229;
  _62 = _59 + _61;

  MEM <vector(4) int> [(int *)vectp.55_234] = vect__62.53_230;
  vectp.55_237 = vectp.55_234 + 16;
  MEM <vector(4) int> [(int *)vectp.55_237] = vect__62.53_231;
  vectp.55_239 = vectp.55_234 + 32;
  MEM <vector(4) int> [(int *)vectp.55_239] = vect__62.53_232;
  vectp.55_241 = vectp.55_234 + 48;
  MEM <vector(4) int> [(int *)vectp.55_241] = vect__62.53_233;


A fix looks to avoid the address biasing for these kinds of DRs for omp clause
specific storage. These DRs are mainly used in the main loop (lanes?), for this
case it's for reduction, in prologues we use element 0, in epilogue we use the
last one or reduc_op all elements according to the type. The below small fix
can make it pass:

diff --git a/gcc/tree-vect-loop-manip.c b/gcc/tree-vect-loop-manip.c
index 4988c93fdb6..a447f457f93 100644
--- a/gcc/tree-vect-loop-manip.c
+++ b/gcc/tree-vect-loop-manip.c
@@ -1820,7 +1820,7 @@ vect_update_inits_of_drs (loop_vec_info loop_vinfo, tree
niters,
   FOR_EACH_VEC_ELT (datarefs, i, dr)
     {
       dr_vec_info *dr_info = loop_vinfo->lookup_dr (dr);
-      if (!STMT_VINFO_GATHER_SCATTER_P (dr_info->stmt))
+      if (!STMT_VINFO_GATHER_SCATTER_P (dr_info->stmt) &&
!STMT_VINFO_SIMD_LANE_ACCESS_P (dr_info->stmt))
        vect_update_init_of_dr (dr_info, niters, code);
     }
 }

I've not looked into the meaning for different values (1,2,3,4) for
STMT_VINFO_SIMD_LANE_ACCESS_P (stmt_info), it seems for the different omp
clauses? The assumption of the above fix is that for all cases of
STMT_VINFO_SIMD_LANE_ACCESS_P > 0, the related DR would be used mainly in
vectorized loop body, we don't need any updates for it in prologue. I'm going
to do one broader testing to see if we need more restrictions on that.

  parent reply	other threads:[~2021-10-20  6:24 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-15 20:09 [Bug libgomp/102789] New: [12 regression] libgomp.c++/simd-3.C fails after r12-xxxx " seurer at gcc dot gnu.org
2021-10-15 20:10 ` [Bug libgomp/102789] " seurer at gcc dot gnu.org
2021-10-18  6:29 ` [Bug libgomp/102789] [12 regression] libgomp.c++/simd-3.C fails after r12-4340 " rguenth at gcc dot gnu.org
2021-10-18 10:05 ` [Bug target/102789] " jakub at gcc dot gnu.org
2021-10-18 10:20 ` jakub at gcc dot gnu.org
2021-10-19  6:51 ` linkw at gcc dot gnu.org
2021-10-20  6:24 ` linkw at gcc dot gnu.org [this message]
2021-10-25  3:08 ` linkw at gcc dot gnu.org
2021-10-26  3:18 ` cvs-commit at gcc dot gnu.org
2021-10-26  3:21 ` [Bug tree-optimization/102789] " linkw at gcc dot gnu.org
2021-10-26  3:28 ` linkw at gcc dot gnu.org
2021-11-05 13:57 ` [Bug tree-optimization/102789] " rguenth at gcc dot gnu.org
2021-11-05 14:00 ` jakub at gcc dot gnu.org
2021-11-08  5:31 ` cvs-commit at gcc dot gnu.org
2021-11-08  5:33 ` cvs-commit at gcc dot gnu.org
2021-11-08  5:34 ` cvs-commit at gcc dot gnu.org
2021-11-08  5:36 ` linkw at gcc dot gnu.org

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-102789-4-FiCLOdMygF@http.gcc.gnu.org/bugzilla/ \
    --to=gcc-bugzilla@gcc.gnu.org \
    --cc=gcc-bugs@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).