public inbox for gcc-bugs@sourceware.org help / color / mirror / Atom feed
From: "rguenther at suse dot de" <gcc-bugzilla@gcc.gnu.org> To: gcc-bugs@gcc.gnu.org Subject: [Bug target/106346] [11/12/13 Regression] Potential regression on vectorization of left shift with constants since r11-5160-g9fc9573f9a5e94 Date: Wed, 27 Jul 2022 08:50:00 +0000 [thread overview] Message-ID: <bug-106346-4-LsgLU75Dwn@http.gcc.gnu.org/bugzilla/> (raw) In-Reply-To: <bug-106346-4@http.gcc.gnu.org/bugzilla/> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106346 --- Comment #7 from rguenther at suse dot de <rguenther at suse dot de> --- On Wed, 27 Jul 2022, tnfchris at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106346 > > --- Comment #6 from Tamar Christina <tnfchris at gcc dot gnu.org> --- > (In reply to Richard Biener from comment #5) > > (In reply to Tamar Christina from comment #4) > > > I believe the problem is actually g:27842e2a1eb26a7eae80b8efd98fb8c8bd74a68e > > > > > > We added an optab for the widening left shift pattern there however the > > > operation requires a uniform shift constant to work. See > > > https://godbolt.org/z/4hqKc69Ke > > > > > > The existing pattern that deals with this is vect_recog_widen_shift_pattern > > > which is a scalar pattern. during build_slp it validates that constants are > > > the same and when they're not it aborts SLP. This is why we lose > > > vectorization. Eventually we hit V4HI for which we have no widening shift > > > optab for and it vectorizes using that low VF. > > > > > > This example shows a number of things wrong: > > > > > > 1. The generic costing seems off, this sequence shouldn't have been > > > generated, as a vector sequence it's more inefficient than the scalar > > > sequence. Using -mcpu=neover-n1 or any other costing structure correctly > > > only gives scalar. > > > > > > 2. vect_recog_widen_shift_pattern is implemented in the wrong place. It > > > predates the existence of the SLP pattern matcher. Because of the uniform > > > requirements it's better to use the SLP pattern matcher where we have access > > > to all the constants to decide whether the pattern is a match or not. That > > > way we don't abort SLP. Are you ok with this as a fix Richi? > > > > patterns are difficult beasts - I think vect_recog_widen_shift_pattern is > > at the correct place but instead what is lacking is SLP discovery support > > for scrapping it - that is, ideally the vectorizer would take patterns as > > a hint and ignore them when they are not helpful. > > Hmm, yes but the problem is that we've consumed additional related statements > which now need to be handled by build_slp as well. I suppose you could do an > in-place build_slp on the pattern stmt seq iterator. Though that seems like > undoing a mistake. Yes. But it would be actually undoing a mistake, so ... ;) The issue is of course that we have many ways to vectorize things and patterns just add to that - and SLP discovery needs to somehow pick one and we stick to it (and cost based selection is only done on the vector mode choice). I'm not sure what's the best thing to do in the very end - for patterns one could think of adding alternate (sub-)SLP-graph overlays instead of forking the whole SLP graph for every combination of pattern/non-pattern which would surely explode. At analysis time one could choose the locally cheapest variant assuming the local choice does not influence analysis of the rest of the graph (which some widening patterns do after all, at least in the loop case by influencing the VF). > > > > Now - in theory, for SLP vectorization, all patterns could be handled > > as SLP patterns and scalar patterns disabled. But that isn't easy to > > do either. > > As long as we still have the non-SLP loop vect it's probably not a good idea > no? > since we then lose all patterns for it. The widening shift was already > sufficiently > limited that it wouldn't really regress here. Well, we'd need both obviously. But yes, if we get to the point (sorry for it taking forever) that we have SLP data structures for everything we could shelve the stmt based patterns. > > > > I fear to fight this regression the easiest route is to pretend the > > ISA can do widen shift by vector and fixup in the expander ... > > I can do this, but we're hiding the cost then. Or did you want me to fudge the > numbers in the costing hooks? Yes. > > > > > 3. The epilogue costing seems off.. > > > > > > This example https://godbolt.org/z/YoPcWv6Td ends up generating an > > > exceptionally high epilogue cost and so thinks vectorization at the higher > > > VF is not profitable. > > > > > > *src1_18(D) 1 times vec_to_scalar costs 2 in epilogue > > > MEM[(uint16_t *)src1_18(D) + 2B] 1 times vec_to_scalar costs 2 in epilogue > > > MEM[(uint16_t *)src1_18(D) + 4B] 1 times vec_to_scalar costs 2 in epilogue > > > MEM[(uint16_t *)src1_18(D) + 6B] 1 times vec_to_scalar costs 2 in epilogue > > > MEM[(uint16_t *)src1_18(D) + 8B] 1 times vec_to_scalar costs 2 in epilogue > > > MEM[(uint16_t *)src1_18(D) + 10B] 1 times vec_to_scalar costs 2 in epilogue > > > MEM[(uint16_t *)src1_18(D) + 12B] 1 times vec_to_scalar costs 2 in epilogue > > > MEM[(uint16_t *)src1_18(D) + 14B] 1 times vec_to_scalar costs 2 in epilogue > > > /app/example.c:16:12: note: Cost model analysis for part in loop 0: > > > Vector cost: 23 > > > Scalar cost: 17 > > > > I don't see any epilogue cost - the example doesn't have a loop. With BB > > vect you could see no epilogue costs? > > That was my expectation, but see e.g. https://godbolt.org/z/MGEMYEe86 the SLP > shows the > above output. I don't understand where the vec_to_scalar costs come from. > > > > > > For some reason it thinks it needs a scalar epilogue? using > > > -fno-vect-cost-model gives the desired codegen. > >
next prev parent reply other threads:[~2022-07-27 8:50 UTC|newest] Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top 2022-07-18 18:28 [Bug tree-optimization/106346] New: Potential regression on vectorization of left shift with constants manolis.tsamis at vrull dot eu 2022-07-19 6:52 ` [Bug target/106346] " rguenth at gcc dot gnu.org 2022-07-19 13:09 ` manolis.tsamis at vrull dot eu 2022-07-22 10:11 ` [Bug target/106346] Potential regression on vectorization of left shift with constants since r11-5160-g9fc9573f9a5e94 marxin at gcc dot gnu.org 2022-07-22 11:00 ` tnfchris at gcc dot gnu.org 2022-07-27 7:27 ` [Bug target/106346] [11/12/13 Regression] " tnfchris at gcc dot gnu.org 2022-07-27 7:48 ` rguenth at gcc dot gnu.org 2022-07-27 8:26 ` tnfchris at gcc dot gnu.org 2022-07-27 8:50 ` rguenther at suse dot de [this message] 2023-07-31 15:30 ` [Bug target/106346] [11/12/13/14 " tnfchris at gcc dot gnu.org 2023-08-04 12:50 ` cvs-commit at gcc dot gnu.org 2023-08-04 13:58 ` tnfchris at gcc dot gnu.org
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=bug-106346-4-LsgLU75Dwn@http.gcc.gnu.org/bugzilla/ \ --to=gcc-bugzilla@gcc.gnu.org \ --cc=gcc-bugs@gcc.gnu.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).