From: Richard Sandiford <richard.sandiford@arm.com>
To: Jan Hubicka <hubicka@ucw.cz>
Cc: Tamar Christina <tamar.christina@arm.com>,
gcc-patches@gcc.gnu.org, nd@arm.com, Richard.Earnshaw@arm.com,
Marcus.Shawcroft@arm.com, Kyrylo.Tkachov@arm.com,
rguenther@suse.de, kubanek0ondrej@gmail.com
Subject: Re: [PATCH]AArch64 fix regexp for live_1.c sve test
Date: Thu, 20 Jul 2023 10:16:17 +0100 [thread overview]
Message-ID: <mptv8efj63i.fsf@arm.com> (raw)
In-Reply-To: <ZLjvmAWOe5Bnk5HX@kam.mff.cuni.cz> (Jan Hubicka's message of "Thu, 20 Jul 2023 10:26:00 +0200")
Jan Hubicka <hubicka@ucw.cz> writes:
>> Tamar Christina <tamar.christina@arm.com> writes:
>> > Hi All,
>> >
>> > The resulting predicate register of a whilelo is not
>> > restricted to the lower half of the predicate register file.
>> >
>> > As such these tests started failing after recent changes
>> > because the whilelo outside the loop is getting assigned p15.
>>
>> It's the whilelo in the loop for me. We go from:
>>
>> .L3:
>> ld1b z31.b, p7/z, [x4, x3]
>> movprfx z30, z31
>> mul z30.b, p5/m, z30.b, z29.b
>> st1b z30.b, p7, [x4, x3]
>> mov p6.b, p7.b
>> add x3, x3, x0
>> whilelo p7.b, w3, w1
>> b.any .L3
>>
>> to:
>>
>> .L3:
>> ld1b z31.b, p7/z, [x3, x2]
>> movprfx z29, z31
>> mul z29.b, p6/m, z29.b, z30.b
>> st1b z29.b, p7, [x3, x2]
>> add x2, x2, x0
>> whilelo p15.b, w2, w1
>> b.any .L4
>> [...]
>> .p2align 2,,3
>> .L4:
>> mov p7.b, p15.b
>> b .L3
>>
>> This adds an extra (admittedly unconditional) branch to every non-final
>> vector iteration, which seems unfortunate. I don't think we'd see
>> p8-p15 otherwise, since the result of the whilelo is used as a
>> governing predicate by the next iteration of the loop.
>>
>> This happens because the scalar loop is given an 89% chance of iterating.
>> Previously we gave the vector loop an 83.33% chance of iterating, whereas
>> after 061f74c06735e1fa35b910ae we give it a 12% chance. 0.89^16 == 15.50%,
>> so the new probabilities definitely preserve the original probabilities
>> more closely. But for purely heuristic probabilities like these, I'm
>> not sure we should lean so heavily into the idea that the vector
>> latch is unlikely.
>>
>> Honza, Richi, any thoughts? Just wanted to double-check that this
>> was operating as expected before making the tests accept the (arguably)
>> less efficient code. It looks like the commit was more aimed at fixing
>> the profile counts for the epilogues, rather than the main loop.
>
> You are right that we shold not scale down static profiles in case they
> are artifically flat. It is nice to have actual testcase.
> Old code used to test:
>
> /* Without profile feedback, loops for which we do not know a better estimate
> are assumed to roll 10 times. When we unroll such loop, it appears to
> roll too little, and it may even seem to be cold. To avoid this, we
> ensure that the created loop appears to roll at least 5 times (but at
> most as many times as before unrolling). Don't do adjustment if profile
> feedback is present. */
> if (new_est_niter < 5 && !profile_p)
> {
> if (est_niter < 5)
> new_est_niter = est_niter;
> else
> new_est_niter = 5;
> }
>
> This is not right when profile feedback is around and also when we
> managed to determine precise #of itrations at branch prediction time and
> did not cap.
>
> So I replaced it iwht the test that adjusted header count is not smaller
> than the preheader edge count. However this will happily get loop
> iteration count close to 0.
>
> It is bit hard to figure out if profile is realistic:
>
> Sometimes we do
> profile_status_for_fn (cfun) != PROFILE_READ
> I am trying to get rid of this test. With LTO or when comdat profile is
> lost we inline together functions with and without profile.
>
> We can test for quality of loop header count to be precise or adjusted.
> However at the time vectorizer is modifying loop profile we already
> adjusted it for the initial conditional for profitability threshold and
> drop it to GUESSED.Even with profile feedback we do not know outcome
> probability of that one (Ondrej Kubanek's histograms will help here).
Ah, yeah, hadn't thought about that.
> So I think we want to check if we have loop iteration estimate recorded
> (that should be true for both profile feedback and loops with known trip
> count) and if so compare it what profile says and it is more or less in
> match consider profile realistic. This needs to be done before
> vectorizer starts tampering with the loop.
>
> I will try to make patch for that.
Thanks!
Richard
next prev parent reply other threads:[~2023-07-20 9:16 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-07-18 14:43 Tamar Christina
2023-07-20 5:44 ` Richard Sandiford
2023-07-20 7:20 ` Richard Biener
2023-07-20 9:14 ` Richard Sandiford
2023-07-20 8:26 ` Jan Hubicka
2023-07-20 9:16 ` Richard Sandiford [this message]
2023-07-21 17:10 ` Jan Hubicka
2023-07-21 17:19 ` Richard Sandiford
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=mptv8efj63i.fsf@arm.com \
--to=richard.sandiford@arm.com \
--cc=Kyrylo.Tkachov@arm.com \
--cc=Marcus.Shawcroft@arm.com \
--cc=Richard.Earnshaw@arm.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=hubicka@ucw.cz \
--cc=kubanek0ondrej@gmail.com \
--cc=nd@arm.com \
--cc=rguenther@suse.de \
--cc=tamar.christina@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).