Re: Redundant constants in coremark crc8 for RISCV/aarch64 (no-if-conversion)

public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed

From: Jeff Law <jeffreyalaw@gmail.com>
To: Vineet Gupta <vineetg@rivosinc.com>, gcc@gcc.gnu.org
Cc: Kito Cheng <kito.cheng@sifive.com>,
	Philipp Tomsich <philipp.tomsich@vrull.eu>
Subject: Re: Redundant constants in coremark crc8 for RISCV/aarch64 (no-if-conversion)
Date: Tue, 18 Oct 2022 17:36:00 -0600	[thread overview]
Message-ID: <53dcbef4-7aef-5f63-9bd8-e11c614b0be8@gmail.com> (raw)
In-Reply-To: <1e118c0c-5d9a-4fca-9fe9-12e2baa34019@rivosinc.com>

On 10/18/22 15:51, Vineet Gupta wrote:
>
>
>
>> Where BB4 corresponds to .L2 and BB6 corresponds to .L3. Evaluation 
>> of the constants occurs in BB3 and BB5.
>
> And Evaluation here means use of the constant (vs. definition ?).

In this case, use of the constant.

>
>> PRE/GCSE is better suited for this scenario, but it has a critical 
>> constraint.  In particular our PRE formulation is never allowed to 
>> put an evaluation of an expression on a path that didn't have one 
>> before. So
>> while there clearly a redundancy on the path 2->3->4->5 (BB3 and 
>> BB5), there is nowhere we could put an evaluation that would reduce 
>> the number of evaluation on that path without introducing an 
>> evaluation on paths that didn't have one.  So consider 2->4->6.  On 
>> that path there are zero evaluations.  So we can't place an eval in 
>> BB2 because that will cause evaluations on 2->4->6 which didn't have 
>> any evaluations.
>
> OK. How does PRE calculate all possible paths to consider: say your 
> example 2-3-4-5 and 2-4-6 ? Is that just indicative or would actually 
> be the one PRE calculates for this case. Would there be more ?

PRE has a series of dataflow equations it solves which gives it the 
answer to that question.  The one that computes this property is usually 
called anticipated.  Given some block B in a graph G. An expression is 
anticipated at B when the expression is guaranteed to be computed if we 
reach B.  That doesn't mean the evaluation must happen in B, just that 
evaluation at some point is guaranteed if we reach B.

If an expression is not anticipated in a block, then PRE will not insert 
in that block since doing so would add evaluations on paths where they 
did not previously have any.

>
>> There isn't a great place in GCC to handle this right now.  If the 
>> constraints were relaxed in PRE, then we'd have a chance, but getting 
>> the cost model right is going to be tough.
>
> It would have been better (for this specific case) if loop unrolling 
> was not being done so early. The tree pass cunroll is flattening it 
> out and leaving for rest of the all tree/rtl passes to pick up the 
> pieces and remove any redundancies, if at all. It obviously needs to 
> be early if we are injecting 7x more instructions, but seems like a 
> lot to unravel.

Yup.  If that loop gets unrolled, it's going to be a mess.  It will 
almost certainly make this problem worse as each iteration is going to 
have a pair of constants loaded and no good way to remove them.

>
> FWIW -fno-unroll-loops only seems to work at -O2. At -O3 it always 
> unrolls. Is that expected ?

The only case I'm immediately aware of where this wouldn't work would be 
if -O3 came after -fno-unroll-oops.

>
> If this seems worthwhile and you have ideas to do this any better, I'd 
> be happy to work on this with some guidance.

I don't see  a great solution here.    Something like Cliff Click's work 
might help, but it's far from a guarantee.  Click's work essentially 
throws away the PRE constraint about never inserting an expression 
evaluation on a path where it didn't exit, along with all kinds of other 
things.  Essentially it's a total reformulation of redundancy elimination.

I did an implementation eons ago in gimple, but never was able to 
convince myself the implementation was correct or that integrating it 
was a good thing.   It's almost certainly going to cause performance 
regressions elsewhere so it may end up doing more harm than good.  I 
don't really know.

https://courses.cs.washington.edu/courses/cse501/06wi/reading/click-pldi95.pdf

Jeff

next prev parent reply	other threads:[~2022-10-18 23:36 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-14 15:56 Vineet Gupta
2022-10-14 16:54 ` Jeff Law
2022-10-18 21:51   ` Vineet Gupta
2022-10-18 23:36     ` Jeff Law [this message]
2022-10-19  2:09       ` Vineet Gupta
2022-10-19  3:42         ` Jeff Law
2022-10-19  7:46           ` Richard Biener
2022-10-19 13:30             ` Jeff Law

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53dcbef4-7aef-5f63-9bd8-e11c614b0be8@gmail.com \
    --to=jeffreyalaw@gmail.com \
    --cc=gcc@gcc.gnu.org \
    --cc=kito.cheng@sifive.com \
    --cc=philipp.tomsich@vrull.eu \
    --cc=vineetg@rivosinc.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).