From: Jeff Law <jeffreyalaw@gmail.com>
To: Vineet Gupta <vineetg@rivosinc.com>, gcc-patches@gcc.gnu.org
Cc: kito.cheng@gmail.com, Palmer Dabbelt <palmer@rivosinc.com>,
gnu-toolchain@rivosinc.com, Robin Dapp <rdapp.gcc@gmail.com>
Subject: Re: [gcc-15 1/3] RISC-V: avoid LUI based const materialization ... [part of PR/106265]
Date: Fri, 22 Mar 2024 23:59:50 -0600 [thread overview]
Message-ID: <db0caf50-0f94-40a4-bbfb-b021b31fbebd@gmail.com> (raw)
In-Reply-To: <63222471-d4a7-4c37-bc7a-75d83f13ddde@rivosinc.com>
On 3/18/24 6:07 PM, Vineet Gupta wrote:
>
> Naive question: why is define_split more natural vs. define_insn_and_split.
> Is it because of the syntax/implementation or that one is more Combine
> "centric" and other is more of a Split* Pass thing (roughly speaking of
> course) or something else altogether ?
There are parts of combine that cost based on the number of insns. This
is primarily to keep combine from taking an insane amount of time.
So when we have a little white lie like mvconst_internal we muck up that
costing aspect of the combiner. That in turn stops the combiner from
doing certain combinations.
As a concrete example consider this:
unsigned long long w32mem(unsigned long long w32)
{
return w32 & ~(1U << 0);
}
Right now we use a bseti+addi to construct the constant, then do the
logical and. Prior to the combiner it looks like this:
> (insn 7 3 8 2 (set (reg:DI 137)
> (const_int 4294967294 [0xfffffffe])) "j.c":3:16 206 {*mvconst_internal}
> (nil))
> (insn 8 7 13 2 (set (reg:DI 136 [ _2 ])
> (and:DI (reg/v:DI 135 [ w32 ])
> (reg:DI 137))) "j.c":3:16 102 {*anddi3}
> (expr_list:REG_DEAD (reg:DI 137)
> (expr_list:REG_DEAD (reg/v:DI 135 [ w32 ])
> (expr_list:REG_EQUAL (and:DI (reg/v:DI 135 [ w32 ])
> (const_int 4294967294 [0xfffffffe]))
> (nil)))))
The combiner will refuse to match a splitter where the number of
incoming insns matches the number of resulting insns. So to match this
we'd have to write another define_insn_and_split.
If we didn't have mvconst_internal, then we'd see something like this:
> (insn 6 3 7 2 (set (reg:DI 138)
> (const_int 4294967296 [0x100000000])) "j.c":3:16 -1
> (nil))
> (insn 7 6 8 2 (set (reg:DI 137)
> (plus:DI (reg:DI 138)
> (const_int -2 [0xfffffffffffffffe]))) "j.c":3:16 -1
> (expr_list:REG_EQUAL (const_int 4294967294 [0xfffffffe])
> (nil)))
> (insn 8 7 9 2 (set (reg:DI 136 [ _2 ])
> (and:DI (reg/v:DI 135 [ w32 ])
> (reg:DI 137))) "j.c":3:16 -1
> (nil))
Which we could match with a define_split which would generate RTL for
bclri+zext.w.
Note that I suspect there's a handful of these kinds of sequences for
logical ops where the immediate doesn't fit a simm12.
Of course the point of the white lie is to expose complex constant
synthesis in away that the combiner can use to simplify things. It's
definitely a tradeoff either way. What's been rattling around a little
bit would be to narrow the set of constants allowed by mvconst_internal
to those which require 3 or more to synthesize. THe idea being cases
like you're looking where we can use addi+add rather than lui+addi+add
would be rejected by mvconst_internal, but the more complex constants
that push combine over the 4 insn limit would be allowed by
mvconst_internal.
>
> Would we have to revisit the new splitter (and perhaps audit the
> existing define_insn_and_split patterns) if we were to go ahead with
> this revert ?
I don't recall follow-ups which used the result of mvconst_internal in
subsequent combiner patterns, but it should be fairly simple to search
for them.
We'd need to look back at the original bug which resulted in creating
the mvconst_internal pattern. My recollection is it had a fairly
complex constant and we were unable to see enough insns to do the
necessary simplification to fix that case.
I bet whatever goes on inside perlbench, mcf and x264 (guessing based
on instruction counts in your message) + the original bug report will
provide reasonable testcases for evaluating reversion + adding patches
to avoid the regressions.
>> So why use "P" for your mode iterator? I would have expected GPR since
>> I think this works for both SI and DI mode as-is.
>
> My intent at least was to have this work for either of rv64/rv32 and in
> either of those environments, work for both SI or DI - certainly
> rv64+{SI,DI}.
> To that end I debated between GPR, X and P.
> It seems GPR only supports DI if TARGET_64BIT.
> But I could well be wrong about above requirements or the subtle
> differences in these.
I wouldn't worry about GPR only support DI for TARGET_64BIT. I don't
think we generally expose DImode patterns for TARGET_32BIT.
>
>> For the output template "#" for alternative 0 and the add instruction
>> for alternative 1 is probably better. That way it's clear to everyone
>> that alternative 0 is always meant to be split.
>
> OK.
>
>> Don't you need some kind of condition on the split to avoid splitting
>> when you've got a register for operands[2]?
>
> Won't the predicate "match_code" guarantee that ?
>
> (define_predicate "const_two_s12"
> (match_code "const_int")
> {
> return SUM_OF_TWO_S12 (INTVAL (op), false);
> })
You're right. Missed the match_code. Sorry for the bad steer.
>
> Do I need to build gcc in a certain way to expose such errors - I wasn't
> able to trigger such an error for the entire testsuite or SPEC build.
There's a distinct RTL checking mode. It's fairly expensive from a
compile-time standpoint though.
Jeff
next prev parent reply other threads:[~2024-03-23 5:59 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-03-16 17:35 [gcc-15 0/3] RISC-V improve stack/array access by constant mat tweak Vineet Gupta
2024-03-16 17:35 ` [gcc-15 1/3] RISC-V: avoid LUI based const materialization ... [part of PR/106265] Vineet Gupta
2024-03-16 20:28 ` Jeff Law
2024-03-19 0:07 ` Vineet Gupta
2024-03-23 5:59 ` Jeff Law [this message]
2024-03-16 17:35 ` [gcc-15 2/3] RISC-V: avoid LUI based const mat: keep stack offsets aligned Vineet Gupta
2024-03-16 20:21 ` Jeff Law
2024-03-19 0:27 ` Vineet Gupta
2024-03-19 6:48 ` Andrew Waterman
2024-03-19 13:10 ` Jeff Law
2024-03-19 20:05 ` Vineet Gupta
2024-03-19 20:58 ` Andrew Waterman
2024-03-19 21:17 ` Palmer Dabbelt
2024-03-20 18:57 ` Jeff Law
2024-03-23 6:05 ` Jeff Law
2024-03-16 17:35 ` [gcc-15 3/3] RISC-V: avoid LUI based const mat in prologue/epilogue expansion [PR/105733] Vineet Gupta
2024-03-16 20:27 ` Jeff Law
2024-03-19 4:41 ` [gcc-15 0/3] RISC-V improve stack/array access by constant mat tweak Jeff Law
2024-03-21 0:45 ` Vineet Gupta
2024-03-21 14:36 ` scheduler queue flush (was Re: [gcc-15 0/3] RISC-V improve stack/array access by constant mat tweak) Vineet Gupta
2024-03-21 14:45 ` Jeff Law
2024-03-21 17:19 ` Vineet Gupta
2024-03-21 19:56 ` Jeff Law
2024-03-22 0:34 ` scheduler queue flush Vineet Gupta
2024-03-22 8:47 ` scheduler queue flush (was Re: [gcc-15 0/3] RISC-V improve stack/array access by constant mat tweak) Richard Biener
2024-03-22 12:29 ` Jeff Law
2024-03-22 16:56 ` Vineet Gupta
2024-03-25 3:05 ` Jeff Law
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=db0caf50-0f94-40a4-bbfb-b021b31fbebd@gmail.com \
--to=jeffreyalaw@gmail.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=gnu-toolchain@rivosinc.com \
--cc=kito.cheng@gmail.com \
--cc=palmer@rivosinc.com \
--cc=rdapp.gcc@gmail.com \
--cc=vineetg@rivosinc.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).