From: Jeff Law <jlaw@ventanamicro.com>
To: Xi Ruoyao <xry111@xry111.site>,
"gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>
Cc: Jivan Hakobyan <jivanhakobyan9@gmail.com>
Subject: Re: [RFA] New pass for sign/zero extension elimination
Date: Sun, 19 Nov 2023 19:52:29 -0700 [thread overview]
Message-ID: <4c3588b1-f05c-4e7f-ad7a-e7050cf45859@ventanamicro.com> (raw)
In-Reply-To: <ea241b35f33d2383a7892e6462c9d042200cb487.camel@xry111.site>
On 11/19/23 19:23, Xi Ruoyao wrote:
> On Sun, 2023-11-19 at 17:47 -0700, Jeff Law wrote:
>> This is work originally started by Joern @ Embecosm.
>>
>> There's been a long standing sense that we're generating too many
>> sign/zero extensions on the RISC-V port. REE is useful, but it's really
>> focused on a relatively narrow part of the extension problem.
>>
>> What Joern's patch does is introduce a new pass which tracks liveness of
>> chunks of pseudo regs. Specifically it tracks bits 0..7, 8..15, 16..31
>> and 32..63.
>>
>> If it encounters a sign/zero extend that sets bits that are never read,
>> then it replaces the sign/zero extension with a narrowing subreg. The
>> narrowing subreg usually gets eliminated by subsequent passes (it's just
>> a copy after all).
>>
>> Jivan has done some analysis and found that it eliminates roughly 1% of
>> the dynamic instruction stream for x264 as well as some redundant
>> extensions in the coremark benchmark (both on rv64). In my own testing
>> as I worked through issues on other architectures I clearly saw it
>> helping in various places within GCC itself or in the testsuite.
>>
>> The basic structure is to first do a fairly standard liveness analysis
>> on the chunks, seeding original state with the liveness data from DF.
>> Once that's stable, we do a final pass to identify the useless
>> extensions and transform them into narrowing subregs.
>>
>> A few key points to remember.
>>
>> For destination processing it is always safe to ignore a destination.
>> Ignoring a destination merely means that whatever was live after the
>> given insn will continue to be live before the insn. What is not safe
>> is to clear a bit in the LIVENOW bitmap for a destination chunk that is
>> not set. This comes into play with things like STRICT_LOW_PART.
>>
>> For source processing the safe thing to do is to set all the chunks in a
>> register as live. It is never safe to fail to process a source operand.
>>
>> When a destination object is not fully live, we try to transfer that
>> limited liveness to the source operands. So for example if bits 16..63
>> are dead in a destination of a PLUS, we need not mark bits 16..63 as
>> live for the source operands. We have to be careful -- consider a shift
>> count on a target without SHIFT_COUNT_TRUNCATED set. So we have both a
>> list of RTL codes where we can transfer liveness and a few codes where
>> one of the operands may need to be fully live (ex, a shift count) while
>> the other input may not need to be fully live (value left shifted).
>>
>> Locally we have had this enabled at -O1 and above to encourage testing,
>> but I'm thinking that for the trunk enabling at -O2 and above is the
>> right thing to do.
>>
>> This has (of course) been tested on rv64. It's also been bootstrapped
>> and regression tested on x86. Bootstrap and regression tested (C only)
>> for m68k, sh4, sh4eb, alpha. Earlier versions were also bootstrapped
>> and regression tested on ppc, hppa and s390x (C only for those as well).
>> It's also been tested on the various crosses in my tester. So we've
>> got reasonable coverage of 16, 32 and 64 bit targets, big and little
>> endian, with and without SHIFT_COUNT_TRUNCATED and all kinds of other
>> oddities.
>>
>> The included tests are for RISC-V only because not all targets are going
>> to have extraneous extensions. There's tests from coremark, x264 and
>> GCC's bz database. It probably wouldn't be hard to add aarch64
>> testscases. The BZs listed are improved by this patch for aarch64.
>>
>> Given the amount of work Jivan and I have done, I'm not comfortable
>> self-approving at this time. I'd much rather have another set of eyes
>> on the code. Hopefully the code is documented well enough for that to
>> be useful exercise.
>>
>> So, no need to work from Pago Pago for this patch. I may make another
>> attempt at the eswin conditional move work while working virtually in
>> Pago Pago though.
>>
>> Thoughts, comments, recommendations?
>
> Unfortunately, I get some ICE building stage 1 libgcc with this patch on
> loongarch64-linux-gnu:
>
> during RTL pass: ext_dce
> ../../../gcc/libgcc/libgcc2.c: In function ‘__absvdi2’:
> ../../../gcc/libgcc/libgcc2.c:224:1: internal compiler error: Segmentation fault
> 224 | }
> | ^
> 0x120baa477 crash_signal
> ../../gcc/gcc/toplev.cc:316
> 0x1216aeeb4 ext_dce_process_sets
> ../../gcc/gcc/ext-dce.cc:128
> 0x1216afbaf ext_dce_process_bb
> ../../gcc/gcc/ext-dce.cc:647
> 0x1216afbaf ext_dce
> ../../gcc/gcc/ext-dce.cc:802
> 0x1216afbaf execute
> ../../gcc/gcc/ext-dce.cc:868
> Please submit a full bug report, with preprocessed source (by using -freport-bug).
> Please include the complete backtrace with any bug report.
> See <https://gcc.gnu.org/bugs/> for instructions.
I think I know what's going on here.
jeff
next prev parent reply other threads:[~2023-11-20 2:52 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-11-20 0:47 Jeff Law
2023-11-20 1:22 ` Oleg Endo
2023-11-20 2:51 ` Jeff Law
2023-11-20 2:57 ` Oleg Endo
2023-11-20 2:23 ` Xi Ruoyao
2023-11-20 2:46 ` Jeff Law
2023-11-20 2:52 ` Jeff Law [this message]
2023-11-20 3:32 ` Xi Ruoyao
2023-11-20 3:48 ` Jeff Law
2023-11-20 18:26 ` Richard Sandiford
2023-11-22 17:59 ` Jeff Law
2023-11-27 20:15 ` Richard Sandiford
2023-11-20 18:56 ` Dimitar Dimitrov
2023-11-22 22:23 ` Jeff Law
2023-11-26 16:42 ` rep.dot.nop
2023-11-27 16:14 ` Jeff Law
2023-11-27 11:30 ` Andrew Stubbs
2023-11-27 16:16 ` Jeff Law
2023-12-01 1:08 ` Hans-Peter Nilsson
2023-12-01 15:09 ` Jeff Law
2023-12-01 16:17 ` Hans-Peter Nilsson
2023-11-27 17:36 Joern Rennecke
2023-11-27 17:57 ` Joern Rennecke
2023-11-27 20:03 ` Richard Sandiford
2023-11-27 20:18 ` Jeff Law
2023-11-28 13:36 ` Joern Rennecke
2023-11-28 14:09 ` Joern Rennecke
2023-11-30 17:33 ` Jeff Law
2023-11-28 13:13 ` Joern Rennecke
2023-11-28 5:50 ` Jeff Law
2023-11-27 18:19 Joern Rennecke
2023-11-28 5:51 ` Jeff Law
2023-11-29 17:37 Joern Rennecke
2023-11-29 19:13 ` Jivan Hakobyan
2023-11-30 15:37 ` Jeff Law
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4c3588b1-f05c-4e7f-ad7a-e7050cf45859@ventanamicro.com \
--to=jlaw@ventanamicro.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=jivanhakobyan9@gmail.com \
--cc=xry111@xry111.site \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).