public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Xi Ruoyao <xry111@xry111.site>
To: Jeff Law <jlaw@ventanamicro.com>,
	"gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>
Cc: Jivan Hakobyan <jivanhakobyan9@gmail.com>
Subject: Re: [RFA] New pass for sign/zero extension elimination
Date: Mon, 20 Nov 2023 10:23:46 +0800	[thread overview]
Message-ID: <ea241b35f33d2383a7892e6462c9d042200cb487.camel@xry111.site> (raw)
In-Reply-To: <6d5f8ba7-0c60-4789-87ae-68617ce6ac2c@ventanamicro.com>

On Sun, 2023-11-19 at 17:47 -0700, Jeff Law wrote:
> This is work originally started by Joern @ Embecosm.
> 
> There's been a long standing sense that we're generating too many 
> sign/zero extensions on the RISC-V port.  REE is useful, but it's really 
> focused on a relatively narrow part of the extension problem.
> 
> What Joern's patch does is introduce a new pass which tracks liveness of 
> chunks of pseudo regs.  Specifically it tracks bits 0..7, 8..15, 16..31 
> and 32..63.
> 
> If it encounters a sign/zero extend that sets bits that are never read, 
> then it replaces the sign/zero extension with a narrowing subreg.  The
> narrowing subreg usually gets eliminated by subsequent passes (it's just 
> a copy after all).
> 
> Jivan has done some analysis and found that it eliminates roughly 1% of 
> the dynamic instruction stream for x264 as well as some redundant 
> extensions in the coremark benchmark (both on rv64).  In my own testing 
> as I worked through issues on other architectures I clearly saw it 
> helping in various places within GCC itself or in the testsuite.
> 
> The basic structure is to first do a fairly standard liveness analysis
> on the chunks, seeding original state with the liveness data from DF. 
> Once that's stable, we do a final pass to identify the useless 
> extensions and transform them into narrowing subregs.
> 
> A few key points to remember.
> 
> For destination processing it is always safe to ignore a destination. 
> Ignoring a destination merely means that whatever was live after the 
> given insn will continue to be live before the insn.  What is not safe
> is to clear a bit in the LIVENOW bitmap for a destination chunk that is 
> not set.  This comes into play with things like STRICT_LOW_PART.
> 
> For source processing the safe thing to do is to set all the chunks in a 
> register as live.  It is never safe to fail to process a source operand.
> 
> When a destination object is not fully live, we try to transfer that 
> limited liveness to the source operands.  So for example if bits 16..63 
> are dead in a destination of a PLUS, we need not mark bits 16..63 as 
> live for the source operands.  We have to be careful -- consider a shift 
> count on a target without SHIFT_COUNT_TRUNCATED set.  So we have both a 
> list of RTL codes where we can transfer liveness and a few codes where
> one of the operands may need to be fully live (ex, a shift count) while 
> the other input may not need to be fully live (value left shifted).
> 
> Locally we have had this enabled at -O1 and above to encourage testing, 
> but I'm thinking that for the trunk enabling at -O2 and above is the 
> right thing to do.
> 
> This has (of course) been tested on rv64.  It's also been bootstrapped
> and regression tested on x86.  Bootstrap and regression tested (C only) 
> for m68k, sh4, sh4eb, alpha.  Earlier versions were also bootstrapped 
> and regression tested on ppc, hppa and s390x (C only for those as well). 
>   It's also been tested on the various crosses in my tester.  So we've
> got reasonable coverage of 16, 32 and 64 bit targets, big and little 
> endian, with and without SHIFT_COUNT_TRUNCATED and all kinds of other 
> oddities.
> 
> The included tests are for RISC-V only because not all targets are going 
> to have extraneous extensions.   There's tests from coremark, x264 and
> GCC's bz database.  It probably wouldn't be hard to add aarch64 
> testscases.  The BZs listed are improved by this patch for aarch64.
> 
> Given the amount of work Jivan and I have done, I'm not comfortable 
> self-approving at this time.  I'd much rather have another set of eyes
> on the code.  Hopefully the code is documented well enough for that to
> be useful exercise.
> 
> So, no need to work from Pago Pago for this patch.  I may make another
> attempt at the eswin conditional move work while working virtually in 
> Pago Pago though.
> 
> Thoughts, comments, recommendations?

Unfortunately, I get some ICE building stage 1 libgcc with this patch on
loongarch64-linux-gnu:

during RTL pass: ext_dce
../../../gcc/libgcc/libgcc2.c: In function ‘__absvdi2’:
../../../gcc/libgcc/libgcc2.c:224:1: internal compiler error: Segmentation fault
  224 | }
      | ^
0x120baa477 crash_signal
	../../gcc/gcc/toplev.cc:316
0x1216aeeb4 ext_dce_process_sets
	../../gcc/gcc/ext-dce.cc:128
0x1216afbaf ext_dce_process_bb
	../../gcc/gcc/ext-dce.cc:647
0x1216afbaf ext_dce
	../../gcc/gcc/ext-dce.cc:802
0x1216afbaf execute
	../../gcc/gcc/ext-dce.cc:868
Please submit a full bug report, with preprocessed source (by using -freport-bug).
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.

-- 
Xi Ruoyao <xry111@xry111.site>
School of Aerospace Science and Technology, Xidian University

  parent reply	other threads:[~2023-11-20  2:23 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-20  0:47 Jeff Law
2023-11-20  1:22 ` Oleg Endo
2023-11-20  2:51   ` Jeff Law
2023-11-20  2:57     ` Oleg Endo
2023-11-20  2:23 ` Xi Ruoyao [this message]
2023-11-20  2:46   ` Jeff Law
2023-11-20  2:52   ` Jeff Law
2023-11-20  3:32     ` Xi Ruoyao
2023-11-20  3:48       ` Jeff Law
2023-11-20 18:26 ` Richard Sandiford
2023-11-22 17:59   ` Jeff Law
2023-11-27 20:15     ` Richard Sandiford
2023-11-20 18:56 ` Dimitar Dimitrov
2023-11-22 22:23   ` Jeff Law
2023-11-26 16:42     ` rep.dot.nop
2023-11-27 16:14       ` Jeff Law
2023-11-27 11:30 ` Andrew Stubbs
2023-11-27 16:16   ` Jeff Law
2023-12-01  1:08 ` Hans-Peter Nilsson
2023-12-01 15:09   ` Jeff Law
2023-12-01 16:17     ` Hans-Peter Nilsson
2023-11-27 17:36 Joern Rennecke
2023-11-27 17:57 ` Joern Rennecke
2023-11-27 20:03   ` Richard Sandiford
2023-11-27 20:18     ` Jeff Law
2023-11-28 13:36       ` Joern Rennecke
2023-11-28 14:09         ` Joern Rennecke
2023-11-30 17:33         ` Jeff Law
2023-11-28 13:13     ` Joern Rennecke
2023-11-28  5:50 ` Jeff Law
2023-11-27 18:19 Joern Rennecke
2023-11-28  5:51 ` Jeff Law
2023-11-29 17:37 Joern Rennecke
2023-11-29 19:13 ` Jivan Hakobyan
2023-11-30 15:37 ` Jeff Law

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ea241b35f33d2383a7892e6462c9d042200cb487.camel@xry111.site \
    --to=xry111@xry111.site \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jivanhakobyan9@gmail.com \
    --cc=jlaw@ventanamicro.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).