From: "Roger Sayle" <roger@nextmovesoftware.com>
To: <gcc-patches@gcc.gnu.org>
Cc: "'Jeff Law'" <jeffreyalaw@gmail.com>, "'YunQiang Su'" <syq@gcc.gnu.org>
Subject: [PATCH] Improved RTL expansion of field assignments into promoted registers.
Date: Thu, 28 Dec 2023 14:59:40 -0000 [thread overview]
Message-ID: <005901da399e$7d13b330$773b1990$@nextmovesoftware.com> (raw)
[-- Attachment #1: Type: text/plain, Size: 3833 bytes --]
This patch fixes PR rtl-optmization/104914 by tweaking/improving the way
that fields are written into a pseudo register that needs to be kept sign
extended.
The motivating example from the bugzilla PR is:
extern void ext(int);
void foo(const unsigned char *buf) {
int val;
((unsigned char*)&val)[0] = *buf++;
((unsigned char*)&val)[1] = *buf++;
((unsigned char*)&val)[2] = *buf++;
((unsigned char*)&val)[3] = *buf++;
if(val > 0)
ext(1);
else
ext(0);
}
which at the end of the tree optimization passes looks like:
void foo (const unsigned char * buf)
{
int val;
unsigned char _1;
unsigned char _2;
unsigned char _3;
unsigned char _4;
int val.5_5;
<bb 2> [local count: 1073741824]:
_1 = *buf_7(D);
MEM[(unsigned char *)&val] = _1;
_2 = MEM[(const unsigned char *)buf_7(D) + 1B];
MEM[(unsigned char *)&val + 1B] = _2;
_3 = MEM[(const unsigned char *)buf_7(D) + 2B];
MEM[(unsigned char *)&val + 2B] = _3;
_4 = MEM[(const unsigned char *)buf_7(D) + 3B];
MEM[(unsigned char *)&val + 3B] = _4;
val.5_5 = val;
if (val.5_5 > 0)
goto <bb 3>; [59.00%]
else
goto <bb 4>; [41.00%]
<bb 3> [local count: 633507681]:
ext (1);
goto <bb 5>; [100.00%]
<bb 4> [local count: 440234144]:
ext (0);
<bb 5> [local count: 1073741824]:
val ={v} {CLOBBER(eol)};
return;
}
Here four bytes are being sequentially written into the SImode value
val. On some platforms, such as MIPS64, this SImode value is kept in
a 64-bit register, suitably sign-extended. The function expand_assignment
contains logic to handle this via SUBREG_PROMOTED_VAR_P (around line 6264
in expr.cc) which outputs an explicit extension operation after each
store_field (typically insv) to such promoted/extended pseudos.
The first observation is that there's no need to perform sign extension
after each byte in the example above; the extension is only required
after changes to the most significant byte (i.e. to a field that overlaps
the most significant bit).
The bug fix is actually a bit more subtle, but at this point during
code expansion it's not safe to use a SUBREG when sign-extending this
field. Currently, GCC generates (sign_extend:DI (subreg:SI (reg:DI) 0))
but combine (and other RTL optimizers) later realize that because SImode
values are always sign-extended in their 64-bit hard registers that
this is a no-op and eliminates it. The trouble is that it's unsafe to
refer to the SImode lowpart of a 64-bit register using SUBREG at those
critical points when temporarily the value isn't correctly sign-extended,
and the usual backend invariants don't hold. At these critical points,
the middle-end needs to use an explicit TRUNCATE rtx (as this isn't a
TRULY_NOOP_TRUNCATION), so that the explicit sign-extension looks like
(sign_extend:DI (truncate:SI (reg:DI)), which avoids the problem.
Note that MODE_REP_EXTENDED (NARROW, WIDE) != UNKOWN implies (or should
imply) !TRULY_NOOP_TRUNCATION (NARROW, WIDE). I've another (independent)
patch that I'll post in a few minutes.
This middle-end patch has been tested on x86_64-pc-linux-gnu with
make bootstrap and make -k check, both with and without
--target_board=unix{-m32} with no new failures. The cc1 from a
cross-compiler to mips64 appears to generate much better code for
the above test case. Ok for mainline?
2023-12-28 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
PR rtl-optimization/104914
* expr.cc (expand_assignment): When target is SUBREG_PROMOTED_VAR_P
a sign or zero extension is only required if the modified field
overlaps the SUBREG's most significant bit. On MODE_REP_EXTENDED
targets, don't refer to the temporarily incorrectly extended value
using a SUBREG, but instead generate an explicit TRUNCATE rtx.
Thanks in advance,
Roger
--
[-- Attachment #2: patchyq.txt --]
[-- Type: text/plain, Size: 1756 bytes --]
diff --git a/gcc/expr.cc b/gcc/expr.cc
index 9fef2bf6585..1a34b48e38f 100644
--- a/gcc/expr.cc
+++ b/gcc/expr.cc
@@ -6272,19 +6272,32 @@ expand_assignment (tree to, tree from, bool nontemporal)
&& known_eq (bitpos, 0)
&& known_eq (bitsize, GET_MODE_BITSIZE (GET_MODE (to_rtx))))
result = store_expr (from, to_rtx, 0, nontemporal, false);
- else
+ /* Check if the field overlaps the MSB, requiring extension. */
+ else if (known_eq (bitpos + bitsize,
+ GET_MODE_BITSIZE (GET_MODE (to_rtx))))
{
- rtx to_rtx1
- = lowpart_subreg (subreg_unpromoted_mode (to_rtx),
- SUBREG_REG (to_rtx),
- subreg_promoted_mode (to_rtx));
+ scalar_int_mode imode = subreg_unpromoted_mode (to_rtx);
+ scalar_int_mode omode = subreg_promoted_mode (to_rtx);
+ rtx to_rtx1 = lowpart_subreg (imode, SUBREG_REG (to_rtx),
+ omode);
result = store_field (to_rtx1, bitsize, bitpos,
bitregion_start, bitregion_end,
mode1, from, get_alias_set (to),
nontemporal, reversep);
+ /* If the target usually keeps IMODE appropriately
+ extended in OMODE it's unsafe to refer to it using
+ a SUBREG whilst this invariant doesn't hold. */
+ if (targetm.mode_rep_extended (imode, omode) != UNKNOWN)
+ to_rtx1 = simplify_gen_unary (TRUNCATE, imode,
+ SUBREG_REG (to_rtx), omode);
convert_move (SUBREG_REG (to_rtx), to_rtx1,
SUBREG_PROMOTED_SIGN (to_rtx));
}
+ else
+ result = store_field (to_rtx, bitsize, bitpos,
+ bitregion_start, bitregion_end,
+ mode1, from, get_alias_set (to),
+ nontemporal, reversep);
}
else
result = store_field (to_rtx, bitsize, bitpos,
next reply other threads:[~2023-12-28 14:59 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-12-28 14:59 Roger Sayle [this message]
2023-12-28 18:22 ` Jeff Law
2023-12-28 19:35 ` Roger Sayle
2023-12-29 2:07 ` YunQiang Su
2023-12-30 3:27 ` Jeff Law
2023-12-31 4:34 ` YunQiang Su
2024-01-02 16:46 ` Jeff Law
2024-01-02 16:55 ` Jeff Law
2023-12-29 1:51 ` YunQiang Su
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='005901da399e$7d13b330$773b1990$@nextmovesoftware.com' \
--to=roger@nextmovesoftware.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=jeffreyalaw@gmail.com \
--cc=syq@gcc.gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).