From: Richard Earnshaw <Richard.Earnshaw@foss.arm.com>
To: Richard Biener <richard.guenther@gmail.com>,
Richard Earnshaw <rearnsha@arm.com>
Cc: GCC Patches <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH v2 1/3] rtl: properly handle subreg (mem) in gen_highpart [PR102125]
Date: Thu, 9 Sep 2021 15:39:00 +0100 [thread overview]
Message-ID: <dc3c622b-da20-16cd-797d-9f700ad8e9e3@foss.arm.com> (raw)
In-Reply-To: <CAFiYyc02XAipxY3XNagXamTw=JBZmNgDajTbx8K2HL3w+zMonQ@mail.gmail.com>
On 09/09/2021 13:23, Richard Biener via Gcc-patches wrote:
> On Thu, Sep 9, 2021 at 1:09 PM Richard Earnshaw <rearnsha@arm.com> wrote:
>>
>>
>> gen_lowpart_general handles forming a SUBREG of a MEM by using
>> adjust_address to rework and validate a new version of the MEM.
>> However, gen_highpart does not attempt this and simply returns (SUBREG
>> (MEM)) if the change is not 'obviously' safe. Improve on that by
>> using a similar approach so that gen_lowpart and gen_highpart are
>> mostly symmetrical in this regard.
>
> When I decipher gen_lowpart correctly then it doesn't generate the
> subreg of the mem in the first place so doing it like that in gen_highpart
> would _not_ invoke simplify_gen_subreg on a MEM_P but instead
> do what you now do directly?
>
> I also wonder why gen_lowpart_general uses byte_lowpart_offset
> while you use subreg_highpart_offset where subreg_lowpart_offset
> is also available ... huh - and there's also
> subreg_size_{lowpart,highpart}_offset.
> So it looks like your case wouldn't handle the paradoxical highpart
> (which better shouldn't be accessed?).
>
Surely the highpart of a paradoxical subreg is meaningless... what's the
highpart when the outer subreg is wider than the inner one?
And that's why there is subreg_lowpart_offset, subreg_highpart_offset
and byte_lowpart_offset, but not byte_highpart_offset (because the
latter is there to handle paradoxical cases, but decays to
subreg_lowpart_offset for a non-paradoxical subreg case).
> So like
>
> diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c
> index 77ea8948ee8..c3dae7d8075 100644
> --- a/gcc/emit-rtl.c
> +++ b/gcc/emit-rtl.c
> @@ -1585,6 +1585,13 @@ gen_highpart (machine_mode mode, rtx x)
> gcc_assert (known_le (msize, (unsigned int) UNITS_PER_WORD)
> || known_eq (msize, GET_MODE_UNIT_SIZE (GET_MODE (x))));
>
> + /* Offset MEMs. */
> + if (MEM_P (x))
> + {
> + poly_int64 offset = subreg_highpart_offset (mode, GET_MODE (x));
> + return adjust_address (x, mode, offset);
> + }
> +
> result = simplify_gen_subreg (mode, x, GET_MODE (x),
> subreg_highpart_offset (mode, GET_MODE (x)));
> gcc_assert (result);
>
In which case, I'm pretty certain the subsequent MEM_P (result) test can
be removed, as I can't see how simplify_gen_subreg would return a MEM
with such a change.
> Testing
>
> + else if (GET_CODE (result) == SUBREG && MEM_P (SUBREG_REG (result))
> + && MEM_P (x))
>
> looks a bit odd to me.
>
> I'll note it leaves gen_highpart_mode "unfixed", some refactoring should
> instead commonize the worker for both interfaces, making gen_highpart
> invoke gen_highpart_mode or so.
>
gen_highpart_mode invokes gen_highpart if the inner mode is not
VOIDmode. Perhaps the logic is somewhat backwards, or perhaps it's just
a bit more efficient that way.
I'll try your suggested change.
R.
>> gcc/ChangeLog:
>>
>> PR target/102125
>> * emit-rtl.c (gen_highpart): If simplify_gen_subreg returns
>> SUBREG (MEM) for a MEM, use adjust_address to produce a new
>> MEM.
>> ---
>> gcc/emit-rtl.c | 8 ++++++++
>> 1 file changed, 8 insertions(+)
>>
next prev parent reply other threads:[~2021-09-09 14:39 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-09-09 11:09 [PATCH v2 0/3] lower more cases of memcpy [PR102125] Richard Earnshaw
2021-09-09 11:09 ` [PATCH v2 1/3] rtl: properly handle subreg (mem) in gen_highpart [PR102125] Richard Earnshaw
2021-09-09 12:23 ` Richard Biener
2021-09-09 14:39 ` Richard Earnshaw [this message]
2021-09-09 11:09 ` [PATCH v2 2/3] arm: expand handling of movmisalign for DImode [PR102125] Richard Earnshaw
2021-09-09 11:09 ` [PATCH v2 3/3] gimple: allow more folding of memcpy [PR102125] Richard Earnshaw
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=dc3c622b-da20-16cd-797d-9f700ad8e9e3@foss.arm.com \
--to=richard.earnshaw@foss.arm.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=rearnsha@arm.com \
--cc=richard.guenther@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).