From: Jiong Wang <jiong.wang@arm.com>
To: Jeff Law <law@redhat.com>
Cc: "gcc-patches\@gcc.gnu.org" <gcc-patches@gcc.gnu.org>
Subject: Re: [Patch/rtl-expand] Take tree range info into account to improve LSHIFT_EXP expanding
Date: Mon, 27 Apr 2015 20:58:00 -0000 [thread overview]
Message-ID: <n99pp6pwaob.fsf@arm.com> (raw)
In-Reply-To: <5539A922.7060708@redhat.com>
Jeff Law writes:
> On 04/16/2015 05:04 AM, Jiong Wang wrote:
>>
>> This is a rework of
>>
>> https://gcc.gnu.org/ml/gcc-patches/2014-07/msg01998.html
>>
>> After second thinking, I feel it's better to fix this in earlier stage
>> during RTL expand which is more generic, and we also avoid making the
>> already complex combine pass complexer.
>>
>> Currently gcc expand wide mode left shift to some generic complex
>> instruction sequences, while if we have known the high part of wide mode
>> all comes from sign extension, the expand logic could be simplifed.
>>
>> Given the following example,
>>
>> T A = (T) B << const_imm_shift
>>
>> We know the high part of A are all comes from sign extension, if
>>
>> * T is the next wider type of word_mode.
>>
>> For example, for aarch64, if type T is 128int (TImode), and B is with
>> type SImode or DImode, then tree analyzer know that the high part of
>> TImode result all comes from sign extension, and kept them in range info.
>>
>> |< T >|
>> | high | low |
>> |<- sizel ->|
>>
>> For above example, we could simplify the expand logic into
>> 1. low = low << const_imm_shift;
>> 2. high = low >> (sizel - const_imm_shift) */
>>
>> We can utilize the arithmetic right shift to do the sign
>> extension. Those reduntant instructions will be optimized out later.
>>
>> For actual .s improvement,
>>
>> AArch64
>> =======
>>
>> __int128_t
>> foo (int data)
>> {
>> return (__int128_t) data << 50;
>> }
>>
>> old:
>> sxtw x2, w0
>> asr x1, x2, 63
>> lsl x0, x2, 50
>> lsl x1, x1, 50
>> orr x1, x1, x2, lsr 14
>>
>> new:
>> sxtw x1, w0
>> lsl x0, x1, 50
>> asr x1, x1, 14
>>
>>
>> ARM (.fpu softvfp)
>> ===========
>>
>> long long
>> shift (int data)
>> {
>> return (long long) data << 20;
>> }
>>
>> old:
>> stmfd sp!, {r4, r5}
>> mov r5, r0, asr #31
>> mov r3, r0
>> mov r0, r0, asl #20
>> mov r1, r5, asl #20
>> orr r1, r1, r3, lsr #12
>> ldmfd sp!, {r4, r5}
>> bx lr
>>
>> new:
>> mov r1, r0
>> mov r0, r0, asl #20
>> mov r1, r1, asr #12
>> bx lr
>>
>> Test
>> ====
>>
>> x86 bootstrap OK, regression test OK.
>> AArch64 bootstrap OK, regression test on board OK.
>>
>> Regards,
>> Jiong
>>
>> 2015-04-116 Jiong.Wang <jiong.wang@arm.com>
>>
>> gcc/
>> * expr.c (expand_expr_real_2): Take tree range info into account when
>> expanding LSHIFT_EXPR.
>>
>> gcc/testsuite
>> * gcc.dg/wide_shift_64_1.c: New testcase.
>> * gcc.dg/wide_shift_128_1.c: Ditto.
>> * gcc.target/aarch64/ashlti3_1.c: Ditto.
>> * gcc.target/arm/ashldisi_1.c: Ditto.
> Funny, I find myself wanting this transformation in both places :-)
> Expansion time so that we generate efficient code from the start and
> combine to catch those cases which are too complex to see at expansion,
> but due to other optimizations become visible in the combiner.
>
> Sadly, it's been fairly common practice for targets to define
> double-word shift patterns which catch a variety of special cases.
> Ports will have to choose between using those patterns and exploiting
> your work. I'd be tempted to generate a double-word shift by the given
> constant and check its cost vs the single word shifts.
>
> What happens if there's an overlap between t_low and low? Won't the
> lshift clobber low and thus we get the right value for the rshift in
> that case?
Jeff,
Sorry, I can't understand the meaning of "overlap between t_low and low",
assume "right" in "right value" means the opposite of "left" not
"correct".
So what you mean is t_low and low share the same pseudo regiser?
or you mean if we are shifting a value across the word boundary? like the following.
|< double word >|
| t_high | t_low |
|<- low ->|
for above situation, the simplified two instruction sequence do works.
"t_low = low << const_imm_shift ; t_high = low >> (sizel - const_imm_shift)"
I attached the expand result for a simple testcase below. I appreicate
if you could comment on the RTL.
Thanks.
__int128_t
foo (int data)
{
return (__int128_t) data << 50;
}
foo.c.188t.optimized
===
foo (int data)
{
__int128 _2;
__int128 _3;
<bb 2>:
_2 = (__int128) data_1(D);
_3 = _2 << 50;
return _3;
}
foo.c.189r.expand
===
(insn 2 4 3 2 (set (reg/v:SI 76 [ data ])
(reg:SI 0 x0 [ data ])) foo.c:3 -1
(nil))
(insn 6 3 7 2 (set (reg:DI 79)
(sign_extend:DI (reg/v:SI 76 [ data ]))) foo.c:4 -1
(nil))
(insn 7 6 8 2 (set (subreg:DI (reg:TI 78 [ D.2677 ]) 0)
(reg:DI 79)) foo.c:4 -1
(nil))
(insn 8 7 9 2 (set (reg:DI 80)
(ashiftrt:DI (reg:DI 79) (const_int 63 [0x3f]))) foo.c:4 -1
(nil))
(insn 9 8 10 2 (set (subreg:DI (reg:TI 78 [ D.2677 ]) 8)
(reg:DI 80)) foo.c:4 -1
(nil))
^
~~~~~~~~~ sign extend SImode "data" into TImode "_2" (r78)
(insn 10 9 11 2 (set (subreg:DI (reg:TI 77 [D.2677 ]) 0)
(ashift:DI (subreg:DI (reg:TI 78 [ D.2677 ]) 0)
(const_int 50 [0x32]))) foo.c:4 -1
(nil))
^
~~~~~~~~~~ t_low = low << const_imm_shift, target be r77
(insn 11 10 12 2 (set (subreg:DI (reg:TI 77 [ D.2677 ]) 8)
(ashiftrt:DI (subreg:DI (reg:TI 78 [ D.2677 ]) 0)
(const_int 14 [0xe]))) foo.c:4 -1
(nil))
^
~~~~~~~~~~ t_high = low >> (sizel - const_imm_shift)
(insn 12 11 16 2 (set (reg:TI 75 [ <retval> ])
(reg:TI 77 [ D.2677 ])) foo.c:4 -1
(nil))
(insn 16 12 17 2 (set (reg/i:TI 0 x0)
(reg:TI 75 [ <retval> ])) foo.c:5 -1
(nil))
> Note that expand_variable_shift may not honor your request for putting
> the result in the TARGET target parameter you pass in.
Thanks, agree, it's better to add those extra move.
I noticed the comments at the start of the function:
"Store the result in the rtx TARGET, if that is convenient."
Although I still don't understand in which case it's inconveninent.
--
Regards,
Jiong
next prev parent reply other threads:[~2015-04-27 20:58 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-04-16 11:04 Jiong Wang
2015-04-24 2:23 ` Jeff Law
2015-04-27 20:58 ` Jiong Wang [this message]
2015-04-29 3:53 ` Jeff Law
2015-04-29 22:14 ` Jiong Wang
2015-04-29 22:55 ` Jeff Law
2015-08-14 17:55 ` Jiong Wang
2015-08-14 20:30 ` Jeff Law
2015-08-14 22:24 ` Jiong Wang
2015-08-18 13:22 ` Jiong Wang
2015-08-18 17:47 ` Jeff Law
2015-08-19 23:05 ` Jiong Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=n99pp6pwaob.fsf@arm.com \
--to=jiong.wang@arm.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=law@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).