* [PATCH] Optimize bfi by remove redundant zero_extend
@ 2015-02-13 16:26 Renlin Li
2015-02-13 17:05 ` Richard Henderson
0 siblings, 1 reply; 3+ messages in thread
From: Renlin Li @ 2015-02-13 16:26 UTC (permalink / raw)
To: gcc-patches; +Cc: marcus Shawcroft, ramana Radhakrishnan, Richard Earnshaw
[-- Attachment #1: Type: text/plain, Size: 1176 bytes --]
Hi all,
The following rtx pattern will be transformed into the second one,
which makes the compiler unable to remove the redundant zero_extend
operation.
(set (zero_extract:SI (reg/i:SI 0 x0)
(const_int 8 [0x8])
(const_int 0 [0]))
(and:SI (reg:SI 1 x1 [ y ])
(const_int 255 [0xff])))
------------------->>>>>>>>>>>>>>>>
(set (zero_extract:SI (reg/i:SI 0 x0)
(const_int 8 [0x8])
(const_int 0 [0]))
(zero_extend:SI (reg:QI 1 x1 [ y ])))
This patch add the code to handle the second case as well.
for example, for the following short code excerpt
int f5(int x, int y)
{
return (x & ~0xff) | (y & 0xff);
}
original code-gen is:
uxtb w1, w1
bfi w0, w1, 0, 8
ret
new code-gen is:
bfi w0, w1, 0, 8
ret
x86-64 bootstraps Okay, aarch64-none-elf regression test Okay.
Okay for trunk?
Regards,
Renlin Li
gcc/ChangeLog:
2015-02-13 Renlin Li <renlin.li@arm.com>
* combine.c (make_field_assignment): Add code to remove redundant
ZERO_EXTEND operation.
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: new-3.diff --]
[-- Type: text/x-patch; name=new-3.diff, Size: 1369 bytes --]
diff --git a/gcc/combine.c b/gcc/combine.c
index f779117..759e07c 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -9311,6 +9311,31 @@ make_field_assignment (rtx x)
}
}
+ /* If DEST is already a field assignment, i.e. ZERO_EXTRACT, and the
+ SRC is an ZERO_EXTEND with all bits of that field set, then we can discard
+ the ZERO_EXTEND. */
+ if (GET_CODE (dest) == ZERO_EXTRACT
+ && CONST_INT_P (XEXP (dest, 1))
+ && CONST_INT_P (XEXP (dest, 2))
+ && GET_CODE (src) == ZERO_EXTEND
+ && REG_P (XEXP (src, 0)))
+ {
+ unsigned int regno = REGNO (XEXP (src, 0));
+ unsigned int inner_size = GET_MODE_BITSIZE (GET_MODE (XEXP (src, 0)));
+ mode = GET_MODE (src);
+
+ HOST_WIDE_INT width = INTVAL (XEXP (dest, 1));
+ HOST_WIDE_INT lsb = INTVAL (XEXP (dest, 2));
+ /* Complete overlap. We can remove the source ZERO_EXTEND. */
+ if (width == inner_size
+ && (regno < FIRST_PSEUDO_REGISTER)
+ && HARD_REGNO_MODE_OK (regno, mode))
+ {
+ rtx reg = gen_rtx_REG (mode, regno);
+ return gen_rtx_SET (VOIDmode, dest, reg);
+ }
+ }
+
/* The other case we handle is assignments into a constant-position
field. They look like (ior/xor (and DEST C1) OTHER). If C1 represents
a mask that has all one bits except for a group of zero bits and
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] Optimize bfi by remove redundant zero_extend
2015-02-13 16:26 [PATCH] Optimize bfi by remove redundant zero_extend Renlin Li
@ 2015-02-13 17:05 ` Richard Henderson
2015-02-25 10:42 ` Renlin Li
0 siblings, 1 reply; 3+ messages in thread
From: Richard Henderson @ 2015-02-13 17:05 UTC (permalink / raw)
To: Renlin Li, gcc-patches
Cc: marcus Shawcroft, ramana Radhakrishnan, Richard Earnshaw
On 02/13/2015 08:26 AM, Renlin Li wrote:
> + /* Complete overlap. We can remove the source ZERO_EXTEND. */
> + if (width == inner_size
> + && (regno < FIRST_PSEUDO_REGISTER)
> + && HARD_REGNO_MODE_OK (regno, mode))
> + {
> + rtx reg = gen_rtx_REG (mode, regno);
> + return gen_rtx_SET (VOIDmode, dest, reg);
What in the world are you doing here with the hard registers?
r~
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] Optimize bfi by remove redundant zero_extend
2015-02-13 17:05 ` Richard Henderson
@ 2015-02-25 10:42 ` Renlin Li
0 siblings, 0 replies; 3+ messages in thread
From: Renlin Li @ 2015-02-25 10:42 UTC (permalink / raw)
To: Richard Henderson, gcc-patches
On 13/02/15 17:04, Richard Henderson wrote:
> On 02/13/2015 08:26 AM, Renlin Li wrote:
>> + /* Complete overlap. We can remove the source ZERO_EXTEND. */
>> + if (width == inner_size
>> + && (regno < FIRST_PSEUDO_REGISTER)
>> + && HARD_REGNO_MODE_OK (regno, mode))
>> + {
>> + rtx reg = gen_rtx_REG (mode, regno);
>> + return gen_rtx_SET (VOIDmode, dest, reg);
> What in the world are you doing here with the hard registers?
>
>
> r~
>
Ah, So we don't have the convention to play with hard register in this pass?
This AND rtx here will be transformed into a zero_extend rtx from
sub_reg expression, and further into the form I have mentioned above, as
the hard register support this.
(and:SI (reg:SI 1 x1 [ y ])
(const_int 255 [0xff])))
I have modified my patch a little bit, Is the following method Okay?
if (GET_CODE (dest) == ZERO_EXTRACT
&& CONST_INT_P (XEXP (dest, 1))
&& CONST_INT_P (XEXP (dest, 2))
&& GET_CODE (src) == ZERO_EXTEND)
{
HOST_WIDE_INT width = INTVAL (XEXP (dest, 1));
mode = GET_MODE (dest);
/* handle sub_reg rtx. */
if (GET_CODE (XEXP (src, 0)) == SUBREG && REG_P (XEXP (XEXP (src,
0), 0)))
{
rtx reg = XEXP (XEXP (src, 0), 0);
unsigned int inner_size = GET_MODE_BITSIZE (GET_MODE (XEXP (src,
0)));
/* Complete overlap. We can remove the source ZERO_EXTEND. */
if (width == inner_size
&& mode == GET_MODE (reg))
return gen_rtx_SET (VOIDmode, dest, reg);
}
/* handle hard register case as sub_reg further is simplied. */
else if (REG_P (XEXP (src, 0)))
{
rtx tmp = XEXP (src, 0);
unsigned int inner_size = GET_MODE_BITSIZE (GET_MODE (tmp));
/* Complete overlap. We can remove the source ZERO_EXTEND. */
if (width == inner_size && can_change_dest_mode (tmp, 0, mode))
{
rtx reg = gen_rtx_REG (mode, REGNO (tmp));
return gen_rtx_SET (VOIDmode, dest, reg);
}
}
}
Is there a better way to do this here or somewhere else? or it's good
to implement it as a specific rtx insn?
A define split won't work in this case. This should handled in combine pass.
Thank you for any suggestions!
Regards,
Renlin Li
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2015-02-25 10:27 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-02-13 16:26 [PATCH] Optimize bfi by remove redundant zero_extend Renlin Li
2015-02-13 17:05 ` Richard Henderson
2015-02-25 10:42 ` Renlin Li
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).