public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] Optimize bfi by remove redundant zero_extend
@ 2015-02-13 16:26 Renlin Li
  2015-02-13 17:05 ` Richard Henderson
  0 siblings, 1 reply; 3+ messages in thread
From: Renlin Li @ 2015-02-13 16:26 UTC (permalink / raw)
  To: gcc-patches; +Cc: marcus Shawcroft, ramana Radhakrishnan, Richard Earnshaw

[-- Attachment #1: Type: text/plain, Size: 1176 bytes --]

Hi all,

The following rtx pattern will be transformed into the second one,
which makes the compiler unable to remove the redundant zero_extend 
operation.


(set (zero_extract:SI (reg/i:SI 0 x0)
                       (const_int 8 [0x8])
                   (const_int 0 [0]))
     (and:SI (reg:SI 1 x1 [ y ])
         (const_int 255 [0xff])))
------------------->>>>>>>>>>>>>>>>
(set (zero_extract:SI (reg/i:SI 0 x0)
                       (const_int 8 [0x8])
                   (const_int 0 [0]))
      (zero_extend:SI (reg:QI 1 x1 [ y ])))

This patch add the code to handle the second case as well.

for example, for the following short code excerpt

int f5(int x, int y)
{
         return (x & ~0xff) | (y & 0xff);
}

original code-gen is:
         uxtb    w1, w1
         bfi     w0, w1, 0, 8
         ret
new code-gen is:
         bfi    w0, w1, 0, 8
         ret

x86-64 bootstraps Okay, aarch64-none-elf regression test Okay.
Okay for trunk?

Regards,
Renlin Li

gcc/ChangeLog:

2015-02-13  Renlin Li  <renlin.li@arm.com>

     * combine.c (make_field_assignment): Add code to remove redundant
     ZERO_EXTEND operation.

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: new-3.diff --]
[-- Type: text/x-patch; name=new-3.diff, Size: 1369 bytes --]

diff --git a/gcc/combine.c b/gcc/combine.c
index f779117..759e07c 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -9311,6 +9311,31 @@ make_field_assignment (rtx x)
 	}
     }
 
+  /* If DEST is already a field assignment, i.e. ZERO_EXTRACT, and the
+     SRC is an ZERO_EXTEND with all bits of that field set, then we can discard
+     the ZERO_EXTEND.  */
+  if (GET_CODE (dest) == ZERO_EXTRACT
+      && CONST_INT_P (XEXP (dest, 1))
+      && CONST_INT_P (XEXP (dest, 2))
+      && GET_CODE (src) == ZERO_EXTEND
+      && REG_P (XEXP (src, 0)))
+    {
+      unsigned int regno = REGNO (XEXP (src, 0));
+      unsigned int inner_size = GET_MODE_BITSIZE (GET_MODE (XEXP (src, 0)));
+      mode = GET_MODE (src);
+
+      HOST_WIDE_INT width = INTVAL (XEXP (dest, 1));
+      HOST_WIDE_INT lsb = INTVAL (XEXP (dest, 2));
+      /* Complete overlap.  We can remove the source ZERO_EXTEND.  */
+      if (width == inner_size
+	  && (regno < FIRST_PSEUDO_REGISTER)
+	  && HARD_REGNO_MODE_OK (regno, mode))
+      {
+	rtx reg = gen_rtx_REG (mode, regno);
+	return gen_rtx_SET (VOIDmode, dest, reg);
+      }
+    }
+
   /* The other case we handle is assignments into a constant-position
      field.  They look like (ior/xor (and DEST C1) OTHER).  If C1 represents
      a mask that has all one bits except for a group of zero bits and

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] Optimize bfi by remove redundant zero_extend
  2015-02-13 16:26 [PATCH] Optimize bfi by remove redundant zero_extend Renlin Li
@ 2015-02-13 17:05 ` Richard Henderson
  2015-02-25 10:42   ` Renlin Li
  0 siblings, 1 reply; 3+ messages in thread
From: Richard Henderson @ 2015-02-13 17:05 UTC (permalink / raw)
  To: Renlin Li, gcc-patches
  Cc: marcus Shawcroft, ramana Radhakrishnan, Richard Earnshaw

On 02/13/2015 08:26 AM, Renlin Li wrote:
> +      /* Complete overlap.  We can remove the source ZERO_EXTEND.  */
> +      if (width == inner_size
> +	  && (regno < FIRST_PSEUDO_REGISTER)
> +	  && HARD_REGNO_MODE_OK (regno, mode))
> +      {
> +	rtx reg = gen_rtx_REG (mode, regno);
> +	return gen_rtx_SET (VOIDmode, dest, reg);

What in the world are you doing here with the hard registers?


r~

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] Optimize bfi by remove redundant zero_extend
  2015-02-13 17:05 ` Richard Henderson
@ 2015-02-25 10:42   ` Renlin Li
  0 siblings, 0 replies; 3+ messages in thread
From: Renlin Li @ 2015-02-25 10:42 UTC (permalink / raw)
  To: Richard Henderson, gcc-patches

On 13/02/15 17:04, Richard Henderson wrote:
> On 02/13/2015 08:26 AM, Renlin Li wrote:
>> +      /* Complete overlap.  We can remove the source ZERO_EXTEND.  */
>> +      if (width == inner_size
>> +	  && (regno < FIRST_PSEUDO_REGISTER)
>> +	  && HARD_REGNO_MODE_OK (regno, mode))
>> +      {
>> +	rtx reg = gen_rtx_REG (mode, regno);
>> +	return gen_rtx_SET (VOIDmode, dest, reg);
> What in the world are you doing here with the hard registers?
>
>
> r~
>

Ah, So we don't have the convention to play with hard register in this pass?

This AND rtx here will be transformed into a zero_extend rtx from 
sub_reg expression, and further into the form I have mentioned above, as 
the hard register support this.

(and:SI (reg:SI 1 x1 [ y ])
	    (const_int 255 [0xff])))

I have modified my patch a little bit, Is the following method Okay?

   if (GET_CODE (dest) == ZERO_EXTRACT
       && CONST_INT_P (XEXP (dest, 1))
       && CONST_INT_P (XEXP (dest, 2))
       && GET_CODE (src) == ZERO_EXTEND)
     {
       HOST_WIDE_INT width = INTVAL (XEXP (dest, 1));
       mode = GET_MODE (dest);

      /* handle sub_reg rtx.  */
       if (GET_CODE (XEXP (src, 0)) == SUBREG && REG_P (XEXP (XEXP (src, 
0), 0)))
     {
       rtx reg = XEXP (XEXP (src, 0), 0);
       unsigned int inner_size = GET_MODE_BITSIZE (GET_MODE (XEXP (src, 
0)));

       /* Complete overlap.  We can remove the source ZERO_EXTEND. */
       if (width == inner_size
           && mode == GET_MODE (reg))
         return gen_rtx_SET (VOIDmode, dest, reg);
     }
       /* handle hard register case as sub_reg further is simplied. */
       else if (REG_P (XEXP (src, 0)))
     {
       rtx tmp = XEXP (src, 0);
       unsigned int inner_size = GET_MODE_BITSIZE (GET_MODE (tmp));

       /* Complete overlap.  We can remove the source ZERO_EXTEND. */
       if (width == inner_size && can_change_dest_mode (tmp, 0, mode))
         {
           rtx reg = gen_rtx_REG (mode, REGNO (tmp));
           return gen_rtx_SET (VOIDmode, dest, reg);
         }
     }
     }



Is there a better way to do this here or somewhere else?  or it's good 
to implement it as a specific rtx insn?
A define split won't work in this case. This should handled in combine pass.

Thank you for any suggestions!
Regards,
Renlin Li



^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2015-02-25 10:27 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-02-13 16:26 [PATCH] Optimize bfi by remove redundant zero_extend Renlin Li
2015-02-13 17:05 ` Richard Henderson
2015-02-25 10:42   ` Renlin Li

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).