public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] Fold VEC_[LR]SHIFT_EXPR (PR tree-optimization/57051)
@ 2013-04-26  0:32 Jakub Jelinek
  2013-04-26 11:06 ` Richard Biener
  2013-05-03 13:18 ` Jakub Jelinek
  0 siblings, 2 replies; 6+ messages in thread
From: Jakub Jelinek @ 2013-04-26  0:32 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

Hi!

This patch adds folding of constant arguments v>> and v<<, which helps to
optimize the testcase from the PR back into constant store after vectorized
loop is unrolled.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2013-04-25  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/57051
	* fold-const.c (const_binop): Handle VEC_LSHIFT_EXPR
	and VEC_RSHIFT_EXPR if shift count is a multiple of element
	bitsize.

--- gcc/fold-const.c.jj	2013-04-12 10:16:25.000000000 +0200
+++ gcc/fold-const.c	2013-04-24 12:37:11.789122719 +0200
@@ -1380,17 +1380,42 @@ const_binop (enum tree_code code, tree a
       int count = TYPE_VECTOR_SUBPARTS (type), i;
       tree *elts = XALLOCAVEC (tree, count);
 
-      for (i = 0; i < count; i++)
+      if (code == VEC_LSHIFT_EXPR
+	  || code == VEC_RSHIFT_EXPR)
 	{
-	  tree elem1 = VECTOR_CST_ELT (arg1, i);
-
-	  elts[i] = const_binop (code, elem1, arg2);
+	  if (!host_integerp (arg2, 1))
+	    return NULL_TREE;
 
-	  /* It is possible that const_binop cannot handle the given
-	     code and return NULL_TREE */
-	  if (elts[i] == NULL_TREE)
+	  unsigned HOST_WIDE_INT shiftc = tree_low_cst (arg2, 1);
+	  unsigned HOST_WIDE_INT outerc = tree_low_cst (TYPE_SIZE (type), 1);
+	  unsigned HOST_WIDE_INT innerc
+	    = tree_low_cst (TYPE_SIZE (TREE_TYPE (type)), 1);
+	  if (shiftc >= outerc || (shiftc % innerc) != 0)
 	    return NULL_TREE;
+	  int offset = shiftc / innerc;
+	  if (code == VEC_LSHIFT_EXPR)
+	    offset = -offset;
+	  tree zero = build_zero_cst (TREE_TYPE (type));
+	  for (i = 0; i < count; i++)
+	    {
+	      if (i + offset < 0 || i + offset >= count)
+		elts[i] = zero;
+	      else
+		elts[i] = VECTOR_CST_ELT (arg1, i + offset);
+	    }
 	}
+      else
+	for (i = 0; i < count; i++)
+	  {
+	    tree elem1 = VECTOR_CST_ELT (arg1, i);
+
+	    elts[i] = const_binop (code, elem1, arg2);
+
+	    /* It is possible that const_binop cannot handle the given
+	       code and return NULL_TREE */
+	    if (elts[i] == NULL_TREE)
+	      return NULL_TREE;
+	  }
 
       return build_vector (type, elts);
     }

	Jakub

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Fold VEC_[LR]SHIFT_EXPR (PR tree-optimization/57051)
  2013-04-26  0:32 [PATCH] Fold VEC_[LR]SHIFT_EXPR (PR tree-optimization/57051) Jakub Jelinek
@ 2013-04-26 11:06 ` Richard Biener
  2013-05-03 13:18 ` Jakub Jelinek
  1 sibling, 0 replies; 6+ messages in thread
From: Richard Biener @ 2013-04-26 11:06 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: gcc-patches

On Thu, 25 Apr 2013, Jakub Jelinek wrote:

> Hi!
> 
> This patch adds folding of constant arguments v>> and v<<, which helps to
> optimize the testcase from the PR back into constant store after vectorized
> loop is unrolled.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Ok.

Thanks,
Richard.

> 2013-04-25  Jakub Jelinek  <jakub@redhat.com>
> 
> 	PR tree-optimization/57051
> 	* fold-const.c (const_binop): Handle VEC_LSHIFT_EXPR
> 	and VEC_RSHIFT_EXPR if shift count is a multiple of element
> 	bitsize.
> 
> --- gcc/fold-const.c.jj	2013-04-12 10:16:25.000000000 +0200
> +++ gcc/fold-const.c	2013-04-24 12:37:11.789122719 +0200
> @@ -1380,17 +1380,42 @@ const_binop (enum tree_code code, tree a
>        int count = TYPE_VECTOR_SUBPARTS (type), i;
>        tree *elts = XALLOCAVEC (tree, count);
>  
> -      for (i = 0; i < count; i++)
> +      if (code == VEC_LSHIFT_EXPR
> +	  || code == VEC_RSHIFT_EXPR)
>  	{
> -	  tree elem1 = VECTOR_CST_ELT (arg1, i);
> -
> -	  elts[i] = const_binop (code, elem1, arg2);
> +	  if (!host_integerp (arg2, 1))
> +	    return NULL_TREE;
>  
> -	  /* It is possible that const_binop cannot handle the given
> -	     code and return NULL_TREE */
> -	  if (elts[i] == NULL_TREE)
> +	  unsigned HOST_WIDE_INT shiftc = tree_low_cst (arg2, 1);
> +	  unsigned HOST_WIDE_INT outerc = tree_low_cst (TYPE_SIZE (type), 1);
> +	  unsigned HOST_WIDE_INT innerc
> +	    = tree_low_cst (TYPE_SIZE (TREE_TYPE (type)), 1);
> +	  if (shiftc >= outerc || (shiftc % innerc) != 0)
>  	    return NULL_TREE;
> +	  int offset = shiftc / innerc;
> +	  if (code == VEC_LSHIFT_EXPR)
> +	    offset = -offset;
> +	  tree zero = build_zero_cst (TREE_TYPE (type));
> +	  for (i = 0; i < count; i++)
> +	    {
> +	      if (i + offset < 0 || i + offset >= count)
> +		elts[i] = zero;
> +	      else
> +		elts[i] = VECTOR_CST_ELT (arg1, i + offset);
> +	    }
>  	}
> +      else
> +	for (i = 0; i < count; i++)
> +	  {
> +	    tree elem1 = VECTOR_CST_ELT (arg1, i);
> +
> +	    elts[i] = const_binop (code, elem1, arg2);
> +
> +	    /* It is possible that const_binop cannot handle the given
> +	       code and return NULL_TREE */
> +	    if (elts[i] == NULL_TREE)
> +	      return NULL_TREE;
> +	  }
>  
>        return build_vector (type, elts);
>      }
> 
> 	Jakub
> 
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imend

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Fold VEC_[LR]SHIFT_EXPR (PR tree-optimization/57051)
  2013-04-26  0:32 [PATCH] Fold VEC_[LR]SHIFT_EXPR (PR tree-optimization/57051) Jakub Jelinek
  2013-04-26 11:06 ` Richard Biener
@ 2013-05-03 13:18 ` Jakub Jelinek
  2013-05-16 18:01   ` Mikael Pettersson
  1 sibling, 1 reply; 6+ messages in thread
From: Jakub Jelinek @ 2013-05-03 13:18 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

On Thu, Apr 25, 2013 at 11:47:02PM +0200, Jakub Jelinek wrote:
> This patch adds folding of constant arguments v>> and v<<, which helps to
> optimize the testcase from the PR back into constant store after vectorized
> loop is unrolled.

As this fixes a regression on the 4.8 branch, I've backported it (and
minimal prerequisite for that) to 4.8 branch too.

As the non-whole vector shifts VECTOR_CST by INTEGER_CST don't have any
testcase showing a regression, I've left those out (trunk has instead
of that else return NULL_TREE; code to handle those).

2013-05-03  Jakub Jelinek  <jakub@redhat.com>

	Backported from mainline
	2013-04-26  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/57051
	* fold-const.c (const_binop): Handle VEC_LSHIFT_EXPR
	and VEC_RSHIFT_EXPR if shift count is a multiple of element
	bitsize.

	2013-04-12  Marc Glisse  <marc.glisse@inria.fr>

	* fold-const.c (fold_binary_loc): Call const_binop also for mixed
	vector-scalar operations.

--- gcc/fold-const.c	(revision 198579)
+++ gcc/fold-const.c	(working copy)
@@ -1366,6 +1366,44 @@ const_binop (enum tree_code code, tree a
 
       return build_vector (type, elts);
     }
+
+  /* Shifts allow a scalar offset for a vector.  */
+  if (TREE_CODE (arg1) == VECTOR_CST
+      && TREE_CODE (arg2) == INTEGER_CST)
+    {
+      tree type = TREE_TYPE (arg1);
+      int count = TYPE_VECTOR_SUBPARTS (type), i;
+      tree *elts = XALLOCAVEC (tree, count);
+
+      if (code == VEC_LSHIFT_EXPR
+	  || code == VEC_RSHIFT_EXPR)
+	{
+	  if (!host_integerp (arg2, 1))
+	    return NULL_TREE;
+
+	  unsigned HOST_WIDE_INT shiftc = tree_low_cst (arg2, 1);
+	  unsigned HOST_WIDE_INT outerc = tree_low_cst (TYPE_SIZE (type), 1);
+	  unsigned HOST_WIDE_INT innerc
+	    = tree_low_cst (TYPE_SIZE (TREE_TYPE (type)), 1);
+	  if (shiftc >= outerc || (shiftc % innerc) != 0)
+	    return NULL_TREE;
+	  int offset = shiftc / innerc;
+	  if (code == VEC_LSHIFT_EXPR)
+	    offset = -offset;
+	  tree zero = build_zero_cst (TREE_TYPE (type));
+	  for (i = 0; i < count; i++)
+	    {
+	      if (i + offset < 0 || i + offset >= count)
+		elts[i] = zero;
+	      else
+		elts[i] = VECTOR_CST_ELT (arg1, i + offset);
+	    }
+	}
+      else
+	return NULL_TREE;
+
+      return build_vector (type, elts);
+    }
   return NULL_TREE;
 }
 
@@ -9862,7 +9900,8 @@ fold_binary_loc (location_t loc,
       || (TREE_CODE (arg0) == FIXED_CST && TREE_CODE (arg1) == FIXED_CST)
       || (TREE_CODE (arg0) == FIXED_CST && TREE_CODE (arg1) == INTEGER_CST)
       || (TREE_CODE (arg0) == COMPLEX_CST && TREE_CODE (arg1) == COMPLEX_CST)
-      || (TREE_CODE (arg0) == VECTOR_CST && TREE_CODE (arg1) == VECTOR_CST))
+      || (TREE_CODE (arg0) == VECTOR_CST && TREE_CODE (arg1) == VECTOR_CST)
+      || (TREE_CODE (arg0) == VECTOR_CST && TREE_CODE (arg1) == INTEGER_CST))
     {
       if (kind == tcc_binary)
 	{


	Jakub

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Fold VEC_[LR]SHIFT_EXPR (PR tree-optimization/57051)
  2013-05-03 13:18 ` Jakub Jelinek
@ 2013-05-16 18:01   ` Mikael Pettersson
  2013-05-17  6:48     ` [PATCH] Fix VEC_[LR]SHIFT_EXPR folding for big-endian " Jakub Jelinek
  0 siblings, 1 reply; 6+ messages in thread
From: Mikael Pettersson @ 2013-05-16 18:01 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Richard Biener, gcc-patches

Jakub Jelinek writes:
 > On Thu, Apr 25, 2013 at 11:47:02PM +0200, Jakub Jelinek wrote:
 > > This patch adds folding of constant arguments v>> and v<<, which helps to
 > > optimize the testcase from the PR back into constant store after vectorized
 > > loop is unrolled.
 > 
 > As this fixes a regression on the 4.8 branch, I've backported it (and
 > minimal prerequisite for that) to 4.8 branch too.

Unfortunately this patch makes gcc.dg/vect/no-scevccp-outer-{7,13}.c fail
on powerpc64-linux:

+FAIL: gcc.dg/vect/no-scevccp-outer-13.c execution test
+FAIL: gcc.dg/vect/no-scevccp-outer-7.c execution test

which is a regression from 4.8-20130502.  Reverting r198580 fixes it.

The same FAILs also occur on trunk.

/Mikael

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH] Fix VEC_[LR]SHIFT_EXPR folding for big-endian (PR tree-optimization/57051)
  2013-05-16 18:01   ` Mikael Pettersson
@ 2013-05-17  6:48     ` Jakub Jelinek
  2013-05-17  8:02       ` Richard Biener
  0 siblings, 1 reply; 6+ messages in thread
From: Jakub Jelinek @ 2013-05-17  6:48 UTC (permalink / raw)
  To: Richard Biener, Mikael Pettersson; +Cc: gcc-patches

On Thu, May 16, 2013 at 07:59:00PM +0200, Mikael Pettersson wrote:
> Jakub Jelinek writes:
>  > On Thu, Apr 25, 2013 at 11:47:02PM +0200, Jakub Jelinek wrote:
>  > > This patch adds folding of constant arguments v>> and v<<, which helps to
>  > > optimize the testcase from the PR back into constant store after vectorized
>  > > loop is unrolled.
>  > 
>  > As this fixes a regression on the 4.8 branch, I've backported it (and
>  > minimal prerequisite for that) to 4.8 branch too.
> 
> Unfortunately this patch makes gcc.dg/vect/no-scevccp-outer-{7,13}.c fail
> on powerpc64-linux:
> 
> +FAIL: gcc.dg/vect/no-scevccp-outer-13.c execution test
> +FAIL: gcc.dg/vect/no-scevccp-outer-7.c execution test
> 
> which is a regression from 4.8-20130502.  Reverting r198580 fixes it.
> 
> The same FAILs also occur on trunk.

Ah right, I was confused by the fact that VEC_RSHIFT_EXPR is used
not just on little endian targets, but on big endian as well
(VEC_LSHIFT_EXPR is never emitted), but the important spot is
when extracting the scalar result from the vector:

      if (BYTES_BIG_ENDIAN)
        bitpos = size_binop (MULT_EXPR,
                             bitsize_int (TYPE_VECTOR_SUBPARTS (vectype) - 1),
                             TYPE_SIZE (scalar_type));
      else
        bitpos = bitsize_zero_node;

Fixed thusly, ok for trunk/4.8?

2013-05-17  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/57051
	* fold-const.c (const_binop) <case VEC_LSHIFT_EXPR,
	case VEC_RSHIFT_EXPR>: Fix BYTES_BIG_ENDIAN handling.

--- gcc/fold-const.c.jj	2013-05-16 12:36:28.000000000 +0200
+++ gcc/fold-const.c	2013-05-17 08:38:12.575117676 +0200
@@ -1393,7 +1393,7 @@ const_binop (enum tree_code code, tree a
 	  if (shiftc >= outerc || (shiftc % innerc) != 0)
 	    return NULL_TREE;
 	  int offset = shiftc / innerc;
-	  if (code == VEC_LSHIFT_EXPR)
+	  if ((code == VEC_RSHIFT_EXPR) ^ (!BYTES_BIG_ENDIAN))
 	    offset = -offset;
 	  tree zero = build_zero_cst (TREE_TYPE (type));
 	  for (i = 0; i < count; i++)


	Jakub

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Fix VEC_[LR]SHIFT_EXPR folding for big-endian (PR tree-optimization/57051)
  2013-05-17  6:48     ` [PATCH] Fix VEC_[LR]SHIFT_EXPR folding for big-endian " Jakub Jelinek
@ 2013-05-17  8:02       ` Richard Biener
  0 siblings, 0 replies; 6+ messages in thread
From: Richard Biener @ 2013-05-17  8:02 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Mikael Pettersson, gcc-patches

On Fri, 17 May 2013, Jakub Jelinek wrote:

> On Thu, May 16, 2013 at 07:59:00PM +0200, Mikael Pettersson wrote:
> > Jakub Jelinek writes:
> >  > On Thu, Apr 25, 2013 at 11:47:02PM +0200, Jakub Jelinek wrote:
> >  > > This patch adds folding of constant arguments v>> and v<<, which helps to
> >  > > optimize the testcase from the PR back into constant store after vectorized
> >  > > loop is unrolled.
> >  > 
> >  > As this fixes a regression on the 4.8 branch, I've backported it (and
> >  > minimal prerequisite for that) to 4.8 branch too.
> > 
> > Unfortunately this patch makes gcc.dg/vect/no-scevccp-outer-{7,13}.c fail
> > on powerpc64-linux:
> > 
> > +FAIL: gcc.dg/vect/no-scevccp-outer-13.c execution test
> > +FAIL: gcc.dg/vect/no-scevccp-outer-7.c execution test
> > 
> > which is a regression from 4.8-20130502.  Reverting r198580 fixes it.
> > 
> > The same FAILs also occur on trunk.
> 
> Ah right, I was confused by the fact that VEC_RSHIFT_EXPR is used
> not just on little endian targets, but on big endian as well
> (VEC_LSHIFT_EXPR is never emitted), but the important spot is
> when extracting the scalar result from the vector:
> 
>       if (BYTES_BIG_ENDIAN)
>         bitpos = size_binop (MULT_EXPR,
>                              bitsize_int (TYPE_VECTOR_SUBPARTS (vectype) - 1),
>                              TYPE_SIZE (scalar_type));
>       else
>         bitpos = bitsize_zero_node;
> 
> Fixed thusly, ok for trunk/4.8?

Ok with a comment in front of (code == VEC_RSHIFT_EXPR) ^ 
(!BYTES_BIG_ENDIAN)

Thanks,
Richard.

> 2013-05-17  Jakub Jelinek  <jakub@redhat.com>
> 
> 	PR tree-optimization/57051
> 	* fold-const.c (const_binop) <case VEC_LSHIFT_EXPR,
> 	case VEC_RSHIFT_EXPR>: Fix BYTES_BIG_ENDIAN handling.
> 
> --- gcc/fold-const.c.jj	2013-05-16 12:36:28.000000000 +0200
> +++ gcc/fold-const.c	2013-05-17 08:38:12.575117676 +0200
> @@ -1393,7 +1393,7 @@ const_binop (enum tree_code code, tree a
>  	  if (shiftc >= outerc || (shiftc % innerc) != 0)
>  	    return NULL_TREE;
>  	  int offset = shiftc / innerc;
> -	  if (code == VEC_LSHIFT_EXPR)
> +	  if ((code == VEC_RSHIFT_EXPR) ^ (!BYTES_BIG_ENDIAN))
>  	    offset = -offset;
>  	  tree zero = build_zero_cst (TREE_TYPE (type));
>  	  for (i = 0; i < count; i++)
> 
> 
> 	Jakub
> 
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imend

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2013-05-17  8:02 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-04-26  0:32 [PATCH] Fold VEC_[LR]SHIFT_EXPR (PR tree-optimization/57051) Jakub Jelinek
2013-04-26 11:06 ` Richard Biener
2013-05-03 13:18 ` Jakub Jelinek
2013-05-16 18:01   ` Mikael Pettersson
2013-05-17  6:48     ` [PATCH] Fix VEC_[LR]SHIFT_EXPR folding for big-endian " Jakub Jelinek
2013-05-17  8:02       ` Richard Biener

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).