public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] Don't check for optab for 16bit bswap
@ 2014-12-29 14:46 Thomas Preud'homme
  2014-12-29 17:53 ` Richard Biener
  2015-01-21 21:30 ` Richard Henderson
  0 siblings, 2 replies; 16+ messages in thread
From: Thomas Preud'homme @ 2014-12-29 14:46 UTC (permalink / raw)
  To: gcc-patches, 'Richard Biener'

Since 16bit byteswap can be done via an 8 bit rotation (and it is the canonical form),
the check for an optab only serves to prevent the bswap optimization for
targets that don't have a 16bit byteswap (but do have a rotation instruction). See
PR63259 (comments 6 and onwards) for more details.

ChangeLog entry is as follows:

2014-12-24  Thomas Preud'homme  thomas.preudhomme@arm.com

PR tree-optimization/63259
* tree-ssa-math-opts.c (pass_optimize_bswap::execute): Stop checking
if optab exists for 16bit byteswap.


diff --git a/gcc/tree-ssa-math-opts.c b/gcc/tree-ssa-math-opts.c
index 1ed2838..c175e88 100644
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -2350,15 +2350,13 @@ unsigned int
 pass_optimize_bswap::execute (function *fun)
 {
   basic_block bb;
-  bool bswap16_p, bswap32_p, bswap64_p;
+  bool bswap32_p, bswap64_p;
   bool changed = false;
-  tree bswap16_type = NULL_TREE, bswap32_type = NULL_TREE, bswap64_type = NULL_TREE;
+  tree bswap32_type = NULL_TREE, bswap64_type = NULL_TREE;
 
   if (BITS_PER_UNIT != 8)
     return 0;
 
-  bswap16_p = (builtin_decl_explicit_p (BUILT_IN_BSWAP16)
-	       && optab_handler (bswap_optab, HImode) != CODE_FOR_nothing);
   bswap32_p = (builtin_decl_explicit_p (BUILT_IN_BSWAP32)
 	       && optab_handler (bswap_optab, SImode) != CODE_FOR_nothing);
   bswap64_p = (builtin_decl_explicit_p (BUILT_IN_BSWAP64)
@@ -2367,12 +2365,6 @@ pass_optimize_bswap::execute (function *fun)
 
   /* Determine the argument type of the builtins.  The code later on
      assumes that the return and argument type are the same.  */
-  if (bswap16_p)
-    {
-      tree fndecl = builtin_decl_explicit (BUILT_IN_BSWAP16);
-      bswap16_type = TREE_VALUE (TYPE_ARG_TYPES (TREE_TYPE (fndecl)));
-    }
-
   if (bswap32_p)
     {
       tree fndecl = builtin_decl_explicit (BUILT_IN_BSWAP32);
@@ -2443,11 +2435,6 @@ pass_optimize_bswap::execute (function *fun)
 	      if (code == LROTATE_EXPR || code == RROTATE_EXPR)
 		continue;
 	      load_type = uint16_type_node;
-	      if (bswap16_p)
-		{
-		  fndecl = builtin_decl_explicit (BUILT_IN_BSWAP16);
-		  bswap_type = bswap16_type;
-		}
 	      break;
 	    case 32:
 	      load_type = uint32_type_node;
@@ -2469,7 +2456,7 @@ pass_optimize_bswap::execute (function *fun)
 	      continue;
 	    }
 
-	  if (bswap && !fndecl)
+	  if (bswap && !fndecl && n.range != 16)
 	    continue;
 
 	  if (bswap_replace (cur_stmt, src_stmt, fndecl, bswap_type, load_type,

Bootstrapped on x86_64-linux-gnu and aarch64-none-linux-gnu without any
testsuite regression

Is this ok for trunk?

Best regards,

Thomas



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Don't check for optab for 16bit bswap
  2014-12-29 14:46 [PATCH] Don't check for optab for 16bit bswap Thomas Preud'homme
@ 2014-12-29 17:53 ` Richard Biener
  2014-12-29 18:36   ` Thomas Preud'homme
  2015-01-21 21:30 ` Richard Henderson
  1 sibling, 1 reply; 16+ messages in thread
From: Richard Biener @ 2014-12-29 17:53 UTC (permalink / raw)
  To: Thomas Preud'homme, gcc-patches

On December 29, 2014 3:04:36 PM CET, Thomas Preud'homme <thomas.preudhomme@arm.com> wrote:
>Since 16bit byteswap can be done via an 8 bit rotation (and it is the
>canonical form),
>the check for an optab only serves to prevent the bswap optimization
>for
>targets that don't have a 16bit byteswap (but do have a rotation
>instruction). See
>PR63259 (comments 6 and onwards) for more details.

OK, but what about targets without a rotation optab?  Is the fallback expansion reasonable in all cases?

Richard.

>ChangeLog entry is as follows:
>
>2014-12-24  Thomas Preud'homme  thomas.preudhomme@arm.com
>
>PR tree-optimization/63259
>* tree-ssa-math-opts.c (pass_optimize_bswap::execute): Stop checking
>if optab exists for 16bit byteswap.
>
>
>diff --git a/gcc/tree-ssa-math-opts.c b/gcc/tree-ssa-math-opts.c
>index 1ed2838..c175e88 100644
>--- a/gcc/tree-ssa-math-opts.c
>+++ b/gcc/tree-ssa-math-opts.c
>@@ -2350,15 +2350,13 @@ unsigned int
> pass_optimize_bswap::execute (function *fun)
> {
>   basic_block bb;
>-  bool bswap16_p, bswap32_p, bswap64_p;
>+  bool bswap32_p, bswap64_p;
>   bool changed = false;
>-  tree bswap16_type = NULL_TREE, bswap32_type = NULL_TREE,
>bswap64_type = NULL_TREE;
>+  tree bswap32_type = NULL_TREE, bswap64_type = NULL_TREE;
> 
>   if (BITS_PER_UNIT != 8)
>     return 0;
> 
>-  bswap16_p = (builtin_decl_explicit_p (BUILT_IN_BSWAP16)
>-	       && optab_handler (bswap_optab, HImode) != CODE_FOR_nothing);
>   bswap32_p = (builtin_decl_explicit_p (BUILT_IN_BSWAP32)
> 	       && optab_handler (bswap_optab, SImode) != CODE_FOR_nothing);
>   bswap64_p = (builtin_decl_explicit_p (BUILT_IN_BSWAP64)
>@@ -2367,12 +2365,6 @@ pass_optimize_bswap::execute (function *fun)
> 
>   /* Determine the argument type of the builtins.  The code later on
>      assumes that the return and argument type are the same.  */
>-  if (bswap16_p)
>-    {
>-      tree fndecl = builtin_decl_explicit (BUILT_IN_BSWAP16);
>-      bswap16_type = TREE_VALUE (TYPE_ARG_TYPES (TREE_TYPE (fndecl)));
>-    }
>-
>   if (bswap32_p)
>     {
>       tree fndecl = builtin_decl_explicit (BUILT_IN_BSWAP32);
>@@ -2443,11 +2435,6 @@ pass_optimize_bswap::execute (function *fun)
> 	      if (code == LROTATE_EXPR || code == RROTATE_EXPR)
> 		continue;
> 	      load_type = uint16_type_node;
>-	      if (bswap16_p)
>-		{
>-		  fndecl = builtin_decl_explicit (BUILT_IN_BSWAP16);
>-		  bswap_type = bswap16_type;
>-		}
> 	      break;
> 	    case 32:
> 	      load_type = uint32_type_node;
>@@ -2469,7 +2456,7 @@ pass_optimize_bswap::execute (function *fun)
> 	      continue;
> 	    }
> 
>-	  if (bswap && !fndecl)
>+	  if (bswap && !fndecl && n.range != 16)
> 	    continue;
> 
>	  if (bswap_replace (cur_stmt, src_stmt, fndecl, bswap_type,
>load_type,
>
>Bootstrapped on x86_64-linux-gnu and aarch64-none-linux-gnu without any
>testsuite regression
>
>Is this ok for trunk?
>
>Best regards,
>
>Thomas


^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: [PATCH] Don't check for optab for 16bit bswap
  2014-12-29 17:53 ` Richard Biener
@ 2014-12-29 18:36   ` Thomas Preud'homme
  2014-12-29 18:44     ` Oleg Endo
  0 siblings, 1 reply; 16+ messages in thread
From: Thomas Preud'homme @ 2014-12-29 18:36 UTC (permalink / raw)
  To: 'Richard Biener'; +Cc: gcc-patches

> From: Richard Biener [mailto:rguenther@suse.de]
> Sent: Monday, December 29, 2014 5:09 PM
> 
> OK, but what about targets without a rotation optab?  Is the fallback
> expansion reasonable in all cases?

To be honest I haven't checked. I thought being a treecode means it
can always be expanded, using a sequence of shift and bitwise or if
necessary. Isn't there some language that GCC support with rotate
operators?

Given your question I guess I was wrong assuming this. Is there a list
of gimple construct that are necessary supported? What about a list
of insn pattern that a backend must necessarily provide?

Best regards,

Thomas 




^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: [PATCH] Don't check for optab for 16bit bswap
  2014-12-29 18:36   ` Thomas Preud'homme
@ 2014-12-29 18:44     ` Oleg Endo
  2014-12-30 15:35       ` Richard Biener
  0 siblings, 1 reply; 16+ messages in thread
From: Oleg Endo @ 2014-12-29 18:44 UTC (permalink / raw)
  To: Thomas Preud'homme; +Cc: 'Richard Biener', gcc-patches

On Mon, 2014-12-29 at 17:53 +0000, Thomas Preud'homme wrote:
> > From: Richard Biener [mailto:rguenther@suse.de]
> > Sent: Monday, December 29, 2014 5:09 PM
> > 
> > OK, but what about targets without a rotation optab?  Is the fallback
> > expansion reasonable in all cases?
> 
> To be honest I haven't checked. I thought being a treecode means it
> can always be expanded, using a sequence of shift and bitwise or if
> necessary. Isn't there some language that GCC support with rotate
> operators?
> 
> Given your question I guess I was wrong assuming this. Is there a list
> of gimple construct that are necessary supported? What about a list
> of insn pattern that a backend must necessarily provide?
> 

__builtin_bswap16 expansion uses the 'rotlhi3' pattern to do a 16 bit
bswap as a fallback when there's no 'bswaphi2' pattern in the backend
(like on SH at the moment.  I haven't added bswaphi2, as
__builtin_bswap16 has been working without it).

I've just tried disabling the 'rotlhi3' pattern and __builtin_bswap16
expands into shift + and + or (as expected).
Thus, I don't think the patch will make something worse (than it already
is) on some backends.  If the bswap detection bails out at the tree
level, the expanded ops will be shift + and + or -- as written in the
original code.  So probably, that will be the same as the fallback
expansion for __builtin_bswap16, and we're back to sqrt (1).

Cheers,
Oleg

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: [PATCH] Don't check for optab for 16bit bswap
  2014-12-29 18:44     ` Oleg Endo
@ 2014-12-30 15:35       ` Richard Biener
  2014-12-30 16:31         ` Oleg Endo
  0 siblings, 1 reply; 16+ messages in thread
From: Richard Biener @ 2014-12-30 15:35 UTC (permalink / raw)
  To: Oleg Endo, Thomas Preud'homme; +Cc: gcc-patches

On December 29, 2014 7:44:13 PM CET, Oleg Endo <oleg.endo@t-online.de> wrote:
>On Mon, 2014-12-29 at 17:53 +0000, Thomas Preud'homme wrote:
>> > From: Richard Biener [mailto:rguenther@suse.de]
>> > Sent: Monday, December 29, 2014 5:09 PM
>> > 
>> > OK, but what about targets without a rotation optab?  Is the
>fallback
>> > expansion reasonable in all cases?
>> 
>> To be honest I haven't checked. I thought being a treecode means it
>> can always be expanded, using a sequence of shift and bitwise or if
>> necessary. Isn't there some language that GCC support with rotate
>> operators?
>> 
>> Given your question I guess I was wrong assuming this. Is there a
>list
>> of gimple construct that are necessary supported? What about a list
>> of insn pattern that a backend must necessarily provide?
>> 
>
>__builtin_bswap16 expansion uses the 'rotlhi3' pattern to do a 16 bit
>bswap as a fallback when there's no 'bswaphi2' pattern in the backend
>(like on SH at the moment.  I haven't added bswaphi2, as
>__builtin_bswap16 has been working without it).
>
>I've just tried disabling the 'rotlhi3' pattern and __builtin_bswap16
>expands into shift + and + or (as expected).
>Thus, I don't think the patch will make something worse (than it
>already
>is) on some backends.  If the bswap detection bails out at the tree
>level, the expanded ops will be shift + and + or -- as written in the
>original code.  So probably, that will be the same as the fallback
>expansion for __builtin_bswap16, and we're back to sqrt (1).

OK, that is what I was asking - are there cases where using rot is worse (like forcing a libcall or so).

Richard.

>Cheers,
>Oleg


^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: [PATCH] Don't check for optab for 16bit bswap
  2014-12-30 15:35       ` Richard Biener
@ 2014-12-30 16:31         ` Oleg Endo
  2015-01-05 14:55           ` Thomas Preud'homme
  0 siblings, 1 reply; 16+ messages in thread
From: Oleg Endo @ 2014-12-30 16:31 UTC (permalink / raw)
  To: Richard Biener; +Cc: Thomas Preud'homme, gcc-patches

On Tue, 2014-12-30 at 16:22 +0100, Richard Biener wrote:
> On December 29, 2014 7:44:13 PM CET, Oleg Endo <oleg.endo@t-online.de> wrote:
> >On Mon, 2014-12-29 at 17:53 +0000, Thomas Preud'homme wrote:
> >> > From: Richard Biener [mailto:rguenther@suse.de]
> >> > Sent: Monday, December 29, 2014 5:09 PM
> >> > 
> >> > OK, but what about targets without a rotation optab?  Is the
> >fallback
> >> > expansion reasonable in all cases?
> >> 
> >> To be honest I haven't checked. I thought being a treecode means it
> >> can always be expanded, using a sequence of shift and bitwise or if
> >> necessary. Isn't there some language that GCC support with rotate
> >> operators?
> >> 
> >> Given your question I guess I was wrong assuming this. Is there a
> >list
> >> of gimple construct that are necessary supported? What about a list
> >> of insn pattern that a backend must necessarily provide?
> >> 
> >
> >__builtin_bswap16 expansion uses the 'rotlhi3' pattern to do a 16 bit
> >bswap as a fallback when there's no 'bswaphi2' pattern in the backend
> >(like on SH at the moment.  I haven't added bswaphi2, as
> >__builtin_bswap16 has been working without it).
> >
> >I've just tried disabling the 'rotlhi3' pattern and __builtin_bswap16
> >expands into shift + and + or (as expected).
> >Thus, I don't think the patch will make something worse (than it
> >already

.L42:

> >is) on some backends.  If the bswap detection bails out at the tree
> >level, the expanded ops will be shift + and + or -- as written in the
> >original code.  So probably, that will be the same as the fallback
> >expansion for __builtin_bswap16, and we're back to sqrt (1).
> 
> OK, that is what I was asking - are there cases where using rot is worse (like forcing a libcall or so).

See also comment in expmed.c (expand_shift_1):
 It is theoretically possible that the target machine might
 not be able to perform either shift and hence we would
 be making two libcalls rather than just the one for the
 shift (similarly if IOR could not be done).  We will allow
 this extremely unlikely lossage to avoid complicating the
 code below.  */

then goto .L42  ;)

Cheers,
Oleg

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: [PATCH] Don't check for optab for 16bit bswap
  2014-12-30 16:31         ` Oleg Endo
@ 2015-01-05 14:55           ` Thomas Preud'homme
  2015-01-05 20:59             ` Oleg Endo
  2015-01-05 21:09             ` Richard Biener
  0 siblings, 2 replies; 16+ messages in thread
From: Thomas Preud'homme @ 2015-01-05 14:55 UTC (permalink / raw)
  To: 'Oleg Endo', Richard Biener; +Cc: gcc-patches

> From: Oleg Endo [mailto:oleg.endo@t-online.de]
> Sent: Tuesday, December 30, 2014 4:25 PM
> 
> I've just tried disabling the 'rotlhi3' pattern and __builtin_bswap16
> expands into shift + and + or (as expected).
> Thus, I don't think the patch will make something worse (than it
> already
> 
> .L42:
> 
> is) on some backends.  If the bswap detection bails out at the tree
> level, the expanded ops will be shift + and + or -- as written in the
> original code.  So probably, that will be the same as the fallback
> expansion for __builtin_bswap16, and we're back to sqrt (1).
> 
> > OK, that is what I was asking - are there cases where using rot is worse
> (like forcing a libcall or so).
> 
> See also comment in expmed.c (expand_shift_1):
>  It is theoretically possible that the target machine might
>  not be able to perform either shift and hence we would
>  be making two libcalls rather than just the one for the
>  shift (similarly if IOR could not be done).  We will allow
>  this extremely unlikely lossage to avoid complicating the
>  code below.  */
> 
> then goto .L42  ;)

Yes and if the fallback expansion for a rot is actually worse than the
original set of stmt it means the fallback expansion should be improved.

Now I realize that said like this it sounds more like stage1 material.

Would it be fine for next stage1?

Best regards,

Thomas 




^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: [PATCH] Don't check for optab for 16bit bswap
  2015-01-05 14:55           ` Thomas Preud'homme
@ 2015-01-05 20:59             ` Oleg Endo
  2015-01-05 21:09             ` Richard Biener
  1 sibling, 0 replies; 16+ messages in thread
From: Oleg Endo @ 2015-01-05 20:59 UTC (permalink / raw)
  To: Thomas Preud'homme; +Cc: Richard Biener, gcc-patches

On Mon, 2015-01-05 at 14:54 +0000, Thomas Preud'homme wrote:
> > From: Oleg Endo [mailto:oleg.endo@t-online.de]
> > Sent: Tuesday, December 30, 2014 4:25 PM
> > 
> > I've just tried disabling the 'rotlhi3' pattern and __builtin_bswap16
> > expands into shift + and + or (as expected).
> > Thus, I don't think the patch will make something worse (than it
> > already
> > 
> > .L42:
> > 
> > is) on some backends.  If the bswap detection bails out at the tree
> > level, the expanded ops will be shift + and + or -- as written in the
> > original code.  So probably, that will be the same as the fallback
> > expansion for __builtin_bswap16, and we're back to sqrt (1).
> > 
> > > OK, that is what I was asking - are there cases where using rot is worse
> > (like forcing a libcall or so).
> > 
> > See also comment in expmed.c (expand_shift_1):
> >  It is theoretically possible that the target machine might
> >  not be able to perform either shift and hence we would
> >  be making two libcalls rather than just the one for the
> >  shift (similarly if IOR could not be done).  We will allow
> >  this extremely unlikely lossage to avoid complicating the
> >  code below.  */
> > 
> > then goto .L42  ;)
> 
> Yes and if the fallback expansion for a rot is actually worse than the
> original set of stmt it means the fallback expansion should be improved.

I might be missing something, but how can the original shift/and/or
statements (which eventually are translated to shift/and/or insns) be
any better than the shift/and/or insns that are expanded by the rot
fallback?  In both cases the insns have to go through the backend
expanders, and if those are absent/unsupported libcalls will be
expanded.

> Now I realize that said like this it sounds more like stage1 material.
> 
> Would it be fine for next stage1?

Or maybe just leave everything as it is and I add a bswaphi expander in
the SH backend ;)

Cheers,
Oleg

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: [PATCH] Don't check for optab for 16bit bswap
  2015-01-05 14:55           ` Thomas Preud'homme
  2015-01-05 20:59             ` Oleg Endo
@ 2015-01-05 21:09             ` Richard Biener
  2015-01-21 20:00               ` H.J. Lu
  1 sibling, 1 reply; 16+ messages in thread
From: Richard Biener @ 2015-01-05 21:09 UTC (permalink / raw)
  To: Thomas Preud'homme, 'Oleg Endo'; +Cc: gcc-patches

On January 5, 2015 3:54:40 PM CET, Thomas Preud'homme <thomas.preudhomme@arm.com> wrote:
>> From: Oleg Endo [mailto:oleg.endo@t-online.de]
>> Sent: Tuesday, December 30, 2014 4:25 PM
>> 
>> I've just tried disabling the 'rotlhi3' pattern and __builtin_bswap16
>> expands into shift + and + or (as expected).
>> Thus, I don't think the patch will make something worse (than it
>> already
>> 
>> .L42:
>> 
>> is) on some backends.  If the bswap detection bails out at the tree
>> level, the expanded ops will be shift + and + or -- as written in the
>> original code.  So probably, that will be the same as the fallback
>> expansion for __builtin_bswap16, and we're back to sqrt (1).
>> 
>> > OK, that is what I was asking - are there cases where using rot is
>worse
>> (like forcing a libcall or so).
>> 
>> See also comment in expmed.c (expand_shift_1):
>>  It is theoretically possible that the target machine might
>>  not be able to perform either shift and hence we would
>>  be making two libcalls rather than just the one for the
>>  shift (similarly if IOR could not be done).  We will allow
>>  this extremely unlikely lossage to avoid complicating the
>>  code below.  */
>> 
>> then goto .L42  ;)
>
>Yes and if the fallback expansion for a rot is actually worse than the
>original set of stmt it means the fallback expansion should be
>improved.
>
>Now I realize that said like this it sounds more like stage1 material.
>
>Would it be fine for next stage1?

No, I think the patch is fine as-is now.

Thanks,
Richard.

>Best regards,
>
>Thomas 


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Don't check for optab for 16bit bswap
  2015-01-05 21:09             ` Richard Biener
@ 2015-01-21 20:00               ` H.J. Lu
  0 siblings, 0 replies; 16+ messages in thread
From: H.J. Lu @ 2015-01-21 20:00 UTC (permalink / raw)
  To: Richard Biener; +Cc: Thomas Preud'homme, Oleg Endo, GCC Patches

On Mon, Jan 5, 2015 at 1:09 PM, Richard Biener <rguenther@suse.de> wrote:
> On January 5, 2015 3:54:40 PM CET, Thomas Preud'homme <thomas.preudhomme@arm.com> wrote:
>>> From: Oleg Endo [mailto:oleg.endo@t-online.de]
>>> Sent: Tuesday, December 30, 2014 4:25 PM
>>>
>>> I've just tried disabling the 'rotlhi3' pattern and __builtin_bswap16
>>> expands into shift + and + or (as expected).
>>> Thus, I don't think the patch will make something worse (than it
>>> already
>>>
>>> .L42:
>>>
>>> is) on some backends.  If the bswap detection bails out at the tree
>>> level, the expanded ops will be shift + and + or -- as written in the
>>> original code.  So probably, that will be the same as the fallback
>>> expansion for __builtin_bswap16, and we're back to sqrt (1).
>>>
>>> > OK, that is what I was asking - are there cases where using rot is
>>worse
>>> (like forcing a libcall or so).
>>>
>>> See also comment in expmed.c (expand_shift_1):
>>>  It is theoretically possible that the target machine might
>>>  not be able to perform either shift and hence we would
>>>  be making two libcalls rather than just the one for the
>>>  shift (similarly if IOR could not be done).  We will allow
>>>  this extremely unlikely lossage to avoid complicating the
>>>  code below.  */
>>>
>>> then goto .L42  ;)
>>
>>Yes and if the fallback expansion for a rot is actually worse than the
>>original set of stmt it means the fallback expansion should be
>>improved.
>>
>>Now I realize that said like this it sounds more like stage1 material.
>>
>>Would it be fine for next stage1?
>
> No, I think the patch is fine as-is now.

Does it cause:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64718

-- 
H.J.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Don't check for optab for 16bit bswap
  2014-12-29 14:46 [PATCH] Don't check for optab for 16bit bswap Thomas Preud'homme
  2014-12-29 17:53 ` Richard Biener
@ 2015-01-21 21:30 ` Richard Henderson
  2015-01-22  8:22   ` Thomas Preud'homme
  2015-01-22  8:35   ` Richard Biener
  1 sibling, 2 replies; 16+ messages in thread
From: Richard Henderson @ 2015-01-21 21:30 UTC (permalink / raw)
  To: Thomas Preud'homme, gcc-patches, 'Richard Biener'

On 12/29/2014 06:04 AM, Thomas Preud'homme wrote:
> Since 16bit byteswap can be done via an 8 bit rotation (and it is the canonical form),
> the check for an optab only serves to prevent the bswap optimization for
> targets that don't have a 16bit byteswap (but do have a rotation instruction). See
> PR63259 (comments 6 and onwards) for more details.

I question the choice to have rotate be the canonical form.

Doesn't this make things more complicated for targets that
don't have rotate?  Or, equivalently, no 16-bit rotate?
That would seem to cover 99% of all 32-bit risc machines.


r~

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: [PATCH] Don't check for optab for 16bit bswap
  2015-01-21 21:30 ` Richard Henderson
@ 2015-01-22  8:22   ` Thomas Preud'homme
  2015-01-22  8:35   ` Richard Biener
  1 sibling, 0 replies; 16+ messages in thread
From: Thomas Preud'homme @ 2015-01-22  8:22 UTC (permalink / raw)
  To: 'Richard Henderson', gcc-patches, 'Richard Biener'



> -----Original Message-----
> From: Richard Henderson [mailto:rth@redhat.com]
> Sent: Thursday, January 22, 2015 5:24 AM
> To: Thomas Preud'homme; gcc-patches@gcc.gnu.org; 'Richard Biener'
> Subject: Re: [PATCH] Don't check for optab for 16bit bswap
> 
> On 12/29/2014 06:04 AM, Thomas Preud'homme wrote:
> > Since 16bit byteswap can be done via an 8 bit rotation (and it is the
> canonical form),
> > the check for an optab only serves to prevent the bswap optimization
> for
> > targets that don't have a 16bit byteswap (but do have a rotation
> instruction). See
> > PR63259 (comments 6 and onwards) for more details.
> 
> I question the choice to have rotate be the canonical form.
> 
> Doesn't this make things more complicated for targets that
> don't have rotate?  Or, equivalently, no 16-bit rotate?
> That would seem to cover 99% of all 32-bit risc machines.

Here is the difference between the two (see [1]):

/* Rotation of 16bit values by 8 bits is effectively equivalent to a bswaphi.
Note that this is not the case for bigger values. For instance a rotation
of 0x01020304 by 16 bits gives 0x03040102 which is different from
0x04030201 (bswapsi). */
if (rotate
     && CONST_INT_P (op1)
     && INTVAL (op1) == BITS_PER_UNIT
     && GET_MODE_SIZE (scalar_mode) == 2
     && optab_handler (bswap_optab, HImode) != CODE_FOR_nothing)
  return expand_unop (HImode, bswap_optab, shifted, NULL_RTX,
			unsignedp);

[1] https://gcc.gnu.org/viewcvs/gcc/trunk/gcc/expmed.c?revision=219718&view=markup#l2208

It doesn't look much more complicated. There was also some change in
the bswap pass but some code was both added and removed. As to the
bug raised by H.J. Lu, I just wrote a patch and testing is underway.

Given this, I don't think it's a problem. That being said, I'm happy to
change the canonical representation if there is consensus for that.

Best regards,

Thomas



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Don't check for optab for 16bit bswap
  2015-01-21 21:30 ` Richard Henderson
  2015-01-22  8:22   ` Thomas Preud'homme
@ 2015-01-22  8:35   ` Richard Biener
  2015-01-22 19:23     ` Richard Henderson
  1 sibling, 1 reply; 16+ messages in thread
From: Richard Biener @ 2015-01-22  8:35 UTC (permalink / raw)
  To: Richard Henderson, Thomas Preud'homme, gcc-patches

On January 21, 2015 10:23:56 PM CET, Richard Henderson <rth@redhat.com> wrote:
>On 12/29/2014 06:04 AM, Thomas Preud'homme wrote:
>> Since 16bit byteswap can be done via an 8 bit rotation (and it is the
>canonical form),
>> the check for an optab only serves to prevent the bswap optimization
>for
>> targets that don't have a 16bit byteswap (but do have a rotation
>instruction). See
>> PR63259 (comments 6 and onwards) for more details.
>
>I question the choice to have rotate be the canonical form.
>
>Doesn't this make things more complicated for targets that
>don't have rotate?  Or, equivalently, no 16-bit rotate?
>That would seem to cover 99% of all 32-bit risc machines.

I was asking for the generic expander to consider bswapHI.  Rotates are certainly more likely to get combined with sth else.

Richard.

>
>r~


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Don't check for optab for 16bit bswap
  2015-01-22  8:35   ` Richard Biener
@ 2015-01-22 19:23     ` Richard Henderson
  2015-01-23  4:10       ` Thomas Preud'homme
  2015-01-23 10:00       ` Richard Biener
  0 siblings, 2 replies; 16+ messages in thread
From: Richard Henderson @ 2015-01-22 19:23 UTC (permalink / raw)
  To: Richard Biener, Thomas Preud'homme, gcc-patches

On 01/21/2015 11:52 PM, Richard Biener wrote:
> On January 21, 2015 10:23:56 PM CET, Richard Henderson <rth@redhat.com> wrote:
>> On 12/29/2014 06:04 AM, Thomas Preud'homme wrote:
>>> Since 16bit byteswap can be done via an 8 bit rotation (and it is the
>> canonical form),
>>> the check for an optab only serves to prevent the bswap optimization
>> for
>>> targets that don't have a 16bit byteswap (but do have a rotation
>> instruction). See
>>> PR63259 (comments 6 and onwards) for more details.
>>
>> I question the choice to have rotate be the canonical form.
>>
>> Doesn't this make things more complicated for targets that
>> don't have rotate?  Or, equivalently, no 16-bit rotate?
>> That would seem to cover 99% of all 32-bit risc machines.
> 
> I was asking for the generic expander to consider bswapHI.  Rotates are
> certainly more likely to get combined with sth else.

Maybe.  Alternately, don't we canonicalize byte-swapping memory ops as

 (set (reg:HI) (bswap:HI (mem:HI)))

 (set (mem:HI) (bswap:HI (reg:HI)))

Certainly this is true for powerpc.

(It appears that both sparc and s390 are missing similar patterns,
despite having instructions that can perform this operation...)


r~


r~

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: [PATCH] Don't check for optab for 16bit bswap
  2015-01-22 19:23     ` Richard Henderson
@ 2015-01-23  4:10       ` Thomas Preud'homme
  2015-01-23 10:00       ` Richard Biener
  1 sibling, 0 replies; 16+ messages in thread
From: Thomas Preud'homme @ 2015-01-23  4:10 UTC (permalink / raw)
  To: 'Richard Henderson', Richard Biener, gcc-patches

> From: Richard Henderson [mailto:rth@redhat.com]
> Sent: Friday, January 23, 2015 2:43 AM
> On 01/21/2015 11:52 PM, Richard Biener wrote:
> >
> > I was asking for the generic expander to consider bswapHI.  Rotates are
> > certainly more likely to get combined with sth else.
> 
> Maybe.  Alternately, don't we canonicalize byte-swapping memory ops
> as
> 
>  (set (reg:HI) (bswap:HI (mem:HI)))
> 
>  (set (mem:HI) (bswap:HI (reg:HI)))

Given the expmed.c bit I pasted, I'd say that's how it would be canonicalized
in RTL form yes. I think Richard was referring to combination at GIMPLE level
but that expander should expand to bswapHI to not break expectations.

Best regards,

Thomas 




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Don't check for optab for 16bit bswap
  2015-01-22 19:23     ` Richard Henderson
  2015-01-23  4:10       ` Thomas Preud'homme
@ 2015-01-23 10:00       ` Richard Biener
  1 sibling, 0 replies; 16+ messages in thread
From: Richard Biener @ 2015-01-23 10:00 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Thomas Preud'homme, gcc-patches

On Thu, 22 Jan 2015, Richard Henderson wrote:

> On 01/21/2015 11:52 PM, Richard Biener wrote:
> > On January 21, 2015 10:23:56 PM CET, Richard Henderson <rth@redhat.com> wrote:
> >> On 12/29/2014 06:04 AM, Thomas Preud'homme wrote:
> >>> Since 16bit byteswap can be done via an 8 bit rotation (and it is the
> >> canonical form),
> >>> the check for an optab only serves to prevent the bswap optimization
> >> for
> >>> targets that don't have a 16bit byteswap (but do have a rotation
> >> instruction). See
> >>> PR63259 (comments 6 and onwards) for more details.
> >>
> >> I question the choice to have rotate be the canonical form.
> >>
> >> Doesn't this make things more complicated for targets that
> >> don't have rotate?  Or, equivalently, no 16-bit rotate?
> >> That would seem to cover 99% of all 32-bit risc machines.
> > 
> > I was asking for the generic expander to consider bswapHI.  Rotates are
> > certainly more likely to get combined with sth else.

Sorry for being GIMPLE centric here - but I meant canonicalization
on GIMPLE.  For RTL the expander will hopefully canonicalize it to
whatever is canonical on RTL.

> Maybe.  Alternately, don't we canonicalize byte-swapping memory ops as
> 
>  (set (reg:HI) (bswap:HI (mem:HI)))
> 
>  (set (mem:HI) (bswap:HI (reg:HI)))
> 
> Certainly this is true for powerpc.
> 
> (It appears that both sparc and s390 are missing similar patterns,
> despite having instructions that can perform this operation...)

That makes sense, it looks easier to match than a form with rotates.
I wonder when CPUs start to make a load-with-byte-shuffle instruction
available...

Richard.

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2015-01-23  8:57 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-12-29 14:46 [PATCH] Don't check for optab for 16bit bswap Thomas Preud'homme
2014-12-29 17:53 ` Richard Biener
2014-12-29 18:36   ` Thomas Preud'homme
2014-12-29 18:44     ` Oleg Endo
2014-12-30 15:35       ` Richard Biener
2014-12-30 16:31         ` Oleg Endo
2015-01-05 14:55           ` Thomas Preud'homme
2015-01-05 20:59             ` Oleg Endo
2015-01-05 21:09             ` Richard Biener
2015-01-21 20:00               ` H.J. Lu
2015-01-21 21:30 ` Richard Henderson
2015-01-22  8:22   ` Thomas Preud'homme
2015-01-22  8:35   ` Richard Biener
2015-01-22 19:23     ` Richard Henderson
2015-01-23  4:10       ` Thomas Preud'homme
2015-01-23 10:00       ` Richard Biener

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).