public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug middle-end/90323] powerpc should convert equivalent sequences to vec_sel()
       [not found] <bug-90323-4@http.gcc.gnu.org/bugzilla/>
@ 2021-04-08  5:25 ` luoxhu at gcc dot gnu.org
  2021-04-08  5:29 ` luoxhu at gcc dot gnu.org
                   ` (14 subsequent siblings)
  15 siblings, 0 replies; 16+ messages in thread
From: luoxhu at gcc dot gnu.org @ 2021-04-08  5:25 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90323

luoxhu at gcc dot gnu.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |luoxhu at gcc dot gnu.org

--- Comment #8 from luoxhu at gcc dot gnu.org ---
Two minor updates for the case mentioned in #c2:

 for VEC_SEL (ARG1, ARG2, ARG3):

   Returns a vector containing the value of either ARG1 or ARG2 depending on
the 
   value of ARG3.


#include <stdio.h>
#include <altivec.h>
volatile vector unsigned orig = {0xebebebeb, 0x34343434, 0x76767676,
0x12121212};
volatile vector unsigned mask = {0xffffffff, 0, 0xffffffff, 0};
volatile vector unsigned fill = {0xfefefefe, 0xaaaaaaaa, 0xbbbbbbbb,
0xcccccccc};
volatile vector unsigned expected = {0xfefefefe, 0x34343434, 0xbbbbbbbb,
0x12121212};
__attribute__ ((noinline))
vector unsigned without_sel(vector unsigned l, vector unsigned r, vector
unsigned mask) {
-    l = l & ~r;
+    l = l & ~mask;
    l |= mask & r;
    return l;
}

__attribute__ ((noinline))
vector unsigned with_sel(vector unsigned l, vector unsigned r, vector unsigned
mask) {
-    return vec_sel(l, mask, r);
+    return vec_sel(l, r, mask);
}

int main() {
    vector unsigned res1 = without_sel(orig, fill, mask);
    vector unsigned res2 = with_sel(orig, fill, mask);
    if (!vec_all_eq(res1, expected)) printf ("error1\n");
    if (!vec_all_eq(res2, expected)) printf ("error2\n");
    return 0;
}


And the ASM would be:

without_sel:
        xxlxor 35,34,35
        xxland 35,35,36
        xxlxor 34,34,35
        blr
        .long 0
        .byte 0,0,0,0,0,0,0,0
with_sel:
        xxsel 34,34,35,36
        blr
        .long 0
        .byte 0,0,0,0,0,0,0,0

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug middle-end/90323] powerpc should convert equivalent sequences to vec_sel()
       [not found] <bug-90323-4@http.gcc.gnu.org/bugzilla/>
  2021-04-08  5:25 ` [Bug middle-end/90323] powerpc should convert equivalent sequences to vec_sel() luoxhu at gcc dot gnu.org
@ 2021-04-08  5:29 ` luoxhu at gcc dot gnu.org
  2021-04-08 20:49 ` segher at gcc dot gnu.org
                   ` (13 subsequent siblings)
  15 siblings, 0 replies; 16+ messages in thread
From: luoxhu at gcc dot gnu.org @ 2021-04-08  5:29 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90323

--- Comment #9 from luoxhu at gcc dot gnu.org ---
Then we could optimized it in match.pd

diff --git a/gcc/match.pd b/gcc/match.pd
index 036f92fa959..8944312c153 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -3711,6 +3711,17 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
    (if (integer_all_onesp (@1) && integer_zerop (@2))
     @0))))

+#if GIMPLE
+(simplify
+ (bit_xor @0 (bit_and @2 (bit_xor @0 @1)))
+ (if (optimize_vectors_before_lowering_p () && types_match (@0, @1)
+      && types_match (@0, @2) && VECTOR_TYPE_P (TREE_TYPE (@0))
+      && VECTOR_TYPE_P (TREE_TYPE (@1)) && VECTOR_TYPE_P (TREE_TYPE (@2)))
+ (with { tree itype = truth_type_for (type); }
+ (vec_cond (convert:itype @2) @1 @0))))
+#endif

in pr90323.c.033t.forwprop1, it will be optimized to:

  <bb 2> :
  _1 = ~mask_3(D);
  l_5 = _1 & l_4(D);
  _2 = mask_3(D) & r_6(D);
  _8 = l_4(D) ^ r_6(D);
  _10 = mask_3(D) & _8;
  _11 = (vector(4) <signed-boolean:32>) mask_3(D);
  l_7 = VEC_COND_EXPR <_11, r_6(D), l_4(D)>;
  return l_7;

Then in pr90323.c.243t.isel:

  <bb 2> [local count: 1073741824]:
  _6 = (vector(4) <signed-boolean:32>) mask_1(D);
  l_4 = .VCOND_MASK (_6, r_3(D), l_2(D));
  return l_4;

final ASM:

without_sel:
.LFB11:
        .cfi_startproc
        xxsel 34,34,35,36
        blr
        .long 0
        .byte 0,0,0,0,0,0,0,0
        .cfi_endproc
.LFE11:
        .size   without_sel,.-without_sel
        .align 2
        .p2align 4,,15
        .globl with_sel
        .type   with_sel, @function
with_sel:
.LFB12:
        .cfi_startproc
        xxsel 34,34,35,36
        blr


@segher, Is this reasonable fix ???

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug middle-end/90323] powerpc should convert equivalent sequences to vec_sel()
       [not found] <bug-90323-4@http.gcc.gnu.org/bugzilla/>
  2021-04-08  5:25 ` [Bug middle-end/90323] powerpc should convert equivalent sequences to vec_sel() luoxhu at gcc dot gnu.org
  2021-04-08  5:29 ` luoxhu at gcc dot gnu.org
@ 2021-04-08 20:49 ` segher at gcc dot gnu.org
  2021-04-09  3:47 ` luoxhu at gcc dot gnu.org
                   ` (12 subsequent siblings)
  15 siblings, 0 replies; 16+ messages in thread
From: segher at gcc dot gnu.org @ 2021-04-08 20:49 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90323

--- Comment #10 from Segher Boessenkool <segher at gcc dot gnu.org> ---
You cannot fix a simplify-rtx problem in much earlier passes!  It may be
useful of course (I have no idea, I don't know gimple well enough), but
it is no solution to the problem at all.  The xor/and/xor thing should be
simplified to something proper.

((A^B)&C)^A = (A&~C)^(B&C) = (A&~C)|(B&C)

This should already be done by the expand pass.  At gimple level the logical
complement is counted as an operation, making the contorted xor/and/xor form
the best form to use, but in a system that considers more than just operation
counts (like in RTL) this is not the best form at all.  But, anyway, RTL
simplification should be able to do this.

Similar problems happen all over the place, fwiw -- see the various rl* tests
for rs6000, for example.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug middle-end/90323] powerpc should convert equivalent sequences to vec_sel()
       [not found] <bug-90323-4@http.gcc.gnu.org/bugzilla/>
                   ` (2 preceding siblings ...)
  2021-04-08 20:49 ` segher at gcc dot gnu.org
@ 2021-04-09  3:47 ` luoxhu at gcc dot gnu.org
  2021-04-09  7:29 ` luoxhu at gcc dot gnu.org
                   ` (11 subsequent siblings)
  15 siblings, 0 replies; 16+ messages in thread
From: luoxhu at gcc dot gnu.org @ 2021-04-09  3:47 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90323

--- Comment #11 from luoxhu at gcc dot gnu.org ---
I noticed that you added the below optimization with commit
a62436c0a505155fc8becac07a8c0abe2c265bfe. But it doesn't even handle this case,
cse1 pass will call simplify_binary_operation_1, both op0 and op1 are REGs
instead of AND operators, do you have a test case to cover that piece of code?

__attribute__ ((noinline))
 long without_sel3( long l,  long r) {
    long tmp = {0x0ff00fff};
    l =  ( (l ^ r) & tmp) ^ l;
    return l;
}


without_sel3:
        xor 4,3,4
        rlwinm 4,4,0,20,11
        rldicl 4,4,0,36
        xor 3,4,3
        blr
        .long 0
        .byte 0,0,0,0,0,0,0,0


+2016-11-09  Segher Boessenkool  <segher@kernel.crashing.org>
+
+       * simplify-rtx.c (simplify_binary_operation_1): Simplify
+       (xor (and (xor A B) C) B) to (ior (and A C) (and B ~C)) and
+       (xor (and (xor A B) C) A) to (ior (and A ~C) (and B C)) if C
+       is a const_int.

diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c
index 5c3dea1a349..11a2e0267c7 100644
--- a/gcc/simplify-rtx.c
+++ b/gcc/simplify-rtx.c
@@ -2886,6 +2886,37 @@ simplify_binary_operation_1 (enum rtx_code code,
machine_mode mode,
            }
        }

+      /* If we have (xor (and (xor A B) C) A) with C a constant we can instead
+        do (ior (and A ~C) (and B C)) which is a machine instruction on some
+        machines, and also has shorter instruction path length.  */
+      if (GET_CODE (op0) == AND
+         && GET_CODE (XEXP (op0, 0)) == XOR
+         && CONST_INT_P (XEXP (op0, 1))
+         && rtx_equal_p (XEXP (XEXP (op0, 0), 0), trueop1))
+       {
+         rtx a = trueop1;
+         rtx b = XEXP (XEXP (op0, 0), 1);
+         rtx c = XEXP (op0, 1);
+         rtx nc = simplify_gen_unary (NOT, mode, c, mode);
+         rtx a_nc = simplify_gen_binary (AND, mode, a, nc);
+         rtx bc = simplify_gen_binary (AND, mode, b, c);
+         return simplify_gen_binary (IOR, mode, a_nc, bc);
+       }
+      /* Similarly, (xor (and (xor A B) C) B) as (ior (and A C) (and B ~C)) 
*/
+      else if (GET_CODE (op0) == AND
+         && GET_CODE (XEXP (op0, 0)) == XOR
+         && CONST_INT_P (XEXP (op0, 1))
+         && rtx_equal_p (XEXP (XEXP (op0, 0), 1), trueop1))
+       {
+         rtx a = XEXP (XEXP (op0, 0), 0);
+         rtx b = trueop1;
+         rtx c = XEXP (op0, 1);
+         rtx nc = simplify_gen_unary (NOT, mode, c, mode);
+         rtx b_nc = simplify_gen_binary (AND, mode, b, nc);
+         rtx ac = simplify_gen_binary (AND, mode, a, c);
+         return simplify_gen_binary (IOR, mode, ac, b_nc);
+       }

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug middle-end/90323] powerpc should convert equivalent sequences to vec_sel()
       [not found] <bug-90323-4@http.gcc.gnu.org/bugzilla/>
                   ` (3 preceding siblings ...)
  2021-04-09  3:47 ` luoxhu at gcc dot gnu.org
@ 2021-04-09  7:29 ` luoxhu at gcc dot gnu.org
  2021-04-12 22:32 ` segher at gcc dot gnu.org
                   ` (10 subsequent siblings)
  15 siblings, 0 replies; 16+ messages in thread
From: luoxhu at gcc dot gnu.org @ 2021-04-09  7:29 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90323

--- Comment #12 from luoxhu at gcc dot gnu.org ---

That code was called by combine pass but fail to match. 

pr newpat
(set (reg:DI 125 [ l ])
    (xor:DI (and:DI (xor:DI (reg/v:DI 120 [ l ])
                (reg:DI 127))
            (const_int 267390975 [0xff00fff]))
        (reg/v:DI 120 [ l ])))


Trying 8, 10 -> 11:
    8: r123:DI=r120:DI^r127:DI
      REG_DEAD r127:DI
   10: r118:DI=r123:DI&0xff00fff
      REG_DEAD r123:DI
   11: r125:DI=r118:DI^r120:DI
      REG_DEAD r120:DI
      REG_DEAD r118:DI
Failed to match this instruction:
(set (reg:DI 125 [ l ])
    (ior:DI (and:DI (reg/v:DI 120 [ l ])
            (const_int -267390976 [0xfffffffff00ff000]))
        (and:DI (reg:DI 127)
            (const_int 267390975 [0xff00fff]))))
Successfully matched this instruction:
(set (reg:DI 118 [ _2 ])
    (and:DI (reg:DI 127)
        (const_int 267390975 [0xff00fff])))
Failed to match this instruction:
(set (reg:DI 125 [ l ])
    (ior:DI (and:DI (reg/v:DI 120 [ l ])
            (const_int -267390976 [0xfffffffff00ff000]))
        (reg:DI 118 [ _2 ])))

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug middle-end/90323] powerpc should convert equivalent sequences to vec_sel()
       [not found] <bug-90323-4@http.gcc.gnu.org/bugzilla/>
                   ` (4 preceding siblings ...)
  2021-04-09  7:29 ` luoxhu at gcc dot gnu.org
@ 2021-04-12 22:32 ` segher at gcc dot gnu.org
  2021-04-12 22:36 ` segher at gcc dot gnu.org
                   ` (9 subsequent siblings)
  15 siblings, 0 replies; 16+ messages in thread
From: segher at gcc dot gnu.org @ 2021-04-12 22:32 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90323

--- Comment #13 from Segher Boessenkool <segher at gcc dot gnu.org> ---
(In reply to luoxhu from comment #11)
> I noticed that you added the below optimization with commit
> a62436c0a505155fc8becac07a8c0abe2c265bfe. But it doesn't even handle this
> case, cse1 pass will call simplify_binary_operation_1, both op0 and op1 are
> REGs instead of AND operators, do you have a test case to cover that piece
> of code?

This worked at the time.  It broke some time ago in simple testcases,
triggered by the "don't combine hard registers" thing I did.  This is
PR98468.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug middle-end/90323] powerpc should convert equivalent sequences to vec_sel()
       [not found] <bug-90323-4@http.gcc.gnu.org/bugzilla/>
                   ` (5 preceding siblings ...)
  2021-04-12 22:32 ` segher at gcc dot gnu.org
@ 2021-04-12 22:36 ` segher at gcc dot gnu.org
  2021-04-13  3:22 ` luoxhu at gcc dot gnu.org
                   ` (8 subsequent siblings)
  15 siblings, 0 replies; 16+ messages in thread
From: segher at gcc dot gnu.org @ 2021-04-12 22:36 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90323

--- Comment #14 from Segher Boessenkool <segher at gcc dot gnu.org> ---
(In reply to luoxhu from comment #12)
> That code was called by combine pass but fail to match. 

> 
> pr newpat
> (set (reg:DI 125 [ l ])
>     (xor:DI (and:DI (xor:DI (reg/v:DI 120 [ l ])
>                 (reg:DI 127))
>             (const_int 267390975 [0xff00fff]))
>         (reg/v:DI 120 [ l ])))

Note this is 0x0ff00fff, and this is not a valid mask for rlwimi.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug middle-end/90323] powerpc should convert equivalent sequences to vec_sel()
       [not found] <bug-90323-4@http.gcc.gnu.org/bugzilla/>
                   ` (6 preceding siblings ...)
  2021-04-12 22:36 ` segher at gcc dot gnu.org
@ 2021-04-13  3:22 ` luoxhu at gcc dot gnu.org
  2021-04-30  2:20 ` luoxhu at gcc dot gnu.org
                   ` (7 subsequent siblings)
  15 siblings, 0 replies; 16+ messages in thread
From: luoxhu at gcc dot gnu.org @ 2021-04-13  3:22 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90323

--- Comment #15 from luoxhu at gcc dot gnu.org ---
(In reply to Segher Boessenkool from comment #14)
> (In reply to luoxhu from comment #12)
> > That code was called by combine pass but fail to match. 
> 
> > 
> > pr newpat
> > (set (reg:DI 125 [ l ])
> >     (xor:DI (and:DI (xor:DI (reg/v:DI 120 [ l ])
> >                 (reg:DI 127))
> >             (const_int 267390975 [0xff00fff]))
> >         (reg/v:DI 120 [ l ])))
> 
> Note this is 0x0ff00fff, and this is not a valid mask for rlwimi.

OK, it also fails to combine for 0x01000000.


        .cfi_startproc
        xor 4,3,4
        rlwinm 4,4,0,7,7
        xor 3,4,3
        blr

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug middle-end/90323] powerpc should convert equivalent sequences to vec_sel()
       [not found] <bug-90323-4@http.gcc.gnu.org/bugzilla/>
                   ` (7 preceding siblings ...)
  2021-04-13  3:22 ` luoxhu at gcc dot gnu.org
@ 2021-04-30  2:20 ` luoxhu at gcc dot gnu.org
  2021-04-30  6:36 ` luoxhu at gcc dot gnu.org
                   ` (6 subsequent siblings)
  15 siblings, 0 replies; 16+ messages in thread
From: luoxhu at gcc dot gnu.org @ 2021-04-30  2:20 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90323

--- Comment #16 from luoxhu at gcc dot gnu.org ---

> +2016-11-09  Segher Boessenkool  <segher@kernel.crashing.org>
> +
> +       * simplify-rtx.c (simplify_binary_operation_1): Simplify
> +       (xor (and (xor A B) C) B) to (ior (and A C) (and B ~C)) and
> +       (xor (and (xor A B) C) A) to (ior (and A ~C) (and B C)) if C
> +       is a const_int.


Is it a MUST that C be const here? For this case in PR90323, C is not a const 
actually.

    l = l & ~mask;
    l |= mask & r;

Trying 8, 9 -> 10:
    8: r127:V4SI=r124:V4SI^r131:V4SI
      REG_DEAD r131:V4SI
    9: r122:V4SI=r127:V4SI&r130:V4SI
      REG_DEAD r130:V4SI
      REG_DEAD r127:V4SI
   10: r128:V4SI=r124:V4SI^r122:V4SI
      REG_DEAD r124:V4SI
      REG_DEAD r122:V4SI

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug middle-end/90323] powerpc should convert equivalent sequences to vec_sel()
       [not found] <bug-90323-4@http.gcc.gnu.org/bugzilla/>
                   ` (8 preceding siblings ...)
  2021-04-30  2:20 ` luoxhu at gcc dot gnu.org
@ 2021-04-30  6:36 ` luoxhu at gcc dot gnu.org
  2021-12-22 10:49 ` pinskia at gcc dot gnu.org
                   ` (5 subsequent siblings)
  15 siblings, 0 replies; 16+ messages in thread
From: luoxhu at gcc dot gnu.org @ 2021-04-30  6:36 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90323

--- Comment #17 from luoxhu at gcc dot gnu.org ---
If the constant limitation is removed, it could be combined successfully with
my new patch for PR94613.

https://gcc.gnu.org/pipermail/gcc-patches/2021-April/569255.html

And what do you mean"This is not canonical form on RTL, and it's not a useful
form either" in c#7, please? Not understanding the point...


Trying 11 -> 16:
   11: r124:V4SI=r127:V4SI&r129:V4SI|~r129:V4SI&r128:V4SI
      REG_DEAD r128:V4SI
      REG_DEAD r129:V4SI
      REG_DEAD r127:V4SI
   16: %v2:V4SI=r124:V4SI
      REG_DEAD r124:V4SI
Successfully matched this instruction:
(set (reg/i:V4SI 66 %v2)
    (ior:V4SI (and:V4SI (reg:V4SI 127)
            (reg:V4SI 129))
        (and:V4SI (not:V4SI (reg:V4SI 129))
            (reg:V4SI 128))))
allowing combination of insns 11 and 16
original costs 4 + 4 = 8
replacement cost 4
deferring deletion of insn with uid = 11.
modifying insn i3    16: %v2:V4SI=r127:V4SI&r129:V4SI|~r129:V4SI&r128:V4SI
      REG_DEAD r127:V4SI
      REG_DEAD r129:V4SI
      REG_DEAD r128:V4SI
deferring rescan insn with uid = 16.


diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c
index 571e2337e27..701f37eb03e 100644
--- a/gcc/simplify-rtx.c
+++ b/gcc/simplify-rtx.c
@@ -3405,7 +3405,6 @@ simplify_context::simplify_binary_operation_1 (rtx_code
code,
         machines, and also has shorter instruction path length.  */
       if (GET_CODE (op0) == AND
          && GET_CODE (XEXP (op0, 0)) == XOR
-         && CONST_INT_P (XEXP (op0, 1))
          && rtx_equal_p (XEXP (XEXP (op0, 0), 0), trueop1))
        {
          rtx a = trueop1;
@@ -3419,7 +3418,6 @@ simplify_context::simplify_binary_operation_1 (rtx_code
code,
       /* Similarly, (xor (and (xor A B) C) B) as (ior (and A C) (and B ~C)) 
*/
       else if (GET_CODE (op0) == AND
          && GET_CODE (XEXP (op0, 0)) == XOR
-         && CONST_INT_P (XEXP (op0, 1))
          && rtx_equal_p (XEXP (XEXP (op0, 0), 1), trueop1))
        {
          rtx a = XEXP (XEXP (op0, 0), 0);

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug middle-end/90323] powerpc should convert equivalent sequences to vec_sel()
       [not found] <bug-90323-4@http.gcc.gnu.org/bugzilla/>
                   ` (9 preceding siblings ...)
  2021-04-30  6:36 ` luoxhu at gcc dot gnu.org
@ 2021-12-22 10:49 ` pinskia at gcc dot gnu.org
  2021-12-22 11:27 ` segher at gcc dot gnu.org
                   ` (4 subsequent siblings)
  15 siblings, 0 replies; 16+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-12-22 10:49 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90323

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Severity|normal                      |enhancement

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug middle-end/90323] powerpc should convert equivalent sequences to vec_sel()
       [not found] <bug-90323-4@http.gcc.gnu.org/bugzilla/>
                   ` (10 preceding siblings ...)
  2021-12-22 10:49 ` pinskia at gcc dot gnu.org
@ 2021-12-22 11:27 ` segher at gcc dot gnu.org
  2021-12-22 11:39 ` segher at gcc dot gnu.org
                   ` (3 subsequent siblings)
  15 siblings, 0 replies; 16+ messages in thread
From: segher at gcc dot gnu.org @ 2021-12-22 11:27 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90323

--- Comment #18 from Segher Boessenkool <segher at gcc dot gnu.org> ---
(In reply to luoxhu from comment #16)
> > +2016-11-09  Segher Boessenkool  <segher@kernel.crashing.org>
> > +
> > +       * simplify-rtx.c (simplify_binary_operation_1): Simplify
> > +       (xor (and (xor A B) C) B) to (ior (and A C) (and B ~C)) and
> > +       (xor (and (xor A B) C) A) to (ior (and A ~C) (and B C)) if C
> > +       is a const_int.
> 
> 
> Is it a MUST that C be const here?

It could be extended to C a reg as well, I think.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug middle-end/90323] powerpc should convert equivalent sequences to vec_sel()
       [not found] <bug-90323-4@http.gcc.gnu.org/bugzilla/>
                   ` (11 preceding siblings ...)
  2021-12-22 11:27 ` segher at gcc dot gnu.org
@ 2021-12-22 11:39 ` segher at gcc dot gnu.org
  2023-09-04  3:28 ` pinskia at gcc dot gnu.org
                   ` (2 subsequent siblings)
  15 siblings, 0 replies; 16+ messages in thread
From: segher at gcc dot gnu.org @ 2021-12-22 11:39 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90323

--- Comment #19 from Segher Boessenkool <segher at gcc dot gnu.org> ---
(In reply to luoxhu from comment #17)
> And what do you mean"This is not canonical form on RTL, and it's not a
> useful form either" in c#7, please? Not understanding the point...

On Gimple it is canonical to convert (a&c)|(b&~c) to ((a^b)&c)^b), because
all Gimple cares about is number of operations (and it counts unary operations
as well, so this is three instead of four ops).

For RTL we do not have such a simple-minded rule.

> --- a/gcc/simplify-rtx.c
> +++ b/gcc/simplify-rtx.c
> @@ -3405,7 +3405,6 @@ simplify_context::simplify_binary_operation_1
> (rtx_code code,
>          machines, and also has shorter instruction path length.  */
>        if (GET_CODE (op0) == AND
>           && GET_CODE (XEXP (op0, 0)) == XOR
> -         && CONST_INT_P (XEXP (op0, 1))
>           && rtx_equal_p (XEXP (XEXP (op0, 0), 0), trueop1))
>         {
>           rtx a = trueop1;
> @@ -3419,7 +3418,6 @@ simplify_context::simplify_binary_operation_1
> (rtx_code code,
>        /* Similarly, (xor (and (xor A B) C) B) as (ior (and A C) (and B ~C))
> */
>        else if (GET_CODE (op0) == AND
>           && GET_CODE (XEXP (op0, 0)) == XOR
> -         && CONST_INT_P (XEXP (op0, 1))
>           && rtx_equal_p (XEXP (XEXP (op0, 0), 1), trueop1))
>         {
>           rtx a = XEXP (XEXP (op0, 0), 0);

It needs *some* test on it.  It certainly cannot have side effects for
example.  CONST_INT_P || REG_P should catch all useful cases?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug middle-end/90323] powerpc should convert equivalent sequences to vec_sel()
       [not found] <bug-90323-4@http.gcc.gnu.org/bugzilla/>
                   ` (12 preceding siblings ...)
  2021-12-22 11:39 ` segher at gcc dot gnu.org
@ 2023-09-04  3:28 ` pinskia at gcc dot gnu.org
  2024-05-15  9:05 ` pinskia at gcc dot gnu.org
  2024-05-15 11:13 ` segher at gcc dot gnu.org
  15 siblings, 0 replies; 16+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-09-04  3:28 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90323

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           See Also|                            |https://gcc.gnu.org/bugzill
                   |                            |a/show_bug.cgi?id=64448

--- Comment #20 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
The aarch64 backend matches this:
(insn 15 10 16 2 (set (reg/i:V4SI 32 v0)
        (xor:V4SI (and:V4SI (xor:V4SI (reg:V4SI 101)
                    (reg:V4SI 102))
                (reg:V4SI 103))
            (reg:V4SI 101))) "/app/example.cpp":7:1 3103
{aarch64_simd_bslv4si_internal}

for the `bit v0.16b, v1.16b, v2.16b` instruction. which was done r5-6601 (PR
64448) .

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug middle-end/90323] powerpc should convert equivalent sequences to vec_sel()
       [not found] <bug-90323-4@http.gcc.gnu.org/bugzilla/>
                   ` (13 preceding siblings ...)
  2023-09-04  3:28 ` pinskia at gcc dot gnu.org
@ 2024-05-15  9:05 ` pinskia at gcc dot gnu.org
  2024-05-15 11:13 ` segher at gcc dot gnu.org
  15 siblings, 0 replies; 16+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-05-15  9:05 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90323

--- Comment #21 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Andrew Pinski from comment #20)
> The aarch64 backend matches this:
> (insn 15 10 16 2 (set (reg/i:V4SI 32 v0)
>         (xor:V4SI (and:V4SI (xor:V4SI (reg:V4SI 101)
>                     (reg:V4SI 102))
>                 (reg:V4SI 103))
>             (reg:V4SI 101))) "/app/example.cpp":7:1 3103
> {aarch64_simd_bslv4si_internal}
> 
> for the `bit v0.16b, v1.16b, v2.16b` instruction. which was done r5-6601 (PR
> 64448) .

One thing for the middle-end here is if we have `(xor (and (xor A B) C) B)` we
could try expand it into `(a&c)|(b&~c)` if there is an optab for the &~ (which
I am aiming to add for other reasons). I am not sure if powerpc vsx has &~
though. I will doing my development on both x86_64 and aarch64 and it will be
up to the other targets to add the optab pattern if needed.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug middle-end/90323] powerpc should convert equivalent sequences to vec_sel()
       [not found] <bug-90323-4@http.gcc.gnu.org/bugzilla/>
                   ` (14 preceding siblings ...)
  2024-05-15  9:05 ` pinskia at gcc dot gnu.org
@ 2024-05-15 11:13 ` segher at gcc dot gnu.org
  15 siblings, 0 replies; 16+ messages in thread
From: segher at gcc dot gnu.org @ 2024-05-15 11:13 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90323

--- Comment #22 from Segher Boessenkool <segher at gcc dot gnu.org> ---
(In reply to Andrew Pinski from comment #21)
> I am not sure if powerpc vsx
> has &~ though.

VMX has vandc (since 1999), and VSX has xxlandc (since 2010).

In general, PowerPC has a full complement of logical ops, everywhere.  In some
cases it has the full truth table of the operation as part of the binary opcode
;-)

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2024-05-15 11:13 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <bug-90323-4@http.gcc.gnu.org/bugzilla/>
2021-04-08  5:25 ` [Bug middle-end/90323] powerpc should convert equivalent sequences to vec_sel() luoxhu at gcc dot gnu.org
2021-04-08  5:29 ` luoxhu at gcc dot gnu.org
2021-04-08 20:49 ` segher at gcc dot gnu.org
2021-04-09  3:47 ` luoxhu at gcc dot gnu.org
2021-04-09  7:29 ` luoxhu at gcc dot gnu.org
2021-04-12 22:32 ` segher at gcc dot gnu.org
2021-04-12 22:36 ` segher at gcc dot gnu.org
2021-04-13  3:22 ` luoxhu at gcc dot gnu.org
2021-04-30  2:20 ` luoxhu at gcc dot gnu.org
2021-04-30  6:36 ` luoxhu at gcc dot gnu.org
2021-12-22 10:49 ` pinskia at gcc dot gnu.org
2021-12-22 11:27 ` segher at gcc dot gnu.org
2021-12-22 11:39 ` segher at gcc dot gnu.org
2023-09-04  3:28 ` pinskia at gcc dot gnu.org
2024-05-15  9:05 ` pinskia at gcc dot gnu.org
2024-05-15 11:13 ` segher at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).