public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug rtl-optimization/107546] New: simd, redundant pcmpeqb and pxor
@ 2022-11-07  6:39 i.nixman at autistici dot org
  2022-11-07  6:51 ` [Bug target/107546] " pinskia at gcc dot gnu.org
                   ` (15 more replies)
  0 siblings, 16 replies; 17+ messages in thread
From: i.nixman at autistici dot org @ 2022-11-07  6:39 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107546

            Bug ID: 107546
           Summary: simd, redundant pcmpeqb and pxor
           Product: gcc
           Version: 12.2.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: rtl-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: i.nixman at autistici dot org
  Target Milestone: ---

Hello,

this code sample(https://godbolt.org/z/TnGMsfMs6):
```
#include <x86intrin.h>

auto foo(const char *p) {
    const auto substr = _mm_loadu_si128((const __m128i *)p);
    return _mm_cmplt_epi8(substr, _mm_set1_epi8('0'));
}

```
produces the following ASM code:
```
1: foo(char const*):
2:    movdqu  xmm0, XMMWORD PTR [rdi]
3:    pxor    xmm1, xmm1
4:    pcmpgtb xmm0, XMMWORD PTR .LC0[rip]
5:    pcmpeqb xmm0, xmm1
6:    ret
```
please look at line 5.
is there any reason for `pcmpeqb` and `pxor` instruction?

just for info, clang's output(https://godbolt.org/z/MPnvEMdhr):
```
1: foo(char const*):
2:    movdqu  xmm1, xmmword ptr [rdi]
3:    movdqa  xmm0, xmmword ptr [rip + .LCPI0_0]
4:    pcmpgtb xmm0, xmm1
5:    ret
```

it looks like the issue started at version 9 and up to the current trunk.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug target/107546] simd, redundant pcmpeqb and pxor
  2022-11-07  6:39 [Bug rtl-optimization/107546] New: simd, redundant pcmpeqb and pxor i.nixman at autistici dot org
@ 2022-11-07  6:51 ` pinskia at gcc dot gnu.org
  2022-11-07  6:58 ` [Bug target/107546] [10/11/12/13 Regression] " pinskia at gcc dot gnu.org
                   ` (14 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: pinskia at gcc dot gnu.org @ 2022-11-07  6:51 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107546

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
     Ever confirmed|0                           |1
   Last reconfirmed|                            |2022-11-07
             Status|UNCONFIRMED                 |NEW

--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Well it is definitely a target issue:
;; _3 = _4 <= { 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47
};

(insn 9 8 10 (set (reg:V16QI 89)
        (mem/u/c:V16QI (symbol_ref/u:DI ("*.LC0") [flags 0x2]) [0  S16 A128]))
"/opt/compiler-explorer/gcc-trunk-20221106/lib/gcc/x86_64-linux-gnu/13.0.0/include/emmintrin.h":1353:34
-1
     (expr_list:REG_EQUAL (const_vector:V16QI [
                (const_int 47 [0x2f]) repeated x16
            ])
        (nil)))

(insn 10 9 11 (set (reg:V16QI 90)
        (gt:V16QI (reg:V16QI 83 [ _4 ])
            (reg:V16QI 89)))
"/opt/compiler-explorer/gcc-trunk-20221106/lib/gcc/x86_64-linux-gnu/13.0.0/include/emmintrin.h":1353:34
-1
     (nil))

(insn 11 10 12 (set (reg:V16QI 91)
        (const_vector:V16QI [
                (const_int 0 [0]) repeated x16
            ]))
"/opt/compiler-explorer/gcc-trunk-20221106/lib/gcc/x86_64-linux-gnu/13.0.0/include/emmintrin.h":1353:34
-1
     (nil))

(insn 12 11 13 (set (reg:V16QI 92)
        (eq:V16QI (reg:V16QI 90)
            (reg:V16QI 91)))
"/opt/compiler-explorer/gcc-trunk-20221106/lib/gcc/x86_64-linux-gnu/13.0.0/include/emmintrin.h":1353:34
-1
     (nil))

(insn 13 12 0 (set (reg:V16QI 82 [ _3 ])
        (reg:V16QI 92))
"/opt/compiler-explorer/gcc-trunk-20221106/lib/gcc/x86_64-linux-gnu/13.0.0/include/emmintrin.h":1353:34
-1
     (nil))


Reduced testcase using GNU C++ vector types instead:

#define vector __attribute__((vector_size(16)))
auto foo1(const char *p) {
    vector signed char a = *(vector signed char*)p;
    vector signed char a47 = {47, 47, 47, 47,47, 47, 47, 47,47, 47, 47, 47,47,
47, 47, 47};
    return a <= a47;
}

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug target/107546] [10/11/12/13 Regression] simd, redundant pcmpeqb and pxor
  2022-11-07  6:39 [Bug rtl-optimization/107546] New: simd, redundant pcmpeqb and pxor i.nixman at autistici dot org
  2022-11-07  6:51 ` [Bug target/107546] " pinskia at gcc dot gnu.org
@ 2022-11-07  6:58 ` pinskia at gcc dot gnu.org
  2022-11-07  7:07 ` rguenth at gcc dot gnu.org
                   ` (13 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: pinskia at gcc dot gnu.org @ 2022-11-07  6:58 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107546

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|simd, redundant pcmpeqb and |[10/11/12/13 Regression]
                   |pxor                        |simd, redundant pcmpeqb and
                   |                            |pxor
   Target Milestone|---                         |10.5

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
For GNU C++ vectors produced
GCC 4.8 until GCC 11 produced:
        movdqa  xmm0, XMMWORD PTR [rdi]
        pcmpeqd xmm1, xmm1
        pcmpgtb xmm0, XMMWORD PTR .LC0[rip]
        pandn   xmm0, xmm1
        ret

GCC 11+ produces:
        movdqa  xmm0, XMMWORD PTR [rdi]
        pxor    xmm1, xmm1
        pcmpgtb xmm0, XMMWORD PTR .LC0[rip]
        pcmpeqb xmm0, xmm1
        ret

But the intrinics produced the expected thing until GCC 9.

in GCC 8 the intrinsics produces:
  _3 = VEC_COND_EXPR <_4 < { 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48,
48, 48, 48, 48 }, { -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
-1 }, { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }>;

even. Notice the < vs <= there.
I suspect the <= expansion part of the x86_64 backend needs to be fixed up to
produce better code.

So this is a regression for the intrinsics and marking it as such.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug target/107546] [10/11/12/13 Regression] simd, redundant pcmpeqb and pxor
  2022-11-07  6:39 [Bug rtl-optimization/107546] New: simd, redundant pcmpeqb and pxor i.nixman at autistici dot org
  2022-11-07  6:51 ` [Bug target/107546] " pinskia at gcc dot gnu.org
  2022-11-07  6:58 ` [Bug target/107546] [10/11/12/13 Regression] " pinskia at gcc dot gnu.org
@ 2022-11-07  7:07 ` rguenth at gcc dot gnu.org
  2022-11-07  7:07 ` crazylht at gmail dot com
                   ` (12 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-11-07  7:07 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107546

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Priority|P3                          |P2
             Target|x86_64                      |x86_64-*-* i?86-*-*

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug target/107546] [10/11/12/13 Regression] simd, redundant pcmpeqb and pxor
  2022-11-07  6:39 [Bug rtl-optimization/107546] New: simd, redundant pcmpeqb and pxor i.nixman at autistici dot org
                   ` (2 preceding siblings ...)
  2022-11-07  7:07 ` rguenth at gcc dot gnu.org
@ 2022-11-07  7:07 ` crazylht at gmail dot com
  2022-11-07  7:14 ` crazylht at gmail dot com
                   ` (11 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: crazylht at gmail dot com @ 2022-11-07  7:07 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107546

--- Comment #3 from Hongtao.liu <crazylht at gmail dot com> ---
Failed to match this instruction:
(set (reg:V16QI 95)
    (eq:V16QI (gt:V16QI (subreg:V16QI (reg:V2DI 89 [ MEM[(const __m128i_u *
{ref-all})p_2(D)] ]) 0)
            (mem/u/c:V16QI (symbol_ref/u:DI ("*.LC0") [flags 0x2]) [0  S16
A128]))
        (const_vector:V16QI [
                (const_int 0 [0]) repeated x16
            ])))


I think rtl can simplify vector comparison

(eq (gt op1 op2) const0_rtx) to just (gt op2 op1).

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug target/107546] [10/11/12/13 Regression] simd, redundant pcmpeqb and pxor
  2022-11-07  6:39 [Bug rtl-optimization/107546] New: simd, redundant pcmpeqb and pxor i.nixman at autistici dot org
                   ` (3 preceding siblings ...)
  2022-11-07  7:07 ` crazylht at gmail dot com
@ 2022-11-07  7:14 ` crazylht at gmail dot com
  2022-11-07  7:16 ` glisse at gcc dot gnu.org
                   ` (10 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: crazylht at gmail dot com @ 2022-11-07  7:14 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107546

--- Comment #4 from Hongtao.liu <crazylht at gmail dot com> ---

> even. Notice the < vs <= there.
> I suspect the <= expansion part of the x86_64 backend needs to be fixed up
> to produce better code.

Hmm, we do have a extra pcmpeq to negate the result.

--------------cut from ix86_expand_int_vec_cmp---------------
  rtx cmp = ix86_expand_int_sse_cmp (operands[0], code, operands[2],
                                     operands[3], NULL, NULL, &negate);

  if (!cmp)
    return false;

  if (negate)
    cmp = ix86_expand_int_sse_cmp (operands[0], EQ, cmp,
                                   CONST0_RTX (GET_MODE (cmp)),
                                   NULL, NULL, &negate);
---------------cut end-------------------------------

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug target/107546] [10/11/12/13 Regression] simd, redundant pcmpeqb and pxor
  2022-11-07  6:39 [Bug rtl-optimization/107546] New: simd, redundant pcmpeqb and pxor i.nixman at autistici dot org
                   ` (4 preceding siblings ...)
  2022-11-07  7:14 ` crazylht at gmail dot com
@ 2022-11-07  7:16 ` glisse at gcc dot gnu.org
  2022-11-07  7:20 ` pinskia at gcc dot gnu.org
                   ` (9 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: glisse at gcc dot gnu.org @ 2022-11-07  7:16 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107546

--- Comment #5 from Marc Glisse <glisse at gcc dot gnu.org> ---
typedef signed char v16qs __attribute__((vector_size(16)));
auto bar(v16qs x) { return x < 48; }

clang does expand it as 48 gt x. Gcc however does its usual change to x <= 47,
which it then tries to expand as ~(x > 47). I guess the expansion for x <= y
could be tweaked in the case where one argument is constant to undo what was
done earlier in the pipeline and expand as 48 > x.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug target/107546] [10/11/12/13 Regression] simd, redundant pcmpeqb and pxor
  2022-11-07  6:39 [Bug rtl-optimization/107546] New: simd, redundant pcmpeqb and pxor i.nixman at autistici dot org
                   ` (5 preceding siblings ...)
  2022-11-07  7:16 ` glisse at gcc dot gnu.org
@ 2022-11-07  7:20 ` pinskia at gcc dot gnu.org
  2022-11-07  8:27 ` crazylht at gmail dot com
                   ` (8 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: pinskia at gcc dot gnu.org @ 2022-11-07  7:20 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107546

--- Comment #6 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Marc Glisse from comment #5)
> typedef signed char v16qs __attribute__((vector_size(16)));
> auto bar(v16qs x) { return x < 48; }
> 
> clang does expand it as 48 gt x. Gcc however does its usual change to x <=
> 47, which it then tries to expand as ~(x > 47). I guess the expansion for x
> <= y could be tweaked in the case where one argument is constant to undo
> what was done earlier in the pipeline and expand as 48 > x.

I was going to suggest that ...
Or maybe that could be done in isel.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug target/107546] [10/11/12/13 Regression] simd, redundant pcmpeqb and pxor
  2022-11-07  6:39 [Bug rtl-optimization/107546] New: simd, redundant pcmpeqb and pxor i.nixman at autistici dot org
                   ` (6 preceding siblings ...)
  2022-11-07  7:20 ` pinskia at gcc dot gnu.org
@ 2022-11-07  8:27 ` crazylht at gmail dot com
  2022-11-07 10:22 ` jakub at gcc dot gnu.org
                   ` (7 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: crazylht at gmail dot com @ 2022-11-07  8:27 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107546

--- Comment #7 from Hongtao.liu <crazylht at gmail dot com> ---
(In reply to Hongtao.liu from comment #3)
> Failed to match this instruction:
> (set (reg:V16QI 95)
>     (eq:V16QI (gt:V16QI (subreg:V16QI (reg:V2DI 89 [ MEM[(const __m128i_u *
> {ref-all})p_2(D)] ]) 0)
>             (mem/u/c:V16QI (symbol_ref/u:DI ("*.LC0") [flags 0x2]) [0  S16
> A128]))
>         (const_vector:V16QI [
>                 (const_int 0 [0]) repeated x16
>             ])))
> 
> 
> I think rtl can simplify vector comparison
> 
> (eq (gt op1 op2) const0_rtx) to just (gt op2 op1).

No, it's not, it's (le op1 op2) which is not supported under sse2 ,and this is
the exact reason we do negate the result in the backend.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug target/107546] [10/11/12/13 Regression] simd, redundant pcmpeqb and pxor
  2022-11-07  6:39 [Bug rtl-optimization/107546] New: simd, redundant pcmpeqb and pxor i.nixman at autistici dot org
                   ` (7 preceding siblings ...)
  2022-11-07  8:27 ` crazylht at gmail dot com
@ 2022-11-07 10:22 ` jakub at gcc dot gnu.org
  2022-11-07 12:02 ` jakub at gcc dot gnu.org
                   ` (6 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: jakub at gcc dot gnu.org @ 2022-11-07 10:22 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107546

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jakub at gcc dot gnu.org
             Status|NEW                         |ASSIGNED
           Assignee|unassigned at gcc dot gnu.org      |jakub at gcc dot gnu.org

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug target/107546] [10/11/12/13 Regression] simd, redundant pcmpeqb and pxor
  2022-11-07  6:39 [Bug rtl-optimization/107546] New: simd, redundant pcmpeqb and pxor i.nixman at autistici dot org
                   ` (8 preceding siblings ...)
  2022-11-07 10:22 ` jakub at gcc dot gnu.org
@ 2022-11-07 12:02 ` jakub at gcc dot gnu.org
  2022-11-07 12:05 ` i.nixman at autistici dot org
                   ` (5 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: jakub at gcc dot gnu.org @ 2022-11-07 12:02 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107546

--- Comment #8 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Created attachment 53842
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53842&action=edit
gcc13-pr107546.patch

Untested fix.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug target/107546] [10/11/12/13 Regression] simd, redundant pcmpeqb and pxor
  2022-11-07  6:39 [Bug rtl-optimization/107546] New: simd, redundant pcmpeqb and pxor i.nixman at autistici dot org
                   ` (9 preceding siblings ...)
  2022-11-07 12:02 ` jakub at gcc dot gnu.org
@ 2022-11-07 12:05 ` i.nixman at autistici dot org
  2022-11-07 13:47 ` i.nixman at autistici dot org
                   ` (4 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: i.nixman at autistici dot org @ 2022-11-07 12:05 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107546

--- Comment #9 from niXman <i.nixman at autistici dot org> ---
(In reply to Jakub Jelinek from comment #8)
> Created attachment 53842 [details]
> gcc13-pr107546.patch
> 
> Untested fix.

many thanks!

will test, will report back.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug target/107546] [10/11/12/13 Regression] simd, redundant pcmpeqb and pxor
  2022-11-07  6:39 [Bug rtl-optimization/107546] New: simd, redundant pcmpeqb and pxor i.nixman at autistici dot org
                   ` (10 preceding siblings ...)
  2022-11-07 12:05 ` i.nixman at autistici dot org
@ 2022-11-07 13:47 ` i.nixman at autistici dot org
  2022-11-08 11:24 ` cvs-commit at gcc dot gnu.org
                   ` (3 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: i.nixman at autistici dot org @ 2022-11-07 13:47 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107546

--- Comment #10 from niXman <i.nixman at autistici dot org> ---
yes, fixed for the `master`.

```
foo:
.LFB6604:
        .cfi_startproc
        movdqu  (%rdi), %xmm1
        movdqa  .LC0(%rip), %xmm0
        pcmpgtb %xmm1, %xmm0
        ret
        .cfi_endproc
.LFE6604:
```

can it be applied at least for gcc-12 branch ?

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug target/107546] [10/11/12/13 Regression] simd, redundant pcmpeqb and pxor
  2022-11-07  6:39 [Bug rtl-optimization/107546] New: simd, redundant pcmpeqb and pxor i.nixman at autistici dot org
                   ` (11 preceding siblings ...)
  2022-11-07 13:47 ` i.nixman at autistici dot org
@ 2022-11-08 11:24 ` cvs-commit at gcc dot gnu.org
  2022-11-08 11:59 ` [Bug target/107546] [10/11/12 " jakub at gcc dot gnu.org
                   ` (2 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2022-11-08 11:24 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107546

--- Comment #11 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:fa271afb58423014e2feef9f15c1a87428e64ddc

commit r13-3803-gfa271afb58423014e2feef9f15c1a87428e64ddc
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Tue Nov 8 12:21:55 2022 +0100

    i386: Improve vector [GL]E{,U} comparison against vector constants
[PR107546]

    For integer vector comparisons without XOP before AVX512{F,VL} we are
    constrained by only GT and EQ being supported in HW.
    For GTU we play tricks to implement it using GT or unsigned saturating
    subtraction, for LT/LTU we swap the operands and thus turn it into
    GT/GTU.  For LE/LEU we handle it by using GT/GTU and negating the
    result and for GE/GEU by using GT/GTU on swapped operands and negating
    the result.
    If the second operand is a CONST_VECTOR, we can usually do better though,
    we can avoid the negation.  For LE/LEU cst by doing LT/LTU cst+1 (and
    then cst+1 GT/GTU x) and for GE/GEU cst by doing GT/GTU cst-1, provided
    there is no wrap-around on those cst+1 or cst-1.
    GIMPLE canonicalizes x < cst to x <= cst-1 etc. (the rule is smaller
    absolute value on constant), but only for scalars or uniform vectors,
    so in some cases this undoes that canonicalization in order to avoid
    the extra negation, but it handles also non-uniform constants.
    E.g. with -mavx2 the testcase assembly difference is:
    -       movl    $47, %eax
    +       movl    $48, %eax
            vmovdqa %xmm0, %xmm1
            vmovd   %eax, %xmm0
            vpbroadcastb    %xmm0, %xmm0
    -       vpminsb %xmm0, %xmm1, %xmm0
    -       vpcmpeqb        %xmm1, %xmm0, %xmm0
    +       vpcmpgtb        %xmm1, %xmm0, %xmm0
    and
    -       vmovdqa %xmm0, %xmm1
    -       vmovdqa .LC1(%rip), %xmm0
    -       vpminsb %xmm1, %xmm0, %xmm1
    -       vpcmpeqb        %xmm1, %xmm0, %xmm0
    +       vpcmpgtb        .LC1(%rip), %xmm0, %xmm0
    while with just SSE2:
    -       pcmpgtb .LC0(%rip), %xmm0
    -       pxor    %xmm1, %xmm1
    -       pcmpeqb %xmm1, %xmm0
    +       movdqa  %xmm0, %xmm1
    +       movdqa  .LC0(%rip), %xmm0
    +       pcmpgtb %xmm1, %xmm0
    and
    -       movdqa  %xmm0, %xmm1
    -       movdqa  .LC1(%rip), %xmm0
    -       pcmpgtb %xmm1, %xmm0
    -       pxor    %xmm1, %xmm1
    -       pcmpeqb %xmm1, %xmm0
    +       pcmpgtb .LC1(%rip), %xmm0

    2022-11-08  Jakub Jelinek  <jakub@redhat.com>

            PR target/107546
            * config/i386/predicates.md (vector_or_const_vector_operand): New
            predicate.
            * config/i386/sse.md (vec_cmp<mode><sseintvecmodelower>,
            vec_cmpv2div2di, vec_cmpu<mode><sseintvecmodelower>,
            vec_cmpuv2div2di): Use nonimmediate_or_const_vector_operand
            predicate instead of nonimmediate_operand and
            vector_or_const_vector_operand instead of vector_operand.
            * config/i386/i386-expand.cc (ix86_expand_int_sse_cmp): For
            LE/LEU or GE/GEU with CONST_VECTOR cop1 try to transform those
            into LE/LEU or GT/GTU with larger or smaller by one cop1 if
            there is no wrap-around.  Force CONST_VECTOR cop0 or cop1 into
            REG.  Formatting fix.

            * gcc.target/i386/pr107546.c: New test.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug target/107546] [10/11/12 Regression] simd, redundant pcmpeqb and pxor
  2022-11-07  6:39 [Bug rtl-optimization/107546] New: simd, redundant pcmpeqb and pxor i.nixman at autistici dot org
                   ` (12 preceding siblings ...)
  2022-11-08 11:24 ` cvs-commit at gcc dot gnu.org
@ 2022-11-08 11:59 ` jakub at gcc dot gnu.org
  2022-11-10 10:48 ` cvs-commit at gcc dot gnu.org
  2023-07-07 10:44 ` [Bug target/107546] [11/12 " rguenth at gcc dot gnu.org
  15 siblings, 0 replies; 17+ messages in thread
From: jakub at gcc dot gnu.org @ 2022-11-08 11:59 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107546

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|[10/11/12/13 Regression]    |[10/11/12 Regression] simd,
                   |simd, redundant pcmpeqb and |redundant pcmpeqb and pxor
                   |pxor                        |

--- Comment #12 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Fixed on the trunk.
I don't think this is appropriate for release branches.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug target/107546] [10/11/12 Regression] simd, redundant pcmpeqb and pxor
  2022-11-07  6:39 [Bug rtl-optimization/107546] New: simd, redundant pcmpeqb and pxor i.nixman at autistici dot org
                   ` (13 preceding siblings ...)
  2022-11-08 11:59 ` [Bug target/107546] [10/11/12 " jakub at gcc dot gnu.org
@ 2022-11-10 10:48 ` cvs-commit at gcc dot gnu.org
  2023-07-07 10:44 ` [Bug target/107546] [11/12 " rguenth at gcc dot gnu.org
  15 siblings, 0 replies; 17+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2022-11-10 10:48 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107546

--- Comment #13 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:49de156a6adb50e0624479f22c20141cadef812d

commit r13-3874-g49de156a6adb50e0624479f22c20141cadef812d
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Thu Nov 10 11:46:52 2022 +0100

    i386: Fix up ix86_expand_int_sse_cmp [PR107585]

    The following patch fixes ICE on the testcase.  I've used GEN_INT
    incorrectly thinking the code punts on the problematic boundaries.
    It does, but only for LE and GE, i.e. signed comparisons, for unsigned
    the boundaries are 0 and unsigned maximum, so when say unsigned char
    adds one to 127 or subtracts one from 128 we need to canonicalize it.

    2022-11-10  Jakub Jelinek  <jakub@redhat.com>

            PR target/107585
            PR target/107546
            * config/i386/i386-expand.cc (ix86_expand_int_sse_cmp): Use
            gen_int_mode rather than GEN_INT.

            * gcc.dg/pr107585.c: New test.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug target/107546] [11/12 Regression] simd, redundant pcmpeqb and pxor
  2022-11-07  6:39 [Bug rtl-optimization/107546] New: simd, redundant pcmpeqb and pxor i.nixman at autistici dot org
                   ` (14 preceding siblings ...)
  2022-11-10 10:48 ` cvs-commit at gcc dot gnu.org
@ 2023-07-07 10:44 ` rguenth at gcc dot gnu.org
  15 siblings, 0 replies; 17+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-07-07 10:44 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107546

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|10.5                        |11.5

--- Comment #14 from Richard Biener <rguenth at gcc dot gnu.org> ---
GCC 10 branch is being closed.

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2023-07-07 10:44 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-07  6:39 [Bug rtl-optimization/107546] New: simd, redundant pcmpeqb and pxor i.nixman at autistici dot org
2022-11-07  6:51 ` [Bug target/107546] " pinskia at gcc dot gnu.org
2022-11-07  6:58 ` [Bug target/107546] [10/11/12/13 Regression] " pinskia at gcc dot gnu.org
2022-11-07  7:07 ` rguenth at gcc dot gnu.org
2022-11-07  7:07 ` crazylht at gmail dot com
2022-11-07  7:14 ` crazylht at gmail dot com
2022-11-07  7:16 ` glisse at gcc dot gnu.org
2022-11-07  7:20 ` pinskia at gcc dot gnu.org
2022-11-07  8:27 ` crazylht at gmail dot com
2022-11-07 10:22 ` jakub at gcc dot gnu.org
2022-11-07 12:02 ` jakub at gcc dot gnu.org
2022-11-07 12:05 ` i.nixman at autistici dot org
2022-11-07 13:47 ` i.nixman at autistici dot org
2022-11-08 11:24 ` cvs-commit at gcc dot gnu.org
2022-11-08 11:59 ` [Bug target/107546] [10/11/12 " jakub at gcc dot gnu.org
2022-11-10 10:48 ` cvs-commit at gcc dot gnu.org
2023-07-07 10:44 ` [Bug target/107546] [11/12 " rguenth at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).