public inbox for gcc-help@gcc.gnu.org
 help / color / mirror / Atom feed
* simd, redundant pcmpeqb and pxor
@ 2022-11-06 10:53 i.nixman
  2022-11-07  3:32 ` Hongtao Liu
  0 siblings, 1 reply; 5+ messages in thread
From: i.nixman @ 2022-11-06 10:53 UTC (permalink / raw)
  To: gcc-help


Hello,

look at this example(https://godbolt.org/z/TnGMsfMs6):
```
auto foo(const char *p) {
     const auto substr = _mm_loadu_si128((const __m128i *)p);
     return _mm_cmplt_epi8(substr, _mm_set1_epi8('0'));
}
```
and to the generated asm:
```
1: foo(char const*):
2:    movdqu  xmm0, XMMWORD PTR [rdi]
3:    pxor    xmm1, xmm1
4:    pcmpgtb xmm0, XMMWORD PTR .LC0[rip]
5:    pcmpeqb xmm0, xmm1
6:    ret
```
look at line 5.
is there any reason for `pcmpeqb` instruction?

just for info, clang's output(https://godbolt.org/z/MPnvEMdhr):
```
1: foo(char const*):
2:    movdqu  xmm1, xmmword ptr [rdi]
3:    movdqa  xmm0, xmmword ptr [rip + .LCPI0_0]
4:    pcmpgtb xmm0, xmm1
5:    ret
```


best!

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: simd, redundant pcmpeqb and pxor
  2022-11-06 10:53 simd, redundant pcmpeqb and pxor i.nixman
@ 2022-11-07  3:32 ` Hongtao Liu
  2022-11-07  6:26   ` i.nixman
  0 siblings, 1 reply; 5+ messages in thread
From: Hongtao Liu @ 2022-11-07  3:32 UTC (permalink / raw)
  To: i.nixman; +Cc: gcc-help

On Sun, Nov 6, 2022 at 6:54 PM i.nixman--- via Gcc-help
<gcc-help@gcc.gnu.org> wrote:
>
>
> Hello,
>
> look at this example(https://godbolt.org/z/TnGMsfMs6):
> ```
> auto foo(const char *p) {
>      const auto substr = _mm_loadu_si128((const __m128i *)p);
>      return _mm_cmplt_epi8(substr, _mm_set1_epi8('0'));
> }
> ```
> and to the generated asm:
> ```
> 1: foo(char const*):
> 2:    movdqu  xmm0, XMMWORD PTR [rdi]
> 3:    pxor    xmm1, xmm1
> 4:    pcmpgtb xmm0, XMMWORD PTR .LC0[rip]
> 5:    pcmpeqb xmm0, xmm1
> 6:    ret
> ```
> look at line 5.
> is there any reason for `pcmpeqb` instruction?
Looks like a mis optimization from

_4 = VIEW_CONVERT_EXPR<__v16qs>(_7);
_3 = _4 <= { 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47 };
_5 = VIEW_CONVERT_EXPR<vector(16) signed char>(_3);  --- this?

Could you open a bugzilla for it
https://gcc.gnu.org/bugzilla/

>
> just for info, clang's output(https://godbolt.org/z/MPnvEMdhr):
> ```
> 1: foo(char const*):
> 2:    movdqu  xmm1, xmmword ptr [rdi]
> 3:    movdqa  xmm0, xmmword ptr [rip + .LCPI0_0]
> 4:    pcmpgtb xmm0, xmm1
> 5:    ret
> ```
>
>
> best!



-- 
BR,
Hongtao

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: simd, redundant pcmpeqb and pxor
  2022-11-07  3:32 ` Hongtao Liu
@ 2022-11-07  6:26   ` i.nixman
  2022-11-07  6:32     ` Hongtao Liu
  0 siblings, 1 reply; 5+ messages in thread
From: i.nixman @ 2022-11-07  6:26 UTC (permalink / raw)
  To: Hongtao Liu, gcc-help

On 2022-11-07 03:32, Hongtao Liu wrote:
> On Sun, Nov 6, 2022 at 6:54 PM i.nixman--- via Gcc-help
> <gcc-help@gcc.gnu.org> wrote:
>> 
>> 
>> Hello,
>> 
>> look at this example(https://godbolt.org/z/TnGMsfMs6):
>> ```
>> auto foo(const char *p) {
>>      const auto substr = _mm_loadu_si128((const __m128i *)p);
>>      return _mm_cmplt_epi8(substr, _mm_set1_epi8('0'));
>> }
>> ```
>> and to the generated asm:
>> ```
>> 1: foo(char const*):
>> 2:    movdqu  xmm0, XMMWORD PTR [rdi]
>> 3:    pxor    xmm1, xmm1
>> 4:    pcmpgtb xmm0, XMMWORD PTR .LC0[rip]
>> 5:    pcmpeqb xmm0, xmm1
>> 6:    ret
>> ```
>> look at line 5.
>> is there any reason for `pcmpeqb` instruction?

hi,

> Looks like a mis optimization from
> 
> _4 = VIEW_CONVERT_EXPR<__v16qs>(_7);
> _3 = _4 <= { 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 
> 47, 47 };
> _5 = VIEW_CONVERT_EXPR<vector(16) signed char>(_3);  --- this?
> 
> Could you open a bugzilla for it
> https://gcc.gnu.org/bugzilla/

sure, but for which component?



>> 
>> just for info, clang's output(https://godbolt.org/z/MPnvEMdhr):
>> ```
>> 1: foo(char const*):
>> 2:    movdqu  xmm1, xmmword ptr [rdi]
>> 3:    movdqa  xmm0, xmmword ptr [rip + .LCPI0_0]
>> 4:    pcmpgtb xmm0, xmm1
>> 5:    ret
>> ```
>> 
>> 
>> best!

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: simd, redundant pcmpeqb and pxor
  2022-11-07  6:26   ` i.nixman
@ 2022-11-07  6:32     ` Hongtao Liu
  2022-11-07  6:40       ` i.nixman
  0 siblings, 1 reply; 5+ messages in thread
From: Hongtao Liu @ 2022-11-07  6:32 UTC (permalink / raw)
  To: i.nixman; +Cc: gcc-help

On Mon, Nov 7, 2022 at 2:26 PM <i.nixman@autistici.org> wrote:
>
> On 2022-11-07 03:32, Hongtao Liu wrote:
> > On Sun, Nov 6, 2022 at 6:54 PM i.nixman--- via Gcc-help
> > <gcc-help@gcc.gnu.org> wrote:
> >>
> >>
> >> Hello,
> >>
> >> look at this example(https://godbolt.org/z/TnGMsfMs6):
> >> ```
> >> auto foo(const char *p) {
> >>      const auto substr = _mm_loadu_si128((const __m128i *)p);
> >>      return _mm_cmplt_epi8(substr, _mm_set1_epi8('0'));
> >> }
> >> ```
> >> and to the generated asm:
> >> ```
> >> 1: foo(char const*):
> >> 2:    movdqu  xmm0, XMMWORD PTR [rdi]
> >> 3:    pxor    xmm1, xmm1
> >> 4:    pcmpgtb xmm0, XMMWORD PTR .LC0[rip]
> >> 5:    pcmpeqb xmm0, xmm1
> >> 6:    ret
> >> ```
> >> look at line 5.
> >> is there any reason for `pcmpeqb` instruction?
>
> hi,
>
> > Looks like a mis optimization from
> >
> > _4 = VIEW_CONVERT_EXPR<__v16qs>(_7);
> > _3 = _4 <= { 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47,
> > 47, 47 };
> > _5 = VIEW_CONVERT_EXPR<vector(16) signed char>(_3);  --- this?
> >
> > Could you open a bugzilla for it
> > https://gcc.gnu.org/bugzilla/
>
> sure, but for which component?
Let's put it as rtl-optimization first.
>
>
>
> >>
> >> just for info, clang's output(https://godbolt.org/z/MPnvEMdhr):
> >> ```
> >> 1: foo(char const*):
> >> 2:    movdqu  xmm1, xmmword ptr [rdi]
> >> 3:    movdqa  xmm0, xmmword ptr [rip + .LCPI0_0]
> >> 4:    pcmpgtb xmm0, xmm1
> >> 5:    ret
> >> ```
> >>
> >>
> >> best!



-- 
BR,
Hongtao

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: simd, redundant pcmpeqb and pxor
  2022-11-07  6:32     ` Hongtao Liu
@ 2022-11-07  6:40       ` i.nixman
  0 siblings, 0 replies; 5+ messages in thread
From: i.nixman @ 2022-11-07  6:40 UTC (permalink / raw)
  To: Hongtao Liu, gcc-help


done: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107546



best!

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-11-07  6:40 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-06 10:53 simd, redundant pcmpeqb and pxor i.nixman
2022-11-07  3:32 ` Hongtao Liu
2022-11-07  6:26   ` i.nixman
2022-11-07  6:32     ` Hongtao Liu
2022-11-07  6:40       ` i.nixman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).