* simd, redundant pcmpeqb and pxor
@ 2022-11-06 10:53 i.nixman
2022-11-07 3:32 ` Hongtao Liu
0 siblings, 1 reply; 5+ messages in thread
From: i.nixman @ 2022-11-06 10:53 UTC (permalink / raw)
To: gcc-help
Hello,
look at this example(https://godbolt.org/z/TnGMsfMs6):
```
auto foo(const char *p) {
const auto substr = _mm_loadu_si128((const __m128i *)p);
return _mm_cmplt_epi8(substr, _mm_set1_epi8('0'));
}
```
and to the generated asm:
```
1: foo(char const*):
2: movdqu xmm0, XMMWORD PTR [rdi]
3: pxor xmm1, xmm1
4: pcmpgtb xmm0, XMMWORD PTR .LC0[rip]
5: pcmpeqb xmm0, xmm1
6: ret
```
look at line 5.
is there any reason for `pcmpeqb` instruction?
just for info, clang's output(https://godbolt.org/z/MPnvEMdhr):
```
1: foo(char const*):
2: movdqu xmm1, xmmword ptr [rdi]
3: movdqa xmm0, xmmword ptr [rip + .LCPI0_0]
4: pcmpgtb xmm0, xmm1
5: ret
```
best!
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: simd, redundant pcmpeqb and pxor
2022-11-06 10:53 simd, redundant pcmpeqb and pxor i.nixman
@ 2022-11-07 3:32 ` Hongtao Liu
2022-11-07 6:26 ` i.nixman
0 siblings, 1 reply; 5+ messages in thread
From: Hongtao Liu @ 2022-11-07 3:32 UTC (permalink / raw)
To: i.nixman; +Cc: gcc-help
On Sun, Nov 6, 2022 at 6:54 PM i.nixman--- via Gcc-help
<gcc-help@gcc.gnu.org> wrote:
>
>
> Hello,
>
> look at this example(https://godbolt.org/z/TnGMsfMs6):
> ```
> auto foo(const char *p) {
> const auto substr = _mm_loadu_si128((const __m128i *)p);
> return _mm_cmplt_epi8(substr, _mm_set1_epi8('0'));
> }
> ```
> and to the generated asm:
> ```
> 1: foo(char const*):
> 2: movdqu xmm0, XMMWORD PTR [rdi]
> 3: pxor xmm1, xmm1
> 4: pcmpgtb xmm0, XMMWORD PTR .LC0[rip]
> 5: pcmpeqb xmm0, xmm1
> 6: ret
> ```
> look at line 5.
> is there any reason for `pcmpeqb` instruction?
Looks like a mis optimization from
_4 = VIEW_CONVERT_EXPR<__v16qs>(_7);
_3 = _4 <= { 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47 };
_5 = VIEW_CONVERT_EXPR<vector(16) signed char>(_3); --- this?
Could you open a bugzilla for it
https://gcc.gnu.org/bugzilla/
>
> just for info, clang's output(https://godbolt.org/z/MPnvEMdhr):
> ```
> 1: foo(char const*):
> 2: movdqu xmm1, xmmword ptr [rdi]
> 3: movdqa xmm0, xmmword ptr [rip + .LCPI0_0]
> 4: pcmpgtb xmm0, xmm1
> 5: ret
> ```
>
>
> best!
--
BR,
Hongtao
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: simd, redundant pcmpeqb and pxor
2022-11-07 3:32 ` Hongtao Liu
@ 2022-11-07 6:26 ` i.nixman
2022-11-07 6:32 ` Hongtao Liu
0 siblings, 1 reply; 5+ messages in thread
From: i.nixman @ 2022-11-07 6:26 UTC (permalink / raw)
To: Hongtao Liu, gcc-help
On 2022-11-07 03:32, Hongtao Liu wrote:
> On Sun, Nov 6, 2022 at 6:54 PM i.nixman--- via Gcc-help
> <gcc-help@gcc.gnu.org> wrote:
>>
>>
>> Hello,
>>
>> look at this example(https://godbolt.org/z/TnGMsfMs6):
>> ```
>> auto foo(const char *p) {
>> const auto substr = _mm_loadu_si128((const __m128i *)p);
>> return _mm_cmplt_epi8(substr, _mm_set1_epi8('0'));
>> }
>> ```
>> and to the generated asm:
>> ```
>> 1: foo(char const*):
>> 2: movdqu xmm0, XMMWORD PTR [rdi]
>> 3: pxor xmm1, xmm1
>> 4: pcmpgtb xmm0, XMMWORD PTR .LC0[rip]
>> 5: pcmpeqb xmm0, xmm1
>> 6: ret
>> ```
>> look at line 5.
>> is there any reason for `pcmpeqb` instruction?
hi,
> Looks like a mis optimization from
>
> _4 = VIEW_CONVERT_EXPR<__v16qs>(_7);
> _3 = _4 <= { 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47,
> 47, 47 };
> _5 = VIEW_CONVERT_EXPR<vector(16) signed char>(_3); --- this?
>
> Could you open a bugzilla for it
> https://gcc.gnu.org/bugzilla/
sure, but for which component?
>>
>> just for info, clang's output(https://godbolt.org/z/MPnvEMdhr):
>> ```
>> 1: foo(char const*):
>> 2: movdqu xmm1, xmmword ptr [rdi]
>> 3: movdqa xmm0, xmmword ptr [rip + .LCPI0_0]
>> 4: pcmpgtb xmm0, xmm1
>> 5: ret
>> ```
>>
>>
>> best!
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: simd, redundant pcmpeqb and pxor
2022-11-07 6:26 ` i.nixman
@ 2022-11-07 6:32 ` Hongtao Liu
2022-11-07 6:40 ` i.nixman
0 siblings, 1 reply; 5+ messages in thread
From: Hongtao Liu @ 2022-11-07 6:32 UTC (permalink / raw)
To: i.nixman; +Cc: gcc-help
On Mon, Nov 7, 2022 at 2:26 PM <i.nixman@autistici.org> wrote:
>
> On 2022-11-07 03:32, Hongtao Liu wrote:
> > On Sun, Nov 6, 2022 at 6:54 PM i.nixman--- via Gcc-help
> > <gcc-help@gcc.gnu.org> wrote:
> >>
> >>
> >> Hello,
> >>
> >> look at this example(https://godbolt.org/z/TnGMsfMs6):
> >> ```
> >> auto foo(const char *p) {
> >> const auto substr = _mm_loadu_si128((const __m128i *)p);
> >> return _mm_cmplt_epi8(substr, _mm_set1_epi8('0'));
> >> }
> >> ```
> >> and to the generated asm:
> >> ```
> >> 1: foo(char const*):
> >> 2: movdqu xmm0, XMMWORD PTR [rdi]
> >> 3: pxor xmm1, xmm1
> >> 4: pcmpgtb xmm0, XMMWORD PTR .LC0[rip]
> >> 5: pcmpeqb xmm0, xmm1
> >> 6: ret
> >> ```
> >> look at line 5.
> >> is there any reason for `pcmpeqb` instruction?
>
> hi,
>
> > Looks like a mis optimization from
> >
> > _4 = VIEW_CONVERT_EXPR<__v16qs>(_7);
> > _3 = _4 <= { 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47,
> > 47, 47 };
> > _5 = VIEW_CONVERT_EXPR<vector(16) signed char>(_3); --- this?
> >
> > Could you open a bugzilla for it
> > https://gcc.gnu.org/bugzilla/
>
> sure, but for which component?
Let's put it as rtl-optimization first.
>
>
>
> >>
> >> just for info, clang's output(https://godbolt.org/z/MPnvEMdhr):
> >> ```
> >> 1: foo(char const*):
> >> 2: movdqu xmm1, xmmword ptr [rdi]
> >> 3: movdqa xmm0, xmmword ptr [rip + .LCPI0_0]
> >> 4: pcmpgtb xmm0, xmm1
> >> 5: ret
> >> ```
> >>
> >>
> >> best!
--
BR,
Hongtao
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: simd, redundant pcmpeqb and pxor
2022-11-07 6:32 ` Hongtao Liu
@ 2022-11-07 6:40 ` i.nixman
0 siblings, 0 replies; 5+ messages in thread
From: i.nixman @ 2022-11-07 6:40 UTC (permalink / raw)
To: Hongtao Liu, gcc-help
done: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107546
best!
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2022-11-07 6:40 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-06 10:53 simd, redundant pcmpeqb and pxor i.nixman
2022-11-07 3:32 ` Hongtao Liu
2022-11-07 6:26 ` i.nixman
2022-11-07 6:32 ` Hongtao Liu
2022-11-07 6:40 ` i.nixman
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).