* simd, redundant pcmpeqb and pxor @ 2022-11-06 10:53 i.nixman 2022-11-07 3:32 ` Hongtao Liu 0 siblings, 1 reply; 5+ messages in thread From: i.nixman @ 2022-11-06 10:53 UTC (permalink / raw) To: gcc-help Hello, look at this example(https://godbolt.org/z/TnGMsfMs6): ``` auto foo(const char *p) { const auto substr = _mm_loadu_si128((const __m128i *)p); return _mm_cmplt_epi8(substr, _mm_set1_epi8('0')); } ``` and to the generated asm: ``` 1: foo(char const*): 2: movdqu xmm0, XMMWORD PTR [rdi] 3: pxor xmm1, xmm1 4: pcmpgtb xmm0, XMMWORD PTR .LC0[rip] 5: pcmpeqb xmm0, xmm1 6: ret ``` look at line 5. is there any reason for `pcmpeqb` instruction? just for info, clang's output(https://godbolt.org/z/MPnvEMdhr): ``` 1: foo(char const*): 2: movdqu xmm1, xmmword ptr [rdi] 3: movdqa xmm0, xmmword ptr [rip + .LCPI0_0] 4: pcmpgtb xmm0, xmm1 5: ret ``` best! ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: simd, redundant pcmpeqb and pxor 2022-11-06 10:53 simd, redundant pcmpeqb and pxor i.nixman @ 2022-11-07 3:32 ` Hongtao Liu 2022-11-07 6:26 ` i.nixman 0 siblings, 1 reply; 5+ messages in thread From: Hongtao Liu @ 2022-11-07 3:32 UTC (permalink / raw) To: i.nixman; +Cc: gcc-help On Sun, Nov 6, 2022 at 6:54 PM i.nixman--- via Gcc-help <gcc-help@gcc.gnu.org> wrote: > > > Hello, > > look at this example(https://godbolt.org/z/TnGMsfMs6): > ``` > auto foo(const char *p) { > const auto substr = _mm_loadu_si128((const __m128i *)p); > return _mm_cmplt_epi8(substr, _mm_set1_epi8('0')); > } > ``` > and to the generated asm: > ``` > 1: foo(char const*): > 2: movdqu xmm0, XMMWORD PTR [rdi] > 3: pxor xmm1, xmm1 > 4: pcmpgtb xmm0, XMMWORD PTR .LC0[rip] > 5: pcmpeqb xmm0, xmm1 > 6: ret > ``` > look at line 5. > is there any reason for `pcmpeqb` instruction? Looks like a mis optimization from _4 = VIEW_CONVERT_EXPR<__v16qs>(_7); _3 = _4 <= { 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47 }; _5 = VIEW_CONVERT_EXPR<vector(16) signed char>(_3); --- this? Could you open a bugzilla for it https://gcc.gnu.org/bugzilla/ > > just for info, clang's output(https://godbolt.org/z/MPnvEMdhr): > ``` > 1: foo(char const*): > 2: movdqu xmm1, xmmword ptr [rdi] > 3: movdqa xmm0, xmmword ptr [rip + .LCPI0_0] > 4: pcmpgtb xmm0, xmm1 > 5: ret > ``` > > > best! -- BR, Hongtao ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: simd, redundant pcmpeqb and pxor 2022-11-07 3:32 ` Hongtao Liu @ 2022-11-07 6:26 ` i.nixman 2022-11-07 6:32 ` Hongtao Liu 0 siblings, 1 reply; 5+ messages in thread From: i.nixman @ 2022-11-07 6:26 UTC (permalink / raw) To: Hongtao Liu, gcc-help On 2022-11-07 03:32, Hongtao Liu wrote: > On Sun, Nov 6, 2022 at 6:54 PM i.nixman--- via Gcc-help > <gcc-help@gcc.gnu.org> wrote: >> >> >> Hello, >> >> look at this example(https://godbolt.org/z/TnGMsfMs6): >> ``` >> auto foo(const char *p) { >> const auto substr = _mm_loadu_si128((const __m128i *)p); >> return _mm_cmplt_epi8(substr, _mm_set1_epi8('0')); >> } >> ``` >> and to the generated asm: >> ``` >> 1: foo(char const*): >> 2: movdqu xmm0, XMMWORD PTR [rdi] >> 3: pxor xmm1, xmm1 >> 4: pcmpgtb xmm0, XMMWORD PTR .LC0[rip] >> 5: pcmpeqb xmm0, xmm1 >> 6: ret >> ``` >> look at line 5. >> is there any reason for `pcmpeqb` instruction? hi, > Looks like a mis optimization from > > _4 = VIEW_CONVERT_EXPR<__v16qs>(_7); > _3 = _4 <= { 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, > 47, 47 }; > _5 = VIEW_CONVERT_EXPR<vector(16) signed char>(_3); --- this? > > Could you open a bugzilla for it > https://gcc.gnu.org/bugzilla/ sure, but for which component? >> >> just for info, clang's output(https://godbolt.org/z/MPnvEMdhr): >> ``` >> 1: foo(char const*): >> 2: movdqu xmm1, xmmword ptr [rdi] >> 3: movdqa xmm0, xmmword ptr [rip + .LCPI0_0] >> 4: pcmpgtb xmm0, xmm1 >> 5: ret >> ``` >> >> >> best! ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: simd, redundant pcmpeqb and pxor 2022-11-07 6:26 ` i.nixman @ 2022-11-07 6:32 ` Hongtao Liu 2022-11-07 6:40 ` i.nixman 0 siblings, 1 reply; 5+ messages in thread From: Hongtao Liu @ 2022-11-07 6:32 UTC (permalink / raw) To: i.nixman; +Cc: gcc-help On Mon, Nov 7, 2022 at 2:26 PM <i.nixman@autistici.org> wrote: > > On 2022-11-07 03:32, Hongtao Liu wrote: > > On Sun, Nov 6, 2022 at 6:54 PM i.nixman--- via Gcc-help > > <gcc-help@gcc.gnu.org> wrote: > >> > >> > >> Hello, > >> > >> look at this example(https://godbolt.org/z/TnGMsfMs6): > >> ``` > >> auto foo(const char *p) { > >> const auto substr = _mm_loadu_si128((const __m128i *)p); > >> return _mm_cmplt_epi8(substr, _mm_set1_epi8('0')); > >> } > >> ``` > >> and to the generated asm: > >> ``` > >> 1: foo(char const*): > >> 2: movdqu xmm0, XMMWORD PTR [rdi] > >> 3: pxor xmm1, xmm1 > >> 4: pcmpgtb xmm0, XMMWORD PTR .LC0[rip] > >> 5: pcmpeqb xmm0, xmm1 > >> 6: ret > >> ``` > >> look at line 5. > >> is there any reason for `pcmpeqb` instruction? > > hi, > > > Looks like a mis optimization from > > > > _4 = VIEW_CONVERT_EXPR<__v16qs>(_7); > > _3 = _4 <= { 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, > > 47, 47 }; > > _5 = VIEW_CONVERT_EXPR<vector(16) signed char>(_3); --- this? > > > > Could you open a bugzilla for it > > https://gcc.gnu.org/bugzilla/ > > sure, but for which component? Let's put it as rtl-optimization first. > > > > >> > >> just for info, clang's output(https://godbolt.org/z/MPnvEMdhr): > >> ``` > >> 1: foo(char const*): > >> 2: movdqu xmm1, xmmword ptr [rdi] > >> 3: movdqa xmm0, xmmword ptr [rip + .LCPI0_0] > >> 4: pcmpgtb xmm0, xmm1 > >> 5: ret > >> ``` > >> > >> > >> best! -- BR, Hongtao ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: simd, redundant pcmpeqb and pxor 2022-11-07 6:32 ` Hongtao Liu @ 2022-11-07 6:40 ` i.nixman 0 siblings, 0 replies; 5+ messages in thread From: i.nixman @ 2022-11-07 6:40 UTC (permalink / raw) To: Hongtao Liu, gcc-help done: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107546 best! ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2022-11-07 6:40 UTC | newest] Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2022-11-06 10:53 simd, redundant pcmpeqb and pxor i.nixman 2022-11-07 3:32 ` Hongtao Liu 2022-11-07 6:26 ` i.nixman 2022-11-07 6:32 ` Hongtao Liu 2022-11-07 6:40 ` i.nixman
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).