public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c/63791] New: use 32-byte version of vpbroadcastb on AVX2 platform
@ 2014-11-09 12:14 marcus.kool at urlfilterdb dot com
  2014-11-09 16:28 ` [Bug c/63791] " jakub at gcc dot gnu.org
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: marcus.kool at urlfilterdb dot com @ 2014-11-09 12:14 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63791

            Bug ID: 63791
           Summary: use 32-byte version of vpbroadcastb on AVX2 platform
           Product: gcc
           Version: 4.9.2
            Status: UNCONFIRMED
          Severity: minor
          Priority: P3
         Component: c
          Assignee: unassigned at gcc dot gnu.org
          Reporter: marcus.kool at urlfilterdb dot com

Created attachment 33926
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=33926&action=edit
code with _mm256_set1_epi8, _mm256_loadu_si256, _mm256_cmpeq_epi8,
_mm256_movemask_epi8

With gcc 4.9.2 and compile options 
-std=c99 -mavx2 -mbmi -mbmi2 -O3 -fno-tree-vectorize
on an Intel Haswell CPU
the intrinsic function _mm256_set1_epi8() generates 3 instructions while it
could do better with only 2 instructions.

Generated code is either
   vmovd         reg, xmmreg
   vpbroadcastb  xmmreg, xmmreg
   vinserti128   $1, xmmreg, ymmreg, ymmreg
or
   vmovd         reg, xmmreg
   vpbroadcastb  xmmreg, xmmreg
   vperm2i128    $0, ymmreg, ymmreg, ymmreg

But it could generate faster code instead:
   vmovd         reg, xmmreg
   vpbroadcastb  xmmreg, ymmreg

Example C source is in the attachment.


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2015-05-01 23:07 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-11-09 12:14 [Bug c/63791] New: use 32-byte version of vpbroadcastb on AVX2 platform marcus.kool at urlfilterdb dot com
2014-11-09 16:28 ` [Bug c/63791] " jakub at gcc dot gnu.org
2015-05-01 13:51 ` [Bug target/63791] use 32-byte version of vpbroadcastb (and register to poulate) on AVX/AVX2 platforms marcus.kool at urlfilterdb dot com
2015-05-01 13:53 ` marcus.kool at urlfilterdb dot com
2015-05-01 14:10 ` marcus.kool at urlfilterdb dot com
2015-05-01 18:43 ` hjl.tools at gmail dot com
2015-05-01 22:40 ` marcus.kool at urlfilterdb dot com
2015-05-01 23:07 ` pinskia at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).