public inbox for gcc-cvs@sourceware.org
help / color / mirror / Atom feed
* [gcc r12-5580] rs6000: Add Power10 optimization for most _mm_movemask*
@ 2021-11-29 15:51 Paul Clarke
0 siblings, 0 replies; only message in thread
From: Paul Clarke @ 2021-11-29 15:51 UTC (permalink / raw)
To: gcc-cvs
https://gcc.gnu.org/g:85289ba36c2e62de84cc0232c954d9a74bda708a
commit r12-5580-g85289ba36c2e62de84cc0232c954d9a74bda708a
Author: Paul A. Clarke <pc@us.ibm.com>
Date: Thu Oct 21 11:21:01 2021 -0500
rs6000: Add Power10 optimization for most _mm_movemask*
Power10 ISA added `vextract*` instructions which are realized in the
`vec_extractm` instrinsic.
Use `vec_extractm` for `_mm_movemask_ps`, `_mm_movemask_pd`, and
`_mm_movemask_epi8` compatibility intrinsics, when `_ARCH_PWR10`.
2021-11-29 Paul A. Clarke <pc@us.ibm.com>
gcc
* config/rs6000/xmmintrin.h (_mm_movemask_ps): Use vec_extractm
when _ARCH_PWR10.
* config/rs6000/emmintrin.h (_mm_movemask_pd): Likewise.
(_mm_movemask_epi8): Likewise.
Diff:
---
gcc/config/rs6000/emmintrin.h | 8 ++++++++
gcc/config/rs6000/xmmintrin.h | 4 ++++
2 files changed, 12 insertions(+)
diff --git a/gcc/config/rs6000/emmintrin.h b/gcc/config/rs6000/emmintrin.h
index 4125b12fa09..c4758be0e77 100644
--- a/gcc/config/rs6000/emmintrin.h
+++ b/gcc/config/rs6000/emmintrin.h
@@ -1233,6 +1233,9 @@ _mm_loadl_pd (__m128d __A, double const *__B)
extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm_movemask_pd (__m128d __A)
{
+#ifdef _ARCH_PWR10
+ return vec_extractm ((__v2du) __A);
+#else
__vector unsigned long long result;
static const __vector unsigned int perm_mask =
{
@@ -1252,6 +1255,7 @@ _mm_movemask_pd (__m128d __A)
#else
return result[0];
#endif
+#endif /* !_ARCH_PWR10 */
}
#endif /* _ARCH_PWR8 */
@@ -2030,6 +2034,9 @@ _mm_min_epu8 (__m128i __A, __m128i __B)
extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm_movemask_epi8 (__m128i __A)
{
+#ifdef _ARCH_PWR10
+ return vec_extractm ((__v16qu) __A);
+#else
__vector unsigned long long result;
static const __vector unsigned char perm_mask =
{
@@ -2046,6 +2053,7 @@ _mm_movemask_epi8 (__m128i __A)
#else
return result[0];
#endif
+#endif /* !_ARCH_PWR10 */
}
#endif /* _ARCH_PWR8 */
diff --git a/gcc/config/rs6000/xmmintrin.h b/gcc/config/rs6000/xmmintrin.h
index ae1a33e8d95..4c093fd1d5a 100644
--- a/gcc/config/rs6000/xmmintrin.h
+++ b/gcc/config/rs6000/xmmintrin.h
@@ -1352,6 +1352,9 @@ _mm_storel_pi (__m64 *__P, __m128 __A)
extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm_movemask_ps (__m128 __A)
{
+#ifdef _ARCH_PWR10
+ return vec_extractm ((vector unsigned int) __A);
+#else
__vector unsigned long long result;
static const __vector unsigned int perm_mask =
{
@@ -1371,6 +1374,7 @@ _mm_movemask_ps (__m128 __A)
#else
return result[0];
#endif
+#endif /* !_ARCH_PWR10 */
}
#endif /* _ARCH_PWR8 */
^ permalink raw reply [flat|nested] only message in thread
only message in thread, other threads:[~2021-11-29 15:51 UTC | newest]
Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-29 15:51 [gcc r12-5580] rs6000: Add Power10 optimization for most _mm_movemask* Paul Clarke
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).