public inbox for libstdc++@gcc.gnu.org
 help / color / mirror / Atom feed
* [RFC]  libstc++: Implement gather and scatter
@ 2021-02-18 13:12 yaozhongxiao
  2021-02-19  9:12 ` Matthias Kretz
  0 siblings, 1 reply; 3+ messages in thread
From: yaozhongxiao @ 2021-02-18 13:12 UTC (permalink / raw)
  To: libstdc++

[-- Attachment #1: Type: text/plain, Size: 1417 bytes --]


Memory load/store in gather and scatter mode with simd are common cases.
I try to support the gather and scatter features in my workload.
I send my draft to request for comment and suggestions before officially commits, 
Please do not hesitate to correct and comment, thanks.

Dr. Matthias Kretz, thank you very much for your work in simd, 
and hope to look forward to get suggestion from you.

--------------------------------------------------------------
 libstc++: Implement gather and scatter
    memory load/store in gather and scatter mode with simd are common
    cases. Implement them in simd via call to _S_gather and _S_scatter
    with it's Abi method.

    libstdc++-v3/ChangeLog:

            * include/experimental/bits/simd.h: Add simd::gather and 
            simd::scatter as the public methods.
            * include/experimental/bits/simd_builtin.h: Add 
            _SimdImplBuiltin::_S_gather and _SimdImplBuiltin::_S_scatter
            for simd native abi implementation.
            * include/experimental/bits/simd_fixed_size.h: Add 
            _SimdImplFixedSize::_S_gather and _SimdImplFixedSize::_S_scatter
            for simd fix abi implementation;
            _SimdTuple::_M_tuple_at for tuple accessing
            * include/experimental/bits/ssimd_scalar: Add 
            _SimdImplScalar::_S_gather and _SimdImplScalar::_S_scatter
            for simd scalar abi implementation.

[-- Attachment #2: patch.txt --]
[-- Type: application/octet-stream, Size: 10375 bytes --]

commit 178ded0bbcf126f7b347045992c9ef501a050bd9
Author: zhongxiao.yzx <zhongxiao.yzx@gmail.com>
Date:   Thu Feb 18 20:32:28 2021 +0800

    libstc++: Implement gather and scatter
    
    memory load/store in gather and scatter mode with simd are common
    cases. Implement them in simd via call to _S_gather and _S_scatter
    with it's Abi method.
    
    libstdc++-v3/ChangeLog:
    
            * include/experimental/bits/simd.h: Add simd::gather and
            simd::scatter as the public methods.
            * include/experimental/bits/simd_builtin.h: Add
            _SimdImplBuiltin::_S_gather and _SimdImplBuiltin::_S_scatter
            for simd native abi implementation.
            * include/experimental/bits/simd_fixed_size.h: Add
            _SimdImplFixedSize::_S_gather and _SimdImplFixedSize::_S_scatter
            for simd fix abi implementation;
            _SimdTuple::_M_tuple_at for tuple accessing
            * include/experimental/bits/ssimd_scalar: Add
            _SimdImplScalar::_S_gather and _SimdImplScalar::_S_scatter
            for simd scalar abi implementation.

diff --git a/libstdc++-v3/include/experimental/bits/simd.h b/libstdc++-v3/include/experimental/bits/simd.h
index c452778832f..603817ff2b0 100644
--- a/libstdc++-v3/include/experimental/bits/simd.h
+++ b/libstdc++-v3/include/experimental/bits/simd.h
@@ -29,6 +29,7 @@
 
 #include "simd_detail.h"
 #include "numeric_traits.h"
+#include <array>
 #include <bit>
 #include <bitset>
 #ifdef _GLIBCXX_DEBUG_UB
@@ -4960,6 +4961,15 @@ template <typename _Tp, typename _Abi>
 	  _Impl::_S_load(_Flags::template _S_apply<simd>(__mem), _S_type_tag))
       {}
 
+    // gather constructor
+    template <typename _Up, typename _Flags>
+      _GLIBCXX_SIMD_ALWAYS_INLINE
+      simd(const _Up* __mem, const __int_for_sizeof_t<_Up>* __idx, _Flags)
+      : _M_data(
+      _Impl::_S_gather(_Flags::template _S_apply<simd>(__mem),
+			   __idx, _S_type_tag))
+      {}
+
     // loads [simd.load]
     template <typename _Up, typename _Flags>
       _GLIBCXX_SIMD_ALWAYS_INLINE void
@@ -4978,6 +4988,47 @@ template <typename _Tp, typename _Abi>
 			_S_type_tag);
       }
 
+    // gather [simd.gather]
+    template <typename _Up, typename _Flags>
+      _GLIBCXX_SIMD_ALWAYS_INLINE void
+      gather(const _Vectorizable<_Up>* __mem,
+	     const std::array<int, size()>& __idx, _Flags)
+      {
+	_M_data = static_cast<decltype(_M_data)>(
+	  _Impl::_S_gather(_Flags::template _S_apply<simd>(__mem), __idx.data(),
+			   _S_type_tag));
+      }
+
+    template <typename _Up, typename _Flags>
+      _GLIBCXX_SIMD_ALWAYS_INLINE void
+      gather(const _Vectorizable<_Up>* __mem,
+	     const __int_for_sizeof_t<_Up>* __idx, _Flags)
+      {
+	_M_data = static_cast<decltype(_M_data)>(
+	  _Impl::_S_gather(_Flags::template _S_apply<simd>(__mem), __idx,
+			   _S_type_tag));
+      }
+
+    // scatter [simd.scatter]
+    template <typename _Up, typename _Flags>
+      _GLIBCXX_SIMD_ALWAYS_INLINE void
+      scatter(_Vectorizable<_Up>* __mem, std::array<int, size()>& __idx,
+	      _Flags) const
+      {
+	_Impl::_S_scatter(_M_data, _Flags::template _S_apply<simd>(__mem),
+			  __idx.data(), _S_type_tag);
+      }
+
+    // scatter [simd.scatter]
+    template <typename _Up, typename _Flags>
+      _GLIBCXX_SIMD_ALWAYS_INLINE void
+      scatter(_Vectorizable<_Up>* __mem, const __int_for_sizeof_t<_Up>* __idx,
+	      _Flags) const
+      {
+	_Impl::_S_scatter(_M_data, _Flags::template _S_apply<simd>(__mem),
+			  __idx, _S_type_tag);
+      }
+
     // scalar access
     _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR reference
     operator[](size_t __i)
diff --git a/libstdc++-v3/include/experimental/bits/simd_builtin.h b/libstdc++-v3/include/experimental/bits/simd_builtin.h
index 7f728a10488..69594e1006c 100644
--- a/libstdc++-v3/include/experimental/bits/simd_builtin.h
+++ b/libstdc++-v3/include/experimental/bits/simd_builtin.h
@@ -1428,6 +1428,54 @@ template <typename _Abi>
 	});
       }
 
+    // _S_gather {{{2
+    template <typename _Tp, typename _Up>
+      _GLIBCXX_SIMD_INTRINSIC static _SimdMember<_Tp>
+      _S_gather(const _Up* __mem, const __int_for_sizeof_t<_Up>* __idx,
+		_TypeTag<_Tp>) noexcept
+      {
+	constexpr size_t _Np = _S_size<_Tp>;
+	return __generate_vector<_Tp, _SimdMember<_Tp>::_S_full_size>([&](
+	  auto __i) constexpr {
+	  return static_cast<_Tp>(__i < _Np ? __mem[__idx[__i]] : 0);
+	});
+      }
+    // _S_gather
+    template <typename _Tp, typename _Up>
+      _GLIBCXX_SIMD_INTRINSIC static _SimdMember<_Tp>
+      _S_gather(const _Up* __mem, const _SimdMember<_Tp>& __idx,
+		_TypeTag<_Tp>) noexcept
+      {
+	constexpr size_t _Np = _S_size<_Tp>;
+	return __generate_vector<_Tp, _SimdMember<_Tp>::_S_full_size>([&](
+	  auto __i) constexpr {
+	  return static_cast<_Tp>(__i < _Np ? __mem[__idx[__i]] : 0);
+	});
+      } // }}}
+
+    // _S_scatter {{{2
+    template <typename _Tp, typename _Up>
+      _GLIBCXX_SIMD_INTRINSIC static void
+      _S_scatter(_SimdMember<_Tp> __v, _Up* __mem,
+		 const __int_for_sizeof_t<_Up>* __idx, _TypeTag<_Tp>) noexcept
+      {
+	constexpr size_t _Np = _S_size<_Tp>;
+	__execute_n_times<_Np>([&](auto __i) constexpr {
+	  __mem[__idx[__i]] = static_cast<_Up>(__v[__i]);
+	});
+      }
+    // _S_scatter
+    template <typename _Tp, typename _Up>
+      _GLIBCXX_SIMD_INTRINSIC static void
+      _S_scatter(_SimdMember<_Tp> __v, _Up* __mem,
+		 const _SimdMember<_Tp>& __idx, _TypeTag<_Tp>) noexcept
+      {
+	constexpr size_t _Np = _S_size<_Tp>;
+	__execute_n_times<_Np>([&](auto __i) constexpr {
+	  __mem[__idx[__i]] = static_cast<_Up>(__v[__i]);
+	});
+      } // }}}
+
     // _S_load {{{2
     template <typename _Tp, typename _Up>
       _GLIBCXX_SIMD_INTRINSIC static _SimdMember<_Tp>
@@ -2813,7 +2861,7 @@ template <typename _Abi>
 
     // smart_reference access {{{2
     template <typename _Tp, size_t _Np>
-      static constexpr void _S_set(_SimdWrapper<_Tp, _Np>& __k, int __i,
+      static constexpr void _S_set(_SimdWrapper<_Tp, _Np>& __k, size_t __i,
 				   bool __x) noexcept
       {
 	if constexpr (is_same_v<_Tp, bool>)
diff --git a/libstdc++-v3/include/experimental/bits/simd_fixed_size.h b/libstdc++-v3/include/experimental/bits/simd_fixed_size.h
index 2722055c899..befa32547cc 100644
--- a/libstdc++-v3/include/experimental/bits/simd_fixed_size.h
+++ b/libstdc++-v3/include/experimental/bits/simd_fixed_size.h
@@ -392,6 +392,15 @@ template <typename _Tp, typename _Abi0, typename... _Abis>
 	  return second.template _M_simd_at<_Np - 1>();
       }
 
+    template <size_t _Offset>
+      _GLIBCXX_SIMD_INTRINSIC constexpr auto _M_tuple_at() const
+      {
+	if constexpr (_Offset == 0)
+	  return first;
+	else
+	  return second.template _M_tuple_at<_Offset - simd_size_v<_Tp, _Abi0>>();
+      }
+
     template <size_t _Offset = 0, typename _Fp>
       _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdTuple
       _S_generate(_Fp&& __gen, _SizeConstant<_Offset> = {})
@@ -1328,6 +1337,56 @@ template <int _Np>
 	});
       }
 
+    // _S_gather {{{2
+    template <typename _Tp, typename _Up>
+      static inline _SimdMember<_Tp>
+      _S_gather(const _Up* __mem, const __int_for_sizeof_t<_Up>* __idx,
+		_TypeTag<_Tp>) noexcept
+      {
+	return _SimdMember<_Tp>::_S_generate([&](auto __meta) {
+	  return __meta._S_gather(__mem, &__idx[__meta._S_offset],
+				  _TypeTag<_Tp>());
+	});
+      }
+
+    // _S_gather {{{2
+    template <typename _Tp, typename _Up>
+      static inline _SimdMember<_Tp>
+      _S_gather(const _Up* __mem, const _SimdMember<_Tp>& __idx,
+		_TypeTag<_Tp>) noexcept
+      {
+	return _SimdMember<_Tp>::_S_generate([&](auto __meta) {
+	  return __meta._S_gather(
+	    __mem, __idx.template _M_tuple_at<__meta._S_offset>(),
+	    _TypeTag<_Tp>());
+	});
+      }
+
+    // _S_scatter {{{2
+    template <typename _Tp, typename _Up>
+      static inline void
+      _S_scatter(const _SimdMember<_Tp>& __v, _Up* __mem,
+		 const __int_for_sizeof_t<_Up>* __idx, _TypeTag<_Tp>) noexcept
+      {
+	__for_each(__v, [&](auto __meta, auto __native) {
+	  __meta._S_scatter(__native, __mem, &__idx[__meta._S_offset],
+			    _TypeTag<_Tp>());
+	});
+      }
+
+    // _S_scatter {{{2
+    template <typename _Tp, typename _Up>
+      static inline void
+      _S_scatter(const _SimdMember<_Tp>& __v, _Up* __mem,
+		 const _SimdMember<_Tp>& __idx, _TypeTag<_Tp>) noexcept
+      {
+	__for_each(__v, __idx,
+		   [&](auto __meta, auto __v_tuple, auto __idx_tuple) {
+		     __meta._S_scatter(__v_tuple, __mem, __idx_tuple,
+				       _TypeTag<_Tp>());
+		   });
+      }
+
     // _S_load {{{2
     template <typename _Tp, typename _Up>
       static inline _SimdMember<_Tp> _S_load(const _Up* __mem,
diff --git a/libstdc++-v3/include/experimental/bits/simd_scalar.h b/libstdc++-v3/include/experimental/bits/simd_scalar.h
index 48e13f6c719..243672dda39 100644
--- a/libstdc++-v3/include/experimental/bits/simd_scalar.h
+++ b/libstdc++-v3/include/experimental/bits/simd_scalar.h
@@ -147,6 +147,42 @@ struct _SimdImplScalar
 							      _TypeTag<_Tp>)
     { return __gen(_SizeConstant<0>()); }
 
+  // _S_gather {{{2
+  template <typename _Tp, typename _Up>
+    _GLIBCXX_SIMD_INTRINSIC static _Tp
+    _S_gather(const _Up* __mem, const __int_for_sizeof_t<_Up>* __idx,
+	      _TypeTag<_Tp>) noexcept
+    {
+      return static_cast<_Tp>(__mem[__idx[0]]);
+    }
+
+  // _S_gather
+  template <typename _Tp, typename _Up>
+    _GLIBCXX_SIMD_INTRINSIC static _Tp
+    _S_gather(const _Up* __mem, const _Tp& __idx, _TypeTag<_Tp>) noexcept
+    {
+      return static_cast<_Tp>(__mem[__idx]);
+    } // }}}
+
+  // _S_scatter {{{2
+  template <typename _Tp, typename _Up>
+    _GLIBCXX_SIMD_INTRINSIC static void
+    _S_scatter(const _Tp& __v, _Up* __mem,
+	       [[maybe_unused]] const __int_for_sizeof_t<_Up>* __idx,
+	       _TypeTag<_Tp>) noexcept
+    {
+      __mem[__idx[0]] = static_cast<_Up>(__v);
+    }
+
+  // _S_scatter
+  template <typename _Tp, typename _Up>
+    _GLIBCXX_SIMD_INTRINSIC static void
+    _S_scatter(_Tp& __v, _Up* __mem, [[maybe_unused]] const _Tp& __idx,
+	       _TypeTag<_Tp>) noexcept
+    {
+      __mem[__idx] = static_cast<_Up>(__v);
+    } // }}}
+
   // _S_load {{{2
   template <typename _Tp, typename _Up>
     _GLIBCXX_SIMD_INTRINSIC static _Tp _S_load(const _Up* __mem,

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [RFC]  libstc++: Implement gather and scatter
  2021-02-18 13:12 [RFC] libstc++: Implement gather and scatter yaozhongxiao
@ 2021-02-19  9:12 ` Matthias Kretz
  2021-02-19 12:56   ` yao zhongxiao
  0 siblings, 1 reply; 3+ messages in thread
From: Matthias Kretz @ 2021-02-19  9:12 UTC (permalink / raw)
  To: libstdc++; +Cc: yaozhongxiao

Thank you! But please just call me Matthias :)

I already gave feedback at https://github.com/VcDevel/std-simd/pull/27. But 
I'd like to raise the issue of not-yet-proposed-but-we-want-to-write-a-paper-
soon features.

My understanding: If we define interfaces without using reserved names we're 
risking a source incompatible change (i.e. user code doesn't compile anymore 
after libstdc++ was modified to match the WG21 paper). I've seen non-standard 
extensions in libstdc++ use e.g. __member_function(...). I.e. without the _M_ 
or _S_ prefix. Is this the way to go? What about non-standard constructor 
overloads? We can't introduce a reserved name except via a tag type. 
Alternatively we could use a static function.

Cheers,
  Matthias

On Donnerstag, 18. Februar 2021 14:12:14 CET yaozhongxiao via Libstdc++ wrote:
> Memory load/store in gather and scatter mode with simd are common cases.
> I try to support the gather and scatter features in my workload.
> I send my draft to request for comment and suggestions before officially
> commits, Please do not hesitate to correct and comment, thanks.
> 
> Dr. Matthias Kretz, thank you very much for your work in simd,
> and hope to look forward to get suggestion from you.
> 
> --------------------------------------------------------------
>  libstc++: Implement gather and scatter
>     memory load/store in gather and scatter mode with simd are common
>     cases. Implement them in simd via call to _S_gather and _S_scatter
>     with it's Abi method.
> 
>     libstdc++-v3/ChangeLog:
> 
>             * include/experimental/bits/simd.h: Add simd::gather and
>             simd::scatter as the public methods.
>             * include/experimental/bits/simd_builtin.h: Add
>             _SimdImplBuiltin::_S_gather and _SimdImplBuiltin::_S_scatter
>             for simd native abi implementation.
>             * include/experimental/bits/simd_fixed_size.h: Add
>             _SimdImplFixedSize::_S_gather and _SimdImplFixedSize::_S_scatter
> for simd fix abi implementation;
>             _SimdTuple::_M_tuple_at for tuple accessing
>             * include/experimental/bits/ssimd_scalar: Add
>             _SimdImplScalar::_S_gather and _SimdImplScalar::_S_scatter
>             for simd scalar abi implementation.


-- 
──────────────────────────────────────────────────────────────────────────
 Dr. Matthias Kretz                           https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research               https://gsi.de
 std::experimental::simd              https://github.com/VcDevel/std-simd
──────────────────────────────────────────────────────────────────────────

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [RFC] libstc++: Implement gather and scatter
  2021-02-19  9:12 ` Matthias Kretz
@ 2021-02-19 12:56   ` yao zhongxiao
  0 siblings, 0 replies; 3+ messages in thread
From: yao zhongxiao @ 2021-02-19 12:56 UTC (permalink / raw)
  To: Matthias Kretz; +Cc: libstdc++, yaozhongxiao

Hi,Matthias:
I have read your feedback at  https://github.com/VcDevel/std-simd/pull/27
and agree with your suggestions totally!
Writing the paper and propose beforehand is the best way.
Thus,I will try to understand your comments and then have a discussion with
you
personally through IRC or email!

Best Regards
Thanks

Matthias Kretz <m.kretz@gsi.de> 于2021年2月19日周五 下午5:27写道:

> Thank you! But please just call me Matthias :)
>
> I already gave feedback at https://github.com/VcDevel/std-simd/pull/27.
> But
> I'd like to raise the issue of
> not-yet-proposed-but-we-want-to-write-a-paper-
> soon features.
>
> My understanding: If we define interfaces without using reserved names
> we're
> risking a source incompatible change (i.e. user code doesn't compile
> anymore
> after libstdc++ was modified to match the WG21 paper). I've seen
> non-standard
> extensions in libstdc++ use e.g. __member_function(...). I.e. without the
> _M_
> or _S_ prefix. Is this the way to go? What about non-standard constructor
> overloads? We can't introduce a reserved name except via a tag type.
> Alternatively we could use a static function.
>
> Cheers,
>   Matthias
>
> On Donnerstag, 18. Februar 2021 14:12:14 CET yaozhongxiao via Libstdc++
> wrote:
> > Memory load/store in gather and scatter mode with simd are common cases.
> > I try to support the gather and scatter features in my workload.
> > I send my draft to request for comment and suggestions before officially
> > commits, Please do not hesitate to correct and comment, thanks.
> >
> > Dr. Matthias Kretz, thank you very much for your work in simd,
> > and hope to look forward to get suggestion from you.
> >
> > --------------------------------------------------------------
> >  libstc++: Implement gather and scatter
> >     memory load/store in gather and scatter mode with simd are common
> >     cases. Implement them in simd via call to _S_gather and _S_scatter
> >     with it's Abi method.
> >
> >     libstdc++-v3/ChangeLog:
> >
> >             * include/experimental/bits/simd.h: Add simd::gather and
> >             simd::scatter as the public methods.
> >             * include/experimental/bits/simd_builtin.h: Add
> >             _SimdImplBuiltin::_S_gather and _SimdImplBuiltin::_S_scatter
> >             for simd native abi implementation.
> >             * include/experimental/bits/simd_fixed_size.h: Add
> >             _SimdImplFixedSize::_S_gather and
> _SimdImplFixedSize::_S_scatter
> > for simd fix abi implementation;
> >             _SimdTuple::_M_tuple_at for tuple accessing
> >             * include/experimental/bits/ssimd_scalar: Add
> >             _SimdImplScalar::_S_gather and _SimdImplScalar::_S_scatter
> >             for simd scalar abi implementation.
>
>
> --
> ──────────────────────────────────────────────────────────────────────────
>  Dr. Matthias Kretz                           https://mattkretz.github.io
>  GSI Helmholtz Centre for Heavy Ion Research               https://gsi.de
>  std::experimental::simd              https://github.com/VcDevel/std-simd
> ──────────────────────────────────────────────────────────────────────────
>


-- 
Name: zhongxiao yao
Email: zhongxiao.yzx@gmail.com

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-02-19 12:55 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-18 13:12 [RFC] libstc++: Implement gather and scatter yaozhongxiao
2021-02-19  9:12 ` Matthias Kretz
2021-02-19 12:56   ` yao zhongxiao

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).