public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
* [PATCH v3 0/1] Add vector math function tan/tanf to libmvec
@ 2021-12-30  0:03 Sunil K Pandey
  2021-12-30  0:03 ` [PATCH v3 1/1] x86-64: Add vector tan/tanf implementation " Sunil K Pandey
  0 siblings, 1 reply; 3+ messages in thread
From: Sunil K Pandey @ 2021-12-30  0:03 UTC (permalink / raw)
  To: libc-alpha; +Cc: hjl.tools, andrey.kolesov, marius.cornea

This patch may looks big but 74% of this patch is data table.

Changes from v2:
-  Replace big negative rip offset with Table Lookup Bias.
-  Remove more unused data table fields.
-  Include LOE(live on exit) register info.
-  Apply more peephole optimization.
-  Optimize load of all bits set into ZMM register
-  Replace 3 kmovw + andl with kandw instruction.
-  Restructure data table and remove unused fields.
-  Fix data table and field alignment according to ISA.
-  Fix data offset according to ISA.
-  Remove exit call dead code.
-  Remove unnecessary save/restore.
-  Keep cfi_escape for callee saved registers only.
-  Add DW_CFA_expression comments corresponding to each cfi_escape.
-  Define macro corresponding to each numeric data table offset.
-  Replace numeric data table offset with macro name.
-  Add data table structure definition as comments.
-  Restructure data table and add comments to each data field value.
-  Rename numeric sequential labels with meaningful label name.
-  Add more comments to labels as well as on call sites.
-  Internal special value processing paths replaced by calls to standard
   scalar math functions, makes code more compact and aligned with
   previous libmvec submission.
  
Changes from v1:
-  Add ISA specific sections for all libmvec functions.
-  Add libmvec functions to math-vector-fortran.h.
-  Change label to sequential.
-  Fix function name in GNU header plate.

This patch implements tan/tanf vector math functions containing
SSE, AVX, AVX2 and AVX512 versions for libmvec as per vector ABI.
It also contains accuracy and ABI tests with regenerated ulps.

Sunil K Pandey (1):
  x86-64: Add vector tan/tanf implementation to libmvec

 bits/libm-simd-decl-stubs.h                   |   11 +
 math/bits/mathcalls.h                         |    2 +-
 .../unix/sysv/linux/x86_64/libmvec.abilist    |    8 +
 sysdeps/x86/fpu/bits/math-vector.h            |    4 +
 .../x86/fpu/finclude/math-vector-fortran.h    |    4 +
 sysdeps/x86_64/fpu/Makeconfig                 |    1 +
 sysdeps/x86_64/fpu/Versions                   |    2 +
 sysdeps/x86_64/fpu/libm-test-ulps             |   20 +
 .../fpu/multiarch/svml_d_tan2_core-sse2.S     |   20 +
 .../x86_64/fpu/multiarch/svml_d_tan2_core.c   |   27 +
 .../fpu/multiarch/svml_d_tan2_core_sse4.S     | 6259 +++++++++++++++++
 .../fpu/multiarch/svml_d_tan4_core-sse.S      |   20 +
 .../x86_64/fpu/multiarch/svml_d_tan4_core.c   |   27 +
 .../fpu/multiarch/svml_d_tan4_core_avx2.S     | 6227 ++++++++++++++++
 .../fpu/multiarch/svml_d_tan8_core-avx2.S     |   20 +
 .../x86_64/fpu/multiarch/svml_d_tan8_core.c   |   27 +
 .../fpu/multiarch/svml_d_tan8_core_avx512.S   | 2733 +++++++
 .../fpu/multiarch/svml_s_tanf16_core-avx2.S   |   20 +
 .../x86_64/fpu/multiarch/svml_s_tanf16_core.c |   28 +
 .../fpu/multiarch/svml_s_tanf16_core_avx512.S |  927 +++
 .../fpu/multiarch/svml_s_tanf4_core-sse2.S    |   20 +
 .../x86_64/fpu/multiarch/svml_s_tanf4_core.c  |   28 +
 .../fpu/multiarch/svml_s_tanf4_core_sse4.S    | 2600 +++++++
 .../fpu/multiarch/svml_s_tanf8_core-sse.S     |   20 +
 .../x86_64/fpu/multiarch/svml_s_tanf8_core.c  |   28 +
 .../fpu/multiarch/svml_s_tanf8_core_avx2.S    | 2595 +++++++
 sysdeps/x86_64/fpu/svml_d_tan2_core.S         |   29 +
 sysdeps/x86_64/fpu/svml_d_tan4_core.S         |   29 +
 sysdeps/x86_64/fpu/svml_d_tan4_core_avx.S     |   25 +
 sysdeps/x86_64/fpu/svml_d_tan8_core.S         |   25 +
 sysdeps/x86_64/fpu/svml_s_tanf16_core.S       |   25 +
 sysdeps/x86_64/fpu/svml_s_tanf4_core.S        |   29 +
 sysdeps/x86_64/fpu/svml_s_tanf8_core.S        |   29 +
 sysdeps/x86_64/fpu/svml_s_tanf8_core_avx.S    |   25 +
 .../x86_64/fpu/test-double-libmvec-tan-avx.c  |    1 +
 .../x86_64/fpu/test-double-libmvec-tan-avx2.c |    1 +
 .../fpu/test-double-libmvec-tan-avx512f.c     |    1 +
 sysdeps/x86_64/fpu/test-double-libmvec-tan.c  |    3 +
 .../x86_64/fpu/test-double-vlen2-wrappers.c   |    1 +
 .../fpu/test-double-vlen4-avx2-wrappers.c     |    1 +
 .../x86_64/fpu/test-double-vlen4-wrappers.c   |    1 +
 .../x86_64/fpu/test-double-vlen8-wrappers.c   |    1 +
 .../x86_64/fpu/test-float-libmvec-tanf-avx.c  |    1 +
 .../x86_64/fpu/test-float-libmvec-tanf-avx2.c |    1 +
 .../fpu/test-float-libmvec-tanf-avx512f.c     |    1 +
 sysdeps/x86_64/fpu/test-float-libmvec-tanf.c  |    3 +
 .../x86_64/fpu/test-float-vlen16-wrappers.c   |    1 +
 .../x86_64/fpu/test-float-vlen4-wrappers.c    |    1 +
 .../fpu/test-float-vlen8-avx2-wrappers.c      |    1 +
 .../x86_64/fpu/test-float-vlen8-wrappers.c    |    1 +
 50 files changed, 21913 insertions(+), 1 deletion(-)
 create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_d_tan2_core-sse2.S
 create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_d_tan2_core.c
 create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_d_tan2_core_sse4.S
 create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_d_tan4_core-sse.S
 create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_d_tan4_core.c
 create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_d_tan4_core_avx2.S
 create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_d_tan8_core-avx2.S
 create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_d_tan8_core.c
 create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_d_tan8_core_avx512.S
 create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_s_tanf16_core-avx2.S
 create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_s_tanf16_core.c
 create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_s_tanf16_core_avx512.S
 create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_s_tanf4_core-sse2.S
 create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_s_tanf4_core.c
 create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_s_tanf4_core_sse4.S
 create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_s_tanf8_core-sse.S
 create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_s_tanf8_core.c
 create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_s_tanf8_core_avx2.S
 create mode 100644 sysdeps/x86_64/fpu/svml_d_tan2_core.S
 create mode 100644 sysdeps/x86_64/fpu/svml_d_tan4_core.S
 create mode 100644 sysdeps/x86_64/fpu/svml_d_tan4_core_avx.S
 create mode 100644 sysdeps/x86_64/fpu/svml_d_tan8_core.S
 create mode 100644 sysdeps/x86_64/fpu/svml_s_tanf16_core.S
 create mode 100644 sysdeps/x86_64/fpu/svml_s_tanf4_core.S
 create mode 100644 sysdeps/x86_64/fpu/svml_s_tanf8_core.S
 create mode 100644 sysdeps/x86_64/fpu/svml_s_tanf8_core_avx.S
 create mode 100644 sysdeps/x86_64/fpu/test-double-libmvec-tan-avx.c
 create mode 100644 sysdeps/x86_64/fpu/test-double-libmvec-tan-avx2.c
 create mode 100644 sysdeps/x86_64/fpu/test-double-libmvec-tan-avx512f.c
 create mode 100644 sysdeps/x86_64/fpu/test-double-libmvec-tan.c
 create mode 100644 sysdeps/x86_64/fpu/test-float-libmvec-tanf-avx.c
 create mode 100644 sysdeps/x86_64/fpu/test-float-libmvec-tanf-avx2.c
 create mode 100644 sysdeps/x86_64/fpu/test-float-libmvec-tanf-avx512f.c
 create mode 100644 sysdeps/x86_64/fpu/test-float-libmvec-tanf.c

-- 
2.31.1


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-12-30 19:47 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-30  0:03 [PATCH v3 0/1] Add vector math function tan/tanf to libmvec Sunil K Pandey
2021-12-30  0:03 ` [PATCH v3 1/1] x86-64: Add vector tan/tanf implementation " Sunil K Pandey
2021-12-30 19:47   ` H.J. Lu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).