public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCHv4 00/34] Replace the Power target-specific builtin machinery
@ 2021-07-29 13:30 Bill Schmidt
  2021-07-29 13:30 ` [PATCH 01/34] rs6000: Incorporate new builtins code into the build machinery Bill Schmidt
                   ` (33 more replies)
  0 siblings, 34 replies; 84+ messages in thread
From: Bill Schmidt @ 2021-07-29 13:30 UTC (permalink / raw)
  To: gcc-patches; +Cc: segher, dje.gcc, willschm

Hi!

Original patch series here:
https://gcc.gnu.org/pipermail/gcc-patches/2021-April/568840.html

V2 patch series here:
https://gcc.gnu.org/pipermail/gcc-patches/2021-June/572231.html

V3 patch series here:
https://gcc.gnu.org/pipermail/gcc-patches/2021-June/573020.html

Thanks for all the reviews so far!  I've now committed all of the
rs6000-gen-builtins.c and rbtree.[ch] patches, along with the patch
to generic code to handle build-time GC roots in gnegtype.  These
constituted the first 22 patches of the V3 series.

In this version of the series, I've made some changes in response to
reviews from Segher and Will Schmidt, and incorporated some upstream
changes since the V3 posting.  Mapping from V4 patches to V3 patches:

   V4   =>   V3
  ----      ----
  0001      0023
   ..        ..
  0020      0042
  0021               (new)
  0022      0043
   ..        ..
  0034      0055

The new patch 0021 handles the MMA changes, which required rethinking
how I handle "internal" MMA builtins.

Thanks again for the ongoing reviews!

Bill

Bill Schmidt (34):
  rs6000: Incorporate new builtins code into the build machinery
  rs6000: Add gengtype handling to the build machinery
  rs6000: Add the rest of the [altivec] stanza to the builtins file
  rs6000: Add VSX builtins
  rs6000: Add available-everywhere and ancient builtins
  rs6000: Add power7 and power7-64 builtins
  rs6000: Add power8-vector builtins
  rs6000: Add Power9 builtins
  rs6000: Add more type nodes to support builtin processing
  rs6000: Add Power10 builtins
  rs6000: Add MMA builtins
  rs6000: Add miscellaneous builtins
  rs6000: Add Cell builtins
  rs6000: Add remaining overloads
  rs6000: Execute the automatic built-in initialization code
  rs6000: Darwin builtin support
  rs6000: Add sanity to V2DI_type_node definitions
  rs6000: Always initialize vector_pair and vector_quad nodes
  rs6000: Handle overloads during program parsing
  rs6000: Handle gimple folding of target built-ins
  rs6000: Handle some recent MMA builtin changes
  rs6000: Support for vectorizing built-in functions
  rs6000: Builtin expansion, part 1
  rs6000: Builtin expansion, part 2
  rs6000: Builtin expansion, part 3
  rs6000: Builtin expansion, part 4
  rs6000: Builtin expansion, part 5
  rs6000: Builtin expansion, part 6
  rs6000: Update rs6000_builtin_decl
  rs6000: Miscellaneous uses of rs6000_builtins_decl_x
  rs6000: Debug support
  rs6000: Update altivec.h for automated interfaces
  rs6000: Test case adjustments
  rs6000: Enable the new builtin support

 gcc/config.gcc                                |    2 +
 gcc/config/rs6000/altivec.h                   |  519 +-
 gcc/config/rs6000/darwin.h                    |    8 +-
 gcc/config/rs6000/rs6000-builtin-new.def      | 3806 ++++++++++
 gcc/config/rs6000/rs6000-c.c                  | 1083 +++
 gcc/config/rs6000/rs6000-call.c               | 3437 +++++++++-
 gcc/config/rs6000/rs6000-gen-builtins.c       |   44 +-
 gcc/config/rs6000/rs6000-overload.def         | 6104 +++++++++++++++++
 gcc/config/rs6000/rs6000.c                    |  219 +-
 gcc/config/rs6000/rs6000.h                    |   84 +
 gcc/config/rs6000/t-rs6000                    |   47 +-
 .../powerpc/bfp/scalar-extract-exp-2.c        |    2 +-
 .../powerpc/bfp/scalar-extract-sig-2.c        |    2 +-
 .../powerpc/bfp/scalar-insert-exp-2.c         |    2 +-
 .../powerpc/bfp/scalar-insert-exp-5.c         |    2 +-
 .../powerpc/bfp/scalar-insert-exp-8.c         |    2 +-
 .../powerpc/bfp/scalar-test-neg-2.c           |    2 +-
 .../powerpc/bfp/scalar-test-neg-3.c           |    2 +-
 .../powerpc/bfp/scalar-test-neg-5.c           |    2 +-
 .../gcc.target/powerpc/byte-in-set-2.c        |    2 +-
 gcc/testsuite/gcc.target/powerpc/cmpb-2.c     |    2 +-
 gcc/testsuite/gcc.target/powerpc/cmpb32-2.c   |    2 +-
 .../gcc.target/powerpc/crypto-builtin-2.c     |   14 +-
 .../powerpc/fold-vec-splat-floatdouble.c      |    4 +-
 .../powerpc/fold-vec-splat-longlong.c         |   10 +-
 .../powerpc/fold-vec-splat-misc-invalid.c     |    8 +-
 .../gcc.target/powerpc/int_128bit-runnable.c  |    6 +-
 .../gcc.target/powerpc/p8vector-builtin-8.c   |    1 +
 gcc/testsuite/gcc.target/powerpc/pr80315-1.c  |    2 +-
 gcc/testsuite/gcc.target/powerpc/pr80315-2.c  |    2 +-
 gcc/testsuite/gcc.target/powerpc/pr80315-3.c  |    2 +-
 gcc/testsuite/gcc.target/powerpc/pr80315-4.c  |    2 +-
 gcc/testsuite/gcc.target/powerpc/pr88100.c    |   12 +-
 .../gcc.target/powerpc/pragma_misc9.c         |    2 +-
 .../gcc.target/powerpc/pragma_power8.c        |    2 +
 .../gcc.target/powerpc/pragma_power9.c        |    3 +
 .../powerpc/test_fpscr_drn_builtin_error.c    |    4 +-
 .../powerpc/test_fpscr_rn_builtin_error.c     |   12 +-
 gcc/testsuite/gcc.target/powerpc/test_mffsl.c |    3 +-
 gcc/testsuite/gcc.target/powerpc/vec-gnb-2.c  |    2 +-
 .../gcc.target/powerpc/vsu/vec-all-nez-7.c    |    2 +-
 .../gcc.target/powerpc/vsu/vec-any-eqz-7.c    |    2 +-
 .../gcc.target/powerpc/vsu/vec-cmpnez-7.c     |    2 +-
 .../gcc.target/powerpc/vsu/vec-cntlz-lsbb-2.c |    2 +-
 .../gcc.target/powerpc/vsu/vec-cnttz-lsbb-2.c |    2 +-
 .../gcc.target/powerpc/vsu/vec-xl-len-13.c    |    2 +-
 .../gcc.target/powerpc/vsu/vec-xst-len-12.c   |    2 +-
 47 files changed, 14638 insertions(+), 842 deletions(-)

-- 
2.27.0


^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 01/34] rs6000: Incorporate new builtins code into the build machinery
  2021-07-29 13:30 [PATCHv4 00/34] Replace the Power target-specific builtin machinery Bill Schmidt
@ 2021-07-29 13:30 ` Bill Schmidt
  2021-08-04 22:29   ` Segher Boessenkool
  2021-07-29 13:30 ` [PATCH 02/34] rs6000: Add gengtype handling to " Bill Schmidt
                   ` (32 subsequent siblings)
  33 siblings, 1 reply; 84+ messages in thread
From: Bill Schmidt @ 2021-07-29 13:30 UTC (permalink / raw)
  To: gcc-patches; +Cc: segher, dje.gcc, willschm

Differences from previous version:
 - Removed the change to add rs6000-c.o to extra_objs (unnecessary)
 - Avoided race condition and documented how this works

2021-07-27  Bill Schmidt  <wschmidt@linux.ibm.com>

gcc/
	* config.gcc (powerpc*-*-*): Add rs6000-builtins.o to extra_objs.
	* config/rs6000/rs6000-gen-builtins.c (main): Close init_file
	last.
	* config/rs6000/t-rs6000 (rs6000-gen-builtins.o): New target.
	(rbtree.o): Likewise.
	(rs6000-gen-builtins): Likewise.
	(rs6000-builtins.c): Likewise.
	(rs6000-builtins.h): Likewise.
	(rs6000.o): Add dependency.
	(EXTRA_HEADERS): Add rs6000-vecdefines.h.
	(rs6000-vecdefines.h): New target.
	(rs6000-builtins.o): Likewise.
	(rs6000-call.o): Add rs6000-builtins.h as a dependency.
	(rs6000-c.o): Likewise.
---
 gcc/config.gcc                          |  1 +
 gcc/config/rs6000/rs6000-gen-builtins.c |  4 ++-
 gcc/config/rs6000/t-rs6000              | 46 ++++++++++++++++++++++---
 3 files changed, 45 insertions(+), 6 deletions(-)

diff --git a/gcc/config.gcc b/gcc/config.gcc
index 93e2b3219b9..fe2205b4bc2 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -476,6 +476,7 @@ powerpc*-*-*)
 	cpu_type=rs6000
 	extra_objs="rs6000-string.o rs6000-p8swap.o rs6000-logue.o"
 	extra_objs="${extra_objs} rs6000-call.o rs6000-pcrel-opt.o"
+	extra_objs="${extra_objs} rs6000-builtins.o"
 	extra_headers="ppc-asm.h altivec.h htmintrin.h htmxlintrin.h"
 	extra_headers="${extra_headers} bmi2intrin.h bmiintrin.h"
 	extra_headers="${extra_headers} xmmintrin.h mm_malloc.h emmintrin.h"
diff --git a/gcc/config/rs6000/rs6000-gen-builtins.c b/gcc/config/rs6000/rs6000-gen-builtins.c
index e5d3b71b622..c401a44e104 100644
--- a/gcc/config/rs6000/rs6000-gen-builtins.c
+++ b/gcc/config/rs6000/rs6000-gen-builtins.c
@@ -2979,9 +2979,11 @@ main (int argc, const char **argv)
       exit (1);
     }
 
+  /* Always close init_file last.  This avoids race conditions in the
+     build machinery.  See comments in t-rs6000.  */
   fclose (header_file);
-  fclose (init_file);
   fclose (defines_file);
+  fclose (init_file);
 
   return 0;
 }
diff --git a/gcc/config/rs6000/t-rs6000 b/gcc/config/rs6000/t-rs6000
index 44f7ffb35fe..e0e8ab8d828 100644
--- a/gcc/config/rs6000/t-rs6000
+++ b/gcc/config/rs6000/t-rs6000
@@ -27,10 +27,6 @@ rs6000-pcrel-opt.o: $(srcdir)/config/rs6000/rs6000-pcrel-opt.c
 	$(COMPILE) $<
 	$(POSTCOMPILE)
 
-rs6000-c.o: $(srcdir)/config/rs6000/rs6000-c.c
-	$(COMPILE) $<
-	$(POSTCOMPILE)
-
 rs6000-string.o: $(srcdir)/config/rs6000/rs6000-string.c
 	$(COMPILE) $<
 	$(POSTCOMPILE)
@@ -47,7 +43,47 @@ rs6000-logue.o: $(srcdir)/config/rs6000/rs6000-logue.c
 	$(COMPILE) $<
 	$(POSTCOMPILE)
 
-rs6000-call.o: $(srcdir)/config/rs6000/rs6000-call.c
+rs6000-gen-builtins.o: $(srcdir)/config/rs6000/rs6000-gen-builtins.c
+	$(COMPILE) $<
+	$(POSTCOMPILE)
+
+rbtree.o: $(srcdir)/config/rs6000/rbtree.c
+	$(COMPILE) $<
+	$(POSTCOMPILE)
+
+rs6000-gen-builtins: rs6000-gen-builtins.o rbtree.o
+	$(LINKER_FOR_BUILD) $(BUILD_LINKERFLAGS) $(BUILD_LDFLAGS) -o $@ \
+	    $(filter-out $(BUILD_LIBDEPS), $^) $(BUILD_LIBS)
+
+# TODO: Whenever GNU make 4.3 is the minimum required, we should use
+# grouped targets on this:
+#    rs6000-builtins.c rs6000-builtins.h rs6000-vecdefines.h &: <deps>
+#       <recipe>
+# For now, the header files depend on rs6000-builtins.c, which avoids
+# races because the .c file is closed last in rs6000-gen-builtins.c.
+rs6000-builtins.c: rs6000-gen-builtins \
+		   $(srcdir)/config/rs6000/rs6000-builtin-new.def \
+		   $(srcdir)/config/rs6000/rs6000-overload.def
+	./rs6000-gen-builtins $(srcdir)/config/rs6000/rs6000-builtin-new.def \
+		$(srcdir)/config/rs6000/rs6000-overload.def rs6000-builtins.h \
+		rs6000-builtins.c rs6000-vecdefines.h
+
+rs6000-builtins.h: rs6000-builtins.c
+
+rs6000.o: rs6000-builtins.h
+
+EXTRA_HEADERS += rs6000-vecdefines.h
+rs6000-vecdefines.h: rs6000-builtins.c
+
+rs6000-builtins.o: rs6000-builtins.c
+	$(COMPILE) $<
+	$(POSTCOMPILE)
+
+rs6000-call.o: $(srcdir)/config/rs6000/rs6000-call.c rs6000-builtins.h
+	$(COMPILE) $<
+	$(POSTCOMPILE)
+
+rs6000-c.o: $(srcdir)/config/rs6000/rs6000-c.c rs6000-builtins.h
 	$(COMPILE) $<
 	$(POSTCOMPILE)
 
-- 
2.27.0


^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 02/34] rs6000: Add gengtype handling to the build machinery
  2021-07-29 13:30 [PATCHv4 00/34] Replace the Power target-specific builtin machinery Bill Schmidt
  2021-07-29 13:30 ` [PATCH 01/34] rs6000: Incorporate new builtins code into the build machinery Bill Schmidt
@ 2021-07-29 13:30 ` Bill Schmidt
  2021-08-04 22:52   ` Segher Boessenkool
  2021-07-29 13:30 ` [PATCH 03/34] rs6000: Add the rest of the [altivec] stanza to the builtins file Bill Schmidt
                   ` (31 subsequent siblings)
  33 siblings, 1 reply; 84+ messages in thread
From: Bill Schmidt @ 2021-07-29 13:30 UTC (permalink / raw)
  To: gcc-patches; +Cc: segher, dje.gcc, willschm

2021-06-07  Bill Schmidt  <wschmidt@linux.ibm.com>

gcc/
	* config.gcc (target_gtfiles): Add ./rs6000-builtins.h.
	* config/rs6000/t-rs6000 (EXTRA_GTYPE_DEPS): Set.
---
 gcc/config.gcc             | 1 +
 gcc/config/rs6000/t-rs6000 | 1 +
 2 files changed, 2 insertions(+)

diff --git a/gcc/config.gcc b/gcc/config.gcc
index fe2205b4bc2..a880823e562 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -492,6 +492,7 @@ powerpc*-*-*)
 	extra_options="${extra_options} g.opt fused-madd.opt rs6000/rs6000-tables.opt"
 	target_gtfiles="$target_gtfiles \$(srcdir)/config/rs6000/rs6000-logue.c \$(srcdir)/config/rs6000/rs6000-call.c"
 	target_gtfiles="$target_gtfiles \$(srcdir)/config/rs6000/rs6000-pcrel-opt.c"
+	target_gtfiles="$target_gtfiles ./rs6000-builtins.h"
 	;;
 pru-*-*)
 	cpu_type=pru
diff --git a/gcc/config/rs6000/t-rs6000 b/gcc/config/rs6000/t-rs6000
index e0e8ab8d828..92766d8ea25 100644
--- a/gcc/config/rs6000/t-rs6000
+++ b/gcc/config/rs6000/t-rs6000
@@ -22,6 +22,7 @@ TM_H += $(srcdir)/config/rs6000/rs6000-builtin.def
 TM_H += $(srcdir)/config/rs6000/rs6000-cpus.def
 TM_H += $(srcdir)/config/rs6000/rs6000-modes.h
 PASSES_EXTRA += $(srcdir)/config/rs6000/rs6000-passes.def
+EXTRA_GTYPE_DEPS += $(srcdir)/config/rs6000/rs6000-builtin-new.def
 
 rs6000-pcrel-opt.o: $(srcdir)/config/rs6000/rs6000-pcrel-opt.c
 	$(COMPILE) $<
-- 
2.27.0


^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 03/34] rs6000: Add the rest of the [altivec] stanza to the builtins file
  2021-07-29 13:30 [PATCHv4 00/34] Replace the Power target-specific builtin machinery Bill Schmidt
  2021-07-29 13:30 ` [PATCH 01/34] rs6000: Incorporate new builtins code into the build machinery Bill Schmidt
  2021-07-29 13:30 ` [PATCH 02/34] rs6000: Add gengtype handling to " Bill Schmidt
@ 2021-07-29 13:30 ` Bill Schmidt
  2021-08-07  0:01   ` Segher Boessenkool
  2021-07-29 13:30 ` [PATCH 04/34] rs6000: Add VSX builtins Bill Schmidt
                   ` (30 subsequent siblings)
  33 siblings, 1 reply; 84+ messages in thread
From: Bill Schmidt @ 2021-07-29 13:30 UTC (permalink / raw)
  To: gcc-patches; +Cc: segher, dje.gcc, willschm

2021-06-10  Bill Schmidt  <wschmidt@linux.ibm.com>

gcc/
	* config/rs6000/rs6000-builtin-new.def: Finish altivec stanza.
	* config/rs6000/rs6000-call.c (rs6000_init_builtins): Move
	initialization of pcvoid_type_node here...
	(altivec_init_builtins): ...from here.
	* config/rs6000/rs6000.h (rs6000_builtin_type_index): Add
	RS6000_BTI_const_ptr_void.
	(pcvoid_type_node): New macro.
---
 gcc/config/rs6000/rs6000-builtin-new.def | 831 +++++++++++++++++++++++
 gcc/config/rs6000/rs6000-call.c          |   7 +-
 gcc/config/rs6000/rs6000.h               |   2 +
 3 files changed, 836 insertions(+), 4 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-builtin-new.def b/gcc/config/rs6000/rs6000-builtin-new.def
index a84a3def2d5..f1aa5529cdd 100644
--- a/gcc/config/rs6000/rs6000-builtin-new.def
+++ b/gcc/config/rs6000/rs6000-builtin-new.def
@@ -197,3 +197,834 @@
 
   const vss __builtin_altivec_abs_v8hi (vss);
     ABS_V8HI absv8hi2 {}
+
+  const vsc __builtin_altivec_abss_v16qi (vsc);
+    ABSS_V16QI altivec_abss_v16qi {}
+
+  const vsi __builtin_altivec_abss_v4si (vsi);
+    ABSS_V4SI altivec_abss_v4si {}
+
+  const vss __builtin_altivec_abss_v8hi (vss);
+    ABSS_V8HI altivec_abss_v8hi {}
+
+  const vf __builtin_altivec_copysignfp (vf, vf);
+    COPYSIGN_V4SF vector_copysignv4sf3 {}
+
+  void __builtin_altivec_dss (const int<2>);
+    DSS altivec_dss {}
+
+  void __builtin_altivec_dssall ();
+    DSSALL altivec_dssall {}
+
+  void __builtin_altivec_dst (void *, const int, const int<2>);
+    DST altivec_dst {}
+
+  void __builtin_altivec_dstst (void *, const int, const int<2>);
+    DSTST altivec_dstst {}
+
+  void __builtin_altivec_dststt (void *, const int, const int<2>);
+    DSTSTT altivec_dststt {}
+
+  void __builtin_altivec_dstt (void *, const int, const int<2>);
+    DSTT altivec_dstt {}
+
+  fpmath vsi __builtin_altivec_fix_sfsi (vf);
+    FIX_V4SF_V4SI fix_truncv4sfv4si2 {}
+
+  fpmath vui __builtin_altivec_fixuns_sfsi (vf);
+    FIXUNS_V4SF_V4SI fixuns_truncv4sfv4si2 {}
+
+  fpmath vf __builtin_altivec_float_sisf (vsi);
+    FLOAT_V4SI_V4SF floatv4siv4sf2 {}
+
+  pure vsc __builtin_altivec_lvebx (signed long, const void *);
+    LVEBX altivec_lvebx {ldvec}
+
+  pure vss __builtin_altivec_lvehx (signed long, const void *);
+    LVEHX altivec_lvehx {ldvec}
+
+  pure vsi __builtin_altivec_lvewx (signed long, const void *);
+    LVEWX altivec_lvewx {ldvec}
+
+  pure vuc __builtin_altivec_lvsl (signed long, const void *);
+    LVSL altivec_lvsl {ldvec}
+
+  pure vuc __builtin_altivec_lvsr (signed long, const void *);
+    LVSR altivec_lvsr {ldvec}
+
+  pure vsi __builtin_altivec_lvx (signed long, const void *);
+    LVX altivec_lvx_v4si {ldvec}
+
+  pure vsq __builtin_altivec_lvx_v1ti (signed long, const void *);
+    LVX_V1TI altivec_lvx_v1ti {ldvec}
+
+  pure vsc __builtin_altivec_lvx_v16qi (signed long, const void *);
+    LVX_V16QI altivec_lvx_v16qi {ldvec}
+
+  pure vf __builtin_altivec_lvx_v4sf (signed long, const void *);
+    LVX_V4SF altivec_lvx_v4sf {ldvec}
+
+  pure vsi __builtin_altivec_lvx_v4si (signed long, const void *);
+    LVX_V4SI altivec_lvx_v4si {ldvec}
+
+  pure vss __builtin_altivec_lvx_v8hi (signed long, const void *);
+    LVX_V8HI altivec_lvx_v8hi {ldvec}
+
+  pure vsi __builtin_altivec_lvxl (signed long, const void *);
+    LVXL altivec_lvxl_v4si {ldvec}
+
+  pure vsc __builtin_altivec_lvxl_v16qi (signed long, const void *);
+    LVXL_V16QI altivec_lvxl_v16qi {ldvec}
+
+  pure vf __builtin_altivec_lvxl_v4sf (signed long, const void *);
+    LVXL_V4SF altivec_lvxl_v4sf {ldvec}
+
+  pure vsi __builtin_altivec_lvxl_v4si (signed long, const void *);
+    LVXL_V4SI altivec_lvxl_v4si {ldvec}
+
+  pure vss __builtin_altivec_lvxl_v8hi (signed long, const void *);
+    LVXL_V8HI altivec_lvxl_v8hi {ldvec}
+
+  const vsc __builtin_altivec_mask_for_load (const void *);
+    MASK_FOR_LOAD altivec_lvsr_direct {ldstmask}
+
+  vss __builtin_altivec_mfvscr ();
+    MFVSCR altivec_mfvscr {}
+
+  void __builtin_altivec_mtvscr (vsi);
+    MTVSCR altivec_mtvscr {}
+
+  const vsll __builtin_altivec_vmulesw (vsi, vsi);
+    VMULESW vec_widen_smult_even_v4si {}
+
+  const vull __builtin_altivec_vmuleuw (vui, vui);
+    VMULEUW vec_widen_umult_even_v4si {}
+
+  const vsll __builtin_altivec_vmulosw (vsi, vsi);
+    VMULOSW vec_widen_smult_odd_v4si {}
+
+  const vull __builtin_altivec_vmulouw (vui, vui);
+    VMULOUW vec_widen_umult_odd_v4si {}
+
+  const vsc __builtin_altivec_nabs_v16qi (vsc);
+    NABS_V16QI nabsv16qi2 {}
+
+  const vf __builtin_altivec_nabs_v4sf (vf);
+    NABS_V4SF vsx_nabsv4sf2 {}
+
+  const vsi __builtin_altivec_nabs_v4si (vsi);
+    NABS_V4SI nabsv4si2 {}
+
+  const vss __builtin_altivec_nabs_v8hi (vss);
+    NABS_V8HI nabsv8hi2 {}
+
+  void __builtin_altivec_stvebx (vsc, signed long, void *);
+    STVEBX altivec_stvebx {stvec}
+
+  void __builtin_altivec_stvehx (vss, signed long, void *);
+    STVEHX altivec_stvehx {stvec}
+
+  void __builtin_altivec_stvewx (vsi, signed long, void *);
+    STVEWX altivec_stvewx {stvec}
+
+  void __builtin_altivec_stvx (vsi, signed long, void *);
+    STVX altivec_stvx_v4si {stvec}
+
+  void __builtin_altivec_stvx_v16qi (vsc, signed long, void *);
+    STVX_V16QI altivec_stvx_v16qi {stvec}
+
+  void __builtin_altivec_stvx_v4sf (vf, signed long, void *);
+    STVX_V4SF altivec_stvx_v4sf {stvec}
+
+  void __builtin_altivec_stvx_v4si (vsi, signed long, void *);
+    STVX_V4SI altivec_stvx_v4si {stvec}
+
+  void __builtin_altivec_stvx_v8hi (vss, signed long, void *);
+    STVX_V8HI altivec_stvx_v8hi {stvec}
+
+  void __builtin_altivec_stvxl (vsi, signed long, void *);
+    STVXL altivec_stvxl_v4si {stvec}
+
+  void __builtin_altivec_stvxl_v16qi (vsc, signed long, void *);
+    STVXL_V16QI altivec_stvxl_v16qi {stvec}
+
+  void __builtin_altivec_stvxl_v4sf (vf, signed long, void *);
+    STVXL_V4SF altivec_stvxl_v4sf {stvec}
+
+  void __builtin_altivec_stvxl_v4si (vsi, signed long, void *);
+    STVXL_V4SI altivec_stvxl_v4si {stvec}
+
+  void __builtin_altivec_stvxl_v8hi (vss, signed long, void *);
+    STVXL_V8HI altivec_stvxl_v8hi {stvec}
+
+  fpmath vf __builtin_altivec_uns_float_sisf (vui);
+    UNSFLOAT_V4SI_V4SF floatunsv4siv4sf2 {}
+
+  const vui __builtin_altivec_vaddcuw (vui, vui);
+    VADDCUW altivec_vaddcuw {}
+
+  const vf __builtin_altivec_vaddfp (vf, vf);
+    VADDFP addv4sf3 {}
+
+  const vsc __builtin_altivec_vaddsbs (vsc, vsc);
+    VADDSBS altivec_vaddsbs {}
+
+  const vss __builtin_altivec_vaddshs (vss, vss);
+    VADDSHS altivec_vaddshs {}
+
+  const vsi __builtin_altivec_vaddsws (vsi, vsi);
+    VADDSWS altivec_vaddsws {}
+
+  const vuc __builtin_altivec_vaddubm (vuc, vuc);
+    VADDUBM addv16qi3 {}
+
+  const vuc __builtin_altivec_vaddubs (vuc, vuc);
+    VADDUBS altivec_vaddubs {}
+
+  const vus __builtin_altivec_vadduhm (vus, vus);
+    VADDUHM addv8hi3 {}
+
+  const vus __builtin_altivec_vadduhs (vus, vus);
+    VADDUHS altivec_vadduhs {}
+
+  const vsi __builtin_altivec_vadduwm (vsi, vsi);
+    VADDUWM addv4si3 {}
+
+  const vui __builtin_altivec_vadduws (vui, vui);
+    VADDUWS altivec_vadduws {}
+
+  const vsc __builtin_altivec_vand_v16qi (vsc, vsc);
+    VAND_V16QI andv16qi3 {}
+
+  const vuc __builtin_altivec_vand_v16qi_uns (vuc, vuc);
+    VAND_V16QI_UNS andv16qi3 {}
+
+  const vf __builtin_altivec_vand_v4sf (vf, vf);
+    VAND_V4SF andv4sf3 {}
+
+  const vsi __builtin_altivec_vand_v4si (vsi, vsi);
+    VAND_V4SI andv4si3 {}
+
+  const vui __builtin_altivec_vand_v4si_uns (vui, vui);
+    VAND_V4SI_UNS andv4si3 {}
+
+  const vss __builtin_altivec_vand_v8hi (vss, vss);
+    VAND_V8HI andv8hi3 {}
+
+  const vus __builtin_altivec_vand_v8hi_uns (vus, vus);
+    VAND_V8HI_UNS andv8hi3 {}
+
+  const vsc __builtin_altivec_vandc_v16qi (vsc, vsc);
+    VANDC_V16QI andcv16qi3 {}
+
+  const vuc __builtin_altivec_vandc_v16qi_uns (vuc, vuc);
+    VANDC_V16QI_UNS andcv16qi3 {}
+
+  const vf __builtin_altivec_vandc_v4sf (vf, vf);
+    VANDC_V4SF andcv4sf3 {}
+
+  const vsi __builtin_altivec_vandc_v4si (vsi, vsi);
+    VANDC_V4SI andcv4si3 {}
+
+  const vui __builtin_altivec_vandc_v4si_uns (vui, vui);
+    VANDC_V4SI_UNS andcv4si3 {}
+
+  const vss __builtin_altivec_vandc_v8hi (vss, vss);
+    VANDC_V8HI andcv8hi3 {}
+
+  const vus __builtin_altivec_vandc_v8hi_uns (vus, vus);
+    VANDC_V8HI_UNS andcv8hi3 {}
+
+  const vsc __builtin_altivec_vavgsb (vsc, vsc);
+    VAVGSB avgv16qi3_ceil {}
+
+  const vss __builtin_altivec_vavgsh (vss, vss);
+    VAVGSH avgv8hi3_ceil {}
+
+  const vsi __builtin_altivec_vavgsw (vsi, vsi);
+    VAVGSW avgv4si3_ceil {}
+
+  const vuc __builtin_altivec_vavgub (vuc, vuc);
+    VAVGUB uavgv16qi3_ceil {}
+
+  const vus __builtin_altivec_vavguh (vus, vus);
+    VAVGUH uavgv8hi3_ceil {}
+
+  const vui __builtin_altivec_vavguw (vui, vui);
+    VAVGUW uavgv4si3_ceil {}
+
+  const vf __builtin_altivec_vcfsx (vsi, const int<5>);
+    VCFSX altivec_vcfsx {}
+
+  const vf __builtin_altivec_vcfux (vui, const int<5>);
+    VCFUX altivec_vcfux {}
+
+  const vsi __builtin_altivec_vcmpbfp (vf, vf);
+    VCMPBFP altivec_vcmpbfp {}
+
+  const int __builtin_altivec_vcmpbfp_p (int, vf, vf);
+    VCMPBFP_P altivec_vcmpbfp_p {pred}
+
+  const vf __builtin_altivec_vcmpeqfp (vf, vf);
+    VCMPEQFP vector_eqv4sf {}
+
+  const int __builtin_altivec_vcmpeqfp_p (int, vf, vf);
+    VCMPEQFP_P vector_eq_v4sf_p {pred}
+
+  const vsc __builtin_altivec_vcmpequb (vuc, vuc);
+    VCMPEQUB vector_eqv16qi {}
+
+  const int __builtin_altivec_vcmpequb_p (int, vsc, vsc);
+    VCMPEQUB_P vector_eq_v16qi_p {pred}
+
+  const vss __builtin_altivec_vcmpequh (vus, vus);
+    VCMPEQUH vector_eqv8hi {}
+
+  const int __builtin_altivec_vcmpequh_p (int, vss, vss);
+    VCMPEQUH_P vector_eq_v8hi_p {pred}
+
+  const vsi __builtin_altivec_vcmpequw (vui, vui);
+    VCMPEQUW vector_eqv4si {}
+
+  const int __builtin_altivec_vcmpequw_p (int, vsi, vsi);
+    VCMPEQUW_P vector_eq_v4si_p {pred}
+
+  const vf __builtin_altivec_vcmpgefp (vf, vf);
+    VCMPGEFP vector_gev4sf {}
+
+  const int __builtin_altivec_vcmpgefp_p (int, vf, vf);
+    VCMPGEFP_P vector_ge_v4sf_p {pred}
+
+  const vf __builtin_altivec_vcmpgtfp (vf, vf);
+    VCMPGTFP vector_gtv4sf {}
+
+  const int __builtin_altivec_vcmpgtfp_p (int, vf, vf);
+    VCMPGTFP_P vector_gt_v4sf_p {pred}
+
+  const vsc __builtin_altivec_vcmpgtsb (vsc, vsc);
+    VCMPGTSB vector_gtv16qi {}
+
+  const int __builtin_altivec_vcmpgtsb_p (int, vsc, vsc);
+    VCMPGTSB_P vector_gt_v16qi_p {pred}
+
+  const vss __builtin_altivec_vcmpgtsh (vss, vss);
+    VCMPGTSH vector_gtv8hi {}
+
+  const int __builtin_altivec_vcmpgtsh_p (int, vss, vss);
+    VCMPGTSH_P vector_gt_v8hi_p {pred}
+
+  const vsi __builtin_altivec_vcmpgtsw (vsi, vsi);
+    VCMPGTSW vector_gtv4si {}
+
+  const int __builtin_altivec_vcmpgtsw_p (int, vsi, vsi);
+    VCMPGTSW_P vector_gt_v4si_p {pred}
+
+  const vsc __builtin_altivec_vcmpgtub (vuc, vuc);
+    VCMPGTUB vector_gtuv16qi {}
+
+  const int __builtin_altivec_vcmpgtub_p (int, vsc, vsc);
+    VCMPGTUB_P vector_gtu_v16qi_p {pred}
+
+  const vss __builtin_altivec_vcmpgtuh (vus, vus);
+    VCMPGTUH vector_gtuv8hi {}
+
+  const int __builtin_altivec_vcmpgtuh_p (int, vss, vss);
+    VCMPGTUH_P vector_gtu_v8hi_p {pred}
+
+  const vsi __builtin_altivec_vcmpgtuw (vui, vui);
+    VCMPGTUW vector_gtuv4si {}
+
+  const int __builtin_altivec_vcmpgtuw_p (int, vsi, vsi);
+    VCMPGTUW_P vector_gtu_v4si_p {pred}
+
+  const vsi __builtin_altivec_vctsxs (vf, const int<5>);
+    VCTSXS altivec_vctsxs {}
+
+  const vui __builtin_altivec_vctuxs (vf, const int<5>);
+    VCTUXS altivec_vctuxs {}
+
+  fpmath vf __builtin_altivec_vexptefp (vf);
+    VEXPTEFP altivec_vexptefp {}
+
+  fpmath vf __builtin_altivec_vlogefp (vf);
+    VLOGEFP altivec_vlogefp {}
+
+  fpmath vf __builtin_altivec_vmaddfp (vf, vf, vf);
+    VMADDFP fmav4sf4 {}
+
+  const vf __builtin_altivec_vmaxfp (vf, vf);
+    VMAXFP smaxv4sf3 {}
+
+  const vsc __builtin_altivec_vmaxsb (vsc, vsc);
+    VMAXSB smaxv16qi3 {}
+
+  const vuc __builtin_altivec_vmaxub (vuc, vuc);
+    VMAXUB umaxv16qi3 {}
+
+  const vss __builtin_altivec_vmaxsh (vss, vss);
+    VMAXSH smaxv8hi3 {}
+
+  const vsi __builtin_altivec_vmaxsw (vsi, vsi);
+    VMAXSW smaxv4si3 {}
+
+  const vus __builtin_altivec_vmaxuh (vus, vus);
+    VMAXUH umaxv8hi3 {}
+
+  const vui __builtin_altivec_vmaxuw (vui, vui);
+    VMAXUW umaxv4si3 {}
+
+  vss __builtin_altivec_vmhaddshs (vss, vss, vss);
+    VMHADDSHS altivec_vmhaddshs {}
+
+  vss __builtin_altivec_vmhraddshs (vss, vss, vss);
+    VMHRADDSHS altivec_vmhraddshs {}
+
+  const vf __builtin_altivec_vminfp (vf, vf);
+    VMINFP sminv4sf3 {}
+
+  const vsc __builtin_altivec_vminsb (vsc, vsc);
+    VMINSB sminv16qi3 {}
+
+  const vss __builtin_altivec_vminsh (vss, vss);
+    VMINSH sminv8hi3 {}
+
+  const vsi __builtin_altivec_vminsw (vsi, vsi);
+    VMINSW sminv4si3 {}
+
+  const vuc __builtin_altivec_vminub (vuc, vuc);
+    VMINUB uminv16qi3 {}
+
+  const vus __builtin_altivec_vminuh (vus, vus);
+    VMINUH uminv8hi3 {}
+
+  const vui __builtin_altivec_vminuw (vui, vui);
+    VMINUW uminv4si3 {}
+
+  const vss __builtin_altivec_vmladduhm (vss, vss, vss);
+    VMLADDUHM fmav8hi4 {}
+
+  const vsc __builtin_altivec_vmrghb (vsc, vsc);
+    VMRGHB altivec_vmrghb {}
+
+  const vss __builtin_altivec_vmrghh (vss, vss);
+    VMRGHH altivec_vmrghh {}
+
+  const vsi __builtin_altivec_vmrghw (vsi, vsi);
+    VMRGHW altivec_vmrghw {}
+
+  const vsc __builtin_altivec_vmrglb (vsc, vsc);
+    VMRGLB altivec_vmrglb {}
+
+  const vss __builtin_altivec_vmrglh (vss, vss);
+    VMRGLH altivec_vmrglh {}
+
+  const vsi __builtin_altivec_vmrglw (vsi, vsi);
+    VMRGLW altivec_vmrglw {}
+
+  const vsi __builtin_altivec_vmsummbm (vsc, vuc, vsi);
+    VMSUMMBM altivec_vmsummbm {}
+
+  const vsi __builtin_altivec_vmsumshm (vss, vss, vsi);
+    VMSUMSHM altivec_vmsumshm {}
+
+  vsi __builtin_altivec_vmsumshs (vss, vss, vsi);
+    VMSUMSHS altivec_vmsumshs {}
+
+  const vui __builtin_altivec_vmsumubm (vuc, vuc, vui);
+    VMSUMUBM altivec_vmsumubm {}
+
+  const vui __builtin_altivec_vmsumuhm (vus, vus, vui);
+    VMSUMUHM altivec_vmsumuhm {}
+
+  vui __builtin_altivec_vmsumuhs (vus, vus, vui);
+    VMSUMUHS altivec_vmsumuhs {}
+
+  const vss __builtin_altivec_vmulesb (vsc, vsc);
+    VMULESB vec_widen_smult_even_v16qi {}
+
+  const vsi __builtin_altivec_vmulesh (vss, vss);
+    VMULESH vec_widen_smult_even_v8hi {}
+
+  const vus __builtin_altivec_vmuleub (vuc, vuc);
+    VMULEUB vec_widen_umult_even_v16qi {}
+
+  const vui __builtin_altivec_vmuleuh (vus, vus);
+    VMULEUH vec_widen_umult_even_v8hi {}
+
+  const vss __builtin_altivec_vmulosb (vsc, vsc);
+    VMULOSB vec_widen_smult_odd_v16qi {}
+
+  const vus __builtin_altivec_vmuloub (vuc, vuc);
+    VMULOUB vec_widen_umult_odd_v16qi {}
+
+  const vsi __builtin_altivec_vmulosh (vss, vss);
+    VMULOSH vec_widen_smult_odd_v8hi {}
+
+  const vui __builtin_altivec_vmulouh (vus, vus);
+    VMULOUH vec_widen_umult_odd_v8hi {}
+
+  fpmath vf __builtin_altivec_vnmsubfp (vf, vf, vf);
+    VNMSUBFP nfmsv4sf4 {}
+
+  const vsc __builtin_altivec_vnor_v16qi (vsc, vsc);
+    VNOR_V16QI norv16qi3 {}
+
+  const vuc __builtin_altivec_vnor_v16qi_uns (vuc, vuc);
+    VNOR_V16QI_UNS norv16qi3 {}
+
+  const vf __builtin_altivec_vnor_v4sf (vf, vf);
+    VNOR_V4SF norv4sf3 {}
+
+  const vsi __builtin_altivec_vnor_v4si (vsi, vsi);
+    VNOR_V4SI norv4si3 {}
+
+  const vui __builtin_altivec_vnor_v4si_uns (vui, vui);
+    VNOR_V4SI_UNS norv4si3 {}
+
+  const vss __builtin_altivec_vnor_v8hi (vss, vss);
+    VNOR_V8HI norv8hi3 {}
+
+  const vus __builtin_altivec_vnor_v8hi_uns (vus, vus);
+    VNOR_V8HI_UNS norv8hi3 {}
+
+  const vsc __builtin_altivec_vor_v16qi (vsc, vsc);
+    VOR_V16QI iorv16qi3 {}
+
+  const vuc __builtin_altivec_vor_v16qi_uns (vuc, vuc);
+    VOR_V16QI_UNS iorv16qi3 {}
+
+  const vf __builtin_altivec_vor_v4sf (vf, vf);
+    VOR_V4SF iorv4sf3 {}
+
+  const vsi __builtin_altivec_vor_v4si (vsi, vsi);
+    VOR_V4SI iorv4si3 {}
+
+  const vui __builtin_altivec_vor_v4si_uns (vui, vui);
+    VOR_V4SI_UNS iorv4si3 {}
+
+  const vss __builtin_altivec_vor_v8hi (vss, vss);
+    VOR_V8HI iorv8hi3 {}
+
+  const vus __builtin_altivec_vor_v8hi_uns (vus, vus);
+    VOR_V8HI_UNS iorv8hi3 {}
+
+  const vsc __builtin_altivec_vperm_16qi (vsc, vsc, vuc);
+    VPERM_16QI altivec_vperm_v16qi {}
+
+  const vuc __builtin_altivec_vperm_16qi_uns (vuc, vuc, vuc);
+    VPERM_16QI_UNS altivec_vperm_v16qi_uns {}
+
+  const vsq __builtin_altivec_vperm_1ti (vsq, vsq, vuc);
+    VPERM_1TI altivec_vperm_v1ti {}
+
+  const vuq __builtin_altivec_vperm_1ti_uns (vuq, vuq, vuc);
+    VPERM_1TI_UNS altivec_vperm_v1ti_uns {}
+
+  const vf __builtin_altivec_vperm_4sf (vf, vf, vuc);
+    VPERM_4SF altivec_vperm_v4sf {}
+
+  const vsi __builtin_altivec_vperm_4si (vsi, vsi, vuc);
+    VPERM_4SI altivec_vperm_v4si {}
+
+  const vui __builtin_altivec_vperm_4si_uns (vui, vui, vuc);
+    VPERM_4SI_UNS altivec_vperm_v4si_uns {}
+
+  const vss __builtin_altivec_vperm_8hi (vss, vss, vuc);
+    VPERM_8HI altivec_vperm_v8hi {}
+
+  const vus __builtin_altivec_vperm_8hi_uns (vus, vus, vuc);
+    VPERM_8HI_UNS altivec_vperm_v8hi_uns {}
+
+  const vp __builtin_altivec_vpkpx (vui, vui);
+    VPKPX altivec_vpkpx {}
+
+  const vsc __builtin_altivec_vpkshss (vss, vss);
+    VPKSHSS altivec_vpkshss {}
+
+  const vuc __builtin_altivec_vpkshus (vss, vss);
+    VPKSHUS altivec_vpkshus {}
+
+  const vss __builtin_altivec_vpkswss (vsi, vsi);
+    VPKSWSS altivec_vpkswss {}
+
+  const vus __builtin_altivec_vpkswus (vsi, vsi);
+    VPKSWUS altivec_vpkswus {}
+
+  const vsc __builtin_altivec_vpkuhum (vss, vss);
+    VPKUHUM altivec_vpkuhum {}
+
+  const vuc __builtin_altivec_vpkuhus (vus, vus);
+    VPKUHUS altivec_vpkuhus {}
+
+  const vss __builtin_altivec_vpkuwum (vsi, vsi);
+    VPKUWUM altivec_vpkuwum {}
+
+  const vus __builtin_altivec_vpkuwus (vui, vui);
+    VPKUWUS altivec_vpkuwus {}
+
+  const vf __builtin_altivec_vrecipdivfp (vf, vf);
+    VRECIPFP recipv4sf3 {}
+
+  fpmath vf __builtin_altivec_vrefp (vf);
+    VREFP rev4sf2 {}
+
+  const vsc __builtin_altivec_vreve_v16qi (vsc);
+    VREVE_V16QI altivec_vrevev16qi2 {}
+
+  const vf __builtin_altivec_vreve_v4sf (vf);
+    VREVE_V4SF altivec_vrevev4sf2 {}
+
+  const vsi __builtin_altivec_vreve_v4si (vsi);
+    VREVE_V4SI altivec_vrevev4si2 {}
+
+  const vss __builtin_altivec_vreve_v8hi (vss);
+    VREVE_V8HI altivec_vrevev8hi2 {}
+
+  fpmath vf __builtin_altivec_vrfim (vf);
+    VRFIM vector_floorv4sf2 {}
+
+  fpmath vf __builtin_altivec_vrfin (vf);
+    VRFIN altivec_vrfin {}
+
+  fpmath vf __builtin_altivec_vrfip (vf);
+    VRFIP vector_ceilv4sf2 {}
+
+  fpmath vf __builtin_altivec_vrfiz (vf);
+    VRFIZ vector_btruncv4sf2 {}
+
+  const vsc __builtin_altivec_vrlb (vsc, vsc);
+    VRLB vrotlv16qi3 {}
+
+  const vss __builtin_altivec_vrlh (vss, vss);
+    VRLH vrotlv8hi3 {}
+
+  const vsi __builtin_altivec_vrlw (vsi, vsi);
+    VRLW vrotlv4si3 {}
+
+  fpmath vf __builtin_altivec_vrsqrtefp (vf);
+    VRSQRTEFP rsqrtev4sf2 {}
+
+  fpmath vf __builtin_altivec_vrsqrtfp (vf);
+    VRSQRTFP rsqrtv4sf2 {}
+
+  const vsc __builtin_altivec_vsel_16qi (vsc, vsc, vuc);
+    VSEL_16QI vector_select_v16qi {}
+
+  const vuc __builtin_altivec_vsel_16qi_uns (vuc, vuc, vuc);
+    VSEL_16QI_UNS vector_select_v16qi_uns {}
+
+  const vsq __builtin_altivec_vsel_1ti (vsq, vsq, vuq);
+    VSEL_1TI vector_select_v1ti {}
+
+  const vuq __builtin_altivec_vsel_1ti_uns (vuq, vuq, vuq);
+    VSEL_1TI_UNS vector_select_v1ti_uns {}
+
+  const vf __builtin_altivec_vsel_4sf (vf, vf, vf);
+    VSEL_4SF vector_select_v4sf {}
+
+  const vsi __builtin_altivec_vsel_4si (vsi, vsi, vui);
+    VSEL_4SI vector_select_v4si {}
+
+  const vui __builtin_altivec_vsel_4si_uns (vui, vui, vui);
+    VSEL_4SI_UNS vector_select_v4si_uns {}
+
+  const vss __builtin_altivec_vsel_8hi (vss, vss, vus);
+    VSEL_8HI vector_select_v8hi {}
+
+  const vus __builtin_altivec_vsel_8hi_uns (vus, vus, vus);
+    VSEL_8HI_UNS vector_select_v8hi_uns {}
+
+  const vsi __builtin_altivec_vsl (vsi, vsi);
+    VSL altivec_vsl {}
+
+  const vsc __builtin_altivec_vslb (vsc, vuc);
+    VSLB vashlv16qi3 {}
+
+  const vsc __builtin_altivec_vsldoi_16qi (vsc, vsc, const int<4>);
+    VSLDOI_16QI altivec_vsldoi_v16qi {}
+
+  const vf __builtin_altivec_vsldoi_4sf (vf, vf, const int<4>);
+    VSLDOI_4SF altivec_vsldoi_v4sf {}
+
+  const vsi __builtin_altivec_vsldoi_4si (vsi, vsi, const int<4>);
+    VSLDOI_4SI altivec_vsldoi_v4si {}
+
+  const vss __builtin_altivec_vsldoi_8hi (vss, vss, const int<4>);
+    VSLDOI_8HI altivec_vsldoi_v8hi {}
+
+  const vss __builtin_altivec_vslh (vss, vus);
+    VSLH vashlv8hi3 {}
+
+  const vsi __builtin_altivec_vslo (vsi, vsi);
+    VSLO altivec_vslo {}
+
+  const vsi __builtin_altivec_vslw (vsi, vui);
+    VSLW vashlv4si3 {}
+
+  const vsc __builtin_altivec_vspltb (vsc, const int<4>);
+    VSPLTB altivec_vspltb {}
+
+  const vss __builtin_altivec_vsplth (vss, const int<3>);
+    VSPLTH altivec_vsplth {}
+
+  const vsc __builtin_altivec_vspltisb (const int<-16,15>);
+    VSPLTISB altivec_vspltisb {}
+
+  const vss __builtin_altivec_vspltish (const int<-16,15>);
+    VSPLTISH altivec_vspltish {}
+
+  const vsi __builtin_altivec_vspltisw (const int<-16,15>);
+    VSPLTISW altivec_vspltisw {}
+
+  const vsi __builtin_altivec_vspltw (vsi, const int<2>);
+    VSPLTW altivec_vspltw {}
+
+  const vsi __builtin_altivec_vsr (vsi, vsi);
+    VSR altivec_vsr {}
+
+  const vsc __builtin_altivec_vsrab (vsc, vuc);
+    VSRAB vashrv16qi3 {}
+
+  const vss __builtin_altivec_vsrah (vss, vus);
+    VSRAH vashrv8hi3 {}
+
+  const vsi __builtin_altivec_vsraw (vsi, vui);
+    VSRAW vashrv4si3 {}
+
+  const vsc __builtin_altivec_vsrb (vsc, vuc);
+    VSRB vlshrv16qi3 {}
+
+  const vss __builtin_altivec_vsrh (vss, vus);
+    VSRH vlshrv8hi3 {}
+
+  const vsi __builtin_altivec_vsro (vsi, vsi);
+    VSRO altivec_vsro {}
+
+  const vsi __builtin_altivec_vsrw (vsi, vui);
+    VSRW vlshrv4si3 {}
+
+  const vsi __builtin_altivec_vsubcuw (vsi, vsi);
+    VSUBCUW altivec_vsubcuw {}
+
+  const vf __builtin_altivec_vsubfp (vf, vf);
+    VSUBFP subv4sf3 {}
+
+  const vsc __builtin_altivec_vsubsbs (vsc, vsc);
+    VSUBSBS altivec_vsubsbs {}
+
+  const vss __builtin_altivec_vsubshs (vss, vss);
+    VSUBSHS altivec_vsubshs {}
+
+  const vsi __builtin_altivec_vsubsws (vsi, vsi);
+    VSUBSWS altivec_vsubsws {}
+
+  const vuc __builtin_altivec_vsububm (vuc, vuc);
+    VSUBUBM subv16qi3 {}
+
+  const vuc __builtin_altivec_vsububs (vuc, vuc);
+    VSUBUBS altivec_vsububs {}
+
+  const vus __builtin_altivec_vsubuhm (vus, vus);
+    VSUBUHM subv8hi3 {}
+
+  const vus __builtin_altivec_vsubuhs (vus, vus);
+    VSUBUHS altivec_vsubuhs {}
+
+  const vui __builtin_altivec_vsubuwm (vui, vui);
+    VSUBUWM subv4si3 {}
+
+  const vui __builtin_altivec_vsubuws (vui, vui);
+    VSUBUWS altivec_vsubuws {}
+
+  const vsi __builtin_altivec_vsum2sws (vsi, vsi);
+    VSUM2SWS altivec_vsum2sws {}
+
+  const vsi __builtin_altivec_vsum4sbs (vsc, vsi);
+    VSUM4SBS altivec_vsum4sbs {}
+
+  const vsi __builtin_altivec_vsum4shs (vss, vsi);
+    VSUM4SHS altivec_vsum4shs {}
+
+  const vui __builtin_altivec_vsum4ubs (vuc, vui);
+    VSUM4UBS altivec_vsum4ubs {}
+
+  const vsi __builtin_altivec_vsumsws (vsi, vsi);
+    VSUMSWS altivec_vsumsws {}
+
+  const vsi __builtin_altivec_vsumsws_be (vsi, vsi);
+    VSUMSWS_BE altivec_vsumsws_direct {}
+
+  const vui __builtin_altivec_vupkhpx (vp);
+    VUPKHPX altivec_vupkhpx {}
+
+  const vss __builtin_altivec_vupkhsb (vsc);
+    VUPKHSB altivec_vupkhsb {}
+
+  const vsi __builtin_altivec_vupkhsh (vss);
+    VUPKHSH altivec_vupkhsh {}
+
+  const vui __builtin_altivec_vupklpx (vp);
+    VUPKLPX altivec_vupklpx {}
+
+  const vss __builtin_altivec_vupklsb (vsc);
+    VUPKLSB altivec_vupklsb {}
+
+  const vsi __builtin_altivec_vupklsh (vss);
+    VUPKLSH altivec_vupklsh {}
+
+  const vsc __builtin_altivec_vxor_v16qi (vsc, vsc);
+    VXOR_V16QI xorv16qi3 {}
+
+  const vuc __builtin_altivec_vxor_v16qi_uns (vuc, vuc);
+    VXOR_V16QI_UNS xorv16qi3 {}
+
+  const vf __builtin_altivec_vxor_v4sf (vf, vf);
+    VXOR_V4SF xorv4sf3 {}
+
+  const vsi __builtin_altivec_vxor_v4si (vsi, vsi);
+    VXOR_V4SI xorv4si3 {}
+
+  const vui __builtin_altivec_vxor_v4si_uns (vui, vui);
+    VXOR_V4SI_UNS xorv4si3 {}
+
+  const vss __builtin_altivec_vxor_v8hi (vss, vss);
+    VXOR_V8HI xorv8hi3 {}
+
+  const vus __builtin_altivec_vxor_v8hi_uns (vus, vus);
+    VXOR_V8HI_UNS xorv8hi3 {}
+
+  const signed char __builtin_vec_ext_v16qi (vsc, signed int);
+    VEC_EXT_V16QI nothing {extract}
+
+  const float __builtin_vec_ext_v4sf (vf, signed int);
+    VEC_EXT_V4SF nothing {extract}
+
+  const signed int __builtin_vec_ext_v4si (vsi, signed int);
+    VEC_EXT_V4SI nothing {extract}
+
+  const signed short __builtin_vec_ext_v8hi (vss, signed int);
+    VEC_EXT_V8HI nothing {extract}
+
+  const vsc __builtin_vec_init_v16qi (signed char, signed char, signed char, signed char, signed char, signed char, signed char, signed char, signed char, signed char, signed char, signed char, signed char, signed char, signed char, signed char);
+    VEC_INIT_V16QI nothing {init}
+
+  const vf __builtin_vec_init_v4sf (float, float, float, float);
+    VEC_INIT_V4SF nothing {init}
+
+  const vsi __builtin_vec_init_v4si (signed int, signed int, signed int, signed int);
+    VEC_INIT_V4SI nothing {init}
+
+  const vss __builtin_vec_init_v8hi (signed short, signed short, signed short, signed short, signed short, signed short, signed short, signed short);
+    VEC_INIT_V8HI nothing {init}
+
+  const vsc __builtin_vec_set_v16qi (vsc, signed char, const int<4>);
+    VEC_SET_V16QI nothing {set}
+
+  const vf __builtin_vec_set_v4sf (vf, float, const int<2>);
+    VEC_SET_V4SF nothing {set}
+
+  const vsi __builtin_vec_set_v4si (vsi, signed int, const int<2>);
+    VEC_SET_V4SI nothing {set}
+
+  const vss __builtin_vec_set_v8hi (vss, signed short, const int<3>);
+    VEC_SET_V8HI nothing {set}
diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
index 904e104c058..8b16d65e684 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -13493,6 +13493,9 @@ rs6000_init_builtins (void)
 					    intTI_type_node, 1);
   pixel_V8HI_type_node = rs6000_vector_type ("__vector __pixel",
 					     pixel_type_node, 8);
+  pcvoid_type_node
+    = build_pointer_type (build_qualified_type (void_type_node,
+						TYPE_QUAL_CONST));
 
   /* Create Altivec, VSX and MMA builtins on machines with at least the
      general purpose extensions (970 and newer) to allow the use of
@@ -13652,10 +13655,6 @@ altivec_init_builtins (void)
 
   tree pvoid_type_node = build_pointer_type (void_type_node);
 
-  tree pcvoid_type_node
-    = build_pointer_type (build_qualified_type (void_type_node,
-						TYPE_QUAL_CONST));
-
   tree int_ftype_opaque
     = build_function_type_list (integer_type_node,
 				opaque_V4SI_type_node, NULL_TREE);
diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h
index 4ca6372435d..c5d20d240f2 100644
--- a/gcc/config/rs6000/rs6000.h
+++ b/gcc/config/rs6000/rs6000.h
@@ -2460,6 +2460,7 @@ enum rs6000_builtin_type_index
   RS6000_BTI_const_str,		 /* pointer to const char * */
   RS6000_BTI_vector_pair,	 /* unsigned 256-bit types (vector pair).  */
   RS6000_BTI_vector_quad,	 /* unsigned 512-bit types (vector quad).  */
+  RS6000_BTI_const_ptr_void,     /* const pointer to void */
   RS6000_BTI_MAX
 };
 
@@ -2515,6 +2516,7 @@ enum rs6000_builtin_type_index
 #define const_str_type_node		 (rs6000_builtin_types[RS6000_BTI_const_str])
 #define vector_pair_type_node		 (rs6000_builtin_types[RS6000_BTI_vector_pair])
 #define vector_quad_type_node		 (rs6000_builtin_types[RS6000_BTI_vector_quad])
+#define pcvoid_type_node		 (rs6000_builtin_types[RS6000_BTI_const_ptr_void])
 
 extern GTY(()) tree rs6000_builtin_types[RS6000_BTI_MAX];
 extern GTY(()) tree rs6000_builtin_decls[RS6000_BUILTIN_COUNT];
-- 
2.27.0


^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 04/34] rs6000: Add VSX builtins
  2021-07-29 13:30 [PATCHv4 00/34] Replace the Power target-specific builtin machinery Bill Schmidt
                   ` (2 preceding siblings ...)
  2021-07-29 13:30 ` [PATCH 03/34] rs6000: Add the rest of the [altivec] stanza to the builtins file Bill Schmidt
@ 2021-07-29 13:30 ` Bill Schmidt
  2021-08-10 16:14   ` will schmidt
  2021-08-10 17:52   ` Segher Boessenkool
  2021-07-29 13:30 ` [PATCH 05/34] rs6000: Add available-everywhere and ancient builtins Bill Schmidt
                   ` (29 subsequent siblings)
  33 siblings, 2 replies; 84+ messages in thread
From: Bill Schmidt @ 2021-07-29 13:30 UTC (permalink / raw)
  To: gcc-patches; +Cc: segher, dje.gcc, willschm

2021-06-07  Bill Schmidt  <wschmidt@linux.ibm.com>

gcc/
	* config/rs6000/rs6000-builtin-new.def: Add vsx stanza.
---
 gcc/config/rs6000/rs6000-builtin-new.def | 857 +++++++++++++++++++++++
 1 file changed, 857 insertions(+)

diff --git a/gcc/config/rs6000/rs6000-builtin-new.def b/gcc/config/rs6000/rs6000-builtin-new.def
index f1aa5529cdd..974cdc8c37c 100644
--- a/gcc/config/rs6000/rs6000-builtin-new.def
+++ b/gcc/config/rs6000/rs6000-builtin-new.def
@@ -1028,3 +1028,860 @@
 
   const vss __builtin_vec_set_v8hi (vss, signed short, const int<3>);
     VEC_SET_V8HI nothing {set}
+
+
+; VSX builtins.
+[vsx]
+  pure vd __builtin_altivec_lvx_v2df (signed long, const void *);
+    LVX_V2DF altivec_lvx_v2df {ldvec}
+
+  pure vsll __builtin_altivec_lvx_v2di (signed long, const void *);
+    LVX_V2DI altivec_lvx_v2di {ldvec}
+
+  pure vd __builtin_altivec_lvxl_v2df (signed long, const void *);
+    LVXL_V2DF altivec_lvxl_v2df {ldvec}
+
+  pure vsll __builtin_altivec_lvxl_v2di (signed long, const void *);
+    LVXL_V2DI altivec_lvxl_v2di {ldvec}
+
+  const vd __builtin_altivec_nabs_v2df (vd);
+    NABS_V2DF vsx_nabsv2df2 {}
+
+  const vsll __builtin_altivec_nabs_v2di (vsll);
+    NABS_V2DI nabsv2di2 {}
+
+  void __builtin_altivec_stvx_v2df (vd, signed long, void *);
+    STVX_V2DF altivec_stvx_v2df {stvec}
+
+  void __builtin_altivec_stvx_v2di (vsll, signed long, void *);
+    STVX_V2DI altivec_stvx_v2di {stvec}
+
+  void __builtin_altivec_stvxl_v2df (vd, signed long, void *);
+    STVXL_V2DF altivec_stvxl_v2df {stvec}
+
+  void __builtin_altivec_stvxl_v2di (vsll, signed long, void *);
+    STVXL_V2DI altivec_stvxl_v2di {stvec}
+
+  const vd __builtin_altivec_vand_v2df (vd, vd);
+    VAND_V2DF andv2df3 {}
+
+  const vsll __builtin_altivec_vand_v2di (vsll, vsll);
+    VAND_V2DI andv2di3 {}
+
+  const vull __builtin_altivec_vand_v2di_uns (vull, vull);
+    VAND_V2DI_UNS andv2di3 {}
+
+  const vd __builtin_altivec_vandc_v2df (vd, vd);
+    VANDC_V2DF andcv2df3 {}
+
+  const vsll __builtin_altivec_vandc_v2di (vsll, vsll);
+    VANDC_V2DI andcv2di3 {}
+
+  const vull __builtin_altivec_vandc_v2di_uns (vull, vull);
+    VANDC_V2DI_UNS andcv2di3 {}
+
+  const vsll __builtin_altivec_vcmpequd (vull, vull);
+    VCMPEQUD vector_eqv2di {}
+
+  const int __builtin_altivec_vcmpequd_p (int, vsll, vsll);
+    VCMPEQUD_P vector_eq_v2di_p {pred}
+
+  const vsll __builtin_altivec_vcmpgtsd (vsll, vsll);
+    VCMPGTSD vector_gtv2di {}
+
+  const int __builtin_altivec_vcmpgtsd_p (int, vsll, vsll);
+    VCMPGTSD_P vector_gt_v2di_p {pred}
+
+  const vsll __builtin_altivec_vcmpgtud (vull, vull);
+    VCMPGTUD vector_gtuv2di {}
+
+  const int __builtin_altivec_vcmpgtud_p (int, vsll, vsll);
+    VCMPGTUD_P vector_gtu_v2di_p {pred}
+
+  const vd __builtin_altivec_vnor_v2df (vd, vd);
+    VNOR_V2DF norv2df3 {}
+
+  const vsll __builtin_altivec_vnor_v2di (vsll, vsll);
+    VNOR_V2DI norv2di3 {}
+
+  const vull __builtin_altivec_vnor_v2di_uns (vull, vull);
+    VNOR_V2DI_UNS norv2di3 {}
+
+  const vd __builtin_altivec_vor_v2df (vd, vd);
+    VOR_V2DF iorv2df3 {}
+
+  const vsll __builtin_altivec_vor_v2di (vsll, vsll);
+    VOR_V2DI iorv2di3 {}
+
+  const vull __builtin_altivec_vor_v2di_uns (vull, vull);
+    VOR_V2DI_UNS iorv2di3 {}
+
+  const vd __builtin_altivec_vperm_2df (vd, vd, vuc);
+    VPERM_2DF altivec_vperm_v2df {}
+
+  const vsll __builtin_altivec_vperm_2di (vsll, vsll, vuc);
+    VPERM_2DI altivec_vperm_v2di {}
+
+  const vull __builtin_altivec_vperm_2di_uns (vull, vull, vuc);
+    VPERM_2DI_UNS altivec_vperm_v2di_uns {}
+
+  const vd __builtin_altivec_vreve_v2df (vd);
+    VREVE_V2DF altivec_vrevev2df2 {}
+
+  const vsll __builtin_altivec_vreve_v2di (vsll);
+    VREVE_V2DI altivec_vrevev2di2 {}
+
+  const vd __builtin_altivec_vsel_2df (vd, vd, vd);
+    VSEL_2DF vector_select_v2df {}
+
+  const vsll __builtin_altivec_vsel_2di (vsll, vsll, vsll);
+    VSEL_2DI_B vector_select_v2di {}
+
+  const vull __builtin_altivec_vsel_2di_uns (vull, vull, vull);
+    VSEL_2DI_UNS vector_select_v2di_uns {}
+
+  const vd __builtin_altivec_vsldoi_2df (vd, vd, const int<4>);
+    VSLDOI_2DF altivec_vsldoi_v2df {}
+
+  const vsll __builtin_altivec_vsldoi_2di (vsll, vsll, const int<4>);
+    VSLDOI_2DI altivec_vsldoi_v2di {}
+
+  const vd __builtin_altivec_vxor_v2df (vd, vd);
+    VXOR_V2DF xorv2df3 {}
+
+  const vsll __builtin_altivec_vxor_v2di (vsll, vsll);
+    VXOR_V2DI xorv2di3 {}
+
+  const vull __builtin_altivec_vxor_v2di_uns (vull, vull);
+    VXOR_V2DI_UNS xorv2di3 {}
+
+  const signed __int128 __builtin_vec_ext_v1ti (vsq, signed int);
+    VEC_EXT_V1TI nothing {extract}
+
+  const double __builtin_vec_ext_v2df (vd, signed int);
+    VEC_EXT_V2DF nothing {extract}
+
+  const signed long long __builtin_vec_ext_v2di (vsll, signed int);
+    VEC_EXT_V2DI nothing {extract}
+
+  const vsq __builtin_vec_init_v1ti (signed __int128);
+    VEC_INIT_V1TI nothing {init}
+
+  const vd __builtin_vec_init_v2df (double, double);
+    VEC_INIT_V2DF nothing {init}
+
+  const vsll __builtin_vec_init_v2di (signed long long, signed long long);
+    VEC_INIT_V2DI nothing {init}
+
+  const vsq __builtin_vec_set_v1ti (vsq, signed __int128, const int<0,0>);
+    VEC_SET_V1TI nothing {set}
+
+  const vd __builtin_vec_set_v2df (vd, double, const int<1>);
+    VEC_SET_V2DF nothing {set}
+
+  const vsll __builtin_vec_set_v2di (vsll, signed long long, const int<1>);
+    VEC_SET_V2DI nothing {set}
+
+  const vsc __builtin_vsx_cmpge_16qi (vsc, vsc);
+    CMPGE_16QI vector_nltv16qi {}
+
+  const vsll __builtin_vsx_cmpge_2di (vsll, vsll);
+    CMPGE_2DI vector_nltv2di {}
+
+  const vsi __builtin_vsx_cmpge_4si (vsi, vsi);
+    CMPGE_4SI vector_nltv4si {}
+
+  const vss __builtin_vsx_cmpge_8hi (vss, vss);
+    CMPGE_8HI vector_nltv8hi {}
+
+  const vsc __builtin_vsx_cmpge_u16qi (vuc, vuc);
+    CMPGE_U16QI vector_nltuv16qi {}
+
+  const vsll __builtin_vsx_cmpge_u2di (vull, vull);
+    CMPGE_U2DI vector_nltuv2di {}
+
+  const vsi __builtin_vsx_cmpge_u4si (vui, vui);
+    CMPGE_U4SI vector_nltuv4si {}
+
+  const vss __builtin_vsx_cmpge_u8hi (vus, vus);
+    CMPGE_U8HI vector_nltuv8hi {}
+
+  const vsc __builtin_vsx_cmple_16qi (vsc, vsc);
+    CMPLE_16QI vector_ngtv16qi {}
+
+  const vsll __builtin_vsx_cmple_2di (vsll, vsll);
+    CMPLE_2DI vector_ngtv2di {}
+
+  const vsi __builtin_vsx_cmple_4si (vsi, vsi);
+    CMPLE_4SI vector_ngtv4si {}
+
+  const vss __builtin_vsx_cmple_8hi (vss, vss);
+    CMPLE_8HI vector_ngtv8hi {}
+
+  const vsc __builtin_vsx_cmple_u16qi (vsc, vsc);
+    CMPLE_U16QI vector_ngtuv16qi {}
+
+  const vsll __builtin_vsx_cmple_u2di (vsll, vsll);
+    CMPLE_U2DI vector_ngtuv2di {}
+
+  const vsi __builtin_vsx_cmple_u4si (vsi, vsi);
+    CMPLE_U4SI vector_ngtuv4si {}
+
+  const vss __builtin_vsx_cmple_u8hi (vss, vss);
+    CMPLE_U8HI vector_ngtuv8hi {}
+
+  const vd __builtin_vsx_concat_2df (double, double);
+    CONCAT_2DF vsx_concat_v2df {}
+
+  const vsll __builtin_vsx_concat_2di (signed long long, signed long long);
+    CONCAT_2DI vsx_concat_v2di {}
+
+  const vd __builtin_vsx_cpsgndp (vd, vd);
+    CPSGNDP vector_copysignv2df3 {}
+
+  const vf __builtin_vsx_cpsgnsp (vf, vf);
+    CPSGNSP vector_copysignv4sf3 {}
+
+  const vsll __builtin_vsx_div_2di (vsll, vsll);
+    DIV_V2DI vsx_div_v2di {}
+
+  const vd __builtin_vsx_doublee_v4sf (vf);
+    DOUBLEE_V4SF doubleev4sf2 {}
+
+  const vd __builtin_vsx_doublee_v4si (vsi);
+    DOUBLEE_V4SI doubleev4si2 {}
+
+  const vd __builtin_vsx_doubleh_v4sf (vf);
+    DOUBLEH_V4SF doublehv4sf2 {}
+
+  const vd __builtin_vsx_doubleh_v4si (vsi);
+    DOUBLEH_V4SI doublehv4si2 {}
+
+  const vd __builtin_vsx_doublel_v4sf (vf);
+    DOUBLEL_V4SF doublelv4sf2 {}
+
+  const vd __builtin_vsx_doublel_v4si (vsi);
+    DOUBLEL_V4SI doublelv4si2 {}
+
+  const vd __builtin_vsx_doubleo_v4sf (vf);
+    DOUBLEO_V4SF doubleov4sf2 {}
+
+  const vd __builtin_vsx_doubleo_v4si (vsi);
+    DOUBLEO_V4SI doubleov4si2 {}
+
+  const vf __builtin_vsx_floate_v2df (vd);
+    FLOATE_V2DF floatev2df {}
+
+  const vf __builtin_vsx_floate_v2di (vsll);
+    FLOATE_V2DI floatev2di {}
+
+  const vf __builtin_vsx_floato_v2df (vd);
+    FLOATO_V2DF floatov2df {}
+
+  const vf __builtin_vsx_floato_v2di (vsll);
+    FLOATO_V2DI floatov2di {}
+
+  pure vsq __builtin_vsx_ld_elemrev_v1ti (signed long, const void *);
+    LD_ELEMREV_V1TI vsx_ld_elemrev_v1ti {ldvec,endian}
+
+  pure vd __builtin_vsx_ld_elemrev_v2df (signed long, const void *);
+    LD_ELEMREV_V2DF vsx_ld_elemrev_v2df {ldvec,endian}
+
+  pure vsll __builtin_vsx_ld_elemrev_v2di (signed long, const void *);
+    LD_ELEMREV_V2DI vsx_ld_elemrev_v2di {ldvec,endian}
+
+  pure vf __builtin_vsx_ld_elemrev_v4sf (signed long, const void *);
+    LD_ELEMREV_V4SF vsx_ld_elemrev_v4sf {ldvec,endian}
+
+  pure vsi __builtin_vsx_ld_elemrev_v4si (signed long, const void *);
+    LD_ELEMREV_V4SI vsx_ld_elemrev_v4si {ldvec,endian}
+
+  pure vss __builtin_vsx_ld_elemrev_v8hi (signed long, const void *);
+    LD_ELEMREV_V8HI vsx_ld_elemrev_v8hi {ldvec,endian}
+
+  pure vsc __builtin_vsx_ld_elemrev_v16qi (signed long, const void *);
+    LD_ELEMREV_V16QI vsx_ld_elemrev_v16qi {ldvec,endian}
+
+; There is apparent intent in rs6000-builtin.def to have RS6000_BTC_SPECIAL
+; processing for LXSDX, LXVDSX, and STXSDX, but there are no def_builtin calls
+; for any of them.  At some point, we may want to add a set of built-ins for
+; whichever vector types make sense for these.
+
+  pure vsq __builtin_vsx_lxvd2x_v1ti (signed long, const void *);
+    LXVD2X_V1TI vsx_load_v1ti {ldvec}
+
+  pure vd __builtin_vsx_lxvd2x_v2df (signed long, const void *);
+    LXVD2X_V2DF vsx_load_v2df {ldvec}
+
+  pure vsll __builtin_vsx_lxvd2x_v2di (signed long, const void *);
+    LXVD2X_V2DI vsx_load_v2di {ldvec}
+
+  pure vsc __builtin_vsx_lxvw4x_v16qi (signed long, const void *);
+    LXVW4X_V16QI vsx_load_v16qi {ldvec}
+
+  pure vf __builtin_vsx_lxvw4x_v4sf (signed long, const void *);
+    LXVW4X_V4SF vsx_load_v4sf {ldvec}
+
+  pure vsi __builtin_vsx_lxvw4x_v4si (signed long, const void *);
+    LXVW4X_V4SI vsx_load_v4si {ldvec}
+
+  pure vss __builtin_vsx_lxvw4x_v8hi (signed long, const void *);
+    LXVW4X_V8HI vsx_load_v8hi {ldvec}
+
+  const vd __builtin_vsx_mergeh_2df (vd, vd);
+    VEC_MERGEH_V2DF vsx_mergeh_v2df {}
+
+  const vsll __builtin_vsx_mergeh_2di (vsll, vsll);
+    VEC_MERGEH_V2DI vsx_mergeh_v2di {}
+
+  const vd __builtin_vsx_mergel_2df (vd, vd);
+    VEC_MERGEL_V2DF vsx_mergel_v2df {}
+
+  const vsll __builtin_vsx_mergel_2di (vsll, vsll);
+    VEC_MERGEL_V2DI vsx_mergel_v2di {}
+
+  const vsll __builtin_vsx_mul_2di (vsll, vsll);
+    MUL_V2DI vsx_mul_v2di {}
+
+  const vsq __builtin_vsx_set_1ti (vsq, signed __int128, const int<0,0>);
+    SET_1TI vsx_set_v1ti {set}
+
+  const vd __builtin_vsx_set_2df (vd, double, const int<0,1>);
+    SET_2DF vsx_set_v2df {set}
+
+  const vsll __builtin_vsx_set_2di (vsll, signed long long, const int<0,1>);
+    SET_2DI vsx_set_v2di {set}
+
+  const vd __builtin_vsx_splat_2df (double);
+    SPLAT_2DF vsx_splat_v2df {}
+
+  const vsll __builtin_vsx_splat_2di (signed long long);
+    SPLAT_2DI vsx_splat_v2di {}
+
+  void __builtin_vsx_st_elemrev_v1ti (vsq, signed long, void *);
+    ST_ELEMREV_V1TI vsx_st_elemrev_v1ti {stvec,endian}
+
+  void __builtin_vsx_st_elemrev_v2df (vd, signed long, void *);
+    ST_ELEMREV_V2DF vsx_st_elemrev_v2df {stvec,endian}
+
+  void __builtin_vsx_st_elemrev_v2di (vsll, signed long, void *);
+    ST_ELEMREV_V2DI vsx_st_elemrev_v2di {stvec,endian}
+
+  void __builtin_vsx_st_elemrev_v4sf (vf, signed long, void *);
+    ST_ELEMREV_V4SF vsx_st_elemrev_v4sf {stvec,endian}
+
+  void __builtin_vsx_st_elemrev_v4si (vsi, signed long, void *);
+    ST_ELEMREV_V4SI vsx_st_elemrev_v4si {stvec,endian}
+
+  void __builtin_vsx_st_elemrev_v8hi (vss, signed long, void *);
+    ST_ELEMREV_V8HI vsx_st_elemrev_v8hi {stvec,endian}
+
+  void __builtin_vsx_st_elemrev_v16qi (vsc, signed long, void *);
+    ST_ELEMREV_V16QI vsx_st_elemrev_v16qi {stvec,endian}
+
+  void __builtin_vsx_stxvd2x_v1ti (vsq, signed long, void *);
+    STXVD2X_V1TI vsx_store_v1ti {stvec}
+
+  void __builtin_vsx_stxvd2x_v2df (vd, signed long, void *);
+    STXVD2X_V2DF vsx_store_v2df {stvec}
+
+  void __builtin_vsx_stxvd2x_v2di (vsll, signed long, void *);
+    STXVD2X_V2DI vsx_store_v2di {stvec}
+
+  void __builtin_vsx_stxvw4x_v4sf (vf, signed long, void *);
+    STXVW4X_V4SF vsx_store_v4sf {stvec}
+
+  void __builtin_vsx_stxvw4x_v4si (vsi, signed long, void *);
+    STXVW4X_V4SI vsx_store_v4si {stvec}
+
+  void __builtin_vsx_stxvw4x_v8hi (vss, signed long, void *);
+    STXVW4X_V8HI vsx_store_v8hi {stvec}
+
+  void __builtin_vsx_stxvw4x_v16qi (vsc, signed long, void *);
+    STXVW4X_V16QI vsx_store_v16qi {stvec}
+
+  const vull __builtin_vsx_udiv_2di (vull, vull);
+    UDIV_V2DI vsx_udiv_v2di {}
+
+  const vd __builtin_vsx_uns_doublee_v4si (vsi);
+    UNS_DOUBLEE_V4SI unsdoubleev4si2 {}
+
+  const vd __builtin_vsx_uns_doubleh_v4si (vsi);
+    UNS_DOUBLEH_V4SI unsdoublehv4si2 {}
+
+  const vd __builtin_vsx_uns_doublel_v4si (vsi);
+    UNS_DOUBLEL_V4SI unsdoublelv4si2 {}
+
+  const vd __builtin_vsx_uns_doubleo_v4si (vsi);
+    UNS_DOUBLEO_V4SI unsdoubleov4si2 {}
+
+  const vf __builtin_vsx_uns_floate_v2di (vsll);
+    UNS_FLOATE_V2DI unsfloatev2di {}
+
+  const vf __builtin_vsx_uns_floato_v2di (vsll);
+    UNS_FLOATO_V2DI unsfloatov2di {}
+
+; I have no idea why we have __builtin_vsx_* duplicates of these when
+; the __builtin_altivec_* counterparts are already present.  Keeping
+; them for compatibility, but...oy.
+  const vsc __builtin_vsx_vperm_16qi (vsc, vsc, vuc);
+    VPERM_16QI_X altivec_vperm_v16qi {}
+
+  const vuc __builtin_vsx_vperm_16qi_uns (vuc, vuc, vuc);
+    VPERM_16QI_UNS_X altivec_vperm_v16qi_uns {}
+
+  const vsq __builtin_vsx_vperm_1ti (vsq, vsq, vsc);
+    VPERM_1TI_X altivec_vperm_v1ti {}
+
+  const vsq __builtin_vsx_vperm_1ti_uns (vsq, vsq, vsc);
+    VPERM_1TI_UNS_X altivec_vperm_v1ti_uns {}
+
+  const vd __builtin_vsx_vperm_2df (vd, vd, vuc);
+    VPERM_2DF_X altivec_vperm_v2df {}
+
+  const vsll __builtin_vsx_vperm_2di (vsll, vsll, vuc);
+    VPERM_2DI_X altivec_vperm_v2di {}
+
+  const vull __builtin_vsx_vperm_2di_uns (vull, vull, vuc);
+    VPERM_2DI_UNS_X altivec_vperm_v2di_uns {}
+
+  const vf __builtin_vsx_vperm_4sf (vf, vf, vuc);
+    VPERM_4SF_X altivec_vperm_v4sf {}
+
+  const vsi __builtin_vsx_vperm_4si (vsi, vsi, vuc);
+    VPERM_4SI_X altivec_vperm_v4si {}
+
+  const vui __builtin_vsx_vperm_4si_uns (vui, vui, vuc);
+    VPERM_4SI_UNS_X altivec_vperm_v4si_uns {}
+
+  const vss __builtin_vsx_vperm_8hi (vss, vss, vuc);
+    VPERM_8HI_X altivec_vperm_v8hi {}
+
+  const vus __builtin_vsx_vperm_8hi_uns (vus, vus, vuc);
+    VPERM_8HI_UNS_X altivec_vperm_v8hi_uns {}
+
+  const vsll __builtin_vsx_vsigned_v2df (vd);
+    VEC_VSIGNED_V2DF vsx_xvcvdpsxds {}
+
+  const vsi __builtin_vsx_vsigned_v4sf (vf);
+    VEC_VSIGNED_V4SF vsx_xvcvspsxws {}
+
+  const vsi __builtin_vsx_vsignede_v2df (vd);
+    VEC_VSIGNEDE_V2DF vsignede_v2df {}
+
+  const vsi __builtin_vsx_vsignedo_v2df (vd);
+    VEC_VSIGNEDO_V2DF vsignedo_v2df {}
+
+  const vsll __builtin_vsx_vunsigned_v2df (vd);
+    VEC_VUNSIGNED_V2DF vsx_xvcvdpsxds {}
+
+  const vsi __builtin_vsx_vunsigned_v4sf (vf);
+    VEC_VUNSIGNED_V4SF vsx_xvcvspsxws {}
+
+  const vsi __builtin_vsx_vunsignede_v2df (vd);
+    VEC_VUNSIGNEDE_V2DF vunsignede_v2df {}
+
+  const vsi __builtin_vsx_vunsignedo_v2df (vd);
+    VEC_VUNSIGNEDO_V2DF vunsignedo_v2df {}
+
+  const vf __builtin_vsx_xscvdpsp (double);
+    XSCVDPSP vsx_xscvdpsp {}
+
+  const double __builtin_vsx_xscvspdp (vf);
+    XSCVSPDP vsx_xscvspdp {}
+
+  const double __builtin_vsx_xsmaxdp (double, double);
+    XSMAXDP smaxdf3 {}
+
+  const double __builtin_vsx_xsmindp (double, double);
+    XSMINDP smindf3 {}
+
+  const double __builtin_vsx_xsrdpi (double);
+    XSRDPI vsx_xsrdpi {}
+
+  const double __builtin_vsx_xsrdpic (double);
+    XSRDPIC vsx_xsrdpic {}
+
+  const double __builtin_vsx_xsrdpim (double);
+    XSRDPIM floordf2 {}
+
+  const double __builtin_vsx_xsrdpip (double);
+    XSRDPIP ceildf2 {}
+
+  const double __builtin_vsx_xsrdpiz (double);
+    XSRDPIZ btruncdf2 {}
+
+  const signed int __builtin_vsx_xstdivdp_fe (double, double);
+    XSTDIVDP_FE vsx_tdivdf3_fe {}
+
+  const signed int __builtin_vsx_xstdivdp_fg (double, double);
+    XSTDIVDP_FG vsx_tdivdf3_fg {}
+
+  const signed int __builtin_vsx_xstsqrtdp_fe (double);
+    XSTSQRTDP_FE vsx_tsqrtdf2_fe {}
+
+  const signed int __builtin_vsx_xstsqrtdp_fg (double);
+    XSTSQRTDP_FG vsx_tsqrtdf2_fg {}
+
+  const vd __builtin_vsx_xvabsdp (vd);
+    XVABSDP absv2df2 {}
+
+  const vf __builtin_vsx_xvabssp (vf);
+    XVABSSP absv4sf2 {}
+
+  fpmath vd __builtin_vsx_xvadddp (vd, vd);
+    XVADDDP addv2df3 {}
+
+  fpmath vf __builtin_vsx_xvaddsp (vf, vf);
+    XVADDSP addv4sf3 {}
+
+  const vd __builtin_vsx_xvcmpeqdp (vd, vd);
+    XVCMPEQDP vector_eqv2df {}
+
+  const signed int __builtin_vsx_xvcmpeqdp_p (signed int, vd, vd);
+    XVCMPEQDP_P vector_eq_v2df_p {pred}
+
+  const vf __builtin_vsx_xvcmpeqsp (vf, vf);
+    XVCMPEQSP vector_eqv4sf {}
+
+  const signed int __builtin_vsx_xvcmpeqsp_p (signed int, vf, vf);
+    XVCMPEQSP_P vector_eq_v4sf_p {pred}
+
+  const vd __builtin_vsx_xvcmpgedp (vd, vd);
+    XVCMPGEDP vector_gev2df {}
+
+  const signed int __builtin_vsx_xvcmpgedp_p (signed int, vd, vd);
+    XVCMPGEDP_P vector_ge_v2df_p {pred}
+
+  const vf __builtin_vsx_xvcmpgesp (vf, vf);
+    XVCMPGESP vector_gev4sf {}
+
+  const signed int __builtin_vsx_xvcmpgesp_p (signed int, vf, vf);
+    XVCMPGESP_P vector_ge_v4sf_p {pred}
+
+  const vd __builtin_vsx_xvcmpgtdp (vd, vd);
+    XVCMPGTDP vector_gtv2df {}
+
+  const signed int __builtin_vsx_xvcmpgtdp_p (signed int, vd, vd);
+    XVCMPGTDP_P vector_gt_v2df_p {pred}
+
+  const vf __builtin_vsx_xvcmpgtsp (vf, vf);
+    XVCMPGTSP vector_gtv4sf {}
+
+  const signed int __builtin_vsx_xvcmpgtsp_p (signed int, vf, vf);
+    XVCMPGTSP_P vector_gt_v4sf_p {pred}
+
+  const vf __builtin_vsx_xvcvdpsp (vd);
+    XVCVDPSP vsx_xvcvdpsp {}
+
+  const vsll __builtin_vsx_xvcvdpsxds (vd);
+    XVCVDPSXDS vsx_fix_truncv2dfv2di2 {}
+
+  const vsll __builtin_vsx_xvcvdpsxds_scale (vd, const int);
+    XVCVDPSXDS_SCALE vsx_xvcvdpsxds_scale {}
+
+  const vsi __builtin_vsx_xvcvdpsxws (vd);
+    XVCVDPSXWS vsx_xvcvdpsxws {}
+
+  const vsll __builtin_vsx_xvcvdpuxds (vd);
+    XVCVDPUXDS vsx_fixuns_truncv2dfv2di2 {}
+
+  const vsll __builtin_vsx_xvcvdpuxds_scale (vd, const int);
+    XVCVDPUXDS_SCALE vsx_xvcvdpuxds_scale {}
+
+  const vull __builtin_vsx_xvcvdpuxds_uns (vd);
+    XVCVDPUXDS_UNS vsx_fixuns_truncv2dfv2di2 {}
+
+  const vsi __builtin_vsx_xvcvdpuxws (vd);
+    XVCVDPUXWS vsx_xvcvdpuxws {}
+
+  const vd __builtin_vsx_xvcvspdp (vf);
+    XVCVSPDP vsx_xvcvspdp {}
+
+  const vsll __builtin_vsx_xvcvspsxds (vf);
+    XVCVSPSXDS vsx_xvcvspsxds {}
+
+  const vsi __builtin_vsx_xvcvspsxws (vf);
+    XVCVSPSXWS vsx_fix_truncv4sfv4si2 {}
+
+  const vsll __builtin_vsx_xvcvspuxds (vf);
+    XVCVSPUXDS vsx_xvcvspuxds {}
+
+  const vsi __builtin_vsx_xvcvspuxws (vf);
+    XVCVSPUXWS vsx_fixuns_truncv4sfv4si2 {}
+
+  const vd __builtin_vsx_xvcvsxddp (vsll);
+    XVCVSXDDP vsx_floatv2div2df2 {}
+
+  const vd __builtin_vsx_xvcvsxddp_scale (vsll, const int<5>);
+    XVCVSXDDP_SCALE vsx_xvcvsxddp_scale {}
+
+  const vf __builtin_vsx_xvcvsxdsp (vsll);
+    XVCVSXDSP vsx_xvcvsxdsp {}
+
+  const vd __builtin_vsx_xvcvsxwdp (vsi);
+    XVCVSXWDP vsx_xvcvsxwdp {}
+
+  const vf __builtin_vsx_xvcvsxwsp (vsi);
+    XVCVSXWSP vsx_floatv4siv4sf2 {}
+
+  const vd __builtin_vsx_xvcvuxddp (vsll);
+    XVCVUXDDP vsx_floatunsv2div2df2 {}
+
+  const vd __builtin_vsx_xvcvuxddp_scale (vsll, const int<5>);
+    XVCVUXDDP_SCALE vsx_xvcvuxddp_scale {}
+
+  const vd __builtin_vsx_xvcvuxddp_uns (vull);
+    XVCVUXDDP_UNS vsx_floatunsv2div2df2 {}
+
+  const vf __builtin_vsx_xvcvuxdsp (vull);
+    XVCVUXDSP vsx_xvcvuxdsp {}
+
+  const vd __builtin_vsx_xvcvuxwdp (vsi);
+    XVCVUXWDP vsx_xvcvuxwdp {}
+
+  const vf __builtin_vsx_xvcvuxwsp (vsi);
+    XVCVUXWSP vsx_floatunsv4siv4sf2 {}
+
+  fpmath vd __builtin_vsx_xvdivdp (vd, vd);
+    XVDIVDP divv2df3 {}
+
+  fpmath vf __builtin_vsx_xvdivsp (vf, vf);
+    XVDIVSP divv4sf3 {}
+
+  const vd __builtin_vsx_xvmadddp (vd, vd, vd);
+    XVMADDDP fmav2df4 {}
+
+  const vf __builtin_vsx_xvmaddsp (vf, vf, vf);
+    XVMADDSP fmav4sf4 {}
+
+  const vd __builtin_vsx_xvmaxdp (vd, vd);
+    XVMAXDP smaxv2df3 {}
+
+  const vf __builtin_vsx_xvmaxsp (vf, vf);
+    XVMAXSP smaxv4sf3 {}
+
+  const vd __builtin_vsx_xvmindp (vd, vd);
+    XVMINDP sminv2df3 {}
+
+  const vf __builtin_vsx_xvminsp (vf, vf);
+    XVMINSP sminv4sf3 {}
+
+  const vd __builtin_vsx_xvmsubdp (vd, vd, vd);
+    XVMSUBDP fmsv2df4 {}
+
+  const vf __builtin_vsx_xvmsubsp (vf, vf, vf);
+    XVMSUBSP fmsv4sf4 {}
+
+  fpmath vd __builtin_vsx_xvmuldp (vd, vd);
+    XVMULDP mulv2df3 {}
+
+  fpmath vf __builtin_vsx_xvmulsp (vf, vf);
+    XVMULSP mulv4sf3 {}
+
+  const vd __builtin_vsx_xvnabsdp (vd);
+    XVNABSDP vsx_nabsv2df2 {}
+
+  const vf __builtin_vsx_xvnabssp (vf);
+    XVNABSSP vsx_nabsv4sf2 {}
+
+  const vd __builtin_vsx_xvnegdp (vd);
+    XVNEGDP negv2df2 {}
+
+  const vf __builtin_vsx_xvnegsp (vf);
+    XVNEGSP negv4sf2 {}
+
+  const vd __builtin_vsx_xvnmadddp (vd, vd, vd);
+    XVNMADDDP nfmav2df4 {}
+
+  const vf __builtin_vsx_xvnmaddsp (vf, vf, vf);
+    XVNMADDSP nfmav4sf4 {}
+
+  const vd __builtin_vsx_xvnmsubdp (vd, vd, vd);
+    XVNMSUBDP nfmsv2df4 {}
+
+  const vf __builtin_vsx_xvnmsubsp (vf, vf, vf);
+    XVNMSUBSP nfmsv4sf4 {}
+
+  const vd __builtin_vsx_xvrdpi (vd);
+    XVRDPI vsx_xvrdpi {}
+
+  const vd __builtin_vsx_xvrdpic (vd);
+    XVRDPIC vsx_xvrdpic {}
+
+  const vd __builtin_vsx_xvrdpim (vd);
+    XVRDPIM vsx_floorv2df2 {}
+
+  const vd __builtin_vsx_xvrdpip (vd);
+    XVRDPIP vsx_ceilv2df2 {}
+
+  const vd __builtin_vsx_xvrdpiz (vd);
+    XVRDPIZ vsx_btruncv2df2 {}
+
+  fpmath vd __builtin_vsx_xvrecipdivdp (vd, vd);
+    RECIP_V2DF recipv2df3 {}
+
+  fpmath vf __builtin_vsx_xvrecipdivsp (vf, vf);
+    RECIP_V4SF recipv4sf3 {}
+
+  const vd __builtin_vsx_xvredp (vd);
+    XVREDP vsx_frev2df2 {}
+
+  const vf __builtin_vsx_xvresp (vf);
+    XVRESP vsx_frev4sf2 {}
+
+  const vf __builtin_vsx_xvrspi (vf);
+    XVRSPI vsx_xvrspi {}
+
+  const vf __builtin_vsx_xvrspic (vf);
+    XVRSPIC vsx_xvrspic {}
+
+  const vf __builtin_vsx_xvrspim (vf);
+    XVRSPIM vsx_floorv4sf2 {}
+
+  const vf __builtin_vsx_xvrspip (vf);
+    XVRSPIP vsx_ceilv4sf2 {}
+
+  const vf __builtin_vsx_xvrspiz (vf);
+    XVRSPIZ vsx_btruncv4sf2 {}
+
+  const vd __builtin_vsx_xvrsqrtdp (vd);
+    RSQRT_2DF rsqrtv2df2 {}
+
+  const vf __builtin_vsx_xvrsqrtsp (vf);
+    RSQRT_4SF rsqrtv4sf2 {}
+
+  const vd __builtin_vsx_xvrsqrtedp (vd);
+    XVRSQRTEDP rsqrtev2df2 {}
+
+  const vf __builtin_vsx_xvrsqrtesp (vf);
+    XVRSQRTESP rsqrtev4sf2 {}
+
+  const vd __builtin_vsx_xvsqrtdp (vd);
+    XVSQRTDP sqrtv2df2 {}
+
+  const vf __builtin_vsx_xvsqrtsp (vf);
+    XVSQRTSP sqrtv4sf2 {}
+
+  fpmath vd __builtin_vsx_xvsubdp (vd, vd);
+    XVSUBDP subv2df3 {}
+
+  fpmath vf __builtin_vsx_xvsubsp (vf, vf);
+    XVSUBSP subv4sf3 {}
+
+  const signed int __builtin_vsx_xvtdivdp_fe (vd, vd);
+    XVTDIVDP_FE vsx_tdivv2df3_fe {}
+
+  const signed int __builtin_vsx_xvtdivdp_fg (vd, vd);
+    XVTDIVDP_FG vsx_tdivv2df3_fg {}
+
+  const signed int __builtin_vsx_xvtdivsp_fe (vf, vf);
+    XVTDIVSP_FE vsx_tdivv4sf3_fe {}
+
+  const signed int __builtin_vsx_xvtdivsp_fg (vf, vf);
+    XVTDIVSP_FG vsx_tdivv4sf3_fg {}
+
+  const signed int __builtin_vsx_xvtsqrtdp_fe (vd);
+    XVTSQRTDP_FE vsx_tsqrtv2df2_fe {}
+
+  const signed int __builtin_vsx_xvtsqrtdp_fg (vd);
+    XVTSQRTDP_FG vsx_tsqrtv2df2_fg {}
+
+  const signed int __builtin_vsx_xvtsqrtsp_fe (vf);
+    XVTSQRTSP_FE vsx_tsqrtv4sf2_fe {}
+
+  const signed int __builtin_vsx_xvtsqrtsp_fg (vf);
+    XVTSQRTSP_FG vsx_tsqrtv4sf2_fg {}
+
+  const vf __builtin_vsx_xxmrghw (vf, vf);
+    XXMRGHW_4SF vsx_xxmrghw_v4sf {}
+
+  const vsi __builtin_vsx_xxmrghw_4si (vsi, vsi);
+    XXMRGHW_4SI vsx_xxmrghw_v4si {}
+
+  const vf __builtin_vsx_xxmrglw (vf, vf);
+    XXMRGLW_4SF vsx_xxmrglw_v4sf {}
+
+  const vsi __builtin_vsx_xxmrglw_4si (vsi, vsi);
+    XXMRGLW_4SI vsx_xxmrglw_v4si {}
+
+  const vsc __builtin_vsx_xxpermdi_16qi (vsc, vsc, const int<2>);
+    XXPERMDI_16QI vsx_xxpermdi_v16qi {}
+
+  const vsq __builtin_vsx_xxpermdi_1ti (vsq, vsq, const int<2>);
+    XXPERMDI_1TI vsx_xxpermdi_v1ti {}
+
+  const vd __builtin_vsx_xxpermdi_2df (vd, vd, const int<2>);
+    XXPERMDI_2DF vsx_xxpermdi_v2df {}
+
+  const vsll __builtin_vsx_xxpermdi_2di (vsll, vsll, const int<2>);
+    XXPERMDI_2DI vsx_xxpermdi_v2di {}
+
+  const vf __builtin_vsx_xxpermdi_4sf (vf, vf, const int<2>);
+    XXPERMDI_4SF vsx_xxpermdi_v4sf {}
+
+  const vsi __builtin_vsx_xxpermdi_4si (vsi, vsi, const int<2>);
+    XXPERMDI_4SI vsx_xxpermdi_v4si {}
+
+  const vss __builtin_vsx_xxpermdi_8hi (vss, vss, const int<2>);
+    XXPERMDI_8HI vsx_xxpermdi_v8hi {}
+
+  const vsc __builtin_vsx_xxsel_16qi (vsc, vsc, vsc);
+    XXSEL_16QI vector_select_v16qi {}
+
+  const vuc __builtin_vsx_xxsel_16qi_uns (vuc, vuc, vuc);
+    XXSEL_16QI_UNS vector_select_v16qi_uns {}
+
+  const vsq __builtin_vsx_xxsel_1ti (vsq, vsq, vsq);
+    XXSEL_1TI vector_select_v1ti {}
+
+  const vsq __builtin_vsx_xxsel_1ti_uns (vsq, vsq, vsq);
+    XXSEL_1TI_UNS vector_select_v1ti_uns {}
+
+  const vd __builtin_vsx_xxsel_2df (vd, vd, vd);
+    XXSEL_2DF vector_select_v2df {}
+
+  const vsll __builtin_vsx_xxsel_2di (vsll, vsll, vsll);
+    XXSEL_2DI vector_select_v2di {}
+
+  const vull __builtin_vsx_xxsel_2di_uns (vull, vull, vull);
+    XXSEL_2DI_UNS vector_select_v2di_uns {}
+
+  const vf __builtin_vsx_xxsel_4sf (vf, vf, vf);
+    XXSEL_4SF vector_select_v4sf {}
+
+  const vsi __builtin_vsx_xxsel_4si (vsi, vsi, vsi);
+    XXSEL_4SI vector_select_v4si {}
+
+  const vui __builtin_vsx_xxsel_4si_uns (vui, vui, vui);
+    XXSEL_4SI_UNS vector_select_v4si_uns {}
+
+  const vss __builtin_vsx_xxsel_8hi (vss, vss, vss);
+    XXSEL_8HI vector_select_v8hi {}
+
+  const vus __builtin_vsx_xxsel_8hi_uns (vus, vus, vus);
+    XXSEL_8HI_UNS vector_select_v8hi_uns {}
+
+  const vsc __builtin_vsx_xxsldwi_16qi (vsc, vsc, const int<2>);
+    XXSLDWI_16QI vsx_xxsldwi_v16qi {}
+
+  const vd __builtin_vsx_xxsldwi_2df (vd, vd, const int<2>);
+    XXSLDWI_2DF vsx_xxsldwi_v2df {}
+
+  const vsll __builtin_vsx_xxsldwi_2di (vsll, vsll, const int<2>);
+    XXSLDWI_2DI vsx_xxsldwi_v2di {}
+
+  const vf __builtin_vsx_xxsldwi_4sf (vf, vf, const int<2>);
+    XXSLDWI_4SF vsx_xxsldwi_v4sf {}
+
+  const vsi __builtin_vsx_xxsldwi_4si (vsi, vsi, const int<2>);
+    XXSLDWI_4SI vsx_xxsldwi_v4si {}
+
+  const vss __builtin_vsx_xxsldwi_8hi (vss, vss, const int<2>);
+    XXSLDWI_8HI vsx_xxsldwi_v8hi {}
+
+  const vd __builtin_vsx_xxspltd_2df (vd, const int<1>);
+    XXSPLTD_V2DF vsx_xxspltd_v2df {}
+
+  const vsll __builtin_vsx_xxspltd_2di (vsll, const int<1>);
+    XXSPLTD_V2DI vsx_xxspltd_v2di {}
-- 
2.27.0


^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 05/34] rs6000: Add available-everywhere and ancient builtins
  2021-07-29 13:30 [PATCHv4 00/34] Replace the Power target-specific builtin machinery Bill Schmidt
                   ` (3 preceding siblings ...)
  2021-07-29 13:30 ` [PATCH 04/34] rs6000: Add VSX builtins Bill Schmidt
@ 2021-07-29 13:30 ` Bill Schmidt
  2021-08-10 16:17   ` will schmidt
  2021-08-10 18:38   ` Segher Boessenkool
  2021-07-29 13:30 ` [PATCH 06/34] rs6000: Add power7 and power7-64 builtins Bill Schmidt
                   ` (28 subsequent siblings)
  33 siblings, 2 replies; 84+ messages in thread
From: Bill Schmidt @ 2021-07-29 13:30 UTC (permalink / raw)
  To: gcc-patches; +Cc: segher, dje.gcc, willschm

2021-06-07  Bill Schmidt  <wschmidt@linux.ibm.com>

gcc/
	* config/rs6000/rs6000-builtin-new.def: Add always, power5, and
	power6 stanzas.
---
 gcc/config/rs6000/rs6000-builtin-new.def | 72 ++++++++++++++++++++++++
 1 file changed, 72 insertions(+)

diff --git a/gcc/config/rs6000/rs6000-builtin-new.def b/gcc/config/rs6000/rs6000-builtin-new.def
index 974cdc8c37c..ca694be1ac3 100644
--- a/gcc/config/rs6000/rs6000-builtin-new.def
+++ b/gcc/config/rs6000/rs6000-builtin-new.def
@@ -184,6 +184,78 @@
 
 
 
+; Builtins that have been around since time immemorial or are just
+; considered available everywhere.
+[always]
+  void __builtin_cpu_init ();
+    CPU_INIT nothing {cpu}
+
+  bool __builtin_cpu_is (string);
+    CPU_IS nothing {cpu}
+
+  bool __builtin_cpu_supports (string);
+    CPU_SUPPORTS nothing {cpu}
+
+  unsigned long long __builtin_ppc_get_timebase ();
+    GET_TB rs6000_get_timebase {}
+
+  double __builtin_mffs ();
+    MFFS rs6000_mffs {}
+
+; This will break for long double == _Float128.  libgcc history.
+  const long double __builtin_pack_longdouble (double, double);
+    PACK_TF packtf {}
+
+  unsigned long __builtin_ppc_mftb ();
+    MFTB rs6000_mftb_di {32bit}
+
+  void __builtin_mtfsb0 (const int<5>);
+    MTFSB0 rs6000_mtfsb0 {}
+
+  void __builtin_mtfsb1 (const int<5>);
+    MTFSB1 rs6000_mtfsb1 {}
+
+  void __builtin_mtfsf (const int<8>, double);
+    MTFSF rs6000_mtfsf {}
+
+  const __ibm128 __builtin_pack_ibm128 (double, double);
+    PACK_IF packif {}
+
+  void __builtin_set_fpscr_rn (const int[0,3]);
+    SET_FPSCR_RN rs6000_set_fpscr_rn {}
+
+  const double __builtin_unpack_ibm128 (__ibm128, const int<1>);
+    UNPACK_IF unpackif {}
+
+; This will break for long double == _Float128.  libgcc history.
+  const double __builtin_unpack_longdouble (long double, const int<1>);
+    UNPACK_TF unpacktf {}
+
+
+; Builtins that have been around just about forever, but not quite.
+[power5]
+  fpmath double __builtin_recipdiv (double, double);
+    RECIP recipdf3 {}
+
+  fpmath float __builtin_recipdivf (float, float);
+    RECIPF recipsf3 {}
+
+  fpmath double __builtin_rsqrt (double);
+    RSQRT rsqrtdf2 {}
+
+  fpmath float __builtin_rsqrtf (float);
+    RSQRTF rsqrtsf2 {}
+
+
+; Power6 builtins.
+[power6]
+  const signed long __builtin_p6_cmpb (signed long, signed long);
+    CMPB cmpbdi3 {}
+
+  const signed int __builtin_p6_cmpb_32 (signed int, signed int);
+    CMPB_32 cmpbsi3 {}
+
+
 ; AltiVec builtins.
 [altivec]
   const vsc __builtin_altivec_abs_v16qi (vsc);
-- 
2.27.0


^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 06/34] rs6000: Add power7 and power7-64 builtins
  2021-07-29 13:30 [PATCHv4 00/34] Replace the Power target-specific builtin machinery Bill Schmidt
                   ` (4 preceding siblings ...)
  2021-07-29 13:30 ` [PATCH 05/34] rs6000: Add available-everywhere and ancient builtins Bill Schmidt
@ 2021-07-29 13:30 ` Bill Schmidt
  2021-08-10 16:16   ` will schmidt
  2021-07-29 13:30 ` [PATCH 07/34] rs6000: Add power8-vector builtins Bill Schmidt
                   ` (27 subsequent siblings)
  33 siblings, 1 reply; 84+ messages in thread
From: Bill Schmidt @ 2021-07-29 13:30 UTC (permalink / raw)
  To: gcc-patches; +Cc: segher, dje.gcc, willschm

2021-04-02  Bill Schmidt  <wschmidt@linux.ibm.com>

gcc/
	* config/rs6000/rs6000-builtin-new.def: Add power7 and power7-64
	stanzas.
---
 gcc/config/rs6000/rs6000-builtin-new.def | 39 ++++++++++++++++++++++++
 1 file changed, 39 insertions(+)

diff --git a/gcc/config/rs6000/rs6000-builtin-new.def b/gcc/config/rs6000/rs6000-builtin-new.def
index ca694be1ac3..bffce52ee47 100644
--- a/gcc/config/rs6000/rs6000-builtin-new.def
+++ b/gcc/config/rs6000/rs6000-builtin-new.def
@@ -1957,3 +1957,42 @@
 
   const vsll __builtin_vsx_xxspltd_2di (vsll, const int<1>);
     XXSPLTD_V2DI vsx_xxspltd_v2di {}
+
+
+; Power7 builtins (ISA 2.06).
+[power7]
+  const unsigned int __builtin_addg6s (unsigned int, unsigned int);
+    ADDG6S addg6s {}
+
+  const signed long __builtin_bpermd (signed long, signed long);
+    BPERMD bpermd_di {}
+
+  const unsigned int __builtin_cbcdtd (unsigned int);
+    CBCDTD cbcdtd {}
+
+  const unsigned int __builtin_cdtbcd (unsigned int);
+    CDTBCD cdtbcd {}
+
+  const signed int __builtin_divwe (signed int, signed int);
+    DIVWE dive_si {}
+
+  const unsigned int __builtin_divweu (unsigned int, unsigned int);
+    DIVWEU diveu_si {}
+
+  const vsq __builtin_pack_vector_int128 (unsigned long long, unsigned long long);
+    PACK_V1TI packv1ti {}
+
+  void __builtin_ppc_speculation_barrier ();
+    SPECBARR speculation_barrier {}
+
+  const unsigned long __builtin_unpack_vector_int128 (vsq, const int<1>);
+    UNPACK_V1TI unpackv1ti {}
+
+
+; Power7 builtins requiring 64-bit GPRs (even with 32-bit addressing).
+[power7-64]
+  const signed long long __builtin_divde (signed long long, signed long long);
+    DIVDE dive_di {}
+
+  const unsigned long long __builtin_divdeu (unsigned long long, unsigned long long);
+    DIVDEU diveu_di {}
-- 
2.27.0


^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 07/34] rs6000: Add power8-vector builtins
  2021-07-29 13:30 [PATCHv4 00/34] Replace the Power target-specific builtin machinery Bill Schmidt
                   ` (5 preceding siblings ...)
  2021-07-29 13:30 ` [PATCH 06/34] rs6000: Add power7 and power7-64 builtins Bill Schmidt
@ 2021-07-29 13:30 ` Bill Schmidt
  2021-08-23 21:28   ` Segher Boessenkool
  2021-07-29 13:30 ` [PATCH 08/34] rs6000: Add Power9 builtins Bill Schmidt
                   ` (26 subsequent siblings)
  33 siblings, 1 reply; 84+ messages in thread
From: Bill Schmidt @ 2021-07-29 13:30 UTC (permalink / raw)
  To: gcc-patches; +Cc: segher, dje.gcc, willschm

2021-04-01  Bill Schmidt  <wschmidt@linux.ibm.com>

gcc/
	* config/rs6000/rs6000-builtin-new.def: Add power8-vector stanza.
---
 gcc/config/rs6000/rs6000-builtin-new.def | 438 +++++++++++++++++++++++
 1 file changed, 438 insertions(+)

diff --git a/gcc/config/rs6000/rs6000-builtin-new.def b/gcc/config/rs6000/rs6000-builtin-new.def
index bffce52ee47..f13fb13b0ad 100644
--- a/gcc/config/rs6000/rs6000-builtin-new.def
+++ b/gcc/config/rs6000/rs6000-builtin-new.def
@@ -1996,3 +1996,441 @@
 
   const unsigned long long __builtin_divdeu (unsigned long long, unsigned long long);
     DIVDEU diveu_di {}
+
+
+; Power8 vector built-ins.
+[power8-vector]
+  const vsll __builtin_altivec_abs_v2di (vsll);
+    ABS_V2DI absv2di2 {}
+
+  const vsc __builtin_altivec_bcddiv10_v16qi (vsc);
+    BCDDIV10_V16QI bcddiv10_v16qi {}
+
+  const vsc __builtin_altivec_bcdmul10_v16qi (vsc);
+    BCDMUL10_V16QI bcdmul10_v16qi {}
+
+  const vsc __builtin_altivec_eqv_v16qi (vsc, vsc);
+    EQV_V16QI eqvv16qi3 {}
+
+  const vuc __builtin_altivec_eqv_v16qi_uns (vuc, vuc);
+    EQV_V16QI_UNS eqvv16qi3 {}
+
+  const vsq __builtin_altivec_eqv_v1ti (vsq, vsq);
+    EQV_V1TI eqvv1ti3 {}
+
+  const vuq __builtin_altivec_eqv_v1ti_uns (vuq, vuq);
+    EQV_V1TI_UNS eqvv1ti3 {}
+
+  const vd __builtin_altivec_eqv_v2df (vd, vd);
+    EQV_V2DF eqvv2df3 {}
+
+  const vsll __builtin_altivec_eqv_v2di (vsll, vsll);
+    EQV_V2DI eqvv2di3 {}
+
+  const vull __builtin_altivec_eqv_v2di_uns (vull, vull);
+    EQV_V2DI_UNS eqvv2di3 {}
+
+  const vf __builtin_altivec_eqv_v4sf (vf, vf);
+    EQV_V4SF eqvv4sf3 {}
+
+  const vsi __builtin_altivec_eqv_v4si (vsi, vsi);
+    EQV_V4SI eqvv4si3 {}
+
+  const vui __builtin_altivec_eqv_v4si_uns (vui, vui);
+    EQV_V4SI_UNS eqvv4si3 {}
+
+  const vss __builtin_altivec_eqv_v8hi (vss, vss);
+    EQV_V8HI eqvv8hi3 {}
+
+  const vus __builtin_altivec_eqv_v8hi_uns (vus, vus);
+    EQV_V8HI_UNS eqvv8hi3 {}
+
+  const vsc __builtin_altivec_nand_v16qi (vsc, vsc);
+    NAND_V16QI nandv16qi3 {}
+
+  const vuc __builtin_altivec_nand_v16qi_uns (vuc, vuc);
+    NAND_V16QI_UNS nandv16qi3 {}
+
+  const vsq __builtin_altivec_nand_v1ti (vsq, vsq);
+    NAND_V1TI nandv1ti3 {}
+
+  const vuq __builtin_altivec_nand_v1ti_uns (vuq, vuq);
+    NAND_V1TI_UNS nandv1ti3 {}
+
+  const vd __builtin_altivec_nand_v2df (vd, vd);
+    NAND_V2DF nandv2df3 {}
+
+  const vsll __builtin_altivec_nand_v2di (vsll, vsll);
+    NAND_V2DI nandv2di3 {}
+
+  const vull __builtin_altivec_nand_v2di_uns (vull, vull);
+    NAND_V2DI_UNS nandv2di3 {}
+
+  const vf __builtin_altivec_nand_v4sf (vf, vf);
+    NAND_V4SF nandv4sf3 {}
+
+  const vsi __builtin_altivec_nand_v4si (vsi, vsi);
+    NAND_V4SI nandv4si3 {}
+
+  const vui __builtin_altivec_nand_v4si_uns (vui, vui);
+    NAND_V4SI_UNS nandv4si3 {}
+
+  const vss __builtin_altivec_nand_v8hi (vss, vss);
+    NAND_V8HI nandv8hi3 {}
+
+  const vus __builtin_altivec_nand_v8hi_uns (vus, vus);
+    NAND_V8HI_UNS nandv8hi3 {}
+
+  const vsc __builtin_altivec_neg_v16qi (vsc);
+    NEG_V16QI negv16qi2 {}
+
+  const vd __builtin_altivec_neg_v2df (vd);
+    NEG_V2DF negv2df2 {}
+
+  const vsll __builtin_altivec_neg_v2di (vsll);
+    NEG_V2DI negv2di2 {}
+
+  const vf __builtin_altivec_neg_v4sf (vf);
+    NEG_V4SF negv4sf2 {}
+
+  const vsi __builtin_altivec_neg_v4si (vsi);
+    NEG_V4SI negv4si2 {}
+
+  const vss __builtin_altivec_neg_v8hi (vss);
+    NEG_V8HI negv8hi2 {}
+
+  const vsc __builtin_altivec_orc_v16qi (vsc, vsc);
+    ORC_V16QI orcv16qi3 {}
+
+  const vuc __builtin_altivec_orc_v16qi_uns (vuc, vuc);
+    ORC_V16QI_UNS orcv16qi3 {}
+
+  const vsq __builtin_altivec_orc_v1ti (vsq, vsq);
+    ORC_V1TI orcv1ti3 {}
+
+  const vuq __builtin_altivec_orc_v1ti_uns (vuq, vuq);
+    ORC_V1TI_UNS orcv1ti3 {}
+
+  const vd __builtin_altivec_orc_v2df (vd, vd);
+    ORC_V2DF orcv2df3 {}
+
+  const vsll __builtin_altivec_orc_v2di (vsll, vsll);
+    ORC_V2DI orcv2di3 {}
+
+  const vull __builtin_altivec_orc_v2di_uns (vull, vull);
+    ORC_V2DI_UNS orcv2di3 {}
+
+  const vf __builtin_altivec_orc_v4sf (vf, vf);
+    ORC_V4SF orcv4sf3 {}
+
+  const vsi __builtin_altivec_orc_v4si (vsi, vsi);
+    ORC_V4SI orcv4si3 {}
+
+  const vui __builtin_altivec_orc_v4si_uns (vui, vui);
+    ORC_V4SI_UNS orcv4si3 {}
+
+  const vss __builtin_altivec_orc_v8hi (vss, vss);
+    ORC_V8HI orcv8hi3 {}
+
+  const vus __builtin_altivec_orc_v8hi_uns (vus, vus);
+    ORC_V8HI_UNS orcv8hi3 {}
+
+  const vsc __builtin_altivec_vclzb (vsc);
+    VCLZB clzv16qi2 {}
+
+  const vsll __builtin_altivec_vclzd (vsll);
+    VCLZD clzv2di2 {}
+
+  const vss __builtin_altivec_vclzh (vss);
+    VCLZH clzv8hi2 {}
+
+  const vsi __builtin_altivec_vclzw (vsi);
+    VCLZW clzv4si2 {}
+
+  const vuc __builtin_altivec_vgbbd (vuc);
+    VGBBD p8v_vgbbd {}
+
+  const vsq __builtin_altivec_vaddcuq (vsq, vsq);
+    VADDCUQ altivec_vaddcuq {}
+
+  const vsq __builtin_altivec_vaddecuq (vsq, vsq, vsq);
+    VADDECUQ altivec_vaddecuq {}
+
+  const vsq __builtin_altivec_vaddeuqm (vsq, vsq, vsq);
+    VADDEUQM altivec_vaddeuqm {}
+
+  const vsll __builtin_altivec_vaddudm (vsll, vsll);
+    VADDUDM addv2di3 {}
+
+  const vsq __builtin_altivec_vadduqm (vsq, vsq);
+    VADDUQM altivec_vadduqm {}
+
+  const vsll __builtin_altivec_vbpermq (vsc, vsc);
+    VBPERMQ altivec_vbpermq {}
+
+  const vsc __builtin_altivec_vbpermq2 (vsc, vsc);
+    VBPERMQ2 altivec_vbpermq2 {}
+
+  const vsll __builtin_altivec_vmaxsd (vsll, vsll);
+    VMAXSD smaxv2di3 {}
+
+  const vull __builtin_altivec_vmaxud (vull, vull);
+    VMAXUD umaxv2di3 {}
+
+  const vsll __builtin_altivec_vminsd (vsll, vsll);
+    VMINSD sminv2di3 {}
+
+  const vull __builtin_altivec_vminud (vull, vull);
+    VMINUD uminv2di3 {}
+
+  const vd __builtin_altivec_vmrgew_v2df (vd, vd);
+    VMRGEW_V2DF p8_vmrgew_v2df {}
+
+  const vsll __builtin_altivec_vmrgew_v2di (vsll, vsll);
+    VMRGEW_V2DI p8_vmrgew_v2di {}
+
+  const vf __builtin_altivec_vmrgew_v4sf (vf, vf);
+    VMRGEW_V4SF p8_vmrgew_v4sf {}
+
+  const vsi __builtin_altivec_vmrgew_v4si (vsi, vsi);
+    VMRGEW_V4SI p8_vmrgew_v4si {}
+
+  const vd __builtin_altivec_vmrgow_v2df (vd, vd);
+    VMRGOW_V2DF p8_vmrgow_v2df {}
+
+  const vsll __builtin_altivec_vmrgow_v2di (vsll, vsll);
+    VMRGOW_V2DI p8_vmrgow_v2di {}
+
+  const vf __builtin_altivec_vmrgow_v4sf (vf, vf);
+    VMRGOW_V4SF p8_vmrgow_v4sf {}
+
+  const vsi __builtin_altivec_vmrgow_v4si (vsi, vsi);
+    VMRGOW_V4SI p8_vmrgow_v4si {}
+
+  const vsc __builtin_altivec_vpermxor (vsc, vsc, vsc);
+    VPERMXOR altivec_vpermxor {}
+
+  const vsi __builtin_altivec_vpksdss (vsll, vsll);
+    VPKSDSS altivec_vpksdss {}
+
+  const vsi __builtin_altivec_vpksdus (vsll, vsll);
+    VPKSDUS altivec_vpksdus {}
+
+  const vsi __builtin_altivec_vpkudum (vsll, vsll);
+    VPKUDUM altivec_vpkudum {}
+
+  const vsi __builtin_altivec_vpkudus (vsll, vsll);
+    VPKUDUS altivec_vpkudus {}
+
+  const vsc __builtin_altivec_vpmsumb (vsc, vsc);
+    VPMSUMB_A crypto_vpmsumb {}
+
+  const vsll __builtin_altivec_vpmsumd (vsll, vsll);
+    VPMSUMD_A crypto_vpmsumd {}
+
+  const vss __builtin_altivec_vpmsumh (vss, vss);
+    VPMSUMH_A crypto_vpmsumh {}
+
+  const vsi __builtin_altivec_vpmsumw (vsi, vsi);
+    VPMSUMW_A crypto_vpmsumw {}
+
+  const vsc __builtin_altivec_vpopcntb (vsc);
+    VPOPCNTB popcountv16qi2 {}
+
+  const vsll __builtin_altivec_vpopcntd (vsll);
+    VPOPCNTD popcountv2di2 {}
+
+  const vss __builtin_altivec_vpopcnth (vss);
+    VPOPCNTH popcountv8hi2 {}
+
+  const vsc __builtin_altivec_vpopcntub (vsc);
+    VPOPCNTUB popcountv16qi2 {}
+
+  const vsll __builtin_altivec_vpopcntud (vsll);
+    VPOPCNTUD popcountv2di2 {}
+
+  const vss __builtin_altivec_vpopcntuh (vss);
+    VPOPCNTUH popcountv8hi2 {}
+
+  const vsi __builtin_altivec_vpopcntuw (vsi);
+    VPOPCNTUW popcountv4si2 {}
+
+  const vsi __builtin_altivec_vpopcntw (vsi);
+    VPOPCNTW popcountv4si2 {}
+
+  const vsll __builtin_altivec_vrld (vsll, vsll);
+    VRLD vrotlv2di3 {}
+
+  const vsll __builtin_altivec_vsld (vsll, vsll);
+    VSLD vashlv2di3 {}
+
+  const vsll __builtin_altivec_vsrad (vsll, vsll);
+    VSRAD vashrv2di3 {}
+
+  const vsll __builtin_altivec_vsrd (vsll, vull);
+    VSRD vlshrv2di3 {}
+
+  const vsq __builtin_altivec_vsubcuq (vsq, vsq);
+    VSUBCUQ altivec_vsubcuq {}
+
+  const vsq __builtin_altivec_vsubecuq (vsq, vsq, vsq);
+    VSUBECUQ altivec_vsubecuq {}
+
+  const vsq __builtin_altivec_vsubeuqm (vsq, vsq, vsq);
+    VSUBEUQM altivec_vsubeuqm {}
+
+  const vsll __builtin_altivec_vsubudm (vsll, vsll);
+    VSUBUDM subv2di3 {}
+
+  const vsq __builtin_altivec_vsubuqm (vsq, vsq);
+    VSUBUQM altivec_vsubuqm {}
+
+  const vsll __builtin_altivec_vupkhsw (vsi);
+    VUPKHSW altivec_vupkhsw {}
+
+  const vsll __builtin_altivec_vupklsw (vsi);
+    VUPKLSW altivec_vupklsw {}
+
+  const vsq __builtin_bcdadd_v1ti (vsq, vsq, const int<1>);
+    BCDADD_V1TI bcdadd_v1ti {}
+
+  const vsc __builtin_bcdadd_v16qi (vsc, vsc, const int<1>);
+    BCDADD_V16QI bcdadd_v16qi {}
+
+  const signed int __builtin_bcdadd_eq_v1ti (vsq, vsq, const int<1>);
+    BCDADD_EQ_V1TI bcdadd_eq_v1ti {}
+
+  const signed int __builtin_bcdadd_eq_v16qi (vsc, vsc, const int<1>);
+    BCDADD_EQ_V16QI bcdadd_eq_v16qi {}
+
+  const signed int __builtin_bcdadd_gt_v1ti (vsq, vsq, const int<1>);
+    BCDADD_GT_V1TI bcdadd_gt_v1ti {}
+
+  const signed int __builtin_bcdadd_gt_v16qi (vsc, vsc, const int<1>);
+    BCDADD_GT_V16QI bcdadd_gt_v16qi {}
+
+  const signed int __builtin_bcdadd_lt_v1ti (vsq, vsq, const int<1>);
+    BCDADD_LT_V1TI bcdadd_lt_v1ti {}
+
+  const signed int __builtin_bcdadd_lt_v16qi (vsc, vsc, const int<1>);
+    BCDADD_LT_V16QI bcdadd_lt_v16qi {}
+
+  const signed int __builtin_bcdadd_ov_v1ti (vsq, vsq, const int<1>);
+    BCDADD_OV_V1TI bcdadd_unordered_v1ti {}
+
+  const signed int __builtin_bcdadd_ov_v16qi (vsc, vsc, const int<1>);
+    BCDADD_OV_V16QI bcdadd_unordered_v16qi {}
+
+  const signed int __builtin_bcdinvalid_v1ti (vsq);
+    BCDINVALID_V1TI bcdinvalid_v1ti {}
+
+  const signed int __builtin_bcdinvalid_v16qi (vsc);
+    BCDINVALID_V16QI bcdinvalid_v16qi {}
+
+  const vsq __builtin_bcdsub_v1ti (vsq, vsq, const int<1>);
+    BCDSUB_V1TI bcdsub_v1ti {}
+
+  const vsc __builtin_bcdsub_v16qi (vsc, vsc, const int<1>);
+    BCDSUB_V16QI bcdsub_v16qi {}
+
+  const signed int __builtin_bcdsub_eq_v1ti (vsq, vsq, const int<1>);
+    BCDSUB_EQ_V1TI bcdsub_eq_v1ti {}
+
+  const signed int __builtin_bcdsub_eq_v16qi (vsc, vsc, const int<1>);
+    BCDSUB_EQ_V16QI bcdsub_eq_v16qi {}
+
+  const signed int __builtin_bcdsub_ge_v1ti (vsq, vsq, const int<1>);
+    BCDSUB_GE_V1TI bcdsub_ge_v1ti {}
+
+  const signed int __builtin_bcdsub_ge_v16qi (vsc, vsc, const int<1>);
+    BCDSUB_GE_V16QI bcdsub_ge_v16qi {}
+
+  const signed int __builtin_bcdsub_gt_v1ti (vsq, vsq, const int<1>);
+    BCDSUB_GT_V1TI bcdsub_gt_v1ti {}
+
+  const signed int __builtin_bcdsub_gt_v16qi (vsc, vsc, const int<1>);
+    BCDSUB_GT_V16QI bcdsub_gt_v16qi {}
+
+  const signed int __builtin_bcdsub_le_v1ti (vsq, vsq, const int<1>);
+    BCDSUB_LE_V1TI bcdsub_le_v1ti {}
+
+  const signed int __builtin_bcdsub_le_v16qi (vsc, vsc, const int<1>);
+    BCDSUB_LE_V16QI bcdsub_le_v16qi {}
+
+  const signed int __builtin_bcdsub_lt_v1ti (vsq, vsq, const int<1>);
+    BCDSUB_LT_V1TI bcdsub_lt_v1ti {}
+
+  const signed int __builtin_bcdsub_lt_v16qi (vsc, vsc, const int<1>);
+    BCDSUB_LT_V16QI bcdsub_lt_v16qi {}
+
+  const signed int __builtin_bcdsub_ov_v1ti (vsq, vsq, const int<1>);
+    BCDSUB_OV_V1TI bcdsub_unordered_v1ti {}
+
+  const signed int __builtin_bcdsub_ov_v16qi (vsc, vsc, const int<1>);
+    BCDSUB_OV_V16QI bcdsub_unordered_v16qi {}
+
+  const vuc __builtin_crypto_vpermxor_v16qi (vuc, vuc, vuc);
+    VPERMXOR_V16QI crypto_vpermxor_v16qi {}
+
+  const vull __builtin_crypto_vpermxor_v2di (vull, vull, vull);
+    VPERMXOR_V2DI crypto_vpermxor_v2di {}
+
+  const vui __builtin_crypto_vpermxor_v4si (vui, vui, vui);
+    VPERMXOR_V4SI crypto_vpermxor_v4si {}
+
+  const vus __builtin_crypto_vpermxor_v8hi (vus, vus, vus);
+    VPERMXOR_V8HI crypto_vpermxor_v8hi {}
+
+  const vuc __builtin_crypto_vpmsumb (vuc, vuc);
+    VPMSUMB crypto_vpmsumb {}
+
+  const vull __builtin_crypto_vpmsumd (vull, vull);
+    VPMSUMD crypto_vpmsumd {}
+
+  const vus __builtin_crypto_vpmsumh (vus, vus);
+    VPMSUMH crypto_vpmsumh {}
+
+  const vui __builtin_crypto_vpmsumw (vui, vui);
+    VPMSUMW crypto_vpmsumw {}
+
+  const vf __builtin_vsx_float2_v2df (vd, vd);
+    FLOAT2_V2DF float2_v2df {}
+
+  const vf __builtin_vsx_float2_v2di (vsll, vsll);
+    FLOAT2_V2DI float2_v2di {}
+
+  const vsc __builtin_vsx_revb_v16qi (vsc);
+    REVB_V16QI revb_v16qi {}
+
+  const vsq __builtin_vsx_revb_v1ti (vsq);
+    REVB_V1TI revb_v1ti {}
+
+  const vd __builtin_vsx_revb_v2df (vd);
+    REVB_V2DF revb_v2df {}
+
+  const vsll __builtin_vsx_revb_v2di (vsll);
+    REVB_V2DI revb_v2di {}
+
+  const vf __builtin_vsx_revb_v4sf (vf);
+    REVB_V4SF revb_v4sf {}
+
+  const vsi __builtin_vsx_revb_v4si (vsi);
+    REVB_V4SI revb_v4si {}
+
+  const vss __builtin_vsx_revb_v8hi (vss);
+    REVB_V8HI revb_v8hi {}
+
+  const vf __builtin_vsx_uns_float2_v2di (vsll, vsll);
+    UNS_FLOAT2_V2DI uns_float2_v2di {}
+
+  const vsi __builtin_vsx_vsigned2_v2df (vd, vd);
+    VEC_VSIGNED2_V2DF vsigned2_v2df {}
+
+  const vsi __builtin_vsx_vunsigned2_v2df (vd, vd);
+    VEC_VUNSIGNED2_V2DF vunsigned2_v2df {}
+
+  const vf __builtin_vsx_xscvdpspn (double);
+    XSCVDPSPN vsx_xscvdpspn {}
+
+  const double __builtin_vsx_xscvspdpn (vf);
+    XSCVSPDPN vsx_xscvspdpn {}
-- 
2.27.0


^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 08/34] rs6000: Add Power9 builtins
  2021-07-29 13:30 [PATCHv4 00/34] Replace the Power target-specific builtin machinery Bill Schmidt
                   ` (6 preceding siblings ...)
  2021-07-29 13:30 ` [PATCH 07/34] rs6000: Add power8-vector builtins Bill Schmidt
@ 2021-07-29 13:30 ` Bill Schmidt
  2021-08-23 21:40   ` Segher Boessenkool
  2021-07-29 13:30 ` [PATCH 09/34] rs6000: Add more type nodes to support builtin processing Bill Schmidt
                   ` (25 subsequent siblings)
  33 siblings, 1 reply; 84+ messages in thread
From: Bill Schmidt @ 2021-07-29 13:30 UTC (permalink / raw)
  To: gcc-patches; +Cc: segher, dje.gcc, willschm

2021-06-15  Bill Schmidt  <wschmidt@linux.ibm.com>

gcc/
	* config/rs6000/rs6000-builtin-new.def: Add power9-vector, power9,
	and power9-64 stanzas.
---
 gcc/config/rs6000/rs6000-builtin-new.def | 375 +++++++++++++++++++++++
 1 file changed, 375 insertions(+)

diff --git a/gcc/config/rs6000/rs6000-builtin-new.def b/gcc/config/rs6000/rs6000-builtin-new.def
index f13fb13b0ad..8885df089a6 100644
--- a/gcc/config/rs6000/rs6000-builtin-new.def
+++ b/gcc/config/rs6000/rs6000-builtin-new.def
@@ -2434,3 +2434,378 @@
 
   const double __builtin_vsx_xscvspdpn (vf);
     XSCVSPDPN vsx_xscvspdpn {}
+
+
+; Power9 vector builtins.
+[power9-vector]
+  const vss __builtin_altivec_convert_4f32_8f16 (vf, vf);
+    CONVERT_4F32_8F16 convert_4f32_8f16 {}
+
+  const vss __builtin_altivec_convert_4f32_8i16 (vf, vf);
+    CONVERT_4F32_8I16 convert_4f32_8i16 {}
+
+  const signed int __builtin_altivec_first_match_index_v16qi (vsc, vsc);
+    VFIRSTMATCHINDEX_V16QI first_match_index_v16qi {}
+
+  const signed int __builtin_altivec_first_match_index_v8hi (vss, vss);
+    VFIRSTMATCHINDEX_V8HI first_match_index_v8hi {}
+
+  const signed int __builtin_altivec_first_match_index_v4si (vsi, vsi);
+    VFIRSTMATCHINDEX_V4SI first_match_index_v4si {}
+
+  const signed int __builtin_altivec_first_match_or_eos_index_v16qi (vsc, vsc);
+    VFIRSTMATCHOREOSINDEX_V16QI first_match_or_eos_index_v16qi {}
+
+  const signed int __builtin_altivec_first_match_or_eos_index_v8hi (vss, vss);
+    VFIRSTMATCHOREOSINDEX_V8HI first_match_or_eos_index_v8hi {}
+
+  const signed int __builtin_altivec_first_match_or_eos_index_v4si (vsi, vsi);
+    VFIRSTMATCHOREOSINDEX_V4SI first_match_or_eos_index_v4si {}
+
+  const signed int __builtin_altivec_first_mismatch_index_v16qi (vsc, vsc);
+    VFIRSTMISMATCHINDEX_V16QI first_mismatch_index_v16qi {}
+
+  const signed int __builtin_altivec_first_mismatch_index_v8hi (vss, vss);
+    VFIRSTMISMATCHINDEX_V8HI first_mismatch_index_v8hi {}
+
+  const signed int __builtin_altivec_first_mismatch_index_v4si (vsi, vsi);
+    VFIRSTMISMATCHINDEX_V4SI first_mismatch_index_v4si {}
+
+  const signed int __builtin_altivec_first_mismatch_or_eos_index_v16qi (vsc, vsc);
+    VFIRSTMISMATCHOREOSINDEX_V16QI first_mismatch_or_eos_index_v16qi {}
+
+  const signed int __builtin_altivec_first_mismatch_or_eos_index_v8hi (vss, vss);
+    VFIRSTMISMATCHOREOSINDEX_V8HI first_mismatch_or_eos_index_v8hi {}
+
+  const signed int __builtin_altivec_first_mismatch_or_eos_index_v4si (vsi, vsi);
+    VFIRSTMISMATCHOREOSINDEX_V4SI first_mismatch_or_eos_index_v4si {}
+
+  const vsc __builtin_altivec_vadub (vsc, vsc);
+    VADUB vaduv16qi3 {}
+
+  const vss __builtin_altivec_vaduh (vss, vss);
+    VADUH vaduv8hi3 {}
+
+  const vsi __builtin_altivec_vaduw (vsi, vsi);
+    VADUW vaduv4si3 {}
+
+  const vsll __builtin_altivec_vbpermd (vsll, vsc);
+    VBPERMD altivec_vbpermd {}
+
+  const signed int __builtin_altivec_vclzlsbb_v16qi (vsc);
+    VCLZLSBB_V16QI vclzlsbb_v16qi {}
+
+  const signed int __builtin_altivec_vclzlsbb_v4si (vsi);
+    VCLZLSBB_V4SI vclzlsbb_v4si {}
+
+  const signed int __builtin_altivec_vclzlsbb_v8hi (vss);
+    VCLZLSBB_V8HI vclzlsbb_v8hi {}
+
+  const vsc __builtin_altivec_vctzb (vsc);
+    VCTZB ctzv16qi2 {}
+
+  const vsll __builtin_altivec_vctzd (vsll);
+    VCTZD ctzv2di2 {}
+
+  const vss __builtin_altivec_vctzh (vss);
+    VCTZH ctzv8hi2 {}
+
+  const vsi __builtin_altivec_vctzw (vsi);
+    VCTZW ctzv4si2 {}
+
+  const signed int __builtin_altivec_vctzlsbb_v16qi (vsc);
+    VCTZLSBB_V16QI vctzlsbb_v16qi {}
+
+  const signed int __builtin_altivec_vctzlsbb_v4si (vsi);
+    VCTZLSBB_V4SI vctzlsbb_v4si {}
+
+  const signed int __builtin_altivec_vctzlsbb_v8hi (vss);
+    VCTZLSBB_V8HI vctzlsbb_v8hi {}
+
+  const signed int __builtin_altivec_vcmpaeb_p (vsc, vsc);
+    VCMPAEB_P vector_ae_v16qi_p {}
+
+  const signed int __builtin_altivec_vcmpaed_p (vsll, vsll);
+    VCMPAED_P vector_ae_v2di_p {}
+
+  const signed int __builtin_altivec_vcmpaedp_p (vd, vd);
+    VCMPAEDP_P vector_ae_v2df_p {}
+
+  const signed int __builtin_altivec_vcmpaefp_p (vf, vf);
+    VCMPAEFP_P vector_ae_v4sf_p {}
+
+  const signed int __builtin_altivec_vcmpaeh_p (vss, vss);
+    VCMPAEH_P vector_ae_v8hi_p {}
+
+  const signed int __builtin_altivec_vcmpaew_p (vsi, vsi);
+    VCMPAEW_P vector_ae_v4si_p {}
+
+  const vsc __builtin_altivec_vcmpneb (vsc, vsc);
+    VCMPNEB vcmpneb {}
+
+  const signed int __builtin_altivec_vcmpneb_p (vsc, vsc);
+    VCMPNEB_P vector_ne_v16qi_p {}
+
+  const signed int __builtin_altivec_vcmpned_p (vsll, vsll);
+    VCMPNED_P vector_ne_v2di_p {}
+
+  const signed int __builtin_altivec_vcmpnedp_p (vd, vd);
+    VCMPNEDP_P vector_ne_v2df_p {}
+
+  const signed int __builtin_altivec_vcmpnefp_p (vf, vf);
+    VCMPNEFP_P vector_ne_v4sf_p {}
+
+  const vss __builtin_altivec_vcmpneh (vss, vss);
+    VCMPNEH vcmpneh {}
+
+  const signed int __builtin_altivec_vcmpneh_p (vss, vss);
+    VCMPNEH_P vector_ne_v8hi_p {}
+
+  const vsi __builtin_altivec_vcmpnew (vsi, vsi);
+    VCMPNEW vcmpnew {}
+
+  const signed int __builtin_altivec_vcmpnew_p (vsi, vsi);
+    VCMPNEW_P vector_ne_v4si_p {}
+
+  const vsc __builtin_altivec_vcmpnezb (vsc, vsc);
+    CMPNEZB vcmpnezb {}
+
+  const signed int __builtin_altivec_vcmpnezb_p (signed int, vsc, vsc);
+    VCMPNEZB_P vector_nez_v16qi_p {pred}
+
+  const vss __builtin_altivec_vcmpnezh (vss, vss);
+    CMPNEZH vcmpnezh {}
+
+  const signed int __builtin_altivec_vcmpnezh_p (signed int, vss, vss);
+    VCMPNEZH_P vector_nez_v8hi_p {pred}
+
+  const vsi __builtin_altivec_vcmpnezw (vsi, vsi);
+    CMPNEZW vcmpnezw {}
+
+  const signed int __builtin_altivec_vcmpnezw_p (signed int, vsi, vsi);
+    VCMPNEZW_P vector_nez_v4si_p {pred}
+
+  const signed int __builtin_altivec_vextublx (signed int, vsc);
+    VEXTUBLX vextublx {}
+
+  const signed int __builtin_altivec_vextubrx (signed int, vsc);
+    VEXTUBRX vextubrx {}
+
+  const signed int __builtin_altivec_vextuhlx (signed int, vss);
+    VEXTUHLX vextuhlx {}
+
+  const signed int __builtin_altivec_vextuhrx (signed int, vss);
+    VEXTUHRX vextuhrx {}
+
+  const signed int __builtin_altivec_vextuwlx (signed int, vsi);
+    VEXTUWLX vextuwlx {}
+
+  const signed int __builtin_altivec_vextuwrx (signed int, vsi);
+    VEXTUWRX vextuwrx {}
+
+  const vsq __builtin_altivec_vmsumudm (vsll, vsll, vsq);
+    VMSUMUDM altivec_vmsumudm {}
+
+  const vsll __builtin_altivec_vprtybd (vsll);
+    VPRTYBD parityv2di2 {}
+
+  const vsq __builtin_altivec_vprtybq (vsq);
+    VPRTYBQ parityv1ti2 {}
+
+  const vsi __builtin_altivec_vprtybw (vsi);
+    VPRTYBW parityv4si2 {}
+
+  const vsll __builtin_altivec_vrldmi (vsll, vsll, vsll);
+    VRLDMI altivec_vrldmi {}
+
+  const vsll __builtin_altivec_vrldnm (vsll, vsll);
+    VRLDNM altivec_vrldnm {}
+
+  const vsi __builtin_altivec_vrlwmi (vsi, vsi, vsi);
+    VRLWMI altivec_vrlwmi {}
+
+  const vsi __builtin_altivec_vrlwnm (vsi, vsi);
+    VRLWNM altivec_vrlwnm {}
+
+  const vsll __builtin_altivec_vsignextsb2d (vsc);
+    VSIGNEXTSB2D vsignextend_qi_v2di {}
+
+  const vsi __builtin_altivec_vsignextsb2w (vsc);
+    VSIGNEXTSB2W vsignextend_qi_v4si {}
+
+  const vsll __builtin_altivec_visgnextsh2d (vss);
+    VSIGNEXTSH2D vsignextend_hi_v2di {}
+
+  const vsi __builtin_altivec_vsignextsh2w (vss);
+    VSIGNEXTSH2W vsignextend_hi_v4si {}
+
+  const vsll __builtin_altivec_vsignextsw2d (vsi);
+    VSIGNEXTSW2D vsignextend_si_v2di {}
+
+  const vsc __builtin_altivec_vslv (vsc, vsc);
+    VSLV vslv {}
+
+  const vsc __builtin_altivec_vsrv (vsc, vsc);
+    VSRV vsrv {}
+
+  const signed int __builtin_scalar_byte_in_range (signed int, signed int);
+    CMPRB cmprb {}
+
+  const signed int __builtin_scalar_byte_in_either_range (signed int, signed int);
+    CMPRB2 cmprb2 {}
+
+  const vsll __builtin_vsx_extract4b (vsc, const int[0,12]);
+    EXTRACT4B extract4b {}
+
+  const vd __builtin_vsx_extract_exp_dp (vd);
+    VEEDP xvxexpdp {}
+
+  const vf __builtin_vsx_extract_exp_sp (vf);
+    VEESP xvxexpsp {}
+
+  const vd __builtin_vsx_extract_sig_dp (vd);
+    VESDP xvxsigdp {}
+
+  const vf __builtin_vsx_extract_sig_sp (vf);
+    VESSP xvxsigsp {}
+
+  const vsc __builtin_vsx_insert4b (vsi, vsc, const int[0,12]);
+    INSERT4B insert4b {}
+
+  const vd __builtin_vsx_insert_exp_dp (vd, vd);
+    VIEDP xviexpdp {}
+
+  const vf __builtin_vsx_insert_exp_sp (vf, vf);
+    VIESP xviexpsp {}
+
+  const signed int __builtin_vsx_scalar_cmp_exp_dp_eq (double, double);
+    VSCEDPEQ xscmpexpdp_eq {}
+
+  const signed int __builtin_vsx_scalar_cmp_exp_dp_gt (double, double);
+    VSCEDPGT xscmpexpdp_gt {}
+
+  const signed int __builtin_vsx_scalar_cmp_exp_dp_lt (double, double);
+    VSCEDPLT xscmpexpdp_lt {}
+
+  const signed int __builtin_vsx_scalar_cmp_exp_dp_unordered (double, double);
+    VSCEDPUO xscmpexpdp_unordered {}
+
+  const signed int __builtin_vsx_scalar_test_data_class_dp (double, const int<7>);
+    VSTDCDP xststdcdp {}
+
+  const signed int __builtin_vsx_scalar_test_data_class_sp (float, const int<7>);
+    VSTDCSP xststdcsp {}
+
+  const signed int __builtin_vsx_scalar_test_neg_dp (double);
+    VSTDCNDP xststdcnegdp {}
+
+  const signed int __builtin_vsx_scalar_test_neg_sp (float);
+    VSTDCNSP xststdcnegsp {}
+
+  const vsll __builtin_vsx_test_data_class_dp (vd, const int<7>);
+    VTDCDP xvtstdcdp {}
+
+  const vsi __builtin_vsx_test_data_class_sp (vf, const int<7>);
+    VTDCSP xvtstdcsp {}
+
+  const vf __builtin_vsx_vextract_fp_from_shorth (vss);
+    VEXTRACT_FP_FROM_SHORTH vextract_fp_from_shorth {}
+
+  const vf __builtin_vsx_vextract_fp_from_shortl (vss);
+    VEXTRACT_FP_FROM_SHORTL vextract_fp_from_shortl {}
+
+  const vd __builtin_vsx_xxbrd_v2df (vd);
+    XXBRD_V2DF p9_xxbrd_v2df {}
+
+  const vsll __builtin_vsx_xxbrd_v2di (vsll);
+    XXBRD_V2DI p9_xxbrd_v2di {}
+
+  const vss __builtin_vsx_xxbrh_v8hi (vss);
+    XXBRH_V8HI p9_xxbrh_v8hi {}
+
+  const vsc __builtin_vsx_xxbrq_v16qi (vsc);
+    XXBRQ_V16QI p9_xxbrq_v16qi {}
+
+  const vsq __builtin_vsx_xxbrq_v1ti (vsq);
+    XXBRQ_V1TI p9_xxbrq_v1ti {}
+
+  const vf __builtin_vsx_xxbrw_v4sf (vf);
+    XXBRW_V4SF p9_xxbrw_v4sf {}
+
+  const vsi __builtin_vsx_xxbrw_v4si (vsi);
+    XXBRW_V4SI p9_xxbrw_v4si {}
+
+
+; Miscellaneous P9 functions
+[power9]
+  signed long long __builtin_darn ();
+    DARN darn {}
+
+  signed int __builtin_darn_32 ();
+    DARN_32 darn_32 {}
+
+  signed long long __builtin_darn_raw ();
+    DARN_RAW darn_raw {}
+
+  double __builtin_mffsl ();
+    MFFSL rs6000_mffsl {}
+
+  const signed int __builtin_dtstsfi_eq_dd (const int<6>, _Decimal64);
+    TSTSFI_EQ_DD dfptstsfi_eq_dd {}
+
+  const signed int __builtin_dtstsfi_eq_td (const int<6>, _Decimal128);
+    TSTSFI_EQ_TD dfptstsfi_eq_td {}
+
+  const signed int __builtin_dtstsfi_gt_dd (const int<6>, _Decimal64);
+    TSTSFI_GT_DD dfptstsfi_gt_dd {}
+
+  const signed int __builtin_dtstsfi_gt_td (const int<6>, _Decimal128);
+    TSTSFI_GT_TD dfptstsfi_gt_td {}
+
+  const signed int __builtin_dtstsfi_lt_dd (const int<6>, _Decimal64);
+    TSTSFI_LT_DD dfptstsfi_lt_dd {}
+
+  const signed int __builtin_dtstsfi_lt_td (const int<6>, _Decimal128);
+    TSTSFI_LT_TD dfptstsfi_lt_td {}
+
+  const signed int __builtin_dtstsfi_ov_dd (const int<6>, _Decimal64);
+    TSTSFI_OV_DD dfptstsfi_unordered_dd {}
+
+  const signed int __builtin_dtstsfi_ov_td (const int<6>, _Decimal128);
+    TSTSFI_OV_TD dfptstsfi_unordered_td {}
+
+
+; These things need some review to see whether they really require
+; MASK_POWERPC64.  For xsxexpdp, this seems to be fine for 32-bit,
+; because the result will always fit in 32 bits and the return
+; value is SImode; but the pattern currently requires TARGET_64BIT.
+; On the other hand, xsxsigdp has a result that doesn't fit in
+; 32 bits, and the return value is DImode, so it seems that
+; TARGET_64BIT (actually TARGET_POWERPC64) is justified.  TBD. ####
+[power9-64]
+  void __builtin_altivec_xst_len_r (vsc, void *, long);
+    XST_LEN_R xst_len_r {}
+
+  void __builtin_altivec_stxvl (vsc, void *, long);
+    STXVL stxvl {}
+
+  const signed int __builtin_scalar_byte_in_set (signed int, signed long long);
+    CMPEQB cmpeqb {}
+
+  pure vsc __builtin_vsx_lxvl (const void *, signed long);
+    LXVL lxvl {}
+
+  const signed long __builtin_vsx_scalar_extract_exp (double);
+    VSEEDP xsxexpdp {}
+
+  const signed long __builtin_vsx_scalar_extract_sig (double);
+    VSESDP xsxsigdp {}
+
+  const double __builtin_vsx_scalar_insert_exp (unsigned long long, unsigned long long);
+    VSIEDP xsiexpdp {}
+
+  const double __builtin_vsx_scalar_insert_exp_dp (double, unsigned long long);
+    VSIEDPF xsiexpdpf {}
+
+  pure vsc __builtin_vsx_xl_len_r (void *, signed long);
+    XL_LEN_R xl_len_r {}
-- 
2.27.0


^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 09/34] rs6000: Add more type nodes to support builtin processing
  2021-07-29 13:30 [PATCHv4 00/34] Replace the Power target-specific builtin machinery Bill Schmidt
                   ` (7 preceding siblings ...)
  2021-07-29 13:30 ` [PATCH 08/34] rs6000: Add Power9 builtins Bill Schmidt
@ 2021-07-29 13:30 ` Bill Schmidt
  2021-08-23 22:15   ` Segher Boessenkool
  2021-07-29 13:30 ` [PATCH 10/34] rs6000: Add Power10 builtins Bill Schmidt
                   ` (24 subsequent siblings)
  33 siblings, 1 reply; 84+ messages in thread
From: Bill Schmidt @ 2021-07-29 13:30 UTC (permalink / raw)
  To: gcc-patches; +Cc: segher, dje.gcc, willschm

2021-06-10  Bill Schmidt  <wschmidt@linux.ibm.com>

gcc/
	* config/rs6000/rs6000-call.c (rs6000_init_builtins): Initialize
	various pointer type nodes.
	* config/rs6000/rs6000.h (rs6000_builtin_type_index): Add enum
	values for various pointer types.
	(ptr_V16QI_type_node): New macro.
	(ptr_V1TI_type_node): New macro.
	(ptr_V2DI_type_node): New macro.
	(ptr_V2DF_type_node): New macro.
	(ptr_V4SI_type_node): New macro.
	(ptr_V4SF_type_node): New macro.
	(ptr_V8HI_type_node): New macro.
	(ptr_unsigned_V16QI_type_node): New macro.
	(ptr_unsigned_V1TI_type_node): New macro.
	(ptr_unsigned_V8HI_type_node): New macro.
	(ptr_unsigned_V4SI_type_node): New macro.
	(ptr_unsigned_V2DI_type_node): New macro.
	(ptr_bool_V16QI_type_node): New macro.
	(ptr_bool_V8HI_type_node): New macro.
	(ptr_bool_V4SI_type_node): New macro.
	(ptr_bool_V2DI_type_node): New macro.
	(ptr_bool_V1TI_type_node): New macro.
	(ptr_pixel_type_node): New macro.
	(ptr_intQI_type_node): New macro.
	(ptr_uintQI_type_node): New macro.
	(ptr_intHI_type_node): New macro.
	(ptr_uintHI_type_node): New macro.
	(ptr_intSI_type_node): New macro.
	(ptr_uintSI_type_node): New macro.
	(ptr_intDI_type_node): New macro.
	(ptr_uintDI_type_node): New macro.
	(ptr_intTI_type_node): New macro.
	(ptr_uintTI_type_node): New macro.
	(ptr_long_integer_type_node): New macro.
	(ptr_long_unsigned_type_node): New macro.
	(ptr_float_type_node): New macro.
	(ptr_double_type_node): New macro.
	(ptr_long_double_type_node): New macro.
	(ptr_dfloat64_type_node): New macro.
	(ptr_dfloat128_type_node): New macro.
	(ptr_ieee128_type_node): New macro.
	(ptr_ibm128_type_node): New macro.
	(ptr_vector_pair_type_node): New macro.
	(ptr_vector_quad_type_node): New macro.
	(ptr_long_long_integer_type_node): New macro.
	(ptr_long_long_unsigned_type_node): New macro.
---
 gcc/config/rs6000/rs6000-call.c | 151 ++++++++++++++++++++++++++++++++
 gcc/config/rs6000/rs6000.h      |  82 +++++++++++++++++
 2 files changed, 233 insertions(+)

diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
index 8b16d65e684..b1338191926 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -13298,25 +13298,63 @@ rs6000_init_builtins (void)
   V2DI_type_node = rs6000_vector_type (TARGET_POWERPC64 ? "__vector long"
 				       : "__vector long long",
 				       long_long_integer_type_node, 2);
+  ptr_V2DI_type_node
+    = build_pointer_type (build_qualified_type (V2DI_type_node,
+						TYPE_QUAL_CONST));
+
   V2DF_type_node = rs6000_vector_type ("__vector double", double_type_node, 2);
+  ptr_V2DF_type_node
+    = build_pointer_type (build_qualified_type (V2DF_type_node,
+						TYPE_QUAL_CONST));
+
   V4SI_type_node = rs6000_vector_type ("__vector signed int",
 				       intSI_type_node, 4);
+  ptr_V4SI_type_node
+    = build_pointer_type (build_qualified_type (V4SI_type_node,
+						TYPE_QUAL_CONST));
+
   V4SF_type_node = rs6000_vector_type ("__vector float", float_type_node, 4);
+  ptr_V4SF_type_node
+    = build_pointer_type (build_qualified_type (V4SF_type_node,
+						TYPE_QUAL_CONST));
+
   V8HI_type_node = rs6000_vector_type ("__vector signed short",
 				       intHI_type_node, 8);
+  ptr_V8HI_type_node
+    = build_pointer_type (build_qualified_type (V8HI_type_node,
+						TYPE_QUAL_CONST));
+
   V16QI_type_node = rs6000_vector_type ("__vector signed char",
 					intQI_type_node, 16);
+  ptr_V16QI_type_node
+    = build_pointer_type (build_qualified_type (V16QI_type_node,
+						TYPE_QUAL_CONST));
 
   unsigned_V16QI_type_node = rs6000_vector_type ("__vector unsigned char",
 					unsigned_intQI_type_node, 16);
+  ptr_unsigned_V16QI_type_node
+    = build_pointer_type (build_qualified_type (unsigned_V16QI_type_node,
+						TYPE_QUAL_CONST));
+
   unsigned_V8HI_type_node = rs6000_vector_type ("__vector unsigned short",
 				       unsigned_intHI_type_node, 8);
+  ptr_unsigned_V8HI_type_node
+    = build_pointer_type (build_qualified_type (unsigned_V8HI_type_node,
+						TYPE_QUAL_CONST));
+
   unsigned_V4SI_type_node = rs6000_vector_type ("__vector unsigned int",
 				       unsigned_intSI_type_node, 4);
+  ptr_unsigned_V4SI_type_node
+    = build_pointer_type (build_qualified_type (unsigned_V4SI_type_node,
+						TYPE_QUAL_CONST));
+
   unsigned_V2DI_type_node = rs6000_vector_type (TARGET_POWERPC64
 				       ? "__vector unsigned long"
 				       : "__vector unsigned long long",
 				       long_long_unsigned_type_node, 2);
+  ptr_unsigned_V2DI_type_node
+    = build_pointer_type (build_qualified_type (unsigned_V2DI_type_node,
+						TYPE_QUAL_CONST));
 
   opaque_V4SI_type_node = build_opaque_vector_type (intSI_type_node, 4);
 
@@ -13330,9 +13368,15 @@ rs6000_init_builtins (void)
     {
       V1TI_type_node = rs6000_vector_type ("__vector __int128",
 					   intTI_type_node, 1);
+      ptr_V1TI_type_node
+	= build_pointer_type (build_qualified_type (V1TI_type_node,
+						    TYPE_QUAL_CONST));
       unsigned_V1TI_type_node
 	= rs6000_vector_type ("__vector unsigned __int128",
 			      unsigned_intTI_type_node, 1);
+      ptr_unsigned_V1TI_type_node
+	= build_pointer_type (build_qualified_type (unsigned_V1TI_type_node,
+						    TYPE_QUAL_CONST));
     }
 
   /* The 'vector bool ...' types must be kept distinct from 'vector unsigned ...'
@@ -13366,6 +13410,78 @@ rs6000_init_builtins (void)
   dfloat128_type_internal_node = dfloat128_type_node;
   void_type_internal_node = void_type_node;
 
+  ptr_intQI_type_node
+    = build_pointer_type (build_qualified_type (intQI_type_internal_node,
+						TYPE_QUAL_CONST));
+  ptr_uintQI_type_node
+    = build_pointer_type (build_qualified_type (uintQI_type_internal_node,
+						TYPE_QUAL_CONST));
+  ptr_intHI_type_node
+    = build_pointer_type (build_qualified_type (intHI_type_internal_node,
+						TYPE_QUAL_CONST));
+  ptr_uintHI_type_node
+    = build_pointer_type (build_qualified_type (uintHI_type_internal_node,
+						TYPE_QUAL_CONST));
+  ptr_intSI_type_node
+    = build_pointer_type (build_qualified_type (intSI_type_internal_node,
+						TYPE_QUAL_CONST));
+  ptr_uintSI_type_node
+    = build_pointer_type (build_qualified_type (uintSI_type_internal_node,
+						TYPE_QUAL_CONST));
+  ptr_intDI_type_node
+    = build_pointer_type (build_qualified_type (intDI_type_internal_node,
+						TYPE_QUAL_CONST));
+  ptr_uintDI_type_node
+    = build_pointer_type (build_qualified_type (uintDI_type_internal_node,
+						TYPE_QUAL_CONST));
+  ptr_intTI_type_node
+    = build_pointer_type (build_qualified_type (intTI_type_internal_node,
+						TYPE_QUAL_CONST));
+  ptr_uintTI_type_node
+    = build_pointer_type (build_qualified_type (uintTI_type_internal_node,
+						TYPE_QUAL_CONST));
+  ptr_long_integer_type_node
+    = build_pointer_type
+	(build_qualified_type (long_integer_type_internal_node,
+			       TYPE_QUAL_CONST));
+
+  ptr_long_unsigned_type_node
+    = build_pointer_type
+	(build_qualified_type (long_unsigned_type_internal_node,
+			       TYPE_QUAL_CONST));
+
+  ptr_float_type_node
+    = build_pointer_type (build_qualified_type (float_type_internal_node,
+						TYPE_QUAL_CONST));
+  ptr_double_type_node
+    = build_pointer_type (build_qualified_type (double_type_internal_node,
+						TYPE_QUAL_CONST));
+  ptr_long_double_type_node
+    = build_pointer_type (build_qualified_type (long_double_type_internal_node,
+						TYPE_QUAL_CONST));
+  if (dfloat64_type_node)
+    ptr_dfloat64_type_node
+      = build_pointer_type (build_qualified_type (dfloat64_type_internal_node,
+						  TYPE_QUAL_CONST));
+  else
+    ptr_dfloat64_type_node = NULL;
+
+  if (dfloat128_type_node)
+    ptr_dfloat128_type_node
+      = build_pointer_type (build_qualified_type (dfloat128_type_internal_node,
+						  TYPE_QUAL_CONST));
+  else
+    ptr_dfloat128_type_node = NULL;
+
+  ptr_long_long_integer_type_node
+    = build_pointer_type
+	(build_qualified_type (long_long_integer_type_internal_node,
+			       TYPE_QUAL_CONST));
+  ptr_long_long_unsigned_type_node
+    = build_pointer_type
+	(build_qualified_type (long_long_unsigned_type_internal_node,
+			       TYPE_QUAL_CONST));
+
   /* 128-bit floating point support.  KFmode is IEEE 128-bit floating point.
      IFmode is the IBM extended 128-bit format that is a pair of doubles.
      TFmode will be either IEEE 128-bit floating point or the IBM double-double
@@ -13393,6 +13509,9 @@ rs6000_init_builtins (void)
 	  SET_TYPE_MODE (ibm128_float_type_node, IFmode);
 	  layout_type (ibm128_float_type_node);
 	}
+      ptr_ibm128_float_type_node
+	= build_pointer_type (build_qualified_type (ibm128_float_type_node,
+						    TYPE_QUAL_CONST));
 
       lang_hooks.types.register_builtin_type (ibm128_float_type_node,
 					      "__ibm128");
@@ -13401,6 +13520,9 @@ rs6000_init_builtins (void)
 	ieee128_float_type_node = long_double_type_node;
       else
 	ieee128_float_type_node = float128_type_node;
+      ptr_ieee128_float_type_node
+	= build_pointer_type (build_qualified_type (ieee128_float_type_node,
+						    TYPE_QUAL_CONST));
 
       lang_hooks.types.register_builtin_type (ieee128_float_type_node,
 					      "__ieee128");
@@ -13421,6 +13543,9 @@ rs6000_init_builtins (void)
       TYPE_USER_ALIGN (vector_pair_type_node) = 0;
       lang_hooks.types.register_builtin_type (vector_pair_type_node,
 					      "__vector_pair");
+      ptr_vector_pair_type_node
+	= build_pointer_type (build_qualified_type (vector_pair_type_node,
+						    TYPE_QUAL_CONST));
 
       vector_quad_type_node = make_node (OPAQUE_TYPE);
       SET_TYPE_MODE (vector_quad_type_node, XOmode);
@@ -13431,6 +13556,9 @@ rs6000_init_builtins (void)
       TYPE_USER_ALIGN (vector_quad_type_node) = 0;
       lang_hooks.types.register_builtin_type (vector_quad_type_node,
 					      "__vector_quad");
+      ptr_vector_quad_type_node
+	= build_pointer_type (build_qualified_type (vector_quad_type_node,
+						    TYPE_QUAL_CONST));
     }
 
   /* Initialize the modes for builtin_function_type, mapping a machine mode to
@@ -13481,18 +13609,41 @@ rs6000_init_builtins (void)
 
   bool_V16QI_type_node = rs6000_vector_type ("__vector __bool char",
 					     bool_char_type_node, 16);
+  ptr_bool_V16QI_type_node
+    = build_pointer_type (build_qualified_type (bool_V16QI_type_node,
+						TYPE_QUAL_CONST));
+
   bool_V8HI_type_node = rs6000_vector_type ("__vector __bool short",
 					    bool_short_type_node, 8);
+  ptr_bool_V8HI_type_node
+    = build_pointer_type (build_qualified_type (bool_V8HI_type_node,
+						TYPE_QUAL_CONST));
+
   bool_V4SI_type_node = rs6000_vector_type ("__vector __bool int",
 					    bool_int_type_node, 4);
+  ptr_bool_V4SI_type_node
+    = build_pointer_type (build_qualified_type (bool_V4SI_type_node,
+						TYPE_QUAL_CONST));
+
   bool_V2DI_type_node = rs6000_vector_type (TARGET_POWERPC64
 					    ? "__vector __bool long"
 					    : "__vector __bool long long",
 					    bool_long_long_type_node, 2);
+  ptr_bool_V2DI_type_node
+    = build_pointer_type (build_qualified_type (bool_V2DI_type_node,
+						TYPE_QUAL_CONST));
+
   bool_V1TI_type_node = rs6000_vector_type ("__vector __bool __int128",
 					    intTI_type_node, 1);
+  ptr_bool_V1TI_type_node
+    = build_pointer_type (build_qualified_type (bool_V1TI_type_node,
+						TYPE_QUAL_CONST));
+
   pixel_V8HI_type_node = rs6000_vector_type ("__vector __pixel",
 					     pixel_type_node, 8);
+  ptr_pixel_V8HI_type_node
+    = build_pointer_type (build_qualified_type (pixel_V8HI_type_node,
+						TYPE_QUAL_CONST));
   pcvoid_type_node
     = build_pointer_type (build_qualified_type (void_type_node,
 						TYPE_QUAL_CONST));
diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h
index c5d20d240f2..3eba1c072cf 100644
--- a/gcc/config/rs6000/rs6000.h
+++ b/gcc/config/rs6000/rs6000.h
@@ -2461,6 +2461,47 @@ enum rs6000_builtin_type_index
   RS6000_BTI_vector_pair,	 /* unsigned 256-bit types (vector pair).  */
   RS6000_BTI_vector_quad,	 /* unsigned 512-bit types (vector quad).  */
   RS6000_BTI_const_ptr_void,     /* const pointer to void */
+  RS6000_BTI_ptr_V16QI,
+  RS6000_BTI_ptr_V1TI,
+  RS6000_BTI_ptr_V2DI,
+  RS6000_BTI_ptr_V2DF,
+  RS6000_BTI_ptr_V4SI,
+  RS6000_BTI_ptr_V4SF,
+  RS6000_BTI_ptr_V8HI,
+  RS6000_BTI_ptr_unsigned_V16QI,
+  RS6000_BTI_ptr_unsigned_V1TI,
+  RS6000_BTI_ptr_unsigned_V8HI,
+  RS6000_BTI_ptr_unsigned_V4SI,
+  RS6000_BTI_ptr_unsigned_V2DI,
+  RS6000_BTI_ptr_bool_V16QI,
+  RS6000_BTI_ptr_bool_V8HI,
+  RS6000_BTI_ptr_bool_V4SI,
+  RS6000_BTI_ptr_bool_V2DI,
+  RS6000_BTI_ptr_bool_V1TI,
+  RS6000_BTI_ptr_pixel_V8HI,
+  RS6000_BTI_ptr_INTQI,
+  RS6000_BTI_ptr_UINTQI,
+  RS6000_BTI_ptr_INTHI,
+  RS6000_BTI_ptr_UINTHI,
+  RS6000_BTI_ptr_INTSI,
+  RS6000_BTI_ptr_UINTSI,
+  RS6000_BTI_ptr_INTDI,
+  RS6000_BTI_ptr_UINTDI,
+  RS6000_BTI_ptr_INTTI,
+  RS6000_BTI_ptr_UINTTI,
+  RS6000_BTI_ptr_long_integer,
+  RS6000_BTI_ptr_long_unsigned,
+  RS6000_BTI_ptr_float,
+  RS6000_BTI_ptr_double,
+  RS6000_BTI_ptr_long_double,
+  RS6000_BTI_ptr_dfloat64,
+  RS6000_BTI_ptr_dfloat128,
+  RS6000_BTI_ptr_ieee128_float,
+  RS6000_BTI_ptr_ibm128_float,
+  RS6000_BTI_ptr_vector_pair,
+  RS6000_BTI_ptr_vector_quad,
+  RS6000_BTI_ptr_long_long,
+  RS6000_BTI_ptr_long_long_unsigned,
   RS6000_BTI_MAX
 };
 
@@ -2517,6 +2558,47 @@ enum rs6000_builtin_type_index
 #define vector_pair_type_node		 (rs6000_builtin_types[RS6000_BTI_vector_pair])
 #define vector_quad_type_node		 (rs6000_builtin_types[RS6000_BTI_vector_quad])
 #define pcvoid_type_node		 (rs6000_builtin_types[RS6000_BTI_const_ptr_void])
+#define ptr_V16QI_type_node		 (rs6000_builtin_types[RS6000_BTI_ptr_V16QI])
+#define ptr_V1TI_type_node		 (rs6000_builtin_types[RS6000_BTI_ptr_V1TI])
+#define ptr_V2DI_type_node		 (rs6000_builtin_types[RS6000_BTI_ptr_V2DI])
+#define ptr_V2DF_type_node		 (rs6000_builtin_types[RS6000_BTI_ptr_V2DF])
+#define ptr_V4SI_type_node		 (rs6000_builtin_types[RS6000_BTI_ptr_V4SI])
+#define ptr_V4SF_type_node		 (rs6000_builtin_types[RS6000_BTI_ptr_V4SF])
+#define ptr_V8HI_type_node		 (rs6000_builtin_types[RS6000_BTI_ptr_V8HI])
+#define ptr_unsigned_V16QI_type_node	 (rs6000_builtin_types[RS6000_BTI_ptr_unsigned_V16QI])
+#define ptr_unsigned_V1TI_type_node	 (rs6000_builtin_types[RS6000_BTI_ptr_unsigned_V1TI])
+#define ptr_unsigned_V8HI_type_node	 (rs6000_builtin_types[RS6000_BTI_ptr_unsigned_V8HI])
+#define ptr_unsigned_V4SI_type_node	 (rs6000_builtin_types[RS6000_BTI_ptr_unsigned_V4SI])
+#define ptr_unsigned_V2DI_type_node	 (rs6000_builtin_types[RS6000_BTI_ptr_unsigned_V2DI])
+#define ptr_bool_V16QI_type_node	 (rs6000_builtin_types[RS6000_BTI_ptr_bool_V16QI])
+#define ptr_bool_V8HI_type_node		 (rs6000_builtin_types[RS6000_BTI_ptr_bool_V8HI])
+#define ptr_bool_V4SI_type_node		 (rs6000_builtin_types[RS6000_BTI_ptr_bool_V4SI])
+#define ptr_bool_V2DI_type_node		 (rs6000_builtin_types[RS6000_BTI_ptr_bool_V2DI])
+#define ptr_bool_V1TI_type_node		 (rs6000_builtin_types[RS6000_BTI_ptr_bool_V1TI])
+#define ptr_pixel_V8HI_type_node	 (rs6000_builtin_types[RS6000_BTI_ptr_pixel_V8HI])
+#define ptr_intQI_type_node		 (rs6000_builtin_types[RS6000_BTI_ptr_INTQI])
+#define ptr_uintQI_type_node		 (rs6000_builtin_types[RS6000_BTI_ptr_UINTQI])
+#define ptr_intHI_type_node		 (rs6000_builtin_types[RS6000_BTI_ptr_INTHI])
+#define ptr_uintHI_type_node		 (rs6000_builtin_types[RS6000_BTI_ptr_UINTHI])
+#define ptr_intSI_type_node		 (rs6000_builtin_types[RS6000_BTI_ptr_INTSI])
+#define ptr_uintSI_type_node		 (rs6000_builtin_types[RS6000_BTI_ptr_UINTSI])
+#define ptr_intDI_type_node		 (rs6000_builtin_types[RS6000_BTI_ptr_INTDI])
+#define ptr_uintDI_type_node		 (rs6000_builtin_types[RS6000_BTI_ptr_UINTDI])
+#define ptr_intTI_type_node		 (rs6000_builtin_types[RS6000_BTI_ptr_INTTI])
+#define ptr_uintTI_type_node		 (rs6000_builtin_types[RS6000_BTI_ptr_UINTTI])
+#define ptr_long_integer_type_node	 (rs6000_builtin_types[RS6000_BTI_ptr_long_integer])
+#define ptr_long_unsigned_type_node	 (rs6000_builtin_types[RS6000_BTI_ptr_long_unsigned])
+#define ptr_float_type_node		 (rs6000_builtin_types[RS6000_BTI_ptr_float])
+#define ptr_double_type_node		 (rs6000_builtin_types[RS6000_BTI_ptr_double])
+#define ptr_long_double_type_node	 (rs6000_builtin_types[RS6000_BTI_ptr_long_double])
+#define ptr_dfloat64_type_node		 (rs6000_builtin_types[RS6000_BTI_ptr_dfloat64])
+#define ptr_dfloat128_type_node		 (rs6000_builtin_types[RS6000_BTI_ptr_dfloat128])
+#define ptr_ieee128_float_type_node	 (rs6000_builtin_types[RS6000_BTI_ptr_ieee128_float])
+#define ptr_ibm128_float_type_node	 (rs6000_builtin_types[RS6000_BTI_ptr_ibm128_float])
+#define ptr_vector_pair_type_node	 (rs6000_builtin_types[RS6000_BTI_ptr_vector_pair])
+#define ptr_vector_quad_type_node	 (rs6000_builtin_types[RS6000_BTI_ptr_vector_quad])
+#define ptr_long_long_integer_type_node	 (rs6000_builtin_types[RS6000_BTI_ptr_long_long])
+#define ptr_long_long_unsigned_type_node (rs6000_builtin_types[RS6000_BTI_ptr_long_long_unsigned])
 
 extern GTY(()) tree rs6000_builtin_types[RS6000_BTI_MAX];
 extern GTY(()) tree rs6000_builtin_decls[RS6000_BUILTIN_COUNT];
-- 
2.27.0


^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 10/34] rs6000: Add Power10 builtins
  2021-07-29 13:30 [PATCHv4 00/34] Replace the Power target-specific builtin machinery Bill Schmidt
                   ` (8 preceding siblings ...)
  2021-07-29 13:30 ` [PATCH 09/34] rs6000: Add more type nodes to support builtin processing Bill Schmidt
@ 2021-07-29 13:30 ` Bill Schmidt
  2021-08-23 23:48   ` Segher Boessenkool
  2021-07-29 13:30 ` [PATCH 11/34] rs6000: Add MMA builtins Bill Schmidt
                   ` (23 subsequent siblings)
  33 siblings, 1 reply; 84+ messages in thread
From: Bill Schmidt @ 2021-07-29 13:30 UTC (permalink / raw)
  To: gcc-patches; +Cc: segher, dje.gcc, willschm

2021-07-28  Bill Schmidt  <wschmidt@linux.ibm.com>

gcc/
	* config/rs6000/rs6000-builtin-new.def: Add power10 and power10-64
	stanzas.
---
 gcc/config/rs6000/rs6000-builtin-new.def | 523 +++++++++++++++++++++++
 1 file changed, 523 insertions(+)

diff --git a/gcc/config/rs6000/rs6000-builtin-new.def b/gcc/config/rs6000/rs6000-builtin-new.def
index 8885df089a6..6b7a79549a4 100644
--- a/gcc/config/rs6000/rs6000-builtin-new.def
+++ b/gcc/config/rs6000/rs6000-builtin-new.def
@@ -2809,3 +2809,526 @@
 
   pure vsc __builtin_vsx_xl_len_r (void *, signed long);
     XL_LEN_R xl_len_r {}
+
+
+[power10]
+  const vbq __builtin_altivec_cmpge_1ti (vsq, vsq);
+    CMPGE_1TI vector_nltv1ti {}
+
+  const vbq __builtin_altivec_cmpge_u1ti (vuq, vuq);
+    CMPGE_U1TI vector_nltuv1ti {}
+
+  const vbq __builtin_altivec_cmple_1ti (vsq, vsq);
+    CMPLE_1TI vector_ngtv1ti {}
+
+  const vbq __builtin_altivec_cmple_u1ti (vuq, vuq);
+    CMPLE_U1TI vector_ngtuv1ti {}
+
+  const unsigned long long __builtin_altivec_cntmbb (vuc, const int<1>);
+    VCNTMBB vec_cntmb_v16qi {}
+
+  const unsigned long long __builtin_altivec_cntmbd (vull, const int<1>);
+    VCNTMBD vec_cntmb_v2di {}
+
+  const unsigned long long __builtin_altivec_cntmbh (vus, const int<1>);
+    VCNTMBH vec_cntmb_v8hi {}
+
+  const unsigned long long __builtin_altivec_cntmbw (vui, const int<1>);
+    VCNTMBW vec_cntmb_v4si {}
+
+  const vsq __builtin_altivec_div_v1ti (vsq, vsq);
+    DIV_V1TI vsx_div_v1ti {}
+
+  const vsq __builtin_altivec_dives (vsq, vsq);
+    DIVES_V1TI vsx_dives_v1ti {}
+
+  const vuq __builtin_altivec_diveu (vuq, vuq);
+    DIVEU_V1TI vsx_diveu_v1ti {}
+
+  const vsq __builtin_altivec_mods (vsq, vsq);
+    MODS_V1TI vsx_mods_v1ti {}
+
+  const vuq __builtin_altivec_modu (vuq, vuq);
+    MODU_V1TI vsx_modu_v1ti {}
+
+  const vuc __builtin_altivec_mtvsrbm (unsigned long long);
+    MTVSRBM vec_mtvsr_v16qi {}
+
+  const vull __builtin_altivec_mtvsrdm (unsigned long long);
+    MTVSRDM vec_mtvsr_v2di {}
+
+  const vus __builtin_altivec_mtvsrhm (unsigned long long);
+    MTVSRHM vec_mtvsr_v8hi {}
+
+  const vuq __builtin_altivec_mtvsrqm (unsigned long long);
+    MTVSRQM vec_mtvsr_v1ti {}
+
+  const vui __builtin_altivec_mtvsrwm (unsigned long long);
+    MTVSRWM vec_mtvsr_v4si {}
+
+  pure signed __int128 __builtin_altivec_se_lxvrbx (signed long, const signed char *);
+    SE_LXVRBX vsx_lxvrbx {lxvrse}
+
+  pure signed __int128 __builtin_altivec_se_lxvrhx (signed long, const signed short *);
+    SE_LXVRHX vsx_lxvrhx {lxvrse}
+
+  pure signed __int128 __builtin_altivec_se_lxvrwx (signed long, const signed int *);
+    SE_LXVRWX vsx_lxvrwx {lxvrse}
+
+  pure signed __int128 __builtin_altivec_se_lxvrdx (signed long, const signed long long *);
+    SE_LXVRDX vsx_lxvrdx {lxvrse}
+
+  void __builtin_altivec_tr_stxvrbx (vsq, signed long, signed char *);
+    TR_STXVRBX vsx_stxvrbx {stvec}
+
+  void __builtin_altivec_tr_stxvrhx (vsq, signed long, signed int *);
+    TR_STXVRHX vsx_stxvrhx {stvec}
+
+  void __builtin_altivec_tr_stxvrwx (vsq, signed long, signed short *);
+    TR_STXVRWX vsx_stxvrwx {stvec}
+
+  void __builtin_altivec_tr_stxvrdx (vsq, signed long, signed long long *);
+    TR_STXVRDX vsx_stxvrdx {stvec}
+
+  const vuq __builtin_altivec_udiv_v1ti (vuq, vuq);
+    UDIV_V1TI vsx_udiv_v1ti {}
+
+  const vull __builtin_altivec_vcfuged (vull, vull);
+    VCFUGED vcfuged {}
+
+  const vsc __builtin_altivec_vclrlb (vsc, signed int);
+    VCLRLB vclrlb {}
+
+  const vsc __builtin_altivec_vclrrb (vsc, signed int);
+    VCLRRB vclrrb {}
+
+  const signed int __builtin_altivec_vcmpaet_p (vsq, vsq);
+    VCMPAET_P vector_ae_v1ti_p {}
+
+  const vbq __builtin_altivec_vcmpequt (vsq, vsq);
+    VCMPEQUT vector_eqv1ti {}
+
+  const signed int __builtin_altivec_vcmpequt_p (signed int, vsq, vsq);
+    VCMPEQUT_P vector_eq_v1ti_p {pred}
+
+  const vbq __builtin_altivec_vcmpgtst (vsq, vsq);
+    VCMPGTST vector_gtv1ti {}
+
+  const signed int __builtin_altivec_vcmpgtst_p (signed int, vsq, vsq);
+    VCMPGTST_P vector_gt_v1ti_p {pred}
+
+  const vbq __builtin_altivec_vcmpgtut (vuq, vuq);
+    VCMPGTUT vector_gtuv1ti {}
+
+  const signed int __builtin_altivec_vcmpgtut_p (signed int, vuq, vuq);
+    VCMPGTUT_P vector_gtu_v1ti_p {pred}
+
+  const vbq __builtin_altivec_vcmpnet (vsq, vsq);
+    VCMPNET vcmpnet {}
+
+  const signed int __builtin_altivec_vcmpnet_p (vsq, vsq);
+    VCMPNET_P vector_ne_v1ti_p {}
+
+  const vull __builtin_altivec_vclzdm (vull, vull);
+    VCLZDM vclzdm {}
+
+  const vull __builtin_altivec_vctzdm (vull, vull);
+    VCTZDM vctzdm {}
+
+  const vsll __builtin_altivec_vdivesd (vsll, vsll);
+    VDIVESD dives_v2di {}
+
+  const vsi __builtin_altivec_vdivesw (vsi, vsi);
+    VDIVESW dives_v4si {}
+
+  const vull __builtin_altivec_vdiveud (vull, vull);
+    VDIVEUD diveu_v2di {}
+
+  const vui __builtin_altivec_vdiveuw (vui, vui);
+    VDIVEUW diveu_v4si {}
+
+  const vsll __builtin_altivec_vdivsd (vsll, vsll);
+    VDIVSD divv2di3 {}
+
+  const vsi __builtin_altivec_vdivsw (vsi, vsi);
+    VDIVSW divv4si3 {}
+
+  const vull __builtin_altivec_vdivud (vull, vull);
+    VDIVUD udivv2di3 {}
+
+  const vui __builtin_altivec_vdivuw (vui, vui);
+    VDIVUW udivv4si3 {}
+
+  const vuc __builtin_altivec_vexpandmb (vuc);
+    VEXPANDMB vec_expand_v16qi {}
+
+  const vull __builtin_altivec_vexpandmd (vull);
+    VEXPANDMD vec_expand_v2di {}
+
+  const vus __builtin_altivec_vexpandmh (vus);
+    VEXPANDMH vec_expand_v8hi {}
+
+  const vuq __builtin_altivec_vexpandmq (vuq);
+    VEXPANDMQ vec_expand_v1ti {}
+
+  const vui __builtin_altivec_vexpandmw (vui);
+    VEXPANDMW vec_expand_v4si {}
+
+  const vull __builtin_altivec_vextddvhx (vull, vull, unsigned int);
+    VEXTRACTDR vextractrv2di {}
+
+  const vull __builtin_altivec_vextddvlx (vull, vull, unsigned int);
+    VEXTRACTDL vextractlv2di {}
+
+  const vull __builtin_altivec_vextdubvhx (vuc, vuc, unsigned int);
+    VEXTRACTBR vextractrv16qi {}
+
+  const vull __builtin_altivec_vextdubvlx (vuc, vuc, unsigned int);
+    VEXTRACTBL vextractlv16qi {}
+
+  const vull __builtin_altivec_vextduhvhx (vus, vus, unsigned int);
+    VEXTRACTHR vextractrv8hi {}
+
+  const vull __builtin_altivec_vextduhvlx (vus, vus, unsigned int);
+    VEXTRACTHL vextractlv8hi {}
+
+  const vull __builtin_altivec_vextduwvhx (vui, vui, unsigned int);
+    VEXTRACTWR vextractrv4si {}
+
+  const vull __builtin_altivec_vextduwvlx (vui, vui, unsigned int);
+    VEXTRACTWL vextractlv4si {}
+
+  const signed int __builtin_altivec_vextractmb (vsc);
+    VEXTRACTMB vec_extract_v16qi {}
+
+  const signed int __builtin_altivec_vextractmd (vsll);
+    VEXTRACTMD vec_extract_v2di {}
+
+  const signed int __builtin_altivec_vextractmh (vss);
+    VEXTRACTMH vec_extract_v8hi {}
+
+  const signed int __builtin_altivec_vextractmq (vsq);
+    VEXTRACTMQ vec_extract_v1ti {}
+
+  const signed int __builtin_altivec_vextractmw (vsi);
+    VEXTRACTMW vec_extract_v4si {}
+
+  const unsigned long long __builtin_altivec_vgnb (vull, const int <2,7>);
+    VGNB vgnb {}
+
+  const vuc __builtin_altivec_vinsgubvlx (unsigned int, vuc, unsigned int);
+    VINSERTGPRBL vinsertgl_v16qi {}
+
+  const vsc __builtin_altivec_vinsgubvrx (signed int, vsc, signed int);
+    VINSERTGPRBR vinsertgr_v16qi {}
+
+  const vull __builtin_altivec_vinsgudvlx (unsigned int, vull, unsigned int);
+    VINSERTGPRDL vinsertgl_v2di {}
+
+  const vsll __builtin_altivec_vinsgudvrx (signed int, vsll, signed int);
+    VINSERTGPRDR vinsertgr_v2di {}
+
+  const vus __builtin_altivec_vinsguhvlx (unsigned int, vus, unsigned int);
+    VINSERTGPRHL vinsertgl_v8hi {}
+
+  const vss __builtin_altivec_vinsguhvrx (signed int, vss, signed int);
+    VINSERTGPRHR vinsertgr_v8hi {}
+
+  const vui __builtin_altivec_vinsguwvlx (unsigned int, vui, unsigned int);
+    VINSERTGPRWL vinsertgl_v4si {}
+
+  const vsi __builtin_altivec_vinsguwvrx (signed int, vsi, signed int);
+    VINSERTGPRWR vinsertgr_v4si {}
+
+  const vuc __builtin_altivec_vinsvubvlx (vuc, vuc, unsigned int);
+    VINSERTVPRBL vinsertvl_v16qi {}
+
+  const vsc __builtin_altivec_vinsvubvrx (vsc, vsc, signed int);
+    VINSERTVPRBR vinsertvr_v16qi {}
+
+  const vus __builtin_altivec_vinsvuhvlx (vus, vus, unsigned int);
+    VINSERTVPRHL vinsertvl_v8hi {}
+
+  const vss __builtin_altivec_vinsvuhvrx (vss, vss, signed int);
+    VINSERTVPRHR vinsertvr_v8hi {}
+
+  const vui __builtin_altivec_vinsvuwvlx (vui, vui, unsigned int);
+    VINSERTVPRWL vinsertvl_v4si {}
+
+  const vsi __builtin_altivec_vinsvuwvrx (vsi, vsi, signed int);
+    VINSERTVPRWR vinsertvr_v4si {}
+
+  const vsll __builtin_altivec_vmodsd (vsll, vsll);
+    VMODSD modv2di3 {}
+
+  const vsi __builtin_altivec_vmodsw (vsi, vsi);
+    VMODSW modv4si3 {}
+
+  const vull __builtin_altivec_vmodud (vull, vull);
+    VMODUD umodv2di3 {}
+
+  const vui __builtin_altivec_vmoduw (vui, vui);
+    VMODUW umodv4si3 {}
+
+  const vsq __builtin_altivec_vmulesd (vsll, vsll);
+    VMULESD vec_widen_smult_even_v2di {}
+
+  const vuq __builtin_altivec_vmuleud (vull, vull);
+    VMULEUD vec_widen_umult_even_v2di {}
+
+  const vsll __builtin_altivec_vmulhsd (vsll, vsll);
+    VMULHSD smulv2di3_highpart {}
+
+  const vsi __builtin_altivec_vmulhsw (vsi, vsi);
+    VMULHSW smulv4si3_highpart {}
+
+  const vull __builtin_altivec_vmulhud (vull, vull);
+    VMULHUD umulv2di3_highpart {}
+
+  const vui __builtin_altivec_vmulhuw (vui, vui);
+    VMULHUW umulv4si3_highpart {}
+
+  const vsll __builtin_altivec_vmulld (vsll, vsll);
+    VMULLD mulv2di3 {}
+
+  const vsq __builtin_altivec_vmulosd (vsll, vsll);
+    VMULOSD vec_widen_smult_odd_v2di {}
+
+  const vuq __builtin_altivec_vmuloud (vull, vull);
+    VMULOUD vec_widen_umult_odd_v2di {}
+
+  const vsq __builtin_altivec_vnor_v1ti (vsq, vsq);
+    VNOR_V1TI norv1ti3 {}
+
+  const vuq __builtin_altivec_vnor_v1ti_uns (vuq, vuq);
+    VNOR_V1TI_UNS norv1ti3 {}
+
+  const vull __builtin_altivec_vpdepd (vull, vull);
+    VPDEPD vpdepd {}
+
+  const vull __builtin_altivec_vpextd (vull, vull);
+    VPEXTD vpextd {}
+
+  const vull __builtin_altivec_vreplace_un_uv2di (vull, unsigned long long, const int<4>);
+    VREPLACE_UN_UV2DI vreplace_un_v2di {}
+
+  const vui __builtin_altivec_vreplace_un_uv4si (vui, unsigned int, const int<4>);
+    VREPLACE_UN_UV4SI vreplace_un_v4si {}
+
+  const vd __builtin_altivec_vreplace_un_v2df (vd, double, const int<4>);
+    VREPLACE_UN_V2DF vreplace_un_v2df {}
+
+  const vsll __builtin_altivec_vreplace_un_v2di (vsll, signed long long, const int<4>);
+    VREPLACE_UN_V2DI vreplace_un_v2di {}
+
+  const vf __builtin_altivec_vreplace_un_v4sf (vf, float, const int<4>);
+    VREPLACE_UN_V4SF vreplace_un_v4sf {}
+
+  const vsi __builtin_altivec_vreplace_un_v4si (vsi, signed int, const int<4>);
+    VREPLACE_UN_V4SI vreplace_un_v4si {}
+
+  const vull __builtin_altivec_vreplace_uv2di (vull, unsigned long long, const int<1>);
+    VREPLACE_ELT_UV2DI vreplace_elt_v2di {}
+
+  const vui __builtin_altivec_vreplace_uv4si (vui, unsigned int, const int<2>);
+    VREPLACE_ELT_UV4SI vreplace_elt_v4si {}
+
+  const vd __builtin_altivec_vreplace_v2df (vd, double, const int<1>);
+    VREPLACE_ELT_V2DF vreplace_elt_v2df {}
+
+  const vsll __builtin_altivec_vreplace_v2di (vsll, signed long long, const int<1>);
+    VREPLACE_ELT_V2DI vreplace_elt_v2di {}
+
+  const vf __builtin_altivec_vreplace_v4sf (vf, float, const int<2>);
+    VREPLACE_ELT_V4SF vreplace_elt_v4sf {}
+
+  const vsi __builtin_altivec_vreplace_v4si (vsi, signed int, const int<2>);
+    VREPLACE_ELT_V4SI vreplace_elt_v4si {}
+
+  const vsq __builtin_altivec_vrlq (vsq, vuq);
+    VRLQ vrotlv1ti3 {}
+
+  const vsq __builtin_altivec_vrlqmi (vsq, vsq, vuq);
+    VRLQMI altivec_vrlqmi {}
+
+  const vsq __builtin_altivec_vrlqnm (vsq, vuq);
+    VRLQNM altivec_vrlqnm {}
+
+  const vsq __builtin_altivec_vsignext (vsll);
+    VSIGNEXTSD2Q vsignextend_v2di_v1ti {}
+
+  const vsc __builtin_altivec_vsldb_v16qi (vsc, vsc, const int<3>);
+    VSLDB_V16QI vsldb_v16qi {}
+
+  const vsll __builtin_altivec_vsldb_v2di (vsll, vsll, const int<3>);
+    VSLDB_V2DI vsldb_v2di {}
+
+  const vsi __builtin_altivec_vsldb_v4si (vsi, vsi, const int<3>);
+    VSLDB_V4SI vsldb_v4si {}
+
+  const vss __builtin_altivec_vsldb_v8hi (vss, vss, const int<3>);
+    VSLDB_V8HI vsldb_v8hi {}
+
+  const vsq __builtin_altivec_vslq (vsq, vuq);
+    VSLQ vashlv1ti3 {}
+
+  const vsq __builtin_altivec_vsraq (vsq, vuq);
+    VSRAQ vashrv1ti3 {}
+
+  const vsc __builtin_altivec_vsrdb_v16qi (vsc, vsc, const int<3>);
+    VSRDB_V16QI vsrdb_v16qi {}
+
+  const vsll __builtin_altivec_vsrdb_v2di (vsll, vsll, const int<3>);
+    VSRDB_V2DI vsrdb_v2di {}
+
+  const vsi __builtin_altivec_vsrdb_v4si (vsi, vsi, const int<3>);
+    VSRDB_V4SI vsrdb_v4si {}
+
+  const vss __builtin_altivec_vsrdb_v8hi (vss, vss, const int<3>);
+    VSRDB_V8HI vsrdb_v8hi {}
+
+  const vsq __builtin_altivec_vsrq (vsq, vuq);
+    VSRQ vlshrv1ti3 {}
+
+  const vsc __builtin_altivec_vstribl (vsc);
+    VSTRIBL vstril_v16qi {}
+
+  const signed int __builtin_altivec_vstribl_p (vsc);
+    VSTRIBL_P vstril_p_v16qi {}
+
+  const vsc __builtin_altivec_vstribr (vsc);
+    VSTRIBR vstrir_v16qi {}
+
+  const signed int __builtin_altivec_vstribr_p (vsc);
+    VSTRIBR_P vstrir_p_v16qi {}
+
+  const vss __builtin_altivec_vstrihl (vss);
+    VSTRIHL vstril_v8hi {}
+
+  const signed int __builtin_altivec_vstrihl_p (vss);
+    VSTRIHL_P vstril_p_v8hi {}
+
+  const vss __builtin_altivec_vstrihr (vss);
+    VSTRIHR vstrir_v8hi {}
+
+  const signed int __builtin_altivec_vstrihr_p (vss);
+    VSTRIHR_P vstrir_p_v8hi {}
+
+  const signed int __builtin_vsx_xvtlsbb_all_ones (vsc);
+    XVTLSBB_ONES xvtlsbbo {}
+
+  const signed int __builtin_vsx_xvtlsbb_all_zeros (vsc);
+    XVTLSBB_ZEROS xvtlsbbz {}
+
+  const vf __builtin_vsx_vxxsplti32dx_v4sf (vf, const int<1>, float);
+    VXXSPLTI32DX_V4SF xxsplti32dx_v4sf {}
+
+  const vsi __builtin_vsx_vxxsplti32dx_v4si (vsi, const int<1>, signed int);
+    VXXSPLTI32DX_V4SI xxsplti32dx_v4si {}
+
+  const vd __builtin_vsx_vxxspltidp (float);
+    VXXSPLTIDP xxspltidp_v2df {}
+
+  const vf __builtin_vsx_vxxspltiw_v4sf (float);
+    VXXSPLTIW_V4SF xxspltiw_v4sf {}
+
+  const vsi __builtin_vsx_vxxspltiw_v4si (signed int);
+    VXXSPLTIW_V4SI xxspltiw_v4si {}
+
+  const vuc __builtin_vsx_xvcvbf16spn (vuc);
+    XVCVBF16SPN vsx_xvcvbf16spn {}
+
+  const vuc __builtin_vsx_xvcvspbf16 (vuc);
+    XVCVSPBF16 vsx_xvcvspbf16 {}
+
+  const vuc __builtin_vsx_xxblend_v16qi (vuc, vuc, vuc);
+    VXXBLEND_V16QI xxblend_v16qi {}
+
+  const vd __builtin_vsx_xxblend_v2df (vd, vd, vd);
+    VXXBLEND_V2DF xxblend_v2df {}
+
+  const vull __builtin_vsx_xxblend_v2di (vull, vull, vull);
+    VXXBLEND_V2DI xxblend_v2di {}
+
+  const vf __builtin_vsx_xxblend_v4sf (vf, vf, vf);
+    VXXBLEND_V4SF xxblend_v4sf {}
+
+  const vui __builtin_vsx_xxblend_v4si (vui, vui, vui);
+    VXXBLEND_V4SI xxblend_v4si {}
+
+  const vus __builtin_vsx_xxblend_v8hi (vus, vus, vus);
+    VXXBLEND_V8HI xxblend_v8hi {}
+
+  const vull __builtin_vsx_xxeval (vull, vull, vull, const int <8>);
+    XXEVAL xxeval {}
+
+  const vuc __builtin_vsx_xxgenpcvm_v16qi (vuc, const int <2>);
+    XXGENPCVM_V16QI xxgenpcvm_v16qi {}
+
+  const vull __builtin_vsx_xxgenpcvm_v2di (vull, const int <2>);
+    XXGENPCVM_V2DI xxgenpcvm_v2di {}
+
+  const vui __builtin_vsx_xxgenpcvm_v4si (vui, const int <2>);
+    XXGENPCVM_V4SI xxgenpcvm_v4si {}
+
+  const vus __builtin_vsx_xxgenpcvm_v8hi (vus, const int <2>);
+    XXGENPCVM_V8HI xxgenpcvm_v8hi {}
+
+  const vuc __builtin_vsx_xxpermx_uv16qi (vuc, vuc, vuc, const int<3>);
+    XXPERMX_UV16QI xxpermx {}
+
+  const vull __builtin_vsx_xxpermx_uv2di (vull, vull, vuc, const int<3>);
+    XXPERMX_UV2DI xxpermx {}
+
+  const vui __builtin_vsx_xxpermx_uv4si (vui, vui, vuc, const int<3>);
+    XXPERMX_UV4SI xxpermx {}
+
+  const vus __builtin_vsx_xxpermx_uv8hi (vus, vus, vuc, const int<3>);
+    XXPERMX_UV8HI xxpermx {}
+
+  const vsc __builtin_vsx_xxpermx_v16qi (vsc, vsc, vuc, const int<3>);
+    XXPERMX_V16QI xxpermx {}
+
+  const vd __builtin_vsx_xxpermx_v2df (vd, vd, vuc, const int<3>);
+    XXPERMX_V2DF xxpermx {}
+
+  const vsll __builtin_vsx_xxpermx_v2di (vsll, vsll, vuc, const int<3>);
+    XXPERMX_V2DI xxpermx {}
+
+  const vf __builtin_vsx_xxpermx_v4sf (vf, vf, vuc, const int<3>);
+    XXPERMX_V4SF xxpermx {}
+
+  const vsi __builtin_vsx_xxpermx_v4si (vsi, vsi, vuc, const int<3>);
+    XXPERMX_V4SI xxpermx {}
+
+  const vss __builtin_vsx_xxpermx_v8hi (vss, vss, vuc, const int<3>);
+    XXPERMX_V8HI xxpermx {}
+
+  pure unsigned __int128 __builtin_altivec_ze_lxvrbx (signed long, const unsigned char *);
+    ZE_LXVRBX vsx_lxvrbx {lxvrze}
+
+  pure unsigned __int128 __builtin_altivec_ze_lxvrhx (signed long, const unsigned short *);
+    ZE_LXVRHX vsx_lxvrhx {lxvrze}
+
+  pure unsigned __int128 __builtin_altivec_ze_lxvrwx (signed long, const unsigned int *);
+    ZE_LXVRWX vsx_lxvrwx {lxvrze}
+
+  pure unsigned __int128 __builtin_altivec_ze_lxvrdx (signed long, const unsigned long long *);
+    ZE_LXVRDX vsx_lxvrdx {lxvrze}
+
+
+[power10-64]
+  const unsigned long long __builtin_cfuged (unsigned long long, unsigned long long);
+    CFUGED cfuged {}
+
+  const unsigned long long __builtin_cntlzdm (unsigned long long, unsigned long long);
+    CNTLZDM cntlzdm {}
+
+  const unsigned long long __builtin_cnttzdm (unsigned long long, unsigned long long);
+    CNTTZDM cnttzdm {}
+
+  const unsigned long long __builtin_pdepd (unsigned long long, unsigned long long);
+    PDEPD pdepd {}
+
+  const unsigned long long __builtin_pextd (unsigned long long, unsigned long long);
+    PEXTD pextd {}
-- 
2.27.0


^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 11/34] rs6000: Add MMA builtins
  2021-07-29 13:30 [PATCHv4 00/34] Replace the Power target-specific builtin machinery Bill Schmidt
                   ` (9 preceding siblings ...)
  2021-07-29 13:30 ` [PATCH 10/34] rs6000: Add Power10 builtins Bill Schmidt
@ 2021-07-29 13:30 ` Bill Schmidt
  2021-08-25 22:56   ` Segher Boessenkool
  2021-07-29 13:30 ` [PATCH 12/34] rs6000: Add miscellaneous builtins Bill Schmidt
                   ` (22 subsequent siblings)
  33 siblings, 1 reply; 84+ messages in thread
From: Bill Schmidt @ 2021-07-29 13:30 UTC (permalink / raw)
  To: gcc-patches; +Cc: segher, dje.gcc, willschm

2021-06-16  Bill Schmidt  <wschmidt@linux.ibm.com>

gcc/
	* config/rs6000/rs6000-builtin-new.def: Add mma stanza.
---
 gcc/config/rs6000/rs6000-builtin-new.def | 416 +++++++++++++++++++++++
 1 file changed, 416 insertions(+)

diff --git a/gcc/config/rs6000/rs6000-builtin-new.def b/gcc/config/rs6000/rs6000-builtin-new.def
index 6b7a79549a4..4b65d54d913 100644
--- a/gcc/config/rs6000/rs6000-builtin-new.def
+++ b/gcc/config/rs6000/rs6000-builtin-new.def
@@ -3332,3 +3332,419 @@
 
   const unsigned long long __builtin_pextd (unsigned long long, unsigned long long);
     PEXTD pextd {}
+
+
+[mma]
+  void __builtin_mma_assemble_acc (v512 *, vuc, vuc, vuc, vuc);
+    ASSEMBLE_ACC nothing {mma}
+
+  v512 __builtin_mma_assemble_acc_internal (vuc, vuc, vuc, vuc);
+    ASSEMBLE_ACC_INTERNAL mma_assemble_acc {mma}
+
+  void __builtin_mma_assemble_pair (v256 *, vuc, vuc);
+    ASSEMBLE_PAIR nothing {mma}
+
+  v256 __builtin_mma_assemble_pair_internal (vuc, vuc);
+    ASSEMBLE_PAIR_INTERNAL vsx_assemble_pair {mma}
+
+  void __builtin_mma_build_acc (v512 *, vuc, vuc, vuc, vuc);
+    BUILD_ACC nothing {mma}
+
+  v512 __builtin_mma_build_acc_internal (vuc, vuc, vuc, vuc);
+    BUILD_ACC_INTERNAL mma_assemble_acc {mma}
+
+  void __builtin_mma_disassemble_acc (void *, v512 *);
+    DISASSEMBLE_ACC nothing {mma,quad}
+
+  vuc __builtin_mma_disassemble_acc_internal (v512, const int<2>);
+    DISASSEMBLE_ACC_INTERNAL mma_disassemble_acc {mma}
+
+  void __builtin_mma_disassemble_pair (void *, v256 *);
+    DISASSEMBLE_PAIR nothing {mma,pair}
+
+  vuc __builtin_mma_disassemble_pair_internal (v256, const int<2>);
+    DISASSEMBLE_PAIR_INTERNAL vsx_disassemble_pair {mma}
+
+  void __builtin_mma_pmxvbf16ger2 (v512 *, vuc, vuc, const int<4>, const int<4>, const int<2>);
+    PMXVBF16GER2 nothing {mma}
+
+  v512 __builtin_mma_pmxvbf16ger2_internal (vuc, vuc, const int<4>, const int<4>, const int<2>);
+    PMXVBF16GER2_INTERNAL mma_pmxvbf16ger2 {mma}
+
+  void __builtin_mma_pmxvbf16ger2nn (v512 *, vuc, vuc, const int<4>, const int<4>, const int<2>);
+    PMXVBF16GER2NN nothing {mma,quad}
+
+  v512 __builtin_mma_pmxvbf16ger2nn_internal (v512, vuc, vuc, const int<4>, const int<4>, const int<2>);
+    PMXVBF16GER2NN_INTERNAL mma_pmxvbf16ger2nn {mma,quad}
+
+  void __builtin_mma_pmxvbf16ger2np (v512 *, vuc, vuc, const int<4>, const int<4>, const int<2>);
+    PMXVBF16GER2NP nothing {mma,quad}
+
+  v512 __builtin_mma_pmxvbf16ger2np_internal (v512, vuc, vuc, const int<4>, const int<4>, const int<2>);
+    PMXVBF16GER2NP_INTERNAL mma_pmxvbf16ger2np {mma,quad}
+
+  void __builtin_mma_pmxvbf16ger2pn (v512 *, vuc, vuc, const int<4>, const int<4>, const int<2>);
+    PMXVBF16GER2PN nothing {mma,quad}
+
+  v512 __builtin_mma_pmxvbf16ger2pn_internal (v512, vuc, vuc, const int<4>, const int<4>, const int<2>);
+    PMXVBF16GER2PN_INTERNAL mma_pmxvbf16ger2pn {mma,quad}
+
+  void __builtin_mma_pmxvbf16ger2pp (v512 *, vuc, vuc, const int<4>, const int<4>, const int<2>);
+    PMXVBF16GER2PP nothing {mma,quad}
+
+  v512 __builtin_mma_pmxvbf16ger2pp_internal (v512, vuc, vuc, const int<4>, const int<4>, const int<2>);
+    PMXVBF16GER2PP_INTERNAL mma_pmxvbf16ger2pp {mma,quad}
+
+  void __builtin_mma_pmxvf16ger2 (v512 *, vuc, vuc, const int<4>, const int<4>, const int<2>);
+    PMXVF16GER2 nothing {mma}
+
+  v512 __builtin_mma_pmxvf16ger2_internal (vuc, vuc, const int<4>, const int<4>, const int<2>);
+    PMXVF16GER2_INTERNAL mma_pmxvf16ger2 {mma}
+
+  void __builtin_mma_pmxvf16ger2nn (v512 *, vuc, vuc, const int<4>, const int<4>, const int<2>);
+    PMXVF16GER2NN nothing {mma,quad}
+
+  v512 __builtin_mma_pmxvf16ger2nn_internal (v512, vuc, vuc, const int<4>, const int<4>, const int<2>);
+    PMXVF16GER2NN_INTERNAL mma_pmxvf16ger2nn {mma,quad}
+
+  void __builtin_mma_pmxvf16ger2np (v512 *, vuc, vuc, const int<4>, const int<4>, const int<2>);
+    PMXVF16GER2NP nothing {mma,quad}
+
+  v512 __builtin_mma_pmxvf16ger2np_internal (v512, vuc, vuc, const int<4>, const int<4>, const int<2>);
+    PMXVF16GER2NP_INTERNAL mma_pmxvf16ger2np {mma,quad}
+
+  void __builtin_mma_pmxvf16ger2pn (v512 *, vuc, vuc, const int<4>, const int<4>, const int<2>);
+    PMXVF16GER2PN nothing {mma,quad}
+
+  v512 __builtin_mma_pmxvf16ger2pn_internal (v512, vuc, vuc, const int<4>, const int<4>, const int<2>);
+    PMXVF16GER2PN_INTERNAL mma_pmxvf16ger2pn {mma,quad}
+
+  void __builtin_mma_pmxvf16ger2pp (v512 *, vuc, vuc, const int<4>, const int<4>, const int<2>);
+    PMXVF16GER2PP nothing {mma,quad}
+
+  v512 __builtin_mma_pmxvf16ger2pp_internal (v512, vuc, vuc, const int<4>, const int<4>, const int<2>);
+    PMXVF16GER2PP_INTERNAL mma_pmxvf16ger2pp {mma,quad}
+
+  void __builtin_mma_pmxvf32ger (v512 *, vuc, vuc, const int<4>, const int<4>);
+    PMXVF32GER nothing {mma}
+
+  v512 __builtin_mma_pmxvf32ger_internal (vuc, vuc, const int<4>, const int<4>);
+    PMXVF32GER_INTERNAL mma_pmxvf32ger {mma}
+
+  void __builtin_mma_pmxvf32gernn (v512 *, vuc, vuc, const int<4>, const int<4>);
+    PMXVF32GERNN nothing {mma,quad}
+
+  v512 __builtin_mma_pmxvf32gernn_internal (v512, vuc, vuc, const int<4>, const int<4>);
+    PMXVF32GERNN_INTERNAL mma_pmxvf32gernn {mma,quad}
+
+  void __builtin_mma_pmxvf32gernp (v512 *, vuc, vuc, const int<4>, const int<4>);
+    PMXVF32GERNP nothing {mma,quad}
+
+  v512 __builtin_mma_pmxvf32gernp_internal (v512, vuc, vuc, const int<4>, const int<4>);
+    PMXVF32GERNP_INTERNAL mma_pmxvf32gernp {mma,quad}
+
+  void __builtin_mma_pmxvf32gerpn (v512 *, vuc, vuc, const int<4>, const int<4>);
+    PMXVF32GERPN nothing {mma,quad}
+
+  v512 __builtin_mma_pmxvf32gerpn_internal (v512, vuc, vuc, const int<4>, const int<4>);
+    PMXVF32GERPN_INTERNAL mma_pmxvf32gerpn {mma,quad}
+
+  void __builtin_mma_pmxvf32gerpp (v512 *, vuc, vuc, const int<4>, const int<4>);
+    PMXVF32GERPP nothing {mma,quad}
+
+  v512 __builtin_mma_pmxvf32gerpp_internal (v512, vuc, vuc, const int<4>, const int<4>);
+    PMXVF32GERPP_INTERNAL mma_pmxvf32gerpp {mma,quad}
+
+  void __builtin_mma_pmxvf64ger (v512 *, v256, vuc, const int<4>, const int<2>);
+    PMXVF64GER nothing {mma,pair}
+
+  v512 __builtin_mma_pmxvf64ger_internal (v256, vuc, const int<4>, const int<2>);
+    PMXVF64GER_INTERNAL mma_pmxvf64ger {mma,pair}
+
+  void __builtin_mma_pmxvf64gernn (v512 *, v256, vuc, const int<4>, const int<2>);
+    PMXVF64GERNN nothing {mma,pair,quad}
+
+  v512 __builtin_mma_pmxvf64gernn_internal (v512, v256, vuc, const int<4>, const int<2>);
+    PMXVF64GERNN_INTERNAL mma_pmxvf64gernn {mma,pair,quad}
+
+  void __builtin_mma_pmxvf64gernp (v512 *, v256, vuc, const int<4>, const int<2>);
+    PMXVF64GERNP nothing {mma,pair,quad}
+
+  v512 __builtin_mma_pmxvf64gernp_internal (v512, v256, vuc, const int<4>, const int<2>);
+    PMXVF64GERNP_INTERNAL mma_pmxvf64gernp {mma,pair,quad}
+
+  void __builtin_mma_pmxvf64gerpn (v512 *, v256, vuc, const int<4>, const int<2>);
+    PMXVF64GERPN nothing {mma,pair,quad}
+
+  v512 __builtin_mma_pmxvf64gerpn_internal (v512, v256, vuc, const int<4>, const int<2>);
+    PMXVF64GERPN_INTERNAL mma_pmxvf64gerpn {mma,pair,quad}
+
+  void __builtin_mma_pmxvf64gerpp (v512 *, v256, vuc, const int<4>, const int<2>);
+    PMXVF64GERPP nothing {mma,pair,quad}
+
+  v512 __builtin_mma_pmxvf64gerpp_internal (v512, v256, vuc, const int<4>, const int<2>);
+    PMXVF64GERPP_INTERNAL mma_pmxvf64gerpp {mma,pair,quad}
+
+  void __builtin_mma_pmxvi16ger2 (v512 *, vuc, vuc, const int<4>, const int<4>, const int<2>);
+    PMXVI16GER2 nothing {mma}
+
+  v512 __builtin_mma_pmxvi16ger2_internal (vuc, vuc, const int<4>, const int<4>, const int<2>);
+    PMXVI16GER2_INTERNAL mma_pmxvi16ger2 {mma}
+
+  void __builtin_mma_pmxvi16ger2pp (v512 *, vuc, vuc, const int<4>, const int<4>, const int<2>);
+    PMXVI16GER2PP nothing {mma,quad}
+
+  v512 __builtin_mma_pmxvi16ger2pp_internal (v512, vuc, vuc, const int<4>, const int<4>, const int<2>);
+    PMXVI16GER2PP_INTERNAL mma_pmxvi16ger2pp {mma,quad}
+
+  void __builtin_mma_pmxvi16ger2s (v512 *, vuc, vuc, const int<4>, const int<4>, const int<2>);
+    PMXVI16GER2S nothing {mma}
+
+  v512 __builtin_mma_pmxvi16ger2s_internal (vuc, vuc, const int<4>, const int<4>, const int<2>);
+    PMXVI16GER2S_INTERNAL mma_pmxvi16ger2s {mma}
+
+  void __builtin_mma_pmxvi16ger2spp (v512 *, vuc, vuc, const int<4>, const int<4>, const int<2>);
+    PMXVI16GER2SPP nothing {mma,quad}
+
+  v512 __builtin_mma_pmxvi16ger2spp_internal (v512, vuc, vuc, const int<4>, const int<4>, const int<2>);
+    PMXVI16GER2SPP_INTERNAL mma_pmxvi16ger2spp {mma,quad}
+
+  void __builtin_mma_pmxvi4ger8 (v512 *, vuc, vuc, const int<4>, const int<4>, const int<8>);
+    PMXVI4GER8 nothing {mma}
+
+  v512 __builtin_mma_pmxvi4ger8_internal (vuc, vuc, const int<4>, const int<4>, const int<8>);
+    PMXVI4GER8_INTERNAL mma_pmxvi4ger8 {mma}
+
+  void __builtin_mma_pmxvi4ger8pp (v512 *, vuc, vuc, const int<4>, const int<4>, const int<4>);
+    PMXVI4GER8PP nothing {mma,quad}
+
+  v512 __builtin_mma_pmxvi4ger8pp_internal (v512, vuc, vuc, const int<4>, const int<4>, const int<4>);
+    PMXVI4GER8PP_INTERNAL mma_pmxvi4ger8pp {mma,quad}
+
+  void __builtin_mma_pmxvi8ger4 (v512 *, vuc, vuc, const int<4>, const int<4>, const int<4>);
+    PMXVI8GER4 nothing {mma}
+
+  v512 __builtin_mma_pmxvi8ger4_internal (vuc, vuc, const int<4>, const int<4>, const int<4>);
+    PMXVI8GER4_INTERNAL mma_pmxvi8ger4 {mma}
+
+  void __builtin_mma_pmxvi8ger4pp (v512 *, vuc, vuc, const int<4>, const int<4>, const int<4>);
+    PMXVI8GER4PP nothing {mma,quad}
+
+  v512 __builtin_mma_pmxvi8ger4pp_internal (v512, vuc, vuc, const int<4>, const int<4>, const int<4>);
+    PMXVI8GER4PP_INTERNAL mma_pmxvi8ger4pp {mma,quad}
+
+  void __builtin_mma_pmxvi8ger4spp (v512 *, vuc, vuc, const int<4>, const int<4>, const int<4>);
+    PMXVI8GER4SPP nothing {mma,quad}
+
+  v512 __builtin_mma_pmxvi8ger4spp_internal (v512, vuc, vuc, const int<4>, const int<4>, const int<4>);
+    PMXVI8GER4SPP_INTERNAL mma_pmxvi8ger4spp {mma,quad}
+
+  void __builtin_mma_xvbf16ger2 (v512 *, vuc, vuc);
+    XVBF16GER2 nothing {mma}
+
+  v512 __builtin_mma_xvbf16ger2_internal (vuc, vuc);
+    XVBF16GER2_INTERNAL mma_xvbf16ger2 {mma}
+
+  void __builtin_mma_xvbf16ger2nn (v512 *, vuc, vuc);
+    XVBF16GER2NN nothing {mma,quad}
+
+  v512 __builtin_mma_xvbf16ger2nn_internal (v512, vuc, vuc);
+    XVBF16GER2NN_INTERNAL mma_xvbf16ger2nn {mma,quad}
+
+  void __builtin_mma_xvbf16ger2np (v512 *, vuc, vuc);
+    XVBF16GER2NP nothing {mma,quad}
+
+  v512 __builtin_mma_xvbf16ger2np_internal (v512, vuc, vuc);
+    XVBF16GER2NP_INTERNAL mma_xvbf16ger2np {mma,quad}
+
+  void __builtin_mma_xvbf16ger2pn (v512 *, vuc, vuc);
+    XVBF16GER2PN nothing {mma,quad}
+
+  v512 __builtin_mma_xvbf16ger2pn_internal (v512, vuc, vuc);
+    XVBF16GER2PN_INTERNAL mma_xvbf16ger2pn {mma,quad}
+
+  void __builtin_mma_xvbf16ger2pp (v512 *, vuc, vuc);
+    XVBF16GER2PP nothing {mma,quad}
+
+  v512 __builtin_mma_xvbf16ger2pp_internal (v512, vuc, vuc);
+    XVBF16GER2PP_INTERNAL mma_xvbf16ger2pp {mma,quad}
+
+  void __builtin_mma_xvf16ger2 (v512 *, vuc, vuc);
+    XVF16GER2 nothing {mma}
+
+  v512 __builtin_mma_xvf16ger2_internal (vuc, vuc);
+    XVF16GER2_INTERNAL mma_xvf16ger2 {mma}
+
+  void __builtin_mma_xvf16ger2nn (v512 *, vuc, vuc);
+    XVF16GER2NN nothing {mma,quad}
+
+  v512 __builtin_mma_xvf16ger2nn_internal (v512, vuc, vuc);
+    XVF16GER2NN_INTERNAL mma_xvf16ger2nn {mma,quad}
+
+  void __builtin_mma_xvf16ger2np (v512 *, vuc, vuc);
+    XVF16GER2NP nothing {mma,quad}
+
+  v512 __builtin_mma_xvf16ger2np_internal (v512, vuc, vuc);
+    XVF16GER2NP_INTERNAL mma_xvf16ger2np {mma,quad}
+
+  void __builtin_mma_xvf16ger2pn (v512 *, vuc, vuc);
+    XVF16GER2PN nothing {mma,quad}
+
+  v512 __builtin_mma_xvf16ger2pn_internal (v512, vuc, vuc);
+    XVF16GER2PN_INTERNAL mma_xvf16ger2pn {mma,quad}
+
+  void __builtin_mma_xvf16ger2pp (v512 *, vuc, vuc);
+    XVF16GER2PP nothing {mma,quad}
+
+  v512 __builtin_mma_xvf16ger2pp_internal (v512, vuc, vuc);
+    XVF16GER2PP_INTERNAL mma_xvf16ger2pp {mma,quad}
+
+  void __builtin_mma_xvf32ger (v512 *, vuc, vuc);
+    XVF32GER nothing {mma}
+
+  v512 __builtin_mma_xvf32ger_internal (vuc, vuc);
+    XVF32GER_INTERNAL mma_xvf32ger {mma}
+
+  void __builtin_mma_xvf32gernn (v512 *, vuc, vuc);
+    XVF32GERNN nothing {mma,quad}
+
+  v512 __builtin_mma_xvf32gernn_internal (v512, vuc, vuc);
+    XVF32GERNN_INTERNAL mma_xvf32gernn {mma,quad}
+
+  void __builtin_mma_xvf32gernp (v512 *, vuc, vuc);
+    XVF32GERNP nothing {mma,quad}
+
+  v512 __builtin_mma_xvf32gernp_internal (v512, vuc, vuc);
+    XVF32GERNP_INTERNAL mma_xvf32gernp {mma,quad}
+
+  void __builtin_mma_xvf32gerpn (v512 *, vuc, vuc);
+    XVF32GERPN nothing {mma,quad}
+
+  v512 __builtin_mma_xvf32gerpn_internal (v512, vuc, vuc);
+    XVF32GERPN_INTERNAL mma_xvf32gerpn {mma,quad}
+
+  void __builtin_mma_xvf32gerpp (v512 *, vuc, vuc);
+    XVF32GERPP nothing {mma,quad}
+
+  v512 __builtin_mma_xvf32gerpp_internal (v512, vuc, vuc);
+    XVF32GERPP_INTERNAL mma_xvf32gerpp {mma,quad}
+
+  void __builtin_mma_xvf64ger (v512 *, v256, vuc);
+    XVF64GER nothing {mma,pair}
+
+  v512 __builtin_mma_xvf64ger_internal (v256, vuc);
+    XVF64GER_INTERNAL mma_xvf64ger {mma,pair}
+
+  void __builtin_mma_xvf64gernn (v512 *, v256, vuc);
+    XVF64GERNN nothing {mma,pair,quad}
+
+  v512 __builtin_mma_xvf64gernn_internal (v512, v256, vuc);
+    XVF64GERNN_INTERNAL mma_xvf64gernn {mma,pair,quad}
+
+  void __builtin_mma_xvf64gernp (v512 *, v256, vuc);
+    XVF64GERNP nothing {mma,pair,quad}
+
+  v512 __builtin_mma_xvf64gernp_internal (v512, v256, vuc);
+    XVF64GERNP_INTERNAL mma_xvf64gernp {mma,pair,quad}
+
+  void __builtin_mma_xvf64gerpn (v512 *, v256, vuc);
+    XVF64GERPN nothing {mma,pair,quad}
+
+  v512 __builtin_mma_xvf64gerpn_internal (v512, v256, vuc);
+    XVF64GERPN_INTERNAL mma_xvf64gerpn {mma,pair,quad}
+
+  void __builtin_mma_xvf64gerpp (v512 *, v256, vuc);
+    XVF64GERPP nothing {mma,pair,quad}
+
+  v512 __builtin_mma_xvf64gerpp_internal (v512, v256, vuc);
+    XVF64GERPP_INTERNAL mma_xvf64gerpp {mma,pair,quad}
+
+  void __builtin_mma_xvi16ger2 (v512 *, vuc, vuc);
+    XVI16GER2 nothing {mma}
+
+  v512 __builtin_mma_xvi16ger2_internal (vuc, vuc);
+    XVI16GER2_INTERNAL mma_xvi16ger2 {mma}
+
+  void __builtin_mma_xvi16ger2pp (v512 *, vuc, vuc);
+    XVI16GER2PP nothing {mma,quad}
+
+  v512 __builtin_mma_xvi16ger2pp_internal (v512, vuc, vuc);
+    XVI16GER2PP_INTERNAL mma_xvi16ger2pp {mma,quad}
+
+  void __builtin_mma_xvi16ger2s (v512 *, vuc, vuc);
+    XVI16GER2S nothing {mma}
+
+  v512 __builtin_mma_xvi16ger2s_internal (vuc, vuc);
+    XVI16GER2S_INTERNAL mma_xvi16ger2s {mma}
+
+  void __builtin_mma_xvi16ger2spp (v512 *, vuc, vuc);
+    XVI16GER2SPP nothing {mma,quad}
+
+  v512 __builtin_mma_xvi16ger2spp_internal (v512, vuc, vuc);
+    XVI16GER2SPP_INTERNAL mma_xvi16ger2spp {mma,quad}
+
+  void __builtin_mma_xvi4ger8 (v512 *, vuc, vuc);
+    XVI4GER8 nothing {mma}
+
+  v512 __builtin_mma_xvi4ger8_internal (vuc, vuc);
+    XVI4GER8_INTERNAL mma_xvi4ger8 {mma}
+
+  void __builtin_mma_xvi4ger8pp (v512 *, vuc, vuc);
+    XVI4GER8PP nothing {mma,quad}
+
+  v512 __builtin_mma_xvi4ger8pp_internal (v512, vuc, vuc);
+    XVI4GER8PP_INTERNAL mma_xvi4ger8pp {mma,quad}
+
+  void __builtin_mma_xvi8ger4 (v512 *, vuc, vuc);
+    XVI8GER4 nothing {mma}
+
+  v512 __builtin_mma_xvi8ger4_internal (vuc, vuc);
+    XVI8GER4_INTERNAL mma_xvi8ger4 {mma}
+
+  void __builtin_mma_xvi8ger4pp (v512 *, vuc, vuc);
+    XVI8GER4PP nothing {mma,quad}
+
+  v512 __builtin_mma_xvi8ger4pp_internal (v512, vuc, vuc);
+    XVI8GER4PP_INTERNAL mma_xvi8ger4pp {mma,quad}
+
+  void __builtin_mma_xvi8ger4spp (v512 *, vuc, vuc);
+    XVI8GER4SPP nothing {mma,quad}
+
+  v512 __builtin_mma_xvi8ger4spp_internal (v512, vuc, vuc);
+    XVI8GER4SPP_INTERNAL mma_xvi8ger4spp {mma,quad}
+
+  void __builtin_mma_xxmfacc (v512 *);
+    XXMFACC nothing {mma,quad}
+
+  v512 __builtin_mma_xxmfacc_internal (v512);
+    XXMFACC_INTERNAL mma_xxmfacc {mma,quad}
+
+  void __builtin_mma_xxmtacc (v512 *);
+    XXMTACC nothing {mma,quad}
+
+  v512 __builtin_mma_xxmtacc_internal (v512);
+    XXMTACC_INTERNAL mma_xxmtacc {mma,quad}
+
+  void __builtin_mma_xxsetaccz (v512 *);
+    XXSETACCZ nothing {mma}
+
+  v512 __builtin_mma_xxsetaccz_internal ();
+    XXSETACCZ_INTERNAL mma_xxsetaccz {mma}
+
+  void __builtin_vsx_assemble_pair (v256 *, vuc, vuc);
+    ASSEMBLE_PAIR_V nothing {mma}
+
+  v256 __builtin_vsx_assemble_pair_internal (vuc, vuc);
+    ASSEMBLE_PAIR_V_INTERNAL vsx_assemble_pair {mma}
+
+  void __builtin_vsx_build_pair (v256 *, vuc, vuc);
+    BUILD_PAIR nothing {mma}
+
+  v256 __builtin_vsx_build_pair_internal (vuc, vuc);
+    BUILD_PAIR_INTERNAL vsx_assemble_pair {mma}
+
+  void __builtin_vsx_disassemble_pair (void *, v256 *);
+    DISASSEMBLE_PAIR_V nothing {mma,pair}
+
+  vuc __builtin_vsx_disassemble_pair_internal (v256, const int<2>);
+    DISASSEMBLE_PAIR_V_INTERNAL vsx_disassemble_pair {mma}
-- 
2.27.0


^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 12/34] rs6000: Add miscellaneous builtins
  2021-07-29 13:30 [PATCHv4 00/34] Replace the Power target-specific builtin machinery Bill Schmidt
                   ` (10 preceding siblings ...)
  2021-07-29 13:30 ` [PATCH 11/34] rs6000: Add MMA builtins Bill Schmidt
@ 2021-07-29 13:30 ` Bill Schmidt
  2021-08-25 22:58   ` Segher Boessenkool
  2021-07-29 13:31 ` [PATCH 13/34] rs6000: Add Cell builtins Bill Schmidt
                   ` (21 subsequent siblings)
  33 siblings, 1 reply; 84+ messages in thread
From: Bill Schmidt @ 2021-07-29 13:30 UTC (permalink / raw)
  To: gcc-patches; +Cc: segher, dje.gcc, willschm

2021-06-15  Bill Schmidt  <wschmidt@linux.ibm.com>

gcc/
	* config/rs6000/rs6000-builtin-new.def: Add ieee128-hw, dfp,
	crypto, and htm stanzas.
---
 gcc/config/rs6000/rs6000-builtin-new.def | 215 +++++++++++++++++++++++
 1 file changed, 215 insertions(+)

diff --git a/gcc/config/rs6000/rs6000-builtin-new.def b/gcc/config/rs6000/rs6000-builtin-new.def
index 4b65d54d913..805bdc87acd 100644
--- a/gcc/config/rs6000/rs6000-builtin-new.def
+++ b/gcc/config/rs6000/rs6000-builtin-new.def
@@ -2811,6 +2811,221 @@
     XL_LEN_R xl_len_r {}
 
 
+; Builtins requiring hardware support for IEEE-128 floating-point.
+[ieee128-hw]
+  fpmath _Float128 __builtin_addf128_round_to_odd (_Float128, _Float128);
+    ADDF128_ODD addkf3_odd {}
+
+  fpmath _Float128 __builtin_divf128_round_to_odd (_Float128, _Float128);
+    DIVF128_ODD divkf3_odd {}
+
+  fpmath _Float128 __builtin_fmaf128_round_to_odd (_Float128, _Float128, _Float128);
+    FMAF128_ODD fmakf4_odd {}
+
+  fpmath _Float128 __builtin_mulf128_round_to_odd (_Float128, _Float128);
+    MULF128_ODD mulkf3_odd {}
+
+  const signed int __builtin_vsx_scalar_cmp_exp_qp_eq (_Float128, _Float128);
+    VSCEQPEQ xscmpexpqp_eq_kf {}
+
+  const signed int __builtin_vsx_scalar_cmp_exp_qp_gt (_Float128, _Float128);
+    VSCEQPGT xscmpexpqp_gt_kf {}
+
+  const signed int __builtin_vsx_scalar_cmp_exp_qp_lt (_Float128, _Float128);
+    VSCEQPLT xscmpexpqp_lt_kf {}
+
+  const signed int __builtin_vsx_scalar_cmp_exp_qp_unordered (_Float128, _Float128);
+    VSCEQPUO xscmpexpqp_unordered_kf {}
+
+  fpmath _Float128 __builtin_sqrtf128_round_to_odd (_Float128);
+    SQRTF128_ODD sqrtkf2_odd {}
+
+  fpmath _Float128 __builtin_subf128_round_to_odd (_Float128, _Float128);
+    SUBF128_ODD subkf3_odd {}
+
+  fpmath double __builtin_truncf128_round_to_odd (_Float128);
+    TRUNCF128_ODD trunckfdf2_odd {}
+
+  const signed long long __builtin_vsx_scalar_extract_expq (_Float128);
+    VSEEQP xsxexpqp_kf {}
+
+  const signed __int128 __builtin_vsx_scalar_extract_sigq (_Float128);
+    VSESQP xsxsigqp_kf {}
+
+  const _Float128 __builtin_vsx_scalar_insert_exp_q (unsigned __int128, unsigned long long);
+    VSIEQP xsiexpqp_kf {}
+
+  const _Float128 __builtin_vsx_scalar_insert_exp_qp (_Float128, unsigned long long);
+    VSIEQPF xsiexpqpf_kf {}
+
+  const signed int __builtin_vsx_scalar_test_data_class_qp (_Float128, const int<7>);
+    VSTDCQP xststdcqp_kf {}
+
+  const signed int __builtin_vsx_scalar_test_neg_qp (_Float128);
+    VSTDCNQP xststdcnegqp_kf {}
+
+
+
+; Decimal floating-point builtins.
+[dfp]
+  const _Decimal64 __builtin_ddedpd (const int<2>, _Decimal64);
+    DDEDPD dfp_ddedpd_dd {}
+
+  const _Decimal128 __builtin_ddedpdq (const int<2>, _Decimal128);
+    DDEDPDQ dfp_ddedpd_td {}
+
+  const _Decimal64 __builtin_denbcd (const int<1>, _Decimal64);
+    DENBCD dfp_denbcd_dd {}
+
+  const _Decimal128 __builtin_denbcdq (const int<1>, _Decimal128);
+    DENBCDQ dfp_denbcd_td {}
+
+  const _Decimal128 __builtin_denb2dfp_v16qi (vsc);
+    DENB2DFP_V16QI dfp_denbcd_v16qi {}
+
+  const _Decimal64 __builtin_diex (signed long long, _Decimal64);
+    DIEX dfp_diex_dd {}
+
+  const _Decimal128 __builtin_diexq (signed long long, _Decimal128);
+    DIEXQ dfp_diex_td {}
+
+  const _Decimal64 __builtin_dscli (_Decimal64, const int<6>);
+    DSCLI dfp_dscli_dd {}
+
+  const _Decimal128 __builtin_dscliq (_Decimal128, const int<6>);
+    DSCLIQ dfp_dscli_td {}
+
+  const _Decimal64 __builtin_dscri (_Decimal64, const int<6>);
+    DSCRI dfp_dscri_dd {}
+
+  const _Decimal128 __builtin_dscriq (_Decimal128, const int<6>);
+    DSCRIQ dfp_dscri_td {}
+
+  const signed long long __builtin_dxex (_Decimal64);
+    DXEX dfp_dxex_dd {}
+
+  const signed long long __builtin_dxexq (_Decimal128);
+    DXEXQ dfp_dxex_td {}
+
+  const _Decimal128 __builtin_pack_dec128 (unsigned long long, unsigned long long);
+    PACK_TD packtd {}
+
+  void __builtin_set_fpscr_drn (const int[0,7]);
+    SET_FPSCR_DRN rs6000_set_fpscr_drn {}
+
+  const unsigned long __builtin_unpack_dec128 (_Decimal128, const int<1>);
+    UNPACK_TD unpacktd {}
+
+
+[crypto]
+  const vull __builtin_crypto_vcipher (vull, vull);
+    VCIPHER crypto_vcipher_v2di {}
+
+  const vuc __builtin_crypto_vcipher_be (vuc, vuc);
+    VCIPHER_BE crypto_vcipher_v16qi {}
+
+  const vull __builtin_crypto_vcipherlast (vull, vull);
+    VCIPHERLAST crypto_vcipherlast_v2di {}
+
+  const vuc __builtin_crypto_vcipherlast_be (vuc, vuc);
+    VCIPHERLAST_BE crypto_vcipherlast_v16qi {}
+
+  const vull __builtin_crypto_vncipher (vull, vull);
+    VNCIPHER crypto_vncipher_v2di {}
+
+  const vuc __builtin_crypto_vncipher_be (vuc, vuc);
+    VNCIPHER_BE crypto_vncipher_v16qi {}
+
+  const vull __builtin_crypto_vncipherlast (vull, vull);
+    VNCIPHERLAST crypto_vncipherlast_v2di {}
+
+  const vuc __builtin_crypto_vncipherlast_be (vuc, vuc);
+    VNCIPHERLAST_BE crypto_vncipherlast_v16qi {}
+
+  const vull __builtin_crypto_vsbox (vull);
+    VSBOX crypto_vsbox_v2di {}
+
+  const vuc __builtin_crypto_vsbox_be (vuc);
+    VSBOX_BE crypto_vsbox_v16qi {}
+
+  const vull __builtin_crypto_vshasigmad (vull, const int<1>, const int<4>);
+    VSHASIGMAD crypto_vshasigmad {}
+
+  const vui __builtin_crypto_vshasigmaw (vui, const int<1>, const int<4>);
+    VSHASIGMAW crypto_vshasigmaw {}
+
+
+[htm]
+  unsigned long long __builtin_get_texasr ();
+    GET_TEXASR nothing {htm,htmspr}
+
+  unsigned long long __builtin_get_texasru ();
+    GET_TEXASRU nothing {htm,htmspr}
+
+  unsigned long long __builtin_get_tfhar ();
+    GET_TFHAR nothing {htm,htmspr}
+
+  unsigned long long __builtin_get_tfiar ();
+    GET_TFIAR nothing {htm,htmspr}
+
+  void __builtin_set_texasr (unsigned long long);
+    SET_TEXASR nothing {htm,htmspr}
+
+  void __builtin_set_texasru (unsigned long long);
+    SET_TEXASRU nothing {htm,htmspr}
+
+  void __builtin_set_tfhar (unsigned long long);
+    SET_TFHAR nothing {htm,htmspr}
+
+  void __builtin_set_tfiar (unsigned long long);
+    SET_TFIAR nothing {htm,htmspr}
+
+  unsigned int __builtin_tabort (unsigned int);
+    TABORT tabort {htm,htmcr}
+
+  unsigned int __builtin_tabortdc (unsigned long long, unsigned long long, unsigned long long);
+    TABORTDC tabortdc {htm,htmcr}
+
+  unsigned int __builtin_tabortdci (unsigned long long, unsigned long long, unsigned long long);
+    TABORTDCI tabortdci {htm,htmcr}
+
+  unsigned int __builtin_tabortwc (unsigned int, unsigned int, unsigned int);
+    TABORTWC tabortwc {htm,htmcr}
+
+  unsigned int __builtin_tabortwci (unsigned int, unsigned int, unsigned int);
+    TABORTWCI tabortwci {htm,htmcr}
+
+  unsigned int __builtin_tbegin (unsigned int);
+    TBEGIN tbegin {htm,htmcr}
+
+  unsigned int __builtin_tcheck ();
+    TCHECK tcheck {htm,htmcr}
+
+  unsigned int __builtin_tend (unsigned int);
+    TEND tend {htm,htmcr}
+
+  unsigned int __builtin_tendall ();
+    TENDALL tend {htm,htmcr}
+
+  unsigned int __builtin_trechkpt ();
+    TRECHKPT trechkpt {htm,htmcr}
+
+  unsigned int __builtin_treclaim (unsigned int);
+    TRECLAIM treclaim {htm,htmcr}
+
+  unsigned int __builtin_tresume ();
+    TRESUME tsr {htm,htmcr}
+
+  unsigned int __builtin_tsr (unsigned int);
+    TSR tsr {htm,htmcr}
+
+  unsigned int __builtin_tsuspend ();
+    TSUSPEND tsr {htm,htmcr}
+
+  unsigned int __builtin_ttest ();
+    TTEST ttest {htm,htmcr}
+
+
 [power10]
   const vbq __builtin_altivec_cmpge_1ti (vsq, vsq);
     CMPGE_1TI vector_nltv1ti {}
-- 
2.27.0


^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 13/34] rs6000: Add Cell builtins
  2021-07-29 13:30 [PATCHv4 00/34] Replace the Power target-specific builtin machinery Bill Schmidt
                   ` (11 preceding siblings ...)
  2021-07-29 13:30 ` [PATCH 12/34] rs6000: Add miscellaneous builtins Bill Schmidt
@ 2021-07-29 13:31 ` Bill Schmidt
  2021-08-25 22:59   ` Segher Boessenkool
  2021-07-29 13:31 ` [PATCH 14/34] rs6000: Add remaining overloads Bill Schmidt
                   ` (20 subsequent siblings)
  33 siblings, 1 reply; 84+ messages in thread
From: Bill Schmidt @ 2021-07-29 13:31 UTC (permalink / raw)
  To: gcc-patches; +Cc: segher, dje.gcc, willschm

2021-06-07  Bill Schmidt  <wschmidt@linux.ibm.com>

gcc/
	* config/rs6000/rs6000-builtin-new.def: Add cell stanza.
---
 gcc/config/rs6000/rs6000-builtin-new.def | 27 ++++++++++++++++++++++++
 1 file changed, 27 insertions(+)

diff --git a/gcc/config/rs6000/rs6000-builtin-new.def b/gcc/config/rs6000/rs6000-builtin-new.def
index 805bdc87acd..322dbe1f713 100644
--- a/gcc/config/rs6000/rs6000-builtin-new.def
+++ b/gcc/config/rs6000/rs6000-builtin-new.def
@@ -1102,6 +1102,33 @@
     VEC_SET_V8HI nothing {set}
 
 
+; Cell builtins.
+[cell]
+  pure vsc __builtin_altivec_lvlx (signed long, const void *);
+    LVLX altivec_lvlx {ldvec}
+
+  pure vsc __builtin_altivec_lvlxl (signed long, const void *);
+    LVLXL altivec_lvlxl {ldvec}
+
+  pure vsc __builtin_altivec_lvrx (signed long, const void *);
+    LVRX altivec_lvrx {ldvec}
+
+  pure vsc __builtin_altivec_lvrxl (signed long, const void *);
+    LVRXL altivec_lvrxl {ldvec}
+
+  void __builtin_altivec_stvlx (vsc, signed long, void *);
+    STVLX altivec_stvlx {stvec}
+
+  void __builtin_altivec_stvlxl (vsc, signed long, void *);
+    STVLXL altivec_stvlxl {stvec}
+
+  void __builtin_altivec_stvrx (vsc, signed long, void *);
+    STVRX altivec_stvrx {stvec}
+
+  void __builtin_altivec_stvrxl (vsc, signed long, void *);
+    STVRXL altivec_stvrxl {stvec}
+
+
 ; VSX builtins.
 [vsx]
   pure vd __builtin_altivec_lvx_v2df (signed long, const void *);
-- 
2.27.0


^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 14/34] rs6000: Add remaining overloads
  2021-07-29 13:30 [PATCHv4 00/34] Replace the Power target-specific builtin machinery Bill Schmidt
                   ` (12 preceding siblings ...)
  2021-07-29 13:31 ` [PATCH 13/34] rs6000: Add Cell builtins Bill Schmidt
@ 2021-07-29 13:31 ` Bill Schmidt
  2021-08-25 23:27   ` Segher Boessenkool
  2021-07-29 13:31 ` [PATCH 15/34] rs6000: Execute the automatic built-in initialization code Bill Schmidt
                   ` (19 subsequent siblings)
  33 siblings, 1 reply; 84+ messages in thread
From: Bill Schmidt @ 2021-07-29 13:31 UTC (permalink / raw)
  To: gcc-patches; +Cc: segher, dje.gcc, willschm

2021-06-15  Bill Schmidt  <wschmidt@linux.ibm.com>

gcc/
	* config/rs6000/rs6000-overload.def: Add remaining overloads.
---
 gcc/config/rs6000/rs6000-overload.def | 6104 +++++++++++++++++++++++++
 1 file changed, 6104 insertions(+)

diff --git a/gcc/config/rs6000/rs6000-overload.def b/gcc/config/rs6000/rs6000-overload.def
index d8028c94470..d3f054bec39 100644
--- a/gcc/config/rs6000/rs6000-overload.def
+++ b/gcc/config/rs6000/rs6000-overload.def
@@ -75,8 +75,6112 @@
 ; a semicolon are also treated as blank lines.
 
 
+[BCDADD, __builtin_bcdadd, __builtin_vec_bcdadd]
+  vsq __builtin_vec_bcdadd (vsq, vsq, const int);
+    BCDADD_V1TI
+  vuc __builtin_vec_bcdadd (vuc, vuc, const int);
+    BCDADD_V16QI
+
+[BCDADD_EQ, __builtin_bcdadd_eq, __builtin_vec_bcdadd_eq]
+  signed int __builtin_vec_bcdadd_eq (vsq, vsq, const int);
+    BCDADD_EQ_V1TI
+  signed int __builtin_vec_bcdadd_eq (vuc, vuc, const int);
+    BCDADD_EQ_V16QI
+
+[BCDADD_GT, __builtin_bcdadd_gt, __builtin_vec_bcdadd_gt]
+  signed int __builtin_vec_bcdadd_gt (vsq, vsq, const int);
+    BCDADD_GT_V1TI
+  signed int __builtin_vec_bcdadd_gt (vuc, vuc, const int);
+    BCDADD_GT_V16QI
+
+[BCDADD_LT, __builtin_bcdadd_lt, __builtin_vec_bcdadd_lt]
+  signed int __builtin_vec_bcdadd_lt (vsq, vsq, const int);
+    BCDADD_LT_V1TI
+  signed int __builtin_vec_bcdadd_lt (vuc, vuc, const int);
+    BCDADD_LT_V16QI
+
+[BCDADD_OV, __builtin_bcdadd_ov, __builtin_vec_bcdadd_ov]
+  signed int __builtin_vec_bcdadd_ov (vsq, vsq, const int);
+    BCDADD_OV_V1TI
+  signed int __builtin_vec_bcdadd_ov (vuc, vuc, const int);
+    BCDADD_OV_V16QI
+
+[BCDDIV10, __builtin_bcddiv10, __builtin_vec_bcddiv10]
+  vuc __builtin_vec_bcddiv10 (vuc);
+    BCDDIV10_V16QI
+
+[BCDINVALID, __builtin_bcdinvalid, __builtin_vec_bcdinvalid]
+  signed int __builtin_vec_bcdinvalid (vsq);
+    BCDINVALID_V1TI
+  signed int __builtin_vec_bcdinvalid (vuc);
+    BCDINVALID_V16QI
+
+[BCDMUL10, __builtin_bcdmul10, __builtin_vec_bcdmul10]
+  vuc __builtin_vec_bcdmul10 (vuc);
+    BCDMUL10_V16QI
+
+[BCDSUB, __builtin_bcdsub, __builtin_vec_bcdsub]
+  vsq __builtin_vec_bcdsub (vsq, vsq, const int);
+    BCDSUB_V1TI
+  vuc __builtin_vec_bcdsub (vuc, vuc, const int);
+    BCDSUB_V16QI
+
+[BCDSUB_EQ, __builtin_bcdsub_eq, __builtin_vec_bcdsub_eq]
+  signed int __builtin_vec_bcdsub_eq (vsq, vsq, const int);
+    BCDSUB_EQ_V1TI
+  signed int __builtin_vec_bcdsub_eq (vuc, vuc, const int);
+    BCDSUB_EQ_V16QI
+
+[BCDSUB_GE, __builtin_bcdsub_ge, __builtin_vec_bcdsub_ge]
+  signed int __builtin_vec_bcdsub_ge (vsq, vsq, const int);
+    BCDSUB_GE_V1TI
+  signed int __builtin_vec_bcdsub_ge (vuc, vuc, const int);
+    BCDSUB_GE_V16QI
+
+[BCDSUB_GT, __builtin_bcdsub_gt, __builtin_vec_bcdsub_gt]
+  signed int __builtin_vec_bcdsub_gt (vsq, vsq, const int);
+    BCDSUB_GT_V1TI
+  signed int __builtin_vec_bcdsub_gt (vuc, vuc, const int);
+    BCDSUB_GT_V16QI
+
+[BCDSUB_LE, __builtin_bcdsub_le, __builtin_vec_bcdsub_le]
+  signed int __builtin_vec_bcdsub_le (vsq, vsq, const int);
+    BCDSUB_LE_V1TI
+  signed int __builtin_vec_bcdsub_le (vuc, vuc, const int);
+    BCDSUB_LE_V16QI
+
+[BCDSUB_LT, __builtin_bcdsub_lt, __builtin_vec_bcdsub_lt]
+  signed int __builtin_vec_bcdsub_lt (vsq, vsq, const int);
+    BCDSUB_LT_V1TI
+  signed int __builtin_vec_bcdsub_lt (vuc, vuc, const int);
+    BCDSUB_LT_V16QI
+
+[BCDSUB_OV, __builtin_bcdsub_ov, __builtin_vec_bcdsub_ov]
+  signed int __builtin_vec_bcdsub_ov (vsq, vsq, const int);
+    BCDSUB_OV_V1TI
+  signed int __builtin_vec_bcdsub_ov (vuc, vuc, const int);
+    BCDSUB_OV_V16QI
+
+[BCD2DFP, __builtin_bcd2dfp, __builtin_vec_denb2dfp]
+  _Decimal128 __builtin_vec_denb2dfp (vuc);
+    DENB2DFP_V16QI
+
+[CRYPTO_PERMXOR, SKIP, __builtin_crypto_vpermxor]
+  vuc __builtin_crypto_vpermxor (vuc, vuc, vuc);
+    VPERMXOR_V16QI
+  vus __builtin_crypto_vpermxor (vus, vus, vus);
+    VPERMXOR_V8HI
+  vui __builtin_crypto_vpermxor (vui, vui, vui);
+    VPERMXOR_V4SI
+  vull __builtin_crypto_vpermxor (vull, vull, vull);
+    VPERMXOR_V2DI
+
+[CRYPTO_PMSUM, SKIP, __builtin_crypto_vpmsum]
+  vuc __builtin_crypto_vpmsum (vuc, vuc);
+    VPMSUMB  VPMSUMB_C
+  vus __builtin_crypto_vpmsum (vus, vus);
+    VPMSUMH  VPMSUMH_C
+  vui __builtin_crypto_vpmsum (vui, vui);
+    VPMSUMW  VPMSUMW_C
+  vull __builtin_crypto_vpmsum (vull, vull);
+    VPMSUMD  VPMSUMD_C
+
+[SCAL_CMPB, SKIP, __builtin_cmpb]
+  unsigned int __builtin_cmpb (unsigned int, unsigned int);
+    CMPB_32
+  unsigned long long __builtin_cmpb (unsigned long long, unsigned long long);
+    CMPB
+
 [VEC_ABS, vec_abs, __builtin_vec_abs]
   vsc __builtin_vec_abs (vsc);
     ABS_V16QI
   vss __builtin_vec_abs (vss);
     ABS_V8HI
+  vsi __builtin_vec_abs (vsi);
+    ABS_V4SI
+  vsll __builtin_vec_abs (vsll);
+    ABS_V2DI
+  vf __builtin_vec_abs (vf);
+    ABS_V4SF
+  vd __builtin_vec_abs (vd);
+    XVABSDP
+
+[VEC_ABSD, vec_absd, __builtin_vec_vadu, _ARCH_PWR9]
+  vuc __builtin_vec_vadu (vuc, vuc);
+    VADUB
+  vus __builtin_vec_vadu (vus, vus);
+    VADUH
+  vui __builtin_vec_vadu (vui, vui);
+    VADUW
+
+[VEC_ABSS, vec_abss, __builtin_vec_abss]
+  vsc __builtin_vec_abss (vsc);
+    ABSS_V16QI
+  vss __builtin_vec_abss (vss);
+    ABSS_V8HI
+  vsi __builtin_vec_abss (vsi);
+    ABSS_V4SI
+
+; XVADDSP{TARGET_VSX};VADDFP
+[VEC_ADD, vec_add, __builtin_vec_add]
+  vsc __builtin_vec_add (vsc, vsc);
+    VADDUBM  VADDUBM_VSC
+  vuc __builtin_vec_add (vuc, vuc);
+    VADDUBM  VADDUBM_VUC
+  vss __builtin_vec_add (vss, vss);
+    VADDUHM  VADDUHM_VSS
+  vus __builtin_vec_add (vus, vus);
+    VADDUHM  VADDUHM_VUS
+  vsi __builtin_vec_add (vsi, vsi);
+    VADDUWM  VADDUWM_VSI
+  vui __builtin_vec_add (vui, vui);
+    VADDUWM  VADDUWM_VUI
+  vsll __builtin_vec_add (vsll, vsll);
+    VADDUDM  VADDUDM_VSLL
+  vull __builtin_vec_add (vull, vull);
+    VADDUDM  VADDUDM_VULL
+  vsq __builtin_vec_add (vsq, vsq);
+    VADDUQM  VADDUQM_VSQ
+  vuq __builtin_vec_add (vuq, vuq);
+    VADDUQM  VADDUQM_VUQ
+  vf __builtin_vec_add (vf, vf);
+    VADDFP
+  vd __builtin_vec_add (vd, vd);
+    XVADDDP
+; The following variants are deprecated.
+  vsc __builtin_vec_add (vbc, vsc);
+    VADDUBM  VADDUBM_VBC_VSC
+  vsc __builtin_vec_add (vsc, vbc);
+    VADDUBM  VADDUBM_VSC_VBC
+  vuc __builtin_vec_add (vbc, vuc);
+    VADDUBM  VADDUBM_VBC_VUC
+  vuc __builtin_vec_add (vuc, vbc);
+    VADDUBM  VADDUBM_VUC_VBC
+  vss __builtin_vec_add (vbs, vss);
+    VADDUHM  VADDUHM_VBS_VSS
+  vss __builtin_vec_add (vss, vbs);
+    VADDUHM  VADDUHM_VSS_VBS
+  vus __builtin_vec_add (vbs, vus);
+    VADDUHM  VADDUHM_VBS_VUS
+  vus __builtin_vec_add (vus, vbs);
+    VADDUHM  VADDUHM_VUS_VBS
+  vsi __builtin_vec_add (vbi, vsi);
+    VADDUWM  VADDUWM_VBI_VSI
+  vsi __builtin_vec_add (vsi, vbi);
+    VADDUWM  VADDUWM_VSI_VBI
+  vui __builtin_vec_add (vbi, vui);
+    VADDUWM  VADDUWM_VBI_VUI
+  vui __builtin_vec_add (vui, vbi);
+    VADDUWM  VADDUWM_VUI_VBI
+  vsll __builtin_vec_add (vbll, vsll);
+    VADDUDM  VADDUDM_VBLL_VSLL
+  vsll __builtin_vec_add (vsll, vbll);
+    VADDUDM  VADDUDM_VSLL_VBLL
+  vull __builtin_vec_add (vbll, vull);
+    VADDUDM  VADDUDM_VBLL_VULL
+  vull __builtin_vec_add (vull, vbll);
+    VADDUDM  VADDUDM_VULL_VBLL
+
+[VEC_ADDC, vec_addc, __builtin_vec_addc]
+  vsi __builtin_vec_addc (vsi, vsi);
+    VADDCUW  VADDCUW_VSI
+  vui __builtin_vec_addc (vui, vui);
+    VADDCUW  VADDCUW_VUI
+  vsq __builtin_vec_addc (vsq, vsq);
+    VADDCUQ  VADDCUQ_VSQ
+  vuq __builtin_vec_addc (vuq, vuq);
+    VADDCUQ  VADDCUQ_VUQ
+
+; TODO: Note that the entry for VEC_ADDE currently gets ignored in
+; altivec_resolve_overloaded_builtin.  Revisit whether we can remove
+; that.  We still need to register the legal builtin forms here.
+[VEC_ADDE, vec_adde, __builtin_vec_adde]
+  vsq __builtin_vec_adde (vsq, vsq, vsq);
+    VADDEUQM  VADDEUQM_VSQ
+  vuq __builtin_vec_adde (vuq, vuq, vuq);
+    VADDEUQM  VADDEUQM_VUQ
+
+; TODO: Note that the entry for VEC_ADDEC currently gets ignored in
+; altivec_resolve_overloaded_builtin.  Revisit whether we can remove
+; that.  We still need to register the legal builtin forms here.
+[VEC_ADDEC, vec_addec, __builtin_vec_addec]
+  vsq __builtin_vec_addec (vsq, vsq, vsq);
+    VADDECUQ  VADDECUQ_VSQ
+  vuq __builtin_vec_addec (vuq, vuq, vuq);
+    VADDECUQ  VADDECUQ_VUQ
+
+[VEC_ADDS, vec_adds, __builtin_vec_adds]
+  vuc __builtin_vec_adds (vuc, vuc);
+    VADDUBS
+  vsc __builtin_vec_adds (vsc, vsc);
+    VADDSBS
+  vus __builtin_vec_adds (vus, vus);
+    VADDUHS
+  vss __builtin_vec_adds (vss, vss);
+    VADDSHS
+  vui __builtin_vec_adds (vui, vui);
+    VADDUWS
+  vsi __builtin_vec_adds (vsi, vsi);
+    VADDSWS
+; The following variants are deprecated.
+  vuc __builtin_vec_adds (vbc, vuc);
+    VADDUBS  VADDUBS_BU
+  vuc __builtin_vec_adds (vuc, vbc);
+    VADDUBS  VADDUBS_UB
+  vsc __builtin_vec_adds (vbc, vsc);
+    VADDSBS  VADDSBS_BS
+  vsc __builtin_vec_adds (vsc, vbc);
+    VADDSBS  VADDSBS_SB
+  vus __builtin_vec_adds (vbs, vus);
+    VADDUHS  VADDUHS_BU
+  vus __builtin_vec_adds (vus, vbs);
+    VADDUHS  VADDUHS_UB
+  vss __builtin_vec_adds (vbs, vss);
+    VADDSHS  VADDSHS_BS
+  vss __builtin_vec_adds (vss, vbs);
+    VADDSHS  VADDSHS_SB
+  vui __builtin_vec_adds (vbi, vui);
+    VADDUWS  VADDUWS_BU
+  vui __builtin_vec_adds (vui, vbi);
+    VADDUWS  VADDUWS_UB
+  vsi __builtin_vec_adds (vbi, vsi);
+    VADDSWS  VADDSWS_BS
+  vsi __builtin_vec_adds (vsi, vbi);
+    VADDSWS  VADDSWS_SB
+
+[VEC_AND, vec_and, __builtin_vec_and]
+  vsc __builtin_vec_and (vsc, vsc);
+    VAND_V16QI
+  vuc __builtin_vec_and (vuc, vuc);
+    VAND_V16QI_UNS  VAND_VUC
+  vbc __builtin_vec_and (vbc, vbc);
+    VAND_V16QI_UNS  VAND_VBC
+  vss __builtin_vec_and (vss, vss);
+    VAND_V8HI
+  vus __builtin_vec_and (vus, vus);
+    VAND_V8HI_UNS  VAND_VUS
+  vbs __builtin_vec_and (vbs, vbs);
+    VAND_V8HI_UNS  VAND_VBS
+  vsi __builtin_vec_and (vsi, vsi);
+    VAND_V4SI
+  vui __builtin_vec_and (vui, vui);
+    VAND_V4SI_UNS  VAND_VUI
+  vbi __builtin_vec_and (vbi, vbi);
+    VAND_V4SI_UNS  VAND_VBI
+  vsll __builtin_vec_and (vsll, vsll);
+    VAND_V2DI
+  vull __builtin_vec_and (vull, vull);
+    VAND_V2DI_UNS  VAND_VULL
+  vbll __builtin_vec_and (vbll, vbll);
+    VAND_V2DI_UNS  VAND_VBLL
+  vf __builtin_vec_and (vf, vf);
+    VAND_V4SF
+  vd __builtin_vec_and (vd, vd);
+    VAND_V2DF
+; The following variants are deprecated.
+  vsc __builtin_vec_and (vsc, vbc);
+    VAND_V16QI  VAND_VSC_VBC
+  vsc __builtin_vec_and (vbc, vsc);
+    VAND_V16QI  VAND_VBC_VSC
+  vuc __builtin_vec_and (vuc, vbc);
+    VAND_V16QI_UNS  VAND_VUC_VBC
+  vuc __builtin_vec_and (vbc, vuc);
+    VAND_V16QI_UNS  VAND_VBC_VUC
+  vss __builtin_vec_and (vss, vbs);
+    VAND_V8HI  VAND_VSS_VBS
+  vss __builtin_vec_and (vbs, vss);
+    VAND_V8HI  VAND_VBS_VSS
+  vus __builtin_vec_and (vus, vbs);
+    VAND_V8HI_UNS  VAND_VUS_VBS
+  vus __builtin_vec_and (vbs, vus);
+    VAND_V8HI_UNS  VAND_VBS_VUS
+  vsi __builtin_vec_and (vsi, vbi);
+    VAND_V4SI  VAND_VSI_VBI
+  vsi __builtin_vec_and (vbi, vsi);
+    VAND_V4SI  VAND_VBI_VSI
+  vui __builtin_vec_and (vui, vbi);
+    VAND_V4SI_UNS  VAND_VUI_VBI
+  vui __builtin_vec_and (vbi, vui);
+    VAND_V4SI_UNS  VAND_VBI_VUI
+  vsll __builtin_vec_and (vsll, vbll);
+    VAND_V2DI  VAND_VSLL_VBLL
+  vsll __builtin_vec_and (vbll, vsll);
+    VAND_V2DI  VAND_VBLL_VSLL
+  vull __builtin_vec_and (vull, vbll);
+    VAND_V2DI_UNS  VAND_VULL_VBLL
+  vull __builtin_vec_and (vbll, vull);
+    VAND_V2DI_UNS  VAND_VBLL_VULL
+  vf __builtin_vec_and (vf, vbi);
+    VAND_V4SF  VAND_VF_VBI
+  vf __builtin_vec_and (vbi, vf);
+    VAND_V4SF  VAND_VBI_VF
+  vd __builtin_vec_and (vd, vbll);
+    VAND_V2DF  VAND_VD_VBLL
+  vd __builtin_vec_and (vbll, vd);
+    VAND_V2DF  VAND_VBLL_VD
+
+[VEC_ANDC, vec_andc, __builtin_vec_andc]
+  vbc __builtin_vec_andc (vbc, vbc);
+    VANDC_V16QI_UNS VANDC_VBC
+  vsc __builtin_vec_andc (vsc, vsc);
+    VANDC_V16QI
+  vuc __builtin_vec_andc (vuc, vuc);
+    VANDC_V16QI_UNS VANDC_VUC
+  vbs __builtin_vec_andc (vbs, vbs);
+    VANDC_V8HI_UNS VANDC_VBS
+  vss __builtin_vec_andc (vss, vss);
+    VANDC_V8HI
+  vus __builtin_vec_andc (vus, vus);
+    VANDC_V8HI_UNS VANDC_VUS
+  vbi __builtin_vec_andc (vbi, vbi);
+    VANDC_V4SI_UNS VANDC_VBI
+  vsi __builtin_vec_andc (vsi, vsi);
+    VANDC_V4SI
+  vui __builtin_vec_andc (vui, vui);
+    VANDC_V4SI_UNS VANDC_VUI
+  vbll __builtin_vec_andc (vbll, vbll);
+    VANDC_V2DI_UNS VANDC_VBLL
+  vsll __builtin_vec_andc (vsll, vsll);
+    VANDC_V2DI
+  vull __builtin_vec_andc (vull, vull);
+    VANDC_V2DI_UNS VANDC_VULL
+  vf __builtin_vec_andc (vf, vf);
+    VANDC_V4SF
+  vd __builtin_vec_andc (vd, vd);
+    VANDC_V2DF
+; The following variants are deprecated.
+  vsc __builtin_vec_andc (vsc, vbc);
+    VANDC_V16QI  VANDC_VSC_VBC
+  vsc __builtin_vec_andc (vbc, vsc);
+    VANDC_V16QI  VANDC_VBC_VSC
+  vuc __builtin_vec_andc (vuc, vbc);
+    VANDC_V16QI_UNS VANDC_VUC_VBC
+  vuc __builtin_vec_andc (vbc, vuc);
+    VANDC_V16QI_UNS VANDC_VBC_VUC
+  vss __builtin_vec_andc (vss, vbs);
+    VANDC_V8HI  VANDC_VSS_VBS
+  vss __builtin_vec_andc (vbs, vss);
+    VANDC_V8HI  VANDC_VBS_VSS
+  vus __builtin_vec_andc (vus, vbs);
+    VANDC_V8HI_UNS VANDC_VUS_VBS
+  vus __builtin_vec_andc (vbs, vus);
+    VANDC_V8HI_UNS VANDC_VBS_VUS
+  vsi __builtin_vec_andc (vsi, vbi);
+    VANDC_V4SI  VANDC_VSI_VBI
+  vsi __builtin_vec_andc (vbi, vsi);
+    VANDC_V4SI  VANDC_VBI_VSI
+  vui __builtin_vec_andc (vui, vbi);
+    VANDC_V4SI_UNS VANDC_VUI_VBI
+  vui __builtin_vec_andc (vbi, vui);
+    VANDC_V4SI_UNS VANDC_VBI_VUI
+  vsll __builtin_vec_andc (vsll, vbll);
+    VANDC_V2DI  VANDC_VSLL_VBLL
+  vsll __builtin_vec_andc (vbll, vsll);
+    VANDC_V2DI  VANDC_VBLL_VSLL
+  vull __builtin_vec_andc (vull, vbll);
+    VANDC_V2DI_UNS VANDC_VULL_VBLL
+  vull __builtin_vec_andc (vbll, vull);
+    VANDC_V2DI_UNS VANDC_VBLL_VULL
+  vf __builtin_vec_andc (vf, vbi);
+    VANDC_V4SF  VANDC_VF_VBI
+  vf __builtin_vec_andc (vbi, vf);
+    VANDC_V4SF  VANDC_VBI_VF
+  vd __builtin_vec_andc (vd, vbll);
+    VANDC_V2DF  VANDC_VD_VBLL
+  vd __builtin_vec_andc (vbll, vd);
+    VANDC_V2DF  VANDC_VBLL_VD
+
+[VEC_AVG, vec_avg, __builtin_vec_avg]
+  vsc __builtin_vec_avg (vsc, vsc);
+    VAVGSB
+  vuc __builtin_vec_avg (vuc, vuc);
+    VAVGUB
+  vss __builtin_vec_avg (vss, vss);
+    VAVGSH
+  vus __builtin_vec_avg (vus, vus);
+    VAVGUH
+  vsi __builtin_vec_avg (vsi, vsi);
+    VAVGSW
+  vui __builtin_vec_avg (vui, vui);
+    VAVGUW
+
+[VEC_BLENDV, vec_blendv, __builtin_vec_xxblend, _ARCH_PWR10]
+  vsc __builtin_vec_xxblend (vsc, vsc, vuc);
+    VXXBLEND_V16QI  VXXBLEND_VSC
+  vuc __builtin_vec_xxblend (vuc, vuc, vuc);
+    VXXBLEND_V16QI  VXXBLEND_VUC
+  vss __builtin_vec_xxblend (vss, vss, vus);
+    VXXBLEND_V8HI  VXXBLEND_VSS
+  vus __builtin_vec_xxblend (vus, vus, vus);
+    VXXBLEND_V8HI  VXXBLEND_VUS
+  vsi __builtin_vec_xxblend (vsi, vsi, vui);
+    VXXBLEND_V4SI  VXXBLEND_VSI
+  vui __builtin_vec_xxblend (vui, vui, vui);
+    VXXBLEND_V4SI  VXXBLEND_VUI
+  vsll __builtin_vec_xxblend (vsll, vsll, vull);
+    VXXBLEND_V2DI  VXXBLEND_VSLL
+  vull __builtin_vec_xxblend (vull, vull, vull);
+    VXXBLEND_V2DI  VXXBLEND_VULL
+  vf __builtin_vec_xxblend (vf, vf, vui);
+    VXXBLEND_V4SF
+  vd __builtin_vec_xxblend (vd, vd, vull);
+    VXXBLEND_V2DF
+
+[VEC_BPERM, vec_bperm, __builtin_vec_vbperm_api, _ARCH_PWR8]
+  vull __builtin_vec_vbperm_api (vull, vuc);
+    VBPERMD  VBPERMD_VULL
+  vull __builtin_vec_vbperm_api (vuq, vuc);
+    VBPERMQ  VBPERMQ_VUQ
+  vuc __builtin_vec_vbperm_api (vuc, vuc);
+    VBPERMQ2  VBPERMQ2_U
+  vsc __builtin_vec_vbperm_api (vsc, vsc);
+    VBPERMQ2  VBPERMQ2_S
+
+; #### XVRSPIP{TARGET_VSX};VRFIP
+[VEC_CEIL, vec_ceil, __builtin_vec_ceil]
+  vf __builtin_vec_ceil (vf);
+    VRFIP
+  vd __builtin_vec_ceil (vd);
+    XVRDPIP
+
+[VEC_CFUGE, vec_cfuge, __builtin_vec_cfuge, _ARCH_PWR10]
+  vull __builtin_vec_cfuge (vull, vull);
+    VCFUGED
+
+[VEC_CIPHER_BE, vec_cipher_be, __builtin_vec_vcipher_be, _ARCH_PWR8]
+  vuc __builtin_vec_vcipher_be (vuc, vuc);
+    VCIPHER_BE
+
+[VEC_CIPHERLAST_BE, vec_cipherlast_be, __builtin_vec_vcipherlast_be, _ARCH_PWR8]
+  vuc __builtin_vec_vcipherlast_be (vuc, vuc);
+    VCIPHERLAST_BE
+
+[VEC_CLRL, vec_clrl, __builtin_vec_clrl, _ARCH_PWR10]
+  vsc __builtin_vec_clrl (vsc, unsigned int);
+    VCLRLB  VCLRLB_S
+  vuc __builtin_vec_clrl (vuc, unsigned int);
+    VCLRLB  VCLRLB_U
+
+[VEC_CLRR, vec_clrr, __builtin_vec_clrr, _ARCH_PWR10]
+  vsc __builtin_vec_clrr (vsc, unsigned int);
+    VCLRRB  VCLRRB_S
+  vuc __builtin_vec_clrr (vuc, unsigned int);
+    VCLRRB  VCLRRB_U
+
+; We skip generating a #define because of the C-versus-C++ complexity
+; in altivec.h.  Look there for the template-y details.
+[VEC_CMPAE_P, SKIP, __builtin_vec_vcmpae_p]
+  signed int __builtin_vec_vcmpae_p (vsc, vsc);
+    VCMPAEB_P  VCMPAEB_VSC_P
+  signed int __builtin_vec_vcmpae_p (vuc, vuc);
+    VCMPAEB_P  VCMPAEB_VUC_P
+  signed int __builtin_vec_vcmpae_p (vbc, vbc);
+    VCMPAEB_P  VCMPAEB_VBC_P
+  signed int __builtin_vec_vcmpae_p (vss, vss);
+    VCMPAEH_P  VCMPAEH_VSS_P
+  signed int __builtin_vec_vcmpae_p (vus, vus);
+    VCMPAEH_P  VCMPAEH_VUS_P
+  signed int __builtin_vec_vcmpae_p (vbs, vbs);
+    VCMPAEH_P  VCMPAEH_VBS_P
+  signed int __builtin_vec_vcmpae_p (vp, vp);
+    VCMPAEH_P  VCMPAEH_VP_P
+  signed int __builtin_vec_vcmpae_p (vsi, vsi);
+    VCMPAEW_P  VCMPAEW_VSI_P
+  signed int __builtin_vec_vcmpae_p (vui, vui);
+    VCMPAEW_P  VCMPAEW_VUI_P
+  signed int __builtin_vec_vcmpae_p (vbi, vbi);
+    VCMPAEW_P  VCMPAEW_VBI_P
+  signed int __builtin_vec_vcmpae_p (vsll, vsll);
+    VCMPAED_P  VCMPAED_VSLL_P
+  signed int __builtin_vec_vcmpae_p (vull, vull);
+    VCMPAED_P  VCMPAED_VULL_P
+  signed int __builtin_vec_vcmpae_p (vbll, vbll);
+    VCMPAED_P  VCMPAED_VBLL_P
+  signed int __builtin_vec_vcmpae_p (vsq, vsq);
+    VCMPAET_P  VCMPAET_VSQ_P
+  signed int __builtin_vec_vcmpae_p (vuq, vuq);
+    VCMPAET_P  VCMPAET_VUQ_P
+  signed int __builtin_vec_vcmpae_p (vf, vf);
+    VCMPAEFP_P
+  signed int __builtin_vec_vcmpae_p (vd, vd);
+    VCMPAEDP_P
+; The following variants are deprecated.
+  signed int __builtin_vec_vcmpae_p (signed int, vbc, vuc);
+    VCMPAEB_P  VCMPAEB_P_BU
+  signed int __builtin_vec_vcmpae_p (signed int, vuc, vbc);
+    VCMPAEB_P  VCMPAEB_P_UB
+  signed int __builtin_vec_vcmpae_p (signed int, vbc, vsc);
+    VCMPAEB_P  VCMPAEB_P_BS
+  signed int __builtin_vec_vcmpae_p (signed int, vsc, vbc);
+    VCMPAEB_P  VCMPAEB_P_SB
+  signed int __builtin_vec_vcmpae_p (signed int, vbs, vus);
+    VCMPAEH_P  VCMPAEH_P_BU
+  signed int __builtin_vec_vcmpae_p (signed int, vus, vbs);
+    VCMPAEH_P  VCMPAEH_P_UB
+  signed int __builtin_vec_vcmpae_p (signed int, vbs, vss);
+    VCMPAEH_P  VCMPAEH_P_BS
+  signed int __builtin_vec_vcmpae_p (signed int, vss, vbs);
+    VCMPAEH_P  VCMPAEH_P_SB
+  signed int __builtin_vec_vcmpae_p (signed int, vbi, vui);
+    VCMPAEW_P  VCMPAEW_P_BU
+  signed int __builtin_vec_vcmpae_p (signed int, vui, vbi);
+    VCMPAEW_P  VCMPAEW_P_UB
+  signed int __builtin_vec_vcmpae_p (signed int, vbi, vsi);
+    VCMPAEW_P  VCMPAEW_P_BS
+  signed int __builtin_vec_vcmpae_p (signed int, vsi, vbi);
+    VCMPAEW_P  VCMPAEW_P_SB
+  signed int __builtin_vec_vcmpae_p (signed int, vbll, vull);
+    VCMPAED_P  VCMPAED_P_BU
+  signed int __builtin_vec_vcmpae_p (signed int, vull, vbll);
+    VCMPAED_P  VCMPAED_P_UB
+  signed int __builtin_vec_vcmpae_p (signed int, vbll, vsll);
+    VCMPAED_P  VCMPAED_P_BS
+  signed int __builtin_vec_vcmpae_p (signed int, vbll, vsll);
+    VCMPAED_P  VCMPAED_P_SB
+
+[VEC_CMPB, vec_cmpb, __builtin_vec_cmpb]
+  vsi __builtin_vec_cmpb (vf, vf);
+    VCMPBFP
+
+[VEC_CMPEQ, vec_cmpeq, __builtin_vec_cmpeq]
+; #### XVCMPEQSP{TARGET_VSX};VCMPEQFP
+  vbc __builtin_vec_cmpeq (vsc, vsc);
+    VCMPEQUB  VCMPEQUB_VSC
+  vbc __builtin_vec_cmpeq (vuc, vuc);
+    VCMPEQUB  VCMPEQUB_VUC
+  vbc __builtin_vec_cmpeq (vbc, vbc);
+    VCMPEQUB  VCMPEQUB_VBC
+  vbs __builtin_vec_cmpeq (vss, vss);
+    VCMPEQUH  VCMPEQUH_VSS
+  vbs __builtin_vec_cmpeq (vus, vus);
+    VCMPEQUH  VCMPEQUH_VUS
+  vbs __builtin_vec_cmpeq (vbs, vbs);
+    VCMPEQUH  VCMPEQUH_VBS
+  vbi __builtin_vec_cmpeq (vsi, vsi);
+    VCMPEQUW  VCMPEQUW_VSI
+  vbi __builtin_vec_cmpeq (vui, vui);
+    VCMPEQUW  VCMPEQUW_VUI
+  vbi __builtin_vec_cmpeq (vbi, vbi);
+    VCMPEQUW  VCMPEQUW_VBI
+  vbll __builtin_vec_cmpeq (vsll, vsll);
+    VCMPEQUD  VCMPEQUD_VSLL
+  vbll __builtin_vec_cmpeq (vull, vull);
+    VCMPEQUD  VCMPEQUD_VULL
+  vbll __builtin_vec_cmpeq (vbll, vbll);
+    VCMPEQUD  VCMPEQUD_VBLL
+  vbq __builtin_vec_cmpeq (vsq, vsq);
+    VCMPEQUT  VCMPEQUT_VSQ
+  vbq __builtin_vec_cmpeq (vuq, vuq);
+    VCMPEQUT  VCMPEQUT_VUQ
+  vbi __builtin_vec_cmpeq (vf, vf);
+    VCMPEQFP
+  vbll __builtin_vec_cmpeq (vd, vd);
+    XVCMPEQDP
+
+; We skip generating a #define because of the C-versus-C++ complexity
+; in altivec.h.  Look there for the template-y details.
+; #### XVCMPEQSP_P{TARGET_VSX};VCMPEQFP_P
+[VEC_CMPEQ_P, SKIP, __builtin_vec_vcmpeq_p]
+  signed int __builtin_vec_vcmpeq_p (signed int, vuc, vuc);
+    VCMPEQUB_P  VCMPEQUB_PU
+  signed int __builtin_vec_vcmpeq_p (signed int, vsc, vsc);
+    VCMPEQUB_P  VCMPEQUB_PS
+  signed int __builtin_vec_vcmpeq_p (signed int, vbc, vbc);
+    VCMPEQUB_P  VCMPEQUB_PB
+  signed int __builtin_vec_vcmpeq_p (signed int, vus, vus);
+    VCMPEQUH_P  VCMPEQUH_PU
+  signed int __builtin_vec_vcmpeq_p (signed int, vss, vss);
+    VCMPEQUH_P  VCMPEQUH_PS
+  signed int __builtin_vec_vcmpeq_p (signed int, vbs, vbs);
+    VCMPEQUH_P  VCMPEQUH_PB
+  signed int __builtin_vec_vcmpeq_p (signed int, vp, vp);
+    VCMPEQUH_P  VCMPEQUH_PP
+  signed int __builtin_vec_vcmpeq_p (signed int, vui, vui);
+    VCMPEQUW_P  VCMPEQUW_PU
+  signed int __builtin_vec_vcmpeq_p (signed int, vsi, vsi);
+    VCMPEQUW_P  VCMPEQUW_PS
+  signed int __builtin_vec_vcmpeq_p (signed int, vbi, vbi);
+    VCMPEQUW_P  VCMPEQUW_PB
+  signed int __builtin_vec_vcmpeq_p (signed int, vull, vull);
+    VCMPEQUD_P  VCMPEQUD_PU
+  signed int __builtin_vec_vcmpeq_p (signed int, vsll, vsll);
+    VCMPEQUD_P  VCMPEQUD_PS
+  signed int __builtin_vec_vcmpeq_p (signed int, vbll, vbll);
+    VCMPEQUD_P  VCMPEQUD_PB
+  signed int __builtin_vec_vcmpeq_p (signed int, vsq, vsq);
+    VCMPEQUT_P  VCMPEQUT_P_VSQ
+  signed int __builtin_vec_vcmpeq_p (signed int, vuq, vuq);
+    VCMPEQUT_P  VCMPEQUT_P_VUQ
+  signed int __builtin_vec_vcmpeq_p (signed int, vf, vf);
+    VCMPEQFP_P
+  signed int __builtin_vec_vcmpeq_p (signed int, vd, vd);
+    XVCMPEQDP_P
+; The following variants are deprecated.
+  signed int __builtin_vec_vcmpeq_p (signed int, vbc, vuc);
+    VCMPEQUB_P  VCMPEQUB_P_BU
+  signed int __builtin_vec_vcmpeq_p (signed int, vuc, vbc);
+    VCMPEQUB_P  VCMPEQUB_P_UB
+  signed int __builtin_vec_vcmpeq_p (signed int, vbc, vsc);
+    VCMPEQUB_P  VCMPEQUB_P_BS
+  signed int __builtin_vec_vcmpeq_p (signed int, vsc, vbc);
+    VCMPEQUB_P  VCMPEQUB_P_SB
+  signed int __builtin_vec_vcmpeq_p (signed int, vbs, vus);
+    VCMPEQUH_P  VCMPEQUH_P_BU
+  signed int __builtin_vec_vcmpeq_p (signed int, vus, vbs);
+    VCMPEQUH_P  VCMPEQUH_P_UB
+  signed int __builtin_vec_vcmpeq_p (signed int, vbs, vss);
+    VCMPEQUH_P  VCMPEQUH_P_BS
+  signed int __builtin_vec_vcmpeq_p (signed int, vss, vbs);
+    VCMPEQUH_P  VCMPEQUH_P_SB
+  signed int __builtin_vec_vcmpeq_p (signed int, vbi, vui);
+    VCMPEQUW_P  VCMPEQUW_P_BU
+  signed int __builtin_vec_vcmpeq_p (signed int, vui, vbi);
+    VCMPEQUW_P  VCMPEQUW_P_UB
+  signed int __builtin_vec_vcmpeq_p (signed int, vbi, vsi);
+    VCMPEQUW_P  VCMPEQUW_P_BS
+  signed int __builtin_vec_vcmpeq_p (signed int, vsi, vbi);
+    VCMPEQUW_P  VCMPEQUW_P_SB
+  signed int __builtin_vec_vcmpeq_p (signed int, vbll, vull);
+    VCMPEQUD_P  VCMPEQUD_P_BU
+  signed int __builtin_vec_vcmpeq_p (signed int, vull, vbll);
+    VCMPEQUD_P  VCMPEQUD_P_UB
+  signed int __builtin_vec_vcmpeq_p (signed int, vbll, vsll);
+    VCMPEQUD_P  VCMPEQUD_P_BS
+  signed int __builtin_vec_vcmpeq_p (signed int, vbll, vsll);
+    VCMPEQUD_P  VCMPEQUD_P_SB
+
+[VEC_CMPEQB, SKIP, __builtin_byte_in_set]
+  signed int __builtin_byte_in_set (unsigned int,  unsigned long long);
+    CMPEQB
+
+; #### XVCMPGESP{TARGET_VSX};VCMPGEFP
+[VEC_CMPGE, vec_cmpge, __builtin_vec_cmpge]
+  vbc __builtin_vec_cmpge (vsc, vsc);
+    CMPGE_16QI  CMPGE_16QI_VSC
+  vbc __builtin_vec_cmpge (vuc, vuc);
+    CMPGE_U16QI  CMPGE_16QI_VUC
+  vbs __builtin_vec_cmpge (vss, vss);
+    CMPGE_8HI  CMPGE_8HI_VSS
+  vbs __builtin_vec_cmpge (vus, vus);
+    CMPGE_U8HI  CMPGE_8HI_VUS
+  vbi __builtin_vec_cmpge (vsi, vsi);
+    CMPGE_4SI  CMPGE_4SI_VSI
+  vbi __builtin_vec_cmpge (vui, vui);
+    CMPGE_U4SI  CMPGE_4SI_VUI
+  vbll __builtin_vec_cmpge (vsll, vsll);
+    CMPGE_2DI  CMPGE_2DI_VSLL
+  vbll __builtin_vec_cmpge (vull, vull);
+    CMPGE_U2DI  CMPGE_2DI_VULL
+  vbq __builtin_vec_cmpge (vsq, vsq);
+    CMPGE_1TI
+  vbq __builtin_vec_cmpge (vuq, vuq);
+    CMPGE_U1TI
+  vbi __builtin_vec_cmpge (vf, vf);
+    VCMPGEFP
+  vbll __builtin_vec_cmpge (vd, vd);
+    XVCMPGEDP
+
+; We skip generating a #define because of the C-versus-C++ complexity
+; in altivec.h.  Look there for the template-y details.
+; See altivec_build_resolved_builtin for how we deal with VEC_CMPGE_P.
+; It's quite strange and horrible!
+; #### XVCMPGESP_P{TARGET_VSX};VCMPGEFP_P
+[VEC_CMPGE_P, SKIP, __builtin_vec_vcmpge_p]
+  signed int __builtin_vec_vcmpge_p (signed int, vuc, vuc);
+    VCMPGTUB_P  VCMPGTUB_PR
+  signed int __builtin_vec_vcmpge_p (signed int, vsc, vsc);
+    VCMPGTSB_P  VCMPGTSB_PR
+  signed int __builtin_vec_vcmpge_p (signed int, vus, vus);
+    VCMPGTUH_P  VCMPGTUH_PR
+  signed int __builtin_vec_vcmpge_p (signed int, vss, vss);
+    VCMPGTSH_P  VCMPGTSH_PR
+  signed int __builtin_vec_vcmpge_p (signed int, vui, vui);
+    VCMPGTUW_P  VCMPGTUW_PR
+  signed int __builtin_vec_vcmpge_p (signed int, vsi, vsi);
+    VCMPGTSW_P  VCMPGTSW_PR
+  signed int __builtin_vec_vcmpge_p (signed int, vull, vull);
+    VCMPGTUD_P  VCMPGTUD_PR
+  signed int __builtin_vec_vcmpge_p (signed int, vsll, vsll);
+    VCMPGTSD_P  VCMPGTSD_PR
+  signed int __builtin_vec_vcmpge_p (signed int, vuq, vuq);
+    VCMPGTUT_P  VCMPGTUT_PR
+  signed int __builtin_vec_vcmpge_p (signed int, vsq, vsq);
+    VCMPGTST_P  VCMPGTST_PR
+  signed int __builtin_vec_vcmpge_p (signed int, vf, vf);
+    VCMPGEFP_P
+  signed int __builtin_vec_vcmpge_p (signed int, vd, vd);
+    XVCMPGEDP_P
+; The following variants are deprecated.
+  signed int __builtin_vec_vcmpge_p (signed int, vbc, vuc);
+    VCMPGTUB_P  VCMPGTUB_PR_BU
+  signed int __builtin_vec_vcmpge_p (signed int, vuc, vbc);
+    VCMPGTUB_P  VCMPGTUB_PR_UB
+  signed int __builtin_vec_vcmpge_p (signed int, vbc, vsc);
+    VCMPGTSB_P  VCMPGTSB_PR_BS
+  signed int __builtin_vec_vcmpge_p (signed int, vsc, vbc);
+    VCMPGTSB_P  VCMPGTSB_PR_SB
+  signed int __builtin_vec_vcmpge_p (signed int, vbs, vus);
+    VCMPGTUH_P  VCMPGTUH_PR_BU
+  signed int __builtin_vec_vcmpge_p (signed int, vus, vbs);
+    VCMPGTUH_P  VCMPGTUH_PR_UB
+  signed int __builtin_vec_vcmpge_p (signed int, vbs, vss);
+    VCMPGTSH_P  VCMPGTSH_PR_BS
+  signed int __builtin_vec_vcmpge_p (signed int, vss, vbs);
+    VCMPGTSH_P  VCMPGTSH_PR_SB
+  signed int __builtin_vec_vcmpge_p (signed int, vbi, vui);
+    VCMPGTUW_P  VCMPGTUW_PR_BU
+  signed int __builtin_vec_vcmpge_p (signed int, vui, vbi);
+    VCMPGTUW_P  VCMPGTUW_PR_UB
+  signed int __builtin_vec_vcmpge_p (signed int, vbi, vsi);
+    VCMPGTSW_P  VCMPGTSW_PR_BS
+  signed int __builtin_vec_vcmpge_p (signed int, vsi, vbi);
+    VCMPGTSW_P  VCMPGTSW_PR_SB
+  signed int __builtin_vec_vcmpge_p (signed int, vbll, vull);
+    VCMPGTUD_P  VCMPGTUD_PR_BU
+  signed int __builtin_vec_vcmpge_p (signed int, vull, vbll);
+    VCMPGTUD_P  VCMPGTUD_PR_UB
+  signed int __builtin_vec_vcmpge_p (signed int, vbll, vsll);
+    VCMPGTSD_P  VCMPGTSD_PR_BS
+  signed int __builtin_vec_vcmpge_p (signed int, vsll, vbll);
+    VCMPGTSD_P  VCMPGTSD_PR_SB
+
+; #### XVCMPGTSP{TARGET_VSX};VCMPGTFP
+[VEC_CMPGT, vec_cmpgt, __builtin_vec_cmpgt]
+  vbc __builtin_vec_cmpgt (vsc, vsc);
+    VCMPGTSB
+  vbc __builtin_vec_cmpgt (vuc, vuc);
+    VCMPGTUB
+  vbs __builtin_vec_cmpgt (vss, vss);
+    VCMPGTSH
+  vbs __builtin_vec_cmpgt (vus, vus);
+    VCMPGTUH
+  vbi __builtin_vec_cmpgt (vsi, vsi);
+    VCMPGTSW
+  vbi __builtin_vec_cmpgt (vui, vui);
+    VCMPGTUW
+  vbll __builtin_vec_cmpgt (vsll, vsll);
+    VCMPGTSD
+  vbll __builtin_vec_cmpgt (vull, vull);
+    VCMPGTUD
+  vbq __builtin_vec_cmpgt (vsq, vsq);
+    VCMPGTST
+  vbq __builtin_vec_cmpgt (vuq, vuq);
+    VCMPGTUT
+  vbi __builtin_vec_cmpgt (vf, vf);
+    VCMPGTFP
+  vbll __builtin_vec_cmpgt (vd, vd);
+    XVCMPGTDP
+
+; We skip generating a #define because of the C-versus-C++ complexity
+; in altivec.h.  Look there for the template-y details.
+; #### XVCMPGTSP_P{TARGET_VSX};VCMPGTFP_P
+[VEC_CMPGT_P, SKIP, __builtin_vec_vcmpgt_p]
+  signed int __builtin_vec_vcmpgt_p (signed int, vuc, vuc);
+    VCMPGTUB_P
+  signed int __builtin_vec_vcmpgt_p (signed int, vsc, vsc);
+    VCMPGTSB_P
+  signed int __builtin_vec_vcmpgt_p (signed int, vus, vus);
+    VCMPGTUH_P
+  signed int __builtin_vec_vcmpgt_p (signed int, vss, vss);
+    VCMPGTSH_P
+  signed int __builtin_vec_vcmpgt_p (signed int, vui, vui);
+    VCMPGTUW_P
+  signed int __builtin_vec_vcmpgt_p (signed int, vsi, vsi);
+    VCMPGTSW_P
+  signed int __builtin_vec_vcmpgt_p (signed int, vull, vull);
+    VCMPGTUD_P
+  signed int __builtin_vec_vcmpgt_p (signed int, vsll, vsll);
+    VCMPGTSD_P
+  signed int __builtin_vec_vcmpgt_p (signed int, vuq, vuq);
+    VCMPGTUT_P
+  signed int __builtin_vec_vcmpgt_p (signed int, vsq, vsq);
+    VCMPGTST_P
+  signed int __builtin_vec_vcmpgt_p (signed int, vf, vf);
+    VCMPGTFP_P
+  signed int __builtin_vec_vcmpgt_p (signed int, vd, vd);
+    XVCMPGTDP_P
+; The following variants are deprecated.
+  signed int __builtin_vec_vcmpgt_p (signed int, vbc, vuc);
+    VCMPGTUB_P  VCMPGTUB_P_BU
+  signed int __builtin_vec_vcmpgt_p (signed int, vuc, vbc);
+    VCMPGTUB_P  VCMPGTUB_P_UB
+  signed int __builtin_vec_vcmpgt_p (signed int, vbc, vsc);
+    VCMPGTSB_P  VCMPGTSB_P_BS
+  signed int __builtin_vec_vcmpgt_p (signed int, vsc, vbc);
+    VCMPGTSB_P  VCMPGTSB_P_SB
+  signed int __builtin_vec_vcmpgt_p (signed int, vbs, vus);
+    VCMPGTUH_P  VCMPGTUH_P_BU
+  signed int __builtin_vec_vcmpgt_p (signed int, vus, vbs);
+    VCMPGTUH_P  VCMPGTUH_P_UB
+  signed int __builtin_vec_vcmpgt_p (signed int, vbs, vss);
+    VCMPGTSH_P  VCMPGTSH_P_BS
+  signed int __builtin_vec_vcmpgt_p (signed int, vss, vbs);
+    VCMPGTSH_P  VCMPGTSH_P_SB
+  signed int __builtin_vec_vcmpgt_p (signed int, vbi, vui);
+    VCMPGTUW_P  VCMPGTUW_P_BU
+  signed int __builtin_vec_vcmpgt_p (signed int, vui, vbi);
+    VCMPGTUW_P  VCMPGTUW_P_UB
+  signed int __builtin_vec_vcmpgt_p (signed int, vbi, vsi);
+    VCMPGTSW_P  VCMPGTSW_P_BS
+  signed int __builtin_vec_vcmpgt_p (signed int, vsi, vbi);
+    VCMPGTSW_P  VCMPGTSW_P_SB
+  signed int __builtin_vec_vcmpgt_p (signed int, vbll, vull);
+    VCMPGTUD_P  VCMPGTUD_P_BU
+  signed int __builtin_vec_vcmpgt_p (signed int, vull, vbll);
+    VCMPGTUD_P  VCMPGTUD_P_UB
+  signed int __builtin_vec_vcmpgt_p (signed int, vbll, vsll);
+    VCMPGTSD_P  VCMPGTSD_P_BS
+  signed int __builtin_vec_vcmpgt_p (signed int, vsll, vbll);
+    VCMPGTSD_P  VCMPGTSD_P_SB
+
+; Note that there is no entry for VEC_CMPLE.  VEC_CMPLE is implemented
+; using VEC_CMPGE with reversed arguments in altivec.h.
+
+; Note that there is no entry for VEC_CMPLT.  VEC_CMPLT is implemented
+; using VEC_CMPGT with reversed arguments in altivec.h.
+
+[VEC_CMPNE, vec_cmpne, __builtin_vec_cmpne]
+  vbc __builtin_vec_cmpne (vbc, vbc);
+    VCMPNEB  VCMPNEB_VBC
+  vbc __builtin_vec_cmpne (vsc, vsc);
+    VCMPNEB  VCMPNEB_VSC
+  vbc __builtin_vec_cmpne (vuc, vuc);
+    VCMPNEB  VCMPNEB_VUC
+  vbs __builtin_vec_cmpne (vbs, vbs);
+    VCMPNEH  VCMPNEH_VBS
+  vbs __builtin_vec_cmpne (vss, vss);
+    VCMPNEH  VCMPNEH_VSS
+  vbs __builtin_vec_cmpne (vus, vus);
+    VCMPNEH  VCMPNEH_VUS
+  vbi __builtin_vec_cmpne (vbi, vbi);
+    VCMPNEW  VCMPNEW_VBI
+  vbi __builtin_vec_cmpne (vsi, vsi);
+    VCMPNEW  VCMPNEW_VSI
+  vbi __builtin_vec_cmpne (vui, vui);
+    VCMPNEW  VCMPNEW_VUI
+  vbq __builtin_vec_cmpne (vsq, vsq);
+    VCMPNET  VCMPNET_VSQ
+  vbq __builtin_vec_cmpne (vuq, vuq);
+    VCMPNET  VCMPNET_VUQ
+
+; We skip generating a #define because of the C-versus-C++ complexity
+; in altivec.h.  Look there for the template-y details.
+[VEC_CMPNE_P, SKIP, __builtin_vec_vcmpne_p]
+  signed int __builtin_vec_vcmpne_p (vsc, vsc);
+    VCMPNEB_P  VCMPNEB_VSC_P
+  signed int __builtin_vec_vcmpne_p (vuc, vuc);
+    VCMPNEB_P  VCMPNEB_VUC_P
+  signed int __builtin_vec_vcmpne_p (vbc, vbc);
+    VCMPNEB_P  VCMPNEB_VBC_P
+  signed int __builtin_vec_vcmpne_p (vss, vss);
+    VCMPNEH_P  VCMPNEH_VSS_P
+  signed int __builtin_vec_vcmpne_p (vus, vus);
+    VCMPNEH_P  VCMPNEH_VUS_P
+  signed int __builtin_vec_vcmpne_p (vbs, vbs);
+    VCMPNEH_P  VCMPNEH_VBS_P
+  signed int __builtin_vec_vcmpne_p (vp, vp);
+    VCMPNEH_P  VCMPNEH_VP_P
+  signed int __builtin_vec_vcmpne_p (vsi, vsi);
+    VCMPNEW_P  VCMPNEW_VSI_P
+  signed int __builtin_vec_vcmpne_p (vui, vui);
+    VCMPNEW_P  VCMPNEW_VUI_P
+  signed int __builtin_vec_vcmpne_p (vbi, vbi);
+    VCMPNEW_P  VCMPNEW_VBI_P
+  signed int __builtin_vec_vcmpne_p (vsll, vsll);
+    VCMPNED_P  VCMPNED_VSLL_P
+  signed int __builtin_vec_vcmpne_p (vull, vull);
+    VCMPNED_P  VCMPNED_VULL_P
+  signed int __builtin_vec_vcmpne_p (vbll, vbll);
+    VCMPNED_P  VCMPNED_VBLL_P
+  signed int __builtin_vec_vcmpne_p (vsq, vsq);
+    VCMPNET_P  VCMPNET_VSQ_P
+  signed int __builtin_vec_vcmpne_p (vuq, vuq);
+    VCMPNET_P  VCMPNET_VUQ_P
+  signed int __builtin_vec_vcmpne_p (vf, vf);
+    VCMPNEFP_P
+  signed int __builtin_vec_vcmpne_p (vd, vd);
+    VCMPNEDP_P
+; The following variants are deprecated.
+  signed int __builtin_vec_vcmpne_p (signed int, vbc, vuc);
+    VCMPNEB_P  VCMPNEB_P_BU
+  signed int __builtin_vec_vcmpne_p (signed int, vuc, vbc);
+    VCMPNEB_P  VCMPNEB_P_UB
+  signed int __builtin_vec_vcmpne_p (signed int, vbc, vsc);
+    VCMPNEB_P  VCMPNEB_P_BS
+  signed int __builtin_vec_vcmpne_p (signed int, vsc, vbc);
+    VCMPNEB_P  VCMPNEB_P_SB
+  signed int __builtin_vec_vcmpne_p (signed int, vbs, vus);
+    VCMPNEH_P  VCMPNEH_P_BU
+  signed int __builtin_vec_vcmpne_p (signed int, vus, vbs);
+    VCMPNEH_P  VCMPNEH_P_UB
+  signed int __builtin_vec_vcmpne_p (signed int, vbs, vss);
+    VCMPNEH_P  VCMPNEH_P_BS
+  signed int __builtin_vec_vcmpne_p (signed int, vss, vbs);
+    VCMPNEH_P  VCMPNEH_P_SB
+  signed int __builtin_vec_vcmpne_p (signed int, vbi, vui);
+    VCMPNEW_P  VCMPNEW_P_BU
+  signed int __builtin_vec_vcmpne_p (signed int, vui, vbi);
+    VCMPNEW_P  VCMPNEW_P_UB
+  signed int __builtin_vec_vcmpne_p (signed int, vbi, vsi);
+    VCMPNEW_P  VCMPNEW_P_BS
+  signed int __builtin_vec_vcmpne_p (signed int, vsi, vbi);
+    VCMPNEW_P  VCMPNEW_P_SB
+  signed int __builtin_vec_vcmpne_p (signed int, vbll, vull);
+    VCMPNED_P  VCMPNED_P_BU
+  signed int __builtin_vec_vcmpne_p (signed int, vull, vbll);
+    VCMPNED_P  VCMPNED_P_UB
+  signed int __builtin_vec_vcmpne_p (signed int, vbll, vsll);
+    VCMPNED_P  VCMPNED_P_BS
+  signed int __builtin_vec_vcmpne_p (signed int, vbll, vsll);
+    VCMPNED_P  VCMPNED_P_SB
+
+[VEC_CMPNEZ, vec_cmpnez, __builtin_vec_vcmpnez, _ARCH_PWR9]
+  vbc __builtin_vec_cmpnez (vsc, vsc);
+    CMPNEZB  CMPNEZB_S
+  vbc __builtin_vec_cmpnez (vuc, vuc);
+    CMPNEZB  CMPNEZB_U
+  vbs __builtin_vec_cmpnez (vss, vss);
+    CMPNEZH  CMPNEZH_S
+  vbs __builtin_vec_cmpnez (vus, vus);
+    CMPNEZH  CMPNEZH_U
+  vbi __builtin_vec_cmpnez (vsi, vsi);
+    CMPNEZW  CMPNEZW_S
+  vbi __builtin_vec_cmpnez (vui, vui);
+    CMPNEZW  CMPNEZW_U
+
+; We skip generating a #define because of the C-versus-C++ complexity
+; in altivec.h.  Look there for the template-y details.
+[VEC_CMPNEZ_P, SKIP, __builtin_vec_vcmpnez_p]
+  signed int __builtin_vec_vcmpnez_p (signed int, vsc, vsc);
+    VCMPNEZB_P  VCMPNEZB_VSC_P
+  signed int __builtin_vec_vcmpnez_p (signed int, vuc, vuc);
+    VCMPNEZB_P  VCMPNEZB_VUC_P
+  signed int __builtin_vec_vcmpnez_p (signed int, vss, vss);
+    VCMPNEZH_P  VCMPNEZH_VSS_P
+  signed int __builtin_vec_vcmpnez_p (signed int, vus, vus);
+    VCMPNEZH_P  VCMPNEZH_VUS_P
+  signed int __builtin_vec_vcmpnez_p (signed int, vsi, vsi);
+    VCMPNEZW_P  VCMPNEZW_VSI_P
+  signed int __builtin_vec_vcmpnez_p (signed int, vui, vui);
+    VCMPNEZW_P  VCMPNEZW_VUI_P
+
+[VEC_CMPRB, SKIP, __builtin_byte_in_range]
+  signed int __builtin_byte_in_range (unsigned int, unsigned int);
+    CMPRB
+
+[VEC_CMPRB2, SKIP, __builtin_byte_in_either_range]
+  signed int __builtin_byte_in_range (unsigned int, unsigned int);
+    CMPRB2
+
+[VEC_CNTLZ, vec_cntlz, __builtin_vec_vclz, _ARCH_PWR8]
+  vsc __builtin_vec_vclz (vsc);
+    VCLZB  VCLZB_S
+  vuc __builtin_vec_vclz (vuc);
+    VCLZB  VCLZB_U
+  vss __builtin_vec_vclz (vss);
+    VCLZH  VCLZH_S
+  vus __builtin_vec_vclz (vus);
+    VCLZH  VCLZH_U
+  vsi __builtin_vec_vclz (vsi);
+    VCLZW  VCLZW_S
+  vui __builtin_vec_vclz (vui);
+    VCLZW  VCLZW_U
+  vsll __builtin_vec_vclz (vsll);
+    VCLZD  VCLZD_S
+  vull __builtin_vec_vclz (vull);
+    VCLZD  VCLZD_U
+
+[VEC_CNTLZM, vec_cntlzm, __builtin_vec_vclzdm, _ARCH_PWR10]
+  vull __builtin_vec_vclzdm (vull, vull);
+    VCLZDM
+
+[VEC_CNTTZM, vec_cnttzm, __builtin_vec_vctzdm, _ARCH_PWR10]
+  vull __builtin_vec_vctzdm (vull, vull);
+    VCTZDM
+
+[VEC_CNTLZ_LSBB, vec_cntlz_lsbb, __builtin_vec_vclzlsbb, _ARCH_PWR9]
+  signed int __builtin_vec_vclzlsbb (vsc);
+    VCLZLSBB_V16QI  VCLZLSBB_VSC
+  signed int __builtin_vec_vclzlsbb (vuc);
+    VCLZLSBB_V16QI  VCLZLSBB_VUC
+  signed int __builtin_vec_vclzlsbb (vss);
+    VCLZLSBB_V8HI  VCLZLSBB_VSS
+  signed int __builtin_vec_vclzlsbb (vus);
+    VCLZLSBB_V8HI  VCLZLSBB_VUS
+  signed int __builtin_vec_vclzlsbb (vsi);
+    VCLZLSBB_V4SI  VCLZLSBB_VSI
+  signed int __builtin_vec_vclzlsbb (vui);
+    VCLZLSBB_V4SI  VCLZLSBB_VUI
+
+[VEC_CNTM, vec_cntm, __builtin_vec_cntm, _ARCH_PWR10]
+  unsigned long long __builtin_vec_cntm (vuc, const int);
+    VCNTMBB
+  unsigned long long __builtin_vec_cntm (vus, const int);
+    VCNTMBH
+  unsigned long long __builtin_vec_cntm (vui, const int);
+    VCNTMBW
+  unsigned long long __builtin_vec_cntm (vull, const int);
+    VCNTMBD
+
+[VEC_CNTTZ, vec_cnttz, __builtin_vec_vctz, _ARCH_PWR9]
+  vsc __builtin_vec_vctz (vsc);
+    VCTZB  VCTZB_S
+  vuc __builtin_vec_vctz (vuc);
+    VCTZB  VCTZB_U
+  vss __builtin_vec_vctz (vss);
+    VCTZH  VCTZH_S
+  vus __builtin_vec_vctz (vus);
+    VCTZH  VCTZH_U
+  vsi __builtin_vec_vctz (vsi);
+    VCTZW  VCTZW_S
+  vui __builtin_vec_vctz (vui);
+    VCTZW  VCTZW_U
+  vsll __builtin_vec_vctz (vsll);
+    VCTZD  VCTZD_S
+  vull __builtin_vec_vctz (vull);
+    VCTZD  VCTZD_U
+
+[VEC_CNTTZ_LSBB, vec_cnttz_lsbb, __builtin_vec_vctzlsbb, _ARCH_PWR9]
+  signed int __builtin_vec_vctzlsbb (vsc);
+    VCTZLSBB_V16QI  VCTZLSBB_VSC
+  signed int __builtin_vec_vctzlsbb (vuc);
+    VCTZLSBB_V16QI  VCTZLSBB_VUC
+  signed int __builtin_vec_vctzlsbb (vss);
+    VCTZLSBB_V8HI  VCTZLSBB_VSS
+  signed int __builtin_vec_vctzlsbb (vus);
+    VCTZLSBB_V8HI  VCTZLSBB_VUS
+  signed int __builtin_vec_vctzlsbb (vsi);
+    VCTZLSBB_V4SI  VCTZLSBB_VSI
+  signed int __builtin_vec_vctzlsbb (vui);
+    VCTZLSBB_V4SI  VCTZLSBB_VUI
+
+[VEC_CONVERT_4F32_8I16, SKIP, __builtin_vec_convert_4f32_8i16]
+  vus __builtin_vec_convert_4f32_8i16 (vf, vf);
+    CONVERT_4F32_8I16
+
+[VEC_CONVERT_4F32_8F16, vec_pack_to_short_fp32, __builtin_vec_convert_4f32_8f16, _ARCH_PWR9]
+  vus __builtin_vec_convert_4f32_8f16 (vf, vf);
+    CONVERT_4F32_8F16
+
+[VEC_COPYSIGN, vec_cpsgn, __builtin_vec_copysign]
+  vf __builtin_vec_copysign (vf, vf);
+    CPSGNSP
+  vd __builtin_vec_copysign (vd, vd);
+    CPSGNDP
+
+[VEC_CTF, vec_ctf, __builtin_vec_ctf]
+  vf __builtin_vec_ctf (vsi, const int);
+    VCFSX
+  vf __builtin_vec_ctf (vui, const int);
+    VCFUX
+  vd __builtin_vec_ctf (vsll, const int);
+    XVCVSXDDP_SCALE
+  vd __builtin_vec_ctf (vull, const int);
+    XVCVUXDDP_SCALE
+
+[VEC_CTS, vec_cts, __builtin_vec_cts]
+  vsi __builtin_vec_cts (vf, const int);
+    VCTSXS
+  vsll __builtin_vec_cts (vd, const int);
+    XVCVDPSXDS_SCALE
+
+[VEC_CTU, vec_ctu, __builtin_vec_ctu]
+  vui __builtin_vec_ctu (vf, const int);
+    VCTUXS
+  vull __builtin_vec_ctu (vd, const int);
+    XVCVDPUXDS_SCALE
+
+[VEC_DIV, vec_div, __builtin_vec_div, __VSX__]
+  vsi __builtin_vec_div (vsi, vsi);
+    VDIVSW
+  vui __builtin_vec_div (vui, vui);
+    VDIVUW
+  vsll __builtin_vec_div (vsll, vsll);
+    DIV_V2DI
+  vull __builtin_vec_div (vull, vull);
+    UDIV_V2DI
+  vsq __builtin_vec_div (vsq, vsq);
+    DIV_V1TI
+  vuq __builtin_vec_div (vuq, vuq);
+    UDIV_V1TI
+  vf __builtin_vec_div (vf, vf);
+    XVDIVSP
+  vd __builtin_vec_div (vd, vd);
+    XVDIVDP
+
+[VEC_DIVE, vec_dive, __builtin_vec_dive, _ARCH_PWR10]
+  vsi __builtin_vec_dive (vsi, vsi);
+    VDIVESW
+  vui __builtin_vec_dive (vui, vui);
+    VDIVEUW
+  vsll __builtin_vec_dive (vsll, vsll);
+    VDIVESD
+  vull __builtin_vec_dive (vull, vull);
+    VDIVEUD
+  vsq __builtin_vec_dive (vsq, vsq);
+    DIVES_V1TI
+  vuq __builtin_vec_dive (vuq, vuq);
+    DIVEU_V1TI
+
+[VEC_DOUBLE, vec_double, __builtin_vec_double]
+  vd __builtin_vec_double (vsll);
+    XVCVSXDDP
+  vd __builtin_vec_double (vull);
+    XVCVUXDDP
+
+[VEC_DOUBLEE, vec_doublee, __builtin_vec_doublee]
+  vd __builtin_vec_doublee (vsi);
+    DOUBLEE_V4SI
+  vd __builtin_vec_doublee (vui);
+    UNS_DOUBLEE_V4SI
+  vd __builtin_vec_doublee (vf);
+    DOUBLEE_V4SF
+
+[VEC_DOUBLEH, vec_doubleh, __builtin_vec_doubleh]
+  vd __builtin_vec_doubleh (vsi);
+    DOUBLEH_V4SI
+  vd __builtin_vec_doubleh (vui);
+    UNS_DOUBLEH_V4SI
+  vd __builtin_vec_doubleh (vf);
+    DOUBLEH_V4SF
+
+[VEC_DOUBLEL, vec_doublel, __builtin_vec_doublel]
+  vd __builtin_vec_doublel (vsi);
+    DOUBLEL_V4SI
+  vd __builtin_vec_doublel (vui);
+    UNS_DOUBLEL_V4SI
+  vd __builtin_vec_doublel (vf);
+    DOUBLEL_V4SF
+
+[VEC_DOUBLEO, vec_doubleo, __builtin_vec_doubleo]
+  vd __builtin_vec_doubleo (vsi);
+    DOUBLEO_V4SI
+  vd __builtin_vec_doubleo (vui);
+    UNS_DOUBLEO_V4SI
+  vd __builtin_vec_doubleo (vf);
+    DOUBLEO_V4SF
+
+[VEC_DST, vec_dst, __builtin_vec_dst]
+  void __builtin_vec_dst (unsigned char *, const int, const int);
+    DST  DST_UC
+  void __builtin_vec_dst (signed char *, const int, const int);
+    DST  DST_SC
+  void __builtin_vec_dst (unsigned short *, const int, const int);
+    DST  DST_US
+  void __builtin_vec_dst (signed short *, const int, const int);
+    DST  DST_SS
+  void __builtin_vec_dst (unsigned int *, const int, const int);
+    DST  DST_UI
+  void __builtin_vec_dst (signed int *, const int, const int);
+    DST  DST_SI
+  void __builtin_vec_dst (unsigned long *, const int, const int);
+    DST  DST_UL
+  void __builtin_vec_dst (signed long *, const int, const int);
+    DST  DST_SL
+  void __builtin_vec_dst (unsigned long long *, const int, const int);
+    DST  DST_ULL
+  void __builtin_vec_dst (signed long long *, const int, const int);
+    DST  DST_SLL
+  void __builtin_vec_dst (float *, const int, const int);
+    DST  DST_F
+  void __builtin_vec_dst (vuc *, const int, const int);
+    DST  DST_VUC
+  void __builtin_vec_dst (vsc *, const int, const int);
+    DST  DST_VSC
+  void __builtin_vec_dst (vbc *, const int, const int);
+    DST  DST_VBC
+  void __builtin_vec_dst (vus *, const int, const int);
+    DST  DST_VUS
+  void __builtin_vec_dst (vss *, const int, const int);
+    DST  DST_VSS
+  void __builtin_vec_dst (vbs *, const int, const int);
+    DST  DST_VBS
+  void __builtin_vec_dst (vp *, const int, const int);
+    DST  DST_VP
+  void __builtin_vec_dst (vui *, const int, const int);
+    DST  DST_VUI
+  void __builtin_vec_dst (vsi *, const int, const int);
+    DST  DST_VSI
+  void __builtin_vec_dst (vbi *, const int, const int);
+    DST  DST_VBI
+  void __builtin_vec_dst (vf *, const int, const int);
+    DST  DST_VF
+
+[VEC_DSTST, vec_dstst, __builtin_vec_dstst]
+  void __builtin_vec_dstst (unsigned char *, const int, const int);
+    DSTST  DSTST_UC
+  void __builtin_vec_dstst (signed char *, const int, const int);
+    DSTST  DSTST_SC
+  void __builtin_vec_dstst (unsigned short *, const int, const int);
+    DSTST  DSTST_US
+  void __builtin_vec_dstst (signed short *, const int, const int);
+    DSTST  DSTST_SS
+  void __builtin_vec_dstst (unsigned int *, const int, const int);
+    DSTST  DSTST_UI
+  void __builtin_vec_dstst (signed int *, const int, const int);
+    DSTST  DSTST_SI
+  void __builtin_vec_dstst (unsigned long *, const int, const int);
+    DSTST  DSTST_UL
+  void __builtin_vec_dstst (signed long *, const int, const int);
+    DSTST  DSTST_SL
+  void __builtin_vec_dstst (unsigned long long *, const int, const int);
+    DSTST  DSTST_ULL
+  void __builtin_vec_dstst (signed long long *, const int, const int);
+    DSTST  DSTST_SLL
+  void __builtin_vec_dstst (float *, const int, const int);
+    DSTST  DSTST_F
+  void __builtin_vec_dstst (vuc *, const int, const int);
+    DSTST  DSTST_VUC
+  void __builtin_vec_dstst (vsc *, const int, const int);
+    DSTST  DSTST_VSC
+  void __builtin_vec_dstst (vbc *, const int, const int);
+    DSTST  DSTST_VBC
+  void __builtin_vec_dstst (vus *, const int, const int);
+    DSTST  DSTST_VUS
+  void __builtin_vec_dstst (vss *, const int, const int);
+    DSTST  DSTST_VSS
+  void __builtin_vec_dstst (vbs *, const int, const int);
+    DSTST  DSTST_VBS
+  void __builtin_vec_dstst (vp *, const int, const int);
+    DSTST  DSTST_VP
+  void __builtin_vec_dstst (vui *, const int, const int);
+    DSTST  DSTST_VUI
+  void __builtin_vec_dstst (vsi *, const int, const int);
+    DSTST  DSTST_VSI
+  void __builtin_vec_dstst (vbi *, const int, const int);
+    DSTST  DSTST_VBI
+  void __builtin_vec_dstst (vf *, const int, const int);
+    DSTST  DSTST_VF
+
+[VEC_DSTSTT, vec_dststt, __builtin_vec_dststt]
+  void __builtin_vec_dststt (unsigned char *, const int, const int);
+    DSTSTT  DSTSTT_UC
+  void __builtin_vec_dststt (signed char *, const int, const int);
+    DSTSTT  DSTSTT_SC
+  void __builtin_vec_dststt (unsigned short *, const int, const int);
+    DSTSTT  DSTSTT_US
+  void __builtin_vec_dststt (signed short *, const int, const int);
+    DSTSTT  DSTSTT_SS
+  void __builtin_vec_dststt (unsigned int *, const int, const int);
+    DSTSTT  DSTSTT_UI
+  void __builtin_vec_dststt (signed int *, const int, const int);
+    DSTSTT  DSTSTT_SI
+  void __builtin_vec_dststt (unsigned long *, const int, const int);
+    DSTSTT  DSTSTT_UL
+  void __builtin_vec_dststt (signed long *, const int, const int);
+    DSTSTT  DSTSTT_SL
+  void __builtin_vec_dststt (unsigned long long *, const int, const int);
+    DSTSTT  DSTSTT_ULL
+  void __builtin_vec_dststt (signed long long *, const int, const int);
+    DSTSTT  DSTSTT_SLL
+  void __builtin_vec_dststt (float *, const int, const int);
+    DSTSTT  DSTSTT_F
+  void __builtin_vec_dststt (vuc *, const int, const int);
+    DSTSTT  DSTSTT_VUC
+  void __builtin_vec_dststt (vsc *, const int, const int);
+    DSTSTT  DSTSTT_VSC
+  void __builtin_vec_dststt (vbc *, const int, const int);
+    DSTSTT  DSTSTT_VBC
+  void __builtin_vec_dststt (vus *, const int, const int);
+    DSTSTT  DSTSTT_VUS
+  void __builtin_vec_dststt (vss *, const int, const int);
+    DSTSTT  DSTSTT_VSS
+  void __builtin_vec_dststt (vbs *, const int, const int);
+    DSTSTT  DSTSTT_VBS
+  void __builtin_vec_dststt (vp *, const int, const int);
+    DSTSTT  DSTSTT_VP
+  void __builtin_vec_dststt (vui *, const int, const int);
+    DSTSTT  DSTSTT_VUI
+  void __builtin_vec_dststt (vsi *, const int, const int);
+    DSTSTT  DSTSTT_VSI
+  void __builtin_vec_dststt (vbi *, const int, const int);
+    DSTSTT  DSTSTT_VBI
+  void __builtin_vec_dststt (vf *, const int, const int);
+    DSTSTT  DSTSTT_VF
+
+[VEC_DSTT, vec_dstt, __builtin_vec_dstt]
+  void __builtin_vec_dstt (unsigned char *, const int, const int);
+    DSTT  DSTT_UC
+  void __builtin_vec_dstt (signed char *, const int, const int);
+    DSTT  DSTT_SC
+  void __builtin_vec_dstt (unsigned short *, const int, const int);
+    DSTT  DSTT_US
+  void __builtin_vec_dstt (signed short *, const int, const int);
+    DSTT  DSTT_SS
+  void __builtin_vec_dstt (unsigned int *, const int, const int);
+    DSTT  DSTT_UI
+  void __builtin_vec_dstt (signed int *, const int, const int);
+    DSTT  DSTT_SI
+  void __builtin_vec_dstt (unsigned long *, const int, const int);
+    DSTT  DSTT_UL
+  void __builtin_vec_dstt (signed long *, const int, const int);
+    DSTT  DSTT_SL
+  void __builtin_vec_dstt (unsigned long long *, const int, const int);
+    DSTT  DSTT_ULL
+  void __builtin_vec_dstt (signed long long *, const int, const int);
+    DSTT  DSTT_SLL
+  void __builtin_vec_dstt (float *, const int, const int);
+    DSTT  DSTT_F
+  void __builtin_vec_dstt (vuc *, const int, const int);
+    DSTT  DSTT_VUC
+  void __builtin_vec_dstt (vsc *, const int, const int);
+    DSTT  DSTT_VSC
+  void __builtin_vec_dstt (vbc *, const int, const int);
+    DSTT  DSTT_VBC
+  void __builtin_vec_dstt (vus *, const int, const int);
+    DSTT  DSTT_VUS
+  void __builtin_vec_dstt (vss *, const int, const int);
+    DSTT  DSTT_VSS
+  void __builtin_vec_dstt (vbs *, const int, const int);
+    DSTT  DSTT_VBS
+  void __builtin_vec_dstt (vp *, const int, const int);
+    DSTT  DSTT_VP
+  void __builtin_vec_dstt (vui *, const int, const int);
+    DSTT  DSTT_VUI
+  void __builtin_vec_dstt (vsi *, const int, const int);
+    DSTT  DSTT_VSI
+  void __builtin_vec_dstt (vbi *, const int, const int);
+    DSTT  DSTT_VBI
+  void __builtin_vec_dstt (vf *, const int, const int);
+    DSTT  DSTT_VF
+
+[VEC_EQV, vec_eqv, __builtin_vec_eqv, _ARCH_PWR8]
+  vsc __builtin_vec_eqv (vsc, vsc);
+    EQV_V16QI
+  vuc __builtin_vec_eqv (vuc, vuc);
+    EQV_V16QI_UNS  EQV_V16QI_VUC
+  vbc __builtin_vec_eqv (vbc, vbc);
+    EQV_V16QI_UNS  EQV_V16QI_VBC
+  vss __builtin_vec_eqv (vss, vss);
+    EQV_V8HI
+  vus __builtin_vec_eqv (vus, vus);
+    EQV_V8HI_UNS  EQV_V8HI_VUS
+  vbs __builtin_vec_eqv (vbs, vbs);
+    EQV_V8HI_UNS  EQV_V8HI_VBS
+  vsi __builtin_vec_eqv (vsi, vsi);
+    EQV_V4SI
+  vui __builtin_vec_eqv (vui, vui);
+    EQV_V4SI_UNS  EQV_V4SI_VUI
+  vbi __builtin_vec_eqv (vbi, vbi);
+    EQV_V4SI_UNS  EQV_V4SI_VBI
+  vsll __builtin_vec_eqv (vsll, vsll);
+    EQV_V2DI
+  vull __builtin_vec_eqv (vull, vull);
+    EQV_V2DI_UNS  EQV_V2DI_VULL
+  vbll __builtin_vec_eqv (vbll, vbll);
+    EQV_V2DI_UNS  EQV_V2DI_VBLL
+  vf __builtin_vec_eqv (vf, vf);
+    EQV_V4SF
+  vd __builtin_vec_eqv (vd, vd);
+    EQV_V2DF
+; The following variants are deprecated.
+  vsc __builtin_vec_eqv (vbc, vsc);
+    EQV_V16QI  EQV_VBC_VSC
+  vsc __builtin_vec_eqv (vsc, vbc);
+    EQV_V16QI  EQV_VSC_VBC
+  vuc __builtin_vec_eqv (vbc, vuc);
+    EQV_V16QI_UNS  EQV_VBC_VUC
+  vuc __builtin_vec_eqv (vuc, vbc);
+    EQV_V16QI_UNS  EQV_VUC_VBC
+  vss __builtin_vec_eqv (vbs, vss);
+    EQV_V8HI  EQV_VBS_VSS
+  vss __builtin_vec_eqv (vss, vbs);
+    EQV_V8HI  EQV_VSS_VBS
+  vus __builtin_vec_eqv (vbs, vus);
+    EQV_V8HI_UNS  EQV_VBS_VUS
+  vus __builtin_vec_eqv (vus, vbs);
+    EQV_V8HI_UNS  EQV_VUS_VBS
+  vsi __builtin_vec_eqv (vbi, vsi);
+    EQV_V4SI  EQV_VBI_VSI
+  vsi __builtin_vec_eqv (vsi, vbi);
+    EQV_V4SI  EQV_VSI_VBI
+  vui __builtin_vec_eqv (vbi, vui);
+    EQV_V4SI_UNS  EQV_VBI_VUI
+  vui __builtin_vec_eqv (vui, vbi);
+    EQV_V4SI_UNS  EQV_VUI_VBI
+  vsll __builtin_vec_eqv (vbll, vsll);
+    EQV_V2DI  EQV_VBLL_VSLL
+  vsll __builtin_vec_eqv (vsll, vbll);
+    EQV_V2DI  EQV_VSLL_VBLL
+  vull __builtin_vec_eqv (vbll, vull);
+    EQV_V2DI_UNS  EQV_VBLL_VULL
+  vull __builtin_vec_eqv (vull, vbll);
+    EQV_V2DI_UNS  EQV_VULL_VBLL
+
+[VEC_EXPANDM, vec_expandm, __builtin_vec_vexpandm, _ARCH_PWR10]
+  vuc __builtin_vec_vexpandm (vuc);
+    VEXPANDMB
+  vus __builtin_vec_vexpandm (vus);
+    VEXPANDMH
+  vui __builtin_vec_vexpandm (vui);
+    VEXPANDMW
+  vull __builtin_vec_vexpandm (vull);
+    VEXPANDMD
+  vuq __builtin_vec_vexpandm (vuq);
+    VEXPANDMQ
+
+[VEC_EXPTE, vec_expte, __builtin_vec_expte]
+  vf __builtin_vec_expte (vf);
+    VEXPTEFP
+
+; There are no actual builtins for vec_extract.  There is special handling for
+; this in altivec_resolve_overloaded_builtin in rs6000-c.c, where the call
+; is replaced by "pointer tricks."  The single overload here causes
+; __builtin_vec_extract to be registered with the front end so this can
+; happen.
+[VEC_EXTRACT, vec_extract, __builtin_vec_extract]
+  vsi __builtin_vec_extract (vsi, signed int);
+    VSPLTW  EXTRACT_FAKERY
+
+[VEC_EXTRACT_FP_FROM_SHORTH, vec_extract_fp32_from_shorth, __builtin_vec_vextract_fp_from_shorth, _ARCH_PWR9]
+  vf __builtin_vec_vextract_fp_from_shorth (vus);
+    VEXTRACT_FP_FROM_SHORTH
+
+[VEC_EXTRACT_FP_FROM_SHORTL, vec_extract_fp32_from_shortl, __builtin_vec_vextract_fp_from_shortl, _ARCH_PWR9]
+  vf __builtin_vec_vextract_fp_from_shortl (vus);
+    VEXTRACT_FP_FROM_SHORTL
+
+[VEC_EXTRACTH, vec_extracth, __builtin_vec_extracth, _ARCH_PWR10]
+  vull __builtin_vec_extracth (vuc, vuc, unsigned char);
+    VEXTRACTBR
+  vull __builtin_vec_extracth (vus, vus, unsigned char);
+    VEXTRACTHR
+  vull __builtin_vec_extracth (vui, vui, unsigned char);
+    VEXTRACTWR
+  vull __builtin_vec_extracth (vull, vull, unsigned char);
+    VEXTRACTDR
+
+[VEC_EXTRACTL, vec_extractl, __builtin_vec_extractl, _ARCH_PWR10]
+  vull __builtin_vec_extractl (vuc, vuc, unsigned char);
+    VEXTRACTBL
+  vull __builtin_vec_extractl (vus, vus, unsigned char);
+    VEXTRACTHL
+  vull __builtin_vec_extractl (vui, vui, unsigned char);
+    VEXTRACTWL
+  vull __builtin_vec_extractl (vull, vull, unsigned char);
+    VEXTRACTDL
+
+[VEC_EXTRACTM, vec_extractm, __builtin_vec_vextractm, _ARCH_PWR10]
+  signed int __builtin_vec_vextractm (vuc);
+    VEXTRACTMB
+  signed int __builtin_vec_vextractm (vus);
+    VEXTRACTMH
+  signed int __builtin_vec_vextractm (vui);
+    VEXTRACTMW
+  signed int __builtin_vec_vextractm (vull);
+    VEXTRACTMD
+  signed int __builtin_vec_vextractm (vuq);
+    VEXTRACTMQ
+
+[VEC_EXTRACT4B, vec_extract4b, __builtin_vec_extract4b, _ARCH_PWR9]
+  vull __builtin_vec_extract4b (vuc, const int);
+    EXTRACT4B
+
+[VEC_EXTULX, vec_xlx, __builtin_vec_vextulx, _ARCH_PWR9]
+  signed char __builtin_vec_vextulx (unsigned int, vsc);
+    VEXTUBLX  VEXTUBLX_S
+  unsigned char __builtin_vec_vextulx (unsigned int, vuc);
+    VEXTUBLX  VEXTUBLX_U
+  signed short __builtin_vec_vextulx (unsigned int, vss);
+    VEXTUHLX  VEXTUHLX_S
+  unsigned short __builtin_vec_vextulx (unsigned int, vus);
+    VEXTUHLX  VEXTUHLX_U
+  signed int __builtin_vec_vextulx (unsigned int, vsi);
+    VEXTUWLX  VEXTUWLX_S
+  unsigned int __builtin_vec_vextulx (unsigned int, vui);
+    VEXTUWLX  VEXTUWLX_U
+  float __builtin_vec_vextulx (unsigned int, vf);
+    VEXTUWLX  VEXTUWLX_F
+
+[VEC_EXTURX, vec_xrx, __builtin_vec_vexturx, _ARCH_PWR9]
+  signed char __builtin_vec_vexturx (unsigned int, vsc);
+    VEXTUBRX  VEXTUBRX_S
+  unsigned char __builtin_vec_vexturx (unsigned int, vuc);
+    VEXTUBRX  VEXTUBRX_U
+  signed short __builtin_vec_vexturx (unsigned int, vss);
+    VEXTUHRX  VEXTUHRX_S
+  unsigned short __builtin_vec_vexturx (unsigned int, vus);
+    VEXTUHRX  VEXTUHRX_U
+  signed int __builtin_vec_vexturx (unsigned int, vsi);
+    VEXTUWRX  VEXTUWRX_S
+  unsigned int __builtin_vec_vexturx (unsigned int, vui);
+    VEXTUWRX  VEXTUWRX_U
+  float __builtin_vec_vexturx (unsigned int, vf);
+    VEXTUWRX  VEXTUWRX_F
+
+[VEC_FIRSTMATCHINDEX, vec_first_match_index, __builtin_vec_first_match_index, _ARCH_PWR9]
+  unsigned int __builtin_vec_first_match_index (vsc, vsc);
+    VFIRSTMATCHINDEX_V16QI FIRSTMATCHINDEX_VSC
+  unsigned int __builtin_vec_first_match_index (vuc, vuc);
+    VFIRSTMATCHINDEX_V16QI FIRSTMATCHINDEX_VUC
+  unsigned int __builtin_vec_first_match_index (vss, vss);
+    VFIRSTMATCHINDEX_V8HI FIRSTMATCHINDEX_VSS
+  unsigned int __builtin_vec_first_match_index (vus, vus);
+    VFIRSTMATCHINDEX_V8HI FIRSTMATCHINDEX_VUS
+  unsigned int __builtin_vec_first_match_index (vsi, vsi);
+    VFIRSTMATCHINDEX_V4SI FIRSTMATCHINDEX_VSI
+  unsigned int __builtin_vec_first_match_index (vui, vui);
+    VFIRSTMATCHINDEX_V4SI FIRSTMATCHINDEX_VUI
+
+[VEC_FIRSTMATCHOREOSINDEX, vec_first_match_or_eos_index, __builtin_vec_first_match_or_eos_index, _ARCH_PWR9]
+  unsigned int __builtin_vec_first_match_or_eos_index (vsc, vsc);
+    VFIRSTMATCHOREOSINDEX_V16QI FIRSTMATCHOREOSINDEX_VSC
+  unsigned int __builtin_vec_first_match_or_eos_index (vuc, vuc);
+    VFIRSTMATCHOREOSINDEX_V16QI FIRSTMATCHOREOSINDEX_VUC
+  unsigned int __builtin_vec_first_match_or_eos_index (vss, vss);
+    VFIRSTMATCHOREOSINDEX_V8HI FIRSTMATCHOREOSINDEX_VSS
+  unsigned int __builtin_vec_first_match_or_eos_index (vus, vus);
+    VFIRSTMATCHOREOSINDEX_V8HI FIRSTMATCHOREOSINDEX_VUS
+  unsigned int __builtin_vec_first_match_or_eos_index (vsi, vsi);
+    VFIRSTMATCHOREOSINDEX_V4SI FIRSTMATCHOREOSINDEX_VSI
+  unsigned int __builtin_vec_first_match_or_eos_index (vui, vui);
+    VFIRSTMATCHOREOSINDEX_V4SI FIRSTMATCHOREOSINDEX_VUI
+
+[VEC_FIRSTMISMATCHINDEX, vec_first_mismatch_index, __builtin_vec_first_mismatch_index, _ARCH_PWR9]
+  unsigned int __builtin_vec_first_mismatch_index (vsc, vsc);
+    VFIRSTMISMATCHINDEX_V16QI FIRSTMISMATCHINDEX_VSC
+  unsigned int __builtin_vec_first_mismatch_index (vuc, vuc);
+    VFIRSTMISMATCHINDEX_V16QI FIRSTMISMATCHINDEX_VUC
+  unsigned int __builtin_vec_first_mismatch_index (vss, vss);
+    VFIRSTMISMATCHINDEX_V8HI FIRSTMISMATCHINDEX_VSS
+  unsigned int __builtin_vec_first_mismatch_index (vus, vus);
+    VFIRSTMISMATCHINDEX_V8HI FIRSTMISMATCHINDEX_VUS
+  unsigned int __builtin_vec_first_mismatch_index (vsi, vsi);
+    VFIRSTMISMATCHINDEX_V4SI FIRSTMISMATCHINDEX_VSI
+  unsigned int __builtin_vec_first_mismatch_index (vui, vui);
+    VFIRSTMISMATCHINDEX_V4SI FIRSTMISMATCHINDEX_VUI
+
+[VEC_FIRSTMISMATCHOREOSINDEX, vec_first_mismatch_or_eos_index, __builtin_vec_first_mismatch_or_eos_index, _ARCH_PWR9]
+  unsigned int __builtin_vec_first_mismatch_or_eos_index (vsc, vsc);
+    VFIRSTMISMATCHOREOSINDEX_V16QI FIRSTMISMATCHOREOSINDEX_VSC
+  unsigned int __builtin_vec_first_mismatch_or_eos_index (vuc, vuc);
+    VFIRSTMISMATCHOREOSINDEX_V16QI FIRSTMISMATCHOREOSINDEX_VUC
+  unsigned int __builtin_vec_first_mismatch_or_eos_index (vss, vss);
+    VFIRSTMISMATCHOREOSINDEX_V8HI FIRSTMISMATCHOREOSINDEX_VSS
+  unsigned int __builtin_vec_first_mismatch_or_eos_index (vus, vus);
+    VFIRSTMISMATCHOREOSINDEX_V8HI FIRSTMISMATCHOREOSINDEX_VUS
+  unsigned int __builtin_vec_first_mismatch_or_eos_index (vsi, vsi);
+    VFIRSTMISMATCHOREOSINDEX_V4SI FIRSTMISMATCHOREOSINDEX_VSI
+  unsigned int __builtin_vec_first_mismatch_or_eos_index (vui, vui);
+    VFIRSTMISMATCHOREOSINDEX_V4SI FIRSTMISMATCHOREOSINDEX_VUI
+
+[VEC_FLOAT, vec_float, __builtin_vec_float]
+  vf __builtin_vec_float (vsi);
+    XVCVSXWSP
+  vf __builtin_vec_float (vui);
+    XVCVUXWSP
+
+[VEC_FLOAT2, vec_float2, __builtin_vec_float2]
+  vf __builtin_vec_float2 (vsll, vsll);
+    FLOAT2_V2DI
+  vf __builtin_vec_float2 (vull, vull);
+    UNS_FLOAT2_V2DI
+  vf __builtin_vec_float2 (vd, vd);
+    FLOAT2_V2DF
+
+[VEC_FLOATE, vec_floate, __builtin_vec_floate]
+  vf __builtin_vec_floate (vsll);
+    FLOATE_V2DI
+  vf __builtin_vec_floate (vull);
+    UNS_FLOATE_V2DI
+  vf __builtin_vec_floate (vd);
+    FLOATE_V2DF
+
+[VEC_FLOATO, vec_floato, __builtin_vec_floato]
+  vf __builtin_vec_floato (vsll);
+    FLOATO_V2DI
+  vf __builtin_vec_floato (vull);
+    UNS_FLOATO_V2DI
+  vf __builtin_vec_floato (vd);
+    FLOATO_V2DF
+
+; #### XVRSPIM{TARGET_VSX}; VRFIM
+[VEC_FLOOR, vec_floor, __builtin_vec_floor]
+  vf __builtin_vec_floor (vf);
+    VRFIM
+  vd __builtin_vec_floor (vd);
+    XVRDPIM
+
+[VEC_GB, vec_gb, __builtin_vec_vgbbd, _ARCH_PWR8]
+  vsc __builtin_vec_vgbbd (vsc);
+    VGBBD  VGBBD_S
+  vuc __builtin_vec_vgbbd (vuc);
+    VGBBD  VGBBD_U
+
+[VEC_GENBM, vec_genbm, __builtin_vec_mtvsrbm, _ARCH_PWR10]
+  vuc __builtin_vec_mtvsrbm (unsigned long long);
+    MTVSRBM
+
+[VEC_GENHM, vec_genhm, __builtin_vec_mtvsrhm, _ARCH_PWR10]
+  vus __builtin_vec_mtvsrhm (unsigned long long);
+    MTVSRHM
+
+[VEC_GENWM, vec_genwm, __builtin_vec_mtvsrwm, _ARCH_PWR10]
+  vui __builtin_vec_mtvsrwm (unsigned long long);
+    MTVSRWM
+
+[VEC_GENDM, vec_gendm, __builtin_vec_mtvsrdm, _ARCH_PWR10]
+  vull __builtin_vec_mtvsrdm (unsigned long long);
+    MTVSRDM
+
+[VEC_GENQM, vec_genqm, __builtin_vec_mtvsrqm, _ARCH_PWR10]
+  vuq __builtin_vec_mtvsrqm (unsigned long long);
+    MTVSRQM
+
+[VEC_GENPCVM, vec_genpcvm, __builtin_vec_xxgenpcvm, _ARCH_PWR10]
+  vuc __builtin_vec_xxgenpcvm (vuc, const int);
+    XXGENPCVM_V16QI
+  vus __builtin_vec_xxgenpcvm (vus, const int);
+    XXGENPCVM_V8HI
+  vui __builtin_vec_xxgenpcvm (vui, const int);
+    XXGENPCVM_V4SI
+  vull __builtin_vec_xxgenpcvm (vull, const int);
+    XXGENPCVM_V2DI
+
+[VEC_GNB, vec_gnb, __builtin_vec_gnb, _ARCH_PWR10]
+  unsigned long long __builtin_vec_gnb (vuq, const int);
+    VGNB
+
+; There are no actual builtins for vec_insert.  There is special handling for
+; this in altivec_resolve_overloaded_builtin in rs6000-c.c, where the call
+; is replaced by "pointer tricks."  The single overload here causes
+; __builtin_vec_insert to be registered with the front end so this can happen.
+[VEC_INSERT, vec_insert, __builtin_vec_insert]
+  vsi __builtin_vec_insert (vsi, vsi, signed int);
+    XXPERMDI_4SI  INSERT_FAKERY
+
+[VEC_INSERTH, vec_inserth, __builtin_vec_inserth, _ARCH_PWR10]
+  vuc __builtin_vec_inserth (unsigned char, vuc, unsigned int);
+    VINSERTGPRBR
+  vuc __builtin_vec_inserth (vuc, vuc, unsigned int);
+    VINSERTVPRBR
+  vus __builtin_vec_inserth (unsigned short, vus, unsigned int);
+    VINSERTGPRHR
+  vus __builtin_vec_inserth (vus, vus, unsigned int);
+    VINSERTVPRHR
+  vui __builtin_vec_inserth (unsigned int, vui, unsigned int);
+    VINSERTGPRWR
+  vui __builtin_vec_inserth (vui, vui, unsigned int);
+    VINSERTVPRWR
+  vull __builtin_vec_inserth (unsigned long long, vull, unsigned int);
+    VINSERTGPRDR
+
+[VEC_INSERTL, vec_insertl, __builtin_vec_insertl, _ARCH_PWR10]
+  vuc __builtin_vec_insertl (unsigned char, vuc, unsigned int);
+    VINSERTGPRBL
+  vuc __builtin_vec_insertl (vuc, vuc, unsigned int);
+    VINSERTVPRBL
+  vus __builtin_vec_insertl (unsigned short, vus, unsigned int);
+    VINSERTGPRHL
+  vus __builtin_vec_insertl (vus, vus, unsigned int);
+    VINSERTVPRHL
+  vui __builtin_vec_insertl (unsigned int, vui, unsigned int);
+    VINSERTGPRWL
+  vui __builtin_vec_insertl (vui, vui, unsigned int);
+    VINSERTVPRWL
+  vull __builtin_vec_insertl (unsigned long long, vull, unsigned int);
+    VINSERTGPRDL
+
+[VEC_INSERT4B, vec_insert4b, __builtin_vec_insert4b, _ARCH_PWR9]
+  vuc __builtin_vec_insert4b (vsi, vuc, const int);
+    INSERT4B  INSERT4B_S
+  vuc __builtin_vec_insert4b (vui, vuc, const int);
+    INSERT4B  INSERT4B_U
+
+[VEC_LD, vec_ld, __builtin_vec_ld]
+  vsc __builtin_vec_ld (signed long, const vsc *);
+    LVX_V16QI  LVX_V16QI_VSC
+  vsc __builtin_vec_ld (signed long, const signed char *);
+    LVX_V16QI  LVX_V16QI_SC
+  vuc __builtin_vec_ld (signed long, const vuc *);
+    LVX_V16QI  LVX_V16QI_VUC
+  vuc __builtin_vec_ld (signed long, const unsigned char *);
+    LVX_V16QI  LVX_V16QI_UC
+  vbc __builtin_vec_ld (signed long, const vbc *);
+    LVX_V16QI  LVX_V16QI_VBC
+  vss __builtin_vec_ld (signed long, const vss *);
+    LVX_V8HI  LVX_V8HI_VSS
+  vss __builtin_vec_ld (signed long, const signed short *);
+    LVX_V8HI  LVX_V8HI_SS
+  vus __builtin_vec_ld (signed long, const vus *);
+    LVX_V8HI  LVX_V8HI_VUS
+  vus __builtin_vec_ld (signed long, const unsigned short *);
+    LVX_V8HI  LVX_V8HI_US
+  vbs __builtin_vec_ld (signed long, const vbs *);
+    LVX_V8HI  LVX_V8HI_VBS
+  vp __builtin_vec_ld (signed long, const vp *);
+    LVX_V8HI  LVX_V8HI_VP
+  vsi __builtin_vec_ld (signed long, const vsi *);
+    LVX_V4SI  LVX_V4SI_VSI
+  vsi __builtin_vec_ld (signed long, const signed int *);
+    LVX_V4SI  LVX_V4SI_SI
+  vui __builtin_vec_ld (signed long, const vui *);
+    LVX_V4SI  LVX_V4SI_VUI
+  vui __builtin_vec_ld (signed long, const unsigned int *);
+    LVX_V4SI  LVX_V4SI_UI
+  vbi __builtin_vec_ld (signed long, const vbi *);
+    LVX_V4SI  LVX_V4SI_VBI
+  vsll __builtin_vec_ld (signed long, const vsll *);
+    LVX_V2DI  LVX_V2DI_VSLL
+  vsll __builtin_vec_ld (signed long, const signed long long *);
+    LVX_V2DI  LVX_V2DI_SLL
+  vull __builtin_vec_ld (signed long, const vull *);
+    LVX_V2DI  LVX_V2DI_VULL
+  vull __builtin_vec_ld (signed long, const unsigned long long *);
+    LVX_V2DI  LVX_V2DI_ULL
+  vbll __builtin_vec_ld (signed long, const vbll *);
+    LVX_V2DI  LVX_V2DI_VBLL
+  vsq __builtin_vec_ld (signed long, const vsq *);
+    LVX_V1TI  LVX_V1TI_VSQ
+  vuq __builtin_vec_ld (signed long, const vuq *);
+    LVX_V1TI  LVX_V1TI_VUQ
+  vsq __builtin_vec_ld (signed long, const __int128 *);
+    LVX_V1TI  LVX_V1TI_TI
+  vuq __builtin_vec_ld (signed long, const unsigned __int128 *);
+    LVX_V1TI  LVX_V1TI_UTI
+  vf __builtin_vec_ld (signed long, const vf *);
+    LVX_V4SF  LVX_V4SF_VF
+  vf __builtin_vec_ld (signed long, const float *);
+    LVX_V4SF  LVX_V4SF_F
+  vd __builtin_vec_ld (signed long, const vd *);
+    LVX_V2DF  LVX_V2DF_VD
+  vd __builtin_vec_ld (signed long, const double *);
+    LVX_V2DF  LVX_V2DF_D
+; The following variants are deprecated.
+  vsi __builtin_vec_ld (signed long, const long *);
+    LVX_V4SI  LVX_V4SI_SL
+  vui __builtin_vec_ld (signed long, const unsigned long *);
+    LVX_V4SI  LVX_V4SI_UL
+
+[VEC_LDE, vec_lde, __builtin_vec_lde]
+  vsc __builtin_vec_lde (signed long, const signed char *);
+    LVEBX  LVEBX_SC
+  vuc __builtin_vec_lde (signed long, const unsigned char *);
+    LVEBX  LVEBX_UC
+  vss __builtin_vec_lde (signed long, const signed short *);
+    LVEHX  LVEHX_SS
+  vus __builtin_vec_lde (signed long, const unsigned short *);
+    LVEHX  LVEHX_US
+  vsi __builtin_vec_lde (signed long, const signed int *);
+    LVEWX  LVEWX_SI
+  vui __builtin_vec_lde (signed long, const unsigned int *);
+    LVEWX  LVEWX_UI
+  vf __builtin_vec_lde (signed long, const float *);
+    LVEWX  LVEWX_F
+; The following variants are deprecated.
+  vsi __builtin_vec_lde (signed long, const long *);
+    LVEWX  LVEWX_SL
+  vui __builtin_vec_lde (signed long, const unsigned long *);
+    LVEWX  LVEWX_UL
+
+[VEC_LDL, vec_ldl, __builtin_vec_ldl]
+  vsc __builtin_vec_ldl (signed long, const vsc *);
+    LVXL_V16QI  LVXL_V16QI_VSC
+  vsc __builtin_vec_ldl (signed long, const signed char *);
+    LVXL_V16QI  LVXL_V16QI_SC
+  vuc __builtin_vec_ldl (signed long, const vuc *);
+    LVXL_V16QI  LVXL_V16QI_VUC
+  vuc __builtin_vec_ldl (signed long, const unsigned char *);
+    LVXL_V16QI  LVXL_V16QI_UC
+  vbc __builtin_vec_ldl (signed long, const vbc *);
+    LVXL_V16QI  LVXL_V16QI_VBC
+  vss __builtin_vec_ldl (signed long, const vss *);
+    LVXL_V8HI  LVXL_V8HI_VSS
+  vss __builtin_vec_ldl (signed long, const signed short *);
+    LVXL_V8HI  LVXL_V8HI_SS
+  vus __builtin_vec_ldl (signed long, const vus *);
+    LVXL_V8HI  LVXL_V8HI_VUS
+  vus __builtin_vec_ldl (signed long, const unsigned short *);
+    LVXL_V8HI  LVXL_V8HI_US
+  vbs __builtin_vec_ldl (signed long, const vbs *);
+    LVXL_V8HI  LVXL_V8HI_VBS
+  vp __builtin_vec_ldl (signed long, const vp *);
+    LVXL_V8HI  LVXL_V8HI_VP
+  vsi __builtin_vec_ldl (signed long, const vsi *);
+    LVXL_V4SI  LVXL_V4SI_VSI
+  vsi __builtin_vec_ldl (signed long, const signed int *);
+    LVXL_V4SI  LVXL_V4SI_SI
+  vui __builtin_vec_ldl (signed long, const vui *);
+    LVXL_V4SI  LVXL_V4SI_VUI
+  vui __builtin_vec_ldl (signed long, const unsigned int *);
+    LVXL_V4SI  LVXL_V4SI_UI
+  vbi __builtin_vec_ldl (signed long, const vbi *);
+    LVXL_V4SI  LVXL_V4SI_VBI
+  vsll __builtin_vec_ldl (signed long, const vsll *);
+    LVXL_V2DI  LVXL_V2DI_VSLL
+  vsll __builtin_vec_ldl (signed long, const signed long long *);
+    LVXL_V2DI  LVXL_V2DI_SLL
+  vull __builtin_vec_ldl (signed long, const vull *);
+    LVXL_V2DI  LVXL_V2DI_VULL
+  vull __builtin_vec_ldl (signed long, const unsigned long long *);
+    LVXL_V2DI  LVXL_V2DI_ULL
+  vbll __builtin_vec_ldl (signed long, const vbll *);
+    LVXL_V2DI  LVXL_V2DI_VBLL
+  vf __builtin_vec_ldl (signed long, const vf *);
+    LVXL_V4SF  LVXL_V4SF_VF
+  vf __builtin_vec_ldl (signed long, const float *);
+    LVXL_V4SF  LVXL_V4SF_F
+  vd __builtin_vec_ldl (signed long, const vd *);
+    LVXL_V2DF  LVXL_V2DF_VD
+  vd __builtin_vec_ldl (signed long, const double *);
+    LVXL_V2DF  LVXL_V2DF_D
+
+[VEC_LOGE, vec_loge, __builtin_vec_loge]
+  vf __builtin_vec_loge (vf);
+    VLOGEFP
+
+[VEC_LVLX, vec_lvlx, __builtin_vec_lvlx, __PPU__]
+  vbc __builtin_vec_lvlx (signed long, const vbc *);
+    LVLX  LVLX_VBC
+  vsc __builtin_vec_lvlx (signed long, const vsc *);
+    LVLX  LVLX_VSC
+  vsc __builtin_vec_lvlx (signed long, const signed char *);
+    LVLX  LVLX_SC
+  vuc __builtin_vec_lvlx (signed long, const vuc *);
+    LVLX  LVLX_VUC
+  vuc __builtin_vec_lvlx (signed long, const unsigned char *);
+    LVLX  LVLX_UC
+  vbs __builtin_vec_lvlx (signed long, const vbs *);
+    LVLX  LVLX_VBS
+  vss __builtin_vec_lvlx (signed long, const vss *);
+    LVLX  LVLX_VSS
+  vss __builtin_vec_lvlx (signed long, const signed short *);
+    LVLX  LVLX_SS
+  vus __builtin_vec_lvlx (signed long, const vus *);
+    LVLX  LVLX_VUS
+  vus __builtin_vec_lvlx (signed long, const unsigned short *);
+    LVLX  LVLX_US
+  vp __builtin_vec_lvlx (signed long, const vp *);
+    LVLX  LVLX_VP
+  vbi __builtin_vec_lvlx (signed long, const vbi *);
+    LVLX  LVLX_VBI
+  vsi __builtin_vec_lvlx (signed long, const vsi *);
+    LVLX  LVLX_VSI
+  vsi __builtin_vec_lvlx (signed long, const signed int *);
+    LVLX  LVLX_SI
+  vui __builtin_vec_lvlx (signed long, const vui *);
+    LVLX  LVLX_VUI
+  vui __builtin_vec_lvlx (signed long, const unsigned int *);
+    LVLX  LVLX_UI
+  vf __builtin_vec_lvlx (signed long, const vf *);
+    LVLX  LVLX_VF
+  vf __builtin_vec_lvlx (signed long, const float *);
+    LVLX  LVLX_F
+
+[VEC_LVLXL, vec_lvlxl, __builtin_vec_lvlxl, __PPU__]
+  vbc __builtin_vec_lvlxl (signed long, const vbc *);
+    LVLXL  LVLXL_VBC
+  vsc __builtin_vec_lvlxl (signed long, const vsc *);
+    LVLXL  LVLXL_VSC
+  vsc __builtin_vec_lvlxl (signed long, const signed char *);
+    LVLXL  LVLXL_SC
+  vuc __builtin_vec_lvlxl (signed long, const vuc *);
+    LVLXL  LVLXL_VUC
+  vuc __builtin_vec_lvlxl (signed long, const unsigned char *);
+    LVLXL  LVLXL_UC
+  vbs __builtin_vec_lvlxl (signed long, const vbs *);
+    LVLXL  LVLXL_VBS
+  vss __builtin_vec_lvlxl (signed long, const vss *);
+    LVLXL  LVLXL_VSS
+  vss __builtin_vec_lvlxl (signed long, const signed short *);
+    LVLXL  LVLXL_SS
+  vus __builtin_vec_lvlxl (signed long, const vus *);
+    LVLXL  LVLXL_VUS
+  vus __builtin_vec_lvlxl (signed long, const unsigned short *);
+    LVLXL  LVLXL_US
+  vp __builtin_vec_lvlxl (signed long, const vp *);
+    LVLXL  LVLXL_VP
+  vbi __builtin_vec_lvlxl (signed long, const vbi *);
+    LVLXL  LVLXL_VBI
+  vsi __builtin_vec_lvlxl (signed long, const vsi *);
+    LVLXL  LVLXL_VSI
+  vsi __builtin_vec_lvlxl (signed long, const signed int *);
+    LVLXL  LVLXL_SI
+  vui __builtin_vec_lvlxl (signed long, const vui *);
+    LVLXL  LVLXL_VUI
+  vui __builtin_vec_lvlxl (signed long, const unsigned int *);
+    LVLXL  LVLXL_UI
+  vf __builtin_vec_lvlxl (signed long, const vf *);
+    LVLXL  LVLXL_VF
+  vf __builtin_vec_lvlxl (signed long, const float *);
+    LVLXL  LVLXL_F
+
+[VEC_LVRX, vec_lvrx, __builtin_vec_lvrx, __PPU__]
+  vbc __builtin_vec_lvrx (signed long, const vbc *);
+    LVRX  LVRX_VBC
+  vsc __builtin_vec_lvrx (signed long, const vsc *);
+    LVRX  LVRX_VSC
+  vsc __builtin_vec_lvrx (signed long, const signed char *);
+    LVRX  LVRX_SC
+  vuc __builtin_vec_lvrx (signed long, const vuc *);
+    LVRX  LVRX_VUC
+  vuc __builtin_vec_lvrx (signed long, const unsigned char *);
+    LVRX  LVRX_UC
+  vbs __builtin_vec_lvrx (signed long, const vbs *);
+    LVRX  LVRX_VBS
+  vss __builtin_vec_lvrx (signed long, const vss *);
+    LVRX  LVRX_VSS
+  vss __builtin_vec_lvrx (signed long, const signed short *);
+    LVRX  LVRX_SS
+  vus __builtin_vec_lvrx (signed long, const vus *);
+    LVRX  LVRX_VUS
+  vus __builtin_vec_lvrx (signed long, const unsigned short *);
+    LVRX  LVRX_US
+  vp __builtin_vec_lvrx (signed long, const vp *);
+    LVRX  LVRX_VP
+  vbi __builtin_vec_lvrx (signed long, const vbi *);
+    LVRX  LVRX_VBI
+  vsi __builtin_vec_lvrx (signed long, const vsi *);
+    LVRX  LVRX_VSI
+  vsi __builtin_vec_lvrx (signed long, const signed int *);
+    LVRX  LVRX_SI
+  vui __builtin_vec_lvrx (signed long, const vui *);
+    LVRX  LVRX_VUI
+  vui __builtin_vec_lvrx (signed long, const unsigned int *);
+    LVRX  LVRX_UI
+  vf __builtin_vec_lvrx (signed long, const vf *);
+    LVRX  LVRX_VF
+  vf __builtin_vec_lvrx (signed long, const float *);
+    LVRX  LVRX_F
+
+[VEC_LVRXL, vec_lvrxl, __builtin_vec_lvrxl, __PPU__]
+  vbc __builtin_vec_lvrxl (signed long, const vbc *);
+    LVRXL  LVRXL_VBC
+  vsc __builtin_vec_lvrxl (signed long, const vsc *);
+    LVRXL  LVRXL_VSC
+  vsc __builtin_vec_lvrxl (signed long, const signed char *);
+    LVRXL  LVRXL_SC
+  vuc __builtin_vec_lvrxl (signed long, const vuc *);
+    LVRXL  LVRXL_VUC
+  vuc __builtin_vec_lvrxl (signed long, const unsigned char *);
+    LVRXL  LVRXL_UC
+  vbs __builtin_vec_lvrxl (signed long, const vbs *);
+    LVRXL  LVRXL_VBS
+  vss __builtin_vec_lvrxl (signed long, const vss *);
+    LVRXL  LVRXL_VSS
+  vss __builtin_vec_lvrxl (signed long, const signed short *);
+    LVRXL  LVRXL_SS
+  vus __builtin_vec_lvrxl (signed long, const vus *);
+    LVRXL  LVRXL_VUS
+  vus __builtin_vec_lvrxl (signed long, const unsigned short *);
+    LVRXL  LVRXL_US
+  vp __builtin_vec_lvrxl (signed long, const vp *);
+    LVRXL  LVRXL_VP
+  vbi __builtin_vec_lvrxl (signed long, const vbi *);
+    LVRXL  LVRXL_VBI
+  vsi __builtin_vec_lvrxl (signed long, const vsi *);
+    LVRXL  LVRXL_VSI
+  vsi __builtin_vec_lvrxl (signed long, const signed int *);
+    LVRXL  LVRXL_SI
+  vui __builtin_vec_lvrxl (signed long, const vui *);
+    LVRXL  LVRXL_VUI
+  vui __builtin_vec_lvrxl (signed long, const unsigned int *);
+    LVRXL  LVRXL_UI
+  vf __builtin_vec_lvrxl (signed long, const vf *);
+    LVRXL  LVRXL_VF
+  vf __builtin_vec_lvrxl (signed long, const float *);
+    LVRXL  LVRXL_F
+
+[VEC_LVSL, vec_lvsl, __builtin_vec_lvsl]
+  vuc __builtin_vec_lvsl (signed long, const unsigned char *);
+    LVSL  LVSL_UC
+  vuc __builtin_vec_lvsl (signed long, const signed char *);
+    LVSL  LVSL_SC
+  vuc __builtin_vec_lvsl (signed long, const char *);
+    LVSL  LVSL_STR
+  vuc __builtin_vec_lvsl (signed long, const unsigned short *);
+    LVSL  LVSL_US
+  vuc __builtin_vec_lvsl (signed long, const signed short *);
+    LVSL  LVSL_SS
+  vuc __builtin_vec_lvsl (signed long, const unsigned int *);
+    LVSL  LVSL_UI
+  vuc __builtin_vec_lvsl (signed long, const signed int *);
+    LVSL  LVSL_SI
+  vuc __builtin_vec_lvsl (signed long, const unsigned long *);
+    LVSL  LVSL_UL
+  vuc __builtin_vec_lvsl (signed long, const signed long *);
+    LVSL  LVSL_SL
+  vuc __builtin_vec_lvsl (signed long, const unsigned long long *);
+    LVSL  LVSL_ULL
+  vuc __builtin_vec_lvsl (signed long, const signed long long *);
+    LVSL  LVSL_SLL
+  vuc __builtin_vec_lvsl (signed long, const float *);
+    LVSL  LVSL_F
+  vuc __builtin_vec_lvsl (signed long, const double *);
+    LVSL  LVSL_D
+
+[VEC_LVSR, vec_lvsr, __builtin_vec_lvsr]
+  vuc __builtin_vec_lvsr (signed long, const unsigned char *);
+    LVSR  LVSR_UC
+  vuc __builtin_vec_lvsr (signed long, const signed char *);
+    LVSR  LVSR_SC
+  vuc __builtin_vec_lvsr (signed long, const char *);
+    LVSR  LVSR_STR
+  vuc __builtin_vec_lvsr (signed long, const unsigned short *);
+    LVSR  LVSR_US
+  vuc __builtin_vec_lvsr (signed long, const signed short *);
+    LVSR  LVSR_SS
+  vuc __builtin_vec_lvsr (signed long, const unsigned int *);
+    LVSR  LVSR_UI
+  vuc __builtin_vec_lvsr (signed long, const signed int *);
+    LVSR  LVSR_SI
+  vuc __builtin_vec_lvsr (signed long, const unsigned long *);
+    LVSR  LVSR_UL
+  vuc __builtin_vec_lvsr (signed long, const signed long *);
+    LVSR  LVSR_SL
+  vuc __builtin_vec_lvsr (signed long, const unsigned long long *);
+    LVSR  LVSR_ULL
+  vuc __builtin_vec_lvsr (signed long, const signed long long *);
+    LVSR  LVSR_SLL
+  vuc __builtin_vec_lvsr (signed long, const float *);
+    LVSR  LVSR_F
+  vuc __builtin_vec_lvsr (signed long, const double *);
+    LVSR  LVSR_D
+
+[VEC_LXVL, vec_xl_len, __builtin_vec_lxvl, _ARCH_PPC64_PWR9]
+  vsc __builtin_vec_lxvl (const signed char *, unsigned int);
+    LXVL  LXVL_VSC
+  vuc __builtin_vec_lxvl (const unsigned char *, unsigned int);
+    LXVL  LXVL_VUC
+  vss __builtin_vec_lxvl (const signed short *, unsigned int);
+    LXVL  LXVL_VSS
+  vus __builtin_vec_lxvl (const unsigned short *, unsigned int);
+    LXVL  LXVL_VUS
+  vsi __builtin_vec_lxvl (const signed int *, unsigned int);
+    LXVL  LXVL_VSI
+  vui __builtin_vec_lxvl (const unsigned int *, unsigned int);
+    LXVL  LXVL_VUI
+  vsll __builtin_vec_lxvl (const signed long long *, unsigned int);
+    LXVL  LXVL_VSLL
+  vull __builtin_vec_lxvl (const unsigned long long *, unsigned int);
+    LXVL  LXVL_VULL
+  vsq __builtin_vec_lxvl (const signed __int128 *, unsigned int);
+    LXVL  LXVL_VSQ
+  vuq __builtin_vec_lxvl (const unsigned __int128 *, unsigned int);
+    LXVL  LXVL_VUQ
+  vf __builtin_vec_lxvl (const float *, unsigned int);
+    LXVL  LXVL_VF
+  vd __builtin_vec_lxvl (const double *, unsigned int);
+    LXVL  LXVL_VD
+
+; #### XVMADDSP(TARGET_VSX);VMADDFP
+[VEC_MADD, vec_madd, __builtin_vec_madd]
+  vss __builtin_vec_madd (vss, vss, vss);
+    VMLADDUHM  VMLADDUHM_VSS
+  vss __builtin_vec_madd (vss, vus, vus);
+    VMLADDUHM  VMLADDUHM_VSSVUS
+  vss __builtin_vec_madd (vus, vss, vss);
+    VMLADDUHM  VMLADDUHM_VUSVSS
+  vus __builtin_vec_madd (vus, vus, vus);
+    VMLADDUHM  VMLADDUHM_VUS
+  vf __builtin_vec_madd (vf, vf, vf);
+    VMADDFP
+  vd __builtin_vec_madd (vd, vd, vd);
+    XVMADDDP
+
+[VEC_MADDS, vec_madds, __builtin_vec_madds]
+  vss __builtin_vec_madds (vss, vss, vss);
+    VMHADDSHS
+
+; #### XVMAXSP{TARGET_VSX};VMAXFP
+[VEC_MAX, vec_max, __builtin_vec_max]
+  vsc __builtin_vec_max (vsc, vsc);
+    VMAXSB
+  vuc __builtin_vec_max (vuc, vuc);
+    VMAXUB
+  vss __builtin_vec_max (vss, vss);
+    VMAXSH
+  vus __builtin_vec_max (vus, vus);
+    VMAXUH
+  vsi __builtin_vec_max (vsi, vsi);
+    VMAXSW
+  vui __builtin_vec_max (vui, vui);
+    VMAXUW
+  vsll __builtin_vec_max (vsll, vsll);
+    VMAXSD
+  vull __builtin_vec_max (vull, vull);
+    VMAXUD
+  vf __builtin_vec_max (vf, vf);
+    VMAXFP
+  vd __builtin_vec_max (vd, vd);
+    XVMAXDP
+; The following variants are deprecated.
+  vsc __builtin_vec_max (vsc, vbc);
+    VMAXSB  VMAXSB_SB
+  vsc __builtin_vec_max (vbc, vsc);
+    VMAXSB  VMAXSB_BS
+  vuc __builtin_vec_max (vuc, vbc);
+    VMAXUB  VMAXUB_UB
+  vuc __builtin_vec_max (vbc, vuc);
+    VMAXUB  VMAXUB_BU
+  vss __builtin_vec_max (vss, vbs);
+    VMAXSH  VMAXSH_SB
+  vss __builtin_vec_max (vbs, vss);
+    VMAXSH  VMAXSH_BS
+  vus __builtin_vec_max (vus, vbs);
+    VMAXUH  VMAXUH_UB
+  vus __builtin_vec_max (vbs, vus);
+    VMAXUH  VMAXUH_BU
+  vsi __builtin_vec_max (vsi, vbi);
+    VMAXSW  VMAXSW_SB
+  vsi __builtin_vec_max (vbi, vsi);
+    VMAXSW  VMAXSW_BS
+  vui __builtin_vec_max (vui, vbi);
+    VMAXUW  VMAXUW_UB
+  vui __builtin_vec_max (vbi, vui);
+    VMAXUW  VMAXUW_BU
+  vsll __builtin_vec_max (vsll, vbll);
+    VMAXSD  VMAXSD_SB
+  vsll __builtin_vec_max (vbll, vsll);
+    VMAXSD  VMAXSD_BS
+  vull __builtin_vec_max (vull, vbll);
+    VMAXUD  VMAXUD_UB
+  vull __builtin_vec_max (vbll, vull);
+    VMAXUD  VMAXUD_BU
+
+[VEC_MERGEE, vec_mergee, __builtin_vec_vmrgew, _ARCH_PWR8]
+  vsi __builtin_vec_vmrgew (vsi, vsi);
+    VMRGEW_V4SI  VMRGEW_VSI
+  vui __builtin_vec_vmrgew (vui, vui);
+    VMRGEW_V4SI  VMRGEW_VUI
+  vbi __builtin_vec_vmrgew (vbi, vbi);
+    VMRGEW_V4SI  VMRGEW_VBI
+  vsll __builtin_vec_vmrgew (vsll, vsll);
+    VMRGEW_V2DI  VMRGEW_VSLL
+  vull __builtin_vec_vmrgew (vull, vull);
+    VMRGEW_V2DI  VMRGEW_VULL
+  vbll __builtin_vec_vmrgew (vbll, vbll);
+    VMRGEW_V2DI  VMRGEW_VBLL
+  vf __builtin_vec_vmrgew (vf, vf);
+    VMRGEW_V4SF
+  vd __builtin_vec_vmrgew (vd, vd);
+    VMRGEW_V2DF
+
+[VEC_MERGEH, vec_mergeh, __builtin_vec_mergeh]
+  vbc __builtin_vec_mergeh (vbc, vbc);
+    VMRGHB  VMRGHB_VBC
+  vsc __builtin_vec_mergeh (vsc, vsc);
+    VMRGHB  VMRGHB_VSC
+  vuc __builtin_vec_mergeh (vuc, vuc);
+    VMRGHB  VMRGHB_VUC
+  vbs __builtin_vec_mergeh (vbs, vbs);
+    VMRGHH  VMRGHH_VBS
+  vss __builtin_vec_mergeh (vss, vss);
+    VMRGHH  VMRGHH_VSS
+  vus __builtin_vec_mergeh (vus, vus);
+    VMRGHH  VMRGHH_VUS
+  vp __builtin_vec_mergeh (vp, vp);
+    VMRGHH  VMRGHH_VP
+  vbi __builtin_vec_mergeh (vbi, vbi);
+    VMRGHW  VMRGHW_VBI
+  vsi __builtin_vec_mergeh (vsi, vsi);
+    VMRGHW  VMRGHW_VSI
+  vui __builtin_vec_mergeh (vui, vui);
+    VMRGHW  VMRGHW_VUI
+  vbll __builtin_vec_mergeh (vbll, vbll);
+    VEC_MERGEH_V2DI  VEC_MERGEH_VBLL
+  vsll __builtin_vec_mergeh (vsll, vsll);
+    VEC_MERGEH_V2DI  VEC_MERGEH_VSLL
+  vull __builtin_vec_mergeh (vull, vull);
+    VEC_MERGEH_V2DI  VEC_MERGEH_VULL
+  vf __builtin_vec_mergeh (vf, vf);
+    VMRGHW  VMRGHW_VF
+  vd __builtin_vec_mergeh (vd, vd);
+    VEC_MERGEH_V2DF
+; The following variants are deprecated.
+  vsll __builtin_vec_mergeh (vsll, vbll);
+    VEC_MERGEH_V2DI  VEC_MERGEH_VSLL_VBLL
+  vsll __builtin_vec_mergeh (vbll, vsll);
+    VEC_MERGEH_V2DI  VEC_MERGEH_VBLL_VSLL
+  vull __builtin_vec_mergeh (vull, vbll);
+    VEC_MERGEH_V2DI  VEC_MERGEH_VULL_VBLL
+  vull __builtin_vec_mergeh (vbll, vull);
+    VEC_MERGEH_V2DI  VEC_MERGEH_VBLL_VULL
+
+[VEC_MERGEL, vec_mergel, __builtin_vec_mergel]
+  vbc __builtin_vec_mergel (vbc, vbc);
+    VMRGLB  VMRGLB_VBC
+  vsc __builtin_vec_mergel (vsc, vsc);
+    VMRGLB  VMRGLB_VSC
+  vuc __builtin_vec_mergel (vuc, vuc);
+    VMRGLB  VMRGLB_VUC
+  vbs __builtin_vec_mergel (vbs, vbs);
+    VMRGLH  VMRGLH_VBS
+  vss __builtin_vec_mergel (vss, vss);
+    VMRGLH  VMRGLH_VSS
+  vus __builtin_vec_mergel (vus, vus);
+    VMRGLH  VMRGLH_VUS
+  vp __builtin_vec_mergel (vp, vp);
+    VMRGLH  VMRGLH_VP
+  vbi __builtin_vec_mergel (vbi, vbi);
+    VMRGLW  VMRGLW_VBI
+  vsi __builtin_vec_mergel (vsi, vsi);
+    VMRGLW  VMRGLW_VSI
+  vui __builtin_vec_mergel (vui, vui);
+    VMRGLW  VMRGLW_VUI
+  vbll __builtin_vec_mergel (vbll, vbll);
+    VEC_MERGEL_V2DI  VEC_MERGEL_VBLL
+  vsll __builtin_vec_mergel (vsll, vsll);
+    VEC_MERGEL_V2DI  VEC_MERGEL_VSLL
+  vull __builtin_vec_mergel (vull, vull);
+    VEC_MERGEL_V2DI  VEC_MERGEL_VULL
+  vf __builtin_vec_mergel (vf, vf);
+    VMRGLW  VMRGLW_VF
+  vd __builtin_vec_mergel (vd, vd);
+    VEC_MERGEL_V2DF
+; The following variants are deprecated.
+  vsll __builtin_vec_mergel (vsll, vbll);
+    VEC_MERGEL_V2DI  VEC_MERGEL_VSLL_VBLL
+  vsll __builtin_vec_mergel (vbll, vsll);
+    VEC_MERGEL_V2DI  VEC_MERGEL_VBLL_VSLL
+  vull __builtin_vec_mergel (vull, vbll);
+    VEC_MERGEL_V2DI  VEC_MERGEL_VULL_VBLL
+  vull __builtin_vec_mergel (vbll, vull);
+    VEC_MERGEL_V2DI  VEC_MERGEL_VBLL_VULL
+
+[VEC_MERGEO, vec_mergeo, __builtin_vec_vmrgow, _ARCH_PWR8]
+  vsi __builtin_vec_vmrgow (vsi, vsi);
+    VMRGOW_V4SI  VMRGOW_VSI
+  vui __builtin_vec_vmrgow (vui, vui);
+    VMRGOW_V4SI  VMRGOW_VUI
+  vbi __builtin_vec_vmrgow (vbi, vbi);
+    VMRGOW_V4SI  VMRGOW_VBI
+  vsll __builtin_vec_vmrgow (vsll, vsll);
+    VMRGOW_V2DI  VMRGOW_VSLL
+  vull __builtin_vec_vmrgow (vull, vull);
+    VMRGOW_V2DI  VMRGOW_VULL
+  vbll __builtin_vec_vmrgow (vbll, vbll);
+    VMRGOW_V2DI  VMRGOW_VBLL
+  vf __builtin_vec_vmrgow (vf, vf);
+    VMRGOW_V4SF
+  vd __builtin_vec_vmrgow (vd, vd);
+    VMRGOW_V2DF
+
+[VEC_MFVSCR, vec_mfvscr, __builtin_vec_mfvscr]
+  vus __builtin_vec_mfvscr ();
+    MFVSCR
+
+; #### XVMINSP{TARGET_VSX};VMINFP
+[VEC_MIN, vec_min, __builtin_vec_min]
+  vsc __builtin_vec_min (vsc, vsc);
+    VMINSB
+  vuc __builtin_vec_min (vuc, vuc);
+    VMINUB
+  vss __builtin_vec_min (vss, vss);
+    VMINSH
+  vus __builtin_vec_min (vus, vus);
+    VMINUH
+  vsi __builtin_vec_min (vsi, vsi);
+    VMINSW
+  vui __builtin_vec_min (vui, vui);
+    VMINUW
+  vsll __builtin_vec_min (vsll, vsll);
+    VMINSD
+  vull __builtin_vec_min (vull, vull);
+    VMINUD
+  vf __builtin_vec_min (vf, vf);
+    VMINFP
+  vd __builtin_vec_min (vd, vd);
+    XVMINDP
+; The following variants are deprecated.
+  vsc __builtin_vec_min (vsc, vbc);
+    VMINSB  VMINSB_SB
+  vsc __builtin_vec_min (vbc, vsc);
+    VMINSB  VMINSB_BS
+  vuc __builtin_vec_min (vuc, vbc);
+    VMINUB  VMINUB_UB
+  vuc __builtin_vec_min (vbc, vuc);
+    VMINUB  VMINUB_BU
+  vss __builtin_vec_min (vss, vbs);
+    VMINSH  VMINSH_SB
+  vss __builtin_vec_min (vbs, vss);
+    VMINSH  VMINSH_BS
+  vus __builtin_vec_min (vus, vbs);
+    VMINUH  VMINUH_UB
+  vus __builtin_vec_min (vbs, vus);
+    VMINUH  VMINUH_BU
+  vsi __builtin_vec_min (vsi, vbi);
+    VMINSW  VMINSW_SB
+  vsi __builtin_vec_min (vbi, vsi);
+    VMINSW  VMINSW_BS
+  vui __builtin_vec_min (vui, vbi);
+    VMINUW  VMINUW_UB
+  vui __builtin_vec_min (vbi, vui);
+    VMINUW  VMINUW_BU
+  vsll __builtin_vec_min (vsll, vbll);
+    VMINSD  VMINSD_SB
+  vsll __builtin_vec_min (vbll, vsll);
+    VMINSD  VMINSD_BS
+  vull __builtin_vec_min (vull, vbll);
+    VMINUD  VMINUD_UB
+  vull __builtin_vec_min (vbll, vull);
+    VMINUD  VMINUD_BU
+
+[VEC_MLADD, vec_mladd, __builtin_vec_mladd]
+  vss __builtin_vec_mladd (vss, vss, vss);
+    VMLADDUHM  VMLADDUHM_VSS2
+  vss __builtin_vec_mladd (vss, vus, vus);
+    VMLADDUHM  VMLADDUHM_VSSVUS2
+  vss __builtin_vec_mladd (vus, vss, vss);
+    VMLADDUHM  VMLADDUHM_VUSVSS2
+  vus __builtin_vec_mladd (vus, vus, vus);
+    VMLADDUHM  VMLADDUHM_VUS2
+
+[VEC_MOD, vec_mod, __builtin_vec_mod, _ARCH_PWR10]
+  vsi __builtin_vec_mod (vsi, vsi);
+    VMODSW
+  vui __builtin_vec_mod (vui, vui);
+    VMODUW
+  vsll __builtin_vec_mod (vsll, vsll);
+    VMODSD
+  vull __builtin_vec_mod (vull, vull);
+    VMODUD
+  vsq __builtin_vec_mod (vsq, vsq);
+    MODS_V1TI
+  vuq __builtin_vec_mod (vuq, vuq);
+    MODU_V1TI
+
+[VEC_MRADDS, vec_mradds, __builtin_vec_mradds]
+  vss __builtin_vec_mradds (vss, vss, vss);
+    VMHRADDSHS
+
+[VEC_MSUB, vec_msub, __builtin_vec_msub, __VSX__]
+  vf __builtin_vec_msub (vf, vf, vf);
+    XVMSUBSP
+  vd __builtin_vec_msub (vd, vd, vd);
+    XVMSUBDP
+
+[VEC_MSUM, vec_msum, __builtin_vec_msum]
+  vui __builtin_vec_msum (vuc, vuc, vui);
+    VMSUMUBM
+  vsi __builtin_vec_msum (vsc, vuc, vsi);
+    VMSUMMBM
+  vui __builtin_vec_msum (vus, vus, vui);
+    VMSUMUHM
+  vsi __builtin_vec_msum (vss, vss, vsi);
+    VMSUMSHM
+  vsq __builtin_vec_msum (vsll, vsll, vsq);
+    VMSUMUDM  VMSUMUDM_S
+  vuq __builtin_vec_msum (vull, vull, vuq);
+    VMSUMUDM  VMSUMUDM_U
+
+[VEC_MSUMS, vec_msums, __builtin_vec_msums]
+  vui __builtin_vec_msums (vus, vus, vui);
+    VMSUMUHS
+  vsi __builtin_vec_msums (vss, vss, vsi);
+    VMSUMSHS
+
+[VEC_MTVSCR, vec_mtvscr, __builtin_vec_mtvscr]
+  void __builtin_vec_mtvscr (vbc);
+    MTVSCR  MTVSCR_VBC
+  void __builtin_vec_mtvscr (vsc);
+    MTVSCR  MTVSCR_VSC
+  void __builtin_vec_mtvscr (vuc);
+    MTVSCR  MTVSCR_VUC
+  void __builtin_vec_mtvscr (vbs);
+    MTVSCR  MTVSCR_VBS
+  void __builtin_vec_mtvscr (vss);
+    MTVSCR  MTVSCR_VSS
+  void __builtin_vec_mtvscr (vus);
+    MTVSCR  MTVSCR_VUS
+  void __builtin_vec_mtvscr (vp);
+    MTVSCR  MTVSCR_VP
+  void __builtin_vec_mtvscr (vbi);
+    MTVSCR  MTVSCR_VBI
+  void __builtin_vec_mtvscr (vsi);
+    MTVSCR  MTVSCR_VSI
+  void __builtin_vec_mtvscr (vui);
+    MTVSCR  MTVSCR_VUI
+
+; Note that the entries for VEC_MUL are currently ignored.  See rs6000-c.c:
+; altivec_resolve_overloaded_builtin, where there is special-case code for
+; VEC_MUL.  TODO: Is this really necessary?  Investigate.  Seven missing
+; prototypes here...no corresponding builtins.  Also added "vmulld" in P10
+; which could be used instead of MUL_V2DI, conditionally?
+[VEC_MUL, vec_mul, __builtin_vec_mul]
+  vsll __builtin_vec_mul (vsll, vsll);
+    MUL_V2DI
+  vf __builtin_vec_mul (vf, vf);
+    XVMULSP
+  vd __builtin_vec_mul (vd, vd);
+    XVMULDP
+
+[VEC_MULE, vec_mule, __builtin_vec_mule]
+  vss __builtin_vec_mule (vsc, vsc);
+    VMULESB
+  vus __builtin_vec_mule (vuc, vuc);
+    VMULEUB
+  vsi __builtin_vec_mule (vss, vss);
+    VMULESH
+  vui __builtin_vec_mule (vus, vus);
+    VMULEUH
+  vsll __builtin_vec_mule (vsi, vsi);
+    VMULESW
+  vull __builtin_vec_mule (vui, vui);
+    VMULEUW
+  vsq __builtin_vec_mule (vsll, vsll);
+    VMULESD
+  vuq __builtin_vec_mule (vull, vull);
+    VMULEUD
+
+[VEC_MULH, vec_mulh, __builtin_vec_mulh, _ARCH_PWR10]
+  vsi __builtin_vec_mulh (vsi, vsi);
+    VMULHSW
+  vui __builtin_vec_mulh (vui, vui);
+    VMULHUW
+  vsll __builtin_vec_mulh (vsll, vsll);
+    VMULHSD
+  vull __builtin_vec_mulh (vull, vull);
+    VMULHUD
+
+[VEC_MULO, vec_mulo, __builtin_vec_mulo]
+  vss __builtin_vec_mulo (vsc, vsc);
+    VMULOSB
+  vus __builtin_vec_mulo (vuc, vuc);
+    VMULOUB
+  vsi __builtin_vec_mulo (vss, vss);
+    VMULOSH
+  vui __builtin_vec_mulo (vus, vus);
+    VMULOUH
+  vsll __builtin_vec_mulo (vsi, vsi);
+    VMULOSW
+  vull __builtin_vec_mulo (vui, vui);
+    VMULOUW
+  vsq __builtin_vec_mulo (vsll, vsll);
+    VMULOSD
+  vuq __builtin_vec_mulo (vull, vull);
+    VMULOUD
+
+[VEC_NABS, vec_nabs, __builtin_vec_nabs]
+  vsc __builtin_vec_nabs (vsc);
+    NABS_V16QI
+  vss __builtin_vec_nabs (vss);
+    NABS_V8HI
+  vsi __builtin_vec_nabs (vsi);
+    NABS_V4SI
+  vsll __builtin_vec_nabs (vsll);
+    NABS_V2DI
+  vf __builtin_vec_nabs (vf);
+    NABS_V4SF
+  vd __builtin_vec_nabs (vd);
+    NABS_V2DF
+
+[VEC_NAND, vec_nand, __builtin_vec_nand, _ARCH_PWR8]
+  vsc __builtin_vec_nand (vsc, vsc);
+    NAND_V16QI
+  vuc __builtin_vec_nand (vuc, vuc);
+    NAND_V16QI_UNS  NAND_VUC
+  vbc __builtin_vec_nand (vbc, vbc);
+    NAND_V16QI_UNS  NAND_VBC
+  vss __builtin_vec_nand (vss, vss);
+    NAND_V8HI
+  vus __builtin_vec_nand (vus, vus);
+    NAND_V8HI_UNS  NAND_VUS
+  vbs __builtin_vec_nand (vbs, vbs);
+    NAND_V8HI_UNS  NAND_VBS
+  vsi __builtin_vec_nand (vsi, vsi);
+    NAND_V4SI
+  vui __builtin_vec_nand (vui, vui);
+    NAND_V4SI_UNS  NAND_VUI
+  vbi __builtin_vec_nand (vbi, vbi);
+    NAND_V4SI_UNS  NAND_VBI
+  vsll __builtin_vec_nand (vsll, vsll);
+    NAND_V2DI
+  vull __builtin_vec_nand (vull, vull);
+    NAND_V2DI_UNS  NAND_VULL
+  vbll __builtin_vec_nand (vbll, vbll);
+    NAND_V2DI_UNS  NAND_VBLL
+  vf __builtin_vec_nand (vf, vf);
+    NAND_V4SF
+  vd __builtin_vec_nand (vd, vd);
+    NAND_V2DF
+; The following variants are deprecated.
+  vsc __builtin_vec_nand (vbc, vsc);
+    NAND_V16QI  NAND_VBC_VSC
+  vsc __builtin_vec_nand (vsc, vbc);
+    NAND_V16QI  NAND_VSC_VBC
+  vuc __builtin_vec_nand (vbc, vuc);
+    NAND_V16QI_UNS  NAND_VBC_VUC
+  vuc __builtin_vec_nand (vuc, vbc);
+    NAND_V16QI_UNS  NAND_VUC_VBC
+  vss __builtin_vec_nand (vbs, vss);
+    NAND_V8HI  NAND_VBS_VSS
+  vss __builtin_vec_nand (vss, vbs);
+    NAND_V8HI  NAND_VSS_VBS
+  vus __builtin_vec_nand (vbs, vus);
+    NAND_V8HI_UNS  NAND_VBS_VUS
+  vus __builtin_vec_nand (vus, vbs);
+    NAND_V8HI_UNS  NAND_VUS_VBS
+  vsi __builtin_vec_nand (vbi, vsi);
+    NAND_V4SI  NAND_VBI_VSI
+  vsi __builtin_vec_nand (vsi, vbi);
+    NAND_V4SI  NAND_VSI_VBI
+  vui __builtin_vec_nand (vbi, vui);
+    NAND_V4SI_UNS  NAND_VBI_VUI
+  vui __builtin_vec_nand (vui, vbi);
+    NAND_V4SI_UNS  NAND_VUI_VBI
+  vsll __builtin_vec_nand (vbll, vsll);
+    NAND_V2DI  NAND_VBLL_VSLL
+  vsll __builtin_vec_nand (vsll, vbll);
+    NAND_V2DI  NAND_VSLL_VBLL
+  vull __builtin_vec_nand (vbll, vull);
+    NAND_V2DI_UNS  NAND_VBLL_VULL
+  vull __builtin_vec_nand (vull, vbll);
+    NAND_V2DI_UNS  NAND_VULL_VBLL
+
+[VEC_NCIPHER_BE, vec_ncipher_be, __builtin_vec_vncipher_be, _ARCH_PWR8]
+  vuc __builtin_vec_vncipher_be (vuc, vuc);
+    VNCIPHER_BE
+
+[VEC_NCIPHERLAST_BE, vec_ncipherlast_be, __builtin_vec_vncipherlast_be, _ARCH_PWR8]
+  vuc __builtin_vec_vncipherlast_be (vuc, vuc);
+    VNCIPHERLAST_BE
+
+[VEC_NEARBYINT, vec_nearbyint, __builtin_vec_nearbyint, __VSX__]
+  vf __builtin_vec_nearbyint (vf);
+    XVRSPI  XVRSPI_NBI
+  vd __builtin_vec_nearbyint (vd);
+    XVRDPI  XVRDPI_NBI
+
+[VEC_NEG, vec_neg, __builtin_vec_neg]
+  vsc __builtin_vec_neg (vsc);
+    NEG_V16QI
+  vss __builtin_vec_neg (vss);
+    NEG_V8HI
+  vsi __builtin_vec_neg (vsi);
+    NEG_V4SI
+  vsll __builtin_vec_neg (vsll);
+    NEG_V2DI
+  vf __builtin_vec_neg (vf);
+    NEG_V4SF
+  vd __builtin_vec_neg (vd);
+    NEG_V2DF
+
+[VEC_NMADD, vec_nmadd, __builtin_vec_nmadd, __VSX__]
+  vf __builtin_vec_nmadd (vf, vf, vf);
+    XVNMADDSP
+  vd __builtin_vec_nmadd (vd, vd, vd);
+    XVNMADDDP
+
+; #### XVNMSUBDP{TARGET_VSX};VNMSUBFP
+[VEC_NMSUB, vec_nmsub, __builtin_vec_nmsub]
+  vf __builtin_vec_nmsub (vf, vf, vf);
+    VNMSUBFP
+  vd __builtin_vec_nmsub (vd, vd, vd);
+    XVNMSUBDP
+
+[VEC_NOR, vec_nor, __builtin_vec_nor]
+  vsc __builtin_vec_nor (vsc, vsc);
+    VNOR_V16QI
+  vuc __builtin_vec_nor (vuc, vuc);
+    VNOR_V16QI_UNS  VNOR_V16QI_U
+  vbc __builtin_vec_nor (vbc, vbc);
+    VNOR_V16QI_UNS  VNOR_V16QI_B
+  vss __builtin_vec_nor (vss, vss);
+    VNOR_V8HI
+  vus __builtin_vec_nor (vus, vus);
+    VNOR_V8HI_UNS  VNOR_V8HI_U
+  vbs __builtin_vec_nor (vbs, vbs);
+    VNOR_V8HI_UNS  VNOR_V8HI_B
+  vsi __builtin_vec_nor (vsi, vsi);
+    VNOR_V4SI
+  vui __builtin_vec_nor (vui, vui);
+    VNOR_V4SI_UNS  VNOR_V4SI_U
+  vbi __builtin_vec_nor (vbi, vbi);
+    VNOR_V4SI_UNS  VNOR_V4SI_B
+  vsll __builtin_vec_nor (vsll, vsll);
+    VNOR_V2DI
+  vull __builtin_vec_nor (vull, vull);
+    VNOR_V2DI_UNS  VNOR_V2DI_U
+  vbll __builtin_vec_nor (vbll, vbll);
+    VNOR_V2DI_UNS  VNOR_V2DI_B
+  vsq __builtin_vec_nor (vsq, vsq);
+    VNOR_V1TI  VNOR_V1TI_S
+  vuq __builtin_vec_nor (vuq, vuq);
+    VNOR_V1TI_UNS  VNOR_V1TI_U
+  vf __builtin_vec_nor (vf, vf);
+    VNOR_V4SF
+  vd __builtin_vec_nor (vd, vd);
+    VNOR_V2DF
+; The following variants are deprecated.
+  vsll __builtin_vec_nor (vsll, vbll);
+    VNOR_V2DI  VNOR_VSLL_VBLL
+  vsll __builtin_vec_nor (vbll, vsll);
+    VNOR_V2DI  VNOR_VBLL_VSLL
+  vull __builtin_vec_nor (vull, vbll);
+    VNOR_V2DI_UNS  VNOR_VULL_VBLL
+  vull __builtin_vec_nor (vbll, vull);
+    VNOR_V2DI_UNS  VNOR_VBLL_VULL
+  vsq __builtin_vec_nor (vsq, vbq);
+    VNOR_V1TI  VNOR_VSQ_VBQ
+  vsq __builtin_vec_nor (vbq, vsq);
+    VNOR_V1TI  VNOR_VBQ_VSQ
+  vuq __builtin_vec_nor (vuq, vbq);
+    VNOR_V1TI_UNS  VNOR_VUQ_VBQ
+  vuq __builtin_vec_nor (vbq, vuq);
+    VNOR_V1TI_UNS  VNOR_VBQ_VUQ
+
+[VEC_OR, vec_or, __builtin_vec_or]
+  vsc __builtin_vec_or (vsc, vsc);
+    VOR_V16QI
+  vuc __builtin_vec_or (vuc, vuc);
+    VOR_V16QI_UNS  VOR_V16QI_U
+  vbc __builtin_vec_or (vbc, vbc);
+    VOR_V16QI_UNS  VOR_V16QI_B
+  vss __builtin_vec_or (vss, vss);
+    VOR_V8HI
+  vus __builtin_vec_or (vus, vus);
+    VOR_V8HI_UNS  VOR_V8HI_U
+  vbs __builtin_vec_or (vbs, vbs);
+    VOR_V8HI_UNS  VOR_V8HI_B
+  vsi __builtin_vec_or (vsi, vsi);
+    VOR_V4SI
+  vui __builtin_vec_or (vui, vui);
+    VOR_V4SI_UNS  VOR_V4SI_U
+  vbi __builtin_vec_or (vbi, vbi);
+    VOR_V4SI_UNS  VOR_V4SI_B
+  vsll __builtin_vec_or (vsll, vsll);
+    VOR_V2DI
+  vull __builtin_vec_or (vull, vull);
+    VOR_V2DI_UNS  VOR_V2DI_U
+  vbll __builtin_vec_or (vbll, vbll);
+    VOR_V2DI_UNS  VOR_V2DI_B
+  vf __builtin_vec_or (vf, vf);
+    VOR_V4SF
+  vd __builtin_vec_or (vd, vd);
+    VOR_V2DF
+; The following variants are deprecated.
+  vsc __builtin_vec_or (vsc, vbc);
+    VOR_V16QI  VOR_VSC_VBC
+  vsc __builtin_vec_or (vbc, vsc);
+    VOR_V16QI  VOR_VBC_VSC
+  vuc __builtin_vec_or (vuc, vbc);
+    VOR_V16QI_UNS  VOR_V16QI_UB
+  vuc __builtin_vec_or (vbc, vuc);
+    VOR_V16QI_UNS  VOR_V16QI_BU
+  vss __builtin_vec_or (vss, vbs);
+    VOR_V8HI  VOR_VSS_VBS
+  vss __builtin_vec_or (vbs, vss);
+    VOR_V8HI  VOR_VBS_VSS
+  vus __builtin_vec_or (vus, vbs);
+    VOR_V8HI_UNS  VOR_V8HI_UB
+  vus __builtin_vec_or (vbs, vus);
+    VOR_V8HI_UNS  VOR_V8HI_BU
+  vsi __builtin_vec_or (vsi, vbi);
+    VOR_V4SI  VOR_VSI_VBI
+  vsi __builtin_vec_or (vbi, vsi);
+    VOR_V4SI  VOR_VBI_VSI
+  vui __builtin_vec_or (vui, vbi);
+    VOR_V4SI_UNS  VOR_V4SI_UB
+  vui __builtin_vec_or (vbi, vui);
+    VOR_V4SI_UNS  VOR_V4SI_BU
+  vsll __builtin_vec_or (vsll, vbll);
+    VOR_V2DI  VOR_VSLL_VBLL
+  vsll __builtin_vec_or (vbll, vsll);
+    VOR_V2DI  VOR_VBLL_VSLL
+  vull __builtin_vec_or (vull, vbll);
+    VOR_V2DI_UNS  VOR_V2DI_UB
+  vull __builtin_vec_or (vbll, vull);
+    VOR_V2DI_UNS  VOR_V2DI_BU
+  vf __builtin_vec_or (vf, vbi);
+    VOR_V4SF  VOR_VF_VBI
+  vf __builtin_vec_or (vbi, vf);
+    VOR_V4SF  VOR_VBI_VF
+  vd __builtin_vec_or (vd, vbll);
+    VOR_V2DF  VOR_VD_VBLL
+  vd __builtin_vec_or (vbll, vd);
+    VOR_V2DF  VOR_VBLL_VD
+
+[VEC_ORC, vec_orc, __builtin_vec_orc, _ARCH_PWR8]
+  vsc __builtin_vec_orc (vsc, vsc);
+    ORC_V16QI
+  vuc __builtin_vec_orc (vuc, vuc);
+    ORC_V16QI_UNS  ORC_VUC
+  vbc __builtin_vec_orc (vbc, vbc);
+    ORC_V16QI_UNS  ORC_VBC
+  vss __builtin_vec_orc (vss, vss);
+    ORC_V8HI
+  vus __builtin_vec_orc (vus, vus);
+    ORC_V8HI_UNS  ORC_VUS
+  vbs __builtin_vec_orc (vbs, vbs);
+    ORC_V8HI_UNS  ORC_VBS
+  vsi __builtin_vec_orc (vsi, vsi);
+    ORC_V4SI
+  vui __builtin_vec_orc (vui, vui);
+    ORC_V4SI_UNS  ORC_VUI
+  vbi __builtin_vec_orc (vbi, vbi);
+    ORC_V4SI_UNS  ORC_VBI
+  vsll __builtin_vec_orc (vsll, vsll);
+    ORC_V2DI
+  vull __builtin_vec_orc (vull, vull);
+    ORC_V2DI_UNS  ORC_VULL
+  vbll __builtin_vec_orc (vbll, vbll);
+    ORC_V2DI_UNS  ORC_VBLL
+  vf __builtin_vec_orc (vf, vf);
+    ORC_V4SF
+  vd __builtin_vec_orc (vd, vd);
+    ORC_V2DF
+; The following variants are deprecated.
+  vsc __builtin_vec_orc (vbc, vsc);
+    ORC_V16QI  ORC_VBC_VSC
+  vsc __builtin_vec_orc (vsc, vbc);
+    ORC_V16QI  ORC_VSC_VBC
+  vuc __builtin_vec_orc (vbc, vuc);
+    ORC_V16QI_UNS  ORC_VBC_VUC
+  vuc __builtin_vec_orc (vuc, vbc);
+    ORC_V16QI_UNS  ORC_VUC_VBC
+  vss __builtin_vec_orc (vbs, vss);
+    ORC_V8HI  ORC_VBS_VSS
+  vss __builtin_vec_orc (vss, vbs);
+    ORC_V8HI  ORC_VSS_VBS
+  vus __builtin_vec_orc (vbs, vus);
+    ORC_V8HI_UNS  ORC_VBS_VUS
+  vus __builtin_vec_orc (vus, vbs);
+    ORC_V8HI_UNS  ORC_VUS_VBS
+  vsi __builtin_vec_orc (vbi, vsi);
+    ORC_V4SI  ORC_VBI_VSI
+  vsi __builtin_vec_orc (vsi, vbi);
+    ORC_V4SI  ORC_VSI_VBI
+  vui __builtin_vec_orc (vbi, vui);
+    ORC_V4SI_UNS  ORC_VBI_VUI
+  vui __builtin_vec_orc (vui, vbi);
+    ORC_V4SI_UNS  ORC_VUI_VBI
+  vsll __builtin_vec_orc (vbll, vsll);
+    ORC_V2DI  ORC_VBLL_VSLL
+  vsll __builtin_vec_orc (vsll, vbll);
+    ORC_V2DI  ORC_VSLL_VBLL
+  vull __builtin_vec_orc (vbll, vull);
+    ORC_V2DI_UNS  ORC_VBLL_VULL
+  vull __builtin_vec_orc (vull, vbll);
+    ORC_V2DI_UNS  ORC_VULL_VBLL
+
+[VEC_PACK, vec_pack, __builtin_vec_pack]
+  vsc __builtin_vec_pack (vss, vss);
+    VPKUHUM  VPKUHUM_VSS
+  vuc __builtin_vec_pack (vus, vus);
+    VPKUHUM  VPKUHUM_VUS
+  vbc __builtin_vec_pack (vbs, vbs);
+    VPKUHUM  VPKUHUM_VBS
+  vss __builtin_vec_pack (vsi, vsi);
+    VPKUWUM  VPKUWUM_VSI
+  vus __builtin_vec_pack (vui, vui);
+    VPKUWUM  VPKUWUM_VUI
+  vbs __builtin_vec_pack (vbi, vbi);
+    VPKUWUM  VPKUWUM_VBI
+  vsi __builtin_vec_pack (vsll, vsll);
+    VPKUDUM  VPKUDUM_VSLL
+  vui __builtin_vec_pack (vull, vull);
+    VPKUDUM  VPKUDUM_VULL
+  vbi __builtin_vec_pack (vbll, vbll);
+    VPKUDUM  VPKUDUM_VBLL
+  vf __builtin_vec_pack (vd, vd);
+    FLOAT2_V2DF FLOAT2_V2DF_PACK
+
+[VEC_PACKPX, vec_packpx, __builtin_vec_packpx]
+  vp __builtin_vec_packpx (vui, vui);
+    VPKPX
+
+[VEC_PACKS, vec_packs, __builtin_vec_packs]
+  vuc __builtin_vec_packs (vus, vus);
+    VPKUHUS  VPKUHUS_S
+  vsc __builtin_vec_packs (vss, vss);
+    VPKSHSS
+  vus __builtin_vec_packs (vui, vui);
+    VPKUWUS  VPKUWUS_S
+  vss __builtin_vec_packs (vsi, vsi);
+    VPKSWSS
+  vui __builtin_vec_packs (vull, vull);
+    VPKUDUS  VPKUDUS_S
+  vsi __builtin_vec_packs (vsll, vsll);
+    VPKSDSS
+
+[VEC_PACKSU, vec_packsu, __builtin_vec_packsu]
+  vuc __builtin_vec_packsu (vus, vus);
+    VPKUHUS  VPKUHUS_U
+  vuc __builtin_vec_packsu (vss, vss);
+    VPKSHUS
+  vus __builtin_vec_packsu (vui, vui);
+    VPKUWUS  VPKUWUS_U
+  vus __builtin_vec_packsu (vsi, vsi);
+    VPKSWUS
+  vui __builtin_vec_packsu (vull, vull);
+    VPKUDUS  VPKUDUS_U
+  vui __builtin_vec_packsu (vsll, vsll);
+    VPKSDUS
+
+[VEC_PDEP, vec_pdep, __builtin_vec_vpdepd, _ARCH_PWR10]
+  vull __builtin_vec_vpdepd (vull, vull);
+    VPDEPD
+
+[VEC_PERM, vec_perm, __builtin_vec_perm]
+  vsc __builtin_vec_perm (vsc, vsc, vuc);
+    VPERM_16QI
+  vuc __builtin_vec_perm (vuc, vuc, vuc);
+    VPERM_16QI_UNS VPERM_16QI_VUC
+  vbc __builtin_vec_perm (vbc, vbc, vuc);
+    VPERM_16QI_UNS VPERM_16QI_VBC
+  vss __builtin_vec_perm (vss, vss, vuc);
+    VPERM_8HI
+  vus __builtin_vec_perm (vus, vus, vuc);
+    VPERM_8HI_UNS VPERM_8HI_VUS
+  vbs __builtin_vec_perm (vbs, vbs, vuc);
+    VPERM_8HI_UNS VPERM_8HI_VBS
+  vp __builtin_vec_perm (vp, vp, vuc);
+    VPERM_8HI_UNS VPERM_8HI_VP
+  vsi __builtin_vec_perm (vsi, vsi, vuc);
+    VPERM_4SI
+  vui __builtin_vec_perm (vui, vui, vuc);
+    VPERM_4SI_UNS VPERM_4SI_VUI
+  vbi __builtin_vec_perm (vbi, vbi, vuc);
+    VPERM_4SI_UNS VPERM_4SI_VBI
+  vsll __builtin_vec_perm (vsll, vsll, vuc);
+    VPERM_2DI
+  vull __builtin_vec_perm (vull, vull, vuc);
+    VPERM_2DI_UNS VPERM_2DI_VULL
+  vbll __builtin_vec_perm (vbll, vbll, vuc);
+    VPERM_2DI_UNS VPERM_2DI_VBLL
+  vf __builtin_vec_perm (vf, vf, vuc);
+    VPERM_4SF
+  vd __builtin_vec_perm (vd, vd, vuc);
+    VPERM_2DF
+  vsq __builtin_vec_perm (vsq, vsq, vuc);
+    VPERM_1TI
+  vuq __builtin_vec_perm (vuq, vuq, vuc);
+    VPERM_1TI_UNS
+; The following variants are deprecated.
+  vsc __builtin_vec_perm (vsc, vuc, vuc);
+    VPERM_16QI  VPERM_VSC_VUC_VUC
+  vbc __builtin_vec_perm (vbc, vbc, vbc);
+    VPERM_16QI  VPERM_VBC_VBC_VBC
+
+[VEC_PERMX, vec_permx, __builtin_vec_xxpermx, _ARCH_PWR10]
+  vsc __builtin_vec_xxpermx (vsc, vsc, vuc, const int);
+    XXPERMX_UV2DI  XXPERMX_VSC
+  vuc __builtin_vec_xxpermx (vuc, vuc, vuc, const int);
+    XXPERMX_UV2DI  XXPERMX_VUC
+  vss __builtin_vec_xxpermx (vss, vss, vuc, const int);
+    XXPERMX_UV2DI  XXPERMX_VSS
+  vus __builtin_vec_xxpermx (vus, vus, vuc, const int);
+    XXPERMX_UV2DI  XXPERMX_VUS
+  vsi __builtin_vec_xxpermx (vsi, vsi, vuc, const int);
+    XXPERMX_UV2DI  XXPERMX_VSI
+  vui __builtin_vec_xxpermx (vui, vui, vuc, const int);
+    XXPERMX_UV2DI  XXPERMX_VUI
+  vsll __builtin_vec_xxpermx (vsll, vsll, vuc, const int);
+    XXPERMX_UV2DI  XXPERMX_VSLL
+  vull __builtin_vec_xxpermx (vull, vull, vuc, const int);
+    XXPERMX_UV2DI  XXPERMX_VULL
+  vf __builtin_vec_xxpermx (vf, vf, vuc, const int);
+    XXPERMX_UV2DI  XXPERMX_VF
+  vd __builtin_vec_xxpermx (vd, vd, vuc, const int);
+    XXPERMX_UV2DI  XXPERMX_VD
+
+[VEC_PERMXOR, vec_permxor, __builtin_vec_vpermxor]
+  vsc __builtin_vec_vpermxor (vsc, vsc, vsc);
+    VPERMXOR  VPERMXOR_VSC
+  vuc __builtin_vec_vpermxor (vuc, vuc, vuc);
+    VPERMXOR  VPERMXOR_VUC
+  vbc __builtin_vec_vpermxor (vbc, vbc, vbc);
+    VPERMXOR  VPERMXOR_VBC
+
+[VEC_PEXT, vec_pext, __builtin_vec_vpextd, _ARCH_PWR10]
+  vull __builtin_vec_vpextd (vull, vull);
+    VPEXTD
+
+[VEC_PMSUM, vec_pmsum_be, __builtin_vec_vpmsum]
+  vus __builtin_vec_vpmsum (vuc, vuc);
+    VPMSUMB  VPMSUMB_V
+  vui __builtin_vec_vpmsum (vus, vus);
+    VPMSUMH  VPMSUMH_V
+  vull __builtin_vec_vpmsum (vui, vui);
+    VPMSUMW  VPMSUMW_V
+  vuq __builtin_vec_vpmsum (vull, vull);
+    VPMSUMD  VPMSUMD_V
+
+[VEC_POPCNT, vec_popcnt, __builtin_vec_vpopcntu, _ARCH_PWR8]
+  vuc __builtin_vec_vpopcntu (vsc);
+    VPOPCNTB
+  vuc __builtin_vec_vpopcntu (vuc);
+    VPOPCNTUB
+  vus __builtin_vec_vpopcntu (vss);
+    VPOPCNTH
+  vus __builtin_vec_vpopcntu (vus);
+    VPOPCNTUH
+  vui __builtin_vec_vpopcntu (vsi);
+    VPOPCNTW
+  vui __builtin_vec_vpopcntu (vui);
+    VPOPCNTUW
+  vull __builtin_vec_vpopcntu (vsll);
+    VPOPCNTD
+  vull __builtin_vec_vpopcntu (vull);
+    VPOPCNTUD
+
+[VEC_PARITY_LSBB, vec_parity_lsbb, __builtin_vec_vparity_lsbb, _ARCH_PWR9]
+  vui __builtin_vec_vparity_lsbb (vsi);
+    VPRTYBW  VPRTYBW_S
+  vui __builtin_vec_vparity_lsbb (vui);
+    VPRTYBW  VPRTYBW_U
+  vull __builtin_vec_vparity_lsbb (vsll);
+    VPRTYBD  VPRTYBD_S
+  vull __builtin_vec_vparity_lsbb (vull);
+    VPRTYBD  VPRTYBD_U
+  vuq __builtin_vec_vparity_lsbb (vsq);
+    VPRTYBQ  VPRTYBQ_S
+  vuq __builtin_vec_vparity_lsbb (vuq);
+    VPRTYBQ  VPRTYBQ_U
+
+; There are no actual builtins for vec_promote.  There is special handling for
+; this in altivec_resolve_overloaded_builtin in rs6000-c.c, where the call
+; is replaced by a constructor.  The single overload here causes
+; __builtin_vec_promote to be registered with the front end so that can happen.
+[VEC_PROMOTE, vec_promote, __builtin_vec_promote]
+  vsi __builtin_vec_promote (vsi);
+    ABS_V4SI PROMOTE_FAKERY
+
+; Opportunity for improvement: We can use XVRESP instead of VREFP for
+; TARGET_VSX.  We would need conditional dispatch to allow two possibilities.
+; Some syntax like "XVRESP{TARGET_VSX};VREFP".
+; TODO. ####
+[VEC_RE, vec_re, __builtin_vec_re]
+  vf __builtin_vec_re (vf);
+    VREFP
+  vd __builtin_vec_re (vd);
+    XVREDP
+
+[VEC_RECIP, vec_recipdiv, __builtin_vec_recipdiv]
+  vf __builtin_vec_recipdiv (vf, vf);
+    RECIP_V4SF
+  vd __builtin_vec_recipdiv (vd, vd);
+    RECIP_V2DF
+
+[VEC_REPLACE_ELT, vec_replace_elt, __builtin_vec_replace_elt, _ARCH_PWR10]
+  vui __builtin_vec_replace_elt (vui, unsigned int, const int);
+    VREPLACE_ELT_UV4SI
+  vsi __builtin_vec_replace_elt (vsi, signed int, const int);
+    VREPLACE_ELT_V4SI
+  vull __builtin_vec_replace_elt (vull, unsigned long long, const int);
+    VREPLACE_ELT_UV2DI
+  vsll __builtin_vec_replace_elt (vsll, signed long long, const int);
+    VREPLACE_ELT_V2DI
+  vf __builtin_vec_replace_elt (vf, float, const int);
+    VREPLACE_ELT_V4SF
+  vd __builtin_vec_replace_elt (vd, double, const int);
+    VREPLACE_ELT_V2DF
+
+[VEC_REPLACE_UN, vec_replace_unaligned, __builtin_vec_replace_un, _ARCH_PWR10]
+  vui __builtin_vec_replace_un (vui, unsigned int, const int);
+    VREPLACE_UN_UV4SI
+  vsi __builtin_vec_replace_un (vsi, signed int, const int);
+    VREPLACE_UN_V4SI
+  vull __builtin_vec_replace_un (vull, unsigned long long, const int);
+    VREPLACE_UN_UV2DI
+  vsll __builtin_vec_replace_un (vsll, signed long long, const int);
+    VREPLACE_UN_V2DI
+  vf __builtin_vec_replace_un (vf, float, const int);
+    VREPLACE_UN_V4SF
+  vd __builtin_vec_replace_un (vd, double, const int);
+    VREPLACE_UN_V2DF
+
+[VEC_REVB, vec_revb, __builtin_vec_revb, _ARCH_PWR8]
+  vss __builtin_vec_revb (vss);
+    REVB_V8HI  REVB_VSS
+  vus __builtin_vec_revb (vus);
+    REVB_V8HI  REVB_VUS
+  vsi __builtin_vec_revb (vsi);
+    REVB_V4SI  REVB_VSI
+  vui __builtin_vec_revb (vui);
+    REVB_V4SI  REVB_VUI
+  vsll __builtin_vec_revb (vsll);
+    REVB_V2DI  REVB_VSLL
+  vull __builtin_vec_revb (vull);
+    REVB_V2DI  REVB_VULL
+  vsq __builtin_vec_revb (vsq);
+    REVB_V1TI  REVB_VSQ
+  vuq __builtin_vec_revb (vuq);
+    REVB_V1TI  REVB_VUQ
+  vf __builtin_vec_revb (vf);
+    REVB_V4SF
+  vd __builtin_vec_revb (vd);
+    REVB_V2DF
+; The following variants are deprecated.
+  vsc __builtin_vec_revb (vsc);
+    REVB_V16QI  REVB_VSC
+  vuc __builtin_vec_revb (vuc);
+    REVB_V16QI  REVB_VUC
+  vbc __builtin_vec_revb (vbc);
+    REVB_V16QI  REVB_VBC
+  vbs __builtin_vec_revb (vbs);
+    REVB_V8HI  REVB_VBS
+  vbi __builtin_vec_revb (vbi);
+    REVB_V4SI  REVB_VBI
+  vbll __builtin_vec_revb (vbll);
+    REVB_V2DI  REVB_VBLL
+
+[VEC_REVE, vec_reve, __builtin_vec_vreve]
+  vsc __builtin_vec_vreve (vsc);
+    VREVE_V16QI  VREVE_VSC
+  vuc __builtin_vec_vreve (vuc);
+    VREVE_V16QI  VREVE_VUC
+  vbc __builtin_vec_vreve (vbc);
+    VREVE_V16QI  VREVE_VBC
+  vss __builtin_vec_vreve (vss);
+    VREVE_V8HI  VREVE_VSS
+  vus __builtin_vec_vreve (vus);
+    VREVE_V8HI  VREVE_VUS
+  vbs __builtin_vec_vreve (vbs);
+    VREVE_V8HI  VREVE_VBS
+  vsi __builtin_vec_vreve (vsi);
+    VREVE_V4SI  VREVE_VSI
+  vui __builtin_vec_vreve (vui);
+    VREVE_V4SI  VREVE_VUI
+  vbi __builtin_vec_vreve (vbi);
+    VREVE_V4SI  VREVE_VBI
+  vsll __builtin_vec_vreve (vsll);
+    VREVE_V2DI  VREVE_VSLL
+  vull __builtin_vec_vreve (vull);
+    VREVE_V2DI  VREVE_VULL
+  vbll __builtin_vec_vreve (vbll);
+    VREVE_V2DI  VREVE_VBLL
+  vf __builtin_vec_vreve (vf);
+    VREVE_V4SF
+  vd __builtin_vec_vreve (vd);
+    VREVE_V2DF
+
+[VEC_RINT, vec_rint, __builtin_vec_rint, __VSX__]
+  vf __builtin_vec_rint (vf);
+    XVRSPIC
+  vd __builtin_vec_rint (vd);
+    XVRDPIC
+
+[VEC_RL, vec_rl, __builtin_vec_rl]
+  vsc __builtin_vec_rl (vsc, vuc);
+    VRLB  VRLB_VSC
+  vuc __builtin_vec_rl (vuc, vuc);
+    VRLB  VRLB_VUC
+  vss __builtin_vec_rl (vss, vus);
+    VRLH  VRLH_VSS
+  vus __builtin_vec_rl (vus, vus);
+    VRLH  VRLH_VUS
+  vsi __builtin_vec_rl (vsi, vui);
+    VRLW  VRLW_VSI
+  vui __builtin_vec_rl (vui, vui);
+    VRLW  VRLW_VUI
+  vsll __builtin_vec_rl (vsll, vull);
+    VRLD  VRLD_VSLL
+  vull __builtin_vec_rl (vull, vull);
+    VRLD  VRLD_VULL
+  vsq __builtin_vec_rl (vsq, vuq);
+    VRLQ  VRLQ_VSQ
+  vuq __builtin_vec_rl (vuq, vuq);
+    VRLQ  VRLQ_VUQ
+
+[VEC_RLMI, vec_rlmi, __builtin_vec_rlmi, _ARCH_PWR9]
+  vui __builtin_vec_rlmi (vui, vui, vui);
+    VRLWMI
+  vull __builtin_vec_rlmi (vull, vull, vull);
+    VRLDMI
+  vsq __builtin_vec_rlmi (vsq, vsq, vuq);
+    VRLQMI  VRLQMI_VSQ
+  vuq __builtin_vec_rlmi (vuq, vuq, vuq);
+    VRLQMI  VRLQMI_VUQ
+
+[VEC_RLNM, vec_vrlnm, __builtin_vec_rlnm, _ARCH_PWR9]
+  vui __builtin_vec_rlnm (vui, vui);
+    VRLWNM
+  vull __builtin_vec_rlnm (vull, vull);
+    VRLDNM
+  vsq __builtin_vec_rlnm (vsq, vuq);
+    VRLQNM  VRLQNM_VSQ
+  vuq __builtin_vec_rlnm (vuq, vuq);
+    VRLQNM  VRLQNM_VUQ
+
+; #### XVRSPI{TARGET_VSX};VRFIN
+[VEC_ROUND, vec_round, __builtin_vec_round]
+  vf __builtin_vec_round (vf);
+    VRFIN
+  vd __builtin_vec_round (vd);
+    XVRDPI
+
+[VEC_RSQRT, vec_rsqrt, __builtin_vec_rsqrt]
+  vf __builtin_vec_rsqrt (vf);
+    RSQRT_4SF
+  vd __builtin_vec_rsqrt (vd);
+    RSQRT_2DF
+
+; #### XVRSQRTESP{TARGET_VSX};VRSQRTEFP
+[VEC_RSQRTE, vec_rsqrte, __builtin_vec_rsqrte]
+  vf __builtin_vec_rsqrte (vf);
+    VRSQRTEFP
+  vd __builtin_vec_rsqrte (vd);
+    XVRSQRTEDP
+
+[VEC_SBOX_BE, vec_sbox_be, __builtin_vec_sbox_be, _ARCH_PWR8]
+  vuc __builtin_vec_sbox_be (vuc);
+    VSBOX_BE
+
+[VEC_SEL, vec_sel, __builtin_vec_sel]
+  vsc __builtin_vec_sel (vsc, vsc, vbc);
+    VSEL_16QI  VSEL_16QI_B
+  vsc __builtin_vec_sel (vsc, vsc, vuc);
+    VSEL_16QI  VSEL_16QI_U
+  vuc __builtin_vec_sel (vuc, vuc, vbc);
+    VSEL_16QI_UNS  VSEL_16QI_UB
+  vuc __builtin_vec_sel (vuc, vuc, vuc);
+    VSEL_16QI_UNS  VSEL_16QI_UU
+  vbc __builtin_vec_sel (vbc, vbc, vbc);
+    VSEL_16QI_UNS  VSEL_16QI_BB
+  vbc __builtin_vec_sel (vbc, vbc, vuc);
+    VSEL_16QI_UNS  VSEL_16QI_BU
+  vss __builtin_vec_sel (vss, vss, vbs);
+    VSEL_8HI  VSEL_8HI_B
+  vss __builtin_vec_sel (vss, vss, vus);
+    VSEL_8HI  VSEL_8HI_U
+  vus __builtin_vec_sel (vus, vus, vbs);
+    VSEL_8HI_UNS  VSEL_8HI_UB
+  vus __builtin_vec_sel (vus, vus, vus);
+    VSEL_8HI_UNS  VSEL_8HI_UU
+  vbs __builtin_vec_sel (vbs, vbs, vbs);
+    VSEL_8HI_UNS  VSEL_8HI_BB
+  vbs __builtin_vec_sel (vbs, vbs, vus);
+    VSEL_8HI_UNS  VSEL_8HI_BU
+  vsi __builtin_vec_sel (vsi, vsi, vbi);
+    VSEL_4SI  VSEL_4SI_B
+  vsi __builtin_vec_sel (vsi, vsi, vui);
+    VSEL_4SI  VSEL_4SI_U
+  vui __builtin_vec_sel (vui, vui, vbi);
+    VSEL_4SI_UNS  VSEL_4SI_UB
+  vui __builtin_vec_sel (vui, vui, vui);
+    VSEL_4SI_UNS  VSEL_4SI_UU
+  vbi __builtin_vec_sel (vbi, vbi, vbi);
+    VSEL_4SI_UNS  VSEL_4SI_BB
+  vbi __builtin_vec_sel (vbi, vbi, vui);
+    VSEL_4SI_UNS  VSEL_4SI_BU
+  vsll __builtin_vec_sel (vsll, vsll, vbll);
+    VSEL_2DI_B  VSEL_2DI_B
+  vsll __builtin_vec_sel (vsll, vsll, vull);
+    VSEL_2DI_B  VSEL_2DI_U
+  vull __builtin_vec_sel (vull, vull, vbll);
+    VSEL_2DI_UNS  VSEL_2DI_UB
+  vull __builtin_vec_sel (vull, vull, vull);
+    VSEL_2DI_UNS  VSEL_2DI_UU
+  vbll __builtin_vec_sel (vbll, vbll, vbll);
+    VSEL_2DI_UNS  VSEL_2DI_BB
+  vbll __builtin_vec_sel (vbll, vbll, vull);
+    VSEL_2DI_UNS  VSEL_2DI_BU
+  vf __builtin_vec_sel (vf, vf, vbi);
+    VSEL_4SF  VSEL_4SF_B
+  vf __builtin_vec_sel (vf, vf, vui);
+    VSEL_4SF  VSEL_4SF_U
+  vd __builtin_vec_sel (vd, vd, vbll);
+    VSEL_2DF  VSEL_2DF_B
+  vd __builtin_vec_sel (vd, vd, vull);
+    VSEL_2DF  VSEL_2DF_U
+; The following variants are deprecated.
+  vsll __builtin_vec_sel (vsll, vsll, vsll);
+    VSEL_2DI_B  VSEL_2DI_S
+  vull __builtin_vec_sel (vull, vull, vsll);
+    VSEL_2DI_UNS  VSEL_2DI_US
+  vf __builtin_vec_sel (vf, vf, vf);
+    VSEL_4SF  VSEL_4SF_F
+  vf __builtin_vec_sel (vf, vf, vsi);
+    VSEL_4SF  VSEL_4SF_S
+  vd __builtin_vec_sel (vd, vd, vsll);
+    VSEL_2DF  VSEL_2DF_S
+  vd __builtin_vec_sel (vd, vd, vd);
+    VSEL_2DF  VSEL_2DF_D
+
+[VEC_SHASIGMA_BE, vec_shasigma_be, __builtin_crypto_vshasigma]
+  vui __builtin_crypto_vshasigma (vui, const int, const int);
+    VSHASIGMAW
+  vull __builtin_crypto_vshasigma (vull, const int, const int);
+    VSHASIGMAD
+
+[VEC_SIGNED, vec_signed, __builtin_vec_vsigned]
+  vsi __builtin_vec_vsigned (vf);
+    VEC_VSIGNED_V4SF
+  vsll __builtin_vec_vsigned (vd);
+    VEC_VSIGNED_V2DF
+
+[VEC_SIGNED2, vec_signed2, __builtin_vec_vsigned2]
+  vsi __builtin_vec_vsigned2 (vd, vd);
+    VEC_VSIGNED2_V2DF
+
+[VEC_SIGNEDE, vec_signede, __builtin_vec_vsignede]
+  vsi __builtin_vec_vsignede (vd);
+    VEC_VSIGNEDE_V2DF
+
+[VEC_SIGNEDO, vec_signedo, __builtin_vec_vsignedo]
+  vsi __builtin_vec_vsignedo (vd);
+    VEC_VSIGNEDO_V2DF
+
+[VEC_SIGNEXTI, vec_signexti, __builtin_vec_signexti, _ARCH_PWR9]
+  vsi __builtin_vec_signexti (vsc);
+    VSIGNEXTSB2W
+  vsi __builtin_vec_signexti (vss);
+    VSIGNEXTSH2W
+
+[VEC_SIGNEXTLL, vec_signextll, __builtin_vec_signextll, _ARCH_PWR9]
+  vsll __builtin_vec_signextll (vsc);
+    VSIGNEXTSB2D
+  vsll __builtin_vec_signextll (vss);
+    VSIGNEXTSH2D
+  vsll __builtin_vec_signextll (vsi);
+    VSIGNEXTSW2D
+
+[VEC_SIGNEXTQ, vec_signextq, __builtin_vec_signextq, _ARCH_PWR10]
+  vsq __builtin_vec_signextq (vsll);
+    VSIGNEXTSD2Q
+
+[VEC_SL, vec_sl, __builtin_vec_sl]
+  vsc __builtin_vec_sl (vsc, vuc);
+    VSLB  VSLB_VSC
+  vuc __builtin_vec_sl (vuc, vuc);
+    VSLB  VSLB_VUC
+  vss __builtin_vec_sl (vss, vus);
+    VSLH  VSLH_VSS
+  vus __builtin_vec_sl (vus, vus);
+    VSLH  VSLH_VUS
+  vsi __builtin_vec_sl (vsi, vui);
+    VSLW  VSLW_VSI
+  vui __builtin_vec_sl (vui, vui);
+    VSLW  VSLW_VUI
+  vsll __builtin_vec_sl (vsll, vull);
+    VSLD  VSLD_VSLL
+  vull __builtin_vec_sl (vull, vull);
+    VSLD  VSLD_VULL
+  vsq __builtin_vec_sl (vsq, vuq);
+    VSLQ  VSLQ_VSQ
+  vuq __builtin_vec_sl (vuq, vuq);
+    VSLQ  VSLQ_VUQ
+
+[VEC_SLD, vec_sld, __builtin_vec_sld]
+  vsc __builtin_vec_sld (vsc, vsc, const int);
+    VSLDOI_16QI  VSLDOI_VSC
+  vbc __builtin_vec_sld (vbc, vbc, const int);
+    VSLDOI_16QI  VSLDOI_VBC
+  vuc __builtin_vec_sld (vuc, vuc, const int);
+    VSLDOI_16QI  VSLDOI_VUC
+  vss __builtin_vec_sld (vss, vss, const int);
+    VSLDOI_8HI  VSLDOI_VSS
+  vbs __builtin_vec_sld (vbs, vbs, const int);
+    VSLDOI_8HI  VSLDOI_VBS
+  vus __builtin_vec_sld (vus, vus, const int);
+    VSLDOI_8HI  VSLDOI_VUS
+  vp __builtin_vec_sld (vp, vp, const int);
+    VSLDOI_8HI  VSLDOI_VP
+  vsi __builtin_vec_sld (vsi, vsi, const int);
+    VSLDOI_4SI  VSLDOI_VSI
+  vbi __builtin_vec_sld (vbi, vbi, const int);
+    VSLDOI_4SI  VSLDOI_VBI
+  vui __builtin_vec_sld (vui, vui, const int);
+    VSLDOI_4SI  VSLDOI_VUI
+  vsll __builtin_vec_sld (vsll, vsll, const int);
+    VSLDOI_2DI  VSLDOI_VSLL
+  vbll __builtin_vec_sld (vbll, vbll, const int);
+    VSLDOI_2DI  VSLDOI_VBLL
+  vull __builtin_vec_sld (vull, vull, const int);
+    VSLDOI_2DI  VSLDOI_VULL
+  vf __builtin_vec_sld (vf, vf, const int);
+    VSLDOI_4SF
+  vd __builtin_vec_sld (vd, vd, const int);
+    VSLDOI_2DF
+
+[VEC_SLDB, vec_sldb, __builtin_vec_sldb, _ARCH_PWR10]
+  vsc __builtin_vec_sldb (vsc, vsc, const int);
+    VSLDB_V16QI  VSLDB_VSC
+  vuc __builtin_vec_sldb (vuc, vuc, const int);
+    VSLDB_V16QI  VSLDB_VUC
+  vss __builtin_vec_sldb (vss, vss, const int);
+    VSLDB_V8HI  VSLDB_VSS
+  vus __builtin_vec_sldb (vus, vus, const int);
+    VSLDB_V8HI  VSLDB_VUS
+  vsi __builtin_vec_sldb (vsi, vsi, const int);
+    VSLDB_V4SI  VSLDB_VSI
+  vui __builtin_vec_sldb (vui, vui, const int);
+    VSLDB_V4SI  VSLDB_VUI
+  vsll __builtin_vec_sldb (vsll, vsll, const int);
+    VSLDB_V2DI  VSLDB_VSLL
+  vull __builtin_vec_sldb (vull, vull, const int);
+    VSLDB_V2DI  VSLDB_VULL
+
+[VEC_SLDW, vec_sldw, __builtin_vec_sldw]
+  vsc __builtin_vec_sldw (vsc, vsc, const int);
+    XXSLDWI_16QI  XXSLDWI_VSC
+  vuc __builtin_vec_sldw (vuc, vuc, const int);
+    XXSLDWI_16QI  XXSLDWI_VUC
+  vss __builtin_vec_sldw (vss, vss, const int);
+    XXSLDWI_8HI  XXSLDWI_VSS
+  vus __builtin_vec_sldw (vus, vus, const int);
+    XXSLDWI_8HI  XXSLDWI_VUS
+  vsi __builtin_vec_sldw (vsi, vsi, const int);
+    XXSLDWI_4SI  XXSLDWI_VSI
+  vui __builtin_vec_sldw (vui, vui, const int);
+    XXSLDWI_4SI  XXSLDWI_VUI
+  vsll __builtin_vec_sldw (vsll, vsll, const int);
+    XXSLDWI_2DI  XXSLDWI_VSLL
+  vull __builtin_vec_sldw (vull, vull, const int);
+    XXSLDWI_2DI  XXSLDWI_VULL
+
+[VEC_SLL, vec_sll, __builtin_vec_sll]
+  vsc __builtin_vec_sll (vsc, vuc);
+    VSL  VSL_VSC
+  vuc __builtin_vec_sll (vuc, vuc);
+    VSL  VSL_VUC
+  vss __builtin_vec_sll (vss, vuc);
+    VSL  VSL_VSS
+  vus __builtin_vec_sll (vus, vuc);
+    VSL  VSL_VUS
+  vp __builtin_vec_sll (vp, vuc);
+    VSL  VSL_VP
+  vsi __builtin_vec_sll (vsi, vuc);
+    VSL  VSL_VSI
+  vui __builtin_vec_sll (vui, vuc);
+    VSL  VSL_VUI
+  vsll __builtin_vec_sll (vsll, vuc);
+    VSL  VSL_VSLL
+  vull __builtin_vec_sll (vull, vuc);
+    VSL  VSL_VULL
+; The following variants are deprecated.
+  vsc __builtin_vec_sll (vsc, vus);
+    VSL  VSL_VSC_VUS
+  vsc __builtin_vec_sll (vsc, vui);
+    VSL  VSL_VSC_VUI
+  vuc __builtin_vec_sll (vuc, vus);
+    VSL  VSL_VUC_VUS
+  vuc __builtin_vec_sll (vuc, vui);
+    VSL  VSL_VUC_VUI
+  vbc __builtin_vec_sll (vbc, vuc);
+    VSL  VSL_VBC_VUC
+  vbc __builtin_vec_sll (vbc, vus);
+    VSL  VSL_VBC_VUS
+  vbc __builtin_vec_sll (vbc, vui);
+    VSL  VSL_VBC_VUI
+  vss __builtin_vec_sll (vss, vus);
+    VSL  VSL_VSS_VUS
+  vss __builtin_vec_sll (vss, vui);
+    VSL  VSL_VSS_VUI
+  vus __builtin_vec_sll (vus, vus);
+    VSL  VSL_VUS_VUS
+  vus __builtin_vec_sll (vus, vui);
+    VSL  VSL_VUS_VUI
+  vbs __builtin_vec_sll (vbs, vuc);
+    VSL  VSL_VBS_VUC
+  vbs __builtin_vec_sll (vbs, vus);
+    VSL  VSL_VBS_VUS
+  vbs __builtin_vec_sll (vbs, vui);
+    VSL  VSL_VBS_VUI
+  vp __builtin_vec_sll (vp, vus);
+    VSL  VSL_VP_VUS
+  vp __builtin_vec_sll (vp, vui);
+    VSL  VSL_VP_VUI
+  vsi __builtin_vec_sll (vsi, vus);
+    VSL  VSL_VSI_VUS
+  vsi __builtin_vec_sll (vsi, vui);
+    VSL  VSL_VSI_VUI
+  vui __builtin_vec_sll (vui, vus);
+    VSL  VSL_VUI_VUS
+  vui __builtin_vec_sll (vui, vui);
+    VSL  VSL_VUI_VUI
+  vbi __builtin_vec_sll (vbi, vuc);
+    VSL  VSL_VBI_VUC
+  vbi __builtin_vec_sll (vbi, vus);
+    VSL  VSL_VBI_VUS
+  vbi __builtin_vec_sll (vbi, vui);
+    VSL  VSL_VBI_VUI
+  vbll __builtin_vec_sll (vbll, vuc);
+    VSL  VSL_VBLL_VUC
+  vbll __builtin_vec_sll (vbll, vus);
+    VSL  VSL_VBLL_VUS
+  vbll __builtin_vec_sll (vbll, vull);
+    VSL  VSL_VBLL_VULL
+
+[VEC_SLO, vec_slo, __builtin_vec_slo]
+  vsc __builtin_vec_slo (vsc, vsc);
+    VSLO  VSLO_VSCS
+  vsc __builtin_vec_slo (vsc, vuc);
+    VSLO  VSLO_VSCU
+  vuc __builtin_vec_slo (vuc, vsc);
+    VSLO  VSLO_VUCS
+  vuc __builtin_vec_slo (vuc, vuc);
+    VSLO  VSLO_VUCU
+  vss __builtin_vec_slo (vss, vsc);
+    VSLO  VSLO_VSSS
+  vss __builtin_vec_slo (vss, vuc);
+    VSLO  VSLO_VSSU
+  vus __builtin_vec_slo (vus, vsc);
+    VSLO  VSLO_VUSS
+  vus __builtin_vec_slo (vus, vuc);
+    VSLO  VSLO_VUSU
+  vp __builtin_vec_slo (vp, vsc);
+    VSLO  VSLO_VPS
+  vp __builtin_vec_slo (vp, vuc);
+    VSLO  VSLO_VPU
+  vsi __builtin_vec_slo (vsi, vsc);
+    VSLO  VSLO_VSIS
+  vsi __builtin_vec_slo (vsi, vuc);
+    VSLO  VSLO_VSIU
+  vui __builtin_vec_slo (vui, vsc);
+    VSLO  VSLO_VUIS
+  vui __builtin_vec_slo (vui, vuc);
+    VSLO  VSLO_VUIU
+  vsll __builtin_vec_slo (vsll, vsc);
+    VSLO  VSLO_VSLLS
+  vsll __builtin_vec_slo (vsll, vuc);
+    VSLO  VSLO_VSLLU
+  vull __builtin_vec_slo (vull, vsc);
+    VSLO  VSLO_VULLS
+  vull __builtin_vec_slo (vull, vuc);
+    VSLO  VSLO_VULLU
+  vf __builtin_vec_slo (vf, vsc);
+    VSLO  VSLO_VFS
+  vf __builtin_vec_slo (vf, vuc);
+    VSLO  VSLO_VFU
+
+[VEC_SLV, vec_slv, __builtin_vec_vslv, _ARCH_PWR9]
+  vuc __builtin_vec_vslv (vuc, vuc);
+    VSLV
+
+[VEC_SPLAT, vec_splat, __builtin_vec_splat]
+  vsc __builtin_vec_splat (vsc, signed int);
+    VSPLTB  VSPLTB_VSC
+  vuc __builtin_vec_splat (vuc, signed int);
+    VSPLTB  VSPLTB_VUC
+  vbc __builtin_vec_splat (vbc, signed int);
+    VSPLTB  VSPLTB_VBC
+  vss __builtin_vec_splat (vss, signed int);
+    VSPLTH  VSPLTH_VSS
+  vus __builtin_vec_splat (vus, signed int);
+    VSPLTH  VSPLTH_VUS
+  vbs __builtin_vec_splat (vbs, signed int);
+    VSPLTH  VSPLTH_VBS
+  vp __builtin_vec_splat (vp, signed int);
+    VSPLTH  VSPLTH_VP
+  vf __builtin_vec_splat (vf, signed int);
+    VSPLTW  VSPLTW_VF
+  vsi __builtin_vec_splat (vsi, signed int);
+    VSPLTW  VSPLTW_VSI
+  vui __builtin_vec_splat (vui, signed int);
+    VSPLTW  VSPLTW_VUI
+  vbi __builtin_vec_splat (vbi, signed int);
+    VSPLTW  VSPLTW_VBI
+  vd __builtin_vec_splat (vd, signed int);
+    XXSPLTD_V2DF
+  vsll __builtin_vec_splat (vsll, signed int);
+    XXSPLTD_V2DI  XXSPLTD_VSLL
+  vull __builtin_vec_splat (vull, signed int);
+    XXSPLTD_V2DI  XXSPLTD_VULL
+  vbll __builtin_vec_splat (vbll, signed int);
+    XXSPLTD_V2DI  XXSPLTD_VBLL
+
+[VEC_SPLAT_S8, vec_splat_s8, __builtin_vec_splat_s8]
+  vsc __builtin_vec_splat_s8 (signed int);
+    VSPLTISB
+
+[VEC_SPLAT_S16, vec_splat_s16, __builtin_vec_splat_s16]
+  vss __builtin_vec_splat_s16 (signed int);
+    VSPLTISH
+
+[VEC_SPLAT_S32, vec_splat_s32, __builtin_vec_splat_s32]
+  vsi __builtin_vec_splat_s32 (signed int);
+    VSPLTISW
+
+; There are no entries for vec_splat_u{8,16,32}.  These are handled
+; in altivec.h with a #define and a cast.
+
+[VEC_SPLATI, vec_splati, __builtin_vec_xxspltiw, _ARCH_PWR10]
+  vsi __builtin_vec_xxspltiw (signed int);
+    VXXSPLTIW_V4SI
+  vf __builtin_vec_xxspltiw (float);
+    VXXSPLTIW_V4SF
+
+[VEC_SPLATID, vec_splatid, __builtin_vec_xxspltid, _ARCH_PWR10]
+  vd __builtin_vec_xxspltid (float);
+    VXXSPLTIDP
+
+[VEC_SPLATI_INS, vec_splati_ins, __builtin_vec_xxsplti32dx, _ARCH_PWR10]
+  vsi __builtin_vec_xxsplti32dx (vsi, const int, signed int);
+    VXXSPLTI32DX_V4SI  VXXSPLTI32DX_VSI
+  vui __builtin_vec_xxsplti32dx (vui, const int, unsigned int);
+    VXXSPLTI32DX_V4SI  VXXSPLTI32DX_VUI
+  vf __builtin_vec_xxsplti32dx (vf, const int, float);
+    VXXSPLTI32DX_V4SF
+
+; There are no actual builtins for vec_splats.  There is special handling for
+; this in altivec_resolve_overloaded_builtin in rs6000-c.c, where the call
+; is replaced by a constructor.  The single overload here causes
+; __builtin_vec_splats to be registered with the front end so that can happen.
+[VEC_SPLATS, vec_splats, __builtin_vec_splats]
+  vsi __builtin_vec_splats (vsi);
+    ABS_V4SI SPLATS_FAKERY
+
+[VEC_SQRT, vec_sqrt, __builtin_vec_sqrt, __VSX__]
+  vf __builtin_vec_sqrt (vf);
+    XVSQRTSP
+  vd __builtin_vec_sqrt (vd);
+    XVSQRTDP
+
+[VEC_SR, vec_sr, __builtin_vec_sr]
+  vsc __builtin_vec_sr (vsc, vuc);
+    VSRB  VSRB_VSC
+  vuc __builtin_vec_sr (vuc, vuc);
+    VSRB  VSRB_VUC
+  vss __builtin_vec_sr (vss, vus);
+    VSRH  VSRH_VSS
+  vus __builtin_vec_sr (vus, vus);
+    VSRH  VSRH_VUS
+  vsi __builtin_vec_sr (vsi, vui);
+    VSRW  VSRW_VSI
+  vui __builtin_vec_sr (vui, vui);
+    VSRW  VSRW_VUI
+  vsll __builtin_vec_sr (vsll, vull);
+    VSRD  VSRD_VSLL
+  vull __builtin_vec_sr (vull, vull);
+    VSRD  VSRD_VULL
+  vsq __builtin_vec_sr (vsq, vuq);
+    VSRQ  VSRQ_VSQ
+  vuq __builtin_vec_sr (vuq, vuq);
+    VSRQ  VSRQ_VUQ
+
+[VEC_SRA, vec_sra, __builtin_vec_sra]
+  vsc __builtin_vec_sra (vsc, vuc);
+    VSRAB  VSRAB_VSC
+  vuc __builtin_vec_sra (vuc, vuc);
+    VSRAB  VSRAB_VUC
+  vss __builtin_vec_sra (vss, vus);
+    VSRAH  VSRAH_VSS
+  vus __builtin_vec_sra (vus, vus);
+    VSRAH  VSRAH_VUS
+  vsi __builtin_vec_sra (vsi, vui);
+    VSRAW  VSRAW_VSI
+  vui __builtin_vec_sra (vui, vui);
+    VSRAW  VSRAW_VUI
+  vsll __builtin_vec_sra (vsll, vull);
+    VSRAD  VSRAD_VSLL
+  vull __builtin_vec_sra (vull, vull);
+    VSRAD  VSRAD_VULL
+  vsq __builtin_vec_sra (vsq, vuq);
+    VSRAQ  VSRAQ_VSQ
+  vuq __builtin_vec_sra (vuq, vuq);
+    VSRAQ  VSRAQ_VUQ
+
+[VEC_SRDB, vec_srdb, __builtin_vec_srdb, _ARCH_PWR10]
+  vsc __builtin_vec_srdb (vsc, vsc, const int);
+    VSRDB_V16QI  VSRDB_VSC
+  vuc __builtin_vec_srdb (vuc, vuc, const int);
+    VSRDB_V16QI  VSRDB_VUC
+  vss __builtin_vec_srdb (vss, vss, const int);
+    VSRDB_V8HI  VSRDB_VSS
+  vus __builtin_vec_srdb (vus, vus, const int);
+    VSRDB_V8HI  VSRDB_VUS
+  vsi __builtin_vec_srdb (vsi, vsi, const int);
+    VSRDB_V4SI  VSRDB_VSI
+  vui __builtin_vec_srdb (vui, vui, const int);
+    VSRDB_V4SI  VSRDB_VUI
+  vsll __builtin_vec_srdb (vsll, vsll, const int);
+    VSRDB_V2DI  VSRDB_VSLL
+  vull __builtin_vec_srdb (vull, vull, const int);
+    VSRDB_V2DI  VSRDB_VULL
+
+[VEC_SRL, vec_srl, __builtin_vec_srl]
+  vsc __builtin_vec_srl (vsc, vuc);
+    VSR  VSR_VSC
+  vuc __builtin_vec_srl (vuc, vuc);
+    VSR  VSR_VUC
+  vss __builtin_vec_srl (vss, vuc);
+    VSR  VSR_VSS
+  vus __builtin_vec_srl (vus, vuc);
+    VSR  VSR_VUS
+  vp __builtin_vec_srl (vp, vuc);
+    VSR  VSR_VP
+  vsi __builtin_vec_srl (vsi, vuc);
+    VSR  VSR_VSI
+  vui __builtin_vec_srl (vui, vuc);
+    VSR  VSR_VUI
+  vsll __builtin_vec_srl (vsll, vuc);
+    VSR  VSR_VSLL
+  vull __builtin_vec_srl (vull, vuc);
+    VSR  VSR_VULL
+; The following variants are deprecated.
+  vsc __builtin_vec_srl (vsc, vus);
+    VSR  VSR_VSC_VUS
+  vsc __builtin_vec_srl (vsc, vui);
+    VSR  VSR_VSC_VUI
+  vuc __builtin_vec_srl (vuc, vus);
+    VSR  VSR_VUC_VUS
+  vuc __builtin_vec_srl (vuc, vui);
+    VSR  VSR_VUC_VUI
+  vbc __builtin_vec_srl (vbc, vuc);
+    VSR  VSR_VBC_VUC
+  vbc __builtin_vec_srl (vbc, vus);
+    VSR  VSR_VBC_VUS
+  vbc __builtin_vec_srl (vbc, vui);
+    VSR  VSR_VBC_VUI
+  vss __builtin_vec_srl (vss, vus);
+    VSR  VSR_VSS_VUS
+  vss __builtin_vec_srl (vss, vui);
+    VSR  VSR_VSS_VUI
+  vus __builtin_vec_srl (vus, vus);
+    VSR  VSR_VUS_VUS
+  vus __builtin_vec_srl (vus, vui);
+    VSR  VSR_VUS_VUI
+  vbs __builtin_vec_srl (vbs, vuc);
+    VSR  VSR_VBS_VUC
+  vbs __builtin_vec_srl (vbs, vus);
+    VSR  VSR_VBS_VUS
+  vbs __builtin_vec_srl (vbs, vui);
+    VSR  VSR_VBS_VUI
+  vp __builtin_vec_srl (vp, vus);
+    VSR  VSR_VP_VUS
+  vp __builtin_vec_srl (vp, vui);
+    VSR  VSR_VP_VUI
+  vsi __builtin_vec_srl (vsi, vus);
+    VSR  VSR_VSI_VUS
+  vsi __builtin_vec_srl (vsi, vui);
+    VSR  VSR_VSI_VUI
+  vui __builtin_vec_srl (vui, vus);
+    VSR  VSR_VUI_VUS
+  vui __builtin_vec_srl (vui, vui);
+    VSR  VSR_VUI_VUI
+  vbi __builtin_vec_srl (vbi, vuc);
+    VSR  VSR_VBI_VUC
+  vbi __builtin_vec_srl (vbi, vus);
+    VSR  VSR_VBI_VUS
+  vbi __builtin_vec_srl (vbi, vui);
+    VSR  VSR_VBI_VUI
+
+[VEC_SRO, vec_sro, __builtin_vec_sro]
+  vsc __builtin_vec_sro (vsc, vsc);
+    VSRO  VSRO_VSCS
+  vsc __builtin_vec_sro (vsc, vuc);
+    VSRO  VSRO_VSCU
+  vuc __builtin_vec_sro (vuc, vsc);
+    VSRO  VSRO_VUCS
+  vuc __builtin_vec_sro (vuc, vuc);
+    VSRO  VSRO_VUCU
+  vss __builtin_vec_sro (vss, vsc);
+    VSRO  VSRO_VSSS
+  vss __builtin_vec_sro (vss, vuc);
+    VSRO  VSRO_VSSU
+  vus __builtin_vec_sro (vus, vsc);
+    VSRO  VSRO_VUSS
+  vus __builtin_vec_sro (vus, vuc);
+    VSRO  VSRO_VUSU
+  vp __builtin_vec_sro (vp, vsc);
+    VSRO  VSRO_VPS
+  vp __builtin_vec_sro (vp, vuc);
+    VSRO  VSRO_VPU
+  vsi __builtin_vec_sro (vsi, vsc);
+    VSRO  VSRO_VSIS
+  vsi __builtin_vec_sro (vsi, vuc);
+    VSRO  VSRO_VSIU
+  vui __builtin_vec_sro (vui, vsc);
+    VSRO  VSRO_VUIS
+  vui __builtin_vec_sro (vui, vuc);
+    VSRO  VSRO_VUIU
+  vsll __builtin_vec_sro (vsll, vsc);
+    VSRO  VSRO_VSLLS
+  vsll __builtin_vec_sro (vsll, vuc);
+    VSRO  VSRO_VSLLU
+  vull __builtin_vec_sro (vull, vsc);
+    VSRO  VSRO_VULLS
+  vull __builtin_vec_sro (vull, vuc);
+    VSRO  VSRO_VULLU
+  vf __builtin_vec_sro (vf, vsc);
+    VSRO  VSRO_VFS
+  vf __builtin_vec_sro (vf, vuc);
+    VSRO  VSRO_VFU
+
+[VEC_SRV, vec_srv, __builtin_vec_vsrv, _ARCH_PWR9]
+  vuc __builtin_vec_vsrv (vuc, vuc);
+    VSRV
+
+[VEC_ST, vec_st, __builtin_vec_st]
+  void __builtin_vec_st (vsc, signed long long, vsc *);
+    STVX_V16QI  STVX_VSC
+  void __builtin_vec_st (vsc, signed long long, signed char *);
+    STVX_V16QI  STVX_SC
+  void __builtin_vec_st (vuc, signed long long, vuc *);
+    STVX_V16QI  STVX_VUC
+  void __builtin_vec_st (vuc, signed long long, unsigned char *);
+    STVX_V16QI  STVX_UC
+  void __builtin_vec_st (vbc, signed long long, vbc *);
+    STVX_V16QI  STVX_VBC
+  void __builtin_vec_st (vbc, signed long long, signed char *);
+    STVX_V16QI  STVX_SC_B
+  void __builtin_vec_st (vbc, signed long long, unsigned char *);
+    STVX_V16QI  STVX_UC_B
+  void __builtin_vec_st (vss, signed long long, vss *);
+    STVX_V8HI  STVX_VSS
+  void __builtin_vec_st (vss, signed long long, signed short *);
+    STVX_V8HI  STVX_SS
+  void __builtin_vec_st (vus, signed long long, vus *);
+    STVX_V8HI  STVX_VUS
+  void __builtin_vec_st (vus, signed long long, unsigned short *);
+    STVX_V8HI  STVX_US
+  void __builtin_vec_st (vbs, signed long long, vbs *);
+    STVX_V8HI  STVX_VBS
+  void __builtin_vec_st (vbs, signed long long, signed short *);
+    STVX_V8HI  STVX_SS_B
+  void __builtin_vec_st (vbs, signed long long, unsigned short *);
+    STVX_V8HI  STVX_US_B
+  void __builtin_vec_st (vp, signed long long, vp *);
+    STVX_V8HI  STVX_P
+  void __builtin_vec_st (vsi, signed long long, vsi *);
+    STVX_V4SI  STVX_VSI
+  void __builtin_vec_st (vsi, signed long long, signed int *);
+    STVX_V4SI  STVX_SI
+  void __builtin_vec_st (vui, signed long long, vui *);
+    STVX_V4SI  STVX_VUI
+  void __builtin_vec_st (vui, signed long long, unsigned int *);
+    STVX_V4SI  STVX_UI
+  void __builtin_vec_st (vbi, signed long long, vbi *);
+    STVX_V4SI  STVX_VBI
+  void __builtin_vec_st (vbi, signed long long, signed int *);
+    STVX_V4SI  STVX_SI_B
+  void __builtin_vec_st (vbi, signed long long, unsigned int *);
+    STVX_V4SI  STVX_UI_B
+  void __builtin_vec_st (vsll, signed long long, vsll *);
+    STVX_V2DI  STVX_VSLL
+  void __builtin_vec_st (vsll, signed long long, signed long long *);
+    STVX_V2DI  STVX_SLL
+  void __builtin_vec_st (vull, signed long long, vull *);
+    STVX_V2DI  STVX_VULL
+  void __builtin_vec_st (vull, signed long long, unsigned long long *);
+    STVX_V2DI  STVX_ULL
+  void __builtin_vec_st (vbll, signed long long, vbll *);
+    STVX_V2DI  STVX_VBLL
+  void __builtin_vec_st (vf, signed long long, vf *);
+    STVX_V4SF  STVX_VF
+  void __builtin_vec_st (vf, signed long long, float *);
+    STVX_V4SF  STVX_F
+  void __builtin_vec_st (vd, signed long long, vd *);
+    STVX_V2DF  STVX_VD
+  void __builtin_vec_st (vd, signed long long, double *);
+    STVX_V2DF  STVX_D
+; The following variants are deprecated.
+  void __builtin_vec_st (vbll, signed long long, signed long long *);
+    STVX_V2DI  STVX_SLL_B
+  void __builtin_vec_st (vbll, signed long long, unsigned long long *);
+    STVX_V2DI  STVX_ULL_B
+
+[VEC_STE, vec_ste, __builtin_vec_ste]
+  void __builtin_vec_ste (vsc, signed long long, signed char *);
+    STVEBX  STVEBX_S
+  void __builtin_vec_ste (vuc, signed long long, unsigned char *);
+    STVEBX  STVEBX_U
+  void __builtin_vec_ste (vbc, signed long long, signed char *);
+    STVEBX  STVEBX_BS
+  void __builtin_vec_ste (vbc, signed long long, unsigned char *);
+    STVEBX  STVEBX_BU
+  void __builtin_vec_ste (vss, signed long long, signed short *);
+    STVEHX  STVEHX_S
+  void __builtin_vec_ste (vus, signed long long, unsigned short *);
+    STVEHX  STVEHX_U
+  void __builtin_vec_ste (vbs, signed long long, signed short *);
+    STVEHX  STVEHX_BS
+  void __builtin_vec_ste (vbs, signed long long, unsigned short *);
+    STVEHX  STVEHX_BU
+  void __builtin_vec_ste (vp, signed long long, signed short *);
+    STVEHX  STVEHX_PS
+  void __builtin_vec_ste (vp, signed long long, unsigned short *);
+    STVEHX  STVEHX_PU
+  void __builtin_vec_ste (vsi, signed long long, signed int *);
+    STVEWX  STVEHWX_S
+  void __builtin_vec_ste (vui, signed long long, unsigned int *);
+    STVEWX  STVEWX_U
+  void __builtin_vec_ste (vbi, signed long long, signed int *);
+    STVEWX  STVEWX_BS
+  void __builtin_vec_ste (vbi, signed long long, unsigned int *);
+    STVEWX  STVEWX_BU
+  void __builtin_vec_ste (vf, signed long long, float *);
+    STVEWX  STVEWX_F
+
+; There are no builtins for VEC_STEP; this is handled directly
+; with a constant replacement in rs6000_resolve_overloaded_builtin.
+; The single overload registers __builtin_vec_step with the front end
+; so this can happen.
+[VEC_STEP, vec_step, __builtin_vec_step]
+  signed int __builtin_vec_step (vsi);
+    VCLZLSBB_V4SI  STEP_FAKERY
+
+[VEC_STL, vec_stl, __builtin_vec_stl]
+  void __builtin_vec_stl (vsc, signed long long, vsc *);
+    STVXL_V16QI  STVXL_VSC
+  void __builtin_vec_stl (vsc, signed long long, signed char *);
+    STVXL_V16QI  STVXL_SC
+  void __builtin_vec_stl (vuc, signed long long, vuc *);
+    STVXL_V16QI  STVXL_VUC
+  void __builtin_vec_stl (vuc, signed long long, unsigned char *);
+    STVXL_V16QI  STVXL_UC
+  void __builtin_vec_stl (vbc, signed long long, vbc *);
+    STVXL_V16QI  STVXL_VBC
+  void __builtin_vec_stl (vbc, signed long long, signed char *);
+    STVXL_V16QI  STVXL_SC_B
+  void __builtin_vec_stl (vbc, signed long long, unsigned char *);
+    STVXL_V16QI  STVXL_UC_B
+  void __builtin_vec_stl (vss, signed long long, vss *);
+    STVXL_V8HI  STVXL_VSS
+  void __builtin_vec_stl (vss, signed long long, signed short *);
+    STVXL_V8HI  STVXL_SS
+  void __builtin_vec_stl (vus, signed long long, vus *);
+    STVXL_V8HI  STVXL_VUS
+  void __builtin_vec_stl (vus, signed long long, unsigned short *);
+    STVXL_V8HI  STVXL_US
+  void __builtin_vec_stl (vbs, signed long long, vbs *);
+    STVXL_V8HI  STVXL_VBS
+  void __builtin_vec_stl (vbs, signed long long, signed short *);
+    STVXL_V8HI  STVXL_SS_B
+  void __builtin_vec_stl (vbs, signed long long, unsigned short *);
+    STVXL_V8HI  STVXL_US_B
+  void __builtin_vec_stl (vp, signed long long, vp *);
+    STVXL_V8HI  STVXL_P
+  void __builtin_vec_stl (vsi, signed long long, vsi *);
+    STVXL_V4SI  STVXL_VSI
+  void __builtin_vec_stl (vsi, signed long long, signed int *);
+    STVXL_V4SI  STVXL_SI
+  void __builtin_vec_stl (vui, signed long long, vui *);
+    STVXL_V4SI  STVXL_VUI
+  void __builtin_vec_stl (vui, signed long long, unsigned int *);
+    STVXL_V4SI  STVXL_UI
+  void __builtin_vec_stl (vbi, signed long long, vbi *);
+    STVXL_V4SI  STVXL_VBI
+  void __builtin_vec_stl (vbi, signed long long, signed int *);
+    STVXL_V4SI  STVXL_SI_B
+  void __builtin_vec_stl (vbi, signed long long, unsigned int *);
+    STVXL_V4SI  STVXL_UI_B
+  void __builtin_vec_stl (vsll, signed long long, vsll *);
+    STVXL_V2DI  STVXL_VSLL
+  void __builtin_vec_stl (vsll, signed long long, signed long long *);
+    STVXL_V2DI  STVXL_SLL
+  void __builtin_vec_stl (vull, signed long long, vull *);
+    STVXL_V2DI  STVXL_VULL
+  void __builtin_vec_stl (vull, signed long long, unsigned long long *);
+    STVXL_V2DI  STVXL_ULL
+  void __builtin_vec_stl (vbll, signed long long, vbll *);
+    STVXL_V2DI  STVXL_VBLL
+  void __builtin_vec_stl (vbll, signed long long, signed long long *);
+    STVXL_V2DI  STVXL_SLL_B
+  void __builtin_vec_stl (vbll, signed long long, unsigned long long *);
+    STVXL_V2DI  STVXL_ULL_B
+  void __builtin_vec_stl (vf, signed long long, vf *);
+    STVXL_V4SF  STVXL_VF
+  void __builtin_vec_stl (vf, signed long long, float *);
+    STVXL_V4SF  STVXL_F
+  void __builtin_vec_stl (vd, signed long long, vd *);
+    STVXL_V2DF  STVXL_VD
+  void __builtin_vec_stl (vd, signed long long, double *);
+    STVXL_V2DF  STVXL_D
+
+[VEC_STRIL, vec_stril, __builtin_vec_stril, _ARCH_PWR10]
+  vuc __builtin_vec_stril (vuc);
+    VSTRIBL  VSTRIBL_U
+  vsc __builtin_vec_stril (vsc);
+    VSTRIBL  VSTRIBL_S
+  vus __builtin_vec_stril (vus);
+    VSTRIHL  VSTRIHL_U
+  vss __builtin_vec_stril (vss);
+    VSTRIHL  VSTRIHL_S
+
+[VEC_STRIL_P, vec_stril_p, __builtin_vec_stril_p, _ARCH_PWR10]
+  signed int __builtin_vec_stril_p (vuc);
+    VSTRIBL_P  VSTRIBL_PU
+  signed int __builtin_vec_stril_p (vsc);
+    VSTRIBL_P  VSTRIBL_PS
+  signed int __builtin_vec_stril_p (vus);
+    VSTRIHL_P  VSTRIHL_PU
+  signed int __builtin_vec_stril_p (vss);
+    VSTRIHL_P  VSTRIHL_PS
+
+[VEC_STRIR, vec_strir, __builtin_vec_strir, _ARCH_PWR10]
+  vuc __builtin_vec_strir (vuc);
+    VSTRIBR  VSTRIBR_U
+  vsc __builtin_vec_strir (vsc);
+    VSTRIBR  VSTRIBR_S
+  vus __builtin_vec_strir (vus);
+    VSTRIHR  VSTRIHR_U
+  vss __builtin_vec_strir (vss);
+    VSTRIHR  VSTRIHR_S
+
+[VEC_STRIR_P, vec_strir_p, __builtin_vec_strir_p, _ARCH_PWR10]
+  signed int __builtin_vec_strir_p (vuc);
+    VSTRIBR_P  VSTRIBR_PU
+  signed int __builtin_vec_strir_p (vsc);
+    VSTRIBR_P  VSTRIBR_PS
+  signed int __builtin_vec_strir_p (vus);
+    VSTRIHR_P  VSTRIHR_PU
+  signed int __builtin_vec_strir_p (vss);
+    VSTRIHR_P  VSTRIHR_PS
+
+[VEC_STVLX, vec_stvlx, __builtin_vec_stvlx, __PPU__]
+  void __builtin_vec_stvlx (vbc, signed long long, vbc *);
+    STVLX  STVLX_VBC
+  void __builtin_vec_stvlx (vsc, signed long long, vsc *);
+    STVLX  STVLX_VSC
+  void __builtin_vec_stvlx (vsc, signed long long, signed char *);
+    STVLX  STVLX_SC
+  void __builtin_vec_stvlx (vuc, signed long long, vuc *);
+    STVLX  STVLX_VUC
+  void __builtin_vec_stvlx (vuc, signed long long, unsigned char *);
+    STVLX  STVLX_UC
+  void __builtin_vec_stvlx (vbs, signed long long, vbs *);
+    STVLX  STVLX_VBS
+  void __builtin_vec_stvlx (vss, signed long long, vss *);
+    STVLX  STVLX_VSS
+  void __builtin_vec_stvlx (vss, signed long long, signed short *);
+    STVLX  STVLX_SS
+  void __builtin_vec_stvlx (vus, signed long long, vus *);
+    STVLX  STVLX_VUS
+  void __builtin_vec_stvlx (vus, signed long long, unsigned short *);
+    STVLX  STVLX_US
+  void __builtin_vec_stvlx (vp, signed long long, vp *);
+    STVLX  STVLX_VP
+  void __builtin_vec_stvlx (vbi, signed long long, vbi *);
+    STVLX  STVLX_VBI
+  void __builtin_vec_stvlx (vsi, signed long long, vsi *);
+    STVLX  STVLX_VSI
+  void __builtin_vec_stvlx (vsi, signed long long, signed int *);
+    STVLX  STVLX_SI
+  void __builtin_vec_stvlx (vui, signed long long, vui *);
+    STVLX  STVLX_VUI
+  void __builtin_vec_stvlx (vui, signed long long, unsigned int *);
+    STVLX  STVLX_UI
+  void __builtin_vec_stvlx (vf, signed long long, vf *);
+    STVLX  STVLX_VF
+  void __builtin_vec_stvlx (vf, signed long long, float *);
+    STVLX  STVLX_F
+
+[VEC_STVLXL, vec_stvlxl, __builtin_vec_stvlxl, __PPU__]
+  void __builtin_vec_stvlxl (vbc, signed long long, vbc *);
+    STVLXL  STVLXL_VBC
+  void __builtin_vec_stvlxl (vsc, signed long long, vsc *);
+    STVLXL  STVLXL_VSC
+  void __builtin_vec_stvlxl (vsc, signed long long, signed char *);
+    STVLXL  STVLXL_SC
+  void __builtin_vec_stvlxl (vuc, signed long long, vuc *);
+    STVLXL  STVLXL_VUC
+  void __builtin_vec_stvlxl (vuc, signed long long, unsigned char *);
+    STVLXL  STVLXL_UC
+  void __builtin_vec_stvlxl (vbs, signed long long, vbs *);
+    STVLXL  STVLXL_VBS
+  void __builtin_vec_stvlxl (vss, signed long long, vss *);
+    STVLXL  STVLXL_VSS
+  void __builtin_vec_stvlxl (vss, signed long long, signed short *);
+    STVLXL  STVLXL_SS
+  void __builtin_vec_stvlxl (vus, signed long long, vus *);
+    STVLXL  STVLXL_VUS
+  void __builtin_vec_stvlxl (vus, signed long long, unsigned short *);
+    STVLXL  STVLXL_US
+  void __builtin_vec_stvlxl (vp, signed long long, vp *);
+    STVLXL  STVLXL_VP
+  void __builtin_vec_stvlxl (vbi, signed long long, vbi *);
+    STVLXL  STVLXL_VBI
+  void __builtin_vec_stvlxl (vsi, signed long long, vsi *);
+    STVLXL  STVLXL_VSI
+  void __builtin_vec_stvlxl (vsi, signed long long, signed int *);
+    STVLXL  STVLXL_SI
+  void __builtin_vec_stvlxl (vui, signed long long, vui *);
+    STVLXL  STVLXL_VUI
+  void __builtin_vec_stvlxl (vui, signed long long, unsigned int *);
+    STVLXL  STVLXL_UI
+  void __builtin_vec_stvlxl (vf, signed long long, vf *);
+    STVLXL  STVLXL_VF
+  void __builtin_vec_stvlxl (vf, signed long long, float *);
+    STVLXL  STVLXL_F
+
+[VEC_STVRX, vec_stvrx, __builtin_vec_stvrx, __PPU__]
+  void __builtin_vec_stvrx (vbc, signed long long, vbc *);
+    STVRX  STVRX_VBC
+  void __builtin_vec_stvrx (vsc, signed long long, vsc *);
+    STVRX  STVRX_VSC
+  void __builtin_vec_stvrx (vsc, signed long long, signed char *);
+    STVRX  STVRX_SC
+  void __builtin_vec_stvrx (vuc, signed long long, vuc *);
+    STVRX  STVRX_VUC
+  void __builtin_vec_stvrx (vuc, signed long long, unsigned char *);
+    STVRX  STVRX_UC
+  void __builtin_vec_stvrx (vbs, signed long long, vbs *);
+    STVRX  STVRX_VBS
+  void __builtin_vec_stvrx (vss, signed long long, vss *);
+    STVRX  STVRX_VSS
+  void __builtin_vec_stvrx (vss, signed long long, signed short *);
+    STVRX  STVRX_SS
+  void __builtin_vec_stvrx (vus, signed long long, vus *);
+    STVRX  STVRX_VUS
+  void __builtin_vec_stvrx (vus, signed long long, unsigned short *);
+    STVRX  STVRX_US
+  void __builtin_vec_stvrx (vp, signed long long, vp *);
+    STVRX  STVRX_VP
+  void __builtin_vec_stvrx (vbi, signed long long, vbi *);
+    STVRX  STVRX_VBI
+  void __builtin_vec_stvrx (vsi, signed long long, vsi *);
+    STVRX  STVRX_VSI
+  void __builtin_vec_stvrx (vsi, signed long long, signed int *);
+    STVRX  STVRX_SI
+  void __builtin_vec_stvrx (vui, signed long long, vui *);
+    STVRX  STVRX_VUI
+  void __builtin_vec_stvrx (vui, signed long long, unsigned int *);
+    STVRX  STVRX_UI
+  void __builtin_vec_stvrx (vf, signed long long, vf *);
+    STVRX  STVRX_VF
+  void __builtin_vec_stvrx (vf, signed long long, float *);
+    STVRX  STVRX_F
+
+[VEC_STVRXL, vec_stvrxl, __builtin_vec_stvrxl, __PPU__]
+  void __builtin_vec_stvrxl (vbc, signed long long, vbc *);
+    STVRXL  STVRXL_VBC
+  void __builtin_vec_stvrxl (vsc, signed long long, vsc *);
+    STVRXL  STVRXL_VSC
+  void __builtin_vec_stvrxl (vsc, signed long long, signed char *);
+    STVRXL  STVRXL_SC
+  void __builtin_vec_stvrxl (vuc, signed long long, vuc *);
+    STVRXL  STVRXL_VUC
+  void __builtin_vec_stvrxl (vuc, signed long long, unsigned char *);
+    STVRXL  STVRXL_UC
+  void __builtin_vec_stvrxl (vbs, signed long long, vbs *);
+    STVRXL  STVRXL_VBS
+  void __builtin_vec_stvrxl (vss, signed long long, vss *);
+    STVRXL  STVRXL_VSS
+  void __builtin_vec_stvrxl (vss, signed long long, signed short *);
+    STVRXL  STVRXL_SS
+  void __builtin_vec_stvrxl (vus, signed long long, vus *);
+    STVRXL  STVRXL_VUS
+  void __builtin_vec_stvrxl (vus, signed long long, unsigned short *);
+    STVRXL  STVRXL_US
+  void __builtin_vec_stvrxl (vp, signed long long, vp *);
+    STVRXL  STVRXL_VP
+  void __builtin_vec_stvrxl (vbi, signed long long, vbi *);
+    STVRXL  STVRXL_VBI
+  void __builtin_vec_stvrxl (vsi, signed long long, vsi *);
+    STVRXL  STVRXL_VSI
+  void __builtin_vec_stvrxl (vsi, signed long long, signed int *);
+    STVRXL  STVRXL_SI
+  void __builtin_vec_stvrxl (vui, signed long long, vui *);
+    STVRXL  STVRXL_VUI
+  void __builtin_vec_stvrxl (vui, signed long long, unsigned int *);
+    STVRXL  STVRXL_UI
+  void __builtin_vec_stvrxl (vf, signed long long, vf *);
+    STVRXL  STVRXL_VF
+  void __builtin_vec_stvrxl (vf, signed long long, float *);
+    STVRXL  STVRXL_F
+
+[VEC_STXVL, vec_xst_len, __builtin_vec_stxvl, _ARCH_PPC64_PWR9]
+  void __builtin_vec_stxvl (vsc, signed char *, unsigned int);
+    STXVL  STXVL_VSC
+  void __builtin_vec_stxvl (vuc, unsigned char *, unsigned int);
+    STXVL  STXVL_VUC
+  void __builtin_vec_stxvl (vss, signed short *, unsigned int);
+    STXVL  STXVL_VSS
+  void __builtin_vec_stxvl (vus, unsigned short *, unsigned int);
+    STXVL  STXVL_VUS
+  void __builtin_vec_stxvl (vsi, signed int *, unsigned int);
+    STXVL  STXVL_VSI
+  void __builtin_vec_stxvl (vui, unsigned int *, unsigned int);
+    STXVL  STXVL_VUI
+  void __builtin_vec_stxvl (vsll, signed long long *, unsigned int);
+    STXVL  STXVL_VSLL
+  void __builtin_vec_stxvl (vull, unsigned long long *, unsigned int);
+    STXVL  STXVL_VULL
+  void __builtin_vec_stxvl (vsq, signed __int128 *, unsigned int);
+    STXVL  STXVL_VSQ
+  void __builtin_vec_stxvl (vuq, unsigned __int128 *, unsigned int);
+    STXVL  STXVL_VUQ
+  void __builtin_vec_stxvl (vf, float *, unsigned int);
+    STXVL  STXVL_VF
+  void __builtin_vec_stxvl (vd, double *, unsigned int);
+    STXVL  STXVL_VD
+
+; #### XVSUBSP{TARGET_VSX};VSUBFP
+[VEC_SUB, vec_sub, __builtin_vec_sub]
+  vsc __builtin_vec_sub (vsc, vsc);
+    VSUBUBM  VSUBUBM_VSC
+  vuc __builtin_vec_sub (vuc, vuc);
+    VSUBUBM  VSUBUBM_VUC
+  vss __builtin_vec_sub (vss, vss);
+    VSUBUHM  VSUBUHM_VSS
+  vus __builtin_vec_sub (vus, vus);
+    VSUBUHM  VSUBUHM_VUS
+  vsi __builtin_vec_sub (vsi, vsi);
+    VSUBUWM  VSUBUWM_VSI
+  vui __builtin_vec_sub (vui, vui);
+    VSUBUWM  VSUBUWM_VUI
+  vsll __builtin_vec_sub (vsll, vsll);
+    VSUBUDM  VSUBUDM_VSLL
+  vull __builtin_vec_sub (vull, vull);
+    VSUBUDM  VSUBUDM_VULL
+  vsq __builtin_vec_sub (vsq, vsq);
+    VSUBUQM  VSUBUQM_VSQ
+  vuq __builtin_vec_sub (vuq, vuq);
+    VSUBUQM  VSUBUQM_VUQ
+  vf __builtin_vec_sub (vf, vf);
+    VSUBFP
+  vd __builtin_vec_sub (vd, vd);
+    XVSUBDP
+; The following variants are deprecated.
+  vsc __builtin_vec_sub (vsc, vbc);
+    VSUBUBM  VSUBUBM_VSC_VBC
+  vsc __builtin_vec_sub (vbc, vsc);
+    VSUBUBM  VSUBUBM_VBC_VSC
+  vuc __builtin_vec_sub (vuc, vbc);
+    VSUBUBM  VSUBUBM_VUC_VBC
+  vuc __builtin_vec_sub (vbc, vuc);
+    VSUBUBM  VSUBUBM_VBC_VUC
+  vss __builtin_vec_sub (vss, vbs);
+    VSUBUHM  VSUBUHM_VSS_VBS
+  vss __builtin_vec_sub (vbs, vss);
+    VSUBUHM  VSUBUHM_VBS_VSS
+  vus __builtin_vec_sub (vus, vbs);
+    VSUBUHM  VSUBUHM_VUS_VBS
+  vus __builtin_vec_sub (vbs, vus);
+    VSUBUHM  VSUBUHM_VBS_VUS
+  vsi __builtin_vec_sub (vsi, vbi);
+    VSUBUWM  VSUBUWM_VSI_VBI
+  vsi __builtin_vec_sub (vbi, vsi);
+    VSUBUWM  VSUBUWM_VBI_VSI
+  vui __builtin_vec_sub (vui, vbi);
+    VSUBUWM  VSUBUWM_VUI_VBI
+  vui __builtin_vec_sub (vbi, vui);
+    VSUBUWM  VSUBUWM_VBI_VUI
+  vsll __builtin_vec_sub (vsll, vbll);
+    VSUBUDM  VSUBUDM_VSLL_VBLL
+  vsll __builtin_vec_sub (vbll, vsll);
+    VSUBUDM  VSUBUDM_VBLL_VSLL
+  vull __builtin_vec_sub (vull, vbll);
+    VSUBUDM  VSUBUDM_VULL_VBLL
+  vull __builtin_vec_sub (vbll, vull);
+    VSUBUDM  VSUBUDM_VBLL_VULL
+
+[VEC_SUBC, vec_subc, __builtin_vec_subc]
+  vsi __builtin_vec_subc (vsi, vsi);
+    VSUBCUW  VSUBCUW_VSI
+  vui __builtin_vec_subc (vui, vui);
+    VSUBCUW  VSUBCUW_VUI
+  vsq __builtin_vec_subc (vsq, vsq);
+    VSUBCUQ  VSUBCUQ_VSQ
+  vuq __builtin_vec_subc (vuq, vuq);
+    VSUBCUQ  VSUBCUQ_VUQ
+
+; TODO: Note that the entry for VEC_SUBE currently gets ignored in
+; altivec_resolve_overloaded_builtin.  Revisit whether we can remove
+; that.  We still need to register the legal builtin forms here.
+[VEC_SUBE, vec_sube, __builtin_vec_sube]
+  vsq __builtin_vec_sube (vsq, vsq, vsq);
+    VSUBEUQM  VSUBEUQM_VSQ
+  vuq __builtin_vec_sube (vuq, vuq, vuq);
+    VSUBEUQM  VSUBEUQM_VUQ
+
+; TODO: Note that the entry for VEC_SUBEC currently gets ignored in
+; altivec_resolve_overloaded_builtin.  Revisit whether we can remove
+; that.  We still need to register the legal builtin forms here.
+[VEC_SUBEC, vec_subec, __builtin_vec_subec]
+  vsq __builtin_vec_subec (vsq, vsq, vsq);
+    VSUBECUQ  VSUBECUQ_VSQ
+  vuq __builtin_vec_subec (vuq, vuq, vuq);
+    VSUBECUQ  VSUBECUQ_VUQ
+
+[VEC_SUBS, vec_subs, __builtin_vec_subs]
+  vuc __builtin_vec_subs (vuc, vuc);
+    VSUBUBS
+  vsc __builtin_vec_subs (vsc, vsc);
+    VSUBSBS
+  vus __builtin_vec_subs (vus, vus);
+    VSUBUHS
+  vss __builtin_vec_subs (vss, vss);
+    VSUBSHS
+  vui __builtin_vec_subs (vui, vui);
+    VSUBUWS
+  vsi __builtin_vec_subs (vsi, vsi);
+    VSUBSWS
+; The following variants are deprecated.
+  vuc __builtin_vec_subs (vuc, vbc);
+    VSUBUBS  VSUBUBS_UB
+  vuc __builtin_vec_subs (vbc, vuc);
+    VSUBUBS  VSUBUBS_BU
+  vsc __builtin_vec_subs (vsc, vbc);
+    VSUBSBS  VSUBSBS_SB
+  vsc __builtin_vec_subs (vbc, vsc);
+    VSUBSBS  VSUBSBS_BS
+  vus __builtin_vec_subs (vus, vbs);
+    VSUBUHS  VSUBUHS_UB
+  vus __builtin_vec_subs (vbs, vus);
+    VSUBUHS  VSUBUHS_BU
+  vss __builtin_vec_subs (vss, vbs);
+    VSUBSHS  VSUBSHS_SB
+  vss __builtin_vec_subs (vbs, vss);
+    VSUBSHS  VSUBSHS_BS
+  vui __builtin_vec_subs (vui, vbi);
+    VSUBUWS  VSUBUWS_UB
+  vui __builtin_vec_subs (vbi, vui);
+    VSUBUWS  VSUBUWS_BU
+  vsi __builtin_vec_subs (vsi, vbi);
+    VSUBSWS  VSUBSWS_SB
+  vsi __builtin_vec_subs (vbi, vsi);
+    VSUBSWS  VSUBSWS_BS
+
+[VEC_SUM2S, vec_sum2s, __builtin_vec_sum2s]
+  vsi __builtin_vec_sum2s (vsi, vsi);
+    VSUM2SWS
+
+[VEC_SUM4S, vec_sum4s, __builtin_vec_sum4s]
+  vui __builtin_vec_sum4s (vuc, vui);
+    VSUM4UBS
+  vsi __builtin_vec_sum4s (vsc, vsi);
+    VSUM4SBS
+  vsi __builtin_vec_sum4s (vss, vsi);
+    VSUM4SHS
+
+[VEC_SUMS, vec_sums, __builtin_vec_sums]
+  vsi __builtin_vec_sums (vsi, vsi);
+    VSUMSWS
+
+[VEC_TERNARYLOGIC, vec_ternarylogic, __builtin_vec_xxeval, _ARCH_PWR10]
+  vuc __builtin_vec_xxeval (vuc, vuc, vuc, const int);
+    XXEVAL  XXEVAL_VUC
+  vus __builtin_vec_xxeval (vus, vus, vus, const int);
+    XXEVAL  XXEVAL_VUS
+  vui __builtin_vec_xxeval (vui, vui, vui, const int);
+    XXEVAL  XXEVAL_VUI
+  vull __builtin_vec_xxeval (vull, vull, vull, const int);
+    XXEVAL  XXEVAL_VULL
+  vuq __builtin_vec_xxeval (vuq, vuq, vuq, const int);
+    XXEVAL  XXEVAL_VUQ
+
+[VEC_TEST_LSBB_ALL_ONES, vec_test_lsbb_all_ones, __builtin_vec_xvtlsbb_all_ones, _ARCH_PWR9]
+  signed int __builtin_vec_xvtlsbb_all_ones (vuc);
+    XVTLSBB_ONES
+
+[VEC_TEST_LSBB_ALL_ZEROS, vec_test_lsbb_all_zeros, __builtin_vec_xvtlsbb_all_zeros, _ARCH_PWR9]
+  signed int __builtin_vec_xvtlsbb_all_zeros (vuc);
+    XVTLSBB_ZEROS
+
+; #### XVRSPIZ{TARGET_VSX}; VRFIZ
+[VEC_TRUNC, vec_trunc, __builtin_vec_trunc]
+  vf __builtin_vec_trunc (vf);
+    VRFIZ
+  vd __builtin_vec_trunc (vd);
+    XVRDPIZ
+
+[VEC_TSTSFI_GT, SKIP, __builtin_dfp_dtstsfi_gt]
+  signed int __builtin_dfp_dtstsfi_gt (const int, _Decimal64);
+    TSTSFI_GT_DD
+  signed int __builtin_dfp_dtstsfi_gt (const int, _Decimal128);
+    TSTSFI_GT_TD
+
+[VEC_TSTSFI_EQ, SKIP, __builtin_dfp_dtstsfi_eq]
+  signed int __builtin_dfp_dtstsfi_eq (const int, _Decimal64);
+    TSTSFI_EQ_DD
+  signed int __builtin_dfp_dtstsfi_eq (const int, _Decimal128);
+    TSTSFI_EQ_TD
+
+[VEC_TSTSFI_LT, SKIP, __builtin_dfp_dtstsfi_lt]
+  signed int __builtin_dfp_dtstsfi_lt (const int, _Decimal64);
+    TSTSFI_LT_DD
+  signed int __builtin_dfp_dtstsfi_lt (const int, _Decimal128);
+    TSTSFI_LT_TD
+
+[VEC_TSTSFI_OV, SKIP, __builtin_dfp_dtstsfi_ov]
+  signed int __builtin_dfp_dtstsfi_ov (const int, _Decimal64);
+    TSTSFI_OV_DD
+  signed int __builtin_dfp_dtstsfi_ov (const int, _Decimal128);
+    TSTSFI_OV_TD
+
+[VEC_UNPACKH, vec_unpackh, __builtin_vec_unpackh]
+  vss __builtin_vec_unpackh (vsc);
+    VUPKHSB  VUPKHSB_VSC
+  vbs __builtin_vec_unpackh (vbc);
+    VUPKHSB  VUPKHSB_VBC
+  vsi __builtin_vec_unpackh (vss);
+    VUPKHSH  VUPKHSH_VSS
+  vbi __builtin_vec_unpackh (vbs);
+    VUPKHSH  VUPKHSH_VBS
+  vui __builtin_vec_unpackh (vp);
+    VUPKHPX
+  vsll __builtin_vec_unpackh (vsi);
+    VUPKHSW  VUPKHSW_VSI
+  vbll __builtin_vec_unpackh (vbi);
+    VUPKHSW  VUPKHSW_VBI
+  vd __builtin_vec_unpackh (vf);
+    DOUBLEH_V4SF VUPKHF
+
+[VEC_UNPACKL, vec_unpackl, __builtin_vec_unpackl]
+  vss __builtin_vec_unpackl (vsc);
+    VUPKLSB  VUPKLSB_VSC
+  vbs __builtin_vec_unpackl (vbc);
+    VUPKLSB  VUPKLSB_VBC
+  vsi __builtin_vec_unpackl (vss);
+    VUPKLSH  VUPKLSH_VSS
+  vbi __builtin_vec_unpackl (vbs);
+    VUPKLSH  VUPKLSH_VBS
+  vui __builtin_vec_unpackl (vp);
+    VUPKLPX
+  vsll __builtin_vec_unpackl (vsi);
+    VUPKLSW  VUPKLSW_VSI
+  vbll __builtin_vec_unpackl (vbi);
+    VUPKLSW  VUPKLSW_VBI
+  vd __builtin_vec_unpackl (vf);
+    DOUBLEL_V4SF VUPKLF
+
+[VEC_UNSIGNED, vec_unsigned, __builtin_vec_vunsigned]
+  vui __builtin_vec_vunsigned (vf);
+    VEC_VUNSIGNED_V4SF
+  vull __builtin_vec_vunsigned (vd);
+    VEC_VUNSIGNED_V2DF
+
+[VEC_UNSIGNED2, vec_unsigned2, __builtin_vec_vunsigned2]
+  vui __builtin_vec_vunsigned2 (vd, vd);
+    VEC_VUNSIGNED2_V2DF
+
+[VEC_UNSIGNEDE, vec_unsignede, __builtin_vec_vunsignede]
+  vui __builtin_vec_vunsignede (vd);
+    VEC_VUNSIGNEDE_V2DF
+
+[VEC_UNSIGNEDO, vec_unsignedo, __builtin_vec_vunsignedo]
+  vui __builtin_vec_vunsignedo (vd);
+    VEC_VUNSIGNEDO_V2DF
+
+[VEC_VEE, vec_extract_exp, __builtin_vec_extract_exp, _ARCH_PWR9]
+  vui __builtin_vec_extract_exp (vf);
+    VEESP
+  vull __builtin_vec_extract_exp (vd);
+    VEEDP
+
+[VEC_VES, vec_extract_sig, __builtin_vec_extract_sig, _ARCH_PWR9]
+  vui __builtin_vec_extract_sig (vf);
+    VESSP
+  vull __builtin_vec_extract_sig (vd);
+    VESDP
+
+[VEC_VIE, vec_insert_exp, __builtin_vec_insert_exp, _ARCH_PWR9]
+  vf __builtin_vec_insert_exp (vf, vui);
+    VIESP  VIESP_VF
+  vf __builtin_vec_insert_exp (vui, vui);
+    VIESP  VIESP_VUI
+  vd __builtin_vec_insert_exp (vd, vull);
+    VIEDP  VIEDP_VD
+  vd __builtin_vec_insert_exp (vull, vull);
+    VIEDP  VIEDP_VULL
+
+; It is truly unfortunate that vec_vprtyb has an incompatible set of
+; interfaces with vec_parity_lsbb.  So we can't even deprecate this.
+[VEC_VPRTYB, vec_vprtyb, __builtin_vec_vprtyb, _ARCH_PWR9]
+  vsi __builtin_vec_vprtyb (vsi);
+    VPRTYBW  VPRTYB_VSI
+  vui __builtin_vec_vprtyb (vui);
+    VPRTYBW  VPRTYB_VUI
+  vsll __builtin_vec_vprtyb (vsll);
+    VPRTYBD  VPRTYB_VSLL
+  vull __builtin_vec_vprtyb (vull);
+    VPRTYBD  VPRTYB_VULL
+  vsq __builtin_vec_vprtyb (vsq);
+    VPRTYBQ  VPRTYB_VSQ
+  vuq __builtin_vec_vprtyb (vuq);
+    VPRTYBQ  VPRTYB_VUQ
+  signed __int128 __builtin_vec_vprtyb (signed __int128);
+    VPRTYBQ  VPRTYB_SQ
+  unsigned __int128 __builtin_vec_vprtyb (unsigned __int128);
+    VPRTYBQ  VPRTYB_UQ
+
+[VEC_VSCEEQ, scalar_cmp_exp_eq, __builtin_vec_scalar_cmp_exp_eq, _ARCH_PWR9]
+  signed int __builtin_vec_scalar_cmp_exp_eq (double, double);
+    VSCEDPEQ
+  signed int __builtin_vec_scalar_cmp_exp_eq (_Float128, _Float128);
+    VSCEQPEQ
+
+[VEC_VSCEGT, scalar_cmp_exp_gt, __builtin_vec_scalar_cmp_exp_gt, _ARCH_PWR9]
+  signed int __builtin_vec_scalar_cmp_exp_gt (double, double);
+    VSCEDPGT
+  signed int __builtin_vec_scalar_cmp_exp_gt (_Float128, _Float128);
+    VSCEQPGT
+
+[VEC_VSCELT, scalar_cmp_exp_lt, __builtin_vec_scalar_cmp_exp_lt, _ARCH_PWR9]
+  signed int __builtin_vec_scalar_cmp_exp_lt (double, double);
+    VSCEDPLT
+  signed int __builtin_vec_scalar_cmp_exp_lt (_Float128, _Float128);
+    VSCEQPLT
+
+[VEC_VSCEUO, scalar_cmp_exp_unordered, __builtin_vec_scalar_cmp_exp_unordered, _ARCH_PWR9]
+  signed int __builtin_vec_scalar_cmp_exp_unordered (double, double);
+    VSCEDPUO
+  signed int __builtin_vec_scalar_cmp_exp_unordered (_Float128, _Float128);
+    VSCEQPUO
+
+[VEC_VSEE, scalar_extract_exp, __builtin_vec_scalar_extract_exp, _ARCH_PWR9]
+  unsigned int __builtin_vec_scalar_extract_exp (double);
+    VSEEDP
+  unsigned int __builtin_vec_scalar_extract_exp (_Float128);
+    VSEEQP
+
+[VEC_VSES, scalar_extract_sig, __builtin_vec_scalar_extract_sig, _ARCH_PWR9]
+  unsigned long long __builtin_vec_scalar_extract_sig (double);
+    VSESDP
+  unsigned __int128 __builtin_vec_scalar_extract_sig (_Float128);
+    VSESQP
+
+[VEC_VSIE, scalar_insert_exp, __builtin_vec_scalar_insert_exp, _ARCH_PWR9]
+  double __builtin_vec_scalar_insert_exp (unsigned long long, unsigned long long);
+    VSIEDP
+  double __builtin_vec_scalar_insert_exp (double, unsigned long long);
+    VSIEDPF
+  _Float128 __builtin_vec_scalar_insert_exp (unsigned __int128, unsigned long long);
+    VSIEQP
+  _Float128 __builtin_vec_scalar_insert_exp (_Float128, unsigned long long);
+    VSIEQPF
+
+[VEC_VSTDC, scalar_test_data_class, __builtin_vec_scalar_test_data_class, _ARCH_PWR9]
+  unsigned int __builtin_vec_scalar_test_data_class (float, const int);
+    VSTDCSP
+  unsigned int __builtin_vec_scalar_test_data_class (double, const int);
+    VSTDCDP
+  unsigned int __builtin_vec_scalar_test_data_class (_Float128, const int);
+    VSTDCQP
+
+[VEC_VSTDCN, scalar_test_neg, __builtin_vec_scalar_test_neg, _ARCH_PWR9]
+  unsigned int __builtin_vec_scalar_test_neg (float);
+    VSTDCNSP
+  unsigned int __builtin_vec_scalar_test_neg (double);
+    VSTDCNDP
+  unsigned int __builtin_vec_scalar_test_neg (_Float128);
+    VSTDCNQP
+
+[VEC_VTDC, vec_test_data_class, __builtin_vec_test_data_class, _ARCH_PWR9]
+  vbi __builtin_vec_test_data_class (vf, const int);
+    VTDCSP
+  vbll __builtin_vec_test_data_class (vd, const int);
+    VTDCDP
+
+[VEC_XL, vec_xl, __builtin_vec_vsx_ld, __VSX__]
+  vsc __builtin_vec_vsx_ld (signed long long, const vsc *);
+    LXVW4X_V16QI  LXVW4X_VSC
+  vsc __builtin_vec_vsx_ld (signed long long, const signed char *);
+    LXVW4X_V16QI  LXVW4X_SC
+  vuc __builtin_vec_vsx_ld (signed long long, const vuc *);
+    LXVW4X_V16QI  LXVW4X_VUC
+  vuc __builtin_vec_vsx_ld (signed long long, const unsigned char *);
+    LXVW4X_V16QI  LXVW4X_UC
+  vbc __builtin_vec_vsx_ld (signed long long, const vbc *);
+    LXVW4X_V16QI  LXVW4X_VBC
+  vss __builtin_vec_vsx_ld (signed long long, const vss *);
+    LXVW4X_V8HI  LXVW4X_VSS
+  vss __builtin_vec_vsx_ld (signed long long, const signed short *);
+    LXVW4X_V8HI  LXVW4X_SS
+  vus __builtin_vec_vsx_ld (signed long long, const vus *);
+    LXVW4X_V8HI  LXVW4X_VUS
+  vus __builtin_vec_vsx_ld (signed long long, const unsigned short *);
+    LXVW4X_V8HI  LXVW4X_US
+  vbs __builtin_vec_vsx_ld (signed long long, const vbs *);
+    LXVW4X_V8HI  LXVW4X_VBS
+  vp __builtin_vec_vsx_ld (signed long long, const vp *);
+    LXVW4X_V8HI  LXVW4X_P
+  vsi __builtin_vec_vsx_ld (signed long long, const vsi *);
+    LXVW4X_V4SI  LXVW4X_VSI
+  vsi __builtin_vec_vsx_ld (signed long long, const signed int *);
+    LXVW4X_V4SI  LXVW4X_SI
+  vui __builtin_vec_vsx_ld (signed long long, const vui *);
+    LXVW4X_V4SI  LXVW4X_VUI
+  vui __builtin_vec_vsx_ld (signed long long, const unsigned int *);
+    LXVW4X_V4SI  LXVW4X_UI
+  vbi __builtin_vec_vsx_ld (signed long long, const vbi *);
+    LXVW4X_V4SI  LXVW4X_VBI
+  vsll __builtin_vec_vsx_ld (signed long long, const vsll *);
+    LXVD2X_V2DI  LXVD2X_VSLL
+  vsll __builtin_vec_vsx_ld (signed long long, const signed long long *);
+    LXVD2X_V2DI  LXVD2X_SLL
+  vull __builtin_vec_vsx_ld (signed long long, const vull *);
+    LXVD2X_V2DI  LXVD2X_VULL
+  vull __builtin_vec_vsx_ld (signed long long, const unsigned long long *);
+    LXVD2X_V2DI  LXVD2X_ULL
+  vbll __builtin_vec_vsx_ld (signed long long, const vbll *);
+    LXVD2X_V2DI  LXVD2X_VBLL
+  vsq __builtin_vec_vsx_ld (signed long long, const vsq *);
+    LXVD2X_V1TI  LXVD2X_VSQ
+  vsq __builtin_vec_vsx_ld (signed long long, const signed __int128 *);
+    LXVD2X_V1TI  LXVD2X_SQ
+  vuq __builtin_vec_vsx_ld (signed long long, const unsigned __int128 *);
+    LXVD2X_V1TI  LXVD2X_UQ
+  vf __builtin_vec_vsx_ld (signed long long, const vf *);
+    LXVW4X_V4SF  LXVW4X_VF
+  vf __builtin_vec_vsx_ld (signed long long, const float *);
+    LXVW4X_V4SF  LXVW4X_F
+  vd __builtin_vec_vsx_ld (signed long long, const vd *);
+    LXVD2X_V2DF  LXVD2X_VD
+  vd __builtin_vec_vsx_ld (signed long long, const double *);
+    LXVD2X_V2DF  LXVD2X_D
+
+[VEC_XL_BE, vec_xl_be, __builtin_vec_xl_be, __VSX__]
+  vsc __builtin_vec_xl_be (signed long long, const vsc *);
+    LD_ELEMREV_V16QI  LD_ELEMREV_VSC
+  vsc __builtin_vec_xl_be (signed long long, const signed char *);
+    LD_ELEMREV_V16QI  LD_ELEMREV_SC
+  vuc __builtin_vec_xl_be (signed long long, const vuc *);
+    LD_ELEMREV_V16QI  LD_ELEMREV_VUC
+  vuc __builtin_vec_xl_be (signed long long, const unsigned char *);
+    LD_ELEMREV_V16QI  LD_ELEMREV_UC
+  vss __builtin_vec_xl_be (signed long long, const vss *);
+    LD_ELEMREV_V8HI  LD_ELEMREV_VSS
+  vss __builtin_vec_xl_be (signed long long, const signed short *);
+    LD_ELEMREV_V8HI  LD_ELEMREV_SS
+  vus __builtin_vec_xl_be (signed long long, const vus *);
+    LD_ELEMREV_V8HI  LD_ELEMREV_VUS
+  vus __builtin_vec_xl_be (signed long long, const unsigned short *);
+    LD_ELEMREV_V8HI  LD_ELEMREV_US
+  vsi __builtin_vec_xl_be (signed long long, const vsi *);
+    LD_ELEMREV_V4SI  LD_ELEMREV_VSI
+  vsi __builtin_vec_xl_be (signed long long, const signed int *);
+    LD_ELEMREV_V4SI  LD_ELEMREV_SI
+  vui __builtin_vec_xl_be (signed long long, const vui *);
+    LD_ELEMREV_V4SI  LD_ELEMREV_VUI
+  vui __builtin_vec_xl_be (signed long long, const unsigned int *);
+    LD_ELEMREV_V4SI  LD_ELEMREV_UI
+  vsll __builtin_vec_xl_be (signed long long, const vsll *);
+    LD_ELEMREV_V2DI  LD_ELEMREV_VSLL
+  vsll __builtin_vec_xl_be (signed long long, const signed long long *);
+    LD_ELEMREV_V2DI  LD_ELEMREV_SLL
+  vull __builtin_vec_xl_be (signed long long, const vull *);
+    LD_ELEMREV_V2DI  LD_ELEMREV_VULL
+  vull __builtin_vec_xl_be (signed long long, const unsigned long long *);
+    LD_ELEMREV_V2DI  LD_ELEMREV_ULL
+  vsq __builtin_vec_xl_be (signed long long, const signed __int128 *);
+    LD_ELEMREV_V1TI  LD_ELEMREV_SQ
+  vuq __builtin_vec_xl_be (signed long long, const unsigned __int128 *);
+    LD_ELEMREV_V1TI  LD_ELEMREV_UQ
+  vf __builtin_vec_xl_be (signed long long, const vf *);
+    LD_ELEMREV_V4SF  LD_ELEMREV_VF
+  vf __builtin_vec_xl_be (signed long long, const float *);
+    LD_ELEMREV_V4SF  LD_ELEMREV_F
+  vd __builtin_vec_xl_be (signed long long, const vd *);
+    LD_ELEMREV_V2DF  LD_ELEMREV_VD
+  vd __builtin_vec_xl_be (signed long long, const double *);
+    LD_ELEMREV_V2DF  LD_ELEMREV_DD
+
+[VEC_XL_LEN_R, vec_xl_len_r, __builtin_vec_xl_len_r, _ARCH_PPC64_PWR9]
+  vuc __builtin_vsx_xl_len_r (const unsigned char *, unsigned int);
+    XL_LEN_R
+
+[VEC_XL_SEXT, vec_xl_sext, __builtin_vec_xl_sext, _ARCH_PWR10]
+  vsq __builtin_vec_xl_sext (signed long long, const signed char *);
+    SE_LXVRBX
+  vsq __builtin_vec_xl_sext (signed long long, const signed short *);
+    SE_LXVRHX
+  vsq __builtin_vec_xl_sext (signed long long, const signed int *);
+    SE_LXVRWX
+  vsq __builtin_vec_xl_sext (signed long long, const signed long long *);
+    SE_LXVRDX
+
+[VEC_XL_ZEXT, vec_xl_zext, __builtin_vec_xl_zext, _ARCH_PWR10]
+  vuq __builtin_vec_xl_zext (signed long long, const unsigned char *);
+    ZE_LXVRBX
+  vuq __builtin_vec_xl_zext (signed long long, const unsigned short *);
+    ZE_LXVRHX
+  vuq __builtin_vec_xl_zext (signed long long, const unsigned int *);
+    ZE_LXVRWX
+  vuq __builtin_vec_xl_zext (signed long long, const unsigned long long *);
+    ZE_LXVRDX
+
+[VEC_XOR, vec_xor, __builtin_vec_xor]
+  vsc __builtin_vec_xor (vsc, vsc);
+    VXOR_V16QI
+  vuc __builtin_vec_xor (vuc, vuc);
+    VXOR_V16QI_UNS  VXOR_VUC
+  vbc __builtin_vec_xor (vbc, vbc);
+    VXOR_V16QI_UNS  VXOR_VBC
+  vss __builtin_vec_xor (vss, vss);
+    VXOR_V8HI
+  vus __builtin_vec_xor (vus, vus);
+    VXOR_V8HI_UNS  VXOR_VUS
+  vbs __builtin_vec_xor (vbs, vbs);
+    VXOR_V8HI_UNS  VXOR_VBS
+  vsi __builtin_vec_xor (vsi, vsi);
+    VXOR_V4SI
+  vui __builtin_vec_xor (vui, vui);
+    VXOR_V4SI_UNS  VXOR_VUI
+  vbi __builtin_vec_xor (vbi, vbi);
+    VXOR_V4SI_UNS  VXOR_VBI
+  vsll __builtin_vec_xor (vsll, vsll);
+    VXOR_V2DI
+  vull __builtin_vec_xor (vull, vull);
+    VXOR_V2DI_UNS  VXOR_VULL
+  vbll __builtin_vec_xor (vbll, vbll);
+    VXOR_V2DI_UNS  VXOR_VBLL
+  vf __builtin_vec_xor (vf, vf);
+    VXOR_V4SF
+  vd __builtin_vec_xor (vd, vd);
+    VXOR_V2DF
+; The following variants are deprecated.
+  vsc __builtin_vec_xor (vsc, vbc);
+    VXOR_V16QI  VXOR_VSC_VBC
+  vsc __builtin_vec_xor (vbc, vsc);
+    VXOR_V16QI  VXOR_VBC_VSC
+  vsc __builtin_vec_xor (vsc, vuc);
+    VXOR_V16QI  VXOR_VSC_VUC
+  vuc __builtin_vec_xor (vuc, vbc);
+    VXOR_V16QI_UNS  VXOR_VUC_VBC
+  vuc __builtin_vec_xor (vbc, vuc);
+    VXOR_V16QI_UNS  VXOR_VBC_VUC
+  vuc __builtin_vec_xor (vuc, vsc);
+    VXOR_V16QI_UNS  VXOR_VUC_VSC
+  vss __builtin_vec_xor (vss, vbs);
+    VXOR_V8HI  VXOR_VSS_VBS
+  vss __builtin_vec_xor (vbs, vss);
+    VXOR_V8HI  VXOR_VBS_VSS
+  vus __builtin_vec_xor (vus, vbs);
+    VXOR_V8HI_UNS  VXOR_VUS_VBS
+  vus __builtin_vec_xor (vbs, vus);
+    VXOR_V8HI_UNS  VXOR_VBS_VUS
+  vsi __builtin_vec_xor (vsi, vbi);
+    VXOR_V4SI  VXOR_VSI_VBI
+  vsi __builtin_vec_xor (vbi, vsi);
+    VXOR_V4SI  VXOR_VBI_VSI
+  vui __builtin_vec_xor (vui, vbi);
+    VXOR_V4SI_UNS  VXOR_VUI_VBI
+  vui __builtin_vec_xor (vbi, vui);
+    VXOR_V4SI_UNS  VXOR_VBI_VUI
+  vsll __builtin_vec_xor (vsll, vbll);
+    VXOR_V2DI  VXOR_VSLL_VBLL
+  vsll __builtin_vec_xor (vbll, vsll);
+    VXOR_V2DI  VXOR_VBLL_VSLL
+  vull __builtin_vec_xor (vull, vbll);
+    VXOR_V2DI_UNS  VXOR_VULL_VBLL
+  vull __builtin_vec_xor (vbll, vull);
+    VXOR_V2DI_UNS  VXOR_VBLL_VULL
+  vf __builtin_vec_xor (vf, vbi);
+    VXOR_V4SF  VXOR_VF_VBI
+  vf __builtin_vec_xor (vbi, vf);
+    VXOR_V4SF  VXOR_VBI_VF
+  vd __builtin_vec_xor (vd, vbll);
+    VXOR_V2DF  VXOR_VD_VBLL
+  vd __builtin_vec_xor (vbll, vd);
+    VXOR_V2DF  VXOR_VBLL_VD
+
+[VEC_XST, vec_xst, __builtin_vec_vsx_st, __VSX__]
+  void __builtin_vec_vsx_st (vsc, signed long long, vsc *);
+    STXVW4X_V16QI  STXVW4X_VSC
+  void __builtin_vec_vsx_st (vsc, signed long long, signed char *);
+    STXVW4X_V16QI  STXVW4X_SC
+  void __builtin_vec_vsx_st (vuc, signed long long, vuc *);
+    STXVW4X_V16QI  STXVW4X_VUC
+  void __builtin_vec_vsx_st (vuc, signed long long, unsigned char *);
+    STXVW4X_V16QI  STXVW4X_UC
+  void __builtin_vec_vsx_st (vbc, signed long long, vbc *);
+    STXVW4X_V16QI  STXVW4X_VBC
+  void __builtin_vec_vsx_st (vbc, signed long long, signed char *);
+    STXVW4X_V16QI  STXVW4X_VBC_S
+  void __builtin_vec_vsx_st (vbc, signed long long, unsigned char *);
+    STXVW4X_V16QI  STXVW4X_VBC_U
+  void __builtin_vec_vsx_st (vss, signed long long, vss *);
+    STXVW4X_V8HI  STXVW4X_VSS
+  void __builtin_vec_vsx_st (vss, signed long long, signed short *);
+    STXVW4X_V8HI  STXVW4X_SS
+  void __builtin_vec_vsx_st (vus, signed long long, vus *);
+    STXVW4X_V8HI  STXVW4X_VUS
+  void __builtin_vec_vsx_st (vus, signed long long, unsigned short *);
+    STXVW4X_V8HI  STXVW4X_US
+  void __builtin_vec_vsx_st (vbs, signed long long, vbs *);
+    STXVW4X_V8HI  STXVW4X_VBS
+  void __builtin_vec_vsx_st (vbs, signed long long, signed short *);
+    STXVW4X_V8HI  STXVW4X_VBS_S
+  void __builtin_vec_vsx_st (vbs, signed long long, unsigned short *);
+    STXVW4X_V8HI  STXVW4X_VBS_U
+  void __builtin_vec_vsx_st (vp, signed long long, vp *);
+    STXVW4X_V8HI  STXVW4X_VP
+  void __builtin_vec_vsx_st (vsi, signed long long, vsi *);
+    STXVW4X_V4SI  STXVW4X_VSI
+  void __builtin_vec_vsx_st (vsi, signed long long, signed int *);
+    STXVW4X_V4SI  STXVW4X_SI
+  void __builtin_vec_vsx_st (vui, signed long long, vui *);
+    STXVW4X_V4SI  STXVW4X_VUI
+  void __builtin_vec_vsx_st (vui, signed long long, unsigned int *);
+    STXVW4X_V4SI  STXVW4X_UI
+  void __builtin_vec_vsx_st (vbi, signed long long, vbi *);
+    STXVW4X_V4SI  STXVW4X_VBI
+  void __builtin_vec_vsx_st (vbi, signed long long, signed int *);
+    STXVW4X_V4SI  STXVW4X_VBI_S
+  void __builtin_vec_vsx_st (vbi, signed long long, unsigned int *);
+    STXVW4X_V4SI  STXVW4X_VBI_U
+  void __builtin_vec_vsx_st (vsll, signed long long, vsll *);
+    STXVD2X_V2DI  STXVD2X_VSLL
+  void __builtin_vec_vsx_st (vsll, signed long long, signed long long *);
+    STXVD2X_V2DI  STXVD2X_SLL
+  void __builtin_vec_vsx_st (vull, signed long long, vull *);
+    STXVD2X_V2DI  STXVD2X_VULL
+  void __builtin_vec_vsx_st (vull, signed long long, unsigned long long *);
+    STXVD2X_V2DI  STXVD2X_ULL
+  void __builtin_vec_vsx_st (vbll, signed long long, vbll *);
+    STXVD2X_V2DI  STXVD2X_VBLL
+  void __builtin_vec_vsx_st (vsq, signed long long, signed __int128 *);
+    STXVD2X_V1TI  STXVD2X_SQ
+  void __builtin_vec_vsx_st (vuq, signed long long, unsigned __int128 *);
+    STXVD2X_V1TI  STXVD2X_UQ
+  void __builtin_vec_vsx_st (vf, signed long long, vf *);
+    STXVW4X_V4SF  STXVW4X_VF
+  void __builtin_vec_vsx_st (vf, signed long long, float *);
+    STXVW4X_V4SF  STXVW4X_F
+  void __builtin_vec_vsx_st (vd, signed long long, vd *);
+    STXVD2X_V2DF  STXVD2X_VD
+  void __builtin_vec_vsx_st (vd, signed long long, double *);
+    STXVD2X_V2DF  STXVD2X_D
+
+[VEC_XST_BE, vec_xst_be, __builtin_vec_xst_be, __VSX__]
+  void __builtin_vec_xst_be (vsc, signed long long, vsc *);
+    ST_ELEMREV_V16QI  ST_ELEMREV_VSC
+  void __builtin_vec_xst_be (vsc, signed long long, signed char *);
+    ST_ELEMREV_V16QI  ST_ELEMREV_SC_
+  void __builtin_vec_xst_be (vuc, signed long long, vuc *);
+    ST_ELEMREV_V16QI  ST_ELEMREV_VUC
+  void __builtin_vec_xst_be (vuc, signed long long, unsigned char *);
+    ST_ELEMREV_V16QI  ST_ELEMREV_UC
+  void __builtin_vec_xst_be (vss, signed long long, vss *);
+    ST_ELEMREV_V8HI  ST_ELEMREV_VSS
+  void __builtin_vec_xst_be (vss, signed long long, signed short *);
+    ST_ELEMREV_V8HI  ST_ELEMREV_SS
+  void __builtin_vec_xst_be (vus, signed long long, vus *);
+    ST_ELEMREV_V8HI  ST_ELEMREV_VUS
+  void __builtin_vec_xst_be (vus, signed long long, unsigned short *);
+    ST_ELEMREV_V8HI  ST_ELEMREV_US
+  void __builtin_vec_xst_be (vsi, signed long long, vsi *);
+    ST_ELEMREV_V4SI  ST_ELEMREV_VSI
+  void __builtin_vec_xst_be (vsi, signed long long, signed int *);
+    ST_ELEMREV_V4SI  ST_ELEMREV_SI
+  void __builtin_vec_xst_be (vui, signed long long, vui *);
+    ST_ELEMREV_V4SI  ST_ELEMREV_VUI
+  void __builtin_vec_xst_be (vui, signed long long, unsigned int *);
+    ST_ELEMREV_V4SI  ST_ELEMREV_UI
+  void __builtin_vec_xst_be (vsll, signed long long, vsll *);
+    ST_ELEMREV_V2DI  ST_ELEMREV_VSLL
+  void __builtin_vec_xst_be (vsll, signed long long, signed long long *);
+    ST_ELEMREV_V2DI  ST_ELEMREV_SLL
+  void __builtin_vec_xst_be (vull, signed long long, vull *);
+    ST_ELEMREV_V2DI  ST_ELEMREV_VULL
+  void __builtin_vec_xst_be (vull, signed long long, unsigned long long *);
+    ST_ELEMREV_V2DI  ST_ELEMREV_ULL
+  void __builtin_vec_xst_be (vsq, signed long long, signed __int128 *);
+    ST_ELEMREV_V1TI  ST_ELEMREV_SQ
+  void __builtin_vec_xst_be (vuq, signed long long, unsigned __int128 *);
+    ST_ELEMREV_V1TI  ST_ELEMREV_UQ
+  void __builtin_vec_xst_be (vf, signed long long, vf *);
+    ST_ELEMREV_V4SF  ST_ELEMREV_VF
+  void __builtin_vec_xst_be (vf, signed long long, float *);
+    ST_ELEMREV_V4SF  ST_ELEMREV_F
+  void __builtin_vec_xst_be (vd, signed long long, vd *);
+    ST_ELEMREV_V2DF  ST_ELEMREV_VD
+  void __builtin_vec_xst_be (vd, signed long long, double *);
+    ST_ELEMREV_V2DF  ST_ELEMREV_D
+
+[VEC_XST_LEN_R, vec_xst_len_r, __builtin_vec_xst_len_r, _ARCH_PPC64_PWR9]
+  void __builtin_vsx_xst_len_r (vuc, unsigned char *, unsigned int);
+    XST_LEN_R
+
+[VEC_XST_TRUNC, vec_xst_trunc, __builtin_vec_xst_trunc, _ARCH_PWR10]
+  void __builtin_vec_xst_trunc (vsq, signed long long, signed char *);
+    TR_STXVRBX  TR_STXVRBX_S
+  void __builtin_vec_xst_trunc (vuq, signed long long, unsigned char *);
+    TR_STXVRBX  TR_STXVRBX_U
+  void __builtin_vec_xst_trunc (vsq, signed long long, signed short *);
+    TR_STXVRHX  TR_STXVRHX_S
+  void __builtin_vec_xst_trunc (vuq, signed long long, unsigned short *);
+    TR_STXVRHX  TR_STXVRHX_U
+  void __builtin_vec_xst_trunc (vsq, signed long long, signed int *);
+    TR_STXVRWX  TR_STXVRWX_S
+  void __builtin_vec_xst_trunc (vuq, signed long long, unsigned int *);
+    TR_STXVRWX  TR_STXVRWX_U
+  void __builtin_vec_xst_trunc (vsq, signed long long, signed long long *);
+    TR_STXVRDX  TR_STXVRDX_S
+  void __builtin_vec_xst_trunc (vuq, signed long long, unsigned long long *);
+    TR_STXVRDX  TR_STXVRDX_U
+
+[VEC_XXPERMDI, vec_xxpermdi, __builtin_vsx_xxpermdi, __VSX__]
+  vsc __builtin_vsx_xxpermdi (vsc, vsc, const int);
+    XXPERMDI_16QI  XXPERMDI_VSC
+  vuc __builtin_vsx_xxpermdi (vuc, vuc, const int);
+    XXPERMDI_16QI  XXPERMDI_VUC
+  vss __builtin_vsx_xxpermdi (vss, vss, const int);
+    XXPERMDI_8HI  XXPERMDI_VSS
+  vus __builtin_vsx_xxpermdi (vus, vus, const int);
+    XXPERMDI_8HI  XXPERMDI_VUS
+  vsi __builtin_vsx_xxpermdi (vsi, vsi, const int);
+    XXPERMDI_4SI  XXPERMDI_VSI
+  vui __builtin_vsx_xxpermdi (vui, vui, const int);
+    XXPERMDI_4SI  XXPERMDI_VUI
+  vsll __builtin_vsx_xxpermdi (vsll, vsll, const int);
+    XXPERMDI_2DI  XXPERMDI_VSLL
+  vull __builtin_vsx_xxpermdi (vull, vull, const int);
+    XXPERMDI_2DI  XXPERMDI_VULL
+  vf __builtin_vsx_xxpermdi (vf, vf, const int);
+    XXPERMDI_4SF  XXPERMDI_VF
+  vd __builtin_vsx_xxpermdi (vd, vd, const int);
+    XXPERMDI_2DF  XXPERMDI_VD
+
+[VEC_XXSLDWI, vec_xxsldwi, __builtin_vsx_xxsldwi, __VSX__]
+  vsc __builtin_vsx_xxsldwi (vsc, vsc, const int);
+    XXSLDWI_16QI  XXSLDWI_VSC2
+  vuc __builtin_vsx_xxsldwi (vuc, vuc, const int);
+    XXSLDWI_16QI  XXSLDWI_VUC2
+  vss __builtin_vsx_xxsldwi (vss, vss, const int);
+    XXSLDWI_8HI  XXSLDWI_VSS2
+  vus __builtin_vsx_xxsldwi (vus, vus, const int);
+    XXSLDWI_8HI  XXSLDWI_VUS2
+  vsi __builtin_vsx_xxsldwi (vsi, vsi, const int);
+    XXSLDWI_4SI  XXSLDWI_VSI2
+  vui __builtin_vsx_xxsldwi (vui, vui, const int);
+    XXSLDWI_4SI  XXSLDWI_VUI2
+  vsll __builtin_vsx_xxsldwi (vsll, vsll, const int);
+    XXSLDWI_2DI  XXSLDWI_VSLL2
+  vull __builtin_vsx_xxsldwi (vull, vull, const int);
+    XXSLDWI_2DI  XXSLDWI_VULL2
+  vf __builtin_vsx_xxsldwi (vf, vf, const int);
+    XXSLDWI_4SF  XXSLDWI_VF2
+  vd __builtin_vsx_xxsldwi (vd, vd, const int);
+    XXSLDWI_2DF  XXSLDWI_VD2
+
+
+; **************************************************************************
+; **************************************************************************
+; ****    Deprecated overloads that should never have existed at all    ****
+; **************************************************************************
+; **************************************************************************
+
+[VEC_LVEBX, vec_lvebx, __builtin_vec_lvebx]
+  vsc __builtin_vec_lvebx (signed long, signed char *);
+    LVEBX  LVEBX_DEPR1
+  vuc __builtin_vec_lvebx (signed long, unsigned char *);
+    LVEBX  LVEBX_DEPR2
+
+[VEC_LVEHX, vec_lvehx, __builtin_vec_lvehx]
+  vss __builtin_vec_lvehx (signed long, signed short *);
+    LVEHX  LVEHX_DEPR1
+  vus __builtin_vec_lvehx (signed long, unsigned short *);
+    LVEHX  LVEHX_DEPR2
+
+[VEC_LVEWX, vec_lvewx, __builtin_vec_lvewx]
+  vf __builtin_vec_lvewx (signed long, float *);
+    LVEWX  LVEWX_DEPR1
+  vsi __builtin_vec_lvewx (signed long, signed int *);
+    LVEWX  LVEWX_DEPR2
+  vui __builtin_vec_lvewx (signed long, unsigned int *);
+    LVEWX  LVEWX_DEPR3
+  vsi __builtin_vec_lvewx (signed long, signed long *);
+    LVEWX  LVEWX_DEPR4
+  vui __builtin_vec_lvewx (signed long, unsigned long *);
+    LVEWX  LVEWX_DEPR5
+
+[VEC_STVEBX, vec_stvebx, __builtin_vec_stvebx]
+  void __builtin_vec_stvebx (vsc, signed long, signed char *);
+    STVEBX  STVEBX_DEPR1
+  void __builtin_vec_stvebx (vuc, signed long, unsigned char *);
+    STVEBX  STVEBX_DEPR2
+  void __builtin_vec_stvebx (vbc, signed long, signed char *);
+    STVEBX  STVEBX_DEPR3
+  void __builtin_vec_stvebx (vbc, signed long, signed char *);
+    STVEBX  STVEBX_DEPR4
+  void __builtin_vec_stvebx (vsc, signed long, void *);
+    STVEBX  STVEBX_DEPR5
+  void __builtin_vec_stvebx (vuc, signed long, void *);
+    STVEBX  STVEBX_DEPR6
+
+[VEC_STVEHX, vec_stvehx, __builtin_vec_stvehx]
+  void __builtin_vec_stvehx (vss, signed long, signed short *);
+    STVEHX  STVEHX_DEPR1
+  void __builtin_vec_stvehx (vus, signed long, unsigned short *);
+    STVEHX  STVEHX_DEPR2
+  void __builtin_vec_stvehx (vbs, signed long, signed short *);
+    STVEHX  STVEHX_DEPR3
+  void __builtin_vec_stvehx (vbs, signed long, signed short *);
+    STVEHX  STVEHX_DEPR4
+  void __builtin_vec_stvehx (vss, signed long, void *);
+    STVEHX  STVEHX_DEPR5
+  void __builtin_vec_stvehx (vus, signed long, void *);
+    STVEHX  STVEHX_DEPR6
+
+[VEC_STVEWX, vec_stvewx, __builtin_vec_stvewx]
+  void __builtin_vec_stvewx (vf, signed long, float *);
+    STVEWX  STVEWX_DEPR1
+  void __builtin_vec_stvewx (vsi, signed long, signed int *);
+    STVEWX  STVEWX_DEPR2
+  void __builtin_vec_stvewx (vui, signed long, unsigned int *);
+    STVEWX  STVEWX_DEPR3
+  void __builtin_vec_stvewx (vbi, signed long, signed int *);
+    STVEWX  STVEWX_DEPR4
+  void __builtin_vec_stvewx (vbi, signed long, unsigned int *);
+    STVEWX  STVEWX_DEPR5
+  void __builtin_vec_stvewx (vf, signed long, void *);
+    STVEWX  STVEWX_DEPR6
+  void __builtin_vec_stvewx (vsi, signed long, void *);
+    STVEWX  STVEWX_DEPR7
+  void __builtin_vec_stvewx (vui, signed long, void *);
+    STVEWX  STVEWX_DEPR8
+
+[VEC_TSTSFI_EQ_DD, SKIP, __builtin_dfp_dtstsfi_eq_dd, _ARCH_PWR9]
+  signed int __builtin_dfp_dtstsfi_eq_dd (const int, _Decimal64);
+    TSTSFI_EQ_DD  TSTSFI_EQ_DD_DEPR1
+
+[VEC_TSTSFI_EQ_TD, SKIP, __builtin_dfp_dtstsfi_eq_td, _ARCH_PWR9]
+  signed int __builtin_dfp_dtstsfi_eq_td (const int, _Decimal128);
+    TSTSFI_EQ_TD  TSTSFI_EQ_TD_DEPR1
+
+[VEC_TSTSFI_GT_DD, SKIP, __builtin_dfp_dtstsfi_gt_dd, _ARCH_PWR9]
+  signed int __builtin_dfp_dtstsfi_gt_dd (const int, _Decimal64);
+    TSTSFI_GT_DD  TSTSFI_GT_DD_DEPR1
+
+[VEC_TSTSFI_GT_TD, SKIP, __builtin_dfp_dtstsfi_gt_td, _ARCH_PWR9]
+  signed int __builtin_dfp_dtstsfi_gt_td (const int, _Decimal128);
+    TSTSFI_GT_TD  TSTSFI_GT_TD_DEPR1
+
+[VEC_TSTSFI_LT_DD, SKIP, __builtin_dfp_dtstsfi_lt_dd, _ARCH_PWR9]
+  signed int __builtin_dfp_dtstsfi_lt_dd (const int, _Decimal64);
+    TSTSFI_LT_DD  TSTSFI_LT_DD_DEPR1
+
+[VEC_TSTSFI_LT_TD, SKIP, __builtin_dfp_dtstsfi_lt_td, _ARCH_PWR9]
+  signed int __builtin_dfp_dtstsfi_lt_td (const int, _Decimal128);
+    TSTSFI_LT_TD  TSTSFI_LT_TD_DEPR1
+
+[VEC_TSTSFI_OV_DD, SKIP, __builtin_dfp_dtstsfi_ov_dd, _ARCH_PWR9]
+  signed int __builtin_dfp_dtstsfi_ov_dd (const int, _Decimal64);
+    TSTSFI_OV_DD  TSTSFI_OV_DD_DEPR1
+
+[VEC_TSTSFI_OV_TD, SKIP, __builtin_dfp_dtstsfi_ov_td, _ARCH_PWR9]
+  signed int __builtin_dfp_dtstsfi_ov_td (const int, _Decimal128);
+    TSTSFI_OV_TD  TSTSFI_OV_TD_DEPR1
+
+[VEC_VADDCUQ, vec_vaddcuq, __builtin_vec_vaddcuq, _ARCH_PWR8]
+  vsq __builtin_vec_vaddcuq (vsq, vsq);
+    VADDCUQ  VADDCUQ_DEPR1
+  vuq __builtin_vec_vaddcuq (vuq, vuq);
+    VADDCUQ  VADDCUQ_DEPR2
+
+[VEC_VADDECUQ, vec_vaddecuq, __builtin_vec_vaddecuq, _ARCH_PWR8]
+  vsq __builtin_vec_vaddecuq (vsq, vsq, vsq);
+    VADDECUQ  VADDECUQ_DEPR1
+  vuq __builtin_vec_vaddecuq (vuq, vuq, vuq);
+    VADDECUQ  VADDECUQ_DEPR2
+
+[VEC_VADDEUQM, vec_vaddeuqm, __builtin_vec_vaddeuqm, _ARCH_PWR8]
+  vsq __builtin_vec_vaddeuqm (vsq, vsq, vsq);
+    VADDEUQM  VADDEUQM_DEPR1
+  vuq __builtin_vec_vaddeuqm (vuq, vuq, vuq);
+    VADDEUQM  VADDEUQM_DEPR2
+
+[VEC_VADDFP, vec_vaddfp, __builtin_vec_vaddfp]
+  vf __builtin_vec_vaddfp (vf, vf);
+    VADDFP  VADDFP_DEPR1
+
+[VEC_VADDSBS, vec_vaddsbs, __builtin_vec_vaddsbs]
+  vsc __builtin_vec_vaddsbs (vsc, vsc);
+    VADDSBS  VADDSBS_DEPR1
+  vsc __builtin_vec_vaddsbs (vbc, vsc);
+    VADDSBS  VADDSBS_DEPR2
+  vsc __builtin_vec_vaddsbs (vsc, vbc);
+    VADDSBS  VADDSBS_DEPR3
+
+[VEC_VADDSHS, vec_vaddshs, __builtin_vec_vaddshs]
+  vss __builtin_vec_vaddshs (vss, vss);
+    VADDSHS  VADDSHS_DEPR1
+  vss __builtin_vec_vaddshs (vbs, vss);
+    VADDSHS  VADDSHS_DEPR2
+  vss __builtin_vec_vaddshs (vss, vbs);
+    VADDSHS  VADDSHS_DEPR3
+
+[VEC_VADDSWS, vec_vaddsws, __builtin_vec_vaddsws]
+  vsi __builtin_vec_vaddsws (vsi, vsi);
+    VADDSWS  VADDSWS_DEPR1
+  vsi __builtin_vec_vaddsws (vbi, vsi);
+    VADDSWS  VADDSWS_DEPR2
+  vsi __builtin_vec_vaddsws (vsi, vbi);
+    VADDSWS  VADDSWS_DEPR3
+
+[VEC_VADDUBM, vec_vaddubm, __builtin_vec_vaddubm]
+  vsc __builtin_vec_vaddubm (vsc, vsc);
+    VADDUBM  VADDUBM_DEPR1
+  vuc __builtin_vec_vaddubm (vsc, vuc);
+    VADDUBM  VADDUBM_DEPR2
+  vuc __builtin_vec_vaddubm (vuc, vsc);
+    VADDUBM  VADDUBM_DEPR3
+  vuc __builtin_vec_vaddubm (vuc, vuc);
+    VADDUBM  VADDUBM_DEPR4
+  vsc __builtin_vec_vaddubm (vbc, vsc);
+    VADDUBM  VADDUBM_DEPR5
+  vsc __builtin_vec_vaddubm (vsc, vbc);
+    VADDUBM  VADDUBM_DEPR6
+  vuc __builtin_vec_vaddubm (vbc, vuc);
+    VADDUBM  VADDUBM_DEPR7
+  vuc __builtin_vec_vaddubm (vuc, vbc);
+    VADDUBM  VADDUBM_DEPR8
+
+[VEC_VADDUBS, vec_vaddubs, __builtin_vec_vaddubs]
+  vuc __builtin_vec_vaddubs (vsc, vuc);
+    VADDUBS  VADDUBS_DEPR1
+  vuc __builtin_vec_vaddubs (vuc, vsc);
+    VADDUBS  VADDUBS_DEPR2
+  vuc __builtin_vec_vaddubs (vuc, vuc);
+    VADDUBS  VADDUBS_DEPR3
+  vuc __builtin_vec_vaddubs (vbc, vuc);
+    VADDUBS  VADDUBS_DEPR4
+  vuc __builtin_vec_vaddubs (vuc, vbc);
+    VADDUBS  VADDUBS_DEPR5
+
+[VEC_VADDUDM, vec_vaddudm, __builtin_vec_vaddudm, _ARCH_PWR8]
+  vsll __builtin_vec_vaddudm (vbll, vsll);
+    VADDUDM  VADDUDM_DEPR1
+  vsll __builtin_vec_vaddudm (vsll, vbll);
+    VADDUDM  VADDUDM_DEPR2
+  vsll __builtin_vec_vaddudm (vsll, vsll);
+    VADDUDM  VADDUDM_DEPR3
+  vull __builtin_vec_vaddudm (vbll, vull);
+    VADDUDM  VADDUDM_DEPR4
+  vull __builtin_vec_vaddudm (vull, vbll);
+    VADDUDM  VADDUDM_DEPR5
+  vull __builtin_vec_vaddudm (vull, vull);
+    VADDUDM  VADDUDM_DEPR6
+
+[VEC_VADDUHM, vec_vadduhm, __builtin_vec_vadduhm]
+  vss __builtin_vec_vadduhm (vss, vss);
+    VADDUHM  VADDUHM_DEPR1
+  vus __builtin_vec_vadduhm (vss, vus);
+    VADDUHM  VADDUHM_DEPR2
+  vus __builtin_vec_vadduhm (vus, vss);
+    VADDUHM  VADDUHM_DEPR3
+  vus __builtin_vec_vadduhm (vus, vus);
+    VADDUHM  VADDUHM_DEPR4
+  vss __builtin_vec_vadduhm (vbs, vss);
+    VADDUHM  VADDUHM_DEPR5
+  vss __builtin_vec_vadduhm (vss, vbs);
+    VADDUHM  VADDUHM_DEPR6
+  vus __builtin_vec_vadduhm (vbs, vus);
+    VADDUHM  VADDUHM_DEPR7
+  vus __builtin_vec_vadduhm (vus, vbs);
+    VADDUHM  VADDUHM_DEPR8
+
+[VEC_VADDUHS, vec_vadduhs, __builtin_vec_vadduhs]
+  vus __builtin_vec_vadduhs (vss, vus);
+    VADDUHS  VADDUHS_DEPR1
+  vus __builtin_vec_vadduhs (vus, vss);
+    VADDUHS  VADDUHS_DEPR2
+  vus __builtin_vec_vadduhs (vus, vus);
+    VADDUHS  VADDUHS_DEPR3
+  vus __builtin_vec_vadduhs (vbs, vus);
+    VADDUHS  VADDUHS_DEPR4
+  vus __builtin_vec_vadduhs (vus, vbs);
+    VADDUHS  VADDUHS_DEPR5
+
+[VEC_VADDUQM, vec_vadduqm, __builtin_vec_vadduqm, _ARCH_PWR8]
+  vsq __builtin_vec_vadduqm (vsq, vsq);
+    VADDUQM  VADDUQM_DEPR1
+  vuq __builtin_vec_vadduqm (vuq, vuq);
+    VADDUQM  VADDUQM_DEPR2
+
+[VEC_VADDUWM, vec_vadduwm, __builtin_vec_vadduwm]
+  vsi __builtin_vec_vadduwm (vsi, vsi);
+    VADDUWM  VADDUWM_DEPR1
+  vui __builtin_vec_vadduwm (vsi, vui);
+    VADDUWM  VADDUWM_DEPR2
+  vui __builtin_vec_vadduwm (vui, vsi);
+    VADDUWM  VADDUWM_DEPR3
+  vui __builtin_vec_vadduwm (vui, vui);
+    VADDUWM  VADDUWM_DEPR4
+  vsi __builtin_vec_vadduwm (vbi, vsi);
+    VADDUWM  VADDUWM_DEPR5
+  vsi __builtin_vec_vadduwm (vsi, vbi);
+    VADDUWM  VADDUWM_DEPR6
+  vui __builtin_vec_vadduwm (vbi, vui);
+    VADDUWM  VADDUWM_DEPR7
+  vui __builtin_vec_vadduwm (vui, vbi);
+    VADDUWM  VADDUWM_DEPR8
+
+[VEC_VADDUWS, vec_vadduws, __builtin_vec_vadduws]
+  vui __builtin_vec_vadduws (vsi, vui);
+    VADDUWS  VADDUWS_DEPR1
+  vui __builtin_vec_vadduws (vui, vsi);
+    VADDUWS  VADDUWS_DEPR2
+  vui __builtin_vec_vadduws (vui, vui);
+    VADDUWS  VADDUWS_DEPR3
+  vui __builtin_vec_vadduws (vbi, vui);
+    VADDUWS  VADDUWS_DEPR4
+  vui __builtin_vec_vadduws (vui, vbi);
+    VADDUWS  VADDUWS_DEPR5
+
+[VEC_VADUB, vec_absdb, __builtin_vec_vadub]
+  vuc __builtin_vec_vadub (vuc, vuc);
+    VADUB  VADUB_DEPR1
+
+[VEC_VADUH, vec_absdh, __builtin_vec_vaduh]
+  vus __builtin_vec_vaduh (vus, vus);
+    VADUH  VADUH_DEPR1
+
+[VEC_VADUW, vec_absdw, __builtin_vec_vaduw]
+  vui __builtin_vec_vaduw (vui, vui);
+    VADUW  VADUW_DEPR1
+
+[VEC_VAVGSB, vec_vavgsb, __builtin_vec_vavgsb]
+  vsc __builtin_vec_vavgsb (vsc, vsc);
+    VAVGSB  VAVGSB_DEPR1
+
+[VEC_VAVGSH, vec_vavgsh, __builtin_vec_vavgsh]
+  vss __builtin_vec_vavgsh (vss, vss);
+    VAVGSH  VAVGSH_DEPR1
+
+[VEC_VAVGSW, vec_vavgsw, __builtin_vec_vavgsw]
+  vsi __builtin_vec_vavgsw (vsi, vsi);
+    VAVGSW  VAVGSW_DEPR1
+
+[VEC_VAVGUB, vec_vavgub, __builtin_vec_vavgub]
+  vuc __builtin_vec_vavgub (vuc, vuc);
+    VAVGUB  VAVGUB_DEPR1
+
+[VEC_VAVGUH, vec_vavguh, __builtin_vec_vavguh]
+  vus __builtin_vec_vavguh (vus, vus);
+    VAVGUH  VAVGUH_DEPR1
+
+[VEC_VAVGUW, vec_vavguw, __builtin_vec_vavguw]
+  vui __builtin_vec_vavguw (vui, vui);
+    VAVGUW  VAVGUW_DEPR1
+
+[VEC_VBPERMQ, vec_vbpermq, __builtin_vec_vbpermq, _ARCH_PWR8]
+  vull __builtin_vec_vbpermq (vull, vuc);
+    VBPERMQ  VBPERMQ_DEPR1
+  vsll __builtin_vec_vbpermq (vsc, vsc);
+    VBPERMQ  VBPERMQ_DEPR2
+  vull __builtin_vec_vbpermq (vuc, vuc);
+    VBPERMQ  VBPERMQ_DEPR3
+  vull __builtin_vec_vbpermq (vuq, vuc);
+    VBPERMQ  VBPERMQ_DEPR4
+
+[VEC_VCFSX, vec_vcfsx, __builtin_vec_vcfsx]
+  vf __builtin_vec_vcfsx (vsi, const int);
+    VCFSX  VCFSX_DEPR1
+
+[VEC_VCFUX, vec_vcfux, __builtin_vec_vcfux]
+  vf __builtin_vec_vcfux (vui, const int);
+    VCFUX  VCFUX_DEPR1
+
+[VEC_VCLZB, vec_vclzb, __builtin_vec_vclzb, _ARCH_PWR8]
+  vsc __builtin_vec_vclzb (vsc);
+    VCLZB  VCLZB_DEPR1
+  vuc __builtin_vec_vclzb (vuc);
+    VCLZB  VCLZB_DEPR2
+
+[VEC_VCLZD, vec_vclzd, __builtin_vec_vclzd, _ARCH_PWR8]
+  vsll __builtin_vec_vclzd (vsll);
+    VCLZD  VCLZD_DEPR1
+  vull __builtin_vec_vclzd (vull);
+    VCLZD  VCLZD_DEPR2
+
+[VEC_VCLZH, vec_vclzh, __builtin_vec_vclzh, _ARCH_PWR8]
+  vss __builtin_vec_vclzh (vss);
+    VCLZH  VCLZH_DEPR1
+  vus __builtin_vec_vclzh (vus);
+    VCLZH  VCLZH_DEPR2
+
+[VEC_VCLZW, vec_vclzw, __builtin_vec_vclzw, _ARCH_PWR8]
+  vsi __builtin_vec_vclzw (vsi);
+    VCLZW  VCLZW_DEPR1
+  vui __builtin_vec_vclzw (vui);
+    VCLZW  VCLZW_DEPR2
+
+[VEC_VCMPEQFP, vec_vcmpeqfp, __builtin_vec_vcmpeqfp]
+  vbi __builtin_vec_vcmpeqfp (vf, vf);
+    VCMPEQFP  VCMPEQFP_DEPR1
+
+[VEC_VCMPEQUB, vec_vcmpequb, __builtin_vec_vcmpequb]
+  vbc __builtin_vec_vcmpequb (vsc, vsc);
+    VCMPEQUB  VCMPEQUB_DEPR1
+  vbc __builtin_vec_vcmpequb (vuc, vuc);
+    VCMPEQUB  VCMPEQUB_DEPR2
+
+[VEC_VCMPEQUH, vec_vcmpequh, __builtin_vec_vcmpequh]
+  vbs __builtin_vec_vcmpequh (vss, vss);
+    VCMPEQUH  VCMPEQUH_DEPR1
+  vbs __builtin_vec_vcmpequh (vus, vus);
+    VCMPEQUH  VCMPEQUH_DEPR2
+
+[VEC_VCMPEQUW, vec_vcmpequw, __builtin_vec_vcmpequw]
+  vbi __builtin_vec_vcmpequw (vsi, vsi);
+    VCMPEQUW  VCMPEQUW_DEPR1
+  vbi __builtin_vec_vcmpequw (vui, vui);
+    VCMPEQUW  VCMPEQUW_DEPR2
+
+[VEC_VCMPGTFP, vec_vcmpgtfp, __builtin_vec_vcmpgtfp]
+  vbi __builtin_vec_vcmpgtfp (vf, vf);
+    VCMPGTFP  VCMPGTFP_DEPR1
+
+[VEC_VCMPGTSB, vec_vcmpgtsb, __builtin_vec_vcmpgtsb]
+  vbc __builtin_vec_vcmpgtsb (vsc, vsc);
+    VCMPGTSB  VCMPGTSB_DEPR1
+
+[VEC_VCMPGTSH, vec_vcmpgtsh, __builtin_vec_vcmpgtsh]
+  vbs __builtin_vec_vcmpgtsh (vss, vss);
+    VCMPGTSH  VCMPGTSH_DEPR1
+
+[VEC_VCMPGTSW, vec_vcmpgtsw, __builtin_vec_vcmpgtsw]
+  vbi __builtin_vec_vcmpgtsw (vsi, vsi);
+    VCMPGTSW  VCMPGTSW_DEPR1
+
+[VEC_VCMPGTUB, vec_vcmpgtub, __builtin_vec_vcmpgtub]
+  vbc __builtin_vec_vcmpgtub (vuc, vuc);
+    VCMPGTUB  VCMPGTUB_DEPR1
+
+[VEC_VCMPGTUH, vec_vcmpgtuh, __builtin_vec_vcmpgtuh]
+  vbs __builtin_vec_vcmpgtuh (vus, vus);
+    VCMPGTUH  VCMPGTUH_DEPR1
+
+[VEC_VCMPGTUW, vec_vcmpgtuw, __builtin_vec_vcmpgtuw]
+  vbi __builtin_vec_vcmpgtuw (vui, vui);
+    VCMPGTUW  VCMPGTUW_DEPR1
+
+[VEC_VCTZB, vec_vctzb, __builtin_vec_vctzb, _ARCH_PWR9]
+  vsc __builtin_vec_vctzb (vsc);
+    VCTZB  VCTZB_DEPR1
+  vuc __builtin_vec_vctzb (vuc);
+    VCTZB  VCTZB_DEPR2
+
+[VEC_VCTZD, vec_vctzd, __builtin_vec_vctzd, _ARCH_PWR9]
+  vsll __builtin_vec_vctzd (vsll);
+    VCTZD  VCTZD_DEPR1
+  vull __builtin_vec_vctzd (vull);
+    VCTZD  VCTZD_DEPR2
+
+[VEC_VCTZH, vec_vctzh, __builtin_vec_vctzh, _ARCH_PWR9]
+  vss __builtin_vec_vctzh (vss);
+    VCTZH  VCTZH_DEPR1
+  vus __builtin_vec_vctzh (vus);
+    VCTZH  VCTZH_DEPR2
+
+[VEC_VCTZW, vec_vctzw, __builtin_vec_vctzw, _ARCH_PWR9]
+  vsi __builtin_vec_vctzw (vsi);
+    VCTZW  VCTZW_DEPR1
+  vui __builtin_vec_vctzw (vui);
+    VCTZW  VCTZW_DEPR2
+
+[VEC_VEEDP, vec_extract_exp_dp, __builtin_vec_extract_exp_dp, _ARCH_PWR9]
+  vull __builtin_vec_extract_exp_dp (vd);
+    VEEDP  VEEDP_DEPR1
+
+[VEC_VEESP, vec_extract_exp_sp, __builtin_vec_extract_exp_sp, _ARCH_PWR9]
+  vui __builtin_vec_extract_exp_sp (vf);
+    VEESP  VEESP_DEPR1
+
+[VEC_VESDP, vec_extract_sig_dp, __builtin_vec_extract_sig_dp, _ARCH_PWR9]
+  vull __builtin_vec_extract_sig_dp (vd);
+    VESDP  VESDP_DEPR1
+
+[VEC_VESSP, vec_extract_sig_sp, __builtin_vec_extract_sig_sp, _ARCH_PWR9]
+  vui __builtin_vec_extract_sig_sp (vf);
+    VESSP  VESSP_DEPR1
+
+[VEC_VIEDP, vec_insert_exp_dp, __builtin_vec_insert_exp_dp, _ARCH_PWR9]
+  vd __builtin_vec_insert_exp_dp (vd, vull);
+    VIEDP  VIEDP_DEPR1
+  vd __builtin_vec_insert_exp_dp (vull, vull);
+    VIEDP  VIEDP_DEPR2
+
+[VEC_VIESP, vec_insert_exp_sp, __builtin_vec_insert_exp_sp, _ARCH_PWR9]
+  vf __builtin_vec_insert_exp_sp (vf, vui);
+    VIESP  VIESP_DEPR1
+  vf __builtin_vec_insert_exp_sp (vui, vui);
+    VIESP  VIESP_DEPR2
+
+[VEC_VMAXFP, vec_vmaxfp, __builtin_vec_vmaxfp]
+  vf __builtin_vec_vmaxfp (vf, vf);
+    VMAXFP  VMAXFP_DEPR1
+
+[VEC_VMAXSB, vec_vmaxsb, __builtin_vec_vmaxsb]
+  vsc __builtin_vec_vmaxsb (vsc, vsc);
+    VMAXSB  VMAXSB_DEPR1
+  vsc __builtin_vec_vmaxsb (vbc, vsc);
+    VMAXSB  VMAXSB_DEPR2
+  vsc __builtin_vec_vmaxsb (vsc, vbc);
+    VMAXSB  VMAXSB_DEPR3
+
+[VEC_VMAXSD, vec_vmaxsd, __builtin_vec_vmaxsd]
+  vsll __builtin_vec_vmaxsd (vsll, vsll);
+    VMAXSD  VMAXSD_DEPR1
+  vsll __builtin_vec_vmaxsd (vbll, vsll);
+    VMAXSD  VMAXSD_DEPR2
+  vsll __builtin_vec_vmaxsd (vsll, vbll);
+    VMAXSD  VMAXSD_DEPR3
+
+[VEC_VMAXSH, vec_vmaxsh, __builtin_vec_vmaxsh]
+  vss __builtin_vec_vmaxsh (vss, vss);
+    VMAXSH  VMAXSH_DEPR1
+  vss __builtin_vec_vmaxsh (vbs, vss);
+    VMAXSH  VMAXSH_DEPR2
+  vss __builtin_vec_vmaxsh (vss, vbs);
+    VMAXSH  VMAXSH_DEPR3
+
+[VEC_VMAXSW, vec_vmaxsw, __builtin_vec_vmaxsw]
+  vsi __builtin_vec_vmaxsw (vsi, vsi);
+    VMAXSW  VMAXSW_DEPR1
+  vsi __builtin_vec_vmaxsw (vbi, vsi);
+    VMAXSW  VMAXSW_DEPR2
+  vsi __builtin_vec_vmaxsw (vsi, vbi);
+    VMAXSW  VMAXSW_DEPR3
+
+[VEC_VMAXUB, vec_vmaxub, __builtin_vec_vmaxub]
+  vuc __builtin_vec_vmaxub (vsc, vuc);
+    VMAXUB  VMAXUB_DEPR1
+  vuc __builtin_vec_vmaxub (vuc, vsc);
+    VMAXUB  VMAXUB_DEPR2
+  vuc __builtin_vec_vmaxub (vuc, vuc);
+    VMAXUB  VMAXUB_DEPR3
+  vuc __builtin_vec_vmaxub (vbc, vuc);
+    VMAXUB  VMAXUB_DEPR4
+  vuc __builtin_vec_vmaxub (vuc, vbc);
+    VMAXUB  VMAXUB_DEPR5
+
+[VEC_VMAXUD, vec_vmaxud, __builtin_vec_vmaxud]
+  vull __builtin_vec_vmaxud (vull, vull);
+    VMAXUD  VMAXUD_DEPR1
+  vull __builtin_vec_vmaxud (vbll, vull);
+    VMAXUD  VMAXUD_DEPR2
+  vull __builtin_vec_vmaxud (vull, vbll);
+    VMAXUD  VMAXUD_DEPR3
+
+[VEC_VMAXUH, vec_vmaxuh, __builtin_vec_vmaxuh]
+  vus __builtin_vec_vmaxuh (vss, vus);
+    VMAXUH  VMAXUH_DEPR1
+  vus __builtin_vec_vmaxuh (vus, vss);
+    VMAXUH  VMAXUH_DEPR2
+  vus __builtin_vec_vmaxuh (vus, vus);
+    VMAXUH  VMAXUH_DEPR3
+  vus __builtin_vec_vmaxuh (vbs, vus);
+    VMAXUH  VMAXUH_DEPR4
+  vus __builtin_vec_vmaxuh (vus, vbs);
+    VMAXUH  VMAXUH_DEPR5
+
+[VEC_VMAXUW, vec_vmaxuw, __builtin_vec_vmaxuw]
+  vui __builtin_vec_vmaxuw (vsi, vui);
+    VMAXUW  VMAXUW_DEPR1
+  vui __builtin_vec_vmaxuw (vui, vsi);
+    VMAXUW  VMAXUW_DEPR2
+  vui __builtin_vec_vmaxuw (vui, vui);
+    VMAXUW  VMAXUW_DEPR3
+  vui __builtin_vec_vmaxuw (vbi, vui);
+    VMAXUW  VMAXUW_DEPR4
+  vui __builtin_vec_vmaxuw (vui, vbi);
+    VMAXUW  VMAXUW_DEPR5
+
+[VEC_VMINFP, vec_vminfp, __builtin_vec_vminfp]
+  vf __builtin_vec_vminfp (vf, vf);
+    VMINFP  VMINFP_DEPR1
+
+[VEC_VMINSB, vec_vminsb, __builtin_vec_vminsb]
+  vsc __builtin_vec_vminsb (vsc, vsc);
+    VMINSB  VMINSB_DEPR1
+  vsc __builtin_vec_vminsb (vbc, vsc);
+    VMINSB  VMINSB_DEPR2
+  vsc __builtin_vec_vminsb (vsc, vbc);
+    VMINSB  VMINSB_DEPR3
+
+[VEC_VMINSD, vec_vminsd, __builtin_vec_vminsd]
+  vsll __builtin_vec_vminsd (vsll, vsll);
+    VMINSD  VMINSD_DEPR1
+  vsll __builtin_vec_vminsd (vbll, vsll);
+    VMINSD  VMINSD_DEPR2
+  vsll __builtin_vec_vminsd (vsll, vbll);
+    VMINSD  VMINSD_DEPR3
+
+[VEC_VMINSH, vec_vminsh, __builtin_vec_vminsh]
+  vss __builtin_vec_vminsh (vss, vss);
+    VMINSH  VMINSH_DEPR1
+  vss __builtin_vec_vminsh (vbs, vss);
+    VMINSH  VMINSH_DEPR2
+  vss __builtin_vec_vminsh (vss, vbs);
+    VMINSH  VMINSH_DEPR3
+
+[VEC_VMINSW, vec_vminsw, __builtin_vec_vminsw]
+  vsi __builtin_vec_vminsw (vsi, vsi);
+    VMINSW  VMINSW_DEPR1
+  vsi __builtin_vec_vminsw (vbi, vsi);
+    VMINSW  VMINSW_DEPR2
+  vsi __builtin_vec_vminsw (vsi, vbi);
+    VMINSW  VMINSW_DEPR3
+
+[VEC_VMINUB, vec_vminub, __builtin_vec_vminub]
+  vuc __builtin_vec_vminub (vsc, vuc);
+    VMINUB  VMINUB_DEPR1
+  vuc __builtin_vec_vminub (vuc, vsc);
+    VMINUB  VMINUB_DEPR2
+  vuc __builtin_vec_vminub (vuc, vuc);
+    VMINUB  VMINUB_DEPR3
+  vuc __builtin_vec_vminub (vbc, vuc);
+    VMINUB  VMINUB_DEPR4
+  vuc __builtin_vec_vminub (vuc, vbc);
+    VMINUB  VMINUB_DEPR5
+
+[VEC_VMINUD, vec_vminud, __builtin_vec_vminud]
+  vull __builtin_vec_vminud (vull, vull);
+    VMINUD  VMINUD_DEPR1
+  vull __builtin_vec_vminud (vbll, vull);
+    VMINUD  VMINUD_DEPR2
+  vull __builtin_vec_vminud (vull, vbll);
+    VMINUD  VMINUD_DEPR3
+
+[VEC_VMINUH, vec_vminuh, __builtin_vec_vminuh]
+  vus __builtin_vec_vminuh (vss, vus);
+    VMINUH  VMINUH_DEPR1
+  vus __builtin_vec_vminuh (vus, vss);
+    VMINUH  VMINUH_DEPR2
+  vus __builtin_vec_vminuh (vus, vus);
+    VMINUH  VMINUH_DEPR3
+  vus __builtin_vec_vminuh (vbs, vus);
+    VMINUH  VMINUH_DEPR4
+  vus __builtin_vec_vminuh (vus, vbs);
+    VMINUH  VMINUH_DEPR5
+
+[VEC_VMINUW, vec_vminuw, __builtin_vec_vminuw]
+  vui __builtin_vec_vminuw (vsi, vui);
+    VMINUW  VMINUW_DEPR1
+  vui __builtin_vec_vminuw (vui, vsi);
+    VMINUW  VMINUW_DEPR2
+  vui __builtin_vec_vminuw (vui, vui);
+    VMINUW  VMINUW_DEPR3
+  vui __builtin_vec_vminuw (vbi, vui);
+    VMINUW  VMINUW_DEPR4
+  vui __builtin_vec_vminuw (vui, vbi);
+    VMINUW  VMINUW_DEPR5
+
+[VEC_VMRGHB, vec_vmrghb, __builtin_vec_vmrghb]
+  vsc __builtin_vec_vmrghb (vsc, vsc);
+    VMRGHB  VMRGHB_DEPR1
+  vuc __builtin_vec_vmrghb (vuc, vuc);
+    VMRGHB  VMRGHB_DEPR2
+  vbc __builtin_vec_vmrghb (vbc, vbc);
+    VMRGHB  VMRGHB_DEPR3
+
+[VEC_VMRGHH, vec_vmrghh, __builtin_vec_vmrghh]
+  vss __builtin_vec_vmrghh (vss, vss);
+    VMRGHH  VMRGHH_DEPR1
+  vus __builtin_vec_vmrghh (vus, vus);
+    VMRGHH  VMRGHH_DEPR2
+  vbs __builtin_vec_vmrghh (vbs, vbs);
+    VMRGHH  VMRGHH_DEPR3
+  vp __builtin_vec_vmrghh (vp, vp);
+    VMRGHH  VMRGHH_DEPR4
+
+[VEC_VMRGHW, vec_vmrghw, __builtin_vec_vmrghw]
+  vf __builtin_vec_vmrghw (vf, vf);
+    VMRGHW  VMRGHW_DEPR1
+  vsi __builtin_vec_vmrghw (vsi, vsi);
+    VMRGHW  VMRGHW_DEPR2
+  vui __builtin_vec_vmrghw (vui, vui);
+    VMRGHW  VMRGHW_DEPR3
+  vbi __builtin_vec_vmrghw (vbi, vbi);
+    VMRGHW  VMRGHW_DEPR4
+
+[VEC_VMRGLB, vec_vmrglb, __builtin_vec_vmrglb]
+  vsc __builtin_vec_vmrglb (vsc, vsc);
+    VMRGLB  VMRGLB_DEPR1
+  vuc __builtin_vec_vmrglb (vuc, vuc);
+    VMRGLB  VMRGLB_DEPR2
+  vbc __builtin_vec_vmrglb (vbc, vbc);
+    VMRGLB  VMRGLB_DEPR3
+
+[VEC_VMRGLH, vec_vmrglh, __builtin_vec_vmrglh]
+  vss __builtin_vec_vmrglh (vss, vss);
+    VMRGLH  VMRGLH_DEPR1
+  vus __builtin_vec_vmrglh (vus, vus);
+    VMRGLH  VMRGLH_DEPR2
+  vbs __builtin_vec_vmrglh (vbs, vbs);
+    VMRGLH  VMRGLH_DEPR3
+  vp __builtin_vec_vmrglh (vp, vp);
+    VMRGLH  VMRGLH_DEPR4
+
+[VEC_VMRGLW, vec_vmrglw, __builtin_vec_vmrglw]
+  vf __builtin_vec_vmrglw (vf, vf);
+    VMRGLW  VMRGLW_DEPR1
+  vsi __builtin_vec_vmrglw (vsi, vsi);
+    VMRGLW  VMRGLW_DEPR2
+  vui __builtin_vec_vmrglw (vui, vui);
+    VMRGLW  VMRGLW_DEPR3
+  vbi __builtin_vec_vmrglw (vbi, vbi);
+    VMRGLW  VMRGLW_DEPR4
+
+[VEC_VMSUMMBM, vec_vmsummbm, __builtin_vec_vmsummbm]
+  vsi __builtin_vec_vmsummbm (vsc, vuc, vsi);
+    VMSUMMBM  VMSUMMBM_DEPR1
+
+[VEC_VMSUMSHM, vec_vmsumshm, __builtin_vec_vmsumshm]
+  vsi __builtin_vec_vmsumshm (vss, vss, vsi);
+    VMSUMSHM  VMSUMSHM_DEPR1
+
+[VEC_VMSUMSHS, vec_vmsumshs, __builtin_vec_vmsumshs]
+  vsi __builtin_vec_vmsumshs (vss, vss, vsi);
+    VMSUMSHS  VMSUMSHS_DEPR1
+
+[VEC_VMSUMUBM, vec_vmsumubm, __builtin_vec_vmsumubm]
+  vui __builtin_vec_vmsumubm (vuc, vuc, vui);
+    VMSUMUBM  VMSUMUBM_DEPR1
+
+[VEC_VMSUMUDM, vec_vmsumudm, __builtin_vec_vmsumudm]
+  vuq __builtin_vec_vmsumudm (vull, vull, vuq);
+    VMSUMUDM  VMSUMUDM_DEPR1
+
+[VEC_VMSUMUHM, vec_vmsumuhm, __builtin_vec_vmsumuhm]
+  vui __builtin_vec_vmsumuhm (vus, vus, vui);
+    VMSUMUHM  VMSUMUHM_DEPR1
+
+[VEC_VMSUMUHS, vec_vmsumuhs, __builtin_vec_vmsumuhs]
+  vui __builtin_vec_vmsumuhs (vus, vus, vui);
+    VMSUMUHS  VMSUMUHS_DEPR1
+
+[VEC_VMULESB, vec_vmulesb, __builtin_vec_vmulesb]
+  vss __builtin_vec_vmulesb (vsc, vsc);
+    VMULESB  VMULESB_DEPR1
+
+[VEC_VMULESH, vec_vmulesh, __builtin_vec_vmulesh]
+  vsi __builtin_vec_vmulesh (vss, vss);
+    VMULESH  VMULESH_DEPR1
+
+[VEC_VMULESW, SKIP, __builtin_vec_vmulesw]
+  vsll __builtin_vec_vmulesw (vsi, vsi);
+    VMULESW  VMULESW_DEPR1
+
+[VEC_VMULEUB, vec_vmuleub, __builtin_vec_vmuleub]
+  vus __builtin_vec_vmuleub (vuc, vuc);
+    VMULEUB  VMULEUB_DEPR1
+
+[VEC_VMULEUH, vec_vmuleuh, __builtin_vec_vmuleuh]
+  vui __builtin_vec_vmuleuh (vus, vus);
+    VMULEUH  VMULEUH_DEPR1
+
+[VEC_VMULEUW, SKIP, __builtin_vec_vmuleuw]
+  vull __builtin_vec_vmuleuw (vui, vui);
+    VMULEUW  VMULEUW_DEPR1
+
+[VEC_VMULOSB, vec_vmulosb, __builtin_vec_vmulosb]
+  vss __builtin_vec_vmulosb (vsc, vsc);
+    VMULOSB  VMULOSB_DEPR1
+
+[VEC_VMULOSH, vec_vmulosh, __builtin_vec_vmulosh]
+  vsi __builtin_vec_vmulosh (vss, vss);
+    VMULOSH  VMULOSH_DEPR1
+
+[VEC_VMULOSW, SKIP, __builtin_vec_vmulosw]
+  vsll __builtin_vec_vmulosw (vsi, vsi);
+    VMULOSW  VMULOSW_DEPR1
+
+[VEC_VMULOUB, vec_vmuloub, __builtin_vec_vmuloub]
+  vus __builtin_vec_vmuloub (vuc, vuc);
+    VMULOUB  VMULOUB_DEPR1
+
+[VEC_VMULOUH, vec_vmulouh, __builtin_vec_vmulouh]
+  vui __builtin_vec_vmulouh (vus, vus);
+    VMULOUH  VMULOUH_DEPR1
+
+[VEC_VMULOUW, SKIP, __builtin_vec_vmulouw]
+  vull __builtin_vec_vmulouw (vui, vui);
+    VMULOUW  VMULOUW_DEPR1
+
+[VEC_VPKSDSS, vec_vpksdss, __builtin_vec_vpksdss, _ARCH_PWR8]
+  vsi __builtin_vec_vpksdss (vsll, vsll);
+    VPKSDSS  VPKSDSS_DEPR1
+
+[VEC_VPKSDUS, vec_vpksdus, __builtin_vec_vpksdus, _ARCH_PWR8]
+  vui __builtin_vec_vpksdus (vsll, vsll);
+    VPKSDUS  VPKSDUS_DEPR1
+
+[VEC_VPKSHSS, vec_vpkshss, __builtin_vec_vpkshss]
+  vsc __builtin_vec_vpkshss (vss, vss);
+    VPKSHSS  VPKSHSS_DEPR1
+
+[VEC_VPKSHUS, vec_vpkshus, __builtin_vec_vpkshus]
+  vuc __builtin_vec_vpkshus (vss, vss);
+    VPKSHUS  VPKSHUS_DEPR1
+
+[VEC_VPKSWSS, vec_vpkswss, __builtin_vec_vpkswss]
+  vss __builtin_vec_vpkswss (vsi, vsi);
+    VPKSWSS  VPKSWSS_DEPR1
+
+[VEC_VPKSWUS, vec_vpkswus, __builtin_vec_vpkswus]
+  vus __builtin_vec_vpkswus (vsi, vsi);
+    VPKSWUS  VPKSWUS_DEPR1
+
+[VEC_VPKUDUM, vec_vpkudum, __builtin_vec_vpkudum, _ARCH_PWR8]
+  vsi __builtin_vec_vpkudum (vsll, vsll);
+    VPKUDUM  VPKUDUM_DEPR1
+  vui __builtin_vec_vpkudum (vull, vull);
+    VPKUDUM  VPKUDUM_DEPR2
+  vbi __builtin_vec_vpkudum (vbll, vbll);
+    VPKUDUM  VPKUDUM_DEPR3
+
+[VEC_VPKUDUS, vec_vpkudus, __builtin_vec_vpkudus, _ARCH_PWR8]
+  vui __builtin_vec_vpkudus (vull, vull);
+    VPKUDUS  VPKUDUS_DEPR1
+
+[VEC_VPKUHUM, vec_vpkuhum, __builtin_vec_vpkuhum]
+  vsc __builtin_vec_vpkuhum (vss, vss);
+    VPKUHUM  VPKUHUM_DEPR1
+  vuc __builtin_vec_vpkuhum (vus, vus);
+    VPKUHUM  VPKUHUM_DEPR2
+  vbc __builtin_vec_vpkuhum (vbs, vbs);
+    VPKUHUM  VPKUHUM_DEPR3
+
+[VEC_VPKUHUS, vec_vpkuhus, __builtin_vec_vpkuhus]
+  vuc __builtin_vec_vpkuhus (vus, vus);
+    VPKUHUS  VPKUHUS_DEPR1
+
+[VEC_VPKUWUM, vec_vpkuwum, __builtin_vec_vpkuwum]
+  vss __builtin_vec_vpkuwum (vsi, vsi);
+    VPKUWUM  VPKUWUM_DEPR1
+  vus __builtin_vec_vpkuwum (vui, vui);
+    VPKUWUM  VPKUWUM_DEPR2
+  vbs __builtin_vec_vpkuwum (vbi, vbi);
+    VPKUWUM  VPKUWUM_DEPR3
+
+[VEC_VPKUWUS, vec_vpkuwus, __builtin_vec_vpkuwus]
+  vus __builtin_vec_vpkuwus (vui, vui);
+    VPKUWUS  VPKUWUS_DEPR1
+
+[VEC_VPOPCNT, vec_vpopcnt, __builtin_vec_vpopcnt, _ARCH_PWR8]
+  vsc __builtin_vec_vpopcnt (vsc);
+    VPOPCNTB  VPOPCNT_DEPR1
+  vuc __builtin_vec_vpopcnt (vuc);
+    VPOPCNTB  VPOPCNT_DEPR2
+  vss __builtin_vec_vpopcnt (vss);
+    VPOPCNTH  VPOPCNT_DEPR3
+  vus __builtin_vec_vpopcnt (vus);
+    VPOPCNTH  VPOPCNT_DEPR4
+  vsi __builtin_vec_vpopcnt (vsi);
+    VPOPCNTW  VPOPCNT_DEPR5
+  vui __builtin_vec_vpopcnt (vui);
+    VPOPCNTW  VPOPCNT_DEPR6
+  vsll __builtin_vec_vpopcnt (vsll);
+    VPOPCNTD  VPOPCNT_DEPR7
+  vull __builtin_vec_vpopcnt (vull);
+    VPOPCNTD  VPOPCNT_DEPR8
+
+[VEC_VPOPCNTB, vec_vpopcntb, __builtin_vec_vpopcntb, _ARCH_PWR8]
+  vsc __builtin_vec_vpopcntb (vsc);
+    VPOPCNTB  VPOPCNTB_DEPR1
+  vuc __builtin_vec_vpopcntb (vuc);
+    VPOPCNTB  VPOPCNTB_DEPR2
+
+[VEC_VPOPCNTD, vec_vpopcntd, __builtin_vec_vpopcntd, _ARCH_PWR8]
+  vsll __builtin_vec_vpopcntd (vsll);
+    VPOPCNTD  VPOPCNTD_DEPR1
+  vull __builtin_vec_vpopcntd (vull);
+    VPOPCNTD  VPOPCNTD_DEPR2
+
+[VEC_VPOPCNTH, vec_vpopcnth, __builtin_vec_vpopcnth, _ARCH_PWR8]
+  vss __builtin_vec_vpopcnth (vss);
+    VPOPCNTH  VPOPCNTH_DEPR1
+  vus __builtin_vec_vpopcnth (vus);
+    VPOPCNTH  VPOPCNTH_DEPR2
+
+[VEC_VPOPCNTW, vec_vpopcntw, __builtin_vec_vpopcntw, _ARCH_PWR8]
+  vsi __builtin_vec_vpopcntw (vsi);
+    VPOPCNTW  VPOPCNTW_DEPR1
+  vui __builtin_vec_vpopcntw (vui);
+    VPOPCNTW  VPOPCNTW_DEPR2
+
+[VEC_VPRTYBD, vec_vprtybd, __builtin_vec_vprtybd, _ARCH_PWR9]
+  vsll __builtin_vec_vprtybd (vsll);
+    VPRTYBD  VPRTYBD_DEPR1
+  vull __builtin_vec_vprtybd (vull);
+    VPRTYBD  VPRTYBD_DEPR2
+
+[VEC_VPRTYBQ, vec_vprtybq, __builtin_vec_vprtybq, _ARCH_PPC64_PWR9]
+  vsq __builtin_vec_vprtybq (vsq);
+    VPRTYBQ  VPRTYBQ_DEPR1
+  vuq __builtin_vec_vprtybq (vuq);
+    VPRTYBQ  VPRTYBQ_DEPR2
+  signed __int128 __builtin_vec_vprtybq (signed __int128);
+    VPRTYBQ  VPRTYBQ_DEPR3
+  unsigned __int128 __builtin_vec_vprtybq (unsigned __int128);
+    VPRTYBQ  VPRTYBQ_DEPR4
+
+[VEC_VPRTYBW, vec_vprtybw, __builtin_vec_vprtybw, _ARCH_PWR9]
+  vsi __builtin_vec_vprtybw (vsi);
+    VPRTYBW  VPRTYBW_DEPR1
+  vui __builtin_vec_vprtybw (vui);
+    VPRTYBW  VPRTYBW_DEPR2
+
+[VEC_VRLB, vec_vrlb, __builtin_vec_vrlb]
+  vsc __builtin_vec_vrlb (vsc, vuc);
+    VRLB  VRLB_DEPR1
+  vuc __builtin_vec_vrlb (vuc, vuc);
+    VRLB  VRLB_DEPR2
+
+[VEC_VRLD, SKIP, __builtin_vec_vrld, _ARCH_PWR8]
+  vsll __builtin_vec_vrld (vsll, vull);
+    VRLD  VRLD_DEPR1
+  vull __builtin_vec_vrld (vull, vull);
+    VRLD  VRLD_DEPR2
+
+[VEC_VRLH, vec_vrlh, __builtin_vec_vrlh]
+  vss __builtin_vec_vrlh (vss, vus);
+    VRLH  VRLH_DEPR1
+  vus __builtin_vec_vrlh (vus, vus);
+    VRLH  VRLH_DEPR2
+
+[VEC_VRLW, vec_vrlw, __builtin_vec_vrlw]
+  vsi __builtin_vec_vrlw (vsi, vui);
+    VRLW  VRLW_DEPR1
+  vui __builtin_vec_vrlw (vui, vui);
+    VRLW  VRLW_DEPR2
+
+[VEC_VSLB, vec_vslb, __builtin_vec_vslb]
+  vsc __builtin_vec_vslb (vsc, vuc);
+    VSLB  VSLB_DEPR1
+  vuc __builtin_vec_vslb (vuc, vuc);
+    VSLB  VSLB_DEPR2
+
+[VEC_VSLD, SKIP, __builtin_vec_vsld, _ARCH_PWR8]
+  vsll __builtin_vec_vsld (vsll, vull);
+    VSLD  VSLD_DEPR1
+  vull __builtin_vec_vsld (vull, vull);
+    VSLD  VSLD_DEPR2
+
+[VEC_VSLH, vec_vslh, __builtin_vec_vslh]
+  vss __builtin_vec_vslh (vss, vus);
+    VSLH  VSLH_DEPR1
+  vus __builtin_vec_vslh (vus, vus);
+    VSLH  VSLH_DEPR2
+
+[VEC_VSLW, vec_vslw, __builtin_vec_vslw]
+  vsi __builtin_vec_vslw (vsi, vui);
+    VSLW  VSLW_DEPR1
+  vui __builtin_vec_vslw (vui, vui);
+    VSLW  VSLW_DEPR2
+
+[VEC_VSPLTB, vec_vspltb, __builtin_vec_vspltb]
+  vsc __builtin_vec_vspltb (vsc, const int);
+    VSPLTB  VSPLTB_DEPR1
+  vuc __builtin_vec_vspltb (vuc, const int);
+    VSPLTB  VSPLTB_DEPR2
+  vbc __builtin_vec_vspltb (vbc, const int);
+    VSPLTB  VSPLTB_DEPR3
+
+[VEC_VSPLTH, vec_vsplth, __builtin_vec_vsplth]
+  vss __builtin_vec_vsplth (vss, const int);
+    VSPLTH  VSPLTH_DEPR1
+  vus __builtin_vec_vsplth (vus, const int);
+    VSPLTH  VSPLTH_DEPR2
+  vbs __builtin_vec_vsplth (vbs, const int);
+    VSPLTH  VSPLTH_DEPR3
+  vp __builtin_vec_vsplth (vp, const int);
+    VSPLTH  VSPLTH_DEPR4
+
+[VEC_VSPLTW, vec_vspltw, __builtin_vec_vspltw]
+  vsi __builtin_vec_vspltw (vsi, const int);
+    VSPLTW  VSPLTW_DEPR1
+  vui __builtin_vec_vspltw (vui, const int);
+    VSPLTW  VSPLTW_DEPR2
+  vbi __builtin_vec_vspltw (vbi, const int);
+    VSPLTW  VSPLTW_DEPR3
+  vf __builtin_vec_vspltw (vf, const int);
+    VSPLTW  VSPLTW_DEPR4
+
+[VEC_VSRAB, vec_vsrab, __builtin_vec_vsrab]
+  vsc __builtin_vec_vsrab (vsc, vuc);
+    VSRAB  VSRAB_DEPR1
+  vuc __builtin_vec_vsrab (vuc, vuc);
+    VSRAB  VSRAB_DEPR2
+
+[VEC_VSRAD, SKIP, __builtin_vec_vsrad, _ARCH_PWR8]
+  vsll __builtin_vec_vsrad (vsll, vull);
+    VSRAD  VSRAD_DEPR1
+  vull __builtin_vec_vsrad (vull, vull);
+    VSRAD  VSRAD_DEPR2
+
+[VEC_VSRAH, vec_vsrah, __builtin_vec_vsrah]
+  vss __builtin_vec_vsrah (vss, vus);
+    VSRAH  VSRAH_DEPR1
+  vus __builtin_vec_vsrah (vus, vus);
+    VSRAH  VSRAH_DEPR2
+
+[VEC_VSRAW, vec_vsraw, __builtin_vec_vsraw]
+  vsi __builtin_vec_vsraw (vsi, vui);
+    VSRAW  VSRAW_DEPR1
+  vui __builtin_vec_vsraw (vui, vui);
+    VSRAW  VSRAW_DEPR2
+
+[VEC_VSRB, vec_vsrb, __builtin_vec_vsrb]
+  vsc __builtin_vec_vsrb (vsc, vuc);
+    VSRB  VSRB_DEPR1
+  vuc __builtin_vec_vsrb (vuc, vuc);
+    VSRB  VSRB_DEPR2
+
+[VEC_VSRD, SKIP, __builtin_vec_vsrd, _ARCH_PWR8]
+  vsll __builtin_vec_vsrd (vsll, vull);
+    VSRD  VSRD_DEPR1
+  vull __builtin_vec_vsrd (vull, vull);
+    VSRD  VSRD_DEPR2
+
+[VEC_VSRH, vec_vsrh, __builtin_vec_vsrh]
+  vss __builtin_vec_vsrh (vss, vus);
+    VSRH  VSRH_DEPR1
+  vus __builtin_vec_vsrh (vus, vus);
+    VSRH  VSRH_DEPR2
+
+[VEC_VSRW, vec_vsrw, __builtin_vec_vsrw]
+  vsi __builtin_vec_vsrw (vsi, vui);
+    VSRW  VSRW_DEPR1
+  vui __builtin_vec_vsrw (vui, vui);
+    VSRW  VSRW_DEPR2
+
+[VEC_VSTDCDP, scalar_test_data_class_dp, __builtin_vec_scalar_test_data_class_dp, _ARCH_PWR9]
+  unsigned int __builtin_vec_scalar_test_data_class_dp (double, const int);
+    VSTDCDP  VSTDCDP_DEPR1
+
+[VEC_VSTDCNDP, scalar_test_neg_dp, __builtin_vec_scalar_test_neg_dp, _ARCH_PWR9]
+  unsigned int __builtin_vec_scalar_test_neg_dp (double);
+    VSTDCNDP  VSTDCNDP_DEPR1
+
+[VEC_VSTDCNQP, scalar_test_neg_qp, __builtin_vec_scalar_test_neg_qp, _ARCH_PWR9]
+  unsigned int __builtin_vec_scalar_test_neg_qp (_Float128);
+    VSTDCNQP  VSTDCNQP_DEPR1
+
+[VEC_VSTDCNSP, scalar_test_neg_sp, __builtin_vec_scalar_test_neg_sp, _ARCH_PWR9]
+  unsigned int __builtin_vec_scalar_test_neg_sp (float);
+    VSTDCNSP  VSTDCNSP_DEPR1
+
+[VEC_VSTDCQP, scalar_test_data_class_qp, __builtin_vec_scalar_test_data_class_qp, _ARCH_PWR9]
+  unsigned int __builtin_vec_scalar_test_data_class_qp (_Float128, const int);
+    VSTDCQP  VSTDCQP_DEPR1
+
+[VEC_VSTDCSP, scalar_test_data_class_sp, __builtin_vec_scalar_test_data_class_sp, _ARCH_PWR9]
+  unsigned int __builtin_vec_scalar_test_data_class_sp (float, const int);
+    VSTDCSP  VSTDCSP_DEPR1
+
+[VEC_VSUBCUQ, vec_vsubcuqP, __builtin_vec_vsubcuq]
+  vsq __builtin_vec_vsubcuq (vsq, vsq);
+    VSUBCUQ  VSUBCUQ_DEPR1
+  vuq __builtin_vec_vsubcuq (vuq, vuq);
+    VSUBCUQ  VSUBCUQ_DEPR2
+
+[VEC_VSUBECUQ, vec_vsubecuq, __builtin_vec_vsubecuq, ARCH_PWR8]
+  vsq __builtin_vec_vsubecuq (vsq, vsq, vsq);
+    VSUBECUQ  VSUBECUQ_DEPR1
+  vuq __builtin_vec_vsubecuq (vuq, vuq, vuq);
+    VSUBECUQ  VSUBECUQ_DEPR2
+
+[VEC_VSUBEUQM, vec_vsubeuqm, __builtin_vec_vsubeuqm, _ARCH_PWR8]
+  vsq __builtin_vec_vsubeuqm (vsq, vsq, vsq);
+    VSUBEUQM  VSUBEUQM_DEPR1
+  vuq __builtin_vec_vsubeuqm (vuq, vuq, vuq);
+    VSUBEUQM  VSUBEUQM_DEPR2
+
+[VEC_VSUBFP, vec_vsubfp, __builtin_vec_vsubfp]
+  vf __builtin_vec_vsubfp (vf, vf);
+    VSUBFP  VSUBFP_DEPR1
+
+[VEC_VSUBSBS, vec_vsubsbs, __builtin_vec_vsubsbs]
+  vsc __builtin_vec_vsubsbs (vsc, vsc);
+    VSUBSBS  VSUBSBS_DEPR1
+  vsc __builtin_vec_vsubsbs (vbc, vsc);
+    VSUBSBS  VSUBSBS_DEPR2
+  vsc __builtin_vec_vsubsbs (vsc, vbc);
+    VSUBSBS  VSUBSBS_DEPR3
+
+[VEC_VSUBSHS, vec_vsubshs, __builtin_vec_vsubshs]
+  vss __builtin_vec_vsubshs (vss, vss);
+    VSUBSHS  VSUBSHS_DEPR1
+  vss __builtin_vec_vsubshs (vbs, vss);
+    VSUBSHS  VSUBSHS_DEPR2
+  vss __builtin_vec_vsubshs (vss, vbs);
+    VSUBSHS  VSUBSHS_DEPR3
+
+[VEC_VSUBSWS, vec_vsubsws, __builtin_vec_vsubsws]
+  vsi __builtin_vec_vsubsws (vsi, vsi);
+    VSUBSWS  VSUBSWS_DEPR1
+  vsi __builtin_vec_vsubsws (vbi, vsi);
+    VSUBSWS  VSUBSWS_DEPR2
+  vsi __builtin_vec_vsubsws (vsi, vbi);
+    VSUBSWS  VSUBSWS_DEPR3
+
+[VEC_VSUBUBM, vec_vsububm, __builtin_vec_vsububm]
+  vsc __builtin_vec_vsububm (vsc, vsc);
+    VSUBUBM  VSUBUBM_DEPR1
+  vuc __builtin_vec_vsububm (vsc, vuc);
+    VSUBUBM  VSUBUBM_DEPR2
+  vuc __builtin_vec_vsububm (vuc, vsc);
+    VSUBUBM  VSUBUBM_DEPR3
+  vuc __builtin_vec_vsububm (vuc, vuc);
+    VSUBUBM  VSUBUBM_DEPR4
+  vsc __builtin_vec_vsububm (vbc, vsc);
+    VSUBUBM  VSUBUBM_DEPR5
+  vsc __builtin_vec_vsububm (vsc, vbc);
+    VSUBUBM  VSUBUBM_DEPR6
+  vuc __builtin_vec_vsububm (vbc, vuc);
+    VSUBUBM  VSUBUBM_DEPR7
+  vuc __builtin_vec_vsububm (vuc, vbc);
+    VSUBUBM  VSUBUBM_DEPR8
+
+[VEC_VSUBUBS, vec_vsububs, __builtin_vec_vsububs]
+  vsc __builtin_vec_vsububs (vsc, vsc);
+    VSUBUBS  VSUBUBS_DEPR1
+  vsc __builtin_vec_vsububs (vbc, vsc);
+    VSUBUBS  VSUBUBS_DEPR2
+  vsc __builtin_vec_vsububs (vsc, vbc);
+    VSUBUBS  VSUBUBS_DEPR3
+  vuc __builtin_vec_vsububs (vsc, vuc);
+    VSUBUBS  VSUBUBS_DEPR4
+  vuc __builtin_vec_vsububs (vuc, vsc);
+    VSUBUBS  VSUBUBS_DEPR5
+  vuc __builtin_vec_vsububs (vuc, vuc);
+    VSUBUBS  VSUBUBS_DEPR6
+  vuc __builtin_vec_vsububs (vbc, vuc);
+    VSUBUBS  VSUBUBS_DEPR7
+  vuc __builtin_vec_vsububs (vuc, vbc);
+    VSUBUBS  VSUBUBS_DEPR8
+
+[VEC_VSUBUDM, vec_vsubudm, __builtin_vec_vsubudm, _ARCH_PWR8]
+  vsll __builtin_vec_vsubudm (vbll, vsll);
+    VSUBUDM  VSUBUDM_DEPR1
+  vsll __builtin_vec_vsubudm (vsll, vbll);
+    VSUBUDM  VSUBUDM_DEPR2
+  vsll __builtin_vec_vsubudm (vsll, vsll);
+    VSUBUDM  VSUBUDM_DEPR3
+  vull __builtin_vec_vsubudm (vbll, vull);
+    VSUBUDM  VSUBUDM_DEPR4
+  vull __builtin_vec_vsubudm (vull, vbll);
+    VSUBUDM  VSUBUDM_DEPR5
+  vull __builtin_vec_vsubudm (vull, vull);
+    VSUBUDM  VSUBUDM_DEPR6
+
+[VEC_VSUBUHM, vec_vsubuhm, __builtin_vec_vsubuhm]
+  vss __builtin_vec_vsubuhm (vss, vss);
+    VSUBUHM  VUSBUHM_DEPR1
+  vus __builtin_vec_vsubuhm (vss, vus);
+    VSUBUHM  VUSBUHM_DEPR2
+  vus __builtin_vec_vsubuhm (vus, vss);
+    VSUBUHM  VUSBUHM_DEPR3
+  vus __builtin_vec_vsubuhm (vus, vus);
+    VSUBUHM  VUSBUHM_DEPR4
+  vss __builtin_vec_vsubuhm (vbs, vss);
+    VSUBUHM  VUSBUHM_DEPR5
+  vss __builtin_vec_vsubuhm (vss, vbs);
+    VSUBUHM  VUSBUHM_DEPR6
+  vus __builtin_vec_vsubuhm (vbs, vus);
+    VSUBUHM  VUSBUHM_DEPR7
+  vus __builtin_vec_vsubuhm (vus, vbs);
+    VSUBUHM  VUSBUHM_DEPR8
+
+[VEC_VSUBUHS, vec_vsubuhs, __builtin_vec_vsubuhs]
+  vus __builtin_vec_vsubuhs (vss, vus);
+    VSUBUHS  VSUBUHS_DEPR1
+  vus __builtin_vec_vsubuhs (vus, vss);
+    VSUBUHS  VSUBUHS_DEPR2
+  vus __builtin_vec_vsubuhs (vus, vus);
+    VSUBUHS  VSUBUHS_DEPR3
+  vus __builtin_vec_vsubuhs (vbs, vus);
+    VSUBUHS  VSUBUHS_DEPR4
+  vus __builtin_vec_vsubuhs (vus, vbs);
+    VSUBUHS  VSUBUHS_DEPR5
+
+[VEC_VSUBUQM, vec_vsubuqm, __builtin_vec_vsubuqm, _ARCH_PWR8]
+  vsq __builtin_vec_vsubuqm (vsq, vsq);
+    VSUBUQM  VSUBUQM_DEPR1
+  vuq __builtin_vec_vsubuqm (vuq, vuq);
+    VSUBUQM  VSUBUQM_DEPR2
+
+[VEC_VSUBUWM, vec_vsubuwm, __builtin_vec_vsubuwm]
+  vsi __builtin_vec_vsubuwm (vbi, vsi);
+    VSUBUWM  VSUBUWM_DEPR1
+  vsi __builtin_vec_vsubuwm (vsi, vbi);
+    VSUBUWM  VSUBUWM_DEPR2
+  vui __builtin_vec_vsubuwm (vbi, vui);
+    VSUBUWM  VSUBUWM_DEPR3
+  vui __builtin_vec_vsubuwm (vui, vbi);
+    VSUBUWM  VSUBUWM_DEPR4
+  vsi __builtin_vec_vsubuwm (vsi, vsi);
+    VSUBUWM  VSUBUWM_DEPR5
+  vui __builtin_vec_vsubuwm (vsi, vui);
+    VSUBUWM  VSUBUWM_DEPR6
+  vui __builtin_vec_vsubuwm (vui, vsi);
+    VSUBUWM  VSUBUWM_DEPR7
+  vui __builtin_vec_vsubuwm (vui, vui);
+    VSUBUWM  VSUBUWM_DEPR8
+
+[VEC_VSUBUWS, vec_vsubuws, __builtin_vec_vsubuws]
+  vui __builtin_vec_vsubuws (vsi, vui);
+    VSUBUWS  VSUBUWS_DEPR1
+  vui __builtin_vec_vsubuws (vui, vsi);
+    VSUBUWS  VSUBUWS_DEPR2
+  vui __builtin_vec_vsubuws (vui, vui);
+    VSUBUWS  VSUBUWS_DEPR3
+  vui __builtin_vec_vsubuws (vbi, vui);
+    VSUBUWS  VSUBUWS_DEPR4
+  vui __builtin_vec_vsubuws (vui, vbi);
+    VSUBUWS  VSUBUWS_DEPR5
+
+[VEC_VSUM4SBS, vec_vsum4sbs, __builtin_vec_vsum4sbs]
+  vsi __builtin_vec_vsum4sbs (vsc, vsi);
+    VSUM4SBS  VSUM4SBS_DEPR1
+
+[VEC_VSUM4SHS, vec_vsum4shs, __builtin_vec_vsum4shs]
+  vsi __builtin_vec_vsum4shs (vss, vsi);
+    VSUM4SHS  VSUM4SHS_DEPR1
+
+[VEC_VSUM4UBS, vec_vsum4ubs, __builtin_vec_vsum4ubs]
+  vui __builtin_vec_vsum4ubs (vuc, vui);
+    VSUM4UBS  VSUM4UBS_DEPR1
+
+[VEC_VTDCDP, vec_test_data_class_dp, __builtin_vec_test_data_class_dp, _ARCH_PWR9]
+  vbll __builtin_vec_test_data_class_dp (vd, const int);
+    VTDCDP  VTDCDP_DEPR1
+
+[VEC_VTDCSP, vec_test_data_class_sp, __builtin_vec_test_data_class_sp, _ARCH_PWR9]
+  vbi __builtin_vec_test_data_class_sp (vf, const int);
+    VTDCSP  VTDCSP_DEPR1
+
+[VEC_UNS_DOUBLEE, vec_uns_doublee, __builtin_vec_uns_doublee]
+  vd __builtin_vec_uns_doublee (vui);
+    UNS_DOUBLEE_V4SI  UNS_DOUBLEE_DEPR1
+
+[VEC_UNS_DOUBLEH, vec_uns_doubleh, __builtin_vec_uns_doubleh]
+  vd __builtin_vec_uns_doubleh (vui);
+    UNS_DOUBLEH_V4SI  UNS_DOUBLEH_DEPR1
+
+[VEC_UNS_DOUBLEL, vec_uns_doublel, __builtin_vec_uns_doublel]
+  vd __builtin_vec_uns_doublel (vui);
+    UNS_DOUBLEL_V4SI  UNS_DOUBLEL_DEPR1
+
+[VEC_UNS_DOUBLEO, vec_uns_doubleo, __builtin_vec_uns_doubleo]
+  vd __builtin_vec_uns_doubleo (vui);
+    UNS_DOUBLEO_V4SI  UNS_DOUBLEO_DEPR1
+
+[VEC_VUPKHPX, vec_vupkhpx, __builtin_vec_vupkhpx]
+  vui __builtin_vec_vupkhpx (vus);
+    VUPKHPX  VUPKHPX_DEPR1
+  vui __builtin_vec_vupkhpx (vp);
+    VUPKHPX  VUPKHPX_DEPR2
+
+[VEC_VUPKHSB, vec_vupkhsb, __builtin_vec_vupkhsb]
+  vss __builtin_vec_vupkhsb (vsc);
+    VUPKHSB  VUPKHSB_DEPR1
+  vbs __builtin_vec_vupkhsb (vbc);
+    VUPKHSB  VUPKHSB_DEPR2
+
+[VEC_VUPKHSH, vec_vupkhsh, __builtin_vec_vupkhsh]
+  vsi __builtin_vec_vupkhsh (vss);
+    VUPKHSH  VUPKHSH_DEPR1
+  vbi __builtin_vec_vupkhsh (vbs);
+    VUPKHSH  VUPKHSH_DEPR2
+
+[VEC_VUPKHSW, vec_vupkhsw, __builtin_vec_vupkhsw, _ARCH_PWR8]
+  vsll __builtin_vec_vupkhsw (vsi);
+    VUPKHSW  VUPKHSW_DEPR1
+  vbll __builtin_vec_vupkhsw (vbi);
+    VUPKHSW  VUPKHSW_DEPR2
+
+[VEC_VUPKLPX, vec_vupklpx, __builtin_vec_vupklpx]
+  vui __builtin_vec_vupklpx (vus);
+    VUPKLPX  VUPKLPX_DEPR1
+  vui __builtin_vec_vupklpx (vp);
+    VUPKLPX  VUPKLPX_DEPR2
+
+[VEC_VUPKLSB, vec_vupklsb, __builtin_vec_vupklsb]
+  vss __builtin_vec_vupklsb (vsc);
+    VUPKLSB  VUPKLSB_DEPR1
+  vbs __builtin_vec_vupklsb (vbc);
+    VUPKLSB  VUPKLSB_DEPR2
+
+[VEC_VUPKLSH, vec_vupklsh, __builtin_vec_vupklsh]
+  vsi __builtin_vec_vupklsh (vss);
+    VUPKLSH  VUPKLSH_DEPR1
+  vbi __builtin_vec_vupklsh (vbs);
+    VUPKLSH  VUPKLSH_DEPR2
+
+[VEC_VUPKLSW, vec_vupklsw, __builtin_vec_vupklsw, _ARCH_PWR8]
+  vsll __builtin_vec_vupklsw (vsi);
+    VUPKLSW  VUPKLSW_DEPR1
+  vbll __builtin_vec_vupklsw (vbi);
+    VUPKLSW  VUPKLSW_DEPR2
-- 
2.27.0


^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 15/34] rs6000: Execute the automatic built-in initialization code
  2021-07-29 13:30 [PATCHv4 00/34] Replace the Power target-specific builtin machinery Bill Schmidt
                   ` (13 preceding siblings ...)
  2021-07-29 13:31 ` [PATCH 14/34] rs6000: Add remaining overloads Bill Schmidt
@ 2021-07-29 13:31 ` Bill Schmidt
  2021-08-26 23:15   ` Segher Boessenkool
  2021-07-29 13:31 ` [PATCH 16/34] rs6000: Darwin builtin support Bill Schmidt
                   ` (18 subsequent siblings)
  33 siblings, 1 reply; 84+ messages in thread
From: Bill Schmidt @ 2021-07-29 13:31 UTC (permalink / raw)
  To: gcc-patches; +Cc: segher, dje.gcc, willschm

2021-03-04  Bill Schmidt  <wschmidt@linux.ibm.com>

gcc/
	* config/rs6000/rs6000-call.c (rs6000-builtins.h): New #include.
	(rs6000_init_builtins): Call rs6000_autoinit_builtins; skip the old
	initialization logic when new builtins are enabled.
---
 gcc/config/rs6000/rs6000-call.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
index b1338191926..be34a196be0 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -69,6 +69,7 @@
 #include "opts.h"
 
 #include "rs6000-internal.h"
+#include "rs6000-builtins.h"
 
 #if TARGET_MACHO
 #include "gstab.h"  /* for N_SLINE */
@@ -13648,6 +13649,17 @@ rs6000_init_builtins (void)
     = build_pointer_type (build_qualified_type (void_type_node,
 						TYPE_QUAL_CONST));
 
+  /* Execute the autogenerated initialization code for builtins.  */
+  rs6000_autoinit_builtins ();
+
+  if (new_builtins_are_live)
+    {
+#ifdef SUBTARGET_INIT_BUILTINS
+      SUBTARGET_INIT_BUILTINS;
+#endif
+      return;
+    }
+
   /* Create Altivec, VSX and MMA builtins on machines with at least the
      general purpose extensions (970 and newer) to allow the use of
      the target attribute.  */
-- 
2.27.0


^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 16/34] rs6000: Darwin builtin support
  2021-07-29 13:30 [PATCHv4 00/34] Replace the Power target-specific builtin machinery Bill Schmidt
                   ` (14 preceding siblings ...)
  2021-07-29 13:31 ` [PATCH 15/34] rs6000: Execute the automatic built-in initialization code Bill Schmidt
@ 2021-07-29 13:31 ` Bill Schmidt
  2021-08-27 18:01   ` Segher Boessenkool
  2021-07-29 13:31 ` [PATCH 17/34] rs6000: Add sanity to V2DI_type_node definitions Bill Schmidt
                   ` (17 subsequent siblings)
  33 siblings, 1 reply; 84+ messages in thread
From: Bill Schmidt @ 2021-07-29 13:31 UTC (permalink / raw)
  To: gcc-patches; +Cc: segher, dje.gcc, willschm

2021-03-04  Bill Schmidt  <wschmidt@linux.ibm.com>

gcc/
	* config/rs6000/darwin.h (SUBTARGET_INIT_BUILTINS): Use the new
	decl when new_builtins_are_live.
	* config/rs6000/rs6000-builtin-new.def (__builtin_cfstring): New
	built-in.
---
 gcc/config/rs6000/darwin.h               | 8 ++++++--
 gcc/config/rs6000/rs6000-builtin-new.def | 6 ++++++
 2 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/gcc/config/rs6000/darwin.h b/gcc/config/rs6000/darwin.h
index 42f39e60305..6abf8e84f54 100644
--- a/gcc/config/rs6000/darwin.h
+++ b/gcc/config/rs6000/darwin.h
@@ -504,8 +504,12 @@
 #define SUBTARGET_INIT_BUILTINS						\
 do {									\
   darwin_patch_builtins ();						\
-  rs6000_builtin_decls[(unsigned) (RS6000_BUILTIN_CFSTRING)]		\
-    = darwin_init_cfstring_builtins ((unsigned) (RS6000_BUILTIN_CFSTRING)); \
+  if (new_builtins_are_live)						\
+    rs6000_builtin_decls_x[(unsigned) (RS6000_BIF_CFSTRING)]		\
+      = darwin_init_cfstring_builtins ((unsigned) (RS6000_BIF_CFSTRING)); \
+  else									\
+    rs6000_builtin_decls[(unsigned) (RS6000_BUILTIN_CFSTRING)]		\
+      = darwin_init_cfstring_builtins ((unsigned) (RS6000_BUILTIN_CFSTRING)); \
 } while(0)
 
 /* So far, there is no rs6000_fold_builtin, if one is introduced, then
diff --git a/gcc/config/rs6000/rs6000-builtin-new.def b/gcc/config/rs6000/rs6000-builtin-new.def
index 322dbe1f713..91dce7fbc91 100644
--- a/gcc/config/rs6000/rs6000-builtin-new.def
+++ b/gcc/config/rs6000/rs6000-builtin-new.def
@@ -187,6 +187,12 @@
 ; Builtins that have been around since time immemorial or are just
 ; considered available everywhere.
 [always]
+; __builtin_cfstring is for Darwin, which will replace the decl we
+; create here with another one during subtarget processing.  We just
+; need to ensure it has a slot in the builtin enumeration.
+  void __builtin_cfstring ();
+    CFSTRING nothing {}
+
   void __builtin_cpu_init ();
     CPU_INIT nothing {cpu}
 
-- 
2.27.0


^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 17/34] rs6000: Add sanity to V2DI_type_node definitions
  2021-07-29 13:30 [PATCHv4 00/34] Replace the Power target-specific builtin machinery Bill Schmidt
                   ` (15 preceding siblings ...)
  2021-07-29 13:31 ` [PATCH 16/34] rs6000: Darwin builtin support Bill Schmidt
@ 2021-07-29 13:31 ` Bill Schmidt
  2021-08-27 19:27   ` Segher Boessenkool
  2021-07-29 13:31 ` [PATCH 18/34] rs6000: Always initialize vector_pair and vector_quad nodes Bill Schmidt
                   ` (16 subsequent siblings)
  33 siblings, 1 reply; 84+ messages in thread
From: Bill Schmidt @ 2021-07-29 13:31 UTC (permalink / raw)
  To: gcc-patches; +Cc: segher, dje.gcc, willschm

It seems quite strange for these to be "vector long" for 64-bit and
"vector long long" for 32-bit, when "vector long long" will do for both.

2021-03-04  Bill Schmidt  <wschmidt@linux.ibm.com>

gcc/
	* config/rs6000/rs6000-call.c (rs6000_init_builtins): Change
	initialization of V2DI_type_node and unsigned_V2DI_type_node.
---
 gcc/config/rs6000/rs6000-call.c | 17 +++++++++++++----
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
index be34a196be0..7a8bc5f537c 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -13296,9 +13296,13 @@ rs6000_init_builtins (void)
 	     (TARGET_ALTIVEC)	   ? ", altivec" : "",
 	     (TARGET_VSX)	   ? ", vsx"	 : "");
 
-  V2DI_type_node = rs6000_vector_type (TARGET_POWERPC64 ? "__vector long"
-				       : "__vector long long",
-				       long_long_integer_type_node, 2);
+  if (new_builtins_are_live)
+    V2DI_type_node = rs6000_vector_type ("__vector long long",
+					 long_long_integer_type_node, 2);
+  else
+    V2DI_type_node = rs6000_vector_type (TARGET_POWERPC64 ? "__vector long"
+					 : "__vector long long",
+					 long_long_integer_type_node, 2);
   ptr_V2DI_type_node
     = build_pointer_type (build_qualified_type (V2DI_type_node,
 						TYPE_QUAL_CONST));
@@ -13349,7 +13353,12 @@ rs6000_init_builtins (void)
     = build_pointer_type (build_qualified_type (unsigned_V4SI_type_node,
 						TYPE_QUAL_CONST));
 
-  unsigned_V2DI_type_node = rs6000_vector_type (TARGET_POWERPC64
+  if (new_builtins_are_live)
+    unsigned_V2DI_type_node
+      = rs6000_vector_type ("__vector unsigned long long",
+			    long_long_unsigned_type_node, 2);
+  else
+    unsigned_V2DI_type_node = rs6000_vector_type (TARGET_POWERPC64
 				       ? "__vector unsigned long"
 				       : "__vector unsigned long long",
 				       long_long_unsigned_type_node, 2);
-- 
2.27.0


^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 18/34] rs6000: Always initialize vector_pair and vector_quad nodes
  2021-07-29 13:30 [PATCHv4 00/34] Replace the Power target-specific builtin machinery Bill Schmidt
                   ` (16 preceding siblings ...)
  2021-07-29 13:31 ` [PATCH 17/34] rs6000: Add sanity to V2DI_type_node definitions Bill Schmidt
@ 2021-07-29 13:31 ` Bill Schmidt
  2021-08-27 19:34   ` Segher Boessenkool
  2021-07-29 13:31 ` [PATCH 19/34] rs6000: Handle overloads during program parsing Bill Schmidt
                   ` (15 subsequent siblings)
  33 siblings, 1 reply; 84+ messages in thread
From: Bill Schmidt @ 2021-07-29 13:31 UTC (permalink / raw)
  To: gcc-patches; +Cc: segher, dje.gcc, willschm

2021-03-24  Bill Schmidt  <wschmidt@linux.ibm.com>

gcc/
	* config/rs6000/rs6000-call.c (rs6000_init_builtins): Remove
	TARGET_EXTRA_BUILTINS guard.
---
 gcc/config/rs6000/rs6000-call.c | 51 ++++++++++++++++-----------------
 1 file changed, 24 insertions(+), 27 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
index 7a8bc5f537c..0c555f29f7d 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -13542,34 +13542,31 @@ rs6000_init_builtins (void)
     ieee128_float_type_node = ibm128_float_type_node = long_double_type_node;
 
   /* Vector pair and vector quad support.  */
-  if (TARGET_EXTRA_BUILTINS)
-    {
-      vector_pair_type_node = make_node (OPAQUE_TYPE);
-      SET_TYPE_MODE (vector_pair_type_node, OOmode);
-      TYPE_SIZE (vector_pair_type_node) = bitsize_int (GET_MODE_BITSIZE (OOmode));
-      TYPE_PRECISION (vector_pair_type_node) = GET_MODE_BITSIZE (OOmode);
-      TYPE_SIZE_UNIT (vector_pair_type_node) = size_int (GET_MODE_SIZE (OOmode));
-      SET_TYPE_ALIGN (vector_pair_type_node, 256);
-      TYPE_USER_ALIGN (vector_pair_type_node) = 0;
-      lang_hooks.types.register_builtin_type (vector_pair_type_node,
-					      "__vector_pair");
-      ptr_vector_pair_type_node
-	= build_pointer_type (build_qualified_type (vector_pair_type_node,
-						    TYPE_QUAL_CONST));
+  vector_pair_type_node = make_node (OPAQUE_TYPE);
+  SET_TYPE_MODE (vector_pair_type_node, OOmode);
+  TYPE_SIZE (vector_pair_type_node) = bitsize_int (GET_MODE_BITSIZE (OOmode));
+  TYPE_PRECISION (vector_pair_type_node) = GET_MODE_BITSIZE (OOmode);
+  TYPE_SIZE_UNIT (vector_pair_type_node) = size_int (GET_MODE_SIZE (OOmode));
+  SET_TYPE_ALIGN (vector_pair_type_node, 256);
+  TYPE_USER_ALIGN (vector_pair_type_node) = 0;
+  lang_hooks.types.register_builtin_type (vector_pair_type_node,
+					  "__vector_pair");
+  ptr_vector_pair_type_node
+    = build_pointer_type (build_qualified_type (vector_pair_type_node,
+						TYPE_QUAL_CONST));
 
-      vector_quad_type_node = make_node (OPAQUE_TYPE);
-      SET_TYPE_MODE (vector_quad_type_node, XOmode);
-      TYPE_SIZE (vector_quad_type_node) = bitsize_int (GET_MODE_BITSIZE (XOmode));
-      TYPE_PRECISION (vector_quad_type_node) = GET_MODE_BITSIZE (XOmode);
-      TYPE_SIZE_UNIT (vector_quad_type_node) = size_int (GET_MODE_SIZE (XOmode));
-      SET_TYPE_ALIGN (vector_quad_type_node, 512);
-      TYPE_USER_ALIGN (vector_quad_type_node) = 0;
-      lang_hooks.types.register_builtin_type (vector_quad_type_node,
-					      "__vector_quad");
-      ptr_vector_quad_type_node
-	= build_pointer_type (build_qualified_type (vector_quad_type_node,
-						    TYPE_QUAL_CONST));
-    }
+  vector_quad_type_node = make_node (OPAQUE_TYPE);
+  SET_TYPE_MODE (vector_quad_type_node, XOmode);
+  TYPE_SIZE (vector_quad_type_node) = bitsize_int (GET_MODE_BITSIZE (XOmode));
+  TYPE_PRECISION (vector_quad_type_node) = GET_MODE_BITSIZE (XOmode);
+  TYPE_SIZE_UNIT (vector_quad_type_node) = size_int (GET_MODE_SIZE (XOmode));
+  SET_TYPE_ALIGN (vector_quad_type_node, 512);
+  TYPE_USER_ALIGN (vector_quad_type_node) = 0;
+  lang_hooks.types.register_builtin_type (vector_quad_type_node,
+					  "__vector_quad");
+  ptr_vector_quad_type_node
+    = build_pointer_type (build_qualified_type (vector_quad_type_node,
+						TYPE_QUAL_CONST));
 
   /* Initialize the modes for builtin_function_type, mapping a machine mode to
      tree type node.  */
-- 
2.27.0


^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 19/34] rs6000: Handle overloads during program parsing
  2021-07-29 13:30 [PATCHv4 00/34] Replace the Power target-specific builtin machinery Bill Schmidt
                   ` (17 preceding siblings ...)
  2021-07-29 13:31 ` [PATCH 18/34] rs6000: Always initialize vector_pair and vector_quad nodes Bill Schmidt
@ 2021-07-29 13:31 ` Bill Schmidt
  2021-08-27 23:07   ` Segher Boessenkool
  2021-07-29 13:31 ` [PATCH 20/34] rs6000: Handle gimple folding of target built-ins Bill Schmidt
                   ` (14 subsequent siblings)
  33 siblings, 1 reply; 84+ messages in thread
From: Bill Schmidt @ 2021-07-29 13:31 UTC (permalink / raw)
  To: gcc-patches; +Cc: segher, dje.gcc, willschm

Although this patch looks quite large, the changes are fairly minimal.
Most of it is duplicating the large function that does the overload
resolution using the automatically generated data structures instead of
the old hand-generated ones.  This doesn't make the patch terribly easy to
review, unfortunately.  Just be aware that generally we aren't changing
the logic and functionality of overload handling.

2021-06-07  Bill Schmidt  <wschmidt@linux.ibm.com>

gcc/
	* config/rs6000/rs6000-c.c (rs6000-builtins.h): New include.
	(altivec_resolve_new_overloaded_builtin): New forward decl.
	(rs6000_new_builtin_type_compatible): New function.
	(altivec_resolve_overloaded_builtin): Call
	altivec_resolve_new_overloaded_builtin.
	(altivec_build_new_resolved_builtin): New function.
	(altivec_resolve_new_overloaded_builtin): Likewise.
	* config/rs6000/rs6000-call.c (rs6000_new_builtin_is_supported_p):
	Likewise.
---
 gcc/config/rs6000/rs6000-c.c    | 1083 +++++++++++++++++++++++++++++++
 gcc/config/rs6000/rs6000-call.c |   91 +++
 2 files changed, 1174 insertions(+)

diff --git a/gcc/config/rs6000/rs6000-c.c b/gcc/config/rs6000/rs6000-c.c
index afcb5bb6e39..a986e57fe7d 100644
--- a/gcc/config/rs6000/rs6000-c.c
+++ b/gcc/config/rs6000/rs6000-c.c
@@ -35,6 +35,10 @@
 #include "langhooks.h"
 #include "c/c-tree.h"
 
+#include "rs6000-builtins.h"
+
+static tree
+altivec_resolve_new_overloaded_builtin (location_t, tree, void *);
 
 
 /* Handle the machine specific pragma longcall.  Its syntax is
@@ -811,6 +815,30 @@ is_float128_p (tree t)
 	      && t == long_double_type_node));
 }
   
+static bool
+rs6000_new_builtin_type_compatible (tree t, tree u)
+{
+  if (t == error_mark_node)
+    return false;
+
+  if (INTEGRAL_TYPE_P (t) && INTEGRAL_TYPE_P (u))
+    return true;
+
+  if (TARGET_IEEEQUAD && TARGET_LONG_DOUBLE_128
+	   && is_float128_p (t) && is_float128_p (u))
+    return true;
+
+  if (POINTER_TYPE_P (t) && POINTER_TYPE_P (u))
+    {
+      t = TREE_TYPE (t);
+      u = TREE_TYPE (u);
+      if (TYPE_READONLY (u))
+	t = build_qualified_type (t, TYPE_QUAL_CONST);
+    }
+
+  return lang_hooks.types_compatible_p (t, u);
+}
+
 static inline bool
 rs6000_builtin_type_compatible (tree t, int id)
 {
@@ -927,6 +955,10 @@ tree
 altivec_resolve_overloaded_builtin (location_t loc, tree fndecl,
 				    void *passed_arglist)
 {
+  if (new_builtins_are_live)
+    return altivec_resolve_new_overloaded_builtin (loc, fndecl,
+						   passed_arglist);
+
   vec<tree, va_gc> *arglist = static_cast<vec<tree, va_gc> *> (passed_arglist);
   unsigned int nargs = vec_safe_length (arglist);
   enum rs6000_builtins fcode
@@ -1930,3 +1962,1054 @@ altivec_resolve_overloaded_builtin (location_t loc, tree fndecl,
     return error_mark_node;
   }
 }
+
+/* Build a tree for a function call to an Altivec non-overloaded builtin.
+   The overloaded builtin that matched the types and args is described
+   by DESC.  The N arguments are given in ARGS, respectively.
+
+   Actually the only thing it does is calling fold_convert on ARGS, with
+   a small exception for vec_{all,any}_{ge,le} predicates. */
+
+static tree
+altivec_build_new_resolved_builtin (tree *args, int n, tree fntype,
+				    tree ret_type,
+				    rs6000_gen_builtins bif_id,
+				    rs6000_gen_builtins ovld_id)
+{
+  tree argtypes = TYPE_ARG_TYPES (fntype);
+  tree arg_type[MAX_OVLD_ARGS];
+  tree fndecl = rs6000_builtin_decls_x[bif_id];
+  tree call;
+
+  for (int i = 0; i < n; i++)
+    arg_type[i] = TREE_VALUE (argtypes), argtypes = TREE_CHAIN (argtypes);
+
+  /* The AltiVec overloading implementation is overall gross, but this
+     is particularly disgusting.  The vec_{all,any}_{ge,le} builtins
+     are completely different for floating-point vs. integer vector
+     types, because the former has vcmpgefp, but the latter should use
+     vcmpgtXX.
+
+     In practice, the second and third arguments are swapped, and the
+     condition (LT vs. EQ, which is recognizable by bit 1 of the first
+     argument) is reversed.  Patch the arguments here before building
+     the resolved CALL_EXPR.  */
+  if (n == 3
+      && ovld_id == RS6000_OVLD_VEC_CMPGE_P
+      && bif_id != RS6000_BIF_VCMPGEFP_P
+      && bif_id != RS6000_BIF_XVCMPGEDP_P)
+    {
+      std::swap (args[1], args[2]);
+      std::swap (arg_type[1], arg_type[2]);
+
+      args[0] = fold_build2 (BIT_XOR_EXPR, TREE_TYPE (args[0]), args[0],
+			     build_int_cst (NULL_TREE, 2));
+    }
+
+  /* If the number of arguments to an overloaded function increases,
+     we must expand this switch.  */
+  gcc_assert (MAX_OVLD_ARGS <= 4);
+
+  switch (n)
+    {
+    case 0:
+      call = build_call_expr (fndecl, 0);
+      break;
+    case 1:
+      call = build_call_expr (fndecl, 1,
+			      fully_fold_convert (arg_type[0], args[0]));
+      break;
+    case 2:
+      call = build_call_expr (fndecl, 2,
+			      fully_fold_convert (arg_type[0], args[0]),
+			      fully_fold_convert (arg_type[1], args[1]));
+      break;
+    case 3:
+      call = build_call_expr (fndecl, 3,
+			      fully_fold_convert (arg_type[0], args[0]),
+			      fully_fold_convert (arg_type[1], args[1]),
+			      fully_fold_convert (arg_type[2], args[2]));
+      break;
+    case 4:
+      call = build_call_expr (fndecl, 4,
+			      fully_fold_convert (arg_type[0], args[0]),
+			      fully_fold_convert (arg_type[1], args[1]),
+			      fully_fold_convert (arg_type[2], args[2]),
+			      fully_fold_convert (arg_type[3], args[3]));
+      break;
+    default:
+      gcc_unreachable ();
+    }
+  return fold_convert (ret_type, call);
+}
+
+/* Implementation of the resolve_overloaded_builtin target hook, to
+   support Altivec's overloaded builtins.  */
+
+static tree
+altivec_resolve_new_overloaded_builtin (location_t loc, tree fndecl,
+					void *passed_arglist)
+{
+  vec<tree, va_gc> *arglist = static_cast<vec<tree, va_gc> *> (passed_arglist);
+  unsigned int nargs = vec_safe_length (arglist);
+  enum rs6000_gen_builtins fcode
+    = (enum rs6000_gen_builtins) DECL_MD_FUNCTION_CODE (fndecl);
+  tree fnargs = TYPE_ARG_TYPES (TREE_TYPE (fndecl));
+  tree types[MAX_OVLD_ARGS], args[MAX_OVLD_ARGS];
+  unsigned int n;
+
+  /* Return immediately if this isn't an overload.  */
+  if (fcode <= RS6000_OVLD_NONE)
+    return NULL_TREE;
+
+  unsigned int adj_fcode = fcode - RS6000_OVLD_NONE;
+
+  if (TARGET_DEBUG_BUILTIN)
+    fprintf (stderr, "altivec_resolve_overloaded_builtin, code = %4d, %s\n",
+	     (int)fcode, IDENTIFIER_POINTER (DECL_NAME (fndecl)));
+
+  /* vec_lvsl and vec_lvsr are deprecated for use with LE element order.  */
+  if (fcode == RS6000_OVLD_VEC_LVSL && !BYTES_BIG_ENDIAN)
+    warning (OPT_Wdeprecated,
+	     "%<vec_lvsl%> is deprecated for little endian; use "
+	     "assignment for unaligned loads and stores");
+  else if (fcode == RS6000_OVLD_VEC_LVSR && !BYTES_BIG_ENDIAN)
+    warning (OPT_Wdeprecated,
+	     "%<vec_lvsr%> is deprecated for little endian; use "
+	     "assignment for unaligned loads and stores");
+
+  if (fcode == RS6000_OVLD_VEC_MUL)
+    {
+      /* vec_mul needs to be special cased because there are no instructions
+	 for it for the {un}signed char, {un}signed short, and {un}signed int
+	 types.  */
+      if (nargs != 2)
+	{
+	  error ("builtin %qs only accepts 2 arguments", "vec_mul");
+	  return error_mark_node;
+	}
+
+      tree arg0 = (*arglist)[0];
+      tree arg0_type = TREE_TYPE (arg0);
+      tree arg1 = (*arglist)[1];
+      tree arg1_type = TREE_TYPE (arg1);
+
+      /* Both arguments must be vectors and the types must be compatible.  */
+      if (TREE_CODE (arg0_type) != VECTOR_TYPE)
+	goto bad;
+      if (!lang_hooks.types_compatible_p (arg0_type, arg1_type))
+	goto bad;
+
+      switch (TYPE_MODE (TREE_TYPE (arg0_type)))
+	{
+	  case E_QImode:
+	  case E_HImode:
+	  case E_SImode:
+	  case E_DImode:
+	  case E_TImode:
+	    {
+	      /* For scalar types just use a multiply expression.  */
+	      return fold_build2_loc (loc, MULT_EXPR, TREE_TYPE (arg0), arg0,
+				      fold_convert (TREE_TYPE (arg0), arg1));
+	    }
+	  case E_SFmode:
+	    {
+	      /* For floats use the xvmulsp instruction directly.  */
+	      tree call = rs6000_builtin_decls_x[RS6000_BIF_XVMULSP];
+	      return build_call_expr (call, 2, arg0, arg1);
+	    }
+	  case E_DFmode:
+	    {
+	      /* For doubles use the xvmuldp instruction directly.  */
+	      tree call = rs6000_builtin_decls_x[RS6000_BIF_XVMULDP];
+	      return build_call_expr (call, 2, arg0, arg1);
+	    }
+	  /* Other types are errors.  */
+	  default:
+	    goto bad;
+	}
+    }
+
+  if (fcode == RS6000_OVLD_VEC_CMPNE)
+    {
+      /* vec_cmpne needs to be special cased because there are no instructions
+	 for it (prior to power 9).  */
+      if (nargs != 2)
+	{
+	  error ("builtin %qs only accepts 2 arguments", "vec_cmpne");
+	  return error_mark_node;
+	}
+
+      tree arg0 = (*arglist)[0];
+      tree arg0_type = TREE_TYPE (arg0);
+      tree arg1 = (*arglist)[1];
+      tree arg1_type = TREE_TYPE (arg1);
+
+      /* Both arguments must be vectors and the types must be compatible.  */
+      if (TREE_CODE (arg0_type) != VECTOR_TYPE)
+	goto bad;
+      if (!lang_hooks.types_compatible_p (arg0_type, arg1_type))
+	goto bad;
+
+      /* Power9 instructions provide the most efficient implementation of
+	 ALTIVEC_BUILTIN_VEC_CMPNE if the mode is not DImode or TImode
+	 or SFmode or DFmode.  */
+      if (!TARGET_P9_VECTOR
+	  || (TYPE_MODE (TREE_TYPE (arg0_type)) == DImode)
+	  || (TYPE_MODE (TREE_TYPE (arg0_type)) == TImode)
+	  || (TYPE_MODE (TREE_TYPE (arg0_type)) == SFmode)
+	  || (TYPE_MODE (TREE_TYPE (arg0_type)) == DFmode))
+	{
+	  switch (TYPE_MODE (TREE_TYPE (arg0_type)))
+	    {
+	      /* vec_cmpneq (va, vb) == vec_nor (vec_cmpeq (va, vb),
+		 vec_cmpeq (va, vb)).  */
+	      /* Note:  vec_nand also works but opt changes vec_nand's
+		 to vec_nor's anyway.  */
+	    case E_QImode:
+	    case E_HImode:
+	    case E_SImode:
+	    case E_DImode:
+	    case E_TImode:
+	    case E_SFmode:
+	    case E_DFmode:
+	      {
+		/* call = vec_cmpeq (va, vb)
+		   result = vec_nor (call, call).  */
+		vec<tree, va_gc> *params = make_tree_vector ();
+		vec_safe_push (params, arg0);
+		vec_safe_push (params, arg1);
+		tree call = altivec_resolve_new_overloaded_builtin
+		  (loc, rs6000_builtin_decls_x[RS6000_OVLD_VEC_CMPEQ],
+		   params);
+		/* Use save_expr to ensure that operands used more than once
+		   that may have side effects (like calls) are only evaluated
+		   once.  */
+		call = save_expr (call);
+		params = make_tree_vector ();
+		vec_safe_push (params, call);
+		vec_safe_push (params, call);
+		return altivec_resolve_new_overloaded_builtin
+		  (loc, rs6000_builtin_decls_x[RS6000_OVLD_VEC_NOR], params);
+	      }
+	      /* Other types are errors.  */
+	    default:
+	      goto bad;
+	    }
+	}
+      /* else, fall through and process the Power9 alternative below */
+    }
+
+  if (fcode == RS6000_OVLD_VEC_ADDE || fcode == RS6000_OVLD_VEC_SUBE)
+    {
+      /* vec_adde needs to be special cased because there is no instruction
+	  for the {un}signed int version.  */
+      if (nargs != 3)
+	{
+	  const char *name
+	    = fcode == RS6000_OVLD_VEC_ADDE ? "vec_adde": "vec_sube";
+	  error ("builtin %qs only accepts 3 arguments", name);
+	  return error_mark_node;
+	}
+
+      tree arg0 = (*arglist)[0];
+      tree arg0_type = TREE_TYPE (arg0);
+      tree arg1 = (*arglist)[1];
+      tree arg1_type = TREE_TYPE (arg1);
+      tree arg2 = (*arglist)[2];
+      tree arg2_type = TREE_TYPE (arg2);
+
+      /* All 3 arguments must be vectors of (signed or unsigned) (int or
+	 __int128) and the types must be compatible.  */
+      if (TREE_CODE (arg0_type) != VECTOR_TYPE)
+	goto bad;
+      if (!lang_hooks.types_compatible_p (arg0_type, arg1_type)
+	  || !lang_hooks.types_compatible_p (arg1_type, arg2_type))
+	goto bad;
+
+      switch (TYPE_MODE (TREE_TYPE (arg0_type)))
+	{
+	  /* For {un}signed ints,
+	     vec_adde (va, vb, carryv) == vec_add (vec_add (va, vb),
+						   vec_and (carryv, 1)).
+	     vec_sube (va, vb, carryv) == vec_sub (vec_sub (va, vb),
+						   vec_and (carryv, 1)).  */
+	  case E_SImode:
+	    {
+	      tree add_sub_builtin;
+
+	      vec<tree, va_gc> *params = make_tree_vector ();
+	      vec_safe_push (params, arg0);
+	      vec_safe_push (params, arg1);
+
+	      if (fcode == RS6000_OVLD_VEC_ADDE)
+		add_sub_builtin = rs6000_builtin_decls_x[RS6000_OVLD_VEC_ADD];
+	      else
+		add_sub_builtin = rs6000_builtin_decls_x[RS6000_OVLD_VEC_SUB];
+
+	      tree call
+		= altivec_resolve_new_overloaded_builtin (loc,
+							  add_sub_builtin,
+							  params);
+	      tree const1 = build_int_cstu (TREE_TYPE (arg0_type), 1);
+	      tree ones_vector = build_vector_from_val (arg0_type, const1);
+	      tree and_expr = fold_build2_loc (loc, BIT_AND_EXPR, arg0_type,
+					       arg2, ones_vector);
+	      params = make_tree_vector ();
+	      vec_safe_push (params, call);
+	      vec_safe_push (params, and_expr);
+	      return altivec_resolve_new_overloaded_builtin (loc,
+							     add_sub_builtin,
+							     params);
+	    }
+	  /* For {un}signed __int128s use the vaddeuqm/vsubeuqm instruction
+	     directly.  */
+	  case E_TImode:
+	    break;
+
+	  /* Types other than {un}signed int and {un}signed __int128
+		are errors.  */
+	  default:
+	    goto bad;
+	}
+    }
+
+  if (fcode == RS6000_OVLD_VEC_ADDEC || fcode == RS6000_OVLD_VEC_SUBEC)
+    {
+      /* vec_addec and vec_subec needs to be special cased because there is
+	 no instruction for the {un}signed int version.  */
+      if (nargs != 3)
+	{
+	  const char *name = fcode == RS6000_OVLD_VEC_ADDEC ?
+	    "vec_addec": "vec_subec";
+	  error ("builtin %qs only accepts 3 arguments", name);
+	  return error_mark_node;
+	}
+
+      tree arg0 = (*arglist)[0];
+      tree arg0_type = TREE_TYPE (arg0);
+      tree arg1 = (*arglist)[1];
+      tree arg1_type = TREE_TYPE (arg1);
+      tree arg2 = (*arglist)[2];
+      tree arg2_type = TREE_TYPE (arg2);
+
+      /* All 3 arguments must be vectors of (signed or unsigned) (int or
+	 __int128) and the types must be compatible.  */
+      if (TREE_CODE (arg0_type) != VECTOR_TYPE)
+	goto bad;
+      if (!lang_hooks.types_compatible_p (arg0_type, arg1_type)
+	  || !lang_hooks.types_compatible_p (arg1_type, arg2_type))
+	goto bad;
+
+      switch (TYPE_MODE (TREE_TYPE (arg0_type)))
+	{
+	  /* For {un}signed ints,
+	      vec_addec (va, vb, carryv) ==
+				vec_or (vec_addc (va, vb),
+					vec_addc (vec_add (va, vb),
+						  vec_and (carryv, 0x1))).  */
+	  case E_SImode:
+	    {
+	    /* Use save_expr to ensure that operands used more than once
+		that may have side effects (like calls) are only evaluated
+		once.  */
+	    tree as_builtin;
+	    tree as_c_builtin;
+
+	    arg0 = save_expr (arg0);
+	    arg1 = save_expr (arg1);
+	    vec<tree, va_gc> *params = make_tree_vector ();
+	    vec_safe_push (params, arg0);
+	    vec_safe_push (params, arg1);
+
+	    if (fcode == RS6000_OVLD_VEC_ADDEC)
+	      as_c_builtin = rs6000_builtin_decls_x[RS6000_OVLD_VEC_ADDC];
+	    else
+	      as_c_builtin = rs6000_builtin_decls_x[RS6000_OVLD_VEC_SUBC];
+
+	    tree call1 = altivec_resolve_new_overloaded_builtin (loc,
+								 as_c_builtin,
+								 params);
+	    params = make_tree_vector ();
+	    vec_safe_push (params, arg0);
+	    vec_safe_push (params, arg1);
+
+
+	    if (fcode == RS6000_OVLD_VEC_ADDEC)
+	      as_builtin = rs6000_builtin_decls_x[RS6000_OVLD_VEC_ADD];
+	    else
+	      as_builtin = rs6000_builtin_decls_x[RS6000_OVLD_VEC_SUB];
+
+	    tree call2 = altivec_resolve_new_overloaded_builtin (loc,
+								 as_builtin,
+								 params);
+	    tree const1 = build_int_cstu (TREE_TYPE (arg0_type), 1);
+	    tree ones_vector = build_vector_from_val (arg0_type, const1);
+	    tree and_expr = fold_build2_loc (loc, BIT_AND_EXPR, arg0_type,
+					     arg2, ones_vector);
+	    params = make_tree_vector ();
+	    vec_safe_push (params, call2);
+	    vec_safe_push (params, and_expr);
+	    call2 = altivec_resolve_new_overloaded_builtin (loc, as_c_builtin,
+							    params);
+	    params = make_tree_vector ();
+	    vec_safe_push (params, call1);
+	    vec_safe_push (params, call2);
+	    tree or_builtin = rs6000_builtin_decls_x[RS6000_OVLD_VEC_OR];
+	    return altivec_resolve_new_overloaded_builtin (loc, or_builtin,
+							   params);
+	    }
+	  /* For {un}signed __int128s use the vaddecuq/vsubbecuq
+	     instructions.  This occurs through normal processing.  */
+	  case E_TImode:
+	    break;
+
+	  /* Types other than {un}signed int and {un}signed __int128
+		are errors.  */
+	  default:
+	    goto bad;
+	}
+    }
+
+  /* For now treat vec_splats and vec_promote as the same.  */
+  if (fcode == RS6000_OVLD_VEC_SPLATS || fcode == RS6000_OVLD_VEC_PROMOTE)
+    {
+      tree type, arg;
+      int size;
+      int i;
+      bool unsigned_p;
+      vec<constructor_elt, va_gc> *vec;
+      const char *name
+	= fcode == RS6000_OVLD_VEC_SPLATS ? "vec_splats": "vec_promote";
+
+      if (fcode == RS6000_OVLD_VEC_SPLATS && nargs != 1)
+	{
+	  error ("builtin %qs only accepts 1 argument", name);
+	  return error_mark_node;
+	}
+      if (fcode == RS6000_OVLD_VEC_PROMOTE && nargs != 2)
+	{
+	  error ("builtin %qs only accepts 2 arguments", name);
+	  return error_mark_node;
+	}
+      /* Ignore promote's element argument.  */
+      if (fcode == RS6000_OVLD_VEC_PROMOTE
+	  && !INTEGRAL_TYPE_P (TREE_TYPE ((*arglist)[1])))
+	goto bad;
+
+      arg = (*arglist)[0];
+      type = TREE_TYPE (arg);
+      if (!SCALAR_FLOAT_TYPE_P (type)
+	  && !INTEGRAL_TYPE_P (type))
+	goto bad;
+      unsigned_p = TYPE_UNSIGNED (type);
+      switch (TYPE_MODE (type))
+	{
+	  case E_TImode:
+	    type = (unsigned_p ? unsigned_V1TI_type_node : V1TI_type_node);
+	    size = 1;
+	    break;
+	  case E_DImode:
+	    type = (unsigned_p ? unsigned_V2DI_type_node : V2DI_type_node);
+	    size = 2;
+	    break;
+	  case E_SImode:
+	    type = (unsigned_p ? unsigned_V4SI_type_node : V4SI_type_node);
+	    size = 4;
+	    break;
+	  case E_HImode:
+	    type = (unsigned_p ? unsigned_V8HI_type_node : V8HI_type_node);
+	    size = 8;
+	    break;
+	  case E_QImode:
+	    type = (unsigned_p ? unsigned_V16QI_type_node : V16QI_type_node);
+	    size = 16;
+	    break;
+	  case E_SFmode: type = V4SF_type_node; size = 4; break;
+	  case E_DFmode: type = V2DF_type_node; size = 2; break;
+	  default:
+	    goto bad;
+	}
+      arg = save_expr (fold_convert (TREE_TYPE (type), arg));
+      vec_alloc (vec, size);
+      for(i = 0; i < size; i++)
+	{
+	  constructor_elt elt = {NULL_TREE, arg};
+	  vec->quick_push (elt);
+	}
+	return build_constructor (type, vec);
+    }
+
+  /* For now use pointer tricks to do the extraction, unless we are on VSX
+     extracting a double from a constant offset.  */
+  if (fcode == RS6000_OVLD_VEC_EXTRACT)
+    {
+      tree arg1;
+      tree arg1_type;
+      tree arg2;
+      tree arg1_inner_type;
+      tree decl, stmt;
+      tree innerptrtype;
+      machine_mode mode;
+
+      /* No second argument. */
+      if (nargs != 2)
+	{
+	  error ("builtin %qs only accepts 2 arguments", "vec_extract");
+	  return error_mark_node;
+	}
+
+      arg2 = (*arglist)[1];
+      arg1 = (*arglist)[0];
+      arg1_type = TREE_TYPE (arg1);
+
+      if (TREE_CODE (arg1_type) != VECTOR_TYPE)
+	goto bad;
+      if (!INTEGRAL_TYPE_P (TREE_TYPE (arg2)))
+	goto bad;
+
+      /* See if we can optimize vec_extracts with the current VSX instruction
+	 set.  */
+      mode = TYPE_MODE (arg1_type);
+      if (VECTOR_MEM_VSX_P (mode))
+
+	{
+	  tree call = NULL_TREE;
+	  int nunits = GET_MODE_NUNITS (mode);
+
+	  arg2 = fold_for_warn (arg2);
+
+	  /* If the second argument is an integer constant, generate
+	     the built-in code if we can.  We need 64-bit and direct
+	     move to extract the small integer vectors.  */
+	  if (TREE_CODE (arg2) == INTEGER_CST)
+	    {
+	      wide_int selector = wi::to_wide (arg2);
+	      selector = wi::umod_trunc (selector, nunits);
+	      arg2 = wide_int_to_tree (TREE_TYPE (arg2), selector);
+	      switch (mode)
+		{
+		default:
+		  break;
+
+		case E_V1TImode:
+		  call = rs6000_builtin_decls_x[RS6000_BIF_VEC_EXT_V1TI];
+		  break;
+
+		case E_V2DFmode:
+		  call = rs6000_builtin_decls_x[RS6000_BIF_VEC_EXT_V2DF];
+		  break;
+
+		case E_V2DImode:
+		  call = rs6000_builtin_decls_x[RS6000_BIF_VEC_EXT_V2DI];
+		  break;
+
+		case E_V4SFmode:
+		  call = rs6000_builtin_decls_x[RS6000_BIF_VEC_EXT_V4SF];
+		  break;
+
+		case E_V4SImode:
+		  if (TARGET_DIRECT_MOVE_64BIT)
+		    call = rs6000_builtin_decls_x[RS6000_BIF_VEC_EXT_V4SI];
+		  break;
+
+		case E_V8HImode:
+		  if (TARGET_DIRECT_MOVE_64BIT)
+		    call = rs6000_builtin_decls_x[RS6000_BIF_VEC_EXT_V8HI];
+		  break;
+
+		case E_V16QImode:
+		  if (TARGET_DIRECT_MOVE_64BIT)
+		    call = rs6000_builtin_decls_x[RS6000_BIF_VEC_EXT_V16QI];
+		  break;
+		}
+	    }
+
+	  /* If the second argument is variable, we can optimize it if we are
+	     generating 64-bit code on a machine with direct move.  */
+	  else if (TREE_CODE (arg2) != INTEGER_CST && TARGET_DIRECT_MOVE_64BIT)
+	    {
+	      switch (mode)
+		{
+		default:
+		  break;
+
+		case E_V2DFmode:
+		  call = rs6000_builtin_decls_x[RS6000_BIF_VEC_EXT_V2DF];
+		  break;
+
+		case E_V2DImode:
+		  call = rs6000_builtin_decls_x[RS6000_BIF_VEC_EXT_V2DI];
+		  break;
+
+		case E_V4SFmode:
+		  call = rs6000_builtin_decls_x[RS6000_BIF_VEC_EXT_V4SF];
+		  break;
+
+		case E_V4SImode:
+		  call = rs6000_builtin_decls_x[RS6000_BIF_VEC_EXT_V4SI];
+		  break;
+
+		case E_V8HImode:
+		  call = rs6000_builtin_decls_x[RS6000_BIF_VEC_EXT_V8HI];
+		  break;
+
+		case E_V16QImode:
+		  call = rs6000_builtin_decls_x[RS6000_BIF_VEC_EXT_V16QI];
+		  break;
+		}
+	    }
+
+	  if (call)
+	    {
+	      tree result = build_call_expr (call, 2, arg1, arg2);
+	      /* Coerce the result to vector element type.  May be no-op.  */
+	      arg1_inner_type = TREE_TYPE (arg1_type);
+	      result = fold_convert (arg1_inner_type, result);
+	      return result;
+	    }
+	}
+
+      /* Build *(((arg1_inner_type*)&(vector type){arg1})+arg2). */
+      arg1_inner_type = TREE_TYPE (arg1_type);
+      arg2 = build_binary_op (loc, BIT_AND_EXPR, arg2,
+			      build_int_cst (TREE_TYPE (arg2),
+					     TYPE_VECTOR_SUBPARTS (arg1_type)
+					     - 1), 0);
+      decl = build_decl (loc, VAR_DECL, NULL_TREE, arg1_type);
+      DECL_EXTERNAL (decl) = 0;
+      TREE_PUBLIC (decl) = 0;
+      DECL_CONTEXT (decl) = current_function_decl;
+      TREE_USED (decl) = 1;
+      TREE_TYPE (decl) = arg1_type;
+      TREE_READONLY (decl) = TYPE_READONLY (arg1_type);
+      if (c_dialect_cxx ())
+	{
+	  stmt = build4 (TARGET_EXPR, arg1_type, decl, arg1,
+			 NULL_TREE, NULL_TREE);
+	  SET_EXPR_LOCATION (stmt, loc);
+	}
+      else
+	{
+	  DECL_INITIAL (decl) = arg1;
+	  stmt = build1 (DECL_EXPR, arg1_type, decl);
+	  TREE_ADDRESSABLE (decl) = 1;
+	  SET_EXPR_LOCATION (stmt, loc);
+	  stmt = build1 (COMPOUND_LITERAL_EXPR, arg1_type, stmt);
+	}
+
+      innerptrtype = build_pointer_type (arg1_inner_type);
+
+      stmt = build_unary_op (loc, ADDR_EXPR, stmt, 0);
+      stmt = convert (innerptrtype, stmt);
+      stmt = build_binary_op (loc, PLUS_EXPR, stmt, arg2, 1);
+      stmt = build_indirect_ref (loc, stmt, RO_NULL);
+
+      /* PR83660: We mark this as having side effects so that
+	 downstream in fold_build_cleanup_point_expr () it will get a
+	 CLEANUP_POINT_EXPR.  If it does not we can run into an ICE
+	 later in gimplify_cleanup_point_expr ().  Potentially this
+	 causes missed optimization because there actually is no side
+	 effect.  */
+      if (c_dialect_cxx ())
+	TREE_SIDE_EFFECTS (stmt) = 1;
+
+      return stmt;
+    }
+
+  /* For now use pointer tricks to do the insertion, unless we are on VSX
+     inserting a double to a constant offset..  */
+  if (fcode == RS6000_OVLD_VEC_INSERT)
+    {
+      tree arg0;
+      tree arg1;
+      tree arg2;
+      tree arg1_type;
+      tree decl, stmt;
+      machine_mode mode;
+
+      /* No second or third arguments. */
+      if (nargs != 3)
+	{
+	  error ("builtin %qs only accepts 3 arguments", "vec_insert");
+	  return error_mark_node;
+	}
+
+      arg0 = (*arglist)[0];
+      arg1 = (*arglist)[1];
+      arg1_type = TREE_TYPE (arg1);
+      arg2 = fold_for_warn ((*arglist)[2]);
+
+      if (TREE_CODE (arg1_type) != VECTOR_TYPE)
+	goto bad;
+      if (!INTEGRAL_TYPE_P (TREE_TYPE (arg2)))
+	goto bad;
+
+      /* If we can use the VSX xxpermdi instruction, use that for insert.  */
+      mode = TYPE_MODE (arg1_type);
+      if ((mode == V2DFmode || mode == V2DImode) && VECTOR_UNIT_VSX_P (mode)
+	  && TREE_CODE (arg2) == INTEGER_CST)
+	{
+	  wide_int selector = wi::to_wide (arg2);
+	  selector = wi::umod_trunc (selector, 2);
+	  tree call = NULL_TREE;
+
+	  arg2 = wide_int_to_tree (TREE_TYPE (arg2), selector);
+	  if (mode == V2DFmode)
+	    call = rs6000_builtin_decls_x[RS6000_BIF_VEC_SET_V2DF];
+	  else if (mode == V2DImode)
+	    call = rs6000_builtin_decls_x[RS6000_BIF_VEC_SET_V2DI];
+
+	  /* Note, __builtin_vec_insert_<xxx> has vector and scalar types
+	     reversed.  */
+	  if (call)
+	    return build_call_expr (call, 3, arg1, arg0, arg2);
+	}
+      else if (mode == V1TImode && VECTOR_UNIT_VSX_P (mode)
+	       && TREE_CODE (arg2) == INTEGER_CST)
+	{
+	  tree call = rs6000_builtin_decls_x[RS6000_BIF_VEC_SET_V1TI];
+	  wide_int selector = wi::zero(32);
+
+	  arg2 = wide_int_to_tree (TREE_TYPE (arg2), selector);
+	  /* Note, __builtin_vec_insert_<xxx> has vector and scalar types
+	     reversed.  */
+	  return build_call_expr (call, 3, arg1, arg0, arg2);
+	}
+
+      /* Build *(((arg1_inner_type*)&(vector type){arg1})+arg2) = arg0 with
+	 VIEW_CONVERT_EXPR.  i.e.:
+	 D.3192 = v1;
+	 _1 = n & 3;
+	 VIEW_CONVERT_EXPR<int[4]>(D.3192)[_1] = i;
+	 v1 = D.3192;
+	 D.3194 = v1;  */
+      if (TYPE_VECTOR_SUBPARTS (arg1_type) == 1)
+	arg2 = build_int_cst (TREE_TYPE (arg2), 0);
+      else
+	arg2 = build_binary_op (loc, BIT_AND_EXPR, arg2,
+				build_int_cst (TREE_TYPE (arg2),
+					       TYPE_VECTOR_SUBPARTS (arg1_type)
+					       - 1), 0);
+      decl = build_decl (loc, VAR_DECL, NULL_TREE, arg1_type);
+      DECL_EXTERNAL (decl) = 0;
+      TREE_PUBLIC (decl) = 0;
+      DECL_CONTEXT (decl) = current_function_decl;
+      TREE_USED (decl) = 1;
+      TREE_TYPE (decl) = arg1_type;
+      TREE_READONLY (decl) = TYPE_READONLY (arg1_type);
+      TREE_ADDRESSABLE (decl) = 1;
+      if (c_dialect_cxx ())
+	{
+	  stmt = build4 (TARGET_EXPR, arg1_type, decl, arg1,
+			 NULL_TREE, NULL_TREE);
+	  SET_EXPR_LOCATION (stmt, loc);
+	}
+      else
+	{
+	  DECL_INITIAL (decl) = arg1;
+	  stmt = build1 (DECL_EXPR, arg1_type, decl);
+	  SET_EXPR_LOCATION (stmt, loc);
+	  stmt = build1 (COMPOUND_LITERAL_EXPR, arg1_type, stmt);
+	}
+
+      if (TARGET_VSX)
+	{
+	  stmt = build_array_ref (loc, stmt, arg2);
+	  stmt = fold_build2 (MODIFY_EXPR, TREE_TYPE (arg0), stmt,
+			      convert (TREE_TYPE (stmt), arg0));
+	  stmt = build2 (COMPOUND_EXPR, arg1_type, stmt, decl);
+	}
+      else
+	{
+	  tree arg1_inner_type;
+	  tree innerptrtype;
+	  arg1_inner_type = TREE_TYPE (arg1_type);
+	  innerptrtype = build_pointer_type (arg1_inner_type);
+
+	  stmt = build_unary_op (loc, ADDR_EXPR, stmt, 0);
+	  stmt = convert (innerptrtype, stmt);
+	  stmt = build_binary_op (loc, PLUS_EXPR, stmt, arg2, 1);
+	  stmt = build_indirect_ref (loc, stmt, RO_NULL);
+	  stmt = build2 (MODIFY_EXPR, TREE_TYPE (stmt), stmt,
+			 convert (TREE_TYPE (stmt), arg0));
+	  stmt = build2 (COMPOUND_EXPR, arg1_type, stmt, decl);
+	}
+      return stmt;
+    }
+
+  for (n = 0;
+       !VOID_TYPE_P (TREE_VALUE (fnargs)) && n < nargs;
+       fnargs = TREE_CHAIN (fnargs), n++)
+    {
+      tree decl_type = TREE_VALUE (fnargs);
+      tree arg = (*arglist)[n];
+      tree type;
+
+      if (arg == error_mark_node)
+	return error_mark_node;
+
+      if (n >= MAX_OVLD_ARGS)
+	abort ();
+
+      arg = default_conversion (arg);
+
+      /* The C++ front-end converts float * to const void * using
+	 NOP_EXPR<const void *> (NOP_EXPR<void *> (x)).  */
+      type = TREE_TYPE (arg);
+      if (POINTER_TYPE_P (type)
+	  && TREE_CODE (arg) == NOP_EXPR
+	  && lang_hooks.types_compatible_p (TREE_TYPE (arg),
+					    const_ptr_type_node)
+	  && lang_hooks.types_compatible_p (TREE_TYPE (TREE_OPERAND (arg, 0)),
+					    ptr_type_node))
+	{
+	  arg = TREE_OPERAND (arg, 0);
+	  type = TREE_TYPE (arg);
+	}
+
+      /* Remove the const from the pointers to simplify the overload
+	 matching further down.  */
+      if (POINTER_TYPE_P (decl_type)
+	  && POINTER_TYPE_P (type)
+	  && TYPE_QUALS (TREE_TYPE (type)) != 0)
+	{
+	  if (TYPE_READONLY (TREE_TYPE (type))
+	      && !TYPE_READONLY (TREE_TYPE (decl_type)))
+	    warning (0, "passing argument %d of %qE discards qualifiers from "
+		     "pointer target type", n + 1, fndecl);
+	  type = build_pointer_type (build_qualified_type (TREE_TYPE (type),
+							   0));
+	  arg = fold_convert (type, arg);
+	}
+
+      /* For RS6000_OVLD_VEC_LXVL, convert any const * to its non constant
+	 equivalent to simplify the overload matching below.  */
+      if (fcode == RS6000_OVLD_VEC_LXVL)
+	{
+	  if (POINTER_TYPE_P (type)
+	      && TYPE_READONLY (TREE_TYPE (type)))
+	    {
+	      type = build_pointer_type (build_qualified_type (
+						TREE_TYPE (type),0));
+	      arg = fold_convert (type, arg);
+	    }
+	}
+
+      args[n] = arg;
+      types[n] = type;
+    }
+
+  /* If the number of arguments did not match the prototype, return NULL
+     and the generic code will issue the appropriate error message.  */
+  if (!VOID_TYPE_P (TREE_VALUE (fnargs)) || n < nargs)
+    return NULL;
+
+  if (fcode == RS6000_OVLD_VEC_STEP)
+    {
+      if (TREE_CODE (types[0]) != VECTOR_TYPE)
+	goto bad;
+
+      return build_int_cst (NULL_TREE, TYPE_VECTOR_SUBPARTS (types[0]));
+    }
+
+  {
+    bool unsupported_builtin = false;
+    enum rs6000_gen_builtins overloaded_code;
+    bool supported = false;
+    ovlddata *instance = rs6000_overload_info[adj_fcode].first_instance;
+    gcc_assert (instance != NULL);
+
+    /* Need to special case __builtin_cmpb because the overloaded forms
+       of this function take (unsigned int, unsigned int) or (unsigned
+       long long int, unsigned long long int).  Since C conventions
+       allow the respective argument types to be implicitly coerced into
+       each other, the default handling does not provide adequate
+       discrimination between the desired forms of the function.  */
+    if (fcode == RS6000_OVLD_SCAL_CMPB)
+      {
+	machine_mode arg1_mode = TYPE_MODE (types[0]);
+	machine_mode arg2_mode = TYPE_MODE (types[1]);
+
+	if (nargs != 2)
+	  {
+	    error ("builtin %qs only accepts 2 arguments", "__builtin_cmpb");
+	    return error_mark_node;
+	  }
+
+	/* If any supplied arguments are wider than 32 bits, resolve to
+	   64-bit variant of built-in function.  */
+	if ((GET_MODE_PRECISION (arg1_mode) > 32)
+	    || (GET_MODE_PRECISION (arg2_mode) > 32))
+	  {
+	    /* Assure all argument and result types are compatible with
+	       the built-in function represented by RS6000_BIF_CMPB.  */
+	    overloaded_code = RS6000_BIF_CMPB;
+	  }
+	else
+	  {
+	    /* Assure all argument and result types are compatible with
+	       the built-in function represented by RS6000_BIF_CMPB_32.  */
+	    overloaded_code = RS6000_BIF_CMPB_32;
+	  }
+
+	while (instance && instance->bifid != overloaded_code)
+	  instance = instance->next;
+
+	gcc_assert (instance != NULL);
+	tree fntype = rs6000_builtin_info_x[instance->bifid].fntype;
+	tree parmtype0 = TREE_VALUE (TYPE_ARG_TYPES (fntype));
+	tree parmtype1 = TREE_VALUE (TREE_CHAIN (TYPE_ARG_TYPES (fntype)));
+
+	if (rs6000_new_builtin_type_compatible (types[0], parmtype0)
+	    && rs6000_new_builtin_type_compatible (types[1], parmtype1))
+	  {
+	    if (rs6000_builtin_decl (instance->bifid, false) != error_mark_node
+		&& rs6000_new_builtin_is_supported_p (instance->bifid))
+	      {
+		tree ret_type = TREE_TYPE (instance->fntype);
+		return altivec_build_new_resolved_builtin (args, n, fntype,
+							   ret_type,
+							   instance->bifid,
+							   fcode);
+	      }
+	    else
+	      unsupported_builtin = true;
+	  }
+      }
+    else if (fcode == RS6000_OVLD_VEC_VSIE)
+      {
+	machine_mode arg1_mode = TYPE_MODE (types[0]);
+
+	if (nargs != 2)
+	  {
+	    error ("builtin %qs only accepts 2 arguments",
+		   "scalar_insert_exp");
+	    return error_mark_node;
+	  }
+
+	/* If supplied first argument is wider than 64 bits, resolve to
+	   128-bit variant of built-in function.  */
+	if (GET_MODE_PRECISION (arg1_mode) > 64)
+	  {
+	    /* If first argument is of float variety, choose variant
+	       that expects __ieee128 argument.  Otherwise, expect
+	       __int128 argument.  */
+	    if (GET_MODE_CLASS (arg1_mode) == MODE_FLOAT)
+	      overloaded_code = RS6000_BIF_VSIEQPF;
+	    else
+	      overloaded_code = RS6000_BIF_VSIEQP;
+	  }
+	else
+	  {
+	    /* If first argument is of float variety, choose variant
+	       that expects double argument.  Otherwise, expect
+	       long long int argument.  */
+	    if (GET_MODE_CLASS (arg1_mode) == MODE_FLOAT)
+	      overloaded_code = RS6000_BIF_VSIEDPF;
+	    else
+	      overloaded_code = RS6000_BIF_VSIEDP;
+	  }
+
+	while (instance && instance->bifid != overloaded_code)
+	  instance = instance->next;
+
+	gcc_assert (instance != NULL);
+	tree fntype = rs6000_builtin_info_x[instance->bifid].fntype;
+	tree parmtype0 = TREE_VALUE (TYPE_ARG_TYPES (fntype));
+	tree parmtype1 = TREE_VALUE (TREE_CHAIN (TYPE_ARG_TYPES (fntype)));
+
+	if (rs6000_new_builtin_type_compatible (types[0], parmtype0)
+	    && rs6000_new_builtin_type_compatible (types[1], parmtype1))
+	  {
+	    if (rs6000_builtin_decl (instance->bifid, false) != error_mark_node
+		&& rs6000_new_builtin_is_supported_p (instance->bifid))
+	      {
+		tree ret_type = TREE_TYPE (instance->fntype);
+		return altivec_build_new_resolved_builtin (args, n, fntype,
+							   ret_type,
+							   instance->bifid,
+							   fcode);
+	      }
+	    else
+	      unsupported_builtin = true;
+	  }
+      }
+    else
+      {
+	/* Functions with no arguments can have only one overloaded
+	   instance.  */
+	gcc_assert (n > 0 || !instance->next);
+
+	for (; instance != NULL; instance = instance->next)
+	  {
+	    bool mismatch = false;
+	    tree nextparm = TYPE_ARG_TYPES (instance->fntype);
+
+	    for (unsigned int arg_i = 0;
+		 arg_i < nargs && nextparm != NULL;
+		 arg_i++)
+	      {
+		tree parmtype = TREE_VALUE (nextparm);
+		if (!rs6000_new_builtin_type_compatible (types[arg_i],
+							 parmtype))
+		  {
+		    mismatch = true;
+		    break;
+		  }
+		nextparm = TREE_CHAIN (nextparm);
+	      }
+
+	    if (mismatch)
+	      continue;
+
+	    supported = rs6000_new_builtin_is_supported_p (instance->bifid);
+	    if (rs6000_builtin_decl (instance->bifid, false) != error_mark_node
+		&& supported)
+	      {
+		tree fntype = rs6000_builtin_info_x[instance->bifid].fntype;
+		tree ret_type = TREE_TYPE (instance->fntype);
+		return altivec_build_new_resolved_builtin (args, n, fntype,
+							   ret_type,
+							   instance->bifid,
+							   fcode);
+	      }
+	    else
+	      {
+		unsupported_builtin = true;
+		break;
+	      }
+	  }
+      }
+
+    if (unsupported_builtin)
+      {
+	const char *name = rs6000_overload_info[adj_fcode].ovld_name;
+	if (!supported)
+	  {
+	    const char *internal_name
+	      = rs6000_builtin_info_x[instance->bifid].bifname;
+	    /* An error message making reference to the name of the
+	       non-overloaded function has already been issued.  Add
+	       clarification of the previous message.  */
+	    rich_location richloc (line_table, input_location);
+	    inform (&richloc, "builtin %qs requires builtin %qs",
+		    name, internal_name);
+	  }
+	else
+	  error ("%qs is not supported in this compiler configuration", name);
+	/* If an error-representing  result tree was returned from
+	   altivec_build_resolved_builtin above, use it.  */
+	/*
+	return (result != NULL) ? result : error_mark_node;
+	*/
+	return error_mark_node;
+      }
+  }
+ bad:
+  {
+    const char *name = rs6000_overload_info[adj_fcode].ovld_name;
+    error ("invalid parameter combination for AltiVec intrinsic %qs", name);
+    return error_mark_node;
+  }
+}
diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
index 0c555f29f7d..b08440fd074 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -12965,6 +12965,97 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi)
   return false;
 }
 
+/* Check whether a builtin function is supported in this target
+   configuration.  */
+bool
+rs6000_new_builtin_is_supported_p (enum rs6000_gen_builtins fncode)
+{
+  switch (rs6000_builtin_info_x[(size_t) fncode].enable)
+    {
+    default:
+      gcc_unreachable ();
+    case ENB_ALWAYS:
+      return true;
+    case ENB_P5:
+      if (!TARGET_POPCNTB)
+	return false;
+      break;
+    case ENB_P6:
+      if (!TARGET_CMPB)
+	return false;
+      break;
+    case ENB_ALTIVEC:
+      if (!TARGET_ALTIVEC)
+	return false;
+      break;
+    case ENB_CELL:
+      if (!TARGET_ALTIVEC || rs6000_cpu != PROCESSOR_CELL)
+	return false;
+      break;
+    case ENB_VSX:
+      if (!TARGET_VSX)
+	return false;
+      break;
+    case ENB_P7:
+      if (!TARGET_POPCNTD)
+	return false;
+      break;
+    case ENB_P7_64:
+      if (!TARGET_POPCNTD || !TARGET_POWERPC64)
+	return false;
+      break;
+    case ENB_P8:
+      if (!TARGET_DIRECT_MOVE)
+	return false;
+      break;
+    case ENB_P8V:
+      if (!TARGET_P8_VECTOR)
+	return false;
+      break;
+    case ENB_P9:
+      if (!TARGET_MODULO)
+	return false;
+      break;
+    case ENB_P9_64:
+      if (!TARGET_MODULO || !TARGET_POWERPC64)
+	return false;
+      break;
+    case ENB_P9V:
+      if (!TARGET_P9_VECTOR)
+	return false;
+      break;
+    case ENB_IEEE128_HW:
+      if (!TARGET_FLOAT128_HW)
+	return false;
+      break;
+    case ENB_DFP:
+      if (!TARGET_DFP)
+	return false;
+      break;
+    case ENB_CRYPTO:
+      if (!TARGET_CRYPTO)
+	return false;
+      break;
+    case ENB_HTM:
+      if (!TARGET_HTM)
+	return false;
+      break;
+    case ENB_P10:
+      if (!TARGET_POWER10)
+	return false;
+      break;
+    case ENB_P10_64:
+      if (!TARGET_POWER10 || !TARGET_POWERPC64)
+	return false;
+      break;
+    case ENB_MMA:
+      if (!TARGET_MMA)
+	return false;
+      break;
+    };
+  return true;
+}
+
 /* Expand an expression EXP that calls a built-in function,
    with result going to TARGET if that's convenient
    (and in mode MODE if that's convenient).
-- 
2.27.0


^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 20/34] rs6000: Handle gimple folding of target built-ins
  2021-07-29 13:30 [PATCHv4 00/34] Replace the Power target-specific builtin machinery Bill Schmidt
                   ` (18 preceding siblings ...)
  2021-07-29 13:31 ` [PATCH 19/34] rs6000: Handle overloads during program parsing Bill Schmidt
@ 2021-07-29 13:31 ` Bill Schmidt
  2021-07-29 13:31 ` [PATCH 21/34] rs6000: Handle some recent MMA builtin changes Bill Schmidt
                   ` (13 subsequent siblings)
  33 siblings, 0 replies; 84+ messages in thread
From: Bill Schmidt @ 2021-07-29 13:31 UTC (permalink / raw)
  To: gcc-patches; +Cc: segher, dje.gcc, willschm

This is another patch that looks bigger than it really is.  Because we
have a new namespace for the builtins, allowing us to have both the old
and new builtin infrastructure supported at once, we need versions of
these functions that use the new builtin namespace.  Otherwise the code is
unchanged.

2021-07-29  Bill Schmidt  <wschmidt@linux.ibm.com>

gcc/
	* config/rs6000/rs6000-call.c (rs6000_gimple_fold_new_builtin):
	New forward decl.
	(rs6000_gimple_fold_builtin): Call rs6000_gimple_fold_new_builtin.
	(rs6000_new_builtin_valid_without_lhs): New function.
	(rs6000_gimple_fold_new_mma_builtin): Likewise.
	(rs6000_gimple_fold_new_builtin): Likewise.
---
 gcc/config/rs6000/rs6000-call.c | 1160 +++++++++++++++++++++++++++++++
 1 file changed, 1160 insertions(+)

diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
index b08440fd074..cb2503351c4 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -190,6 +190,7 @@ static tree builtin_function_type (machine_mode, machine_mode,
 static void rs6000_common_init_builtins (void);
 static void htm_init_builtins (void);
 static void mma_init_builtins (void);
+static bool rs6000_gimple_fold_new_builtin (gimple_stmt_iterator *gsi);
 
 
 /* Hash table to keep track of the argument types for builtin functions.  */
@@ -12018,6 +12019,9 @@ rs6000_gimple_fold_mma_builtin (gimple_stmt_iterator *gsi)
 bool
 rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi)
 {
+  if (new_builtins_are_live)
+    return rs6000_gimple_fold_new_builtin (gsi);
+
   gimple *stmt = gsi_stmt (*gsi);
   tree fndecl = gimple_call_fndecl (stmt);
   gcc_checking_assert (fndecl && DECL_BUILT_IN_CLASS (fndecl) == BUILT_IN_MD);
@@ -12965,6 +12969,35 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi)
   return false;
 }
 
+/*  Helper function to sort out which built-ins may be valid without having
+    a LHS.  */
+static bool
+rs6000_new_builtin_valid_without_lhs (enum rs6000_gen_builtins fn_code,
+				      tree fndecl)
+{
+  if (TREE_TYPE (TREE_TYPE (fndecl)) == void_type_node)
+    return true;
+
+  switch (fn_code)
+    {
+    case RS6000_BIF_STVX_V16QI:
+    case RS6000_BIF_STVX_V8HI:
+    case RS6000_BIF_STVX_V4SI:
+    case RS6000_BIF_STVX_V4SF:
+    case RS6000_BIF_STVX_V2DI:
+    case RS6000_BIF_STVX_V2DF:
+    case RS6000_BIF_STXVW4X_V16QI:
+    case RS6000_BIF_STXVW4X_V8HI:
+    case RS6000_BIF_STXVW4X_V4SF:
+    case RS6000_BIF_STXVW4X_V4SI:
+    case RS6000_BIF_STXVD2X_V2DF:
+    case RS6000_BIF_STXVD2X_V2DI:
+      return true;
+    default:
+      return false;
+    }
+}
+
 /* Check whether a builtin function is supported in this target
    configuration.  */
 bool
@@ -13056,6 +13089,1133 @@ rs6000_new_builtin_is_supported_p (enum rs6000_gen_builtins fncode)
   return true;
 }
 
+/* Expand the MMA built-ins early, so that we can convert the pass-by-reference
+   __vector_quad arguments into pass-by-value arguments, leading to more
+   efficient code generation.  */
+static bool
+rs6000_gimple_fold_new_mma_builtin (gimple_stmt_iterator *gsi,
+				    rs6000_gen_builtins fn_code)
+{
+  gimple *stmt = gsi_stmt (*gsi);
+  size_t fncode = (size_t) fn_code;
+
+  if (!bif_is_mma (rs6000_builtin_info_x[fncode]))
+    return false;
+
+  /* Each call that can be gimple-expanded has an associated built-in
+     function that it will expand into.  If this one doesn't, we have
+     already expanded it!  */
+  if (rs6000_builtin_info_x[fncode].assoc_bif == RS6000_BIF_NONE)
+    return false;
+
+  bifdata *bd = &rs6000_builtin_info_x[fncode];
+  unsigned nopnds = bd->nargs;
+  gimple_seq new_seq = NULL;
+  gimple *new_call;
+  tree new_decl;
+
+  /* Compatibility built-ins; we used to call these
+     __builtin_mma_{dis,}assemble_pair, but now we call them
+     __builtin_vsx_{dis,}assemble_pair.  Handle the old versions.  */
+  if (fncode == RS6000_BIF_ASSEMBLE_PAIR)
+    fncode = RS6000_BIF_ASSEMBLE_PAIR_V;
+  else if (fncode == RS6000_BIF_DISASSEMBLE_PAIR)
+    fncode = RS6000_BIF_DISASSEMBLE_PAIR_V;
+
+  if (fncode == RS6000_BIF_DISASSEMBLE_ACC
+      || fncode == RS6000_BIF_DISASSEMBLE_PAIR_V)
+    {
+      /* This is an MMA disassemble built-in function.  */
+      push_gimplify_context (true);
+      unsigned nvec = (fncode == RS6000_BIF_DISASSEMBLE_ACC) ? 4 : 2;
+      tree dst_ptr = gimple_call_arg (stmt, 0);
+      tree src_ptr = gimple_call_arg (stmt, 1);
+      tree src_type = TREE_TYPE (src_ptr);
+      tree src = create_tmp_reg_or_ssa_name (TREE_TYPE (src_type));
+      gimplify_assign (src, build_simple_mem_ref (src_ptr), &new_seq);
+
+      /* If we are not disassembling an accumulator/pair or our destination is
+	 another accumulator/pair, then just copy the entire thing as is.  */
+      if ((fncode == RS6000_BIF_DISASSEMBLE_ACC
+	   && TREE_TYPE (TREE_TYPE (dst_ptr)) == vector_quad_type_node)
+	  || (fncode == RS6000_BIF_DISASSEMBLE_PAIR_V
+	      && TREE_TYPE (TREE_TYPE (dst_ptr)) == vector_pair_type_node))
+	{
+	  tree dst = build_simple_mem_ref (build1 (VIEW_CONVERT_EXPR,
+						   src_type, dst_ptr));
+	  gimplify_assign (dst, src, &new_seq);
+	  pop_gimplify_context (NULL);
+	  gsi_replace_with_seq (gsi, new_seq, true);
+	  return true;
+	}
+
+      /* If we're disassembling an accumulator into a different type, we need
+	 to emit a xxmfacc instruction now, since we cannot do it later.  */
+      if (fncode == RS6000_BIF_DISASSEMBLE_ACC)
+	{
+	  new_decl = rs6000_builtin_decls_x[RS6000_BIF_XXMFACC_INTERNAL];
+	  new_call = gimple_build_call (new_decl, 1, src);
+	  src = create_tmp_reg_or_ssa_name (vector_quad_type_node);
+	  gimple_call_set_lhs (new_call, src);
+	  gimple_seq_add_stmt (&new_seq, new_call);
+	}
+
+      /* Copy the accumulator/pair vector by vector.  */
+      new_decl
+	= rs6000_builtin_decls_x[rs6000_builtin_info_x[fncode].assoc_bif];
+      tree dst_type = build_pointer_type_for_mode (unsigned_V16QI_type_node,
+						   ptr_mode, true);
+      tree dst_base = build1 (VIEW_CONVERT_EXPR, dst_type, dst_ptr);
+      for (unsigned i = 0; i < nvec; i++)
+	{
+	  unsigned index = WORDS_BIG_ENDIAN ? i : nvec - 1 - i;
+	  tree dst = build2 (MEM_REF, unsigned_V16QI_type_node, dst_base,
+			     build_int_cst (dst_type, index * 16));
+	  tree dstssa = create_tmp_reg_or_ssa_name (unsigned_V16QI_type_node);
+	  new_call = gimple_build_call (new_decl, 2, src,
+					build_int_cstu (uint16_type_node, i));
+	  gimple_call_set_lhs (new_call, dstssa);
+	  gimple_seq_add_stmt (&new_seq, new_call);
+	  gimplify_assign (dst, dstssa, &new_seq);
+	}
+      pop_gimplify_context (NULL);
+      gsi_replace_with_seq (gsi, new_seq, true);
+      return true;
+    }
+
+  /* Convert this built-in into an internal version that uses pass-by-value
+     arguments.  The internal built-in is found in the assoc_bif field.  */
+  new_decl = rs6000_builtin_decls_x[rs6000_builtin_info_x[fncode].assoc_bif];
+  tree lhs, op[MAX_MMA_OPERANDS];
+  tree acc = gimple_call_arg (stmt, 0);
+  push_gimplify_context (true);
+
+  if (bif_is_quad (*bd))
+    {
+      /* This built-in has a pass-by-reference accumulator input, so load it
+	 into a temporary accumulator for use as a pass-by-value input.  */
+      op[0] = create_tmp_reg_or_ssa_name (vector_quad_type_node);
+      for (unsigned i = 1; i < nopnds; i++)
+	op[i] = gimple_call_arg (stmt, i);
+      gimplify_assign (op[0], build_simple_mem_ref (acc), &new_seq);
+    }
+  else
+    {
+      /* This built-in does not use its pass-by-reference accumulator argument
+	 as an input argument, so remove it from the input list.  */
+      nopnds--;
+      for (unsigned i = 0; i < nopnds; i++)
+	op[i] = gimple_call_arg (stmt, i + 1);
+    }
+
+  switch (nopnds)
+    {
+    case 0:
+      new_call = gimple_build_call (new_decl, 0);
+      break;
+    case 1:
+      new_call = gimple_build_call (new_decl, 1, op[0]);
+      break;
+    case 2:
+      new_call = gimple_build_call (new_decl, 2, op[0], op[1]);
+      break;
+    case 3:
+      new_call = gimple_build_call (new_decl, 3, op[0], op[1], op[2]);
+      break;
+    case 4:
+      new_call = gimple_build_call (new_decl, 4, op[0], op[1], op[2], op[3]);
+      break;
+    case 5:
+      new_call = gimple_build_call (new_decl, 5, op[0], op[1], op[2], op[3],
+				    op[4]);
+      break;
+    case 6:
+      new_call = gimple_build_call (new_decl, 6, op[0], op[1], op[2], op[3],
+				    op[4], op[5]);
+      break;
+    case 7:
+      new_call = gimple_build_call (new_decl, 7, op[0], op[1], op[2], op[3],
+				    op[4], op[5], op[6]);
+      break;
+    default:
+      gcc_unreachable ();
+    }
+
+  if (fncode == RS6000_BIF_BUILD_PAIR || fncode == RS6000_BIF_ASSEMBLE_PAIR_V)
+    lhs = create_tmp_reg_or_ssa_name (vector_pair_type_node);
+  else
+    lhs = create_tmp_reg_or_ssa_name (vector_quad_type_node);
+  gimple_call_set_lhs (new_call, lhs);
+  gimple_seq_add_stmt (&new_seq, new_call);
+  gimplify_assign (build_simple_mem_ref (acc), lhs, &new_seq);
+  pop_gimplify_context (NULL);
+  gsi_replace_with_seq (gsi, new_seq, true);
+
+  return true;
+}
+
+/* Fold a machine-dependent built-in in GIMPLE.  (For folding into
+   a constant, use rs6000_fold_builtin.)  */
+static bool
+rs6000_gimple_fold_new_builtin (gimple_stmt_iterator *gsi)
+{
+  gimple *stmt = gsi_stmt (*gsi);
+  tree fndecl = gimple_call_fndecl (stmt);
+  gcc_checking_assert (fndecl && DECL_BUILT_IN_CLASS (fndecl) == BUILT_IN_MD);
+  enum rs6000_gen_builtins fn_code
+    = (enum rs6000_gen_builtins) DECL_MD_FUNCTION_CODE (fndecl);
+  tree arg0, arg1, lhs, temp;
+  enum tree_code bcode;
+  gimple *g;
+
+  size_t uns_fncode = (size_t) fn_code;
+  enum insn_code icode = rs6000_builtin_info_x[uns_fncode].icode;
+  const char *fn_name1 = rs6000_builtin_info_x[uns_fncode].bifname;
+  const char *fn_name2 = (icode != CODE_FOR_nothing)
+			  ? get_insn_name ((int) icode)
+			  : "nothing";
+
+  if (TARGET_DEBUG_BUILTIN)
+      fprintf (stderr, "rs6000_gimple_fold_new_builtin %d %s %s\n",
+	       fn_code, fn_name1, fn_name2);
+
+  if (!rs6000_fold_gimple)
+    return false;
+
+  /* Prevent gimple folding for code that does not have a LHS, unless it is
+     allowed per the rs6000_new_builtin_valid_without_lhs helper function.  */
+  if (!gimple_call_lhs (stmt)
+      && !rs6000_new_builtin_valid_without_lhs (fn_code, fndecl))
+    return false;
+
+  /* Don't fold invalid builtins, let rs6000_expand_builtin diagnose it.  */
+  if (!rs6000_new_builtin_is_supported_p (fn_code))
+    return false;
+
+  if (rs6000_gimple_fold_new_mma_builtin (gsi, fn_code))
+    return true;
+
+  switch (fn_code)
+    {
+    /* Flavors of vec_add.  We deliberately don't expand
+       RS6000_BIF_VADDUQM as it gets lowered from V1TImode to
+       TImode, resulting in much poorer code generation.  */
+    case RS6000_BIF_VADDUBM:
+    case RS6000_BIF_VADDUHM:
+    case RS6000_BIF_VADDUWM:
+    case RS6000_BIF_VADDUDM:
+    case RS6000_BIF_VADDFP:
+    case RS6000_BIF_XVADDDP:
+    case RS6000_BIF_XVADDSP:
+      bcode = PLUS_EXPR;
+    do_binary:
+      arg0 = gimple_call_arg (stmt, 0);
+      arg1 = gimple_call_arg (stmt, 1);
+      lhs = gimple_call_lhs (stmt);
+      if (INTEGRAL_TYPE_P (TREE_TYPE (TREE_TYPE (lhs)))
+	  && !TYPE_OVERFLOW_WRAPS (TREE_TYPE (TREE_TYPE (lhs))))
+	{
+	  /* Ensure the binary operation is performed in a type
+	     that wraps if it is integral type.  */
+	  gimple_seq stmts = NULL;
+	  tree type = unsigned_type_for (TREE_TYPE (lhs));
+	  tree uarg0 = gimple_build (&stmts, VIEW_CONVERT_EXPR,
+				     type, arg0);
+	  tree uarg1 = gimple_build (&stmts, VIEW_CONVERT_EXPR,
+				     type, arg1);
+	  tree res = gimple_build (&stmts, gimple_location (stmt), bcode,
+				   type, uarg0, uarg1);
+	  gsi_insert_seq_before (gsi, stmts, GSI_SAME_STMT);
+	  g = gimple_build_assign (lhs, VIEW_CONVERT_EXPR,
+				   build1 (VIEW_CONVERT_EXPR,
+					   TREE_TYPE (lhs), res));
+	  gsi_replace (gsi, g, true);
+	  return true;
+	}
+      g = gimple_build_assign (lhs, bcode, arg0, arg1);
+      gimple_set_location (g, gimple_location (stmt));
+      gsi_replace (gsi, g, true);
+      return true;
+    /* Flavors of vec_sub.  We deliberately don't expand
+       P8V_BUILTIN_VSUBUQM. */
+    case RS6000_BIF_VSUBUBM:
+    case RS6000_BIF_VSUBUHM:
+    case RS6000_BIF_VSUBUWM:
+    case RS6000_BIF_VSUBUDM:
+    case RS6000_BIF_VSUBFP:
+    case RS6000_BIF_XVSUBDP:
+    case RS6000_BIF_XVSUBSP:
+      bcode = MINUS_EXPR;
+      goto do_binary;
+    case RS6000_BIF_XVMULSP:
+    case RS6000_BIF_XVMULDP:
+      arg0 = gimple_call_arg (stmt, 0);
+      arg1 = gimple_call_arg (stmt, 1);
+      lhs = gimple_call_lhs (stmt);
+      g = gimple_build_assign (lhs, MULT_EXPR, arg0, arg1);
+      gimple_set_location (g, gimple_location (stmt));
+      gsi_replace (gsi, g, true);
+      return true;
+    /* Even element flavors of vec_mul (signed). */
+    case RS6000_BIF_VMULESB:
+    case RS6000_BIF_VMULESH:
+    case RS6000_BIF_VMULESW:
+    /* Even element flavors of vec_mul (unsigned).  */
+    case RS6000_BIF_VMULEUB:
+    case RS6000_BIF_VMULEUH:
+    case RS6000_BIF_VMULEUW:
+      arg0 = gimple_call_arg (stmt, 0);
+      arg1 = gimple_call_arg (stmt, 1);
+      lhs = gimple_call_lhs (stmt);
+      g = gimple_build_assign (lhs, VEC_WIDEN_MULT_EVEN_EXPR, arg0, arg1);
+      gimple_set_location (g, gimple_location (stmt));
+      gsi_replace (gsi, g, true);
+      return true;
+    /* Odd element flavors of vec_mul (signed).  */
+    case RS6000_BIF_VMULOSB:
+    case RS6000_BIF_VMULOSH:
+    case RS6000_BIF_VMULOSW:
+    /* Odd element flavors of vec_mul (unsigned). */
+    case RS6000_BIF_VMULOUB:
+    case RS6000_BIF_VMULOUH:
+    case RS6000_BIF_VMULOUW:
+      arg0 = gimple_call_arg (stmt, 0);
+      arg1 = gimple_call_arg (stmt, 1);
+      lhs = gimple_call_lhs (stmt);
+      g = gimple_build_assign (lhs, VEC_WIDEN_MULT_ODD_EXPR, arg0, arg1);
+      gimple_set_location (g, gimple_location (stmt));
+      gsi_replace (gsi, g, true);
+      return true;
+    /* Flavors of vec_div (Integer).  */
+    case RS6000_BIF_DIV_V2DI:
+    case RS6000_BIF_UDIV_V2DI:
+      arg0 = gimple_call_arg (stmt, 0);
+      arg1 = gimple_call_arg (stmt, 1);
+      lhs = gimple_call_lhs (stmt);
+      g = gimple_build_assign (lhs, TRUNC_DIV_EXPR, arg0, arg1);
+      gimple_set_location (g, gimple_location (stmt));
+      gsi_replace (gsi, g, true);
+      return true;
+    /* Flavors of vec_div (Float).  */
+    case RS6000_BIF_XVDIVSP:
+    case RS6000_BIF_XVDIVDP:
+      arg0 = gimple_call_arg (stmt, 0);
+      arg1 = gimple_call_arg (stmt, 1);
+      lhs = gimple_call_lhs (stmt);
+      g = gimple_build_assign (lhs, RDIV_EXPR, arg0, arg1);
+      gimple_set_location (g, gimple_location (stmt));
+      gsi_replace (gsi, g, true);
+      return true;
+    /* Flavors of vec_and.  */
+    case RS6000_BIF_VAND_V16QI_UNS:
+    case RS6000_BIF_VAND_V16QI:
+    case RS6000_BIF_VAND_V8HI_UNS:
+    case RS6000_BIF_VAND_V8HI:
+    case RS6000_BIF_VAND_V4SI_UNS:
+    case RS6000_BIF_VAND_V4SI:
+    case RS6000_BIF_VAND_V2DI_UNS:
+    case RS6000_BIF_VAND_V2DI:
+    case RS6000_BIF_VAND_V4SF:
+    case RS6000_BIF_VAND_V2DF:
+      arg0 = gimple_call_arg (stmt, 0);
+      arg1 = gimple_call_arg (stmt, 1);
+      lhs = gimple_call_lhs (stmt);
+      g = gimple_build_assign (lhs, BIT_AND_EXPR, arg0, arg1);
+      gimple_set_location (g, gimple_location (stmt));
+      gsi_replace (gsi, g, true);
+      return true;
+    /* Flavors of vec_andc.  */
+    case RS6000_BIF_VANDC_V16QI_UNS:
+    case RS6000_BIF_VANDC_V16QI:
+    case RS6000_BIF_VANDC_V8HI_UNS:
+    case RS6000_BIF_VANDC_V8HI:
+    case RS6000_BIF_VANDC_V4SI_UNS:
+    case RS6000_BIF_VANDC_V4SI:
+    case RS6000_BIF_VANDC_V2DI_UNS:
+    case RS6000_BIF_VANDC_V2DI:
+    case RS6000_BIF_VANDC_V4SF:
+    case RS6000_BIF_VANDC_V2DF:
+      arg0 = gimple_call_arg (stmt, 0);
+      arg1 = gimple_call_arg (stmt, 1);
+      lhs = gimple_call_lhs (stmt);
+      temp = create_tmp_reg_or_ssa_name (TREE_TYPE (arg1));
+      g = gimple_build_assign (temp, BIT_NOT_EXPR, arg1);
+      gimple_set_location (g, gimple_location (stmt));
+      gsi_insert_before (gsi, g, GSI_SAME_STMT);
+      g = gimple_build_assign (lhs, BIT_AND_EXPR, arg0, temp);
+      gimple_set_location (g, gimple_location (stmt));
+      gsi_replace (gsi, g, true);
+      return true;
+    /* Flavors of vec_nand.  */
+    case RS6000_BIF_NAND_V16QI_UNS:
+    case RS6000_BIF_NAND_V16QI:
+    case RS6000_BIF_NAND_V8HI_UNS:
+    case RS6000_BIF_NAND_V8HI:
+    case RS6000_BIF_NAND_V4SI_UNS:
+    case RS6000_BIF_NAND_V4SI:
+    case RS6000_BIF_NAND_V2DI_UNS:
+    case RS6000_BIF_NAND_V2DI:
+    case RS6000_BIF_NAND_V4SF:
+    case RS6000_BIF_NAND_V2DF:
+      arg0 = gimple_call_arg (stmt, 0);
+      arg1 = gimple_call_arg (stmt, 1);
+      lhs = gimple_call_lhs (stmt);
+      temp = create_tmp_reg_or_ssa_name (TREE_TYPE (arg1));
+      g = gimple_build_assign (temp, BIT_AND_EXPR, arg0, arg1);
+      gimple_set_location (g, gimple_location (stmt));
+      gsi_insert_before (gsi, g, GSI_SAME_STMT);
+      g = gimple_build_assign (lhs, BIT_NOT_EXPR, temp);
+      gimple_set_location (g, gimple_location (stmt));
+      gsi_replace (gsi, g, true);
+      return true;
+    /* Flavors of vec_or.  */
+    case RS6000_BIF_VOR_V16QI_UNS:
+    case RS6000_BIF_VOR_V16QI:
+    case RS6000_BIF_VOR_V8HI_UNS:
+    case RS6000_BIF_VOR_V8HI:
+    case RS6000_BIF_VOR_V4SI_UNS:
+    case RS6000_BIF_VOR_V4SI:
+    case RS6000_BIF_VOR_V2DI_UNS:
+    case RS6000_BIF_VOR_V2DI:
+    case RS6000_BIF_VOR_V4SF:
+    case RS6000_BIF_VOR_V2DF:
+      arg0 = gimple_call_arg (stmt, 0);
+      arg1 = gimple_call_arg (stmt, 1);
+      lhs = gimple_call_lhs (stmt);
+      g = gimple_build_assign (lhs, BIT_IOR_EXPR, arg0, arg1);
+      gimple_set_location (g, gimple_location (stmt));
+      gsi_replace (gsi, g, true);
+      return true;
+    /* flavors of vec_orc.  */
+    case RS6000_BIF_ORC_V16QI_UNS:
+    case RS6000_BIF_ORC_V16QI:
+    case RS6000_BIF_ORC_V8HI_UNS:
+    case RS6000_BIF_ORC_V8HI:
+    case RS6000_BIF_ORC_V4SI_UNS:
+    case RS6000_BIF_ORC_V4SI:
+    case RS6000_BIF_ORC_V2DI_UNS:
+    case RS6000_BIF_ORC_V2DI:
+    case RS6000_BIF_ORC_V4SF:
+    case RS6000_BIF_ORC_V2DF:
+      arg0 = gimple_call_arg (stmt, 0);
+      arg1 = gimple_call_arg (stmt, 1);
+      lhs = gimple_call_lhs (stmt);
+      temp = create_tmp_reg_or_ssa_name (TREE_TYPE (arg1));
+      g = gimple_build_assign (temp, BIT_NOT_EXPR, arg1);
+      gimple_set_location (g, gimple_location (stmt));
+      gsi_insert_before (gsi, g, GSI_SAME_STMT);
+      g = gimple_build_assign (lhs, BIT_IOR_EXPR, arg0, temp);
+      gimple_set_location (g, gimple_location (stmt));
+      gsi_replace (gsi, g, true);
+      return true;
+    /* Flavors of vec_xor.  */
+    case RS6000_BIF_VXOR_V16QI_UNS:
+    case RS6000_BIF_VXOR_V16QI:
+    case RS6000_BIF_VXOR_V8HI_UNS:
+    case RS6000_BIF_VXOR_V8HI:
+    case RS6000_BIF_VXOR_V4SI_UNS:
+    case RS6000_BIF_VXOR_V4SI:
+    case RS6000_BIF_VXOR_V2DI_UNS:
+    case RS6000_BIF_VXOR_V2DI:
+    case RS6000_BIF_VXOR_V4SF:
+    case RS6000_BIF_VXOR_V2DF:
+      arg0 = gimple_call_arg (stmt, 0);
+      arg1 = gimple_call_arg (stmt, 1);
+      lhs = gimple_call_lhs (stmt);
+      g = gimple_build_assign (lhs, BIT_XOR_EXPR, arg0, arg1);
+      gimple_set_location (g, gimple_location (stmt));
+      gsi_replace (gsi, g, true);
+      return true;
+    /* Flavors of vec_nor.  */
+    case RS6000_BIF_VNOR_V16QI_UNS:
+    case RS6000_BIF_VNOR_V16QI:
+    case RS6000_BIF_VNOR_V8HI_UNS:
+    case RS6000_BIF_VNOR_V8HI:
+    case RS6000_BIF_VNOR_V4SI_UNS:
+    case RS6000_BIF_VNOR_V4SI:
+    case RS6000_BIF_VNOR_V2DI_UNS:
+    case RS6000_BIF_VNOR_V2DI:
+    case RS6000_BIF_VNOR_V4SF:
+    case RS6000_BIF_VNOR_V2DF:
+      arg0 = gimple_call_arg (stmt, 0);
+      arg1 = gimple_call_arg (stmt, 1);
+      lhs = gimple_call_lhs (stmt);
+      temp = create_tmp_reg_or_ssa_name (TREE_TYPE (arg1));
+      g = gimple_build_assign (temp, BIT_IOR_EXPR, arg0, arg1);
+      gimple_set_location (g, gimple_location (stmt));
+      gsi_insert_before (gsi, g, GSI_SAME_STMT);
+      g = gimple_build_assign (lhs, BIT_NOT_EXPR, temp);
+      gimple_set_location (g, gimple_location (stmt));
+      gsi_replace (gsi, g, true);
+      return true;
+    /* flavors of vec_abs.  */
+    case RS6000_BIF_ABS_V16QI:
+    case RS6000_BIF_ABS_V8HI:
+    case RS6000_BIF_ABS_V4SI:
+    case RS6000_BIF_ABS_V4SF:
+    case RS6000_BIF_ABS_V2DI:
+    case RS6000_BIF_XVABSDP:
+    case RS6000_BIF_XVABSSP:
+      arg0 = gimple_call_arg (stmt, 0);
+      if (INTEGRAL_TYPE_P (TREE_TYPE (TREE_TYPE (arg0)))
+	  && !TYPE_OVERFLOW_WRAPS (TREE_TYPE (TREE_TYPE (arg0))))
+	return false;
+      lhs = gimple_call_lhs (stmt);
+      g = gimple_build_assign (lhs, ABS_EXPR, arg0);
+      gimple_set_location (g, gimple_location (stmt));
+      gsi_replace (gsi, g, true);
+      return true;
+    /* flavors of vec_min.  */
+    case RS6000_BIF_XVMINDP:
+    case RS6000_BIF_XVMINSP:
+    case RS6000_BIF_VMINSD:
+    case RS6000_BIF_VMINUD:
+    case RS6000_BIF_VMINSB:
+    case RS6000_BIF_VMINSH:
+    case RS6000_BIF_VMINSW:
+    case RS6000_BIF_VMINUB:
+    case RS6000_BIF_VMINUH:
+    case RS6000_BIF_VMINUW:
+    case RS6000_BIF_VMINFP:
+      arg0 = gimple_call_arg (stmt, 0);
+      arg1 = gimple_call_arg (stmt, 1);
+      lhs = gimple_call_lhs (stmt);
+      g = gimple_build_assign (lhs, MIN_EXPR, arg0, arg1);
+      gimple_set_location (g, gimple_location (stmt));
+      gsi_replace (gsi, g, true);
+      return true;
+    /* flavors of vec_max.  */
+    case RS6000_BIF_XVMAXDP:
+    case RS6000_BIF_XVMAXSP:
+    case RS6000_BIF_VMAXSD:
+    case RS6000_BIF_VMAXUD:
+    case RS6000_BIF_VMAXSB:
+    case RS6000_BIF_VMAXSH:
+    case RS6000_BIF_VMAXSW:
+    case RS6000_BIF_VMAXUB:
+    case RS6000_BIF_VMAXUH:
+    case RS6000_BIF_VMAXUW:
+    case RS6000_BIF_VMAXFP:
+      arg0 = gimple_call_arg (stmt, 0);
+      arg1 = gimple_call_arg (stmt, 1);
+      lhs = gimple_call_lhs (stmt);
+      g = gimple_build_assign (lhs, MAX_EXPR, arg0, arg1);
+      gimple_set_location (g, gimple_location (stmt));
+      gsi_replace (gsi, g, true);
+      return true;
+    /* Flavors of vec_eqv.  */
+    case RS6000_BIF_EQV_V16QI:
+    case RS6000_BIF_EQV_V8HI:
+    case RS6000_BIF_EQV_V4SI:
+    case RS6000_BIF_EQV_V4SF:
+    case RS6000_BIF_EQV_V2DF:
+    case RS6000_BIF_EQV_V2DI:
+      arg0 = gimple_call_arg (stmt, 0);
+      arg1 = gimple_call_arg (stmt, 1);
+      lhs = gimple_call_lhs (stmt);
+      temp = create_tmp_reg_or_ssa_name (TREE_TYPE (arg1));
+      g = gimple_build_assign (temp, BIT_XOR_EXPR, arg0, arg1);
+      gimple_set_location (g, gimple_location (stmt));
+      gsi_insert_before (gsi, g, GSI_SAME_STMT);
+      g = gimple_build_assign (lhs, BIT_NOT_EXPR, temp);
+      gimple_set_location (g, gimple_location (stmt));
+      gsi_replace (gsi, g, true);
+      return true;
+    /* Flavors of vec_rotate_left.  */
+    case RS6000_BIF_VRLB:
+    case RS6000_BIF_VRLH:
+    case RS6000_BIF_VRLW:
+    case RS6000_BIF_VRLD:
+      arg0 = gimple_call_arg (stmt, 0);
+      arg1 = gimple_call_arg (stmt, 1);
+      lhs = gimple_call_lhs (stmt);
+      g = gimple_build_assign (lhs, LROTATE_EXPR, arg0, arg1);
+      gimple_set_location (g, gimple_location (stmt));
+      gsi_replace (gsi, g, true);
+      return true;
+  /* Flavors of vector shift right algebraic.
+     vec_sra{b,h,w} -> vsra{b,h,w}.  */
+    case RS6000_BIF_VSRAB:
+    case RS6000_BIF_VSRAH:
+    case RS6000_BIF_VSRAW:
+    case RS6000_BIF_VSRAD:
+      {
+	arg0 = gimple_call_arg (stmt, 0);
+	arg1 = gimple_call_arg (stmt, 1);
+	lhs = gimple_call_lhs (stmt);
+	tree arg1_type = TREE_TYPE (arg1);
+	tree unsigned_arg1_type = unsigned_type_for (TREE_TYPE (arg1));
+	tree unsigned_element_type = unsigned_type_for (TREE_TYPE (arg1_type));
+	location_t loc = gimple_location (stmt);
+	/* Force arg1 into the range valid matching the arg0 type.  */
+	/* Build a vector consisting of the max valid bit-size values.  */
+	int n_elts = VECTOR_CST_NELTS (arg1);
+	tree element_size = build_int_cst (unsigned_element_type,
+					   128 / n_elts);
+	tree_vector_builder elts (unsigned_arg1_type, n_elts, 1);
+	for (int i = 0; i < n_elts; i++)
+	  elts.safe_push (element_size);
+	tree modulo_tree = elts.build ();
+	/* Modulo the provided shift value against that vector.  */
+	gimple_seq stmts = NULL;
+	tree unsigned_arg1 = gimple_build (&stmts, VIEW_CONVERT_EXPR,
+					   unsigned_arg1_type, arg1);
+	tree new_arg1 = gimple_build (&stmts, loc, TRUNC_MOD_EXPR,
+				      unsigned_arg1_type, unsigned_arg1,
+				      modulo_tree);
+	gsi_insert_seq_before (gsi, stmts, GSI_SAME_STMT);
+	/* And finally, do the shift.  */
+	g = gimple_build_assign (lhs, RSHIFT_EXPR, arg0, new_arg1);
+	gimple_set_location (g, loc);
+	gsi_replace (gsi, g, true);
+	return true;
+      }
+   /* Flavors of vector shift left.
+      builtin_altivec_vsl{b,h,w} -> vsl{b,h,w}.  */
+    case RS6000_BIF_VSLB:
+    case RS6000_BIF_VSLH:
+    case RS6000_BIF_VSLW:
+    case RS6000_BIF_VSLD:
+      {
+	location_t loc;
+	gimple_seq stmts = NULL;
+	arg0 = gimple_call_arg (stmt, 0);
+	tree arg0_type = TREE_TYPE (arg0);
+	if (INTEGRAL_TYPE_P (TREE_TYPE (arg0_type))
+	    && !TYPE_OVERFLOW_WRAPS (TREE_TYPE (arg0_type)))
+	  return false;
+	arg1 = gimple_call_arg (stmt, 1);
+	tree arg1_type = TREE_TYPE (arg1);
+	tree unsigned_arg1_type = unsigned_type_for (TREE_TYPE (arg1));
+	tree unsigned_element_type = unsigned_type_for (TREE_TYPE (arg1_type));
+	loc = gimple_location (stmt);
+	lhs = gimple_call_lhs (stmt);
+	/* Force arg1 into the range valid matching the arg0 type.  */
+	/* Build a vector consisting of the max valid bit-size values.  */
+	int n_elts = VECTOR_CST_NELTS (arg1);
+	int tree_size_in_bits = TREE_INT_CST_LOW (size_in_bytes (arg1_type))
+				* BITS_PER_UNIT;
+	tree element_size = build_int_cst (unsigned_element_type,
+					   tree_size_in_bits / n_elts);
+	tree_vector_builder elts (unsigned_type_for (arg1_type), n_elts, 1);
+	for (int i = 0; i < n_elts; i++)
+	  elts.safe_push (element_size);
+	tree modulo_tree = elts.build ();
+	/* Modulo the provided shift value against that vector.  */
+	tree unsigned_arg1 = gimple_build (&stmts, VIEW_CONVERT_EXPR,
+					   unsigned_arg1_type, arg1);
+	tree new_arg1 = gimple_build (&stmts, loc, TRUNC_MOD_EXPR,
+				      unsigned_arg1_type, unsigned_arg1,
+				      modulo_tree);
+	gsi_insert_seq_before (gsi, stmts, GSI_SAME_STMT);
+	/* And finally, do the shift.  */
+	g = gimple_build_assign (lhs, LSHIFT_EXPR, arg0, new_arg1);
+	gimple_set_location (g, gimple_location (stmt));
+	gsi_replace (gsi, g, true);
+	return true;
+      }
+    /* Flavors of vector shift right.  */
+    case RS6000_BIF_VSRB:
+    case RS6000_BIF_VSRH:
+    case RS6000_BIF_VSRW:
+    case RS6000_BIF_VSRD:
+      {
+	arg0 = gimple_call_arg (stmt, 0);
+	arg1 = gimple_call_arg (stmt, 1);
+	lhs = gimple_call_lhs (stmt);
+	tree arg1_type = TREE_TYPE (arg1);
+	tree unsigned_arg1_type = unsigned_type_for (TREE_TYPE (arg1));
+	tree unsigned_element_type = unsigned_type_for (TREE_TYPE (arg1_type));
+	location_t loc = gimple_location (stmt);
+	gimple_seq stmts = NULL;
+	/* Convert arg0 to unsigned.  */
+	tree arg0_unsigned
+	  = gimple_build (&stmts, VIEW_CONVERT_EXPR,
+			  unsigned_type_for (TREE_TYPE (arg0)), arg0);
+	/* Force arg1 into the range valid matching the arg0 type.  */
+	/* Build a vector consisting of the max valid bit-size values.  */
+	int n_elts = VECTOR_CST_NELTS (arg1);
+	tree element_size = build_int_cst (unsigned_element_type,
+					   128 / n_elts);
+	tree_vector_builder elts (unsigned_arg1_type, n_elts, 1);
+	for (int i = 0; i < n_elts; i++)
+	  elts.safe_push (element_size);
+	tree modulo_tree = elts.build ();
+	/* Modulo the provided shift value against that vector.  */
+	tree unsigned_arg1 = gimple_build (&stmts, VIEW_CONVERT_EXPR,
+					   unsigned_arg1_type, arg1);
+	tree new_arg1 = gimple_build (&stmts, loc, TRUNC_MOD_EXPR,
+				      unsigned_arg1_type, unsigned_arg1,
+				      modulo_tree);
+	/* Do the shift.  */
+	tree res
+	  = gimple_build (&stmts, RSHIFT_EXPR,
+			  TREE_TYPE (arg0_unsigned), arg0_unsigned, new_arg1);
+	/* Convert result back to the lhs type.  */
+	res = gimple_build (&stmts, VIEW_CONVERT_EXPR, TREE_TYPE (lhs), res);
+	gsi_insert_seq_before (gsi, stmts, GSI_SAME_STMT);
+	replace_call_with_value (gsi, res);
+	return true;
+      }
+    /* Vector loads.  */
+    case RS6000_BIF_LVX_V16QI:
+    case RS6000_BIF_LVX_V8HI:
+    case RS6000_BIF_LVX_V4SI:
+    case RS6000_BIF_LVX_V4SF:
+    case RS6000_BIF_LVX_V2DI:
+    case RS6000_BIF_LVX_V2DF:
+    case RS6000_BIF_LVX_V1TI:
+      {
+	arg0 = gimple_call_arg (stmt, 0);  // offset
+	arg1 = gimple_call_arg (stmt, 1);  // address
+	lhs = gimple_call_lhs (stmt);
+	location_t loc = gimple_location (stmt);
+	/* Since arg1 may be cast to a different type, just use ptr_type_node
+	   here instead of trying to enforce TBAA on pointer types.  */
+	tree arg1_type = ptr_type_node;
+	tree lhs_type = TREE_TYPE (lhs);
+	/* POINTER_PLUS_EXPR wants the offset to be of type 'sizetype'.  Create
+	   the tree using the value from arg0.  The resulting type will match
+	   the type of arg1.  */
+	gimple_seq stmts = NULL;
+	tree temp_offset = gimple_convert (&stmts, loc, sizetype, arg0);
+	tree temp_addr = gimple_build (&stmts, loc, POINTER_PLUS_EXPR,
+				       arg1_type, arg1, temp_offset);
+	/* Mask off any lower bits from the address.  */
+	tree aligned_addr = gimple_build (&stmts, loc, BIT_AND_EXPR,
+					  arg1_type, temp_addr,
+					  build_int_cst (arg1_type, -16));
+	gsi_insert_seq_before (gsi, stmts, GSI_SAME_STMT);
+	if (!is_gimple_mem_ref_addr (aligned_addr))
+	  {
+	    tree t = make_ssa_name (TREE_TYPE (aligned_addr));
+	    gimple *g = gimple_build_assign (t, aligned_addr);
+	    gsi_insert_before (gsi, g, GSI_SAME_STMT);
+	    aligned_addr = t;
+	  }
+	/* Use the build2 helper to set up the mem_ref.  The MEM_REF could also
+	   take an offset, but since we've already incorporated the offset
+	   above, here we just pass in a zero.  */
+	gimple *g
+	  = gimple_build_assign (lhs, build2 (MEM_REF, lhs_type, aligned_addr,
+					      build_int_cst (arg1_type, 0)));
+	gimple_set_location (g, loc);
+	gsi_replace (gsi, g, true);
+	return true;
+      }
+    /* Vector stores.  */
+    case RS6000_BIF_STVX_V16QI:
+    case RS6000_BIF_STVX_V8HI:
+    case RS6000_BIF_STVX_V4SI:
+    case RS6000_BIF_STVX_V4SF:
+    case RS6000_BIF_STVX_V2DI:
+    case RS6000_BIF_STVX_V2DF:
+      {
+	arg0 = gimple_call_arg (stmt, 0); /* Value to be stored.  */
+	arg1 = gimple_call_arg (stmt, 1); /* Offset.  */
+	tree arg2 = gimple_call_arg (stmt, 2); /* Store-to address.  */
+	location_t loc = gimple_location (stmt);
+	tree arg0_type = TREE_TYPE (arg0);
+	/* Use ptr_type_node (no TBAA) for the arg2_type.
+	   FIXME: (Richard)  "A proper fix would be to transition this type as
+	   seen from the frontend to GIMPLE, for example in a similar way we
+	   do for MEM_REFs by piggy-backing that on an extra argument, a
+	   constant zero pointer of the alias pointer type to use (which would
+	   also serve as a type indicator of the store itself).  I'd use a
+	   target specific internal function for this (not sure if we can have
+	   those target specific, but I guess if it's folded away then that's
+	   fine) and get away with the overload set."  */
+	tree arg2_type = ptr_type_node;
+	/* POINTER_PLUS_EXPR wants the offset to be of type 'sizetype'.  Create
+	   the tree using the value from arg0.  The resulting type will match
+	   the type of arg2.  */
+	gimple_seq stmts = NULL;
+	tree temp_offset = gimple_convert (&stmts, loc, sizetype, arg1);
+	tree temp_addr = gimple_build (&stmts, loc, POINTER_PLUS_EXPR,
+				       arg2_type, arg2, temp_offset);
+	/* Mask off any lower bits from the address.  */
+	tree aligned_addr = gimple_build (&stmts, loc, BIT_AND_EXPR,
+					  arg2_type, temp_addr,
+					  build_int_cst (arg2_type, -16));
+	gsi_insert_seq_before (gsi, stmts, GSI_SAME_STMT);
+	if (!is_gimple_mem_ref_addr (aligned_addr))
+	  {
+	    tree t = make_ssa_name (TREE_TYPE (aligned_addr));
+	    gimple *g = gimple_build_assign (t, aligned_addr);
+	    gsi_insert_before (gsi, g, GSI_SAME_STMT);
+	    aligned_addr = t;
+	  }
+	/* The desired gimple result should be similar to:
+	   MEM[(__vector floatD.1407 *)_1] = vf1D.2697;  */
+	gimple *g
+	  = gimple_build_assign (build2 (MEM_REF, arg0_type, aligned_addr,
+					 build_int_cst (arg2_type, 0)), arg0);
+	gimple_set_location (g, loc);
+	gsi_replace (gsi, g, true);
+	return true;
+      }
+
+    /* unaligned Vector loads.  */
+    case RS6000_BIF_LXVW4X_V16QI:
+    case RS6000_BIF_LXVW4X_V8HI:
+    case RS6000_BIF_LXVW4X_V4SF:
+    case RS6000_BIF_LXVW4X_V4SI:
+    case RS6000_BIF_LXVD2X_V2DF:
+    case RS6000_BIF_LXVD2X_V2DI:
+      {
+	arg0 = gimple_call_arg (stmt, 0);  // offset
+	arg1 = gimple_call_arg (stmt, 1);  // address
+	lhs = gimple_call_lhs (stmt);
+	location_t loc = gimple_location (stmt);
+	/* Since arg1 may be cast to a different type, just use ptr_type_node
+	   here instead of trying to enforce TBAA on pointer types.  */
+	tree arg1_type = ptr_type_node;
+	tree lhs_type = TREE_TYPE (lhs);
+	/* In GIMPLE the type of the MEM_REF specifies the alignment.  The
+	  required alignment (power) is 4 bytes regardless of data type.  */
+	tree align_ltype = build_aligned_type (lhs_type, 4);
+	/* POINTER_PLUS_EXPR wants the offset to be of type 'sizetype'.  Create
+	   the tree using the value from arg0.  The resulting type will match
+	   the type of arg1.  */
+	gimple_seq stmts = NULL;
+	tree temp_offset = gimple_convert (&stmts, loc, sizetype, arg0);
+	tree temp_addr = gimple_build (&stmts, loc, POINTER_PLUS_EXPR,
+				       arg1_type, arg1, temp_offset);
+	gsi_insert_seq_before (gsi, stmts, GSI_SAME_STMT);
+	if (!is_gimple_mem_ref_addr (temp_addr))
+	  {
+	    tree t = make_ssa_name (TREE_TYPE (temp_addr));
+	    gimple *g = gimple_build_assign (t, temp_addr);
+	    gsi_insert_before (gsi, g, GSI_SAME_STMT);
+	    temp_addr = t;
+	  }
+	/* Use the build2 helper to set up the mem_ref.  The MEM_REF could also
+	   take an offset, but since we've already incorporated the offset
+	   above, here we just pass in a zero.  */
+	gimple *g;
+	g = gimple_build_assign (lhs, build2 (MEM_REF, align_ltype, temp_addr,
+					      build_int_cst (arg1_type, 0)));
+	gimple_set_location (g, loc);
+	gsi_replace (gsi, g, true);
+	return true;
+      }
+
+    /* unaligned Vector stores.  */
+    case RS6000_BIF_STXVW4X_V16QI:
+    case RS6000_BIF_STXVW4X_V8HI:
+    case RS6000_BIF_STXVW4X_V4SF:
+    case RS6000_BIF_STXVW4X_V4SI:
+    case RS6000_BIF_STXVD2X_V2DF:
+    case RS6000_BIF_STXVD2X_V2DI:
+      {
+	arg0 = gimple_call_arg (stmt, 0); /* Value to be stored.  */
+	arg1 = gimple_call_arg (stmt, 1); /* Offset.  */
+	tree arg2 = gimple_call_arg (stmt, 2); /* Store-to address.  */
+	location_t loc = gimple_location (stmt);
+	tree arg0_type = TREE_TYPE (arg0);
+	/* Use ptr_type_node (no TBAA) for the arg2_type.  */
+	tree arg2_type = ptr_type_node;
+	/* In GIMPLE the type of the MEM_REF specifies the alignment.  The
+	   required alignment (power) is 4 bytes regardless of data type.  */
+	tree align_stype = build_aligned_type (arg0_type, 4);
+	/* POINTER_PLUS_EXPR wants the offset to be of type 'sizetype'.  Create
+	   the tree using the value from arg1.  */
+	gimple_seq stmts = NULL;
+	tree temp_offset = gimple_convert (&stmts, loc, sizetype, arg1);
+	tree temp_addr = gimple_build (&stmts, loc, POINTER_PLUS_EXPR,
+				       arg2_type, arg2, temp_offset);
+	gsi_insert_seq_before (gsi, stmts, GSI_SAME_STMT);
+	if (!is_gimple_mem_ref_addr (temp_addr))
+	  {
+	    tree t = make_ssa_name (TREE_TYPE (temp_addr));
+	    gimple *g = gimple_build_assign (t, temp_addr);
+	    gsi_insert_before (gsi, g, GSI_SAME_STMT);
+	    temp_addr = t;
+	  }
+	gimple *g;
+	g = gimple_build_assign (build2 (MEM_REF, align_stype, temp_addr,
+					 build_int_cst (arg2_type, 0)), arg0);
+	gimple_set_location (g, loc);
+	gsi_replace (gsi, g, true);
+	return true;
+      }
+
+    /* Vector Fused multiply-add (fma).  */
+    case RS6000_BIF_VMADDFP:
+    case RS6000_BIF_XVMADDDP:
+    case RS6000_BIF_XVMADDSP:
+    case RS6000_BIF_VMLADDUHM:
+      {
+	arg0 = gimple_call_arg (stmt, 0);
+	arg1 = gimple_call_arg (stmt, 1);
+	tree arg2 = gimple_call_arg (stmt, 2);
+	lhs = gimple_call_lhs (stmt);
+	gcall *g = gimple_build_call_internal (IFN_FMA, 3, arg0, arg1, arg2);
+	gimple_call_set_lhs (g, lhs);
+	gimple_call_set_nothrow (g, true);
+	gimple_set_location (g, gimple_location (stmt));
+	gsi_replace (gsi, g, true);
+	return true;
+      }
+
+    /* Vector compares; EQ, NE, GE, GT, LE.  */
+    case RS6000_BIF_VCMPEQUB:
+    case RS6000_BIF_VCMPEQUH:
+    case RS6000_BIF_VCMPEQUW:
+    case RS6000_BIF_VCMPEQUD:
+    case RS6000_BIF_VCMPEQUT:
+      fold_compare_helper (gsi, EQ_EXPR, stmt);
+      return true;
+
+    case RS6000_BIF_VCMPNEB:
+    case RS6000_BIF_VCMPNEH:
+    case RS6000_BIF_VCMPNEW:
+    case RS6000_BIF_VCMPNET:
+      fold_compare_helper (gsi, NE_EXPR, stmt);
+      return true;
+
+    case RS6000_BIF_CMPGE_16QI:
+    case RS6000_BIF_CMPGE_U16QI:
+    case RS6000_BIF_CMPGE_8HI:
+    case RS6000_BIF_CMPGE_U8HI:
+    case RS6000_BIF_CMPGE_4SI:
+    case RS6000_BIF_CMPGE_U4SI:
+    case RS6000_BIF_CMPGE_2DI:
+    case RS6000_BIF_CMPGE_U2DI:
+    case RS6000_BIF_CMPGE_1TI:
+    case RS6000_BIF_CMPGE_U1TI:
+      fold_compare_helper (gsi, GE_EXPR, stmt);
+      return true;
+
+    case RS6000_BIF_VCMPGTSB:
+    case RS6000_BIF_VCMPGTUB:
+    case RS6000_BIF_VCMPGTSH:
+    case RS6000_BIF_VCMPGTUH:
+    case RS6000_BIF_VCMPGTSW:
+    case RS6000_BIF_VCMPGTUW:
+    case RS6000_BIF_VCMPGTUD:
+    case RS6000_BIF_VCMPGTSD:
+    case RS6000_BIF_VCMPGTUT:
+    case RS6000_BIF_VCMPGTST:
+      fold_compare_helper (gsi, GT_EXPR, stmt);
+      return true;
+
+    case RS6000_BIF_CMPLE_16QI:
+    case RS6000_BIF_CMPLE_U16QI:
+    case RS6000_BIF_CMPLE_8HI:
+    case RS6000_BIF_CMPLE_U8HI:
+    case RS6000_BIF_CMPLE_4SI:
+    case RS6000_BIF_CMPLE_U4SI:
+    case RS6000_BIF_CMPLE_2DI:
+    case RS6000_BIF_CMPLE_U2DI:
+    case RS6000_BIF_CMPLE_1TI:
+    case RS6000_BIF_CMPLE_U1TI:
+      fold_compare_helper (gsi, LE_EXPR, stmt);
+      return true;
+
+    /* flavors of vec_splat_[us]{8,16,32}.  */
+    case RS6000_BIF_VSPLTISB:
+    case RS6000_BIF_VSPLTISH:
+    case RS6000_BIF_VSPLTISW:
+      {
+	arg0 = gimple_call_arg (stmt, 0);
+	lhs = gimple_call_lhs (stmt);
+
+	/* Only fold the vec_splat_*() if the lower bits of arg 0 is a
+	   5-bit signed constant in range -16 to +15.  */
+	if (TREE_CODE (arg0) != INTEGER_CST
+	    || !IN_RANGE (TREE_INT_CST_LOW (arg0), -16, 15))
+	  return false;
+	gimple_seq stmts = NULL;
+	location_t loc = gimple_location (stmt);
+	tree splat_value = gimple_convert (&stmts, loc,
+					   TREE_TYPE (TREE_TYPE (lhs)), arg0);
+	gsi_insert_seq_before (gsi, stmts, GSI_SAME_STMT);
+	tree splat_tree = build_vector_from_val (TREE_TYPE (lhs), splat_value);
+	g = gimple_build_assign (lhs, splat_tree);
+	gimple_set_location (g, gimple_location (stmt));
+	gsi_replace (gsi, g, true);
+	return true;
+      }
+
+    /* Flavors of vec_splat.  */
+    /* a = vec_splat (b, 0x3) becomes a = { b[3],b[3],b[3],...};  */
+    case RS6000_BIF_VSPLTB:
+    case RS6000_BIF_VSPLTH:
+    case RS6000_BIF_VSPLTW:
+    case RS6000_BIF_XXSPLTD_V2DI:
+    case RS6000_BIF_XXSPLTD_V2DF:
+      {
+	arg0 = gimple_call_arg (stmt, 0); /* input vector.  */
+	arg1 = gimple_call_arg (stmt, 1); /* index into arg0.  */
+	/* Only fold the vec_splat_*() if arg1 is both a constant value and
+	   is a valid index into the arg0 vector.  */
+	unsigned int n_elts = VECTOR_CST_NELTS (arg0);
+	if (TREE_CODE (arg1) != INTEGER_CST
+	    || TREE_INT_CST_LOW (arg1) > (n_elts -1))
+	  return false;
+	lhs = gimple_call_lhs (stmt);
+	tree lhs_type = TREE_TYPE (lhs);
+	tree arg0_type = TREE_TYPE (arg0);
+	tree splat;
+	if (TREE_CODE (arg0) == VECTOR_CST)
+	  splat = VECTOR_CST_ELT (arg0, TREE_INT_CST_LOW (arg1));
+	else
+	  {
+	    /* Determine (in bits) the length and start location of the
+	       splat value for a call to the tree_vec_extract helper.  */
+	    int splat_elem_size = TREE_INT_CST_LOW (size_in_bytes (arg0_type))
+				  * BITS_PER_UNIT / n_elts;
+	    int splat_start_bit = TREE_INT_CST_LOW (arg1) * splat_elem_size;
+	    tree len = build_int_cst (bitsizetype, splat_elem_size);
+	    tree start = build_int_cst (bitsizetype, splat_start_bit);
+	    splat = tree_vec_extract (gsi, TREE_TYPE (lhs_type), arg0,
+				      len, start);
+	  }
+	/* And finally, build the new vector.  */
+	tree splat_tree = build_vector_from_val (lhs_type, splat);
+	g = gimple_build_assign (lhs, splat_tree);
+	gimple_set_location (g, gimple_location (stmt));
+	gsi_replace (gsi, g, true);
+	return true;
+      }
+
+    /* vec_mergel (integrals).  */
+    case RS6000_BIF_VMRGLH:
+    case RS6000_BIF_VMRGLW:
+    case RS6000_BIF_XXMRGLW_4SI:
+    case RS6000_BIF_VMRGLB:
+    case RS6000_BIF_VEC_MERGEL_V2DI:
+    case RS6000_BIF_XXMRGLW_4SF:
+    case RS6000_BIF_VEC_MERGEL_V2DF:
+      fold_mergehl_helper (gsi, stmt, 1);
+      return true;
+    /* vec_mergeh (integrals).  */
+    case RS6000_BIF_VMRGHH:
+    case RS6000_BIF_VMRGHW:
+    case RS6000_BIF_XXMRGHW_4SI:
+    case RS6000_BIF_VMRGHB:
+    case RS6000_BIF_VEC_MERGEH_V2DI:
+    case RS6000_BIF_XXMRGHW_4SF:
+    case RS6000_BIF_VEC_MERGEH_V2DF:
+      fold_mergehl_helper (gsi, stmt, 0);
+      return true;
+
+    /* Flavors of vec_mergee.  */
+    case RS6000_BIF_VMRGEW_V4SI:
+    case RS6000_BIF_VMRGEW_V2DI:
+    case RS6000_BIF_VMRGEW_V4SF:
+    case RS6000_BIF_VMRGEW_V2DF:
+      fold_mergeeo_helper (gsi, stmt, 0);
+      return true;
+    /* Flavors of vec_mergeo.  */
+    case RS6000_BIF_VMRGOW_V4SI:
+    case RS6000_BIF_VMRGOW_V2DI:
+    case RS6000_BIF_VMRGOW_V4SF:
+    case RS6000_BIF_VMRGOW_V2DF:
+      fold_mergeeo_helper (gsi, stmt, 1);
+      return true;
+
+    /* d = vec_pack (a, b) */
+    case RS6000_BIF_VPKUDUM:
+    case RS6000_BIF_VPKUHUM:
+    case RS6000_BIF_VPKUWUM:
+      {
+	arg0 = gimple_call_arg (stmt, 0);
+	arg1 = gimple_call_arg (stmt, 1);
+	lhs = gimple_call_lhs (stmt);
+	gimple *g = gimple_build_assign (lhs, VEC_PACK_TRUNC_EXPR, arg0, arg1);
+	gimple_set_location (g, gimple_location (stmt));
+	gsi_replace (gsi, g, true);
+	return true;
+      }
+
+    /* d = vec_unpackh (a) */
+    /* Note that the UNPACK_{HI,LO}_EXPR used in the gimple_build_assign call
+       in this code is sensitive to endian-ness, and needs to be inverted to
+       handle both LE and BE targets.  */
+    case RS6000_BIF_VUPKHSB:
+    case RS6000_BIF_VUPKHSH:
+    case RS6000_BIF_VUPKHSW:
+      {
+	arg0 = gimple_call_arg (stmt, 0);
+	lhs = gimple_call_lhs (stmt);
+	if (BYTES_BIG_ENDIAN)
+	  g = gimple_build_assign (lhs, VEC_UNPACK_HI_EXPR, arg0);
+	else
+	  g = gimple_build_assign (lhs, VEC_UNPACK_LO_EXPR, arg0);
+	gimple_set_location (g, gimple_location (stmt));
+	gsi_replace (gsi, g, true);
+	return true;
+      }
+    /* d = vec_unpackl (a) */
+    case RS6000_BIF_VUPKLSB:
+    case RS6000_BIF_VUPKLSH:
+    case RS6000_BIF_VUPKLSW:
+      {
+	arg0 = gimple_call_arg (stmt, 0);
+	lhs = gimple_call_lhs (stmt);
+	if (BYTES_BIG_ENDIAN)
+	  g = gimple_build_assign (lhs, VEC_UNPACK_LO_EXPR, arg0);
+	else
+	  g = gimple_build_assign (lhs, VEC_UNPACK_HI_EXPR, arg0);
+	gimple_set_location (g, gimple_location (stmt));
+	gsi_replace (gsi, g, true);
+	return true;
+      }
+    /* There is no gimple type corresponding with pixel, so just return.  */
+    case RS6000_BIF_VUPKHPX:
+    case RS6000_BIF_VUPKLPX:
+      return false;
+
+    /* vec_perm.  */
+    case RS6000_BIF_VPERM_16QI:
+    case RS6000_BIF_VPERM_8HI:
+    case RS6000_BIF_VPERM_4SI:
+    case RS6000_BIF_VPERM_2DI:
+    case RS6000_BIF_VPERM_4SF:
+    case RS6000_BIF_VPERM_2DF:
+    case RS6000_BIF_VPERM_16QI_UNS:
+    case RS6000_BIF_VPERM_8HI_UNS:
+    case RS6000_BIF_VPERM_4SI_UNS:
+    case RS6000_BIF_VPERM_2DI_UNS:
+      {
+	arg0 = gimple_call_arg (stmt, 0);
+	arg1 = gimple_call_arg (stmt, 1);
+	tree permute = gimple_call_arg (stmt, 2);
+	lhs = gimple_call_lhs (stmt);
+	location_t loc = gimple_location (stmt);
+	gimple_seq stmts = NULL;
+	// convert arg0 and arg1 to match the type of the permute
+	// for the VEC_PERM_EXPR operation.
+	tree permute_type = (TREE_TYPE (permute));
+	tree arg0_ptype = gimple_build (&stmts, loc, VIEW_CONVERT_EXPR,
+					permute_type, arg0);
+	tree arg1_ptype = gimple_build (&stmts, loc, VIEW_CONVERT_EXPR,
+					permute_type, arg1);
+	tree lhs_ptype = gimple_build (&stmts, loc, VEC_PERM_EXPR,
+				      permute_type, arg0_ptype, arg1_ptype,
+				      permute);
+	// Convert the result back to the desired lhs type upon completion.
+	tree temp = gimple_build (&stmts, loc, VIEW_CONVERT_EXPR,
+				  TREE_TYPE (lhs), lhs_ptype);
+	gsi_insert_seq_before (gsi, stmts, GSI_SAME_STMT);
+	g = gimple_build_assign (lhs, temp);
+	gimple_set_location (g, loc);
+	gsi_replace (gsi, g, true);
+	return true;
+      }
+
+    default:
+      if (TARGET_DEBUG_BUILTIN)
+	fprintf (stderr, "gimple builtin intrinsic not matched:%d %s %s\n",
+		 fn_code, fn_name1, fn_name2);
+      break;
+    }
+
+  return false;
+}
+
 /* Expand an expression EXP that calls a built-in function,
    with result going to TARGET if that's convenient
    (and in mode MODE if that's convenient).
-- 
2.27.0


^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 21/34] rs6000: Handle some recent MMA builtin changes
  2021-07-29 13:30 [PATCHv4 00/34] Replace the Power target-specific builtin machinery Bill Schmidt
                   ` (19 preceding siblings ...)
  2021-07-29 13:31 ` [PATCH 20/34] rs6000: Handle gimple folding of target built-ins Bill Schmidt
@ 2021-07-29 13:31 ` Bill Schmidt
  2021-07-29 13:31 ` [PATCH 22/34] rs6000: Support for vectorizing built-in functions Bill Schmidt
                   ` (12 subsequent siblings)
  33 siblings, 0 replies; 84+ messages in thread
From: Bill Schmidt @ 2021-07-29 13:31 UTC (permalink / raw)
  To: gcc-patches; +Cc: segher, dje.gcc, willschm

2021-07-27  Bill Schmidt  <wschmidt@linux.ibm.com>

gcc/
	* config/rs6000/rs6000-builtin-new.def (ASSEMBLE_ACC): Add mmaint
	flag.
	(ASSEMBLE_PAIR): Likewise.
	(BUILD_ACC): Likewise.
	(DISASSEMBLE_ACC): Likewise.
	(DISASSEMBLE_PAIR): Likewise.
	(PMXVBF16GER2): Likewise.
	(PMXVBF16GER2NN): Likewise.
	(PMXVBF16GER2NP): Likewise.
	(PMXVBF16GER2PN): Likewise.
	(PMXVBF16GER2PP): Likewise.
	(PMXVF16GER2): Likewise.
	(PMXVF16GER2NN): Likewise.
	(PMXVF16GER2NP): Likewise.
	(PMXVF16GER2PN): Likewise.
	(PMXVF16GER2PP): Likewise.
	(PMXVF32GER): Likewise.
	(PMXVF32GERNN): Likewise.
	(PMXVF32GERNP): Likewise.
	(PMXVF32GERPN): Likewise.
	(PMXVF32GERPP): Likewise.
	(PMXVF64GER): Likewise.
	(PMXVF64GERNN): Likewise.
	(PMXVF64GERNP): Likewise.
	(PMXVF64GERPN): Likewise.
	(PMXVF64GERPP): Likewise.
	(PMXVI16GER2): Likewise.
	(PMXVI16GER2PP): Likewise.
	(PMXVI16GER2S): Likewise.
	(PMXVI16GER2SPP): Likewise.
	(PMXVI4GER8): Likewise.
	(PMXVI4GER8PP): Likewise.
	(PMXVI8GER4): Likewise.
	(PMXVI8GER4PP): Likewise.
	(PMXVI8GER4SPP): Likewise.
	(XVBF16GER2): Likewise.
	(XVBF16GER2NN): Likewise.
	(XVBF16GER2NP): Likewise.
	(XVBF16GER2PN): Likewise.
	(XVBF16GER2PP): Likewise.
	(XVF16GER2): Likewise.
	(XVF16GER2NN): Likewise.
	(XVF16GER2NP): Likewise.
	(XVF16GER2PN): Likewise.
	(XVF16GER2PP): Likewise.
	(XVF32GER): Likewise.
	(XVF32GERNN): Likewise.
	(XVF32GERNP): Likewise.
	(XVF32GERPN): Likewise.
	(XVF32GERPP): Likewise.
	(XVF64GER): Likewise.
	(XVF64GERNN): Likewise.
	(XVF64GERNP): Likewise.
	(XVF64GERPN): Likewise.
	(XVF64GERPP): Likewise.
	(XVI16GER2): Likewise.
	(XVI16GER2PP): Likewise.
	(XVI16GER2S): Likewise.
	(XVI16GER2SPP): Likewise.
	(XVI4GER8): Likewise.
	(XVI4GER8PP): Likewise.
	(XVI8GER4): Likewise.
	(XVI8GER4PP): Likewise.
	(XVI8GER4SPP): Likewise.
	(XXMFACC): Likewise.
	(XXMTACC): Likewise.
	(XXSETACCZ): Likewise.
	(ASSEMBLE_PAIR_V): Likewise.
	(BUILD_PAIR): Likewise.
	(DISASSEMBLE_PAIR_V): Likewise.
	(LXVP): New.
	(STXVP): New.
	* config/rs6000/rs6000-call.c
	(rs6000_gimple_fold_new_mma_builtin): Handle RS6000_BIF_LXVP and
	RS6000_BIF_STXVP.
	* config/rs6000/rs6000-gen-builtins.c (attrinfo): Add ismmaint.
	(parse_bif_attrs): Handle ismmaint.
	(write_decls): Add bif_mmaint_bit and bif_is_mmaint.
	(write_bif_static_init): Handle ismmaint.
---
 gcc/config/rs6000/rs6000-builtin-new.def | 145 ++++++++++++-----------
 gcc/config/rs6000/rs6000-call.c          |  32 ++++-
 gcc/config/rs6000/rs6000-gen-builtins.c  |  38 +++---
 3 files changed, 129 insertions(+), 86 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-builtin-new.def b/gcc/config/rs6000/rs6000-builtin-new.def
index 91dce7fbc91..c1bf545c408 100644
--- a/gcc/config/rs6000/rs6000-builtin-new.def
+++ b/gcc/config/rs6000/rs6000-builtin-new.def
@@ -129,6 +129,7 @@
 ;   mma      Needs special handling for MMA
 ;   quad     MMA instruction using a register quad as an input operand
 ;   pair     MMA instruction using a register pair as an input operand
+;   mmaint   MMA instruction expanding to internal call at GIMPLE time
 ;   no32bit  Not valid for TARGET_32BIT
 ;   32bit    Requires different handling for TARGET_32BIT
 ;   cpu      This is a "cpu_is" or "cpu_supports" builtin
@@ -3584,415 +3585,421 @@
 
 [mma]
   void __builtin_mma_assemble_acc (v512 *, vuc, vuc, vuc, vuc);
-    ASSEMBLE_ACC nothing {mma}
+    ASSEMBLE_ACC nothing {mma,mmaint}
 
   v512 __builtin_mma_assemble_acc_internal (vuc, vuc, vuc, vuc);
     ASSEMBLE_ACC_INTERNAL mma_assemble_acc {mma}
 
   void __builtin_mma_assemble_pair (v256 *, vuc, vuc);
-    ASSEMBLE_PAIR nothing {mma}
+    ASSEMBLE_PAIR nothing {mma,mmaint}
 
   v256 __builtin_mma_assemble_pair_internal (vuc, vuc);
     ASSEMBLE_PAIR_INTERNAL vsx_assemble_pair {mma}
 
   void __builtin_mma_build_acc (v512 *, vuc, vuc, vuc, vuc);
-    BUILD_ACC nothing {mma}
+    BUILD_ACC nothing {mma,mmaint}
 
   v512 __builtin_mma_build_acc_internal (vuc, vuc, vuc, vuc);
     BUILD_ACC_INTERNAL mma_assemble_acc {mma}
 
   void __builtin_mma_disassemble_acc (void *, v512 *);
-    DISASSEMBLE_ACC nothing {mma,quad}
+    DISASSEMBLE_ACC nothing {mma,quad,mmaint}
 
   vuc __builtin_mma_disassemble_acc_internal (v512, const int<2>);
     DISASSEMBLE_ACC_INTERNAL mma_disassemble_acc {mma}
 
   void __builtin_mma_disassemble_pair (void *, v256 *);
-    DISASSEMBLE_PAIR nothing {mma,pair}
+    DISASSEMBLE_PAIR nothing {mma,pair,mmaint}
 
   vuc __builtin_mma_disassemble_pair_internal (v256, const int<2>);
     DISASSEMBLE_PAIR_INTERNAL vsx_disassemble_pair {mma}
 
   void __builtin_mma_pmxvbf16ger2 (v512 *, vuc, vuc, const int<4>, const int<4>, const int<2>);
-    PMXVBF16GER2 nothing {mma}
+    PMXVBF16GER2 nothing {mma,mmaint}
 
   v512 __builtin_mma_pmxvbf16ger2_internal (vuc, vuc, const int<4>, const int<4>, const int<2>);
     PMXVBF16GER2_INTERNAL mma_pmxvbf16ger2 {mma}
 
   void __builtin_mma_pmxvbf16ger2nn (v512 *, vuc, vuc, const int<4>, const int<4>, const int<2>);
-    PMXVBF16GER2NN nothing {mma,quad}
+    PMXVBF16GER2NN nothing {mma,quad,mmaint}
 
   v512 __builtin_mma_pmxvbf16ger2nn_internal (v512, vuc, vuc, const int<4>, const int<4>, const int<2>);
     PMXVBF16GER2NN_INTERNAL mma_pmxvbf16ger2nn {mma,quad}
 
   void __builtin_mma_pmxvbf16ger2np (v512 *, vuc, vuc, const int<4>, const int<4>, const int<2>);
-    PMXVBF16GER2NP nothing {mma,quad}
+    PMXVBF16GER2NP nothing {mma,quad,mmaint}
 
   v512 __builtin_mma_pmxvbf16ger2np_internal (v512, vuc, vuc, const int<4>, const int<4>, const int<2>);
     PMXVBF16GER2NP_INTERNAL mma_pmxvbf16ger2np {mma,quad}
 
   void __builtin_mma_pmxvbf16ger2pn (v512 *, vuc, vuc, const int<4>, const int<4>, const int<2>);
-    PMXVBF16GER2PN nothing {mma,quad}
+    PMXVBF16GER2PN nothing {mma,quad,mmaint}
 
   v512 __builtin_mma_pmxvbf16ger2pn_internal (v512, vuc, vuc, const int<4>, const int<4>, const int<2>);
     PMXVBF16GER2PN_INTERNAL mma_pmxvbf16ger2pn {mma,quad}
 
   void __builtin_mma_pmxvbf16ger2pp (v512 *, vuc, vuc, const int<4>, const int<4>, const int<2>);
-    PMXVBF16GER2PP nothing {mma,quad}
+    PMXVBF16GER2PP nothing {mma,quad,mmaint}
 
   v512 __builtin_mma_pmxvbf16ger2pp_internal (v512, vuc, vuc, const int<4>, const int<4>, const int<2>);
     PMXVBF16GER2PP_INTERNAL mma_pmxvbf16ger2pp {mma,quad}
 
   void __builtin_mma_pmxvf16ger2 (v512 *, vuc, vuc, const int<4>, const int<4>, const int<2>);
-    PMXVF16GER2 nothing {mma}
+    PMXVF16GER2 nothing {mma,mmaint}
 
   v512 __builtin_mma_pmxvf16ger2_internal (vuc, vuc, const int<4>, const int<4>, const int<2>);
     PMXVF16GER2_INTERNAL mma_pmxvf16ger2 {mma}
 
   void __builtin_mma_pmxvf16ger2nn (v512 *, vuc, vuc, const int<4>, const int<4>, const int<2>);
-    PMXVF16GER2NN nothing {mma,quad}
+    PMXVF16GER2NN nothing {mma,quad,mmaint}
 
   v512 __builtin_mma_pmxvf16ger2nn_internal (v512, vuc, vuc, const int<4>, const int<4>, const int<2>);
     PMXVF16GER2NN_INTERNAL mma_pmxvf16ger2nn {mma,quad}
 
   void __builtin_mma_pmxvf16ger2np (v512 *, vuc, vuc, const int<4>, const int<4>, const int<2>);
-    PMXVF16GER2NP nothing {mma,quad}
+    PMXVF16GER2NP nothing {mma,quad,mmaint}
 
   v512 __builtin_mma_pmxvf16ger2np_internal (v512, vuc, vuc, const int<4>, const int<4>, const int<2>);
     PMXVF16GER2NP_INTERNAL mma_pmxvf16ger2np {mma,quad}
 
   void __builtin_mma_pmxvf16ger2pn (v512 *, vuc, vuc, const int<4>, const int<4>, const int<2>);
-    PMXVF16GER2PN nothing {mma,quad}
+    PMXVF16GER2PN nothing {mma,quad,mmaint}
 
   v512 __builtin_mma_pmxvf16ger2pn_internal (v512, vuc, vuc, const int<4>, const int<4>, const int<2>);
     PMXVF16GER2PN_INTERNAL mma_pmxvf16ger2pn {mma,quad}
 
   void __builtin_mma_pmxvf16ger2pp (v512 *, vuc, vuc, const int<4>, const int<4>, const int<2>);
-    PMXVF16GER2PP nothing {mma,quad}
+    PMXVF16GER2PP nothing {mma,quad,mmaint}
 
   v512 __builtin_mma_pmxvf16ger2pp_internal (v512, vuc, vuc, const int<4>, const int<4>, const int<2>);
     PMXVF16GER2PP_INTERNAL mma_pmxvf16ger2pp {mma,quad}
 
   void __builtin_mma_pmxvf32ger (v512 *, vuc, vuc, const int<4>, const int<4>);
-    PMXVF32GER nothing {mma}
+    PMXVF32GER nothing {mma,mmaint}
 
   v512 __builtin_mma_pmxvf32ger_internal (vuc, vuc, const int<4>, const int<4>);
     PMXVF32GER_INTERNAL mma_pmxvf32ger {mma}
 
   void __builtin_mma_pmxvf32gernn (v512 *, vuc, vuc, const int<4>, const int<4>);
-    PMXVF32GERNN nothing {mma,quad}
+    PMXVF32GERNN nothing {mma,quad,mmaint}
 
   v512 __builtin_mma_pmxvf32gernn_internal (v512, vuc, vuc, const int<4>, const int<4>);
     PMXVF32GERNN_INTERNAL mma_pmxvf32gernn {mma,quad}
 
   void __builtin_mma_pmxvf32gernp (v512 *, vuc, vuc, const int<4>, const int<4>);
-    PMXVF32GERNP nothing {mma,quad}
+    PMXVF32GERNP nothing {mma,quad,mmaint}
 
   v512 __builtin_mma_pmxvf32gernp_internal (v512, vuc, vuc, const int<4>, const int<4>);
     PMXVF32GERNP_INTERNAL mma_pmxvf32gernp {mma,quad}
 
   void __builtin_mma_pmxvf32gerpn (v512 *, vuc, vuc, const int<4>, const int<4>);
-    PMXVF32GERPN nothing {mma,quad}
+    PMXVF32GERPN nothing {mma,quad,mmaint}
 
   v512 __builtin_mma_pmxvf32gerpn_internal (v512, vuc, vuc, const int<4>, const int<4>);
     PMXVF32GERPN_INTERNAL mma_pmxvf32gerpn {mma,quad}
 
   void __builtin_mma_pmxvf32gerpp (v512 *, vuc, vuc, const int<4>, const int<4>);
-    PMXVF32GERPP nothing {mma,quad}
+    PMXVF32GERPP nothing {mma,quad,mmaint}
 
   v512 __builtin_mma_pmxvf32gerpp_internal (v512, vuc, vuc, const int<4>, const int<4>);
     PMXVF32GERPP_INTERNAL mma_pmxvf32gerpp {mma,quad}
 
   void __builtin_mma_pmxvf64ger (v512 *, v256, vuc, const int<4>, const int<2>);
-    PMXVF64GER nothing {mma,pair}
+    PMXVF64GER nothing {mma,pair,mmaint}
 
   v512 __builtin_mma_pmxvf64ger_internal (v256, vuc, const int<4>, const int<2>);
     PMXVF64GER_INTERNAL mma_pmxvf64ger {mma,pair}
 
   void __builtin_mma_pmxvf64gernn (v512 *, v256, vuc, const int<4>, const int<2>);
-    PMXVF64GERNN nothing {mma,pair,quad}
+    PMXVF64GERNN nothing {mma,pair,quad,mmaint}
 
   v512 __builtin_mma_pmxvf64gernn_internal (v512, v256, vuc, const int<4>, const int<2>);
     PMXVF64GERNN_INTERNAL mma_pmxvf64gernn {mma,pair,quad}
 
   void __builtin_mma_pmxvf64gernp (v512 *, v256, vuc, const int<4>, const int<2>);
-    PMXVF64GERNP nothing {mma,pair,quad}
+    PMXVF64GERNP nothing {mma,pair,quad,mmaint}
 
   v512 __builtin_mma_pmxvf64gernp_internal (v512, v256, vuc, const int<4>, const int<2>);
     PMXVF64GERNP_INTERNAL mma_pmxvf64gernp {mma,pair,quad}
 
   void __builtin_mma_pmxvf64gerpn (v512 *, v256, vuc, const int<4>, const int<2>);
-    PMXVF64GERPN nothing {mma,pair,quad}
+    PMXVF64GERPN nothing {mma,pair,quad,mmaint}
 
   v512 __builtin_mma_pmxvf64gerpn_internal (v512, v256, vuc, const int<4>, const int<2>);
     PMXVF64GERPN_INTERNAL mma_pmxvf64gerpn {mma,pair,quad}
 
   void __builtin_mma_pmxvf64gerpp (v512 *, v256, vuc, const int<4>, const int<2>);
-    PMXVF64GERPP nothing {mma,pair,quad}
+    PMXVF64GERPP nothing {mma,pair,quad,mmaint}
 
   v512 __builtin_mma_pmxvf64gerpp_internal (v512, v256, vuc, const int<4>, const int<2>);
     PMXVF64GERPP_INTERNAL mma_pmxvf64gerpp {mma,pair,quad}
 
   void __builtin_mma_pmxvi16ger2 (v512 *, vuc, vuc, const int<4>, const int<4>, const int<2>);
-    PMXVI16GER2 nothing {mma}
+    PMXVI16GER2 nothing {mma,mmaint}
 
   v512 __builtin_mma_pmxvi16ger2_internal (vuc, vuc, const int<4>, const int<4>, const int<2>);
     PMXVI16GER2_INTERNAL mma_pmxvi16ger2 {mma}
 
   void __builtin_mma_pmxvi16ger2pp (v512 *, vuc, vuc, const int<4>, const int<4>, const int<2>);
-    PMXVI16GER2PP nothing {mma,quad}
+    PMXVI16GER2PP nothing {mma,quad,mmaint}
 
   v512 __builtin_mma_pmxvi16ger2pp_internal (v512, vuc, vuc, const int<4>, const int<4>, const int<2>);
     PMXVI16GER2PP_INTERNAL mma_pmxvi16ger2pp {mma,quad}
 
   void __builtin_mma_pmxvi16ger2s (v512 *, vuc, vuc, const int<4>, const int<4>, const int<2>);
-    PMXVI16GER2S nothing {mma}
+    PMXVI16GER2S nothing {mma,mmaint}
 
   v512 __builtin_mma_pmxvi16ger2s_internal (vuc, vuc, const int<4>, const int<4>, const int<2>);
     PMXVI16GER2S_INTERNAL mma_pmxvi16ger2s {mma}
 
   void __builtin_mma_pmxvi16ger2spp (v512 *, vuc, vuc, const int<4>, const int<4>, const int<2>);
-    PMXVI16GER2SPP nothing {mma,quad}
+    PMXVI16GER2SPP nothing {mma,quad,mmaint}
 
   v512 __builtin_mma_pmxvi16ger2spp_internal (v512, vuc, vuc, const int<4>, const int<4>, const int<2>);
     PMXVI16GER2SPP_INTERNAL mma_pmxvi16ger2spp {mma,quad}
 
   void __builtin_mma_pmxvi4ger8 (v512 *, vuc, vuc, const int<4>, const int<4>, const int<8>);
-    PMXVI4GER8 nothing {mma}
+    PMXVI4GER8 nothing {mma,mmaint}
 
   v512 __builtin_mma_pmxvi4ger8_internal (vuc, vuc, const int<4>, const int<4>, const int<8>);
     PMXVI4GER8_INTERNAL mma_pmxvi4ger8 {mma}
 
   void __builtin_mma_pmxvi4ger8pp (v512 *, vuc, vuc, const int<4>, const int<4>, const int<4>);
-    PMXVI4GER8PP nothing {mma,quad}
+    PMXVI4GER8PP nothing {mma,quad,mmaint}
 
   v512 __builtin_mma_pmxvi4ger8pp_internal (v512, vuc, vuc, const int<4>, const int<4>, const int<4>);
     PMXVI4GER8PP_INTERNAL mma_pmxvi4ger8pp {mma,quad}
 
   void __builtin_mma_pmxvi8ger4 (v512 *, vuc, vuc, const int<4>, const int<4>, const int<4>);
-    PMXVI8GER4 nothing {mma}
+    PMXVI8GER4 nothing {mma,mmaint}
 
   v512 __builtin_mma_pmxvi8ger4_internal (vuc, vuc, const int<4>, const int<4>, const int<4>);
     PMXVI8GER4_INTERNAL mma_pmxvi8ger4 {mma}
 
   void __builtin_mma_pmxvi8ger4pp (v512 *, vuc, vuc, const int<4>, const int<4>, const int<4>);
-    PMXVI8GER4PP nothing {mma,quad}
+    PMXVI8GER4PP nothing {mma,quad,mmaint}
 
   v512 __builtin_mma_pmxvi8ger4pp_internal (v512, vuc, vuc, const int<4>, const int<4>, const int<4>);
     PMXVI8GER4PP_INTERNAL mma_pmxvi8ger4pp {mma,quad}
 
   void __builtin_mma_pmxvi8ger4spp (v512 *, vuc, vuc, const int<4>, const int<4>, const int<4>);
-    PMXVI8GER4SPP nothing {mma,quad}
+    PMXVI8GER4SPP nothing {mma,quad,mmaint}
 
   v512 __builtin_mma_pmxvi8ger4spp_internal (v512, vuc, vuc, const int<4>, const int<4>, const int<4>);
     PMXVI8GER4SPP_INTERNAL mma_pmxvi8ger4spp {mma,quad}
 
   void __builtin_mma_xvbf16ger2 (v512 *, vuc, vuc);
-    XVBF16GER2 nothing {mma}
+    XVBF16GER2 nothing {mma,mmaint}
 
   v512 __builtin_mma_xvbf16ger2_internal (vuc, vuc);
     XVBF16GER2_INTERNAL mma_xvbf16ger2 {mma}
 
   void __builtin_mma_xvbf16ger2nn (v512 *, vuc, vuc);
-    XVBF16GER2NN nothing {mma,quad}
+    XVBF16GER2NN nothing {mma,quad,mmaint}
 
   v512 __builtin_mma_xvbf16ger2nn_internal (v512, vuc, vuc);
     XVBF16GER2NN_INTERNAL mma_xvbf16ger2nn {mma,quad}
 
   void __builtin_mma_xvbf16ger2np (v512 *, vuc, vuc);
-    XVBF16GER2NP nothing {mma,quad}
+    XVBF16GER2NP nothing {mma,quad,mmaint}
 
   v512 __builtin_mma_xvbf16ger2np_internal (v512, vuc, vuc);
     XVBF16GER2NP_INTERNAL mma_xvbf16ger2np {mma,quad}
 
   void __builtin_mma_xvbf16ger2pn (v512 *, vuc, vuc);
-    XVBF16GER2PN nothing {mma,quad}
+    XVBF16GER2PN nothing {mma,quad,mmaint}
 
   v512 __builtin_mma_xvbf16ger2pn_internal (v512, vuc, vuc);
     XVBF16GER2PN_INTERNAL mma_xvbf16ger2pn {mma,quad}
 
   void __builtin_mma_xvbf16ger2pp (v512 *, vuc, vuc);
-    XVBF16GER2PP nothing {mma,quad}
+    XVBF16GER2PP nothing {mma,quad,mmaint}
 
   v512 __builtin_mma_xvbf16ger2pp_internal (v512, vuc, vuc);
     XVBF16GER2PP_INTERNAL mma_xvbf16ger2pp {mma,quad}
 
   void __builtin_mma_xvf16ger2 (v512 *, vuc, vuc);
-    XVF16GER2 nothing {mma}
+    XVF16GER2 nothing {mma,mmaint}
 
   v512 __builtin_mma_xvf16ger2_internal (vuc, vuc);
     XVF16GER2_INTERNAL mma_xvf16ger2 {mma}
 
   void __builtin_mma_xvf16ger2nn (v512 *, vuc, vuc);
-    XVF16GER2NN nothing {mma,quad}
+    XVF16GER2NN nothing {mma,quad,mmaint}
 
   v512 __builtin_mma_xvf16ger2nn_internal (v512, vuc, vuc);
     XVF16GER2NN_INTERNAL mma_xvf16ger2nn {mma,quad}
 
   void __builtin_mma_xvf16ger2np (v512 *, vuc, vuc);
-    XVF16GER2NP nothing {mma,quad}
+    XVF16GER2NP nothing {mma,quad,mmaint}
 
   v512 __builtin_mma_xvf16ger2np_internal (v512, vuc, vuc);
     XVF16GER2NP_INTERNAL mma_xvf16ger2np {mma,quad}
 
   void __builtin_mma_xvf16ger2pn (v512 *, vuc, vuc);
-    XVF16GER2PN nothing {mma,quad}
+    XVF16GER2PN nothing {mma,quad,mmaint}
 
   v512 __builtin_mma_xvf16ger2pn_internal (v512, vuc, vuc);
     XVF16GER2PN_INTERNAL mma_xvf16ger2pn {mma,quad}
 
   void __builtin_mma_xvf16ger2pp (v512 *, vuc, vuc);
-    XVF16GER2PP nothing {mma,quad}
+    XVF16GER2PP nothing {mma,quad,mmaint}
 
   v512 __builtin_mma_xvf16ger2pp_internal (v512, vuc, vuc);
     XVF16GER2PP_INTERNAL mma_xvf16ger2pp {mma,quad}
 
   void __builtin_mma_xvf32ger (v512 *, vuc, vuc);
-    XVF32GER nothing {mma}
+    XVF32GER nothing {mma,mmaint}
 
   v512 __builtin_mma_xvf32ger_internal (vuc, vuc);
     XVF32GER_INTERNAL mma_xvf32ger {mma}
 
   void __builtin_mma_xvf32gernn (v512 *, vuc, vuc);
-    XVF32GERNN nothing {mma,quad}
+    XVF32GERNN nothing {mma,quad,mmaint}
 
   v512 __builtin_mma_xvf32gernn_internal (v512, vuc, vuc);
     XVF32GERNN_INTERNAL mma_xvf32gernn {mma,quad}
 
   void __builtin_mma_xvf32gernp (v512 *, vuc, vuc);
-    XVF32GERNP nothing {mma,quad}
+    XVF32GERNP nothing {mma,quad,mmaint}
 
   v512 __builtin_mma_xvf32gernp_internal (v512, vuc, vuc);
     XVF32GERNP_INTERNAL mma_xvf32gernp {mma,quad}
 
   void __builtin_mma_xvf32gerpn (v512 *, vuc, vuc);
-    XVF32GERPN nothing {mma,quad}
+    XVF32GERPN nothing {mma,quad,mmaint}
 
   v512 __builtin_mma_xvf32gerpn_internal (v512, vuc, vuc);
     XVF32GERPN_INTERNAL mma_xvf32gerpn {mma,quad}
 
   void __builtin_mma_xvf32gerpp (v512 *, vuc, vuc);
-    XVF32GERPP nothing {mma,quad}
+    XVF32GERPP nothing {mma,quad,mmaint}
 
   v512 __builtin_mma_xvf32gerpp_internal (v512, vuc, vuc);
     XVF32GERPP_INTERNAL mma_xvf32gerpp {mma,quad}
 
   void __builtin_mma_xvf64ger (v512 *, v256, vuc);
-    XVF64GER nothing {mma,pair}
+    XVF64GER nothing {mma,pair,mmaint}
 
   v512 __builtin_mma_xvf64ger_internal (v256, vuc);
     XVF64GER_INTERNAL mma_xvf64ger {mma,pair}
 
   void __builtin_mma_xvf64gernn (v512 *, v256, vuc);
-    XVF64GERNN nothing {mma,pair,quad}
+    XVF64GERNN nothing {mma,pair,quad,mmaint}
 
   v512 __builtin_mma_xvf64gernn_internal (v512, v256, vuc);
     XVF64GERNN_INTERNAL mma_xvf64gernn {mma,pair,quad}
 
   void __builtin_mma_xvf64gernp (v512 *, v256, vuc);
-    XVF64GERNP nothing {mma,pair,quad}
+    XVF64GERNP nothing {mma,pair,quad,mmaint}
 
   v512 __builtin_mma_xvf64gernp_internal (v512, v256, vuc);
     XVF64GERNP_INTERNAL mma_xvf64gernp {mma,pair,quad}
 
   void __builtin_mma_xvf64gerpn (v512 *, v256, vuc);
-    XVF64GERPN nothing {mma,pair,quad}
+    XVF64GERPN nothing {mma,pair,quad,mmaint}
 
   v512 __builtin_mma_xvf64gerpn_internal (v512, v256, vuc);
     XVF64GERPN_INTERNAL mma_xvf64gerpn {mma,pair,quad}
 
   void __builtin_mma_xvf64gerpp (v512 *, v256, vuc);
-    XVF64GERPP nothing {mma,pair,quad}
+    XVF64GERPP nothing {mma,pair,quad,mmaint}
 
   v512 __builtin_mma_xvf64gerpp_internal (v512, v256, vuc);
     XVF64GERPP_INTERNAL mma_xvf64gerpp {mma,pair,quad}
 
   void __builtin_mma_xvi16ger2 (v512 *, vuc, vuc);
-    XVI16GER2 nothing {mma}
+    XVI16GER2 nothing {mma,mmaint}
 
   v512 __builtin_mma_xvi16ger2_internal (vuc, vuc);
     XVI16GER2_INTERNAL mma_xvi16ger2 {mma}
 
   void __builtin_mma_xvi16ger2pp (v512 *, vuc, vuc);
-    XVI16GER2PP nothing {mma,quad}
+    XVI16GER2PP nothing {mma,quad,mmaint}
 
   v512 __builtin_mma_xvi16ger2pp_internal (v512, vuc, vuc);
     XVI16GER2PP_INTERNAL mma_xvi16ger2pp {mma,quad}
 
   void __builtin_mma_xvi16ger2s (v512 *, vuc, vuc);
-    XVI16GER2S nothing {mma}
+    XVI16GER2S nothing {mma,mmaint}
 
   v512 __builtin_mma_xvi16ger2s_internal (vuc, vuc);
     XVI16GER2S_INTERNAL mma_xvi16ger2s {mma}
 
   void __builtin_mma_xvi16ger2spp (v512 *, vuc, vuc);
-    XVI16GER2SPP nothing {mma,quad}
+    XVI16GER2SPP nothing {mma,quad,mmaint}
 
   v512 __builtin_mma_xvi16ger2spp_internal (v512, vuc, vuc);
     XVI16GER2SPP_INTERNAL mma_xvi16ger2spp {mma,quad}
 
   void __builtin_mma_xvi4ger8 (v512 *, vuc, vuc);
-    XVI4GER8 nothing {mma}
+    XVI4GER8 nothing {mma,mmaint}
 
   v512 __builtin_mma_xvi4ger8_internal (vuc, vuc);
     XVI4GER8_INTERNAL mma_xvi4ger8 {mma}
 
   void __builtin_mma_xvi4ger8pp (v512 *, vuc, vuc);
-    XVI4GER8PP nothing {mma,quad}
+    XVI4GER8PP nothing {mma,quad,mmaint}
 
   v512 __builtin_mma_xvi4ger8pp_internal (v512, vuc, vuc);
     XVI4GER8PP_INTERNAL mma_xvi4ger8pp {mma,quad}
 
   void __builtin_mma_xvi8ger4 (v512 *, vuc, vuc);
-    XVI8GER4 nothing {mma}
+    XVI8GER4 nothing {mma,mmaint}
 
   v512 __builtin_mma_xvi8ger4_internal (vuc, vuc);
     XVI8GER4_INTERNAL mma_xvi8ger4 {mma}
 
   void __builtin_mma_xvi8ger4pp (v512 *, vuc, vuc);
-    XVI8GER4PP nothing {mma,quad}
+    XVI8GER4PP nothing {mma,quad,mmaint}
 
   v512 __builtin_mma_xvi8ger4pp_internal (v512, vuc, vuc);
     XVI8GER4PP_INTERNAL mma_xvi8ger4pp {mma,quad}
 
   void __builtin_mma_xvi8ger4spp (v512 *, vuc, vuc);
-    XVI8GER4SPP nothing {mma,quad}
+    XVI8GER4SPP nothing {mma,quad,mmaint}
 
   v512 __builtin_mma_xvi8ger4spp_internal (v512, vuc, vuc);
     XVI8GER4SPP_INTERNAL mma_xvi8ger4spp {mma,quad}
 
   void __builtin_mma_xxmfacc (v512 *);
-    XXMFACC nothing {mma,quad}
+    XXMFACC nothing {mma,quad,mmaint}
 
   v512 __builtin_mma_xxmfacc_internal (v512);
     XXMFACC_INTERNAL mma_xxmfacc {mma,quad}
 
   void __builtin_mma_xxmtacc (v512 *);
-    XXMTACC nothing {mma,quad}
+    XXMTACC nothing {mma,quad,mmaint}
 
   v512 __builtin_mma_xxmtacc_internal (v512);
     XXMTACC_INTERNAL mma_xxmtacc {mma,quad}
 
   void __builtin_mma_xxsetaccz (v512 *);
-    XXSETACCZ nothing {mma}
+    XXSETACCZ nothing {mma,mmaint}
 
   v512 __builtin_mma_xxsetaccz_internal ();
     XXSETACCZ_INTERNAL mma_xxsetaccz {mma}
 
   void __builtin_vsx_assemble_pair (v256 *, vuc, vuc);
-    ASSEMBLE_PAIR_V nothing {mma}
+    ASSEMBLE_PAIR_V nothing {mma,mmaint}
 
   v256 __builtin_vsx_assemble_pair_internal (vuc, vuc);
     ASSEMBLE_PAIR_V_INTERNAL vsx_assemble_pair {mma}
 
   void __builtin_vsx_build_pair (v256 *, vuc, vuc);
-    BUILD_PAIR nothing {mma}
+    BUILD_PAIR nothing {mma,mmaint}
 
   v256 __builtin_vsx_build_pair_internal (vuc, vuc);
     BUILD_PAIR_INTERNAL vsx_assemble_pair {mma}
 
   void __builtin_vsx_disassemble_pair (void *, v256 *);
-    DISASSEMBLE_PAIR_V nothing {mma,pair}
+    DISASSEMBLE_PAIR_V nothing {mma,pair,mmaint}
 
   vuc __builtin_vsx_disassemble_pair_internal (v256, const int<2>);
     DISASSEMBLE_PAIR_V_INTERNAL vsx_disassemble_pair {mma}
+
+  v256 __builtin_vsx_lxvp (unsigned long, const v256 *);
+    LXVP nothing {mma}
+
+  void __builtin_vsx_stxvp (v256, unsigned long, const v256 *);
+    STXVP nothing {mma,pair}
diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
index cb2503351c4..f75b5c8176c 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -13104,8 +13104,10 @@ rs6000_gimple_fold_new_mma_builtin (gimple_stmt_iterator *gsi,
 
   /* Each call that can be gimple-expanded has an associated built-in
      function that it will expand into.  If this one doesn't, we have
-     already expanded it!  */
-  if (rs6000_builtin_info_x[fncode].assoc_bif == RS6000_BIF_NONE)
+     already expanded it!  Exceptions: lxvp and stxvp.  */
+  if (rs6000_builtin_info_x[fncode].assoc_bif == RS6000_BIF_NONE
+      && fncode != RS6000_BIF_LXVP
+      && fncode != RS6000_BIF_STXVP)
     return false;
 
   bifdata *bd = &rs6000_builtin_info_x[fncode];
@@ -13182,6 +13184,32 @@ rs6000_gimple_fold_new_mma_builtin (gimple_stmt_iterator *gsi,
       gsi_replace_with_seq (gsi, new_seq, true);
       return true;
     }
+  else if (fncode == RS6000_BIF_LXVP)
+    {
+      push_gimplify_context (true);
+      tree offset = gimple_call_arg (stmt, 0);
+      tree ptr = gimple_call_arg (stmt, 1);
+      tree lhs = gimple_call_lhs (stmt);
+      tree mem = build_simple_mem_ref (build2 (POINTER_PLUS_EXPR,
+					       TREE_TYPE (ptr), ptr, offset));
+      gimplify_assign (lhs, mem, &new_seq);
+      pop_gimplify_context (NULL);
+      gsi_replace_with_seq (gsi, new_seq, true);
+      return true;
+    }
+  else if (fncode == RS6000_BIF_STXVP)
+    {
+      push_gimplify_context (true);
+      tree src = gimple_call_arg (stmt, 0);
+      tree offset = gimple_call_arg (stmt, 1);
+      tree ptr = gimple_call_arg (stmt, 2);
+      tree mem = build_simple_mem_ref (build2 (POINTER_PLUS_EXPR,
+					       TREE_TYPE (ptr), ptr, offset));
+      gimplify_assign (mem, src, &new_seq);
+      pop_gimplify_context (NULL);
+      gsi_replace_with_seq (gsi, new_seq, true);
+      return true;
+    }
 
   /* Convert this built-in into an internal version that uses pass-by-value
      arguments.  The internal built-in is found in the assoc_bif field.  */
diff --git a/gcc/config/rs6000/rs6000-gen-builtins.c b/gcc/config/rs6000/rs6000-gen-builtins.c
index c401a44e104..5fc56eff6d1 100644
--- a/gcc/config/rs6000/rs6000-gen-builtins.c
+++ b/gcc/config/rs6000/rs6000-gen-builtins.c
@@ -84,6 +84,7 @@ along with GCC; see the file COPYING3.  If not see
      mma      Needs special handling for MMA instructions
      quad     MMA instruction using a register quad as an input operand
      pair     MMA instruction using a register pair as an input operand
+     mmaint   MMA instruction expanding to internal call at GIMPLE time
      no32bit  Not valid for TARGET_32BIT
      32bit    Requires different handling for TARGET_32BIT
      cpu      This is a "cpu_is" or "cpu_supports" builtin
@@ -369,6 +370,7 @@ struct attrinfo
   bool ismma;
   bool isquad;
   bool ispair;
+  bool ismmaint;
   bool isno32bit;
   bool is32bit;
   bool iscpu;
@@ -1337,6 +1339,8 @@ parse_bif_attrs (attrinfo *attrptr)
 	  attrptr->isquad = 1;
 	else if (!strcmp (attrname, "pair"))
 	  attrptr->ispair = 1;
+	else if (!strcmp (attrname, "mmaint"))
+	  attrptr->ismmaint = 1;
 	else if (!strcmp (attrname, "no32bit"))
 	  attrptr->isno32bit = 1;
 	else if (!strcmp (attrname, "32bit"))
@@ -1383,15 +1387,15 @@ parse_bif_attrs (attrinfo *attrptr)
   (*diag) ("attribute set: init = %d, set = %d, extract = %d, nosoft = %d, "
 	   "ldvec = %d, stvec = %d, reve = %d, pred = %d, htm = %d, "
 	   "htmspr = %d, htmcr = %d, mma = %d, quad = %d, pair = %d, "
-	   "no32bit = %d, 32bit = %d, cpu = %d, ldstmask = %d, lxvrse = %d, "
-	   "lxvrze = %d, endian = %d.\n",
+	   "mmaint = %d, no32bit = %d, 32bit = %d, cpu = %d, ldstmask = %d, "
+	   "lxvrse = %d, lxvrze = %d, endian = %d.\n",
 	   attrptr->isinit, attrptr->isset, attrptr->isextract,
 	   attrptr->isnosoft, attrptr->isldvec, attrptr->isstvec,
 	   attrptr->isreve, attrptr->ispred, attrptr->ishtm, attrptr->ishtmspr,
 	   attrptr->ishtmcr, attrptr->ismma, attrptr->isquad, attrptr->ispair,
-	   attrptr->isno32bit, attrptr->is32bit, attrptr->iscpu,
-	   attrptr->isldstmask, attrptr->islxvrse, attrptr->islxvrze,
-	   attrptr->isendian);
+	   attrptr->ismmaint, attrptr->isno32bit, attrptr->is32bit,
+	   attrptr->iscpu, attrptr->isldstmask, attrptr->islxvrse,
+	   attrptr->islxvrze, attrptr->isendian);
 #endif
 
   return PC_OK;
@@ -2196,13 +2200,14 @@ write_decls (void)
   fprintf (header_file, "#define bif_mma_bit\t\t(0x00000800)\n");
   fprintf (header_file, "#define bif_quad_bit\t\t(0x00001000)\n");
   fprintf (header_file, "#define bif_pair_bit\t\t(0x00002000)\n");
-  fprintf (header_file, "#define bif_no32bit_bit\t\t(0x00004000)\n");
-  fprintf (header_file, "#define bif_32bit_bit\t\t(0x00008000)\n");
-  fprintf (header_file, "#define bif_cpu_bit\t\t(0x00010000)\n");
-  fprintf (header_file, "#define bif_ldstmask_bit\t(0x00020000)\n");
-  fprintf (header_file, "#define bif_lxvrse_bit\t\t(0x00040000)\n");
-  fprintf (header_file, "#define bif_lxvrze_bit\t\t(0x00080000)\n");
-  fprintf (header_file, "#define bif_endian_bit\t\t(0x00100000)\n");
+  fprintf (header_file, "#define bif_mmaint_bit\t\t(0x00004000)\n");
+  fprintf (header_file, "#define bif_no32bit_bit\t\t(0x00008000)\n");
+  fprintf (header_file, "#define bif_32bit_bit\t\t(0x00010000)\n");
+  fprintf (header_file, "#define bif_cpu_bit\t\t(0x00020000)\n");
+  fprintf (header_file, "#define bif_ldstmask_bit\t(0x00040000)\n");
+  fprintf (header_file, "#define bif_lxvrse_bit\t\t(0x00080000)\n");
+  fprintf (header_file, "#define bif_lxvrze_bit\t\t(0x00100000)\n");
+  fprintf (header_file, "#define bif_endian_bit\t\t(0x00200000)\n");
   fprintf (header_file, "\n");
   fprintf (header_file,
 	   "#define bif_is_init(x)\t\t((x).bifattrs & bif_init_bit)\n");
@@ -2232,6 +2237,8 @@ write_decls (void)
 	   "#define bif_is_quad(x)\t\t((x).bifattrs & bif_quad_bit)\n");
   fprintf (header_file,
 	   "#define bif_is_pair(x)\t\t((x).bifattrs & bif_pair_bit)\n");
+  fprintf (header_file,
+	   "#define bif_is_mmaint(x)\t\t((x).bifattrs & bif_mmaint_bit)\n");
   fprintf (header_file,
 	   "#define bif_is_no32bit(x)\t((x).bifattrs & bif_no32bit_bit)\n");
   fprintf (header_file,
@@ -2464,6 +2471,8 @@ write_bif_static_init (void)
 	fprintf (init_file, " | bif_quad_bit");
       if (bifp->attrs.ispair)
 	fprintf (init_file, " | bif_pair_bit");
+      if (bifp->attrs.ismmaint)
+	fprintf (init_file, " | bif_mmaint_bit");
       if (bifp->attrs.isno32bit)
 	fprintf (init_file, " | bif_no32bit_bit");
       if (bifp->attrs.is32bit)
@@ -2510,10 +2519,9 @@ write_bif_static_init (void)
 		: (bifp->kind == FNK_PURE ? "= pure"
 		   : (bifp->kind == FNK_FPMATH ? "= fp, const"
 		      : ""))));
-      bool no_icode = !strcmp (bifp->patname, "nothing");
       fprintf (init_file, "      /* assoc_bif */\tRS6000_BIF_%s%s\n",
-	       bifp->attrs.ismma && no_icode ? bifp->idname : "NONE",
-	       bifp->attrs.ismma && no_icode ? "_INTERNAL" : "");
+	       bifp->attrs.ismmaint ? bifp->idname : "NONE",
+	       bifp->attrs.ismmaint ? "_INTERNAL" : "");
       fprintf (init_file, "    },\n");
     }
   fprintf (init_file, "  };\n\n");
-- 
2.27.0


^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 22/34] rs6000: Support for vectorizing built-in functions
  2021-07-29 13:30 [PATCHv4 00/34] Replace the Power target-specific builtin machinery Bill Schmidt
                   ` (20 preceding siblings ...)
  2021-07-29 13:31 ` [PATCH 21/34] rs6000: Handle some recent MMA builtin changes Bill Schmidt
@ 2021-07-29 13:31 ` Bill Schmidt
  2021-07-29 13:31 ` [PATCH 23/34] rs6000: Builtin expansion, part 1 Bill Schmidt
                   ` (11 subsequent siblings)
  33 siblings, 0 replies; 84+ messages in thread
From: Bill Schmidt @ 2021-07-29 13:31 UTC (permalink / raw)
  To: gcc-patches; +Cc: segher, dje.gcc, willschm

2021-03-05  Bill Schmidt  <wschmidt@linux.ibm.com>

gcc/
	* config/rs6000/rs6000.c (rs6000-builtins.h): New include.
	(rs6000_new_builtin_vectorized_function): New function.
	(rs6000_new_builtin_md_vectorized_function): Likewise.
	(rs6000_builtin_vectorized_function): Call
	rs6000_new_builtin_vectorized_function.
	(rs6000_builtin_md_vectorized_function): Call
	rs6000_new_builtin_md_vectorized_function.
---
 gcc/config/rs6000/rs6000.c | 200 +++++++++++++++++++++++++++++++++++++
 1 file changed, 200 insertions(+)

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 279f00cc648..ceba25d028c 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -78,6 +78,7 @@
 #include "case-cfn-macros.h"
 #include "ppc-auxv.h"
 #include "rs6000-internal.h"
+#include "rs6000-builtins.h"
 #include "opts.h"
 
 /* This file should be included last.  */
@@ -5489,6 +5490,198 @@ rs6000_loop_unroll_adjust (unsigned nunroll, struct loop *loop)
   return nunroll;
 }
 
+/* Returns a function decl for a vectorized version of the builtin function
+   with builtin function code FN and the result vector type TYPE, or NULL_TREE
+   if it is not available.  */
+
+static tree
+rs6000_new_builtin_vectorized_function (unsigned int fn, tree type_out,
+					tree type_in)
+{
+  machine_mode in_mode, out_mode;
+  int in_n, out_n;
+
+  if (TARGET_DEBUG_BUILTIN)
+    fprintf (stderr, "rs6000_new_builtin_vectorized_function (%s, %s, %s)\n",
+	     combined_fn_name (combined_fn (fn)),
+	     GET_MODE_NAME (TYPE_MODE (type_out)),
+	     GET_MODE_NAME (TYPE_MODE (type_in)));
+
+  if (TREE_CODE (type_out) != VECTOR_TYPE
+      || TREE_CODE (type_in) != VECTOR_TYPE)
+    return NULL_TREE;
+
+  out_mode = TYPE_MODE (TREE_TYPE (type_out));
+  out_n = TYPE_VECTOR_SUBPARTS (type_out);
+  in_mode = TYPE_MODE (TREE_TYPE (type_in));
+  in_n = TYPE_VECTOR_SUBPARTS (type_in);
+
+  switch (fn)
+    {
+    CASE_CFN_COPYSIGN:
+      if (VECTOR_UNIT_VSX_P (V2DFmode)
+	  && out_mode == DFmode && out_n == 2
+	  && in_mode == DFmode && in_n == 2)
+	return rs6000_builtin_decls_x[RS6000_BIF_CPSGNDP];
+      if (VECTOR_UNIT_VSX_P (V4SFmode)
+	  && out_mode == SFmode && out_n == 4
+	  && in_mode == SFmode && in_n == 4)
+	return rs6000_builtin_decls_x[RS6000_BIF_CPSGNSP];
+      if (VECTOR_UNIT_ALTIVEC_P (V4SFmode)
+	  && out_mode == SFmode && out_n == 4
+	  && in_mode == SFmode && in_n == 4)
+	return rs6000_builtin_decls_x[RS6000_BIF_COPYSIGN_V4SF];
+      break;
+    CASE_CFN_CEIL:
+      if (VECTOR_UNIT_VSX_P (V2DFmode)
+	  && out_mode == DFmode && out_n == 2
+	  && in_mode == DFmode && in_n == 2)
+	return rs6000_builtin_decls_x[RS6000_BIF_XVRDPIP];
+      if (VECTOR_UNIT_VSX_P (V4SFmode)
+	  && out_mode == SFmode && out_n == 4
+	  && in_mode == SFmode && in_n == 4)
+	return rs6000_builtin_decls_x[RS6000_BIF_XVRSPIP];
+      if (VECTOR_UNIT_ALTIVEC_P (V4SFmode)
+	  && out_mode == SFmode && out_n == 4
+	  && in_mode == SFmode && in_n == 4)
+	return rs6000_builtin_decls_x[RS6000_BIF_VRFIP];
+      break;
+    CASE_CFN_FLOOR:
+      if (VECTOR_UNIT_VSX_P (V2DFmode)
+	  && out_mode == DFmode && out_n == 2
+	  && in_mode == DFmode && in_n == 2)
+	return rs6000_builtin_decls_x[RS6000_BIF_XVRDPIM];
+      if (VECTOR_UNIT_VSX_P (V4SFmode)
+	  && out_mode == SFmode && out_n == 4
+	  && in_mode == SFmode && in_n == 4)
+	return rs6000_builtin_decls_x[RS6000_BIF_XVRSPIM];
+      if (VECTOR_UNIT_ALTIVEC_P (V4SFmode)
+	  && out_mode == SFmode && out_n == 4
+	  && in_mode == SFmode && in_n == 4)
+	return rs6000_builtin_decls_x[RS6000_BIF_VRFIM];
+      break;
+    CASE_CFN_FMA:
+      if (VECTOR_UNIT_VSX_P (V2DFmode)
+	  && out_mode == DFmode && out_n == 2
+	  && in_mode == DFmode && in_n == 2)
+	return rs6000_builtin_decls_x[RS6000_BIF_XVMADDDP];
+      if (VECTOR_UNIT_VSX_P (V4SFmode)
+	  && out_mode == SFmode && out_n == 4
+	  && in_mode == SFmode && in_n == 4)
+	return rs6000_builtin_decls_x[RS6000_BIF_XVMADDSP];
+      if (VECTOR_UNIT_ALTIVEC_P (V4SFmode)
+	  && out_mode == SFmode && out_n == 4
+	  && in_mode == SFmode && in_n == 4)
+	return rs6000_builtin_decls_x[RS6000_BIF_VMADDFP];
+      break;
+    CASE_CFN_TRUNC:
+      if (VECTOR_UNIT_VSX_P (V2DFmode)
+	  && out_mode == DFmode && out_n == 2
+	  && in_mode == DFmode && in_n == 2)
+	return rs6000_builtin_decls_x[RS6000_BIF_XVRDPIZ];
+      if (VECTOR_UNIT_VSX_P (V4SFmode)
+	  && out_mode == SFmode && out_n == 4
+	  && in_mode == SFmode && in_n == 4)
+	return rs6000_builtin_decls_x[RS6000_BIF_XVRSPIZ];
+      if (VECTOR_UNIT_ALTIVEC_P (V4SFmode)
+	  && out_mode == SFmode && out_n == 4
+	  && in_mode == SFmode && in_n == 4)
+	return rs6000_builtin_decls_x[RS6000_BIF_VRFIZ];
+      break;
+    CASE_CFN_NEARBYINT:
+      if (VECTOR_UNIT_VSX_P (V2DFmode)
+	  && flag_unsafe_math_optimizations
+	  && out_mode == DFmode && out_n == 2
+	  && in_mode == DFmode && in_n == 2)
+	return rs6000_builtin_decls_x[RS6000_BIF_XVRDPI];
+      if (VECTOR_UNIT_VSX_P (V4SFmode)
+	  && flag_unsafe_math_optimizations
+	  && out_mode == SFmode && out_n == 4
+	  && in_mode == SFmode && in_n == 4)
+	return rs6000_builtin_decls_x[RS6000_BIF_XVRSPI];
+      break;
+    CASE_CFN_RINT:
+      if (VECTOR_UNIT_VSX_P (V2DFmode)
+	  && !flag_trapping_math
+	  && out_mode == DFmode && out_n == 2
+	  && in_mode == DFmode && in_n == 2)
+	return rs6000_builtin_decls_x[RS6000_BIF_XVRDPIC];
+      if (VECTOR_UNIT_VSX_P (V4SFmode)
+	  && !flag_trapping_math
+	  && out_mode == SFmode && out_n == 4
+	  && in_mode == SFmode && in_n == 4)
+	return rs6000_builtin_decls_x[RS6000_BIF_XVRSPIC];
+      break;
+    default:
+      break;
+    }
+
+  /* Generate calls to libmass if appropriate.  */
+  if (rs6000_veclib_handler)
+    return rs6000_veclib_handler (combined_fn (fn), type_out, type_in);
+
+  return NULL_TREE;
+}
+
+/* Implement TARGET_VECTORIZE_BUILTIN_MD_VECTORIZED_FUNCTION.  */
+
+static tree
+rs6000_new_builtin_md_vectorized_function (tree fndecl, tree type_out,
+					   tree type_in)
+{
+  machine_mode in_mode, out_mode;
+  int in_n, out_n;
+
+  if (TARGET_DEBUG_BUILTIN)
+    fprintf (stderr,
+	     "rs6000_new_builtin_md_vectorized_function (%s, %s, %s)\n",
+	     IDENTIFIER_POINTER (DECL_NAME (fndecl)),
+	     GET_MODE_NAME (TYPE_MODE (type_out)),
+	     GET_MODE_NAME (TYPE_MODE (type_in)));
+
+  if (TREE_CODE (type_out) != VECTOR_TYPE
+      || TREE_CODE (type_in) != VECTOR_TYPE)
+    return NULL_TREE;
+
+  out_mode = TYPE_MODE (TREE_TYPE (type_out));
+  out_n = TYPE_VECTOR_SUBPARTS (type_out);
+  in_mode = TYPE_MODE (TREE_TYPE (type_in));
+  in_n = TYPE_VECTOR_SUBPARTS (type_in);
+
+  enum rs6000_gen_builtins fn
+    = (enum rs6000_gen_builtins) DECL_MD_FUNCTION_CODE (fndecl);
+  switch (fn)
+    {
+    case RS6000_BIF_RSQRTF:
+      if (VECTOR_UNIT_ALTIVEC_OR_VSX_P (V4SFmode)
+	  && out_mode == SFmode && out_n == 4
+	  && in_mode == SFmode && in_n == 4)
+	return rs6000_builtin_decls_x[RS6000_BIF_VRSQRTFP];
+      break;
+    case RS6000_BIF_RSQRT:
+      if (VECTOR_UNIT_VSX_P (V2DFmode)
+	  && out_mode == DFmode && out_n == 2
+	  && in_mode == DFmode && in_n == 2)
+	return rs6000_builtin_decls_x[RS6000_BIF_RSQRT_2DF];
+      break;
+    case RS6000_BIF_RECIPF:
+      if (VECTOR_UNIT_ALTIVEC_OR_VSX_P (V4SFmode)
+	  && out_mode == SFmode && out_n == 4
+	  && in_mode == SFmode && in_n == 4)
+	return rs6000_builtin_decls_x[RS6000_BIF_VRECIPFP];
+      break;
+    case RS6000_BIF_RECIP:
+      if (VECTOR_UNIT_VSX_P (V2DFmode)
+	  && out_mode == DFmode && out_n == 2
+	  && in_mode == DFmode && in_n == 2)
+	return rs6000_builtin_decls_x[RS6000_BIF_RECIP_V2DF];
+      break;
+    default:
+      break;
+    }
+  return NULL_TREE;
+}
+
 /* Handler for the Mathematical Acceleration Subsystem (mass) interface to a
    library with vectorized intrinsics.  */
 
@@ -5608,6 +5801,9 @@ rs6000_builtin_vectorized_function (unsigned int fn, tree type_out,
   machine_mode in_mode, out_mode;
   int in_n, out_n;
 
+  if (new_builtins_are_live)
+    return rs6000_new_builtin_vectorized_function (fn, type_out, type_in);
+
   if (TARGET_DEBUG_BUILTIN)
     fprintf (stderr, "rs6000_builtin_vectorized_function (%s, %s, %s)\n",
 	     combined_fn_name (combined_fn (fn)),
@@ -5739,6 +5935,10 @@ rs6000_builtin_md_vectorized_function (tree fndecl, tree type_out,
   machine_mode in_mode, out_mode;
   int in_n, out_n;
 
+  if (new_builtins_are_live)
+    return rs6000_new_builtin_md_vectorized_function (fndecl, type_out,
+						      type_in);
+
   if (TARGET_DEBUG_BUILTIN)
     fprintf (stderr, "rs6000_builtin_md_vectorized_function (%s, %s, %s)\n",
 	     IDENTIFIER_POINTER (DECL_NAME (fndecl)),
-- 
2.27.0


^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 23/34] rs6000: Builtin expansion, part 1
  2021-07-29 13:30 [PATCHv4 00/34] Replace the Power target-specific builtin machinery Bill Schmidt
                   ` (21 preceding siblings ...)
  2021-07-29 13:31 ` [PATCH 22/34] rs6000: Support for vectorizing built-in functions Bill Schmidt
@ 2021-07-29 13:31 ` Bill Schmidt
  2021-07-29 13:31 ` [PATCH 24/34] rs6000: Builtin expansion, part 2 Bill Schmidt
                   ` (10 subsequent siblings)
  33 siblings, 0 replies; 84+ messages in thread
From: Bill Schmidt @ 2021-07-29 13:31 UTC (permalink / raw)
  To: gcc-patches; +Cc: segher, dje.gcc, willschm

2021-06-17  Bill Schmidt  <wschmidt@linux.ibm.com>

gcc/
	* config/rs6000/rs6000-call.c (rs6000_expand_new_builtin): New
	forward decl.
	(rs6000_invalid_new_builtin): New stub function.
	(rs6000_expand_builtin): Call rs6000_expand_new_builtin.
	(rs6000_expand_ldst_mask): New stub function.
	(new_cpu_expand_builtin): Likewise.
	(elemrev_icode): Likewise.
	(ldv_expand_builtin): Likewise.
	(lxvrse_expand_builtin): Likewise.
	(lxvrze_expand_builtin): Likewise.
	(stv_expand_builtin): Likewise.
	(new_mma_expand_builtin): Likewise.
	(new_htm_expand_builtin): Likewise.
	(rs6000_expand_new_builtin): New function.
---
 gcc/config/rs6000/rs6000-call.c | 526 ++++++++++++++++++++++++++++++++
 1 file changed, 526 insertions(+)

diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
index f75b5c8176c..4719d074455 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -190,6 +190,7 @@ static tree builtin_function_type (machine_mode, machine_mode,
 static void rs6000_common_init_builtins (void);
 static void htm_init_builtins (void);
 static void mma_init_builtins (void);
+static rtx rs6000_expand_new_builtin (tree, rtx, rtx, machine_mode, int);
 static bool rs6000_gimple_fold_new_builtin (gimple_stmt_iterator *gsi);
 
 
@@ -11664,6 +11665,14 @@ rs6000_invalid_builtin (enum rs6000_builtins fncode)
     error ("%qs is not supported with the current options", name);
 }
 
+/* Raise an error message for a builtin function that is called without the
+   appropriate target options being set.  */
+
+static void
+rs6000_invalid_new_builtin (enum rs6000_gen_builtins fncode)
+{
+}
+
 /* Target hook for early folding of built-ins, shamelessly stolen
    from ia64.c.  */
 
@@ -14255,6 +14264,9 @@ rs6000_expand_builtin (tree exp, rtx target, rtx subtarget ATTRIBUTE_UNUSED,
 		       machine_mode mode ATTRIBUTE_UNUSED,
 		       int ignore ATTRIBUTE_UNUSED)
 {
+  if (new_builtins_are_live)
+    return rs6000_expand_new_builtin (exp, target, subtarget, mode, ignore);
+
   tree fndecl = TREE_OPERAND (CALL_EXPR_FN (exp), 0);
   enum rs6000_builtins fcode
     = (enum rs6000_builtins) DECL_MD_FUNCTION_CODE (fndecl);
@@ -14547,6 +14559,520 @@ rs6000_expand_builtin (tree exp, rtx target, rtx subtarget ATTRIBUTE_UNUSED,
   gcc_unreachable ();
 }
 
+/* Expand ALTIVEC_BUILTIN_MASK_FOR_LOAD.  */
+rtx
+rs6000_expand_ldst_mask (rtx target, tree arg0)
+ {
+  return target;
+ }
+
+/* Expand the CPU builtin in FCODE and store the result in TARGET.  */
+static rtx
+new_cpu_expand_builtin (enum rs6000_gen_builtins fcode,
+			tree exp ATTRIBUTE_UNUSED, rtx target)
+{
+  return target;
+}
+
+static insn_code
+elemrev_icode (rs6000_gen_builtins fcode)
+{
+  return (insn_code) 0;
+}
+
+static rtx
+ldv_expand_builtin (rtx target, insn_code icode, rtx *op, machine_mode tmode)
+{
+  return target;
+}
+
+static rtx
+lxvrse_expand_builtin (rtx target, insn_code icode, rtx *op,
+		       machine_mode tmode, machine_mode smode)
+{
+  return target;
+}
+
+static rtx
+lxvrze_expand_builtin (rtx target, insn_code icode, rtx *op,
+		       machine_mode tmode, machine_mode smode)
+{
+  return target;
+}
+
+static rtx
+stv_expand_builtin (insn_code icode, rtx *op,
+		    machine_mode tmode, machine_mode smode)
+{
+  return NULL_RTX;
+}
+
+/* Expand the MMA built-in in EXP.  */
+static rtx
+new_mma_expand_builtin (tree exp, rtx target, insn_code icode,
+			rs6000_gen_builtins fcode)
+{
+  return target;
+}
+
+/* Expand the HTM builtin in EXP and store the result in TARGET.  */
+static rtx
+new_htm_expand_builtin (bifdata *bifaddr, rs6000_gen_builtins fcode,
+			tree exp, rtx target)
+{
+  return const0_rtx;
+}
+
+/* Expand an expression EXP that calls a built-in function,
+   with result going to TARGET if that's convenient
+   (and in mode MODE if that's convenient).
+   SUBTARGET may be used as the target for computing one of EXP's operands.
+   IGNORE is nonzero if the value is to be ignored.
+   Use the new builtin infrastructure.  */
+static rtx
+rs6000_expand_new_builtin (tree exp, rtx target,
+			   rtx subtarget ATTRIBUTE_UNUSED,
+			   machine_mode ignore_mode ATTRIBUTE_UNUSED,
+			   int ignore ATTRIBUTE_UNUSED)
+{
+  tree fndecl = TREE_OPERAND (CALL_EXPR_FN (exp), 0);
+  enum rs6000_gen_builtins fcode
+    = (enum rs6000_gen_builtins) DECL_MD_FUNCTION_CODE (fndecl);
+  size_t uns_fcode = (size_t)fcode;
+  enum insn_code icode = rs6000_builtin_info_x[uns_fcode].icode;
+
+  /* We have two different modes (KFmode, TFmode) that are the IEEE 128-bit
+     floating point type, depending on whether long double is the IBM extended
+     double (KFmode) or long double is IEEE 128-bit (TFmode).  It is simpler if
+     we only define one variant of the built-in function, and switch the code
+     when defining it, rather than defining two built-ins and using the
+     overload table in rs6000-c.c to switch between the two.  If we don't have
+     the proper assembler, don't do this switch because CODE_FOR_*kf* and
+     CODE_FOR_*tf* will be CODE_FOR_nothing.  */
+  if (FLOAT128_IEEE_P (TFmode))
+    switch (icode)
+      {
+      default:
+	break;
+
+      case CODE_FOR_sqrtkf2_odd:	icode = CODE_FOR_sqrttf2_odd;	break;
+      case CODE_FOR_trunckfdf2_odd:	icode = CODE_FOR_trunctfdf2_odd; break;
+      case CODE_FOR_addkf3_odd:		icode = CODE_FOR_addtf3_odd;	break;
+      case CODE_FOR_subkf3_odd:		icode = CODE_FOR_subtf3_odd;	break;
+      case CODE_FOR_mulkf3_odd:		icode = CODE_FOR_multf3_odd;	break;
+      case CODE_FOR_divkf3_odd:		icode = CODE_FOR_divtf3_odd;	break;
+      case CODE_FOR_fmakf4_odd:		icode = CODE_FOR_fmatf4_odd;	break;
+      case CODE_FOR_xsxexpqp_kf:	icode = CODE_FOR_xsxexpqp_tf;	break;
+      case CODE_FOR_xsxsigqp_kf:	icode = CODE_FOR_xsxsigqp_tf;	break;
+      case CODE_FOR_xststdcnegqp_kf:	icode = CODE_FOR_xststdcnegqp_tf; break;
+      case CODE_FOR_xsiexpqp_kf:	icode = CODE_FOR_xsiexpqp_tf;	break;
+      case CODE_FOR_xsiexpqpf_kf:	icode = CODE_FOR_xsiexpqpf_tf;	break;
+      case CODE_FOR_xststdcqp_kf:	icode = CODE_FOR_xststdcqp_tf;	break;
+
+      case CODE_FOR_xscmpexpqp_eq_kf:
+	icode = CODE_FOR_xscmpexpqp_eq_tf;
+	break;
+
+      case CODE_FOR_xscmpexpqp_lt_kf:
+	icode = CODE_FOR_xscmpexpqp_lt_tf;
+	break;
+
+      case CODE_FOR_xscmpexpqp_gt_kf:
+	icode = CODE_FOR_xscmpexpqp_gt_tf;
+	break;
+
+      case CODE_FOR_xscmpexpqp_unordered_kf:
+	icode = CODE_FOR_xscmpexpqp_unordered_tf;
+	break;
+      }
+
+  bifdata *bifaddr = &rs6000_builtin_info_x[uns_fcode];
+
+  /* In case of "#pragma target" changes, we initialize all builtins
+     but check for actual availability during expand time.  For
+     invalid builtins, generate a normal call.  */
+  switch (bifaddr->enable)
+    {
+    default:
+      gcc_unreachable ();
+    case ENB_ALWAYS:
+      break;
+    case ENB_P5:
+      if (!TARGET_POPCNTB)
+	{
+	  rs6000_invalid_new_builtin (fcode);
+	  return expand_call (exp, target, ignore);
+	}
+      break;
+    case ENB_P6:
+      if (!TARGET_CMPB)
+	{
+	  rs6000_invalid_new_builtin (fcode);
+	  return expand_call (exp, target, ignore);
+	}
+      break;
+    case ENB_ALTIVEC:
+      if (!TARGET_ALTIVEC)
+	{
+	  rs6000_invalid_new_builtin (fcode);
+	  return expand_call (exp, target, ignore);
+	}
+      break;
+    case ENB_CELL:
+      if (!TARGET_ALTIVEC || rs6000_cpu != PROCESSOR_CELL)
+	{
+	  rs6000_invalid_new_builtin (fcode);
+	  return expand_call (exp, target, ignore);
+	}
+      break;
+    case ENB_VSX:
+      if (!TARGET_VSX)
+	{
+	  rs6000_invalid_new_builtin (fcode);
+	  return expand_call (exp, target, ignore);
+	}
+      break;
+    case ENB_P7:
+      if (!TARGET_POPCNTD)
+	{
+	  rs6000_invalid_new_builtin (fcode);
+	  return expand_call (exp, target, ignore);
+	}
+      break;
+    case ENB_P7_64:
+      if (!TARGET_POPCNTD || !TARGET_POWERPC64)
+	{
+	  rs6000_invalid_new_builtin (fcode);
+	  return expand_call (exp, target, ignore);
+	}
+      break;
+    case ENB_P8:
+      if (!TARGET_DIRECT_MOVE)
+	{
+	  rs6000_invalid_new_builtin (fcode);
+	  return expand_call (exp, target, ignore);
+	}
+      break;
+    case ENB_P8V:
+      if (!TARGET_P8_VECTOR)
+	{
+	  rs6000_invalid_new_builtin (fcode);
+	  return expand_call (exp, target, ignore);
+	}
+      break;
+    case ENB_P9:
+      if (!TARGET_MODULO)
+	{
+	  rs6000_invalid_new_builtin (fcode);
+	  return expand_call (exp, target, ignore);
+	}
+      break;
+    case ENB_P9_64:
+      if (!TARGET_MODULO || !TARGET_POWERPC64)
+	{
+	  rs6000_invalid_new_builtin (fcode);
+	  return expand_call (exp, target, ignore);
+	}
+      break;
+    case ENB_P9V:
+      if (!TARGET_P9_VECTOR)
+	{
+	  rs6000_invalid_new_builtin (fcode);
+	  return expand_call (exp, target, ignore);
+	}
+      break;
+    case ENB_IEEE128_HW:
+      if (!TARGET_FLOAT128_HW)
+	{
+	  rs6000_invalid_new_builtin (fcode);
+	  return expand_call (exp, target, ignore);
+	}
+      break;
+    case ENB_DFP:
+      if (!TARGET_DFP)
+	{
+	  rs6000_invalid_new_builtin (fcode);
+	  return expand_call (exp, target, ignore);
+	}
+      break;
+    case ENB_CRYPTO:
+      if (!TARGET_CRYPTO)
+	{
+	  rs6000_invalid_new_builtin (fcode);
+	  return expand_call (exp, target, ignore);
+	}
+      break;
+    case ENB_HTM:
+      if (!TARGET_HTM)
+	{
+	  rs6000_invalid_new_builtin (fcode);
+	  return expand_call (exp, target, ignore);
+	}
+      break;
+    case ENB_P10:
+      if (!TARGET_POWER10)
+	{
+	  rs6000_invalid_new_builtin (fcode);
+	  return expand_call (exp, target, ignore);
+	}
+      break;
+    case ENB_P10_64:
+      if (!TARGET_POWER10 || !TARGET_POWERPC64)
+	{
+	  rs6000_invalid_new_builtin (fcode);
+	  return expand_call (exp, target, ignore);
+	}
+      break;
+    case ENB_MMA:
+      if (!TARGET_MMA)
+	{
+	  rs6000_invalid_new_builtin (fcode);
+	  return expand_call (exp, target, ignore);
+	}
+      break;
+    };
+
+  if (bif_is_nosoft (*bifaddr)
+      && rs6000_isa_flags & OPTION_MASK_SOFT_FLOAT)
+    {
+      error ("%<%s%> not supported with %<-msoft-float%>",
+	     bifaddr->bifname);
+      return const0_rtx;
+    }
+
+  if (bif_is_no32bit (*bifaddr) && TARGET_32BIT)
+    fatal_error (input_location,
+		 "%<%s%> is not supported in 32-bit mode",
+		 bifaddr->bifname);
+
+  if (bif_is_cpu (*bifaddr))
+    return new_cpu_expand_builtin (fcode, exp, target);
+
+  if (bif_is_init (*bifaddr))
+    return altivec_expand_vec_init_builtin (TREE_TYPE (exp), exp, target);
+
+  if (bif_is_set (*bifaddr))
+    return altivec_expand_vec_set_builtin (exp);
+
+  if (bif_is_extract (*bifaddr))
+    return altivec_expand_vec_ext_builtin (exp, target);
+
+  if (bif_is_predicate (*bifaddr))
+    return altivec_expand_predicate_builtin (icode, exp, target);
+
+  if (bif_is_htm (*bifaddr))
+    return new_htm_expand_builtin (bifaddr, fcode, exp, target);
+
+  rtx pat;
+  const int MAX_BUILTIN_ARGS = 6;
+  tree arg[MAX_BUILTIN_ARGS];
+  rtx op[MAX_BUILTIN_ARGS];
+  machine_mode mode[MAX_BUILTIN_ARGS + 1];
+  bool void_func = TREE_TYPE (TREE_TYPE (fndecl)) == void_type_node;
+  int k;
+
+  int nargs = bifaddr->nargs;
+  gcc_assert (nargs <= MAX_BUILTIN_ARGS);
+
+  if (void_func)
+    k = 0;
+  else
+    {
+      k = 1;
+      mode[0] = insn_data[icode].operand[0].mode;
+    }
+
+  for (int i = 0; i < nargs; i++)
+    {
+      arg[i] = CALL_EXPR_ARG (exp, i);
+      if (arg[i] == error_mark_node)
+	return const0_rtx;
+      STRIP_NOPS (arg[i]);
+      op[i] = expand_normal (arg[i]);
+      /* We have a couple of pesky patterns that don't specify the mode...  */
+      if (!insn_data[icode].operand[i+k].mode)
+	mode[i+k] = TARGET_64BIT ? Pmode : SImode;
+      else
+	mode[i+k] = insn_data[icode].operand[i+k].mode;
+    }
+
+  /* Check for restricted constant arguments.  */
+  for (int i = 0; i < 2; i++)
+    {
+      switch (bifaddr->restr[i])
+	{
+	default:
+	case RES_NONE:
+	  break;
+	case RES_BITS:
+	  {
+	    size_t mask = (1 << bifaddr->restr_val1[i]) - 1;
+	    tree restr_arg = arg[bifaddr->restr_opnd[i] - 1];
+	    STRIP_NOPS (restr_arg);
+	    if (TREE_CODE (restr_arg) != INTEGER_CST
+		|| TREE_INT_CST_LOW (restr_arg) & ~mask)
+	      {
+		error ("argument %d must be a %d-bit unsigned literal",
+		       bifaddr->restr_opnd[i], bifaddr->restr_val1[i]);
+		return CONST0_RTX (mode[0]);
+	      }
+	    break;
+	  }
+	case RES_RANGE:
+	  {
+	    tree restr_arg = arg[bifaddr->restr_opnd[i] - 1];
+	    STRIP_NOPS (restr_arg);
+	    if (TREE_CODE (restr_arg) != INTEGER_CST
+		|| !IN_RANGE (tree_to_shwi (restr_arg),
+			      bifaddr->restr_val1[i],
+			      bifaddr->restr_val2[i]))
+	      {
+		error ("argument %d must be a literal between %d and %d,"
+		       " inclusive",
+		       bifaddr->restr_opnd[i], bifaddr->restr_val1[i],
+		       bifaddr->restr_val2[i]);
+		return CONST0_RTX (mode[0]);
+	      }
+	    break;
+	  }
+	case RES_VAR_RANGE:
+	  {
+	    tree restr_arg = arg[bifaddr->restr_opnd[i] - 1];
+	    STRIP_NOPS (restr_arg);
+	    if (TREE_CODE (restr_arg) == INTEGER_CST
+		&& !IN_RANGE (tree_to_shwi (restr_arg),
+			      bifaddr->restr_val1[i],
+			      bifaddr->restr_val2[i]))
+	      {
+		error ("argument %d must be a variable or a literal "
+		       "between %d and %d, inclusive",
+		       bifaddr->restr_opnd[i], bifaddr->restr_val1[i],
+		       bifaddr->restr_val2[i]);
+		return CONST0_RTX (mode[0]);
+	      }
+	    break;
+	  }
+	case RES_VALUES:
+	  {
+	    tree restr_arg = arg[bifaddr->restr_opnd[i] - 1];
+	    STRIP_NOPS (restr_arg);
+	    if (TREE_CODE (restr_arg) != INTEGER_CST
+		|| (tree_to_shwi (restr_arg) != bifaddr->restr_val1[i]
+		    && tree_to_shwi (restr_arg) != bifaddr->restr_val2[i]))
+	      {
+		error ("argument %d must be either a literal %d or a "
+		       "literal %d",
+		       bifaddr->restr_opnd[i], bifaddr->restr_val1[i],
+		       bifaddr->restr_val2[i]);
+		return CONST0_RTX (mode[0]);
+	      }
+	    break;
+	  }
+	}
+    }
+
+  if (bif_is_ldstmask (*bifaddr))
+    return rs6000_expand_ldst_mask (target, arg[0]);
+
+  if (bif_is_stvec (*bifaddr))
+    {
+      if (bif_is_reve (*bifaddr))
+	icode = elemrev_icode (fcode);
+      return stv_expand_builtin (icode, op, mode[0], mode[1]);
+    }
+
+  if (bif_is_ldvec (*bifaddr))
+    {
+      if (bif_is_reve (*bifaddr))
+	icode = elemrev_icode (fcode);
+      return ldv_expand_builtin (target, icode, op, mode[0]);
+    }
+
+  if (bif_is_lxvrse (*bifaddr))
+    return lxvrse_expand_builtin (target, icode, op, mode[0], mode[1]);
+
+  if (bif_is_lxvrze (*bifaddr))
+    return lxvrze_expand_builtin (target, icode, op, mode[0], mode[1]);
+
+  if (bif_is_mma (*bifaddr))
+    return new_mma_expand_builtin (exp, target, icode, fcode);
+
+  if (fcode == RS6000_BIF_PACK_IF
+      && TARGET_LONG_DOUBLE_128 && !TARGET_IEEEQUAD)
+    {
+      icode = CODE_FOR_packtf;
+      fcode = RS6000_BIF_PACK_TF;
+      uns_fcode = (size_t)fcode;
+    }
+  else if (fcode == RS6000_BIF_UNPACK_IF
+	   && TARGET_LONG_DOUBLE_128 && !TARGET_IEEEQUAD)
+    {
+      icode = CODE_FOR_unpacktf;
+      fcode = RS6000_BIF_UNPACK_TF;
+      uns_fcode = (size_t)fcode;
+    }
+
+  if (TREE_TYPE (TREE_TYPE (fndecl)) == void_type_node)
+    target = NULL_RTX;
+  else if (target == 0
+	   || GET_MODE (target) != mode[0]
+	   || !(*insn_data[icode].operand[0].predicate) (target, mode[0]))
+    target = gen_reg_rtx (mode[0]);
+
+  for (int i = 0; i < nargs; i++)
+    if (! (*insn_data[icode].operand[i+k].predicate) (op[i], mode[i+k]))
+      op[i] = copy_to_mode_reg (mode[i+k], op[i]);
+
+  switch (nargs)
+    {
+    default:
+      gcc_assert (MAX_BUILTIN_ARGS == 6);
+      gcc_unreachable ();
+    case 0:
+      pat = (void_func
+	     ? GEN_FCN (icode) ()
+	     : GEN_FCN (icode) (target));
+      break;
+    case 1:
+      pat = (void_func
+	     ? GEN_FCN (icode) (op[0])
+	     : GEN_FCN (icode) (target, op[0]));
+      break;
+    case 2:
+      pat = (void_func
+	     ? GEN_FCN (icode) (op[0], op[1])
+	     : GEN_FCN (icode) (target, op[0], op[1]));
+      break;
+    case 3:
+      pat = (void_func
+	     ? GEN_FCN (icode) (op[0], op[1], op[2])
+	     : GEN_FCN (icode) (target, op[0], op[1], op[2]));
+      break;
+    case 4:
+      pat = (void_func
+	     ? GEN_FCN (icode) (op[0], op[1], op[2], op[3])
+	     : GEN_FCN (icode) (target, op[0], op[1], op[2], op[3]));
+      break;
+    case 5:
+      pat = (void_func
+	     ? GEN_FCN (icode) (op[0], op[1], op[2], op[3], op[4])
+	     : GEN_FCN (icode) (target, op[0], op[1], op[2], op[3], op[4]));
+      break;
+    case 6:
+      pat = (void_func
+	     ? GEN_FCN (icode) (op[0], op[1], op[2], op[3], op[4], op[5])
+	     : GEN_FCN (icode) (target, op[0], op[1],
+				op[2], op[3], op[4], op[5]));
+      break;
+    }
+
+  if (!pat)
+    return 0;
+  emit_insn (pat);
+
+  return target;
+}
+
 /* Create a builtin vector type with a name.  Taking care not to give
    the canonical type a name.  */
 
-- 
2.27.0


^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 24/34] rs6000: Builtin expansion, part 2
  2021-07-29 13:30 [PATCHv4 00/34] Replace the Power target-specific builtin machinery Bill Schmidt
                   ` (22 preceding siblings ...)
  2021-07-29 13:31 ` [PATCH 23/34] rs6000: Builtin expansion, part 1 Bill Schmidt
@ 2021-07-29 13:31 ` Bill Schmidt
  2021-07-29 13:31 ` [PATCH 25/34] rs6000: Builtin expansion, part 3 Bill Schmidt
                   ` (9 subsequent siblings)
  33 siblings, 0 replies; 84+ messages in thread
From: Bill Schmidt @ 2021-07-29 13:31 UTC (permalink / raw)
  To: gcc-patches; +Cc: segher, dje.gcc, willschm

2021-03-05  Bill Schmidt  <wschmidt@linux.ibm.com>

gcc/
	* config/rs6000/rs6000-call.c (rs6000_invalid_new_builtin):
	Implement.
	(rs6000_expand_ldst_mask): Likewise.
	(rs6000_init_builtins): Initialize altivec_builtin_mask_for_load.
---
 gcc/config/rs6000/rs6000-call.c | 101 +++++++++++++++++++++++++++++++-
 1 file changed, 100 insertions(+), 1 deletion(-)

diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
index 4719d074455..13a24dd9713 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -11671,6 +11671,75 @@ rs6000_invalid_builtin (enum rs6000_builtins fncode)
 static void
 rs6000_invalid_new_builtin (enum rs6000_gen_builtins fncode)
 {
+  size_t uns_fncode = (size_t) fncode;
+  const char *name = rs6000_builtin_info_x[uns_fncode].bifname;
+
+  switch (rs6000_builtin_info_x[uns_fncode].enable)
+    {
+    case ENB_P5:
+      error ("%qs requires the %qs option", name, "-mcpu=power5");
+      break;
+    case ENB_P6:
+      error ("%qs requires the %qs option", name, "-mcpu=power6");
+      break;
+    case ENB_ALTIVEC:
+      error ("%qs requires the %qs option", name, "-maltivec");
+      break;
+    case ENB_CELL:
+      error ("%qs is only valid for the cell processor", name);
+      break;
+    case ENB_VSX:
+      error ("%qs requires the %qs option", name, "-mvsx");
+      break;
+    case ENB_P7:
+      error ("%qs requires the %qs option", name, "-mcpu=power7");
+      break;
+    case ENB_P7_64:
+      error ("%qs requires the %qs option and either the %qs or %qs option",
+	     name, "-mcpu=power7", "-m64", "-mpowerpc64");
+      break;
+    case ENB_P8:
+      error ("%qs requires the %qs option", name, "-mcpu=power8");
+      break;
+    case ENB_P8V:
+      error ("%qs requires the %qs option", name, "-mpower8-vector");
+      break;
+    case ENB_P9:
+      error ("%qs requires the %qs option", name, "-mcpu=power9");
+      break;
+    case ENB_P9_64:
+      error ("%qs requires the %qs option and either the %qs or %qs option",
+	     name, "-mcpu=power9", "-m64", "-mpowerpc64");
+      break;
+    case ENB_P9V:
+      error ("%qs requires the %qs option", name, "-mpower9-vector");
+      break;
+    case ENB_IEEE128_HW:
+      error ("%qs requires ISA 3.0 IEEE 128-bit floating point", name);
+      break;
+    case ENB_DFP:
+      error ("%qs requires the %qs option", name, "-mhard-dfp");
+      break;
+    case ENB_CRYPTO:
+      error ("%qs requires the %qs option", name, "-mcrypto");
+      break;
+    case ENB_HTM:
+      error ("%qs requires the %qs option", name, "-mhtm");
+      break;
+    case ENB_P10:
+      error ("%qs requires the %qs option", name, "-mcpu=power10");
+      break;
+    case ENB_P10_64:
+      error ("%qs requires the %qs option and either the %qs or %qs option",
+	     name, "-mcpu=power10", "-m64", "-mpowerpc64");
+      break;
+    case ENB_MMA:
+      error ("%qs requires the %qs option", name, "-mmma");
+      break;
+    default:
+    case ENB_ALWAYS:
+      gcc_unreachable ();
+    };
 }
 
 /* Target hook for early folding of built-ins, shamelessly stolen
@@ -14563,7 +14632,33 @@ rs6000_expand_builtin (tree exp, rtx target, rtx subtarget ATTRIBUTE_UNUSED,
 rtx
 rs6000_expand_ldst_mask (rtx target, tree arg0)
  {
-  return target;
+  int icode2 = (BYTES_BIG_ENDIAN ? (int) CODE_FOR_altivec_lvsr_direct
+		: (int) CODE_FOR_altivec_lvsl_direct);
+  machine_mode tmode = insn_data[icode2].operand[0].mode;
+  machine_mode mode = insn_data[icode2].operand[1].mode;
+  rtx op, addr, pat;
+
+  gcc_assert (TARGET_ALTIVEC);
+
+  gcc_assert (POINTER_TYPE_P (TREE_TYPE (arg0)));
+  op = expand_expr (arg0, NULL_RTX, Pmode, EXPAND_NORMAL);
+  addr = memory_address (mode, op);
+  /* We need to negate the address.  */
+  op = gen_reg_rtx (GET_MODE (addr));
+  emit_insn (gen_rtx_SET (op, gen_rtx_NEG (GET_MODE (addr), addr)));
+  op = gen_rtx_MEM (mode, op);
+
+  if (target == 0
+      || GET_MODE (target) != tmode
+      || ! (*insn_data[icode2].operand[0].predicate) (target, tmode))
+    target = gen_reg_rtx (tmode);
+
+  pat = GEN_FCN (icode2) (target, op);
+  if (!pat)
+    return 0;
+  emit_insn (pat);
+
+   return target;
  }
 
 /* Expand the CPU builtin in FCODE and store the result in TARGET.  */
@@ -15463,6 +15558,10 @@ rs6000_init_builtins (void)
   /* Execute the autogenerated initialization code for builtins.  */
   rs6000_autoinit_builtins ();
 
+  if (new_builtins_are_live)
+    altivec_builtin_mask_for_load
+      = rs6000_builtin_decls_x[RS6000_BIF_MASK_FOR_LOAD];
+
   if (new_builtins_are_live)
     {
 #ifdef SUBTARGET_INIT_BUILTINS
-- 
2.27.0


^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 25/34] rs6000: Builtin expansion, part 3
  2021-07-29 13:30 [PATCHv4 00/34] Replace the Power target-specific builtin machinery Bill Schmidt
                   ` (23 preceding siblings ...)
  2021-07-29 13:31 ` [PATCH 24/34] rs6000: Builtin expansion, part 2 Bill Schmidt
@ 2021-07-29 13:31 ` Bill Schmidt
  2021-07-29 13:31 ` [PATCH 26/34] rs6000: Builtin expansion, part 4 Bill Schmidt
                   ` (8 subsequent siblings)
  33 siblings, 0 replies; 84+ messages in thread
From: Bill Schmidt @ 2021-07-29 13:31 UTC (permalink / raw)
  To: gcc-patches; +Cc: segher, dje.gcc, willschm

2021-03-05  Bill Schmidt  <wschmidt@linux.ibm.com>

gcc/
	* config/rs6000/rs6000-call.c (new_cpu_expand_builtin):
	Implement.
---
 gcc/config/rs6000/rs6000-call.c | 100 ++++++++++++++++++++++++++++++++
 1 file changed, 100 insertions(+)

diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
index 13a24dd9713..4f5aed137fb 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -14666,6 +14666,106 @@ static rtx
 new_cpu_expand_builtin (enum rs6000_gen_builtins fcode,
 			tree exp ATTRIBUTE_UNUSED, rtx target)
 {
+  /* __builtin_cpu_init () is a nop, so expand to nothing.  */
+  if (fcode == RS6000_BIF_CPU_INIT)
+    return const0_rtx;
+
+  if (target == 0 || GET_MODE (target) != SImode)
+    target = gen_reg_rtx (SImode);
+
+#ifdef TARGET_LIBC_PROVIDES_HWCAP_IN_TCB
+  tree arg = TREE_OPERAND (CALL_EXPR_ARG (exp, 0), 0);
+  /* Target clones creates an ARRAY_REF instead of STRING_CST, convert it back
+     to a STRING_CST.  */
+  if (TREE_CODE (arg) == ARRAY_REF
+      && TREE_CODE (TREE_OPERAND (arg, 0)) == STRING_CST
+      && TREE_CODE (TREE_OPERAND (arg, 1)) == INTEGER_CST
+      && compare_tree_int (TREE_OPERAND (arg, 1), 0) == 0)
+    arg = TREE_OPERAND (arg, 0);
+
+  if (TREE_CODE (arg) != STRING_CST)
+    {
+      error ("builtin %qs only accepts a string argument",
+	     rs6000_builtin_info_x[(size_t) fcode].bifname);
+      return const0_rtx;
+    }
+
+  if (fcode == RS6000_BIF_CPU_IS)
+    {
+      const char *cpu = TREE_STRING_POINTER (arg);
+      rtx cpuid = NULL_RTX;
+      for (size_t i = 0; i < ARRAY_SIZE (cpu_is_info); i++)
+	if (strcmp (cpu, cpu_is_info[i].cpu) == 0)
+	  {
+	    /* The CPUID value in the TCB is offset by _DL_FIRST_PLATFORM.  */
+	    cpuid = GEN_INT (cpu_is_info[i].cpuid + _DL_FIRST_PLATFORM);
+	    break;
+	  }
+      if (cpuid == NULL_RTX)
+	{
+	  /* Invalid CPU argument.  */
+	  error ("cpu %qs is an invalid argument to builtin %qs",
+		 cpu, rs6000_builtin_info_x[(size_t) fcode].bifname);
+	  return const0_rtx;
+	}
+
+      rtx platform = gen_reg_rtx (SImode);
+      rtx tcbmem = gen_const_mem (SImode,
+				  gen_rtx_PLUS (Pmode,
+						gen_rtx_REG (Pmode, TLS_REGNUM),
+						GEN_INT (TCB_PLATFORM_OFFSET)));
+      emit_move_insn (platform, tcbmem);
+      emit_insn (gen_eqsi3 (target, platform, cpuid));
+    }
+  else if (fcode == RS6000_BIF_CPU_SUPPORTS)
+    {
+      const char *hwcap = TREE_STRING_POINTER (arg);
+      rtx mask = NULL_RTX;
+      int hwcap_offset;
+      for (size_t i = 0; i < ARRAY_SIZE (cpu_supports_info); i++)
+	if (strcmp (hwcap, cpu_supports_info[i].hwcap) == 0)
+	  {
+	    mask = GEN_INT (cpu_supports_info[i].mask);
+	    hwcap_offset = TCB_HWCAP_OFFSET (cpu_supports_info[i].id);
+	    break;
+	  }
+      if (mask == NULL_RTX)
+	{
+	  /* Invalid HWCAP argument.  */
+	  error ("%s %qs is an invalid argument to builtin %qs",
+		 "hwcap", hwcap,
+		 rs6000_builtin_info_x[(size_t) fcode].bifname);
+	  return const0_rtx;
+	}
+
+      rtx tcb_hwcap = gen_reg_rtx (SImode);
+      rtx tcbmem = gen_const_mem (SImode,
+				  gen_rtx_PLUS (Pmode,
+						gen_rtx_REG (Pmode, TLS_REGNUM),
+						GEN_INT (hwcap_offset)));
+      emit_move_insn (tcb_hwcap, tcbmem);
+      rtx scratch1 = gen_reg_rtx (SImode);
+      emit_insn (gen_rtx_SET (scratch1, gen_rtx_AND (SImode, tcb_hwcap, mask)));
+      rtx scratch2 = gen_reg_rtx (SImode);
+      emit_insn (gen_eqsi3 (scratch2, scratch1, const0_rtx));
+      emit_insn (gen_rtx_SET (target, gen_rtx_XOR (SImode, scratch2, const1_rtx)));
+    }
+  else
+    gcc_unreachable ();
+
+  /* Record that we have expanded a CPU builtin, so that we can later
+     emit a reference to the special symbol exported by LIBC to ensure we
+     do not link against an old LIBC that doesn't support this feature.  */
+  cpu_builtin_p = true;
+
+#else
+  warning (0, "builtin %qs needs GLIBC (2.23 and newer) that exports hardware "
+	   "capability bits", rs6000_builtin_info_x[(size_t) fcode].bifname);
+
+  /* For old LIBCs, always return FALSE.  */
+  emit_move_insn (target, GEN_INT (0));
+#endif /* TARGET_LIBC_PROVIDES_HWCAP_IN_TCB */
+
   return target;
 }
 
-- 
2.27.0


^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 26/34] rs6000: Builtin expansion, part 4
  2021-07-29 13:30 [PATCHv4 00/34] Replace the Power target-specific builtin machinery Bill Schmidt
                   ` (24 preceding siblings ...)
  2021-07-29 13:31 ` [PATCH 25/34] rs6000: Builtin expansion, part 3 Bill Schmidt
@ 2021-07-29 13:31 ` Bill Schmidt
  2021-07-29 13:31 ` [PATCH 27/34] rs6000: Builtin expansion, part 5 Bill Schmidt
                   ` (7 subsequent siblings)
  33 siblings, 0 replies; 84+ messages in thread
From: Bill Schmidt @ 2021-07-29 13:31 UTC (permalink / raw)
  To: gcc-patches; +Cc: segher, dje.gcc, willschm

2021-07-28  Bill Schmidt  <wschmidt@linux.ibm.com>

gcc/
	* config/rs6000/rs6000-call.c (elemrev_icode): Implement.
	(ldv_expand_builtin): Likewise.
	(lxvrse_expand_builtin): Likewise.
	(lxvrze_expand_builtin): Likewise.
	(stv_expand_builtin): Likewise.
---
 gcc/config/rs6000/rs6000-call.c | 217 ++++++++++++++++++++++++++++++++
 1 file changed, 217 insertions(+)

diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
index 4f5aed137fb..89984d65a46 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -14772,12 +14772,114 @@ new_cpu_expand_builtin (enum rs6000_gen_builtins fcode,
 static insn_code
 elemrev_icode (rs6000_gen_builtins fcode)
 {
+  switch (fcode)
+    {
+    default:
+      gcc_unreachable ();
+    case RS6000_BIF_ST_ELEMREV_V1TI:
+      return (BYTES_BIG_ENDIAN ? CODE_FOR_vsx_store_v1ti
+	      : CODE_FOR_vsx_st_elemrev_v1ti);
+    case RS6000_BIF_ST_ELEMREV_V2DF:
+      return (BYTES_BIG_ENDIAN ? CODE_FOR_vsx_store_v2df
+	      : CODE_FOR_vsx_st_elemrev_v2df);
+    case RS6000_BIF_ST_ELEMREV_V2DI:
+      return (BYTES_BIG_ENDIAN ? CODE_FOR_vsx_store_v2di
+	      : CODE_FOR_vsx_st_elemrev_v2di);
+    case RS6000_BIF_ST_ELEMREV_V4SF:
+      return (BYTES_BIG_ENDIAN ? CODE_FOR_vsx_store_v4sf
+	      : CODE_FOR_vsx_st_elemrev_v4sf);
+    case RS6000_BIF_ST_ELEMREV_V4SI:
+      return (BYTES_BIG_ENDIAN ? CODE_FOR_vsx_store_v4si
+	      : CODE_FOR_vsx_st_elemrev_v4si);
+    case RS6000_BIF_ST_ELEMREV_V8HI:
+      return (BYTES_BIG_ENDIAN ? CODE_FOR_vsx_store_v8hi
+	      : CODE_FOR_vsx_st_elemrev_v8hi);
+    case RS6000_BIF_ST_ELEMREV_V16QI:
+      return (BYTES_BIG_ENDIAN ? CODE_FOR_vsx_store_v16qi
+	      : CODE_FOR_vsx_st_elemrev_v16qi);
+    case RS6000_BIF_LD_ELEMREV_V2DF:
+      return (BYTES_BIG_ENDIAN ? CODE_FOR_vsx_load_v2df
+	      : CODE_FOR_vsx_ld_elemrev_v2df);
+    case RS6000_BIF_LD_ELEMREV_V1TI:
+      return (BYTES_BIG_ENDIAN ? CODE_FOR_vsx_load_v1ti
+	      : CODE_FOR_vsx_ld_elemrev_v1ti);
+    case RS6000_BIF_LD_ELEMREV_V2DI:
+      return (BYTES_BIG_ENDIAN ? CODE_FOR_vsx_load_v2di
+	      : CODE_FOR_vsx_ld_elemrev_v2di);
+    case RS6000_BIF_LD_ELEMREV_V4SF:
+      return (BYTES_BIG_ENDIAN ? CODE_FOR_vsx_load_v4sf
+	      : CODE_FOR_vsx_ld_elemrev_v4sf);
+    case RS6000_BIF_LD_ELEMREV_V4SI:
+      return (BYTES_BIG_ENDIAN ? CODE_FOR_vsx_load_v4si
+	      : CODE_FOR_vsx_ld_elemrev_v4si);
+    case RS6000_BIF_LD_ELEMREV_V8HI:
+      return (BYTES_BIG_ENDIAN ? CODE_FOR_vsx_load_v8hi
+	      : CODE_FOR_vsx_ld_elemrev_v8hi);
+    case RS6000_BIF_LD_ELEMREV_V16QI:
+      return (BYTES_BIG_ENDIAN ? CODE_FOR_vsx_load_v16qi
+	      : CODE_FOR_vsx_ld_elemrev_v16qi);
+    }
+  gcc_unreachable ();
   return (insn_code) 0;
 }
 
 static rtx
 ldv_expand_builtin (rtx target, insn_code icode, rtx *op, machine_mode tmode)
 {
+  rtx pat, addr;
+  bool blk = (icode == CODE_FOR_altivec_lvlx
+	      || icode == CODE_FOR_altivec_lvlxl
+	      || icode == CODE_FOR_altivec_lvrx
+	      || icode == CODE_FOR_altivec_lvrxl);
+
+  if (target == 0
+      || GET_MODE (target) != tmode
+      || !(*insn_data[icode].operand[0].predicate) (target, tmode))
+    target = gen_reg_rtx (tmode);
+
+  op[1] = copy_to_mode_reg (Pmode, op[1]);
+
+  /* For LVX, express the RTL accurately by ANDing the address with -16.
+     LVXL and LVE*X expand to use UNSPECs to hide their special behavior,
+     so the raw address is fine.  */
+  if (icode == CODE_FOR_altivec_lvx_v1ti
+      || icode == CODE_FOR_altivec_lvx_v2df
+      || icode == CODE_FOR_altivec_lvx_v2di
+      || icode == CODE_FOR_altivec_lvx_v4sf
+      || icode == CODE_FOR_altivec_lvx_v4si
+      || icode == CODE_FOR_altivec_lvx_v8hi
+      || icode == CODE_FOR_altivec_lvx_v16qi)
+    {
+      rtx rawaddr;
+      if (op[0] == const0_rtx)
+	rawaddr = op[1];
+      else
+	{
+	  op[0] = copy_to_mode_reg (Pmode, op[0]);
+	  rawaddr = gen_rtx_PLUS (Pmode, op[1], op[0]);
+	}
+      addr = gen_rtx_AND (Pmode, rawaddr, gen_rtx_CONST_INT (Pmode, -16));
+      addr = gen_rtx_MEM (blk ? BLKmode : tmode, addr);
+
+      emit_insn (gen_rtx_SET (target, addr));
+    }
+  else
+    {
+      if (op[0] == const0_rtx)
+	addr = gen_rtx_MEM (blk ? BLKmode : tmode, op[1]);
+      else
+	{
+	  op[0] = copy_to_mode_reg (Pmode, op[0]);
+	  addr = gen_rtx_MEM (blk ? BLKmode : tmode,
+			      gen_rtx_PLUS (Pmode, op[1], op[0]));
+	}
+
+      pat = GEN_FCN (icode) (target, addr);
+      if (!pat)
+	return 0;
+      emit_insn (pat);
+    }
+
   return target;
 }
 
@@ -14785,6 +14887,42 @@ static rtx
 lxvrse_expand_builtin (rtx target, insn_code icode, rtx *op,
 		       machine_mode tmode, machine_mode smode)
 {
+  rtx pat, addr;
+  op[1] = copy_to_mode_reg (Pmode, op[1]);
+
+  if (op[0] == const0_rtx)
+    addr = gen_rtx_MEM (tmode, op[1]);
+  else
+    {
+      op[0] = copy_to_mode_reg (Pmode, op[0]);
+      addr = gen_rtx_MEM (smode,
+			  gen_rtx_PLUS (Pmode, op[1], op[0]));
+    }
+
+  rtx discratch = gen_reg_rtx (DImode);
+  rtx tiscratch = gen_reg_rtx (TImode);
+
+  /* Emit the lxvr*x insn.  */
+  pat = GEN_FCN (icode) (tiscratch, addr);
+  if (!pat)
+    return 0;
+  emit_insn (pat);
+
+  /* Emit a sign extension from QI,HI,WI to double (DI).  */
+  rtx scratch = gen_lowpart (smode, tiscratch);
+  if (icode == CODE_FOR_vsx_lxvrbx)
+    emit_insn (gen_extendqidi2 (discratch, scratch));
+  else if (icode == CODE_FOR_vsx_lxvrhx)
+    emit_insn (gen_extendhidi2 (discratch, scratch));
+  else if (icode == CODE_FOR_vsx_lxvrwx)
+    emit_insn (gen_extendsidi2 (discratch, scratch));
+  /*  Assign discratch directly if scratch is already DI.  */
+  if (icode == CODE_FOR_vsx_lxvrdx)
+    discratch = scratch;
+
+  /* Emit the sign extension from DI (double) to TI (quad).  */
+  emit_insn (gen_extendditi2 (target, discratch));
+
   return target;
 }
 
@@ -14792,6 +14930,22 @@ static rtx
 lxvrze_expand_builtin (rtx target, insn_code icode, rtx *op,
 		       machine_mode tmode, machine_mode smode)
 {
+  rtx pat, addr;
+  op[1] = copy_to_mode_reg (Pmode, op[1]);
+
+  if (op[0] == const0_rtx)
+    addr = gen_rtx_MEM (tmode, op[1]);
+  else
+    {
+      op[0] = copy_to_mode_reg (Pmode, op[0]);
+      addr = gen_rtx_MEM (smode,
+			  gen_rtx_PLUS (Pmode, op[1], op[0]));
+    }
+
+  pat = GEN_FCN (icode) (target, addr);
+  if (!pat)
+    return 0;
+  emit_insn (pat);
   return target;
 }
 
@@ -14799,6 +14953,69 @@ static rtx
 stv_expand_builtin (insn_code icode, rtx *op,
 		    machine_mode tmode, machine_mode smode)
 {
+  rtx pat, addr, rawaddr, truncrtx;
+  op[2] = copy_to_mode_reg (Pmode, op[2]);
+
+  /* For STVX, express the RTL accurately by ANDing the address with -16.
+     STVXL and STVE*X expand to use UNSPECs to hide their special behavior,
+     so the raw address is fine.  */
+  if (icode == CODE_FOR_altivec_stvx_v2df
+      || icode == CODE_FOR_altivec_stvx_v2di
+      || icode == CODE_FOR_altivec_stvx_v4sf
+      || icode == CODE_FOR_altivec_stvx_v4si
+      || icode == CODE_FOR_altivec_stvx_v8hi
+      || icode == CODE_FOR_altivec_stvx_v16qi)
+    {
+      if (op[1] == const0_rtx)
+	rawaddr = op[2];
+      else
+	{
+	  op[1] = copy_to_mode_reg (Pmode, op[1]);
+	  rawaddr = gen_rtx_PLUS (Pmode, op[2], op[1]);
+	}
+
+      addr = gen_rtx_AND (Pmode, rawaddr, gen_rtx_CONST_INT (Pmode, -16));
+      addr = gen_rtx_MEM (tmode, addr);
+      op[0] = copy_to_mode_reg (tmode, op[0]);
+      emit_insn (gen_rtx_SET (addr, op[0]));
+    }
+  else if (icode == CODE_FOR_vsx_stxvrbx
+	   || icode == CODE_FOR_vsx_stxvrhx
+	   || icode == CODE_FOR_vsx_stxvrwx
+	   || icode == CODE_FOR_vsx_stxvrdx)
+    {
+      truncrtx = gen_rtx_TRUNCATE (tmode, op[0]);
+      op[0] = copy_to_mode_reg (E_TImode, truncrtx);
+
+      if (op[1] == const0_rtx)
+	addr = gen_rtx_MEM (Pmode, op[2]);
+      else
+	{
+	  op[1] = copy_to_mode_reg (Pmode, op[1]);
+	  addr = gen_rtx_MEM (tmode, gen_rtx_PLUS (Pmode, op[2], op[1]));
+	}
+      pat = GEN_FCN (icode) (addr, op[0]);
+      if (pat)
+	emit_insn (pat);
+    }
+  else
+    {
+      if (! (*insn_data[icode].operand[1].predicate) (op[0], smode))
+	op[0] = copy_to_mode_reg (smode, op[0]);
+
+      if (op[1] == const0_rtx)
+	addr = gen_rtx_MEM (tmode, op[2]);
+      else
+	{
+	  op[1] = copy_to_mode_reg (Pmode, op[1]);
+	  addr = gen_rtx_MEM (tmode, gen_rtx_PLUS (Pmode, op[2], op[1]));
+	}
+
+      pat = GEN_FCN (icode) (addr, op[0]);
+      if (pat)
+	emit_insn (pat);
+    }
+
   return NULL_RTX;
 }
 
-- 
2.27.0


^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 27/34] rs6000: Builtin expansion, part 5
  2021-07-29 13:30 [PATCHv4 00/34] Replace the Power target-specific builtin machinery Bill Schmidt
                   ` (25 preceding siblings ...)
  2021-07-29 13:31 ` [PATCH 26/34] rs6000: Builtin expansion, part 4 Bill Schmidt
@ 2021-07-29 13:31 ` Bill Schmidt
  2021-07-29 13:31 ` [PATCH 28/34] rs6000: Builtin expansion, part 6 Bill Schmidt
                   ` (6 subsequent siblings)
  33 siblings, 0 replies; 84+ messages in thread
From: Bill Schmidt @ 2021-07-29 13:31 UTC (permalink / raw)
  To: gcc-patches; +Cc: segher, dje.gcc, willschm

2021-06-17  Bill Schmidt  <wschmidt@linux.ibm.com>

gcc/
	* config/rs6000/rs6000-call.c (new_mma_expand_builtin):
	Implement.
---
 gcc/config/rs6000/rs6000-call.c | 103 ++++++++++++++++++++++++++++++++
 1 file changed, 103 insertions(+)

diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
index 89984d65a46..f37ee9b25ab 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -15024,6 +15024,109 @@ static rtx
 new_mma_expand_builtin (tree exp, rtx target, insn_code icode,
 			rs6000_gen_builtins fcode)
 {
+  tree fndecl = TREE_OPERAND (CALL_EXPR_FN (exp), 0);
+  tree arg;
+  call_expr_arg_iterator iter;
+  const struct insn_operand_data *insn_op;
+  rtx op[MAX_MMA_OPERANDS];
+  unsigned nopnds = 0;
+  bool void_func = TREE_TYPE (TREE_TYPE (fndecl)) == void_type_node;
+  machine_mode tmode = VOIDmode;
+
+  if (!void_func)
+    {
+      tmode = insn_data[icode].operand[0].mode;
+      if (!target
+	  || GET_MODE (target) != tmode
+	  || !(*insn_data[icode].operand[0].predicate) (target, tmode))
+	target = gen_reg_rtx (tmode);
+      op[nopnds++] = target;
+    }
+  else
+    target = const0_rtx;
+
+  FOR_EACH_CALL_EXPR_ARG (arg, iter, exp)
+    {
+      if (arg == error_mark_node)
+	return const0_rtx;
+
+      rtx opnd;
+      insn_op = &insn_data[icode].operand[nopnds];
+      if (TREE_CODE (arg) == ADDR_EXPR
+	  && MEM_P (DECL_RTL (TREE_OPERAND (arg, 0))))
+	opnd = DECL_RTL (TREE_OPERAND (arg, 0));
+      else
+	opnd = expand_normal (arg);
+
+      if (!(*insn_op->predicate) (opnd, insn_op->mode))
+	{
+	  if (!strcmp (insn_op->constraint, "n"))
+	    {
+	      if (!CONST_INT_P (opnd))
+		error ("argument %d must be an unsigned literal", nopnds);
+	      else
+		error ("argument %d is an unsigned literal that is "
+		       "out of range", nopnds);
+	      return const0_rtx;
+	    }
+	  opnd = copy_to_mode_reg (insn_op->mode, opnd);
+	}
+
+      /* Some MMA instructions have INOUT accumulator operands, so force
+	 their target register to be the same as their input register.  */
+      if (!void_func
+	  && nopnds == 1
+	  && !strcmp (insn_op->constraint, "0")
+	  && insn_op->mode == tmode
+	  && REG_P (opnd)
+	  && (*insn_data[icode].operand[0].predicate) (opnd, tmode))
+	target = op[0] = opnd;
+
+      op[nopnds++] = opnd;
+    }
+
+  rtx pat;
+  switch (nopnds)
+    {
+    case 1:
+      pat = GEN_FCN (icode) (op[0]);
+      break;
+    case 2:
+      pat = GEN_FCN (icode) (op[0], op[1]);
+      break;
+    case 3:
+      /* The ASSEMBLE builtin source operands are reversed in little-endian
+	 mode, so reorder them.  */
+      if (fcode == RS6000_BIF_ASSEMBLE_PAIR_V_INTERNAL && !WORDS_BIG_ENDIAN)
+	std::swap (op[1], op[2]);
+      pat = GEN_FCN (icode) (op[0], op[1], op[2]);
+      break;
+    case 4:
+      pat = GEN_FCN (icode) (op[0], op[1], op[2], op[3]);
+      break;
+    case 5:
+      /* The ASSEMBLE builtin source operands are reversed in little-endian
+	 mode, so reorder them.  */
+      if (fcode == RS6000_BIF_ASSEMBLE_ACC_INTERNAL && !WORDS_BIG_ENDIAN)
+	{
+	  std::swap (op[1], op[4]);
+	  std::swap (op[2], op[3]);
+	}
+      pat = GEN_FCN (icode) (op[0], op[1], op[2], op[3], op[4]);
+      break;
+    case 6:
+      pat = GEN_FCN (icode) (op[0], op[1], op[2], op[3], op[4], op[5]);
+      break;
+    case 7:
+      pat = GEN_FCN (icode) (op[0], op[1], op[2], op[3], op[4], op[5], op[6]);
+      break;
+    default:
+      gcc_unreachable ();
+    }
+  if (!pat)
+    return NULL_RTX;
+  emit_insn (pat);
+
   return target;
 }
 
-- 
2.27.0


^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 28/34] rs6000: Builtin expansion, part 6
  2021-07-29 13:30 [PATCHv4 00/34] Replace the Power target-specific builtin machinery Bill Schmidt
                   ` (26 preceding siblings ...)
  2021-07-29 13:31 ` [PATCH 27/34] rs6000: Builtin expansion, part 5 Bill Schmidt
@ 2021-07-29 13:31 ` Bill Schmidt
  2021-07-29 13:31 ` [PATCH 29/34] rs6000: Update rs6000_builtin_decl Bill Schmidt
                   ` (5 subsequent siblings)
  33 siblings, 0 replies; 84+ messages in thread
From: Bill Schmidt @ 2021-07-29 13:31 UTC (permalink / raw)
  To: gcc-patches; +Cc: segher, dje.gcc, willschm

2021-07-28  Bill Schmidt  <wschmidt@linux.ibm.com>

gcc/
	* config/rs6000/rs6000-call.c (new_htm_spr_num): New function.
	(new_htm_expand_builtin): Implement.
	(rs6000_expand_new_builtin): Handle 32-bit and endian cases.
---
 gcc/config/rs6000/rs6000-call.c | 202 ++++++++++++++++++++++++++++++++
 1 file changed, 202 insertions(+)

diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
index f37ee9b25ab..eaf62d734f1 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -15130,11 +15130,171 @@ new_mma_expand_builtin (tree exp, rtx target, insn_code icode,
   return target;
 }
 
+/* Return the appropriate SPR number associated with the given builtin.  */
+static inline HOST_WIDE_INT
+new_htm_spr_num (enum rs6000_gen_builtins code)
+{
+  if (code == RS6000_BIF_GET_TFHAR
+      || code == RS6000_BIF_SET_TFHAR)
+    return TFHAR_SPR;
+  else if (code == RS6000_BIF_GET_TFIAR
+	   || code == RS6000_BIF_SET_TFIAR)
+    return TFIAR_SPR;
+  else if (code == RS6000_BIF_GET_TEXASR
+	   || code == RS6000_BIF_SET_TEXASR)
+    return TEXASR_SPR;
+  gcc_assert (code == RS6000_BIF_GET_TEXASRU
+	      || code == RS6000_BIF_SET_TEXASRU);
+  return TEXASRU_SPR;
+}
+
 /* Expand the HTM builtin in EXP and store the result in TARGET.  */
 static rtx
 new_htm_expand_builtin (bifdata *bifaddr, rs6000_gen_builtins fcode,
 			tree exp, rtx target)
 {
+  tree fndecl = TREE_OPERAND (CALL_EXPR_FN (exp), 0);
+  bool nonvoid = TREE_TYPE (TREE_TYPE (fndecl)) != void_type_node;
+
+  if (!TARGET_POWERPC64
+      && (fcode == RS6000_BIF_TABORTDC
+	  || fcode == RS6000_BIF_TABORTDCI))
+    {
+      error ("builtin %qs is only valid in 64-bit mode", bifaddr->bifname);
+      return const0_rtx;
+    }
+
+  rtx op[MAX_HTM_OPERANDS], pat;
+  int nopnds = 0;
+  tree arg;
+  call_expr_arg_iterator iter;
+  insn_code icode = bifaddr->icode;
+  bool uses_spr = bif_is_htmspr (*bifaddr);
+  rtx cr = NULL_RTX;
+
+  if (uses_spr)
+    icode = rs6000_htm_spr_icode (nonvoid);
+  const insn_operand_data *insn_op = &insn_data[icode].operand[0];
+
+  if (nonvoid)
+    {
+      machine_mode tmode = (uses_spr) ? insn_op->mode : E_SImode;
+      if (!target
+	  || GET_MODE (target) != tmode
+	  || (uses_spr && !(*insn_op->predicate) (target, tmode)))
+	target = gen_reg_rtx (tmode);
+      if (uses_spr)
+	op[nopnds++] = target;
+    }
+
+  FOR_EACH_CALL_EXPR_ARG (arg, iter, exp)
+    {
+      if (arg == error_mark_node || nopnds >= MAX_HTM_OPERANDS)
+	return const0_rtx;
+
+      insn_op = &insn_data[icode].operand[nopnds];
+      op[nopnds] = expand_normal (arg);
+
+      if (!(*insn_op->predicate) (op[nopnds], insn_op->mode))
+	{
+	  if (!strcmp (insn_op->constraint, "n"))
+	    {
+	      int arg_num = (nonvoid) ? nopnds : nopnds + 1;
+	      if (!CONST_INT_P (op[nopnds]))
+		error ("argument %d must be an unsigned literal", arg_num);
+	      else
+		error ("argument %d is an unsigned literal that is "
+		       "out of range", arg_num);
+	      return const0_rtx;
+	    }
+	  op[nopnds] = copy_to_mode_reg (insn_op->mode, op[nopnds]);
+	}
+
+      nopnds++;
+    }
+
+  /* Handle the builtins for extended mnemonics.  These accept
+     no arguments, but map to builtins that take arguments.  */
+  switch (fcode)
+    {
+    case RS6000_BIF_TENDALL:  /* Alias for: tend. 1  */
+    case RS6000_BIF_TRESUME:  /* Alias for: tsr. 1  */
+      op[nopnds++] = GEN_INT (1);
+      break;
+    case RS6000_BIF_TSUSPEND: /* Alias for: tsr. 0  */
+      op[nopnds++] = GEN_INT (0);
+      break;
+    default:
+      break;
+    }
+
+  /* If this builtin accesses SPRs, then pass in the appropriate
+     SPR number and SPR regno as the last two operands.  */
+  if (uses_spr)
+    {
+      machine_mode mode = (TARGET_POWERPC64) ? DImode : SImode;
+      op[nopnds++] = gen_rtx_CONST_INT (mode, new_htm_spr_num (fcode));
+    }
+  /* If this builtin accesses a CR, then pass in a scratch
+     CR as the last operand.  */
+  else if (bif_is_htmcr (*bifaddr))
+    {
+      cr = gen_reg_rtx (CCmode);
+      op[nopnds++] = cr;
+    }
+
+  switch (nopnds)
+    {
+    case 1:
+      pat = GEN_FCN (icode) (op[0]);
+      break;
+    case 2:
+      pat = GEN_FCN (icode) (op[0], op[1]);
+      break;
+    case 3:
+      pat = GEN_FCN (icode) (op[0], op[1], op[2]);
+      break;
+    case 4:
+      pat = GEN_FCN (icode) (op[0], op[1], op[2], op[3]);
+      break;
+    default:
+      gcc_unreachable ();
+    }
+  if (!pat)
+    return NULL_RTX;
+  emit_insn (pat);
+
+  if (bif_is_htmcr (*bifaddr))
+    {
+      if (fcode == RS6000_BIF_TBEGIN)
+	{
+	  /* Emit code to set TARGET to true or false depending on
+	     whether the tbegin. instruction succeeded or failed
+	     to start a transaction.  We do this by placing the 1's
+	     complement of CR's EQ bit into TARGET.  */
+	  rtx scratch = gen_reg_rtx (SImode);
+	  emit_insn (gen_rtx_SET (scratch,
+				  gen_rtx_EQ (SImode, cr,
+					      const0_rtx)));
+	  emit_insn (gen_rtx_SET (target,
+				  gen_rtx_XOR (SImode, scratch,
+					       GEN_INT (1))));
+	}
+      else
+	{
+	  /* Emit code to copy the 4-bit condition register field
+	     CR into the least significant end of register TARGET.  */
+	  rtx scratch1 = gen_reg_rtx (SImode);
+	  rtx scratch2 = gen_reg_rtx (SImode);
+	  rtx subreg = simplify_gen_subreg (CCmode, scratch1, SImode, 0);
+	  emit_insn (gen_movcc (subreg, cr));
+	  emit_insn (gen_lshrsi3 (scratch2, scratch1, GEN_INT (28)));
+	  emit_insn (gen_andsi3 (target, scratch2, GEN_INT (0xf)));
+	}
+    }
+
+  if (nonvoid)
+    return target;
   return const0_rtx;
 }
 
@@ -15378,6 +15538,48 @@ rs6000_expand_new_builtin (tree exp, rtx target,
   if (bif_is_htm (*bifaddr))
     return new_htm_expand_builtin (bifaddr, fcode, exp, target);
 
+  if (bif_is_32bit (*bifaddr) && TARGET_32BIT)
+    {
+      if (fcode == RS6000_BIF_MFTB)
+	icode = CODE_FOR_rs6000_mftb_si;
+      else
+	gcc_unreachable ();
+    }
+
+  if (bif_is_endian (*bifaddr) && BYTES_BIG_ENDIAN)
+    {
+      if (fcode == RS6000_BIF_LD_ELEMREV_V1TI)
+	icode = CODE_FOR_vsx_load_v1ti;
+      else if (fcode == RS6000_BIF_LD_ELEMREV_V2DF)
+	icode = CODE_FOR_vsx_load_v2df;
+      else if (fcode == RS6000_BIF_LD_ELEMREV_V2DI)
+	icode = CODE_FOR_vsx_load_v2di;
+      else if (fcode == RS6000_BIF_LD_ELEMREV_V4SF)
+	icode = CODE_FOR_vsx_load_v4sf;
+      else if (fcode == RS6000_BIF_LD_ELEMREV_V4SI)
+	icode = CODE_FOR_vsx_load_v4si;
+      else if (fcode == RS6000_BIF_LD_ELEMREV_V8HI)
+	icode = CODE_FOR_vsx_load_v8hi;
+      else if (fcode == RS6000_BIF_LD_ELEMREV_V16QI)
+	icode = CODE_FOR_vsx_load_v16qi;
+      else if (fcode == RS6000_BIF_ST_ELEMREV_V1TI)
+	icode = CODE_FOR_vsx_store_v1ti;
+      else if (fcode == RS6000_BIF_ST_ELEMREV_V2DF)
+	icode = CODE_FOR_vsx_store_v2df;
+      else if (fcode == RS6000_BIF_ST_ELEMREV_V2DI)
+	icode = CODE_FOR_vsx_store_v2di;
+      else if (fcode == RS6000_BIF_ST_ELEMREV_V4SF)
+	icode = CODE_FOR_vsx_store_v4sf;
+      else if (fcode == RS6000_BIF_ST_ELEMREV_V4SI)
+	icode = CODE_FOR_vsx_store_v4si;
+      else if (fcode == RS6000_BIF_ST_ELEMREV_V8HI)
+	icode = CODE_FOR_vsx_store_v8hi;
+      else if (fcode == RS6000_BIF_ST_ELEMREV_V16QI)
+	icode = CODE_FOR_vsx_store_v16qi;
+      else
+	gcc_unreachable ();
+    }
+
   rtx pat;
   const int MAX_BUILTIN_ARGS = 6;
   tree arg[MAX_BUILTIN_ARGS];
-- 
2.27.0


^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 29/34] rs6000: Update rs6000_builtin_decl
  2021-07-29 13:30 [PATCHv4 00/34] Replace the Power target-specific builtin machinery Bill Schmidt
                   ` (27 preceding siblings ...)
  2021-07-29 13:31 ` [PATCH 28/34] rs6000: Builtin expansion, part 6 Bill Schmidt
@ 2021-07-29 13:31 ` Bill Schmidt
  2021-07-29 13:31 ` [PATCH 30/34] rs6000: Miscellaneous uses of rs6000_builtins_decl_x Bill Schmidt
                   ` (4 subsequent siblings)
  33 siblings, 0 replies; 84+ messages in thread
From: Bill Schmidt @ 2021-07-29 13:31 UTC (permalink / raw)
  To: gcc-patches; +Cc: segher, dje.gcc, willschm

Create a new version of this function that uses the new infrastructure,
and particularly checks for supported builtins the new way.

2021-07-28  Bill Schmidt  <wschmidt@linux.ibm.com>

gcc/
	* config/rs6000/rs6000-call.c (rs6000_new_builtin_decl): New
	function.
	(rs6000_builtin_decl): Call it.
---
 gcc/config/rs6000/rs6000-call.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
index eaf62d734f1..d14d58be7d7 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -16320,11 +16320,31 @@ rs6000_init_builtins (void)
     }
 }
 
+static tree
+rs6000_new_builtin_decl (unsigned code, bool initialize_p ATTRIBUTE_UNUSED)
+{
+  rs6000_gen_builtins fcode = (rs6000_gen_builtins) code;
+
+  if (fcode >= RS6000_OVLD_MAX)
+    return error_mark_node;
+
+  if (!rs6000_new_builtin_is_supported_p (fcode))
+    {
+      rs6000_invalid_new_builtin (fcode);
+      return error_mark_node;
+    }
+
+  return rs6000_builtin_decls_x[code];
+}
+
 /* Returns the rs6000 builtin decl for CODE.  */
 
 tree
 rs6000_builtin_decl (unsigned code, bool initialize_p ATTRIBUTE_UNUSED)
 {
+  if (new_builtins_are_live)
+    return rs6000_new_builtin_decl (code, initialize_p);
+
   HOST_WIDE_INT fnmask;
 
   if (code >= RS6000_BUILTIN_COUNT)
-- 
2.27.0


^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 30/34] rs6000: Miscellaneous uses of rs6000_builtins_decl_x
  2021-07-29 13:30 [PATCHv4 00/34] Replace the Power target-specific builtin machinery Bill Schmidt
                   ` (28 preceding siblings ...)
  2021-07-29 13:31 ` [PATCH 29/34] rs6000: Update rs6000_builtin_decl Bill Schmidt
@ 2021-07-29 13:31 ` Bill Schmidt
  2021-07-29 13:31 ` [PATCH 31/34] rs6000: Debug support Bill Schmidt
                   ` (3 subsequent siblings)
  33 siblings, 0 replies; 84+ messages in thread
From: Bill Schmidt @ 2021-07-29 13:31 UTC (permalink / raw)
  To: gcc-patches; +Cc: segher, dje.gcc, willschm

There are a few leftover places where we use the old rs6000_builtins_decl
array, but we need to use rs6000_builtins_decl_x instead when the new
builtins infrastructure is in play.

2021-07-28  Bill Schmidt  <wschmidt@linux.ibm.com>

gcc/
	* config/rs6000/rs6000.c (rs6000_builtin_reciprocal): Use
	rs6000_builtin_decls_x when appropriate.
	(add_condition_to_bb): Likewise.
	(rs6000_atomic_assign_expand_fenv): Likewise.
---
 gcc/config/rs6000/rs6000.c | 19 ++++++++++++++++---
 1 file changed, 16 insertions(+), 3 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index ceba25d028c..112453b2908 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -22453,12 +22453,16 @@ rs6000_builtin_reciprocal (tree fndecl)
       if (!RS6000_RECIP_AUTO_RSQRTE_P (V2DFmode))
 	return NULL_TREE;
 
+      if (new_builtins_are_live)
+	return rs6000_builtin_decls_x[RS6000_BIF_RSQRT_2DF];
       return rs6000_builtin_decls[VSX_BUILTIN_RSQRT_2DF];
 
     case VSX_BUILTIN_XVSQRTSP:
       if (!RS6000_RECIP_AUTO_RSQRTE_P (V4SFmode))
 	return NULL_TREE;
 
+      if (new_builtins_are_live)
+	return rs6000_builtin_decls_x[RS6000_BIF_RSQRT_4SF];
       return rs6000_builtin_decls[VSX_BUILTIN_RSQRT_4SF];
 
     default:
@@ -25047,7 +25051,10 @@ add_condition_to_bb (tree function_decl, tree version_decl,
 
   tree bool_zero = build_int_cst (bool_int_type_node, 0);
   tree cond_var = create_tmp_var (bool_int_type_node);
-  tree predicate_decl = rs6000_builtin_decls [(int) RS6000_BUILTIN_CPU_SUPPORTS];
+  tree predicate_decl
+    = (new_builtins_are_live
+       ? rs6000_builtin_decls_x[(int) RS6000_BIF_CPU_SUPPORTS]
+       : rs6000_builtin_decls [(int) RS6000_BUILTIN_CPU_SUPPORTS]);
   const char *arg_str = rs6000_clone_map[clone_isa].name;
   tree predicate_arg = build_string_literal (strlen (arg_str) + 1, arg_str);
   gimple *call_cond_stmt = gimple_build_call (predicate_decl, 1, predicate_arg);
@@ -27687,8 +27694,14 @@ rs6000_atomic_assign_expand_fenv (tree *hold, tree *clear, tree *update)
       return;
     }
 
-  tree mffs = rs6000_builtin_decls[RS6000_BUILTIN_MFFS];
-  tree mtfsf = rs6000_builtin_decls[RS6000_BUILTIN_MTFSF];
+  tree mffs
+    = (new_builtins_are_live
+       ? rs6000_builtin_decls_x[RS6000_BIF_MFFS]
+       : rs6000_builtin_decls[RS6000_BUILTIN_MFFS]);
+  tree mtfsf
+    = (new_builtins_are_live
+       ? rs6000_builtin_decls_x[RS6000_BIF_MTFSF]
+       : rs6000_builtin_decls[RS6000_BUILTIN_MTFSF]);
   tree call_mffs = build_call_expr (mffs, 0);
 
   /* Generates the equivalent of feholdexcept (&fenv_var)
-- 
2.27.0


^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 31/34] rs6000: Debug support
  2021-07-29 13:30 [PATCHv4 00/34] Replace the Power target-specific builtin machinery Bill Schmidt
                   ` (29 preceding siblings ...)
  2021-07-29 13:31 ` [PATCH 30/34] rs6000: Miscellaneous uses of rs6000_builtins_decl_x Bill Schmidt
@ 2021-07-29 13:31 ` Bill Schmidt
  2021-07-29 13:31 ` [PATCH 32/34] rs6000: Update altivec.h for automated interfaces Bill Schmidt
                   ` (2 subsequent siblings)
  33 siblings, 0 replies; 84+ messages in thread
From: Bill Schmidt @ 2021-07-29 13:31 UTC (permalink / raw)
  To: gcc-patches; +Cc: segher, dje.gcc, willschm

2021-07-28  Bill Schmidt  <wschmidt@linux.ibm.com>

gcc/
	* config/rs6000/rs6000-call.c (rs6000_debug_type): New function.
	(def_builtin): Change debug formatting for easier parsing and
	include more information.
	(rs6000_init_builtins): Add dump of autogenerated builtins.
	(altivec_init_builtins): Dump __builtin_altivec_mask_for_load for
	completeness.
---
 gcc/config/rs6000/rs6000-call.c | 191 +++++++++++++++++++++++++++++++-
 1 file changed, 185 insertions(+), 6 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
index d14d58be7d7..8e2f76f1b5c 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -8880,6 +8880,106 @@ rs6000_gimplify_va_arg (tree valist, tree type, gimple_seq *pre_p,
 
 /* Builtins.  */
 
+/* Debug utility to translate a type node to a single token.  */
+static
+const char *rs6000_debug_type (tree type)
+{
+  if (type == void_type_node)
+    return "void";
+  else if (type == long_integer_type_node)
+    return "long";
+  else if (type == long_unsigned_type_node)
+    return "ulong";
+  else if (type == long_long_integer_type_node)
+    return "longlong";
+  else if (type == long_long_unsigned_type_node)
+    return "ulonglong";
+  else if (type == bool_V2DI_type_node)
+    return "vbll";
+  else if (type == bool_V4SI_type_node)
+    return "vbi";
+  else if (type == bool_V8HI_type_node)
+    return "vbs";
+  else if (type == bool_V16QI_type_node)
+    return "vbc";
+  else if (type == bool_int_type_node)
+    return "bool";
+  else if (type == dfloat64_type_node)
+    return "_Decimal64";
+  else if (type == double_type_node)
+    return "double";
+  else if (type == intDI_type_node)
+    return "sll";
+  else if (type == intHI_type_node)
+    return "ss";
+  else if (type == ibm128_float_type_node)
+    return "__ibm128";
+  else if (type == opaque_V4SI_type_node)
+    return "opaque";
+  else if (POINTER_TYPE_P (type))
+    return "void*";
+  else if (type == intQI_type_node || type == char_type_node)
+    return "sc";
+  else if (type == dfloat32_type_node)
+    return "_Decimal32";
+  else if (type == float_type_node)
+    return "float";
+  else if (type == intSI_type_node || type == integer_type_node)
+    return "si";
+  else if (type == dfloat128_type_node)
+    return "_Decimal128";
+  else if (type == long_double_type_node)
+    return "longdouble";
+  else if (type == intTI_type_node)
+    return "sq";
+  else if (type == unsigned_intDI_type_node)
+    return "ull";
+  else if (type == unsigned_intHI_type_node)
+    return "us";
+  else if (type == unsigned_intQI_type_node)
+    return "uc";
+  else if (type == unsigned_intSI_type_node)
+    return "ui";
+  else if (type == unsigned_intTI_type_node)
+    return "uq";
+  else if (type == unsigned_V1TI_type_node)
+    return "vuq";
+  else if (type == unsigned_V2DI_type_node)
+    return "vull";
+  else if (type == unsigned_V4SI_type_node)
+    return "vui";
+  else if (type == unsigned_V8HI_type_node)
+    return "vus";
+  else if (type == unsigned_V16QI_type_node)
+    return "vuc";
+  else if (type == V16QI_type_node)
+    return "vsc";
+  else if (type == V1TI_type_node)
+    return "vsq";
+  else if (type == V2DF_type_node)
+    return "vd";
+  else if (type == V2DI_type_node)
+    return "vsll";
+  else if (type == V4SF_type_node)
+    return "vf";
+  else if (type == V4SI_type_node)
+    return "vsi";
+  else if (type == V8HI_type_node)
+    return "vss";
+  else if (type == pixel_V8HI_type_node)
+    return "vp";
+  else if (type == pcvoid_type_node)
+    return "voidc*";
+  else if (type == float128_type_node)
+    return "_Float128";
+  else if (type == vector_pair_type_node)
+    return "__vector_pair";
+  else if (type == vector_quad_type_node)
+    return "__vector_quad";
+  else
+    return "unknown";
+}
+
 static void
 def_builtin (const char *name, tree type, enum rs6000_builtins code)
 {
@@ -8908,7 +9008,7 @@ def_builtin (const char *name, tree type, enum rs6000_builtins code)
       /* const function, function only depends on the inputs.  */
       TREE_READONLY (t) = 1;
       TREE_NOTHROW (t) = 1;
-      attr_string = ", const";
+      attr_string = "= const";
     }
   else if ((classify & RS6000_BTC_PURE) != 0)
     {
@@ -8916,7 +9016,7 @@ def_builtin (const char *name, tree type, enum rs6000_builtins code)
 	 external state.  */
       DECL_PURE_P (t) = 1;
       TREE_NOTHROW (t) = 1;
-      attr_string = ", pure";
+      attr_string = "= pure";
     }
   else if ((classify & RS6000_BTC_FP) != 0)
     {
@@ -8930,12 +9030,12 @@ def_builtin (const char *name, tree type, enum rs6000_builtins code)
 	{
 	  DECL_PURE_P (t) = 1;
 	  DECL_IS_NOVOPS (t) = 1;
-	  attr_string = ", fp, pure";
+	  attr_string = "= fp, pure";
 	}
       else
 	{
 	  TREE_READONLY (t) = 1;
-	  attr_string = ", fp, const";
+	  attr_string = "= fp, const";
 	}
     }
   else if ((classify & (RS6000_BTC_QUAD | RS6000_BTC_PAIR)) != 0)
@@ -8945,8 +9045,20 @@ def_builtin (const char *name, tree type, enum rs6000_builtins code)
     gcc_unreachable ();
 
   if (TARGET_DEBUG_BUILTIN)
-    fprintf (stderr, "rs6000_builtin, code = %4d, %s%s\n",
-	     (int)code, name, attr_string);
+    {
+      tree t = TREE_TYPE (type);
+      fprintf (stderr, "%s %s (", rs6000_debug_type (t), name);
+      t = TYPE_ARG_TYPES (type);
+      while (t && TREE_VALUE (t) != void_type_node)
+	{
+	  fprintf (stderr, "%s",
+		   rs6000_debug_type (TREE_VALUE (t)));
+	  t = TREE_CHAIN (t);
+	  if (t && TREE_VALUE (t) != void_type_node)
+	    fprintf (stderr, ", ");
+	}
+      fprintf (stderr, "); %s [%4d]\n", attr_string, (int)code);
+    }
 }
 
 static const struct builtin_compatibility bdesc_compat[] =
@@ -16184,6 +16296,67 @@ rs6000_init_builtins (void)
     altivec_builtin_mask_for_load
       = rs6000_builtin_decls_x[RS6000_BIF_MASK_FOR_LOAD];
 
+  if (TARGET_DEBUG_BUILTIN)
+     {
+      fprintf (stderr, "\nAutogenerated built-in functions:\n\n");
+      for (int i = 1; i < (int) RS6000_BIF_MAX; i++)
+	{
+	  bif_enable e = rs6000_builtin_info_x[i].enable;
+	  if (e == ENB_P5 && !TARGET_POPCNTB)
+	    continue;
+	  if (e == ENB_P6 && !TARGET_CMPB)
+	    continue;
+	  if (e == ENB_ALTIVEC && !TARGET_ALTIVEC)
+	    continue;
+	  if (e == ENB_VSX && !TARGET_VSX)
+	    continue;
+	  if (e == ENB_P7 && !TARGET_POPCNTD)
+	    continue;
+	  if (e == ENB_P7_64 && (!TARGET_POPCNTD || !TARGET_POWERPC64))
+	    continue;
+	  if (e == ENB_P8 && !TARGET_DIRECT_MOVE)
+	    continue;
+	  if (e == ENB_P8V && !TARGET_P8_VECTOR)
+	    continue;
+	  if (e == ENB_P9 && !TARGET_MODULO)
+	    continue;
+	  if (e == ENB_P9_64 && (!TARGET_MODULO || !TARGET_POWERPC64))
+	    continue;
+	  if (e == ENB_P9V && !TARGET_P9_VECTOR)
+	    continue;
+	  if (e == ENB_IEEE128_HW && !TARGET_FLOAT128_HW)
+	    continue;
+	  if (e == ENB_DFP && !TARGET_DFP)
+	    continue;
+	  if (e == ENB_CRYPTO && !TARGET_CRYPTO)
+	    continue;
+	  if (e == ENB_HTM && !TARGET_HTM)
+	    continue;
+	  if (e == ENB_P10 && !TARGET_POWER10)
+	    continue;
+	  if (e == ENB_P10_64 && (!TARGET_POWER10 || !TARGET_POWERPC64))
+	    continue;
+	  if (e == ENB_MMA && !TARGET_MMA)
+	    continue;
+	  tree fntype = rs6000_builtin_info_x[i].fntype;
+	  tree t = TREE_TYPE (fntype);
+	  fprintf (stderr, "%s %s (", rs6000_debug_type (t),
+		   rs6000_builtin_info_x[i].bifname);
+	  t = TYPE_ARG_TYPES (fntype);
+	  while (t && TREE_VALUE (t) != void_type_node)
+	    {
+	      fprintf (stderr, "%s",
+		       rs6000_debug_type (TREE_VALUE (t)));
+	      t = TREE_CHAIN (t);
+	      if (t && TREE_VALUE (t) != void_type_node)
+		fprintf (stderr, ", ");
+	    }
+	  fprintf (stderr, "); %s [%4d]\n",
+		   rs6000_builtin_info_x[i].attr_string, (int) i);
+	}
+      fprintf (stderr, "\nEnd autogenerated built-in functions.\n\n\n");
+     }
+
   if (new_builtins_are_live)
     {
 #ifdef SUBTARGET_INIT_BUILTINS
@@ -16847,6 +17020,12 @@ altivec_init_builtins (void)
 			       ALTIVEC_BUILTIN_MASK_FOR_LOAD,
 			       BUILT_IN_MD, NULL, NULL_TREE);
   TREE_READONLY (decl) = 1;
+  if (TARGET_DEBUG_BUILTIN)
+    fprintf (stderr, "%s __builtin_altivec_mask_for_load (%s); [%4d]\n",
+	     rs6000_debug_type (TREE_TYPE (v16qi_ftype_pcvoid)),
+	     rs6000_debug_type (TREE_VALUE
+				(TYPE_ARG_TYPES (v16qi_ftype_pcvoid))),
+	     (int) ALTIVEC_BUILTIN_MASK_FOR_LOAD);
   /* Record the decl. Will be used by rs6000_builtin_mask_for_load.  */
   altivec_builtin_mask_for_load = decl;
 
-- 
2.27.0


^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 32/34] rs6000: Update altivec.h for automated interfaces
  2021-07-29 13:30 [PATCHv4 00/34] Replace the Power target-specific builtin machinery Bill Schmidt
                   ` (30 preceding siblings ...)
  2021-07-29 13:31 ` [PATCH 31/34] rs6000: Debug support Bill Schmidt
@ 2021-07-29 13:31 ` Bill Schmidt
  2021-07-29 13:31 ` [PATCH 33/34] rs6000: Test case adjustments Bill Schmidt
  2021-07-29 13:31 ` [PATCH 34/34] rs6000: Enable the new builtin support Bill Schmidt
  33 siblings, 0 replies; 84+ messages in thread
From: Bill Schmidt @ 2021-07-29 13:31 UTC (permalink / raw)
  To: gcc-patches; +Cc: segher, dje.gcc, willschm

2021-07-28  Bill Schmidt  <wschmidt@linux.ibm.com>

gcc/
	* config/rs6000/altivec.h: Delete a number of #defines that are
	now superfluous; alphabetize; include rs6000-vecdefines.h; include
	some synonyms.
---
 gcc/config/rs6000/altivec.h | 519 +++---------------------------------
 1 file changed, 38 insertions(+), 481 deletions(-)

diff --git a/gcc/config/rs6000/altivec.h b/gcc/config/rs6000/altivec.h
index 5b631c7ebaf..9dfa285ccd1 100644
--- a/gcc/config/rs6000/altivec.h
+++ b/gcc/config/rs6000/altivec.h
@@ -55,32 +55,36 @@
 #define __CR6_LT		2
 #define __CR6_LT_REV		3
 
-/* Synonyms.  */
+#include "rs6000-vecdefines.h"
+
+/* Deprecated interfaces.  */
+#define vec_lvx vec_ld
+#define vec_lvxl vec_ldl
+#define vec_stvx vec_st
+#define vec_stvxl vec_stl
 #define vec_vaddcuw vec_addc
 #define vec_vand vec_and
 #define vec_vandc vec_andc
-#define vec_vrfip vec_ceil
 #define vec_vcmpbfp vec_cmpb
 #define vec_vcmpgefp vec_cmpge
 #define vec_vctsxs vec_cts
 #define vec_vctuxs vec_ctu
 #define vec_vexptefp vec_expte
-#define vec_vrfim vec_floor
-#define vec_lvx vec_ld
-#define vec_lvxl vec_ldl
 #define vec_vlogefp vec_loge
 #define vec_vmaddfp vec_madd
 #define vec_vmhaddshs vec_madds
-#define vec_vmladduhm vec_mladd
 #define vec_vmhraddshs vec_mradds
+#define vec_vmladduhm vec_mladd
 #define vec_vnmsubfp vec_nmsub
 #define vec_vnor vec_nor
 #define vec_vor vec_or
-#define vec_vpkpx vec_packpx
 #define vec_vperm vec_perm
-#define vec_permxor __builtin_vec_vpermxor
+#define vec_vpkpx vec_packpx
 #define vec_vrefp vec_re
+#define vec_vrfim vec_floor
 #define vec_vrfin vec_round
+#define vec_vrfip vec_ceil
+#define vec_vrfiz vec_trunc
 #define vec_vrsqrtefp vec_rsqrte
 #define vec_vsel vec_sel
 #define vec_vsldoi vec_sld
@@ -91,440 +95,53 @@
 #define vec_vspltisw vec_splat_s32
 #define vec_vsr vec_srl
 #define vec_vsro vec_sro
-#define vec_stvx vec_st
-#define vec_stvxl vec_stl
 #define vec_vsubcuw vec_subc
 #define vec_vsum2sws vec_sum2s
 #define vec_vsumsws vec_sums
-#define vec_vrfiz vec_trunc
 #define vec_vxor vec_xor
 
+/* For _ARCH_PWR8.  Always define to support #pragma GCC target.  */
+#define vec_vclz vec_cntlz
+#define vec_vgbbd vec_gb
+#define vec_vmrgew vec_mergee
+#define vec_vmrgow vec_mergeo
+#define vec_vpopcntu vec_popcnt
+#define vec_vrld vec_rl
+#define vec_vsld vec_sl
+#define vec_vsrd vec_sr
+#define vec_vsrad vec_sra
+
+/* For _ARCH_PWR9.  Always define to support #pragma GCC target.  */
+#define vec_extract_fp_from_shorth vec_extract_fp32_from_shorth
+#define vec_extract_fp_from_shortl vec_extract_fp32_from_shortl
+#define vec_vctz vec_cnttz
+
+/* Synonyms.  */
 /* Functions that are resolved by the backend to one of the
    typed builtins.  */
-#define vec_vaddfp __builtin_vec_vaddfp
-#define vec_addc __builtin_vec_addc
-#define vec_adde __builtin_vec_adde
-#define vec_addec __builtin_vec_addec
-#define vec_vaddsws __builtin_vec_vaddsws
-#define vec_vaddshs __builtin_vec_vaddshs
-#define vec_vaddsbs __builtin_vec_vaddsbs
-#define vec_vavgsw __builtin_vec_vavgsw
-#define vec_vavguw __builtin_vec_vavguw
-#define vec_vavgsh __builtin_vec_vavgsh
-#define vec_vavguh __builtin_vec_vavguh
-#define vec_vavgsb __builtin_vec_vavgsb
-#define vec_vavgub __builtin_vec_vavgub
-#define vec_ceil __builtin_vec_ceil
-#define vec_cmpb __builtin_vec_cmpb
-#define vec_vcmpeqfp __builtin_vec_vcmpeqfp
-#define vec_cmpge __builtin_vec_cmpge
-#define vec_vcmpgtfp __builtin_vec_vcmpgtfp
-#define vec_vcmpgtsw __builtin_vec_vcmpgtsw
-#define vec_vcmpgtuw __builtin_vec_vcmpgtuw
-#define vec_vcmpgtsh __builtin_vec_vcmpgtsh
-#define vec_vcmpgtuh __builtin_vec_vcmpgtuh
-#define vec_vcmpgtsb __builtin_vec_vcmpgtsb
-#define vec_vcmpgtub __builtin_vec_vcmpgtub
-#define vec_vcfsx __builtin_vec_vcfsx
-#define vec_vcfux __builtin_vec_vcfux
-#define vec_cts __builtin_vec_cts
-#define vec_ctu __builtin_vec_ctu
-#define vec_cpsgn __builtin_vec_copysign
-#define vec_double __builtin_vec_double
-#define vec_doublee __builtin_vec_doublee
-#define vec_doubleo __builtin_vec_doubleo
-#define vec_doublel __builtin_vec_doublel
-#define vec_doubleh __builtin_vec_doubleh
-#define vec_expte __builtin_vec_expte
-#define vec_float __builtin_vec_float
-#define vec_float2 __builtin_vec_float2
-#define vec_floate __builtin_vec_floate
-#define vec_floato __builtin_vec_floato
-#define vec_floor __builtin_vec_floor
-#define vec_loge __builtin_vec_loge
-#define vec_madd __builtin_vec_madd
-#define vec_madds __builtin_vec_madds
-#define vec_mtvscr __builtin_vec_mtvscr
-#define vec_reve __builtin_vec_vreve
-#define vec_vmaxfp __builtin_vec_vmaxfp
-#define vec_vmaxsw __builtin_vec_vmaxsw
-#define vec_vmaxsh __builtin_vec_vmaxsh
-#define vec_vmaxsb __builtin_vec_vmaxsb
-#define vec_vminfp __builtin_vec_vminfp
-#define vec_vminsw __builtin_vec_vminsw
-#define vec_vminsh __builtin_vec_vminsh
-#define vec_vminsb __builtin_vec_vminsb
-#define vec_mradds __builtin_vec_mradds
-#define vec_vmsumshm __builtin_vec_vmsumshm
-#define vec_vmsumuhm __builtin_vec_vmsumuhm
-#define vec_vmsummbm __builtin_vec_vmsummbm
-#define vec_vmsumubm __builtin_vec_vmsumubm
-#define vec_vmsumshs __builtin_vec_vmsumshs
-#define vec_vmsumuhs __builtin_vec_vmsumuhs
-#define vec_vmsumudm __builtin_vec_vmsumudm
-#define vec_vmulesb __builtin_vec_vmulesb
-#define vec_vmulesh __builtin_vec_vmulesh
-#define vec_vmuleuh __builtin_vec_vmuleuh
-#define vec_vmuleub __builtin_vec_vmuleub
-#define vec_vmulosh __builtin_vec_vmulosh
-#define vec_vmulouh __builtin_vec_vmulouh
-#define vec_vmulosb __builtin_vec_vmulosb
-#define vec_vmuloub __builtin_vec_vmuloub
-#define vec_nmsub __builtin_vec_nmsub
-#define vec_packpx __builtin_vec_packpx
-#define vec_vpkswss __builtin_vec_vpkswss
-#define vec_vpkuwus __builtin_vec_vpkuwus
-#define vec_vpkshss __builtin_vec_vpkshss
-#define vec_vpkuhus __builtin_vec_vpkuhus
-#define vec_vpkswus __builtin_vec_vpkswus
-#define vec_vpkshus __builtin_vec_vpkshus
-#define vec_re __builtin_vec_re
-#define vec_round __builtin_vec_round
-#define vec_recipdiv __builtin_vec_recipdiv
-#define vec_rlmi __builtin_vec_rlmi
-#define vec_vrlnm __builtin_vec_rlnm
 #define vec_rlnm(a,b,c) (__builtin_vec_rlnm((a),((c)<<8)|(b)))
-#define vec_rsqrt __builtin_vec_rsqrt
-#define vec_rsqrte __builtin_vec_rsqrte
-#define vec_signed __builtin_vec_vsigned
-#define vec_signed2 __builtin_vec_vsigned2
-#define vec_signede __builtin_vec_vsignede
-#define vec_signedo __builtin_vec_vsignedo
-#define vec_unsigned __builtin_vec_vunsigned
-#define vec_unsigned2 __builtin_vec_vunsigned2
-#define vec_unsignede __builtin_vec_vunsignede
-#define vec_unsignedo __builtin_vec_vunsignedo
-#define vec_vsubfp __builtin_vec_vsubfp
-#define vec_subc __builtin_vec_subc
-#define vec_sube __builtin_vec_sube
-#define vec_subec __builtin_vec_subec
-#define vec_vsubsws __builtin_vec_vsubsws
-#define vec_vsubshs __builtin_vec_vsubshs
-#define vec_vsubsbs __builtin_vec_vsubsbs
-#define vec_sum4s __builtin_vec_sum4s
-#define vec_vsum4shs __builtin_vec_vsum4shs
-#define vec_vsum4sbs __builtin_vec_vsum4sbs
-#define vec_vsum4ubs __builtin_vec_vsum4ubs
-#define vec_sum2s __builtin_vec_sum2s
-#define vec_sums __builtin_vec_sums
-#define vec_trunc __builtin_vec_trunc
-#define vec_vupkhpx __builtin_vec_vupkhpx
-#define vec_vupkhsh __builtin_vec_vupkhsh
-#define vec_vupkhsb __builtin_vec_vupkhsb
-#define vec_vupklpx __builtin_vec_vupklpx
-#define vec_vupklsh __builtin_vec_vupklsh
-#define vec_vupklsb __builtin_vec_vupklsb
-#define vec_abs __builtin_vec_abs
-#define vec_nabs __builtin_vec_nabs
-#define vec_abss __builtin_vec_abss
-#define vec_add __builtin_vec_add
-#define vec_adds __builtin_vec_adds
-#define vec_and __builtin_vec_and
-#define vec_andc __builtin_vec_andc
-#define vec_avg __builtin_vec_avg
-#define vec_cmpeq __builtin_vec_cmpeq
-#define vec_cmpne __builtin_vec_cmpne
-#define vec_cmpgt __builtin_vec_cmpgt
-#define vec_ctf __builtin_vec_ctf
-#define vec_dst __builtin_vec_dst
-#define vec_dstst __builtin_vec_dstst
-#define vec_dststt __builtin_vec_dststt
-#define vec_dstt __builtin_vec_dstt
-#define vec_ld __builtin_vec_ld
-#define vec_lde __builtin_vec_lde
-#define vec_ldl __builtin_vec_ldl
-#define vec_lvebx __builtin_vec_lvebx
-#define vec_lvehx __builtin_vec_lvehx
-#define vec_lvewx __builtin_vec_lvewx
-#define vec_xl_zext __builtin_vec_ze_lxvrx
-#define vec_xl_sext __builtin_vec_se_lxvrx
-#define vec_xst_trunc __builtin_vec_tr_stxvrx
-#define vec_neg __builtin_vec_neg
-#define vec_pmsum_be __builtin_vec_vpmsum
-#define vec_shasigma_be __builtin_crypto_vshasigma
-/* Cell only intrinsics.  */
-#ifdef __PPU__
-#define vec_lvlx __builtin_vec_lvlx
-#define vec_lvlxl __builtin_vec_lvlxl
-#define vec_lvrx __builtin_vec_lvrx
-#define vec_lvrxl __builtin_vec_lvrxl
-#endif
-#define vec_lvsl __builtin_vec_lvsl
-#define vec_lvsr __builtin_vec_lvsr
-#define vec_max __builtin_vec_max
-#define vec_mergee __builtin_vec_vmrgew
-#define vec_mergeh __builtin_vec_mergeh
-#define vec_mergel __builtin_vec_mergel
-#define vec_mergeo __builtin_vec_vmrgow
-#define vec_min __builtin_vec_min
-#define vec_mladd __builtin_vec_mladd
-#define vec_msum __builtin_vec_msum
-#define vec_msums __builtin_vec_msums
-#define vec_mul __builtin_vec_mul
-#define vec_mule __builtin_vec_mule
-#define vec_mulo __builtin_vec_mulo
-#define vec_nor __builtin_vec_nor
-#define vec_or __builtin_vec_or
-#define vec_pack __builtin_vec_pack
-#define vec_packs __builtin_vec_packs
-#define vec_packsu __builtin_vec_packsu
-#define vec_perm __builtin_vec_perm
-#define vec_rl __builtin_vec_rl
-#define vec_sel __builtin_vec_sel
-#define vec_sl __builtin_vec_sl
-#define vec_sld __builtin_vec_sld
-#define vec_sldw __builtin_vsx_xxsldwi
-#define vec_sll __builtin_vec_sll
-#define vec_slo __builtin_vec_slo
-#define vec_splat __builtin_vec_splat
-#define vec_sr __builtin_vec_sr
-#define vec_sra __builtin_vec_sra
-#define vec_srl __builtin_vec_srl
-#define vec_sro __builtin_vec_sro
-#define vec_st __builtin_vec_st
-#define vec_ste __builtin_vec_ste
-#define vec_stl __builtin_vec_stl
-#define vec_stvebx __builtin_vec_stvebx
-#define vec_stvehx __builtin_vec_stvehx
-#define vec_stvewx __builtin_vec_stvewx
-/* Cell only intrinsics.  */
-#ifdef __PPU__
-#define vec_stvlx __builtin_vec_stvlx
-#define vec_stvlxl __builtin_vec_stvlxl
-#define vec_stvrx __builtin_vec_stvrx
-#define vec_stvrxl __builtin_vec_stvrxl
-#endif
-#define vec_sub __builtin_vec_sub
-#define vec_subs __builtin_vec_subs
-#define vec_sum __builtin_vec_sum
-#define vec_unpackh __builtin_vec_unpackh
-#define vec_unpackl __builtin_vec_unpackl
-#define vec_vaddubm __builtin_vec_vaddubm
-#define vec_vaddubs __builtin_vec_vaddubs
-#define vec_vadduhm __builtin_vec_vadduhm
-#define vec_vadduhs __builtin_vec_vadduhs
-#define vec_vadduwm __builtin_vec_vadduwm
-#define vec_vadduws __builtin_vec_vadduws
-#define vec_vcmpequb __builtin_vec_vcmpequb
-#define vec_vcmpequh __builtin_vec_vcmpequh
-#define vec_vcmpequw __builtin_vec_vcmpequw
-#define vec_vmaxub __builtin_vec_vmaxub
-#define vec_vmaxuh __builtin_vec_vmaxuh
-#define vec_vmaxuw __builtin_vec_vmaxuw
-#define vec_vminub __builtin_vec_vminub
-#define vec_vminuh __builtin_vec_vminuh
-#define vec_vminuw __builtin_vec_vminuw
-#define vec_vmrghb __builtin_vec_vmrghb
-#define vec_vmrghh __builtin_vec_vmrghh
-#define vec_vmrghw __builtin_vec_vmrghw
-#define vec_vmrglb __builtin_vec_vmrglb
-#define vec_vmrglh __builtin_vec_vmrglh
-#define vec_vmrglw __builtin_vec_vmrglw
-#define vec_vpkuhum __builtin_vec_vpkuhum
-#define vec_vpkuwum __builtin_vec_vpkuwum
-#define vec_vrlb __builtin_vec_vrlb
-#define vec_vrlh __builtin_vec_vrlh
-#define vec_vrlw __builtin_vec_vrlw
-#define vec_vslb __builtin_vec_vslb
-#define vec_vslh __builtin_vec_vslh
-#define vec_vslw __builtin_vec_vslw
-#define vec_vspltb __builtin_vec_vspltb
-#define vec_vsplth __builtin_vec_vsplth
-#define vec_vspltw __builtin_vec_vspltw
-#define vec_vsrab __builtin_vec_vsrab
-#define vec_vsrah __builtin_vec_vsrah
-#define vec_vsraw __builtin_vec_vsraw
-#define vec_vsrb __builtin_vec_vsrb
-#define vec_vsrh __builtin_vec_vsrh
-#define vec_vsrw __builtin_vec_vsrw
-#define vec_vsububs __builtin_vec_vsububs
-#define vec_vsububm __builtin_vec_vsububm
-#define vec_vsubuhm __builtin_vec_vsubuhm
-#define vec_vsubuhs __builtin_vec_vsubuhs
-#define vec_vsubuwm __builtin_vec_vsubuwm
-#define vec_vsubuws __builtin_vec_vsubuws
-#define vec_xor __builtin_vec_xor
-
-#define vec_extract __builtin_vec_extract
-#define vec_insert __builtin_vec_insert
-#define vec_splats __builtin_vec_splats
-#define vec_promote __builtin_vec_promote
 
 #ifdef __VSX__
 /* VSX additions */
-#define vec_div __builtin_vec_div
-#define vec_mul __builtin_vec_mul
-#define vec_msub __builtin_vec_msub
-#define vec_nmadd __builtin_vec_nmadd
-#define vec_nearbyint __builtin_vec_nearbyint
-#define vec_rint __builtin_vec_rint
-#define vec_sqrt __builtin_vec_sqrt
 #define vec_vsx_ld __builtin_vec_vsx_ld
 #define vec_vsx_st __builtin_vec_vsx_st
-#define vec_xl __builtin_vec_vsx_ld
-#define vec_xl_be __builtin_vec_xl_be
-#define vec_xst __builtin_vec_vsx_st
-#define vec_xst_be __builtin_vec_xst_be
-
-/* Note, xxsldi and xxpermdi were added as __builtin_vsx_<xxx> functions
-   instead of __builtin_vec_<xxx>  */
-#define vec_xxsldwi __builtin_vsx_xxsldwi
-#define vec_xxpermdi __builtin_vsx_xxpermdi
-#endif
-
-#ifdef _ARCH_PWR8
-/* Vector additions added in ISA 2.07.  */
-#define vec_eqv __builtin_vec_eqv
-#define vec_nand __builtin_vec_nand
-#define vec_orc __builtin_vec_orc
-#define vec_vaddcuq __builtin_vec_vaddcuq
-#define vec_vaddudm __builtin_vec_vaddudm
-#define vec_vadduqm __builtin_vec_vadduqm
-#define vec_vbpermq __builtin_vec_vbpermq
-#define vec_bperm __builtin_vec_vbperm_api
-#define vec_vclz __builtin_vec_vclz
-#define vec_cntlz __builtin_vec_vclz
-#define vec_vclzb __builtin_vec_vclzb
-#define vec_vclzd __builtin_vec_vclzd
-#define vec_vclzh __builtin_vec_vclzh
-#define vec_vclzw __builtin_vec_vclzw
-#define vec_vaddecuq __builtin_vec_vaddecuq
-#define vec_vaddeuqm __builtin_vec_vaddeuqm
-#define vec_vsubecuq __builtin_vec_vsubecuq
-#define vec_vsubeuqm __builtin_vec_vsubeuqm
-#define vec_vgbbd __builtin_vec_vgbbd
-#define vec_gb __builtin_vec_vgbbd
-#define vec_vmaxsd __builtin_vec_vmaxsd
-#define vec_vmaxud __builtin_vec_vmaxud
-#define vec_vminsd __builtin_vec_vminsd
-#define vec_vminud __builtin_vec_vminud
-#define vec_vmrgew __builtin_vec_vmrgew
-#define vec_vmrgow __builtin_vec_vmrgow
-#define vec_vpksdss __builtin_vec_vpksdss
-#define vec_vpksdus __builtin_vec_vpksdus
-#define vec_vpkudum __builtin_vec_vpkudum
-#define vec_vpkudus __builtin_vec_vpkudus
-#define vec_vpopcnt __builtin_vec_vpopcnt
-#define vec_vpopcntb __builtin_vec_vpopcntb
-#define vec_vpopcntd __builtin_vec_vpopcntd
-#define vec_vpopcnth __builtin_vec_vpopcnth
-#define vec_vpopcntw __builtin_vec_vpopcntw
-#define vec_popcnt __builtin_vec_vpopcntu
-#define vec_vrld __builtin_vec_vrld
-#define vec_vsld __builtin_vec_vsld
-#define vec_vsrad __builtin_vec_vsrad
-#define vec_vsrd __builtin_vec_vsrd
-#define vec_vsubcuq __builtin_vec_vsubcuq
-#define vec_vsubudm __builtin_vec_vsubudm
-#define vec_vsubuqm __builtin_vec_vsubuqm
-#define vec_vupkhsw __builtin_vec_vupkhsw
-#define vec_vupklsw __builtin_vec_vupklsw
-#define vec_revb __builtin_vec_revb
-#define vec_sbox_be __builtin_crypto_vsbox_be
-#define vec_cipher_be __builtin_crypto_vcipher_be
-#define vec_cipherlast_be __builtin_crypto_vcipherlast_be
-#define vec_ncipher_be __builtin_crypto_vncipher_be
-#define vec_ncipherlast_be __builtin_crypto_vncipherlast_be
-#endif
-
-#ifdef __POWER9_VECTOR__
-/* Vector additions added in ISA 3.0.  */
-#define vec_first_match_index __builtin_vec_first_match_index
-#define vec_first_match_or_eos_index __builtin_vec_first_match_or_eos_index
-#define vec_first_mismatch_index __builtin_vec_first_mismatch_index
-#define vec_first_mismatch_or_eos_index __builtin_vec_first_mismatch_or_eos_index
-#define vec_pack_to_short_fp32 __builtin_vec_convert_4f32_8f16
-#define vec_parity_lsbb __builtin_vec_vparity_lsbb
-#define vec_vctz __builtin_vec_vctz
-#define vec_cnttz __builtin_vec_vctz
-#define vec_vctzb __builtin_vec_vctzb
-#define vec_vctzd __builtin_vec_vctzd
-#define vec_vctzh __builtin_vec_vctzh
-#define vec_vctzw __builtin_vec_vctzw
-#define vec_extract4b __builtin_vec_extract4b
-#define vec_insert4b __builtin_vec_insert4b
-#define vec_vprtyb __builtin_vec_vprtyb
-#define vec_vprtybd __builtin_vec_vprtybd
-#define vec_vprtybw __builtin_vec_vprtybw
-
-#ifdef _ARCH_PPC64
-#define vec_vprtybq __builtin_vec_vprtybq
-#endif
-
-#define vec_absd __builtin_vec_vadu
-#define vec_absdb __builtin_vec_vadub
-#define vec_absdh __builtin_vec_vaduh
-#define vec_absdw __builtin_vec_vaduw
-
-#define vec_slv __builtin_vec_vslv
-#define vec_srv __builtin_vec_vsrv
-
-#define vec_extract_exp __builtin_vec_extract_exp
-#define vec_extract_sig __builtin_vec_extract_sig
-#define vec_insert_exp __builtin_vec_insert_exp
-#define vec_test_data_class __builtin_vec_test_data_class
-
-#define vec_extract_fp_from_shorth __builtin_vec_vextract_fp_from_shorth
-#define vec_extract_fp_from_shortl __builtin_vec_vextract_fp_from_shortl
-#define vec_extract_fp32_from_shorth __builtin_vec_vextract_fp_from_shorth
-#define vec_extract_fp32_from_shortl __builtin_vec_vextract_fp_from_shortl
-
-#define scalar_extract_exp __builtin_vec_scalar_extract_exp
-#define scalar_extract_sig __builtin_vec_scalar_extract_sig
-#define scalar_insert_exp __builtin_vec_scalar_insert_exp
-#define scalar_test_data_class __builtin_vec_scalar_test_data_class
-#define scalar_test_neg __builtin_vec_scalar_test_neg
-
-#define scalar_cmp_exp_gt __builtin_vec_scalar_cmp_exp_gt
-#define scalar_cmp_exp_lt __builtin_vec_scalar_cmp_exp_lt
-#define scalar_cmp_exp_eq __builtin_vec_scalar_cmp_exp_eq
-#define scalar_cmp_exp_unordered __builtin_vec_scalar_cmp_exp_unordered
-
-#ifdef _ARCH_PPC64
-#define vec_xl_len __builtin_vec_lxvl
-#define vec_xst_len __builtin_vec_stxvl
-#define vec_xl_len_r __builtin_vec_xl_len_r
-#define vec_xst_len_r __builtin_vec_xst_len_r
-#endif
-
-#define vec_cmpnez __builtin_vec_vcmpnez
-
-#define vec_cntlz_lsbb __builtin_vec_vclzlsbb
-#define vec_cnttz_lsbb __builtin_vec_vctzlsbb
-
-#define vec_test_lsbb_all_ones __builtin_vec_xvtlsbb_all_ones
-#define vec_test_lsbb_all_zeros __builtin_vec_xvtlsbb_all_zeros
-
-#define vec_xlx __builtin_vec_vextulx
-#define vec_xrx __builtin_vec_vexturx
-#define vec_signexti  __builtin_vec_vsignexti
-#define vec_signextll __builtin_vec_vsignextll
+#define __builtin_vec_xl __builtin_vec_vsx_ld
+#define __builtin_vec_xst __builtin_vec_vsx_st
 
-#endif
-
-/* BCD builtins, map ABI builtin name to existing builtin name.  */
-#define __builtin_bcdadd     __builtin_vec_bcdadd
-#define __builtin_bcdadd_lt  __builtin_vec_bcdadd_lt
-#define __builtin_bcdadd_eq  __builtin_vec_bcdadd_eq
-#define __builtin_bcdadd_gt  __builtin_vec_bcdadd_gt
 #define __builtin_bcdadd_ofl __builtin_vec_bcdadd_ov
-#define __builtin_bcdadd_ov  __builtin_vec_bcdadd_ov
-#define __builtin_bcdsub     __builtin_vec_bcdsub
-#define __builtin_bcdsub_lt  __builtin_vec_bcdsub_lt
-#define __builtin_bcdsub_eq  __builtin_vec_bcdsub_eq
-#define __builtin_bcdsub_gt  __builtin_vec_bcdsub_gt
 #define __builtin_bcdsub_ofl __builtin_vec_bcdsub_ov
-#define __builtin_bcdsub_ov  __builtin_vec_bcdsub_ov
-#define __builtin_bcdinvalid __builtin_vec_bcdinvalid
-#define __builtin_bcdmul10   __builtin_vec_bcdmul10
-#define __builtin_bcddiv10   __builtin_vec_bcddiv10
-#define __builtin_bcd2dfp    __builtin_vec_denb2dfp
 #define __builtin_bcdcmpeq(a,b)   __builtin_vec_bcdsub_eq(a,b,0)
 #define __builtin_bcdcmpgt(a,b)   __builtin_vec_bcdsub_gt(a,b,0)
 #define __builtin_bcdcmplt(a,b)   __builtin_vec_bcdsub_lt(a,b,0)
 #define __builtin_bcdcmpge(a,b)   __builtin_vec_bcdsub_ge(a,b,0)
 #define __builtin_bcdcmple(a,b)   __builtin_vec_bcdsub_le(a,b,0)
+#endif
 
+/* For _ARCH_PWR10.  Always define to support #pragma GCC target.  */
+#define __builtin_vec_se_lxvrx __builtin_vec_xl_sext
+#define __builtin_vec_tr_stxvrx __builtin_vec_xst_trunc
+#define __builtin_vec_ze_lxvrx __builtin_vec_xl_zext
+#define __builtin_vsx_xxpermx __builtin_vec_xxpermx
 
 /* Predicates.
    For C++, we use templates in order to allow non-parenthesized arguments.
@@ -700,14 +317,9 @@ __altivec_scalar_pred(vec_any_nle,
 #define vec_any_nle(a1, a2) __builtin_vec_vcmpge_p (__CR6_LT_REV, (a2), (a1))
 #endif
 
-/* These do not accept vectors, so they do not have a __builtin_vec_*
-   counterpart.  */
+/* Miscellaneous definitions.  */
 #define vec_dss(x) __builtin_altivec_dss((x))
 #define vec_dssall() __builtin_altivec_dssall ()
-#define vec_mfvscr() ((__vector unsigned short) __builtin_altivec_mfvscr ())
-#define vec_splat_s8(x) __builtin_altivec_vspltisb ((x))
-#define vec_splat_s16(x) __builtin_altivec_vspltish ((x))
-#define vec_splat_s32(x) __builtin_altivec_vspltisw ((x))
 #define vec_splat_u8(x) ((__vector unsigned char) vec_splat_s8 ((x)))
 #define vec_splat_u16(x) ((__vector unsigned short) vec_splat_s16 ((x)))
 #define vec_splat_u32(x) ((__vector unsigned int) vec_splat_s32 ((x)))
@@ -716,59 +328,4 @@ __altivec_scalar_pred(vec_any_nle,
    to #define vec_step to __builtin_vec_step.  */
 #define vec_step(x) __builtin_vec_step (* (__typeof__ (x) *) 0)
 
-#ifdef _ARCH_PWR10
-#define vec_signextq  __builtin_vec_vsignextq
-#define vec_dive __builtin_vec_dive
-#define vec_mod  __builtin_vec_mod
-
-/* May modify these macro definitions if future capabilities overload
-   with support for different vector argument and result types.  */
-#define vec_cntlzm(a, b)	__builtin_altivec_vclzdm (a, b)
-#define vec_cnttzm(a, b)	__builtin_altivec_vctzdm (a, b)
-#define vec_pdep(a, b)	__builtin_altivec_vpdepd (a, b)
-#define vec_pext(a, b)	__builtin_altivec_vpextd (a, b)
-#define vec_cfuge(a, b)	__builtin_altivec_vcfuged (a, b)
-#define vec_genpcvm(a, b)	__builtin_vec_xxgenpcvm (a, b)
-
-/* Overloaded built-in functions for ISA 3.1.  */
-#define vec_extractl(a, b, c)	__builtin_vec_extractl (a, b, c)
-#define vec_extracth(a, b, c)	__builtin_vec_extracth (a, b, c)
-#define vec_insertl(a, b, c)   __builtin_vec_insertl (a, b, c)
-#define vec_inserth(a, b, c)   __builtin_vec_inserth (a, b, c)
-#define vec_replace_elt(a, b, c)       __builtin_vec_replace_elt (a, b, c)
-#define vec_replace_unaligned(a, b, c) __builtin_vec_replace_un (a, b, c)
-#define vec_sldb(a, b, c)      __builtin_vec_sldb (a, b, c)
-#define vec_srdb(a, b, c)      __builtin_vec_srdb (a, b, c)
-#define vec_splati(a)  __builtin_vec_xxspltiw (a)
-#define vec_splatid(a) __builtin_vec_xxspltid (a)
-#define vec_splati_ins(a, b, c)        __builtin_vec_xxsplti32dx (a, b, c)
-#define vec_blendv(a, b, c)    __builtin_vec_xxblend (a, b, c)
-#define vec_permx(a, b, c, d)  __builtin_vec_xxpermx (a, b, c, d)
-
-#define vec_gnb(a, b)	__builtin_vec_gnb (a, b)
-#define vec_clrl(a, b)	__builtin_vec_clrl (a, b)
-#define vec_clrr(a, b)	__builtin_vec_clrr (a, b)
-#define vec_ternarylogic(a, b, c, d)	__builtin_vec_xxeval (a, b, c, d)
-
-#define vec_strir(a)	__builtin_vec_strir (a)
-#define vec_stril(a)	__builtin_vec_stril (a)
-
-#define vec_strir_p(a)	__builtin_vec_strir_p (a)
-#define vec_stril_p(a)	__builtin_vec_stril_p (a)
-
-#define vec_mulh(a, b) __builtin_vec_mulh ((a), (b))
-#define vec_dive(a, b) __builtin_vec_dive ((a), (b))
-#define vec_mod(a, b) __builtin_vec_mod ((a), (b))
-
-/* VSX Mask Manipulation builtin. */
-#define vec_genbm __builtin_vec_mtvsrbm
-#define vec_genhm __builtin_vec_mtvsrhm
-#define vec_genwm __builtin_vec_mtvsrwm
-#define vec_gendm __builtin_vec_mtvsrdm
-#define vec_genqm __builtin_vec_mtvsrqm
-#define vec_cntm __builtin_vec_cntm
-#define vec_expandm __builtin_vec_vexpandm
-#define vec_extractm __builtin_vec_vextractm
-#endif
-
 #endif /* _ALTIVEC_H */
-- 
2.27.0


^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 33/34] rs6000: Test case adjustments
  2021-07-29 13:30 [PATCHv4 00/34] Replace the Power target-specific builtin machinery Bill Schmidt
                   ` (31 preceding siblings ...)
  2021-07-29 13:31 ` [PATCH 32/34] rs6000: Update altivec.h for automated interfaces Bill Schmidt
@ 2021-07-29 13:31 ` Bill Schmidt
  2021-07-29 13:31 ` [PATCH 34/34] rs6000: Enable the new builtin support Bill Schmidt
  33 siblings, 0 replies; 84+ messages in thread
From: Bill Schmidt @ 2021-07-29 13:31 UTC (permalink / raw)
  To: gcc-patches; +Cc: segher, dje.gcc, willschm

2021-07-19  Bill Schmidt  <wschmidt@linux.ibm.com>

gcc/testsuite/
	* gcc.target/powerpc/bfp/scalar-extract-exp-2.c: Adjust.
	* gcc.target/powerpc/bfp/scalar-extract-sig-2.c: Adjust.
	* gcc.target/powerpc/bfp/scalar-insert-exp-2.c: Adjust.
	* gcc.target/powerpc/bfp/scalar-insert-exp-5.c: Adjust.
	* gcc.target/powerpc/bfp/scalar-insert-exp-8.c: Adjust.
	* gcc.target/powerpc/bfp/scalar-test-neg-2.c: Adjust.
	* gcc.target/powerpc/bfp/scalar-test-neg-3.c: Adjust.
	* gcc.target/powerpc/bfp/scalar-test-neg-5.c: Adjust.
	* gcc.target/powerpc/byte-in-set-2.c: Adjust.
	* gcc.target/powerpc/cmpb-2.c: Adjust.
	* gcc.target/powerpc/cmpb32-2.c: Adjust.
	* gcc.target/powerpc/crypto-builtin-2.c: Adjust.
	* gcc.target/powerpc/fold-vec-splat-floatdouble.c: Adjust.
	* gcc.target/powerpc/fold-vec-splat-longlong.c: Adjust.
	* gcc.target/powerpc/fold-vec-splat-misc-invalid.c: Adjust.
	* gcc.target/powerpc/int_128bit-runnable.c: Adjust.
	* gcc.target/powerpc/p8vector-builtin-8.c: Adjust.
	* gcc.target/powerpc/pr80315-1.c: Adjust.
	* gcc.target/powerpc/pr80315-2.c: Adjust.
	* gcc.target/powerpc/pr80315-3.c: Adjust.
	* gcc.target/powerpc/pr80315-4.c: Adjust.
	* gcc.target/powerpc/pr88100.c: Adjust.
	* gcc.target/powerpc/pragma_misc9.c: Adjust.
	* gcc.target/powerpc/pragma_power8.c: Adjust.
	* gcc.target/powerpc/pragma_power9.c: Adjust.
	* gcc.target/powerpc/test_fpscr_drn_builtin_error.c: Adjust.
	* gcc.target/powerpc/test_fpscr_rn_builtin_error.c: Adjust.
	* gcc.target/powerpc/test_mffsl.c: Adjust.
	* gcc.target/powerpc/vec-gnb-2.c: Adjust.
	* gcc.target/powerpc/vsu/vec-all-nez-7.c: Adjust.
	* gcc.target/powerpc/vsu/vec-any-eqz-7.c: Adjust.
	* gcc.target/powerpc/vsu/vec-cmpnez-7.c: Adjust.
	* gcc.target/powerpc/vsu/vec-cntlz-lsbb-2.c: Adjust.
	* gcc.target/powerpc/vsu/vec-cnttz-lsbb-2.c: Adjust.
	* gcc.target/powerpc/vsu/vec-xst-len-12.c: Adjust.
	* gcc.target/powerpc/vsu/vec-xst-len-13.c: Adjust.
---
 .../gcc.target/powerpc/bfp/scalar-extract-exp-2.c  |  2 +-
 .../gcc.target/powerpc/bfp/scalar-extract-sig-2.c  |  2 +-
 .../gcc.target/powerpc/bfp/scalar-insert-exp-2.c   |  2 +-
 .../gcc.target/powerpc/bfp/scalar-insert-exp-5.c   |  2 +-
 .../gcc.target/powerpc/bfp/scalar-insert-exp-8.c   |  2 +-
 .../gcc.target/powerpc/bfp/scalar-test-neg-2.c     |  2 +-
 .../gcc.target/powerpc/bfp/scalar-test-neg-3.c     |  2 +-
 .../gcc.target/powerpc/bfp/scalar-test-neg-5.c     |  2 +-
 gcc/testsuite/gcc.target/powerpc/byte-in-set-2.c   |  2 +-
 gcc/testsuite/gcc.target/powerpc/cmpb-2.c          |  2 +-
 gcc/testsuite/gcc.target/powerpc/cmpb32-2.c        |  2 +-
 .../gcc.target/powerpc/crypto-builtin-2.c          | 14 +++++++-------
 .../powerpc/fold-vec-splat-floatdouble.c           |  4 ++--
 .../gcc.target/powerpc/fold-vec-splat-longlong.c   | 10 +++-------
 .../powerpc/fold-vec-splat-misc-invalid.c          |  8 ++++----
 .../gcc.target/powerpc/int_128bit-runnable.c       |  6 +++---
 .../gcc.target/powerpc/p8vector-builtin-8.c        |  1 +
 gcc/testsuite/gcc.target/powerpc/pr80315-1.c       |  2 +-
 gcc/testsuite/gcc.target/powerpc/pr80315-2.c       |  2 +-
 gcc/testsuite/gcc.target/powerpc/pr80315-3.c       |  2 +-
 gcc/testsuite/gcc.target/powerpc/pr80315-4.c       |  2 +-
 gcc/testsuite/gcc.target/powerpc/pr88100.c         | 12 ++++++------
 gcc/testsuite/gcc.target/powerpc/pragma_misc9.c    |  2 +-
 gcc/testsuite/gcc.target/powerpc/pragma_power8.c   |  2 ++
 gcc/testsuite/gcc.target/powerpc/pragma_power9.c   |  3 +++
 .../powerpc/test_fpscr_drn_builtin_error.c         |  4 ++--
 .../powerpc/test_fpscr_rn_builtin_error.c          | 12 ++++++------
 gcc/testsuite/gcc.target/powerpc/test_mffsl.c      |  3 ++-
 gcc/testsuite/gcc.target/powerpc/vec-gnb-2.c       |  2 +-
 .../gcc.target/powerpc/vsu/vec-all-nez-7.c         |  2 +-
 .../gcc.target/powerpc/vsu/vec-any-eqz-7.c         |  2 +-
 .../gcc.target/powerpc/vsu/vec-cmpnez-7.c          |  2 +-
 .../gcc.target/powerpc/vsu/vec-cntlz-lsbb-2.c      |  2 +-
 .../gcc.target/powerpc/vsu/vec-cnttz-lsbb-2.c      |  2 +-
 .../gcc.target/powerpc/vsu/vec-xl-len-13.c         |  2 +-
 .../gcc.target/powerpc/vsu/vec-xst-len-12.c        |  2 +-
 36 files changed, 65 insertions(+), 62 deletions(-)

diff --git a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-2.c b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-2.c
index 922180675fc..53b67c95cf9 100644
--- a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-2.c
+++ b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-2.c
@@ -14,7 +14,7 @@ get_exponent (double *p)
 {
   double source = *p;
 
-  return scalar_extract_exp (source);	/* { dg-error "'__builtin_vec_scalar_extract_exp' is not supported in this compiler configuration" } */
+  return scalar_extract_exp (source);	/* { dg-error "'__builtin_vsx_scalar_extract_exp' requires the" } */
 }
 
 
diff --git a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-sig-2.c b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-sig-2.c
index e24d4bd23fe..39ee74c94dc 100644
--- a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-sig-2.c
+++ b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-sig-2.c
@@ -12,5 +12,5 @@ get_significand (double *p)
 {
   double source = *p;
 
-  return __builtin_vec_scalar_extract_sig (source); /* { dg-error "'__builtin_vec_scalar_extract_sig' is not supported in this compiler configuration" } */
+  return __builtin_vec_scalar_extract_sig (source); /* { dg-error "'__builtin_vsx_scalar_extract_sig' requires the" } */
 }
diff --git a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-insert-exp-2.c b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-insert-exp-2.c
index feb943104da..efd69725905 100644
--- a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-insert-exp-2.c
+++ b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-insert-exp-2.c
@@ -16,5 +16,5 @@ insert_exponent (unsigned long long int *significand_p,
   unsigned long long int significand = *significand_p;
   unsigned long long int exponent = *exponent_p;
 
-  return scalar_insert_exp (significand, exponent); /* { dg-error "'__builtin_vec_scalar_insert_exp' is not supported in this compiler configuration" } */
+  return scalar_insert_exp (significand, exponent); /* { dg-error "'__builtin_vsx_scalar_insert_exp' requires the" } */
 }
diff --git a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-insert-exp-5.c b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-insert-exp-5.c
index 0e5683d5d1a..f85966a6fdf 100644
--- a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-insert-exp-5.c
+++ b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-insert-exp-5.c
@@ -16,5 +16,5 @@ insert_exponent (double *significand_p,
   double significand = *significand_p;
   unsigned long long int exponent = *exponent_p;
 
-  return scalar_insert_exp (significand, exponent); /* { dg-error "'__builtin_vec_scalar_insert_exp' is not supported in this compiler configuration" } */
+  return scalar_insert_exp (significand, exponent); /* { dg-error "'__builtin_vsx_scalar_insert_exp_dp' requires the" } */
 }
diff --git a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-insert-exp-8.c b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-insert-exp-8.c
index bd68f770985..b1be8284b4e 100644
--- a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-insert-exp-8.c
+++ b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-insert-exp-8.c
@@ -16,5 +16,5 @@ insert_exponent (unsigned __int128 *significand_p, /* { dg-error "'__int128' is
   unsigned __int128 significand = *significand_p;  /* { dg-error "'__int128' is not supported on this target" } */
   unsigned long long int exponent = *exponent_p;
 
-  return scalar_insert_exp (significand, exponent); /* { dg-error "'__builtin_vec_scalar_insert_exp' is not supported in this compiler configuration" } */
+  return scalar_insert_exp (significand, exponent); /* { dg-error "'__builtin_vsx_scalar_insert_exp' requires the" } */
 }
diff --git a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-test-neg-2.c b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-test-neg-2.c
index 7d2b4deefc3..46d743a899b 100644
--- a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-test-neg-2.c
+++ b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-test-neg-2.c
@@ -10,5 +10,5 @@ test_neg (float *p)
 {
   float source = *p;
 
-  return __builtin_vec_scalar_test_neg_sp (source); /* { dg-error "'__builtin_vsx_scalar_test_neg_sp' requires" } */
+  return __builtin_vec_scalar_test_neg (source); /* { dg-error "'__builtin_vsx_scalar_test_neg_sp' requires" } */
 }
diff --git a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-test-neg-3.c b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-test-neg-3.c
index b503dfa8b56..bfc892b116e 100644
--- a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-test-neg-3.c
+++ b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-test-neg-3.c
@@ -10,5 +10,5 @@ test_neg (double *p)
 {
   double source = *p;
 
-  return __builtin_vec_scalar_test_neg_dp (source); /* { dg-error "'__builtin_vsx_scalar_test_neg_dp' requires" } */
+  return __builtin_vec_scalar_test_neg (source); /* { dg-error "'__builtin_vsx_scalar_test_neg_dp' requires" } */
 }
diff --git a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-test-neg-5.c b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-test-neg-5.c
index bab86040a7b..8c55c1cfb5c 100644
--- a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-test-neg-5.c
+++ b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-test-neg-5.c
@@ -10,5 +10,5 @@ test_neg (__ieee128 *p)
 {
   __ieee128 source = *p;
 
-  return __builtin_vec_scalar_test_neg_qp (source); /* { dg-error "'__builtin_vsx_scalar_test_neg_qp' requires" } */
+  return __builtin_vec_scalar_test_neg (source); /* { dg-error "'__builtin_vsx_scalar_test_neg_qp' requires" } */
 }
diff --git a/gcc/testsuite/gcc.target/powerpc/byte-in-set-2.c b/gcc/testsuite/gcc.target/powerpc/byte-in-set-2.c
index 44cc7782760..4c676ba356d 100644
--- a/gcc/testsuite/gcc.target/powerpc/byte-in-set-2.c
+++ b/gcc/testsuite/gcc.target/powerpc/byte-in-set-2.c
@@ -10,5 +10,5 @@
 int
 test_byte_in_set (unsigned char b, unsigned long long set_members)
 {
-  return __builtin_byte_in_set (b, set_members); /* { dg-warning "implicit declaration of function" } */
+  return __builtin_byte_in_set (b, set_members); /* { dg-error "'__builtin_scalar_byte_in_set' requires the" } */
 }
diff --git a/gcc/testsuite/gcc.target/powerpc/cmpb-2.c b/gcc/testsuite/gcc.target/powerpc/cmpb-2.c
index 113ab6a5f99..02b84d0731d 100644
--- a/gcc/testsuite/gcc.target/powerpc/cmpb-2.c
+++ b/gcc/testsuite/gcc.target/powerpc/cmpb-2.c
@@ -8,7 +8,7 @@ void abort ();
 unsigned long long int
 do_compare (unsigned long long int a, unsigned long long int b)
 {
-  return __builtin_cmpb (a, b);	/* { dg-warning "implicit declaration of function '__builtin_cmpb'" } */
+  return __builtin_cmpb (a, b);	/* { dg-error "'__builtin_p6_cmpb' requires the '-mcpu=power6' option" } */
 }
 
 void
diff --git a/gcc/testsuite/gcc.target/powerpc/cmpb32-2.c b/gcc/testsuite/gcc.target/powerpc/cmpb32-2.c
index 37b54745e0e..d4264ab6e7d 100644
--- a/gcc/testsuite/gcc.target/powerpc/cmpb32-2.c
+++ b/gcc/testsuite/gcc.target/powerpc/cmpb32-2.c
@@ -7,7 +7,7 @@ void abort ();
 unsigned int
 do_compare (unsigned int a, unsigned int b)
 {
-  return __builtin_cmpb (a, b);  /* { dg-warning "implicit declaration of function '__builtin_cmpb'" } */
+  return __builtin_cmpb (a, b);  /* { dg-error "'__builtin_p6_cmpb_32' requires the '-mcpu=power6' option" } */
 }
 
 void
diff --git a/gcc/testsuite/gcc.target/powerpc/crypto-builtin-2.c b/gcc/testsuite/gcc.target/powerpc/crypto-builtin-2.c
index 4066b1228dc..b3a6c737a3e 100644
--- a/gcc/testsuite/gcc.target/powerpc/crypto-builtin-2.c
+++ b/gcc/testsuite/gcc.target/powerpc/crypto-builtin-2.c
@@ -5,21 +5,21 @@
 
 void use_builtins_d (__vector unsigned long long *p, __vector unsigned long long *q, __vector unsigned long long *r, __vector unsigned long long *s)
 {
-  p[0] = __builtin_crypto_vcipher (q[0], r[0]); /* { dg-error "'__builtin_crypto_vcipher' is not supported with the current options" } */
-  p[1] = __builtin_crypto_vcipherlast (q[1], r[1]); /* { dg-error "'__builtin_crypto_vcipherlast' is not supported with the current options" } */
-  p[2] = __builtin_crypto_vncipher (q[2], r[2]); /* { dg-error "'__builtin_crypto_vncipher' is not supported with the current options" } */
-  p[3] = __builtin_crypto_vncipherlast (q[3], r[3]); /* { dg-error "'__builtin_crypto_vncipherlast' is not supported with the current options" } */
+  p[0] = __builtin_crypto_vcipher (q[0], r[0]); /* { dg-error "'__builtin_crypto_vcipher' requires the '-mcrypto' option" } */
+  p[1] = __builtin_crypto_vcipherlast (q[1], r[1]); /* { dg-error "'__builtin_crypto_vcipherlast' requires the '-mcrypto' option" } */
+  p[2] = __builtin_crypto_vncipher (q[2], r[2]); /* { dg-error "'__builtin_crypto_vncipher' requires the '-mcrypto' option" } */
+  p[3] = __builtin_crypto_vncipherlast (q[3], r[3]); /* { dg-error "'__builtin_crypto_vncipherlast' requires the '-mcrypto' option" } */
   p[4] = __builtin_crypto_vpermxor (q[4], r[4], s[4]);
   p[5] = __builtin_crypto_vpmsumd (q[5], r[5]);
-  p[6] = __builtin_crypto_vshasigmad (q[6], 1, 15); /* { dg-error "'__builtin_crypto_vshasigmad' is not supported with the current options" } */
-  p[7] = __builtin_crypto_vsbox (q[7]); /* { dg-error "'__builtin_crypto_vsbox' is not supported with the current options" } */
+  p[6] = __builtin_crypto_vshasigmad (q[6], 1, 15); /* { dg-error "'__builtin_crypto_vshasigmad' requires the '-mcrypto' option" } */
+  p[7] = __builtin_crypto_vsbox (q[7]); /* { dg-error "'__builtin_crypto_vsbox' requires the '-mcrypto' option" } */
 }
 
 void use_builtins_w (__vector unsigned int *p, __vector unsigned int *q, __vector unsigned int *r, __vector unsigned int *s)
 {
   p[0] = __builtin_crypto_vpermxor (q[0], r[0], s[0]);
   p[1] = __builtin_crypto_vpmsumw (q[1], r[1]);
-  p[2] = __builtin_crypto_vshasigmaw (q[2], 1, 15); /* { dg-error "'__builtin_crypto_vshasigmaw' is not supported with the current options" } */
+  p[2] = __builtin_crypto_vshasigmaw (q[2], 1, 15); /* { dg-error "'__builtin_crypto_vshasigmaw' requires the '-mcrypto' option" } */
 }
 
 void use_builtins_h (__vector unsigned short *p, __vector unsigned short *q, __vector unsigned short *r, __vector unsigned short *s)
diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-splat-floatdouble.c b/gcc/testsuite/gcc.target/powerpc/fold-vec-splat-floatdouble.c
index 76619177388..b95fa324633 100644
--- a/gcc/testsuite/gcc.target/powerpc/fold-vec-splat-floatdouble.c
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-splat-floatdouble.c
@@ -18,7 +18,7 @@ vector float test_fc ()
 vector double testd_00 (vector double x) { return vec_splat (x, 0b00000); }
 vector double testd_01 (vector double x) { return vec_splat (x, 0b00001); }
 vector double test_dc ()
-{ const vector double y = { 3.0, 5.0 }; return vec_splat (y, 0b00010); }
+{ const vector double y = { 3.0, 5.0 }; return vec_splat (y, 0b00001); }
 
 /* If the source vector is a known constant, we will generate a load or possibly
    XXSPLTIW.  */
@@ -28,5 +28,5 @@ vector double test_dc ()
 /* { dg-final { scan-assembler-times {\mvspltw\M|\mxxspltw\M} 3 } } */
 
 /* For double types, we will generate xxpermdi instructions.  */
-/* { dg-final { scan-assembler-times "xxpermdi" 3 } } */
+/* { dg-final { scan-assembler-times "xxpermdi" 2 } } */
 
diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-splat-longlong.c b/gcc/testsuite/gcc.target/powerpc/fold-vec-splat-longlong.c
index b95b987abce..3fa1f05d6f5 100644
--- a/gcc/testsuite/gcc.target/powerpc/fold-vec-splat-longlong.c
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-splat-longlong.c
@@ -9,23 +9,19 @@
 
 vector bool long long testb_00 (vector bool long long x) { return vec_splat (x, 0b00000); }
 vector bool long long testb_01 (vector bool long long x) { return vec_splat (x, 0b00001); }
-vector bool long long testb_02 (vector bool long long x) { return vec_splat (x, 0b00010); }
 
 vector signed long long tests_00 (vector signed long long x) { return vec_splat (x, 0b00000); }
 vector signed long long tests_01 (vector signed long long x) { return vec_splat (x, 0b00001); }
-vector signed long long tests_02 (vector signed long long x) { return vec_splat (x, 0b00010); }
 
 vector unsigned long long testu_00 (vector unsigned long long x) { return vec_splat (x, 0b00000); }
 vector unsigned long long testu_01 (vector unsigned long long x) { return vec_splat (x, 0b00001); }
-vector unsigned long long testu_02 (vector unsigned long long x) { return vec_splat (x, 0b00010); }
 
 /* Similar test as above, but the source vector is a known constant. */
-vector bool long long test_bll () { const vector bool long long y = {12, 23}; return vec_splat (y, 0b00010); }
-vector signed long long test_sll () { const vector signed long long y = {34, 45}; return vec_splat (y, 0b00010); }
-vector unsigned long long test_ull () { const vector unsigned long long y = {56, 67}; return vec_splat (y, 0b00010); }
+vector bool long long test_bll () { const vector bool long long y = {12, 23}; return vec_splat (y, 0b00001); }
+vector signed long long test_sll () { const vector signed long long y = {34, 45}; return vec_splat (y, 0b00001); }
 
 /* Assorted load instructions for the initialization with known constants. */
-/* { dg-final { scan-assembler-times {\mlvx\M|\mlxvd2x\M|\mlxv\M|\mplxv\M} 3 } } */
+/* { dg-final { scan-assembler-times {\mlvx\M|\mlxvd2x\M|\mlxv\M|\mplxv\M|\mxxspltib\M} 2 } } */
 
 /* xxpermdi for vec_splat of long long vectors.
  At the time of this writing, the number of xxpermdi instructions
diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-splat-misc-invalid.c b/gcc/testsuite/gcc.target/powerpc/fold-vec-splat-misc-invalid.c
index 20f5b05561e..263a1723d31 100644
--- a/gcc/testsuite/gcc.target/powerpc/fold-vec-splat-misc-invalid.c
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-splat-misc-invalid.c
@@ -10,24 +10,24 @@
 vector signed short
 testss_1 (unsigned int ui)
 {
-  return vec_splat_s16 (ui);/* { dg-error "argument 1 must be a 5-bit signed literal" } */
+  return vec_splat_s16 (ui);/* { dg-error "argument 1 must be a literal between -16 and 15, inclusive" } */
 }
 
 vector unsigned short
 testss_2 (signed int si)
 {
-  return vec_splat_u16 (si);/* { dg-error "argument 1 must be a 5-bit signed literal" } */
+  return vec_splat_u16 (si);/* { dg-error "argument 1 must be a literal between -16 and 15, inclusive" } */
 }
 
 vector signed char
 testsc_1 (unsigned int ui)
 {
-  return vec_splat_s8 (ui); /* { dg-error "argument 1 must be a 5-bit signed literal" } */
+  return vec_splat_s8 (ui); /* { dg-error "argument 1 must be a literal between -16 and 15, inclusive" } */
 }
 
 vector unsigned char
 testsc_2 (signed int si)
 {
-  return vec_splat_u8 (si);/* { dg-error "argument 1 must be a 5-bit signed literal" } */
+  return vec_splat_u8 (si);/* { dg-error "argument 1 must be a literal between -16 and 15, inclusive" } */
 }
 
diff --git a/gcc/testsuite/gcc.target/powerpc/int_128bit-runnable.c b/gcc/testsuite/gcc.target/powerpc/int_128bit-runnable.c
index 1255ee9f0ab..1356793635a 100644
--- a/gcc/testsuite/gcc.target/powerpc/int_128bit-runnable.c
+++ b/gcc/testsuite/gcc.target/powerpc/int_128bit-runnable.c
@@ -11,9 +11,9 @@
 /* { dg-final { scan-assembler-times {\mvrlq\M} 2 } } */
 /* { dg-final { scan-assembler-times {\mvrlqnm\M} 2 } } */
 /* { dg-final { scan-assembler-times {\mvrlqmi\M} 2 } } */
-/* { dg-final { scan-assembler-times {\mvcmpequq\M} 16 } } */
-/* { dg-final { scan-assembler-times {\mvcmpgtsq\M} 16 } } */
-/* { dg-final { scan-assembler-times {\mvcmpgtuq\M} 16 } } */
+/* { dg-final { scan-assembler-times {\mvcmpequq\M} 24 } } */
+/* { dg-final { scan-assembler-times {\mvcmpgtsq\M} 26 } } */
+/* { dg-final { scan-assembler-times {\mvcmpgtuq\M} 26 } } */
 /* { dg-final { scan-assembler-times {\mvmuloud\M} 1 } } */
 /* { dg-final { scan-assembler-times {\mvmulesd\M} 1 } } */
 /* { dg-final { scan-assembler-times {\mvmulosd\M} 1 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/p8vector-builtin-8.c b/gcc/testsuite/gcc.target/powerpc/p8vector-builtin-8.c
index 0cfbe68c3a4..1d09aad9fbf 100644
--- a/gcc/testsuite/gcc.target/powerpc/p8vector-builtin-8.c
+++ b/gcc/testsuite/gcc.target/powerpc/p8vector-builtin-8.c
@@ -126,6 +126,7 @@ void foo (vector signed char *vscr,
 /* { dg-final { scan-assembler-times "vsubcuw" 4 } } */
 /* { dg-final { scan-assembler-times "vsubuwm" 4 } } */
 /* { dg-final { scan-assembler-times "vbpermq" 2 } } */
+/* { dg-final { scan-assembler-times "vbpermd" 0 } } */
 /* { dg-final { scan-assembler-times "xxleqv" 4 } } */
 /* { dg-final { scan-assembler-times "vgbbd" 1 } } */
 /* { dg-final { scan-assembler-times "xxlnand" 4 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/pr80315-1.c b/gcc/testsuite/gcc.target/powerpc/pr80315-1.c
index e2db0ff4b5f..f37f1f169a2 100644
--- a/gcc/testsuite/gcc.target/powerpc/pr80315-1.c
+++ b/gcc/testsuite/gcc.target/powerpc/pr80315-1.c
@@ -10,6 +10,6 @@ main()
   int mask;
 
   /* Argument 2 must be 0 or 1.  Argument 3 must be in range 0..15.  */
-  res = __builtin_crypto_vshasigmaw (test, 1, 0xff); /* { dg-error {argument 3 must be in the range \[0, 15\]} } */
+  res = __builtin_crypto_vshasigmaw (test, 1, 0xff); /* { dg-error {argument 3 must be a 4-bit unsigned literal} } */
   return 0;
 }
diff --git a/gcc/testsuite/gcc.target/powerpc/pr80315-2.c b/gcc/testsuite/gcc.target/powerpc/pr80315-2.c
index 144b705c012..0819a0511b7 100644
--- a/gcc/testsuite/gcc.target/powerpc/pr80315-2.c
+++ b/gcc/testsuite/gcc.target/powerpc/pr80315-2.c
@@ -10,6 +10,6 @@ main ()
   int mask;
 
   /* Argument 2 must be 0 or 1.  Argument 3 must be in range 0..15.  */
-  res = __builtin_crypto_vshasigmad (test, 1, 0xff); /* { dg-error {argument 3 must be in the range \[0, 15\]} } */
+  res = __builtin_crypto_vshasigmad (test, 1, 0xff); /* { dg-error {argument 3 must be a 4-bit unsigned literal} } */
   return 0;
 }
diff --git a/gcc/testsuite/gcc.target/powerpc/pr80315-3.c b/gcc/testsuite/gcc.target/powerpc/pr80315-3.c
index 99a3e24eadd..cc2e46cf5cb 100644
--- a/gcc/testsuite/gcc.target/powerpc/pr80315-3.c
+++ b/gcc/testsuite/gcc.target/powerpc/pr80315-3.c
@@ -12,6 +12,6 @@ main ()
   int mask;
 
   /* Argument 2 must be 0 or 1.  Argument 3 must be in range 0..15.  */
-  res = vec_shasigma_be (test, 1, 0xff); /* { dg-error {argument 3 must be in the range \[0, 15\]} } */
+  res = vec_shasigma_be (test, 1, 0xff); /* { dg-error {argument 3 must be a 4-bit unsigned literal} } */
   return res;
 }
diff --git a/gcc/testsuite/gcc.target/powerpc/pr80315-4.c b/gcc/testsuite/gcc.target/powerpc/pr80315-4.c
index 7f5f6f75029..ac12910741b 100644
--- a/gcc/testsuite/gcc.target/powerpc/pr80315-4.c
+++ b/gcc/testsuite/gcc.target/powerpc/pr80315-4.c
@@ -12,6 +12,6 @@ main ()
   int mask;
 
   /* Argument 2 must be 0 or 1.  Argument 3 must be in range 0..15.  */
-  res = vec_shasigma_be (test, 1, 0xff); /* { dg-error {argument 3 must be in the range \[0, 15\]} } */
+  res = vec_shasigma_be (test, 1, 0xff); /* { dg-error {argument 3 must be a 4-bit unsigned literal} } */
   return res;
 }
diff --git a/gcc/testsuite/gcc.target/powerpc/pr88100.c b/gcc/testsuite/gcc.target/powerpc/pr88100.c
index 4452145ce95..764c897a497 100644
--- a/gcc/testsuite/gcc.target/powerpc/pr88100.c
+++ b/gcc/testsuite/gcc.target/powerpc/pr88100.c
@@ -10,35 +10,35 @@
 vector unsigned char
 splatu1 (void)
 {
-  return vec_splat_u8(0x100);/* { dg-error "argument 1 must be a 5-bit signed literal" } */
+  return vec_splat_u8(0x100);/* { dg-error "argument 1 must be a literal between -16 and 15, inclusive" } */
 }
 
 vector unsigned short
 splatu2 (void)
 {
-  return vec_splat_u16(0x10000);/* { dg-error "argument 1 must be a 5-bit signed literal" } */
+  return vec_splat_u16(0x10000);/* { dg-error "argument 1 must be a literal between -16 and 15, inclusive" } */
 }
 
 vector unsigned int
 splatu3 (void)
 {
-  return vec_splat_u32(0x10000000);/* { dg-error "argument 1 must be a 5-bit signed literal" } */
+  return vec_splat_u32(0x10000000);/* { dg-error "argument 1 must be a literal between -16 and 15, inclusive" } */
 }
 
 vector signed char
 splats1 (void)
 {
-  return vec_splat_s8(0x100);/* { dg-error "argument 1 must be a 5-bit signed literal" } */
+  return vec_splat_s8(0x100);/* { dg-error "argument 1 must be a literal between -16 and 15, inclusive" } */
 }
 
 vector signed short
 splats2 (void)
 {
-  return vec_splat_s16(0x10000);/* { dg-error "argument 1 must be a 5-bit signed literal" } */
+  return vec_splat_s16(0x10000);/* { dg-error "argument 1 must be a literal between -16 and 15, inclusive" } */
 }
 
 vector signed int
 splats3 (void)
 {
-  return vec_splat_s32(0x10000000);/* { dg-error "argument 1 must be a 5-bit signed literal" } */
+  return vec_splat_s32(0x10000000);/* { dg-error "argument 1 must be a literal between -16 and 15, inclusive" } */
 }
diff --git a/gcc/testsuite/gcc.target/powerpc/pragma_misc9.c b/gcc/testsuite/gcc.target/powerpc/pragma_misc9.c
index e03099bd084..61274463653 100644
--- a/gcc/testsuite/gcc.target/powerpc/pragma_misc9.c
+++ b/gcc/testsuite/gcc.target/powerpc/pragma_misc9.c
@@ -20,7 +20,7 @@ vector bool int
 test2 (vector signed int a, vector signed int b)
 {
   return vec_cmpnez (a, b);
-  /* { dg-error "'__builtin_altivec_vcmpnezw' requires the '-mcpu=power9' option" "" { target *-*-* } .-1 } */
+  /* { dg-error "'__builtin_altivec_vcmpnezw' requires the '-mpower9-vector' option" "" { target *-*-* } .-1 } */
 }
 
 #pragma GCC target ("cpu=power7")
diff --git a/gcc/testsuite/gcc.target/powerpc/pragma_power8.c b/gcc/testsuite/gcc.target/powerpc/pragma_power8.c
index c8d2cdd6c1a..cb0f30844d3 100644
--- a/gcc/testsuite/gcc.target/powerpc/pragma_power8.c
+++ b/gcc/testsuite/gcc.target/powerpc/pragma_power8.c
@@ -19,6 +19,7 @@ test1 (vector int a, vector int b)
 #pragma GCC target ("cpu=power7")
 /* Force a re-read of altivec.h with new cpu target. */
 #undef _ALTIVEC_H
+#undef _RS6000_VECDEFINES_H
 #include <altivec.h>
 #ifdef _ARCH_PWR7
 vector signed int
@@ -33,6 +34,7 @@ test2 (vector signed int a, vector signed int b)
 #pragma GCC target ("cpu=power8")
 /* Force a re-read of altivec.h with new cpu target. */
 #undef _ALTIVEC_H
+#undef _RS6000_VECDEFINES_H
 #include <altivec.h>
 #ifdef _ARCH_PWR8
 vector int
diff --git a/gcc/testsuite/gcc.target/powerpc/pragma_power9.c b/gcc/testsuite/gcc.target/powerpc/pragma_power9.c
index e33aad1aaf7..e05f1f4ddfa 100644
--- a/gcc/testsuite/gcc.target/powerpc/pragma_power9.c
+++ b/gcc/testsuite/gcc.target/powerpc/pragma_power9.c
@@ -17,6 +17,7 @@ test1 (vector int a, vector int b)
 
 #pragma GCC target ("cpu=power7")
 #undef _ALTIVEC_H
+#undef _RS6000_VECDEFINES_H
 #include <altivec.h>
 #ifdef _ARCH_PWR7
 vector signed int
@@ -30,6 +31,7 @@ test2 (vector signed int a, vector signed int b)
 
 #pragma GCC target ("cpu=power8")
 #undef _ALTIVEC_H
+#undef _RS6000_VECDEFINES_H
 #include <altivec.h>
 #ifdef _ARCH_PWR8
 vector int
@@ -50,6 +52,7 @@ test3b (vec_t a, vec_t b)
 
 #pragma GCC target ("cpu=power9,power9-vector")
 #undef _ALTIVEC_H
+#undef _RS6000_VECDEFINES_H
 #include <altivec.h>
 #ifdef _ARCH_PWR9
 vector bool int
diff --git a/gcc/testsuite/gcc.target/powerpc/test_fpscr_drn_builtin_error.c b/gcc/testsuite/gcc.target/powerpc/test_fpscr_drn_builtin_error.c
index 028ab0b6d66..4f9d9e08e8a 100644
--- a/gcc/testsuite/gcc.target/powerpc/test_fpscr_drn_builtin_error.c
+++ b/gcc/testsuite/gcc.target/powerpc/test_fpscr_drn_builtin_error.c
@@ -9,8 +9,8 @@ int main ()
      __builtin_set_fpscr_drn() also support a variable as an argument but
      can't test variable value at compile time.  */
 
-  __builtin_set_fpscr_drn(-1);  /* { dg-error "Argument must be a value between 0 and 7" } */ 
-  __builtin_set_fpscr_drn(8);   /* { dg-error "Argument must be a value between 0 and 7" } */ 
+  __builtin_set_fpscr_drn(-1);  /* { dg-error "argument 1 must be a variable or a literal between 0 and 7, inclusive" } */ 
+  __builtin_set_fpscr_drn(8);   /* { dg-error "argument 1 must be a variable or a literal between 0 and 7, inclusive" } */ 
 
 }
 
diff --git a/gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin_error.c b/gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin_error.c
index aea65091b0c..10391b71008 100644
--- a/gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin_error.c
+++ b/gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin_error.c
@@ -8,13 +8,13 @@ int main ()
      int arguments.  The builtins __builtin_set_fpscr_rn() also supports a
      variable as an argument but can't test variable value at compile time.  */
 
-  __builtin_mtfsb0(-1);  /* { dg-error "Argument must be a constant between 0 and 31" } */
-  __builtin_mtfsb0(32);  /* { dg-error "Argument must be a constant between 0 and 31" } */
+  __builtin_mtfsb0(-1);  /* { dg-error "argument 1 must be a 5-bit unsigned literal" } */
+  __builtin_mtfsb0(32);  /* { dg-error "argument 1 must be a 5-bit unsigned literal" } */
 
-  __builtin_mtfsb1(-1);  /* { dg-error "Argument must be a constant between 0 and 31" } */
-  __builtin_mtfsb1(32);  /* { dg-error "Argument must be a constant between 0 and 31" } */ 
+  __builtin_mtfsb1(-1);  /* { dg-error "argument 1 must be a 5-bit unsigned literal" } */
+  __builtin_mtfsb1(32);  /* { dg-error "argument 1 must be a 5-bit unsigned literal" } */ 
 
-  __builtin_set_fpscr_rn(-1);  /* { dg-error "Argument must be a value between 0 and 3" } */ 
-  __builtin_set_fpscr_rn(4);   /* { dg-error "Argument must be a value between 0 and 3" } */ 
+  __builtin_set_fpscr_rn(-1);  /* { dg-error "argument 1 must be a variable or a literal between 0 and 3, inclusive" } */ 
+  __builtin_set_fpscr_rn(4);   /* { dg-error "argument 1 must be a variable or a literal between 0 and 3, inclusive" } */ 
 }
 
diff --git a/gcc/testsuite/gcc.target/powerpc/test_mffsl.c b/gcc/testsuite/gcc.target/powerpc/test_mffsl.c
index 41377efba1a..28c2b91988e 100644
--- a/gcc/testsuite/gcc.target/powerpc/test_mffsl.c
+++ b/gcc/testsuite/gcc.target/powerpc/test_mffsl.c
@@ -1,5 +1,6 @@
 /* { dg-do run { target { powerpc*-*-* } } } */
-/* { dg-options "-O2 -std=c99" } */
+/* { dg-options "-O2 -std=c99 -mcpu=power9" } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
 
 #ifdef DEBUG
 #include <stdio.h>
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-gnb-2.c b/gcc/testsuite/gcc.target/powerpc/vec-gnb-2.c
index 895bb953b37..4e59cbffa17 100644
--- a/gcc/testsuite/gcc.target/powerpc/vec-gnb-2.c
+++ b/gcc/testsuite/gcc.target/powerpc/vec-gnb-2.c
@@ -20,7 +20,7 @@ do_vec_gnb (vector unsigned __int128 source, int stride)
     case 5:
       return vec_gnb (source, 1);	/* { dg-error "between 2 and 7" } */
     case 6:
-      return vec_gnb (source, stride);	/* { dg-error "unsigned literal" } */
+      return vec_gnb (source, stride);	/* { dg-error "literal" } */
     case 7:
       return vec_gnb (source, 7);
 
diff --git a/gcc/testsuite/gcc.target/powerpc/vsu/vec-all-nez-7.c b/gcc/testsuite/gcc.target/powerpc/vsu/vec-all-nez-7.c
index f53c6dca0a9..d1ef054b488 100644
--- a/gcc/testsuite/gcc.target/powerpc/vsu/vec-all-nez-7.c
+++ b/gcc/testsuite/gcc.target/powerpc/vsu/vec-all-nez-7.c
@@ -12,5 +12,5 @@ test_all_not_equal_and_not_zero (vector unsigned short *arg1_p,
   vector unsigned short arg_2 = *arg2_p;
 
   return __builtin_vec_vcmpnez_p (__CR6_LT, arg_1, arg_2);
-  /* { dg-error "'__builtin_altivec_vcmpnezh_p' requires the '-mcpu=power9' option" "" { target *-*-* } .-1 } */
+  /* { dg-error "'__builtin_altivec_vcmpnezh_p' requires the '-mpower9-vector' option" "" { target *-*-* } .-1 } */
 }
diff --git a/gcc/testsuite/gcc.target/powerpc/vsu/vec-any-eqz-7.c b/gcc/testsuite/gcc.target/powerpc/vsu/vec-any-eqz-7.c
index 757acd93110..b5cdea5fb3e 100644
--- a/gcc/testsuite/gcc.target/powerpc/vsu/vec-any-eqz-7.c
+++ b/gcc/testsuite/gcc.target/powerpc/vsu/vec-any-eqz-7.c
@@ -11,5 +11,5 @@ test_any_equal (vector unsigned int *arg1_p, vector unsigned int *arg2_p)
   vector unsigned int arg_2 = *arg2_p;
 
   return __builtin_vec_vcmpnez_p (__CR6_LT_REV, arg_1, arg_2);
-  /* { dg-error "'__builtin_altivec_vcmpnezw_p' requires the '-mcpu=power9' option" "" { target *-*-* } .-1 } */
+  /* { dg-error "'__builtin_altivec_vcmpnezw_p' requires the '-mpower9-vector' option" "" { target *-*-* } .-1 } */
 }
diff --git a/gcc/testsuite/gcc.target/powerpc/vsu/vec-cmpnez-7.c b/gcc/testsuite/gcc.target/powerpc/vsu/vec-cmpnez-7.c
index 811b32f1c32..320421e6028 100644
--- a/gcc/testsuite/gcc.target/powerpc/vsu/vec-cmpnez-7.c
+++ b/gcc/testsuite/gcc.target/powerpc/vsu/vec-cmpnez-7.c
@@ -10,5 +10,5 @@ fetch_data (vector unsigned int *arg1_p, vector unsigned int *arg2_p)
   vector unsigned int arg_1 = *arg1_p;
   vector unsigned int arg_2 = *arg2_p;
 
-  return __builtin_vec_vcmpnez (arg_1, arg_2);	/* { dg-error "'__builtin_altivec_vcmpnezw' requires the '-mcpu=power9' option" } */
+  return __builtin_vec_vcmpnez (arg_1, arg_2);	/* { dg-error "'__builtin_altivec_vcmpnezw' requires the '-mpower9-vector' option" } */
 }
diff --git a/gcc/testsuite/gcc.target/powerpc/vsu/vec-cntlz-lsbb-2.c b/gcc/testsuite/gcc.target/powerpc/vsu/vec-cntlz-lsbb-2.c
index 6ee066d1eff..251285536c2 100644
--- a/gcc/testsuite/gcc.target/powerpc/vsu/vec-cntlz-lsbb-2.c
+++ b/gcc/testsuite/gcc.target/powerpc/vsu/vec-cntlz-lsbb-2.c
@@ -9,5 +9,5 @@ count_leading_zero_byte_bits (vector unsigned char *arg1_p)
 {
   vector unsigned char arg_1 = *arg1_p;
 
-  return __builtin_vec_vclzlsbb (arg_1);	/* { dg-error "'__builtin_altivec_vclzlsbb_v16qi' requires the '-mcpu=power9' option" } */
+  return __builtin_vec_vclzlsbb (arg_1);	/* { dg-error "'__builtin_altivec_vclzlsbb_v16qi' requires the '-mpower9-vector' option" } */
 }
diff --git a/gcc/testsuite/gcc.target/powerpc/vsu/vec-cnttz-lsbb-2.c b/gcc/testsuite/gcc.target/powerpc/vsu/vec-cnttz-lsbb-2.c
index ecd0add70d0..83ca92daced 100644
--- a/gcc/testsuite/gcc.target/powerpc/vsu/vec-cnttz-lsbb-2.c
+++ b/gcc/testsuite/gcc.target/powerpc/vsu/vec-cnttz-lsbb-2.c
@@ -9,5 +9,5 @@ count_trailing_zero_byte_bits (vector unsigned char *arg1_p)
 {
   vector unsigned char arg_1 = *arg1_p;
 
-  return __builtin_vec_vctzlsbb (arg_1);	/* { dg-error "'__builtin_altivec_vctzlsbb_v16qi' requires the '-mcpu=power9' option" } */
+  return __builtin_vec_vctzlsbb (arg_1);	/* { dg-error "'__builtin_altivec_vctzlsbb_v16qi' requires the '-mpower9-vector' option" } */
 }
diff --git a/gcc/testsuite/gcc.target/powerpc/vsu/vec-xl-len-13.c b/gcc/testsuite/gcc.target/powerpc/vsu/vec-xl-len-13.c
index 1cfed57d6a6..0f601fbbb50 100644
--- a/gcc/testsuite/gcc.target/powerpc/vsu/vec-xl-len-13.c
+++ b/gcc/testsuite/gcc.target/powerpc/vsu/vec-xl-len-13.c
@@ -13,5 +13,5 @@
 int
 fetch_data (float *address, size_t length)
 {
-  return __builtin_vec_lxvl (address, length);	/* { dg-warning "'__builtin_vec_lxvl'" } */
+  return __builtin_vec_lxvl (address, length);	/* { dg-error "'__builtin_vsx_lxvl' requires the" } */
 }
diff --git a/gcc/testsuite/gcc.target/powerpc/vsu/vec-xst-len-12.c b/gcc/testsuite/gcc.target/powerpc/vsu/vec-xst-len-12.c
index 3a51132a5a2..f30d49cb4cc 100644
--- a/gcc/testsuite/gcc.target/powerpc/vsu/vec-xst-len-12.c
+++ b/gcc/testsuite/gcc.target/powerpc/vsu/vec-xst-len-12.c
@@ -13,5 +13,5 @@ store_data (vector double *datap, double *address, size_t length)
 {
   vector double data = *datap;
 
-  __builtin_vec_stxvl (data, address, length); /* { dg-error "'__builtin_vec_stxvl' is not supported in this compiler configuration" } */
+  __builtin_vec_stxvl (data, address, length); /* { dg-error "'__builtin_altivec_stxvl' requires the" } */
 }
-- 
2.27.0


^ permalink raw reply	[flat|nested] 84+ messages in thread

* [PATCH 34/34] rs6000: Enable the new builtin support
  2021-07-29 13:30 [PATCHv4 00/34] Replace the Power target-specific builtin machinery Bill Schmidt
                   ` (32 preceding siblings ...)
  2021-07-29 13:31 ` [PATCH 33/34] rs6000: Test case adjustments Bill Schmidt
@ 2021-07-29 13:31 ` Bill Schmidt
  33 siblings, 0 replies; 84+ messages in thread
From: Bill Schmidt @ 2021-07-29 13:31 UTC (permalink / raw)
  To: gcc-patches; +Cc: segher, dje.gcc, willschm

2021-03-05  Bill Schmidt  <wschmidt@linux.ibm.com>

gcc/
	* config/rs6000/rs6000-gen-builtins.c (write_init_file):
	Initialize new_builtins_are_live to 1.
---
 gcc/config/rs6000/rs6000-gen-builtins.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/rs6000/rs6000-gen-builtins.c b/gcc/config/rs6000/rs6000-gen-builtins.c
index 5fc56eff6d1..b0f90de8cae 100644
--- a/gcc/config/rs6000/rs6000-gen-builtins.c
+++ b/gcc/config/rs6000/rs6000-gen-builtins.c
@@ -2764,7 +2764,7 @@ write_init_file (void)
   fprintf (init_file, "#include \"rs6000-builtins.h\"\n");
   fprintf (init_file, "\n");
 
-  fprintf (init_file, "int new_builtins_are_live = 0;\n\n");
+  fprintf (init_file, "int new_builtins_are_live = 1;\n\n");
 
   fprintf (init_file, "tree rs6000_builtin_decls_x[RS6000_OVLD_MAX];\n\n");
 
-- 
2.27.0


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 01/34] rs6000: Incorporate new builtins code into the build machinery
  2021-07-29 13:30 ` [PATCH 01/34] rs6000: Incorporate new builtins code into the build machinery Bill Schmidt
@ 2021-08-04 22:29   ` Segher Boessenkool
  2021-08-05 13:47     ` Bill Schmidt
  0 siblings, 1 reply; 84+ messages in thread
From: Segher Boessenkool @ 2021-08-04 22:29 UTC (permalink / raw)
  To: Bill Schmidt; +Cc: gcc-patches, dje.gcc, willschm

Hi!

On Thu, Jul 29, 2021 at 08:30:48AM -0500, Bill Schmidt wrote:
> 	* config/rs6000/rs6000-gen-builtins.c (main): Close init_file
> 	last.

That easily fits on one line?

> +rs6000-gen-builtins: rs6000-gen-builtins.o rbtree.o
> +	$(LINKER_FOR_BUILD) $(BUILD_LINKERFLAGS) $(BUILD_LDFLAGS) -o $@ \
> +	    $(filter-out $(BUILD_LIBDEPS), $^) $(BUILD_LIBS)

I wonder what the difference is between BUILD_LINKERFLAGS and
BUILD_LDFLAGS?  Do you have any idea?

Okay for trunk.  Thanks!


Segher

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 02/34] rs6000: Add gengtype handling to the build machinery
  2021-07-29 13:30 ` [PATCH 02/34] rs6000: Add gengtype handling to " Bill Schmidt
@ 2021-08-04 22:52   ` Segher Boessenkool
  0 siblings, 0 replies; 84+ messages in thread
From: Segher Boessenkool @ 2021-08-04 22:52 UTC (permalink / raw)
  To: Bill Schmidt; +Cc: gcc-patches, dje.gcc, willschm

On Thu, Jul 29, 2021 at 08:30:49AM -0500, Bill Schmidt wrote:
> 	* config.gcc (target_gtfiles): Add ./rs6000-builtins.h.
> 	* config/rs6000/t-rs6000 (EXTRA_GTYPE_DEPS): Set.

> --- a/gcc/config/rs6000/t-rs6000
> +++ b/gcc/config/rs6000/t-rs6000
> @@ -22,6 +22,7 @@ TM_H += $(srcdir)/config/rs6000/rs6000-builtin.def
>  TM_H += $(srcdir)/config/rs6000/rs6000-cpus.def
>  TM_H += $(srcdir)/config/rs6000/rs6000-modes.h
>  PASSES_EXTRA += $(srcdir)/config/rs6000/rs6000-passes.def
> +EXTRA_GTYPE_DEPS += $(srcdir)/config/rs6000/rs6000-builtin-new.def
>  
>  rs6000-pcrel-opt.o: $(srcdir)/config/rs6000/rs6000-pcrel-opt.c
>  	$(COMPILE) $<

Surprisingly I couldn't find docs or examples for EXTRA_GTYPE_DEPS.
But it loks like it will work.  Okay for trunkm thanks!


Segher

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 01/34] rs6000: Incorporate new builtins code into the build machinery
  2021-08-04 22:29   ` Segher Boessenkool
@ 2021-08-05 13:47     ` Bill Schmidt
  2021-08-05 16:04       ` Segher Boessenkool
  0 siblings, 1 reply; 84+ messages in thread
From: Bill Schmidt @ 2021-08-05 13:47 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: gcc-patches, dje.gcc, willschm

Hi Segher,

On 8/4/21 5:29 PM, Segher Boessenkool wrote:
> On Thu, Jul 29, 2021 at 08:30:48AM -0500, Bill Schmidt wrote:
> +rs6000-gen-builtins: rs6000-gen-builtins.o rbtree.o
>> +	$(LINKER_FOR_BUILD) $(BUILD_LINKERFLAGS) $(BUILD_LDFLAGS) -o $@ \
>> +	    $(filter-out $(BUILD_LIBDEPS), $^) $(BUILD_LIBS)
> I wonder what the difference is between BUILD_LINKERFLAGS and
> BUILD_LDFLAGS?  Do you have any idea?
>
I couldn't find evidence that BUILD_LINKERFLAGS ever has anything that 
BUILD_LDFLAGS doesn't, but I put that down to my ignorance of the 
cobwebbed corners of the build system.  There is probably some configure 
magic that can set it, and I suspect it has something to do with cross 
builds; but it might also just be a leftover artifact.  I decided I 
should use the same build rule as the other gen- programs to make sure 
cross builds work as expected. Certainly open to better ideas if you 
have them!

Thanks,
Bill


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 01/34] rs6000: Incorporate new builtins code into the build machinery
  2021-08-05 13:47     ` Bill Schmidt
@ 2021-08-05 16:04       ` Segher Boessenkool
  0 siblings, 0 replies; 84+ messages in thread
From: Segher Boessenkool @ 2021-08-05 16:04 UTC (permalink / raw)
  To: Bill Schmidt; +Cc: gcc-patches, dje.gcc, willschm

On Thu, Aug 05, 2021 at 08:47:54AM -0500, Bill Schmidt wrote:
> Hi Segher,
> 
> On 8/4/21 5:29 PM, Segher Boessenkool wrote:
> >On Thu, Jul 29, 2021 at 08:30:48AM -0500, Bill Schmidt wrote:
> >+rs6000-gen-builtins: rs6000-gen-builtins.o rbtree.o
> >>+	$(LINKER_FOR_BUILD) $(BUILD_LINKERFLAGS) $(BUILD_LDFLAGS) -o $@ \
> >>+	    $(filter-out $(BUILD_LIBDEPS), $^) $(BUILD_LIBS)
> >I wonder what the difference is between BUILD_LINKERFLAGS and
> >BUILD_LDFLAGS?  Do you have any idea?
> >
> I couldn't find evidence that BUILD_LINKERFLAGS ever has anything that 
> BUILD_LDFLAGS doesn't, but I put that down to my ignorance of the 
> cobwebbed corners of the build system.  There is probably some configure 
> magic that can set it, and I suspect it has something to do with cross 
> builds; but it might also just be a leftover artifact.  I decided I 
> should use the same build rule as the other gen- programs to make sure 
> cross builds work as expected. Certainly open to better ideas if you 
> have them!

Oh no, the patch is fine as is, I approved it...  I'm just terminally
nosy :-)  It isn't clear what (if any) difference there is between the
two vars.  I do know you are just copying exising practice here.


Segher

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 03/34] rs6000: Add the rest of the [altivec] stanza to the builtins file
  2021-07-29 13:30 ` [PATCH 03/34] rs6000: Add the rest of the [altivec] stanza to the builtins file Bill Schmidt
@ 2021-08-07  0:01   ` Segher Boessenkool
  2021-08-08 16:53     ` Bill Schmidt
  0 siblings, 1 reply; 84+ messages in thread
From: Segher Boessenkool @ 2021-08-07  0:01 UTC (permalink / raw)
  To: Bill Schmidt; +Cc: gcc-patches, dje.gcc, willschm

Hi!

On Thu, Jul 29, 2021 at 08:30:50AM -0500, Bill Schmidt wrote:
> +  const vsc __builtin_altivec_abss_v16qi (vsc);
> +    ABSS_V16QI altivec_abss_v16qi {}
> +
> +  const vsi __builtin_altivec_abss_v4si (vsi);
> +    ABSS_V4SI altivec_abss_v4si {}
> +
> +  const vss __builtin_altivec_abss_v8hi (vss);
> +    ABSS_V8HI altivec_abss_v8hi {}

Is there any ordering used here?  What is it, then?  Just alphabetical?

That order does not really allow breaking things up into groups, which
is the main tool to keep things manageable.

> +  const vsc __builtin_vec_init_v16qi (signed char, signed char, signed char, signed char, signed char, signed char, signed char, signed char, signed char, signed char, signed char, signed char, signed char, signed char, signed char, signed char);

That is a very long line, can you do something about that, or is that
forced by the file format?  Can you use just "char"?  "signed char" is a
very strange choice.

> +  pcvoid_type_node
> +    = build_pointer_type (build_qualified_type (void_type_node,
> +						TYPE_QUAL_CONST));

A const void?  Interesting.  You are building a pointer to a const void
here, not a const pointer to void.  Is that what you wanted?

(And yes I do realise this is just moved, not new code).

> --- a/gcc/config/rs6000/rs6000.h
> +++ b/gcc/config/rs6000/rs6000.h
> @@ -2460,6 +2460,7 @@ enum rs6000_builtin_type_index
>    RS6000_BTI_const_str,		 /* pointer to const char * */
>    RS6000_BTI_vector_pair,	 /* unsigned 256-bit types (vector pair).  */
>    RS6000_BTI_vector_quad,	 /* unsigned 512-bit types (vector quad).  */
> +  RS6000_BTI_const_ptr_void,     /* const pointer to void */
>    RS6000_BTI_MAX
>  };

That is not what
  build_pointer_type (build_qualified_type (void_type_node, TYPE_QUAL_CONST));
builds though?

Okay for trunk, but please look at those things, especially the pcvoid
one!  Thanks,


Segher

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 03/34] rs6000: Add the rest of the [altivec] stanza to the builtins file
  2021-08-07  0:01   ` Segher Boessenkool
@ 2021-08-08 16:53     ` Bill Schmidt
  2021-08-08 20:27       ` Segher Boessenkool
  0 siblings, 1 reply; 84+ messages in thread
From: Bill Schmidt @ 2021-08-08 16:53 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: gcc-patches, dje.gcc, willschm

Hi Segher,

On 8/6/21 7:01 PM, Segher Boessenkool wrote:
> Hi!
>
> On Thu, Jul 29, 2021 at 08:30:50AM -0500, Bill Schmidt wrote:
>> +  const vsc __builtin_altivec_abss_v16qi (vsc);
>> +    ABSS_V16QI altivec_abss_v16qi {}
>> +
>> +  const vsi __builtin_altivec_abss_v4si (vsi);
>> +    ABSS_V4SI altivec_abss_v4si {}
>> +
>> +  const vss __builtin_altivec_abss_v8hi (vss);
>> +    ABSS_V8HI altivec_abss_v8hi {}
> Is there any ordering used here?  What is it, then?  Just alphabetical?
>
> That order does not really allow breaking things up into groups, which
> is the main tool to keep things manageable.


Yes, within each stanza, the ordering is alphabetical by built-in name.  
It seems to me that any other ordering is arbitrary and prone to 
requiring exceptions, so in the end you just end up with a mess where 
nobody knows where to put the next builtin added. That's certainly what 
happened with the old support.

>
>> +  const vsc __builtin_vec_init_v16qi (signed char, signed char, signed char, signed char, signed char, signed char, signed char, signed char, signed char, signed char, signed char, signed char, signed char, signed char, signed char, signed char);
> That is a very long line, can you do something about that, or is that
> forced by the file format?  Can you use just "char"?  "signed char" is a
> very strange choice.


Right now, long lines are there because the parser doesn't support 
breaking up the line.  I have an additional patch I put together 
recently that allows the use of escape-newline to break up these lines.  
I am planning to submit that once we get through the current patch set.

>
>> +  pcvoid_type_node
>> +    = build_pointer_type (build_qualified_type (void_type_node,
>> +						TYPE_QUAL_CONST));
> A const void?  Interesting.  You are building a pointer to a const void
> here, not a const pointer to void.  Is that what you wanted?
>
> (And yes I do realise this is just moved, not new code).


Sorry, I misdocumented this below.  I'll review and make sure this is 
correct everywhere.

Thanks for the review!
Bill

>
>> --- a/gcc/config/rs6000/rs6000.h
>> +++ b/gcc/config/rs6000/rs6000.h
>> @@ -2460,6 +2460,7 @@ enum rs6000_builtin_type_index
>>     RS6000_BTI_const_str,		 /* pointer to const char * */
>>     RS6000_BTI_vector_pair,	 /* unsigned 256-bit types (vector pair).  */
>>     RS6000_BTI_vector_quad,	 /* unsigned 512-bit types (vector quad).  */
>> +  RS6000_BTI_const_ptr_void,     /* const pointer to void */
>>     RS6000_BTI_MAX
>>   };
> That is not what
>    build_pointer_type (build_qualified_type (void_type_node, TYPE_QUAL_CONST));
> builds though?
>
> Okay for trunk, but please look at those things, especially the pcvoid
> one!  Thanks,
>
>
> Segher

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 03/34] rs6000: Add the rest of the [altivec] stanza to the builtins file
  2021-08-08 16:53     ` Bill Schmidt
@ 2021-08-08 20:27       ` Segher Boessenkool
  2021-08-08 20:53         ` Bill Schmidt
  0 siblings, 1 reply; 84+ messages in thread
From: Segher Boessenkool @ 2021-08-08 20:27 UTC (permalink / raw)
  To: Bill Schmidt; +Cc: gcc-patches, dje.gcc, willschm

Hi!

On Sun, Aug 08, 2021 at 11:53:38AM -0500, Bill Schmidt wrote:
> On 8/6/21 7:01 PM, Segher Boessenkool wrote:
> >On Thu, Jul 29, 2021 at 08:30:50AM -0500, Bill Schmidt wrote:
> >>+  const vsc __builtin_altivec_abss_v16qi (vsc);
> >>+    ABSS_V16QI altivec_abss_v16qi {}
> >>+
> >>+  const vsi __builtin_altivec_abss_v4si (vsi);
> >>+    ABSS_V4SI altivec_abss_v4si {}
> >>+
> >>+  const vss __builtin_altivec_abss_v8hi (vss);
> >>+    ABSS_V8HI altivec_abss_v8hi {}
> >Is there any ordering used here?  What is it, then?  Just alphabetical?
> >
> >That order does not really allow breaking things up into groups, which
> >is the main tool to keep things manageable.
> 
> Yes, within each stanza, the ordering is alphabetical by built-in name.  
> It seems to me that any other ordering is arbitrary and prone to 
> requiring exceptions, so in the end you just end up with a mess where 
> nobody knows where to put the next builtin added. That's certainly what 
> happened with the old support.

Yeah, there is no great answer here :-(  You have thought about it in
any case, so let's see where this goes.

> >>+  const vsc __builtin_vec_init_v16qi (signed char, signed char, signed 
> >>char, signed char, signed char, signed char, signed char, signed char, 
> >>signed char, signed char, signed char, signed char, signed char, signed 
> >>char, signed char, signed char);
> >That is a very long line, can you do something about that, or is that
> >forced by the file format?  Can you use just "char"?  "signed char" is a
> >very strange choice.
> 
> Right now, long lines are there because the parser doesn't support 
> breaking up the line.  I have an additional patch I put together 
> recently that allows the use of escape-newline to break up these lines.  
> I am planning to submit that once we get through the current patch set.

Okido.  What about the signed char though?

> >>+  pcvoid_type_node
> >>+    = build_pointer_type (build_qualified_type (void_type_node,
> >>+						TYPE_QUAL_CONST));
> >A const void?  Interesting.  You are building a pointer to a const void
> >here, not a const pointer to void.  Is that what you wanted?
> >
> >(And yes I do realise this is just moved, not new code).
> 
> 
> Sorry, I misdocumented this below.  I'll review and make sure this is 
> correct everywhere.

"const void" is meaningless, and maybe even invalid C.  I think the code
is wrong, not (just) the documentation!  This wants to be
  void *const
but it is
  const void *
as far as I can see?

As I said, this isn't new code, but it seems very wrong!


Segher

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 03/34] rs6000: Add the rest of the [altivec] stanza to the builtins file
  2021-08-08 20:27       ` Segher Boessenkool
@ 2021-08-08 20:53         ` Bill Schmidt
  2021-08-09 18:05           ` Segher Boessenkool
  2021-08-09 19:18           ` Bill Schmidt
  0 siblings, 2 replies; 84+ messages in thread
From: Bill Schmidt @ 2021-08-08 20:53 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: gcc-patches, dje.gcc, willschm

Hi...

On 8/8/21 3:27 PM, Segher Boessenkool wrote:
> Hi!
>
> On Sun, Aug 08, 2021 at 11:53:38AM -0500, Bill Schmidt wrote:
>> On 8/6/21 7:01 PM, Segher Boessenkool wrote:
>>> On Thu, Jul 29, 2021 at 08:30:50AM -0500, Bill Schmidt wrote:
>>>> +  const vsc __builtin_altivec_abss_v16qi (vsc);
>>>> +    ABSS_V16QI altivec_abss_v16qi {}
>>>> +
>>>> +  const vsi __builtin_altivec_abss_v4si (vsi);
>>>> +    ABSS_V4SI altivec_abss_v4si {}
>>>> +
>>>> +  const vss __builtin_altivec_abss_v8hi (vss);
>>>> +    ABSS_V8HI altivec_abss_v8hi {}
>>> Is there any ordering used here?  What is it, then?  Just alphabetical?
>>>
>>> That order does not really allow breaking things up into groups, which
>>> is the main tool to keep things manageable.
>> Yes, within each stanza, the ordering is alphabetical by built-in name.
>> It seems to me that any other ordering is arbitrary and prone to
>> requiring exceptions, so in the end you just end up with a mess where
>> nobody knows where to put the next builtin added. That's certainly what
>> happened with the old support.
> Yeah, there is no great answer here :-(  You have thought about it in
> any case, so let's see where this goes.
>
>>>> +  const vsc __builtin_vec_init_v16qi (signed char, signed char, signed
>>>> char, signed char, signed char, signed char, signed char, signed char,
>>>> signed char, signed char, signed char, signed char, signed char, signed
>>>> char, signed char, signed char);
>>> That is a very long line, can you do something about that, or is that
>>> forced by the file format?  Can you use just "char"?  "signed char" is a
>>> very strange choice.
>> Right now, long lines are there because the parser doesn't support
>> breaking up the line.  I have an additional patch I put together
>> recently that allows the use of escape-newline to break up these lines.
>> I am planning to submit that once we get through the current patch set.
> Okido.  What about the signed char though?


Sorry, forgot to address that.  There are two reasons to keep it as is:  
(a) It matches what we have in the old support, and (b) it makes 
explicit that we really mean signed.  We're trying to replace the old 
support without changing type signatures (except in places where there 
is a bug and we need to).

>
>>>> +  pcvoid_type_node
>>>> +    = build_pointer_type (build_qualified_type (void_type_node,
>>>> +						TYPE_QUAL_CONST));
>>> A const void?  Interesting.  You are building a pointer to a const void
>>> here, not a const pointer to void.  Is that what you wanted?
>>>
>>> (And yes I do realise this is just moved, not new code).
>>
>> Sorry, I misdocumented this below.  I'll review and make sure this is
>> correct everywhere.
> "const void" is meaningless, and maybe even invalid C.  I think the code
> is wrong, not (just) the documentation!  This wants to be
>    void *const
> but it is
>    const void *
> as far as I can see?
>
> As I said, this isn't new code, but it seems very wrong!


Yes, I'll look at it...offhand I do not know the answer and need to 
review it.

Thanks,
Bill

>
>
> Segher

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 03/34] rs6000: Add the rest of the [altivec] stanza to the builtins file
  2021-08-08 20:53         ` Bill Schmidt
@ 2021-08-09 18:05           ` Segher Boessenkool
  2021-08-09 19:18           ` Bill Schmidt
  1 sibling, 0 replies; 84+ messages in thread
From: Segher Boessenkool @ 2021-08-09 18:05 UTC (permalink / raw)
  To: Bill Schmidt; +Cc: gcc-patches, dje.gcc, willschm

On Sun, Aug 08, 2021 at 03:53:01PM -0500, Bill Schmidt wrote:
> On 8/8/21 3:27 PM, Segher Boessenkool wrote:
> >Okido.  What about the signed char though?
> 
> Sorry, forgot to address that.  There are two reasons to keep it as is:  
> (a) It matches what we have in the old support, and (b) it makes 
> explicit that we really mean signed.  We're trying to replace the old 
> support without changing type signatures (except in places where there 
> is a bug and we need to).

Sure, I understand that.  But signed char leads to implementation-defined
behaviour in parameter passing (it feels natural to pass 128 or 255 here,
but neither is standard!)  GCC just does mod 256 here, always, but does
every compiler that implements these intrinsics?

I'm sure I worry too much, but :-)


Segher

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 03/34] rs6000: Add the rest of the [altivec] stanza to the builtins file
  2021-08-08 20:53         ` Bill Schmidt
  2021-08-09 18:05           ` Segher Boessenkool
@ 2021-08-09 19:18           ` Bill Schmidt
  2021-08-09 23:44             ` Segher Boessenkool
  1 sibling, 1 reply; 84+ messages in thread
From: Bill Schmidt @ 2021-08-09 19:18 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: gcc-patches, dje.gcc, willschm

Hi Segher,
>>>>> +  pcvoid_type_node
>>>>> +    = build_pointer_type (build_qualified_type (void_type_node,
>>>>> +						TYPE_QUAL_CONST));
>>>> A const void?  Interesting.  You are building a pointer to a const void
>>>> here, not a const pointer to void.  Is that what you wanted?
>>>>
>>>> (And yes I do realise this is just moved, not new code).
>>> Sorry, I misdocumented this below.  I'll review and make sure this is
>>> correct everywhere.
>> "const void" is meaningless, and maybe even invalid C.  I think the code
>> is wrong, not (just) the documentation!  This wants to be
>>     void *const
>> but it is
>>     const void *
>> as far as I can see?
>>
>> As I said, this isn't new code, but it seems very wrong!
>
I had to go back and remember where this fits in.  tl;dr:  This is fine. 
:-)  More details...

"const void *" is used as part of the overloading machinery.  It serves 
to reduce the number of built-in functions that we need to register with 
the front end.  Consider the built-in function that accesses the "lvebx" 
instruction.  In rs6000-builtin-new.def, we define it this way:

   pure vsc __builtin_altivec_lvebx (signed long, const void *);
     LVEBX altivec_lvebx {ldvec}

Note that this is a "pure" function (no side effects), and we 
contractually guarantee (through "const <type> *") that we will not 
modify the data pointed to by the second argument. Normally you might 
expect this to be "const char *" or similar. The purpose of the void 
pointer is to allow multiple overloaded functions with different type 
signatures to map to this built-in function, as follows.

In rs6000-overload.def, you'll see this as part of the overloading for 
"vec_lde":

[VEC_LDE, vec_lde, __builtin_vec_lde]
   vsc __builtin_vec_lde (signed long, const signed char *);
     LVEBX  LVEBX_SC
   vuc __builtin_vec_lde (signed long, const unsigned char *);
     LVEBX  LVEBX_UC

The two references to LVEBX here indicate that those two overloads of 
__builtin_vec_lde will map to __builtin_altivec_lvebx, above.  These two 
functions differ in their argument types and their return types.

The overload machinery will replace a call to one of the 
__builtin_vec_lde functions as follows:

  - Arguments to __builtin_vec_lde are cast to the types expected by 
__builtin_altivec_lvebx
  - __builtin_altivec_lvebx is called
  - The return value from __builtin_altivec_lvebx is cast to the type 
expected by the __builtin_vec_lde function.

For vector types, the altivec type semantics allow us to use 
reinterpret-cast semantics to interpret any vector type as another 
vector type.  That handles the return type coercion in this case.

However, we don't have that freedom with pointer types.  This is why the 
built-in function is defined with a void * argument.  Both "const signed 
char *" and "const unsigned char *" can be legitimately cast to a "const 
void *".

This isn't strictly necessary, but without such a trick, we would have 
to have two different __builtin_altivec_lvebx functions (with different 
names) to handle the different pointer types.  Defining multiple 
functions for each such situation is wasteful when defining functions 
and when looking them up, and a naming scheme would be needed for 
dealing with this.

This is the way the builtin structure has been working since the "dawn 
of time," and I'm not proposing changes to that.  I'm hopeful with the 
new system that it is a little clearer what is going on, though, since 
you can easily see the const void * arguments in the definitions.

Thanks,
Bill


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 03/34] rs6000: Add the rest of the [altivec] stanza to the builtins file
  2021-08-09 19:18           ` Bill Schmidt
@ 2021-08-09 23:44             ` Segher Boessenkool
  2021-08-10 12:17               ` Bill Schmidt
  0 siblings, 1 reply; 84+ messages in thread
From: Segher Boessenkool @ 2021-08-09 23:44 UTC (permalink / raw)
  To: Bill Schmidt; +Cc: gcc-patches, dje.gcc, willschm

On Mon, Aug 09, 2021 at 02:18:48PM -0500, Bill Schmidt wrote:
> >>"const void" is meaningless, and maybe even invalid C.  I think the code
> >>is wrong, not (just) the documentation!  This wants to be
> >>    void *const
> >>but it is
> >>    const void *
> >>as far as I can see?
> >>
> >>As I said, this isn't new code, but it seems very wrong!
> >
> I had to go back and remember where this fits in.  tl;dr:  This is fine. 
> :-)  More details...
> 
> "const void *" is used as part of the overloading machinery.  It serves 
> to reduce the number of built-in functions that we need to register with 
> the front end.  Consider the built-in function that accesses the "lvebx" 
> instruction.  In rs6000-builtin-new.def, we define it this way:
> 
>   pure vsc __builtin_altivec_lvebx (signed long, const void *);
>     LVEBX altivec_lvebx {ldvec}
> 
> Note that this is a "pure" function (no side effects), and we 
> contractually guarantee (through "const <type> *") that we will not 
> modify the data pointed to by the second argument.

"void" can never be an lvalue (it is an incomplete type), so "const" on
it is meaningless (it is not invalid C though afaics).

> However, we don't have that freedom with pointer types.  This is why the 
> built-in function is defined with a void * argument.  Both "const signed 
> char *" and "const unsigned char *" can be legitimately cast to a "const 
> void *".

Why?  "void" is not an object type at all.

This is not a documented GCC extension either, and it might even
conflict with the existing void * extension (allowing arithmetic on it,
by defining sizeof(void)).  In either case it is not currently defined.

You can assign a pointer to qualified to a pointer to unqualified (and
the other way around) just fine, fwiw.  You can cast (explicitly or
implicitly) exactly the same things to void * as you can to const void *.

> This isn't strictly necessary, but without such a trick, we would have 
> to have two different __builtin_altivec_lvebx functions (with different 
> names) to handle the different pointer types.  Defining multiple 
> functions for each such situation is wasteful when defining functions 
> and when looking them up, and a naming scheme would be needed for 
> dealing with this.

So apparently the GCC overload semantics do not have much to do with how
C works otherwise?  This sounds not ideal :-/

> This is the way the builtin structure has been working since the "dawn 
> of time," and I'm not proposing changes to that.  I'm hopeful with the 
> new system that it is a little clearer what is going on, though, since 
> you can easily see the const void * arguments in the definitions.

Yeah, me too.  But it all sounds just wrong.


Segher

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 03/34] rs6000: Add the rest of the [altivec] stanza to the builtins file
  2021-08-09 23:44             ` Segher Boessenkool
@ 2021-08-10 12:17               ` Bill Schmidt
  2021-08-10 12:48                 ` Segher Boessenkool
  0 siblings, 1 reply; 84+ messages in thread
From: Bill Schmidt @ 2021-08-10 12:17 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: gcc-patches, dje.gcc, willschm


On 8/9/21 6:44 PM, Segher Boessenkool wrote:
>
>
> This is not a documented GCC extension either, and it might even
> conflict with the existing void * extension (allowing arithmetic on it,
> by defining sizeof(void)).  In either case it is not currently defined.
>
>
I'm not sure how you get to this, but all we're doing here is standard C.

x.c:

char
foo (const void *x)
{
   const char *y = (const char *) x;
   return *y;
}

y.c:

void
foo (const void *x, char c)
{
   const char *y = (const char *) x;
   *y = c;
}

wschmidt@rain6p1:~/src$ gcc -c x.c
wschmidt@rain6p1:~/src$ gcc -c y.c
y.c: In function 'foo':
y.c:5:6: error: assignment of read-only location '*y'
    *y = c;
       ^
wschmidt@rain6p1:~/src$


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 03/34] rs6000: Add the rest of the [altivec] stanza to the builtins file
  2021-08-10 12:17               ` Bill Schmidt
@ 2021-08-10 12:48                 ` Segher Boessenkool
  2021-08-10 13:02                   ` Bill Schmidt
  0 siblings, 1 reply; 84+ messages in thread
From: Segher Boessenkool @ 2021-08-10 12:48 UTC (permalink / raw)
  To: Bill Schmidt; +Cc: gcc-patches, dje.gcc, willschm

On Tue, Aug 10, 2021 at 07:17:54AM -0500, Bill Schmidt wrote:
> On 8/9/21 6:44 PM, Segher Boessenkool wrote:
> >This is not a documented GCC extension either, and it might even
> >conflict with the existing void * extension (allowing arithmetic on it,
> >by defining sizeof(void)).  In either case it is not currently defined.
> >
> I'm not sure how you get to this, but all we're doing here is standard C.

Arithmetic on void* is the GCC extension.  sizeof(void) is 1 as GCC
extension, instead of being undefined.  Pointer arithmetic is only
defined for arrays of the type being pointed to, and you cannot have an
array of void.  You can do this as GCC extension though, it behaves as
if it was a char* instead.

> x.c:
> 
> char
> foo (const void *x)
> {
>   const char *y = (const char *) x;
>   return *y;
> }

And this behaves exactly the same if you do s/const void/void/ .  The
const qualifier is meaningless on things of type void, since you cannot
have an lvalue of that type anyway.  And all type qualifiers can be cast
away (or cast into existence).

> y.c:
> 
> void
> foo (const void *x, char c)
> {
>   const char *y = (const char *) x;
>   *y = c;
> }
> 
> wschmidt@rain6p1:~/src$ gcc -c x.c
> wschmidt@rain6p1:~/src$ gcc -c y.c
> y.c: In function 'foo':
> y.c:5:6: error: assignment of read-only location '*y'
>    *y = c;
>       ^

Yes, *y is an lvalue.  *x is not: *x is an error.


It *is* allowed to have a "const void", but it means exactly the same as
just "void" (you cannot assign to either!)  And, they are compatible
types, too, (they are the *same* type in fact!), so if you ever would
treat them differently it would be mightily confusing :-)


Segher

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 03/34] rs6000: Add the rest of the [altivec] stanza to the builtins file
  2021-08-10 12:48                 ` Segher Boessenkool
@ 2021-08-10 13:02                   ` Bill Schmidt
  2021-08-10 13:40                     ` Segher Boessenkool
  0 siblings, 1 reply; 84+ messages in thread
From: Bill Schmidt @ 2021-08-10 13:02 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: gcc-patches, dje.gcc, willschm

On 8/10/21 7:48 AM, Segher Boessenkool wrote:
> On Tue, Aug 10, 2021 at 07:17:54AM -0500, Bill Schmidt wrote:
>> On 8/9/21 6:44 PM, Segher Boessenkool wrote:
>>> This is not a documented GCC extension either, and it might even
>>> conflict with the existing void * extension (allowing arithmetic on it,
>>> by defining sizeof(void)).  In either case it is not currently defined.
>>>
>> I'm not sure how you get to this, but all we're doing here is standard C.
> Arithmetic on void* is the GCC extension.  sizeof(void) is 1 as GCC
> extension, instead of being undefined.  Pointer arithmetic is only
> defined for arrays of the type being pointed to, and you cannot have an
> array of void.  You can do this as GCC extension though, it behaves as
> if it was a char* instead.
>
>> x.c:
>>
>> char
>> foo (const void *x)
>> {
>>    const char *y = (const char *) x;
>>    return *y;
>> }
> And this behaves exactly the same if you do s/const void/void/ .  The
> const qualifier is meaningless on things of type void, since you cannot
> have an lvalue of that type anyway.  And all type qualifiers can be cast
> away (or cast into existence).
>
>> y.c:
>>
>> void
>> foo (const void *x, char c)
>> {
>>    const char *y = (const char *) x;
>>    *y = c;
>> }
>>
>> wschmidt@rain6p1:~/src$ gcc -c x.c
>> wschmidt@rain6p1:~/src$ gcc -c y.c
>> y.c: In function 'foo':
>> y.c:5:6: error: assignment of read-only location '*y'
>>     *y = c;
>>        ^
> Yes, *y is an lvalue.  *x is not: *x is an error.
>
>
> It *is* allowed to have a "const void", but it means exactly the same as
> just "void" (you cannot assign to either!)  And, they are compatible
> types, too, (they are the *same* type in fact!), so if you ever would
> treat them differently it would be mightily confusing :-)


The whole point is that this data type is only used for interfaces, as 
shown in the example code.  Nobody wants to define const void as 
anything.  The const serves only as a contract that the pointed-to 
object, no matter what it is cast to, will not be modified.

I think you're over-thinking this. :-)

Bill

>
>
> Segher

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 03/34] rs6000: Add the rest of the [altivec] stanza to the builtins file
  2021-08-10 13:02                   ` Bill Schmidt
@ 2021-08-10 13:40                     ` Segher Boessenkool
  2021-08-10 13:49                       ` Bill Schmidt
  0 siblings, 1 reply; 84+ messages in thread
From: Segher Boessenkool @ 2021-08-10 13:40 UTC (permalink / raw)
  To: Bill Schmidt; +Cc: gcc-patches, dje.gcc, willschm

On Tue, Aug 10, 2021 at 08:02:24AM -0500, Bill Schmidt wrote:
> The whole point is that this data type is only used for interfaces, as 
> shown in the example code.  Nobody wants to define const void as 
> anything.  The const serves only as a contract that the pointed-to 
> object, no matter what it is cast to, will not be modified.

So it is just documentation, nothing to do with overloading?  Any cast
(implicit as well!) will give new qualifiers, not just a new type.  So I
still do not see the point here.

I'll just read it as "void *" :-)


Segher

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 03/34] rs6000: Add the rest of the [altivec] stanza to the builtins file
  2021-08-10 13:40                     ` Segher Boessenkool
@ 2021-08-10 13:49                       ` Bill Schmidt
  0 siblings, 0 replies; 84+ messages in thread
From: Bill Schmidt @ 2021-08-10 13:49 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: gcc-patches, dje.gcc, willschm


On 8/10/21 8:40 AM, Segher Boessenkool wrote:
> On Tue, Aug 10, 2021 at 08:02:24AM -0500, Bill Schmidt wrote:
>> The whole point is that this data type is only used for interfaces, as
>> shown in the example code.  Nobody wants to define const void as
>> anything.  The const serves only as a contract that the pointed-to
>> object, no matter what it is cast to, will not be modified.
> So it is just documentation, nothing to do with overloading?  Any cast
> (implicit as well!) will give new qualifiers, not just a new type.  So I
> still do not see the point here.
>
> I'll just read it as "void *" :-)


Largely documentational, yes.  The overloads must be defined with "const 
unsigned char *" and so forth.  It would be unexpected to define the 
built-in that this maps to as "void *" rather than "const void *".  
Normally passing a "const unsigned char *" to a function requiring a 
"const void *" can be done implicitly with no cast at all, and so this 
is what people expect to see.  "Under the covers" we can of course cast 
in any way that we see fit, but specifying "const void *" really 
reinforces what people should understand is going on.

If it makes you feel better to read it as "void *", I say go for it. 
:-)  I think most people will be less confused with "const" present in 
the signature in both the built-in definition and the overload 
definition, not just in one of them.

Bill

>
>
> Segher

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 04/34] rs6000: Add VSX builtins
  2021-07-29 13:30 ` [PATCH 04/34] rs6000: Add VSX builtins Bill Schmidt
@ 2021-08-10 16:14   ` will schmidt
  2021-08-10 17:52   ` Segher Boessenkool
  1 sibling, 0 replies; 84+ messages in thread
From: will schmidt @ 2021-08-10 16:14 UTC (permalink / raw)
  To: Bill Schmidt, gcc-patches; +Cc: segher, dje.gcc, willschm

On Thu, 2021-07-29 at 08:30 -0500, Bill Schmidt wrote:
> 2021-06-07  Bill Schmidt  <wschmidt@linux.ibm.com>
> 


Hi,

> gcc/
> 	* config/rs6000/rs6000-builtin-new.def: Add vsx stanza.
> ---
>  gcc/config/rs6000/rs6000-builtin-new.def | 857 +++++++++++++++++++++++
>  1 file changed, 857 insertions(+)
> 


ok

> diff --git a/gcc/config/rs6000/rs6000-builtin-new.def b/gcc/config/rs6000/rs6000-builtin-new.def
> index f1aa5529cdd..974cdc8c37c 100644
> --- a/gcc/config/rs6000/rs6000-builtin-new.def
> +++ b/gcc/config/rs6000/rs6000-builtin-new.def
> @@ -1028,3 +1028,860 @@
>  
>    const vss __builtin_vec_set_v8hi (vss, signed short, const int<3>);
>      VEC_SET_V8HI nothing {set}
> +
> +
> +; VSX builtins.
> +[vsx]
> +  pure vd __builtin_altivec_lvx_v2df (signed long, const void *);
> +    LVX_V2DF altivec_lvx_v2df {ldvec}
> +
> +  pure vsll __builtin_altivec_lvx_v2di (signed long, const void *);
> +    LVX_V2DI altivec_lvx_v2di {ldvec}
> +
> +  pure vd __builtin_altivec_lvxl_v2df (signed long, const void *);
> +    LVXL_V2DF altivec_lvxl_v2df {ldvec}
> +
> +  pure vsll __builtin_altivec_lvxl_v2di (signed long, const void *);
> +    LVXL_V2DI altivec_lvxl_v2di {ldvec}
> +
> +  const vd __builtin_altivec_nabs_v2df (vd);
> +    NABS_V2DF vsx_nabsv2df2 {}
> +
> +  const vsll __builtin_altivec_nabs_v2di (vsll);
> +    NABS_V2DI nabsv2di2 {}
> +
> +  void __builtin_altivec_stvx_v2df (vd, signed long, void *);
> +    STVX_V2DF altivec_stvx_v2df {stvec}
> +
> +  void __builtin_altivec_stvx_v2di (vsll, signed long, void *);
> +    STVX_V2DI altivec_stvx_v2di {stvec}
> +
> +  void __builtin_altivec_stvxl_v2df (vd, signed long, void *);
> +    STVXL_V2DF altivec_stvxl_v2df {stvec}
> +
> +  void __builtin_altivec_stvxl_v2di (vsll, signed long, void *);
> +    STVXL_V2DI altivec_stvxl_v2di {stvec}
> +
> +  const vd __builtin_altivec_vand_v2df (vd, vd);
> +    VAND_V2DF andv2df3 {}
> +
> +  const vsll __builtin_altivec_vand_v2di (vsll, vsll);
> +    VAND_V2DI andv2di3 {}
> +
> +  const vull __builtin_altivec_vand_v2di_uns (vull, vull);
> +    VAND_V2DI_UNS andv2di3 {}
> +
> +  const vd __builtin_altivec_vandc_v2df (vd, vd);
> +    VANDC_V2DF andcv2df3 {}
> +
> +  const vsll __builtin_altivec_vandc_v2di (vsll, vsll);
> +    VANDC_V2DI andcv2di3 {}
> +
> +  const vull __builtin_altivec_vandc_v2di_uns (vull, vull);
> +    VANDC_V2DI_UNS andcv2di3 {}
> +
> +  const vsll __builtin_altivec_vcmpequd (vull, vull);
> +    VCMPEQUD vector_eqv2di {}
> +
> +  const int __builtin_altivec_vcmpequd_p (int, vsll, vsll);
> +    VCMPEQUD_P vector_eq_v2di_p {pred}
> +
> +  const vsll __builtin_altivec_vcmpgtsd (vsll, vsll);
> +    VCMPGTSD vector_gtv2di {}
> +
> +  const int __builtin_altivec_vcmpgtsd_p (int, vsll, vsll);
> +    VCMPGTSD_P vector_gt_v2di_p {pred}
> +
> +  const vsll __builtin_altivec_vcmpgtud (vull, vull);
> +    VCMPGTUD vector_gtuv2di {}
> +
> +  const int __builtin_altivec_vcmpgtud_p (int, vsll, vsll);
> +    VCMPGTUD_P vector_gtu_v2di_p {pred}
> +
> +  const vd __builtin_altivec_vnor_v2df (vd, vd);
> +    VNOR_V2DF norv2df3 {}
> +
> +  const vsll __builtin_altivec_vnor_v2di (vsll, vsll);
> +    VNOR_V2DI norv2di3 {}
> +
> +  const vull __builtin_altivec_vnor_v2di_uns (vull, vull);
> +    VNOR_V2DI_UNS norv2di3 {}
> +
> +  const vd __builtin_altivec_vor_v2df (vd, vd);
> +    VOR_V2DF iorv2df3 {}
> +
> +  const vsll __builtin_altivec_vor_v2di (vsll, vsll);
> +    VOR_V2DI iorv2di3 {}
> +
> +  const vull __builtin_altivec_vor_v2di_uns (vull, vull);
> +    VOR_V2DI_UNS iorv2di3 {}
> +
> +  const vd __builtin_altivec_vperm_2df (vd, vd, vuc);
> +    VPERM_2DF altivec_vperm_v2df {}
> +
> +  const vsll __builtin_altivec_vperm_2di (vsll, vsll, vuc);
> +    VPERM_2DI altivec_vperm_v2di {}
> +
> +  const vull __builtin_altivec_vperm_2di_uns (vull, vull, vuc);
> +    VPERM_2DI_UNS altivec_vperm_v2di_uns {}
> +
> +  const vd __builtin_altivec_vreve_v2df (vd);
> +    VREVE_V2DF altivec_vrevev2df2 {}
> +
> +  const vsll __builtin_altivec_vreve_v2di (vsll);
> +    VREVE_V2DI altivec_vrevev2di2 {}
> +
> +  const vd __builtin_altivec_vsel_2df (vd, vd, vd);
> +    VSEL_2DF vector_select_v2df {}
> +
> +  const vsll __builtin_altivec_vsel_2di (vsll, vsll, vsll);
> +    VSEL_2DI_B vector_select_v2di {}
> +
> +  const vull __builtin_altivec_vsel_2di_uns (vull, vull, vull);
> +    VSEL_2DI_UNS vector_select_v2di_uns {}
> +
> +  const vd __builtin_altivec_vsldoi_2df (vd, vd, const int<4>);
> +    VSLDOI_2DF altivec_vsldoi_v2df {}
> +
> +  const vsll __builtin_altivec_vsldoi_2di (vsll, vsll, const int<4>);
> +    VSLDOI_2DI altivec_vsldoi_v2di {}
> +
> +  const vd __builtin_altivec_vxor_v2df (vd, vd);
> +    VXOR_V2DF xorv2df3 {}
> +
> +  const vsll __builtin_altivec_vxor_v2di (vsll, vsll);
> +    VXOR_V2DI xorv2di3 {}
> +
> +  const vull __builtin_altivec_vxor_v2di_uns (vull, vull);
> +    VXOR_V2DI_UNS xorv2di3 {}
> +
> +  const signed __int128 __builtin_vec_ext_v1ti (vsq, signed int);
> +    VEC_EXT_V1TI nothing {extract}
> +
> +  const double __builtin_vec_ext_v2df (vd, signed int);
> +    VEC_EXT_V2DF nothing {extract}
> +
> +  const signed long long __builtin_vec_ext_v2di (vsll, signed int);
> +    VEC_EXT_V2DI nothing {extract}
> +
> +  const vsq __builtin_vec_init_v1ti (signed __int128);
> +    VEC_INIT_V1TI nothing {init}
> +
> +  const vd __builtin_vec_init_v2df (double, double);
> +    VEC_INIT_V2DF nothing {init}
> +
> +  const vsll __builtin_vec_init_v2di (signed long long, signed long long);
> +    VEC_INIT_V2DI nothing {init}
> +
> +  const vsq __builtin_vec_set_v1ti (vsq, signed __int128, const int<0,0>);
> +    VEC_SET_V1TI nothing {set}
> +
> +  const vd __builtin_vec_set_v2df (vd, double, const int<1>);
> +    VEC_SET_V2DF nothing {set}
> +
> +  const vsll __builtin_vec_set_v2di (vsll, signed long long, const int<1>);
> +    VEC_SET_V2DI nothing {set}
> +
> +  const vsc __builtin_vsx_cmpge_16qi (vsc, vsc);
> +    CMPGE_16QI vector_nltv16qi {}
> +
> +  const vsll __builtin_vsx_cmpge_2di (vsll, vsll);
> +    CMPGE_2DI vector_nltv2di {}
> +
> +  const vsi __builtin_vsx_cmpge_4si (vsi, vsi);
> +    CMPGE_4SI vector_nltv4si {}
> +
> +  const vss __builtin_vsx_cmpge_8hi (vss, vss);
> +    CMPGE_8HI vector_nltv8hi {}
> +
> +  const vsc __builtin_vsx_cmpge_u16qi (vuc, vuc);
> +    CMPGE_U16QI vector_nltuv16qi {}
> +
> +  const vsll __builtin_vsx_cmpge_u2di (vull, vull);
> +    CMPGE_U2DI vector_nltuv2di {}
> +
> +  const vsi __builtin_vsx_cmpge_u4si (vui, vui);
> +    CMPGE_U4SI vector_nltuv4si {}
> +
> +  const vss __builtin_vsx_cmpge_u8hi (vus, vus);
> +    CMPGE_U8HI vector_nltuv8hi {}
> +
> +  const vsc __builtin_vsx_cmple_16qi (vsc, vsc);
> +    CMPLE_16QI vector_ngtv16qi {}
> +
> +  const vsll __builtin_vsx_cmple_2di (vsll, vsll);
> +    CMPLE_2DI vector_ngtv2di {}
> +
> +  const vsi __builtin_vsx_cmple_4si (vsi, vsi);
> +    CMPLE_4SI vector_ngtv4si {}
> +
> +  const vss __builtin_vsx_cmple_8hi (vss, vss);
> +    CMPLE_8HI vector_ngtv8hi {}
> +
> +  const vsc __builtin_vsx_cmple_u16qi (vsc, vsc);
> +    CMPLE_U16QI vector_ngtuv16qi {}
> +
> +  const vsll __builtin_vsx_cmple_u2di (vsll, vsll);
> +    CMPLE_U2DI vector_ngtuv2di {}
> +
> +  const vsi __builtin_vsx_cmple_u4si (vsi, vsi);
> +    CMPLE_U4SI vector_ngtuv4si {}
> +
> +  const vss __builtin_vsx_cmple_u8hi (vss, vss);
> +    CMPLE_U8HI vector_ngtuv8hi {}
> +
> +  const vd __builtin_vsx_concat_2df (double, double);
> +    CONCAT_2DF vsx_concat_v2df {}
> +
> +  const vsll __builtin_vsx_concat_2di (signed long long, signed long long);
> +    CONCAT_2DI vsx_concat_v2di {}
> +
> +  const vd __builtin_vsx_cpsgndp (vd, vd);
> +    CPSGNDP vector_copysignv2df3 {}
> +
> +  const vf __builtin_vsx_cpsgnsp (vf, vf);
> +    CPSGNSP vector_copysignv4sf3 {}
> +
> +  const vsll __builtin_vsx_div_2di (vsll, vsll);
> +    DIV_V2DI vsx_div_v2di {}
> +
> +  const vd __builtin_vsx_doublee_v4sf (vf);
> +    DOUBLEE_V4SF doubleev4sf2 {}
> +
> +  const vd __builtin_vsx_doublee_v4si (vsi);
> +    DOUBLEE_V4SI doubleev4si2 {}
> +
> +  const vd __builtin_vsx_doubleh_v4sf (vf);
> +    DOUBLEH_V4SF doublehv4sf2 {}
> +
> +  const vd __builtin_vsx_doubleh_v4si (vsi);
> +    DOUBLEH_V4SI doublehv4si2 {}
> +
> +  const vd __builtin_vsx_doublel_v4sf (vf);
> +    DOUBLEL_V4SF doublelv4sf2 {}
> +
> +  const vd __builtin_vsx_doublel_v4si (vsi);
> +    DOUBLEL_V4SI doublelv4si2 {}
> +
> +  const vd __builtin_vsx_doubleo_v4sf (vf);
> +    DOUBLEO_V4SF doubleov4sf2 {}
> +
> +  const vd __builtin_vsx_doubleo_v4si (vsi);
> +    DOUBLEO_V4SI doubleov4si2 {}
> +
> +  const vf __builtin_vsx_floate_v2df (vd);
> +    FLOATE_V2DF floatev2df {}
> +
> +  const vf __builtin_vsx_floate_v2di (vsll);
> +    FLOATE_V2DI floatev2di {}
> +
> +  const vf __builtin_vsx_floato_v2df (vd);
> +    FLOATO_V2DF floatov2df {}
> +
> +  const vf __builtin_vsx_floato_v2di (vsll);
> +    FLOATO_V2DI floatov2di {}
> +
> +  pure vsq __builtin_vsx_ld_elemrev_v1ti (signed long, const void *);
> +    LD_ELEMREV_V1TI vsx_ld_elemrev_v1ti {ldvec,endian}
> +
> +  pure vd __builtin_vsx_ld_elemrev_v2df (signed long, const void *);
> +    LD_ELEMREV_V2DF vsx_ld_elemrev_v2df {ldvec,endian}
> +
> +  pure vsll __builtin_vsx_ld_elemrev_v2di (signed long, const void *);
> +    LD_ELEMREV_V2DI vsx_ld_elemrev_v2di {ldvec,endian}
> +
> +  pure vf __builtin_vsx_ld_elemrev_v4sf (signed long, const void *);
> +    LD_ELEMREV_V4SF vsx_ld_elemrev_v4sf {ldvec,endian}
> +
> +  pure vsi __builtin_vsx_ld_elemrev_v4si (signed long, const void *);
> +    LD_ELEMREV_V4SI vsx_ld_elemrev_v4si {ldvec,endian}
> +
> +  pure vss __builtin_vsx_ld_elemrev_v8hi (signed long, const void *);
> +    LD_ELEMREV_V8HI vsx_ld_elemrev_v8hi {ldvec,endian}
> +
> +  pure vsc __builtin_vsx_ld_elemrev_v16qi (signed long, const void *);
> +    LD_ELEMREV_V16QI vsx_ld_elemrev_v16qi {ldvec,endian}

Seems straightforward, I've admittedly not lined up and confirmed all
the arguments versus the builtins.  

> +
> +; There is apparent intent in rs6000-builtin.def to have RS6000_BTC_SPECIAL
> +; processing for LXSDX, LXVDSX, and STXSDX, but there are no def_builtin calls
> +; for any of them.  At some point, we may want to add a set of built-ins for
> +; whichever vector types make sense for these.

Add a "TODO:" label ?

> +
> +  pure vsq __builtin_vsx_lxvd2x_v1ti (signed long, const void *);
> +    LXVD2X_V1TI vsx_load_v1ti {ldvec}
> +
> +  pure vd __builtin_vsx_lxvd2x_v2df (signed long, const void *);
> +    LXVD2X_V2DF vsx_load_v2df {ldvec}
> +
> +  pure vsll __builtin_vsx_lxvd2x_v2di (signed long, const void *);
> +    LXVD2X_V2DI vsx_load_v2di {ldvec}
> +
> +  pure vsc __builtin_vsx_lxvw4x_v16qi (signed long, const void *);
> +    LXVW4X_V16QI vsx_load_v16qi {ldvec}
> +
> +  pure vf __builtin_vsx_lxvw4x_v4sf (signed long, const void *);
> +    LXVW4X_V4SF vsx_load_v4sf {ldvec}
> +
> +  pure vsi __builtin_vsx_lxvw4x_v4si (signed long, const void *);
> +    LXVW4X_V4SI vsx_load_v4si {ldvec}
> +
> +  pure vss __builtin_vsx_lxvw4x_v8hi (signed long, const void *);
> +    LXVW4X_V8HI vsx_load_v8hi {ldvec}
> +
> +  const vd __builtin_vsx_mergeh_2df (vd, vd);
> +    VEC_MERGEH_V2DF vsx_mergeh_v2df {}
> +
> +  const vsll __builtin_vsx_mergeh_2di (vsll, vsll);
> +    VEC_MERGEH_V2DI vsx_mergeh_v2di {}
> +
> +  const vd __builtin_vsx_mergel_2df (vd, vd);
> +    VEC_MERGEL_V2DF vsx_mergel_v2df {}
> +
> +  const vsll __builtin_vsx_mergel_2di (vsll, vsll);
> +    VEC_MERGEL_V2DI vsx_mergel_v2di {}
> +
> +  const vsll __builtin_vsx_mul_2di (vsll, vsll);
> +    MUL_V2DI vsx_mul_v2di {}
> +
> +  const vsq __builtin_vsx_set_1ti (vsq, signed __int128, const int<0,0>);
> +    SET_1TI vsx_set_v1ti {set}
> +
> +  const vd __builtin_vsx_set_2df (vd, double, const int<0,1>);
> +    SET_2DF vsx_set_v2df {set}
> +
> +  const vsll __builtin_vsx_set_2di (vsll, signed long long, const int<0,1>);
> +    SET_2DI vsx_set_v2di {set}
> +
> +  const vd __builtin_vsx_splat_2df (double);
> +    SPLAT_2DF vsx_splat_v2df {}
> +
> +  const vsll __builtin_vsx_splat_2di (signed long long);
> +    SPLAT_2DI vsx_splat_v2di {}
> +
> +  void __builtin_vsx_st_elemrev_v1ti (vsq, signed long, void *);
> +    ST_ELEMREV_V1TI vsx_st_elemrev_v1ti {stvec,endian}
> +
> +  void __builtin_vsx_st_elemrev_v2df (vd, signed long, void *);
> +    ST_ELEMREV_V2DF vsx_st_elemrev_v2df {stvec,endian}
> +
> +  void __builtin_vsx_st_elemrev_v2di (vsll, signed long, void *);
> +    ST_ELEMREV_V2DI vsx_st_elemrev_v2di {stvec,endian}
> +
> +  void __builtin_vsx_st_elemrev_v4sf (vf, signed long, void *);
> +    ST_ELEMREV_V4SF vsx_st_elemrev_v4sf {stvec,endian}
> +
> +  void __builtin_vsx_st_elemrev_v4si (vsi, signed long, void *);
> +    ST_ELEMREV_V4SI vsx_st_elemrev_v4si {stvec,endian}
> +
> +  void __builtin_vsx_st_elemrev_v8hi (vss, signed long, void *);
> +    ST_ELEMREV_V8HI vsx_st_elemrev_v8hi {stvec,endian}
> +
> +  void __builtin_vsx_st_elemrev_v16qi (vsc, signed long, void *);
> +    ST_ELEMREV_V16QI vsx_st_elemrev_v16qi {stvec,endian}
> +
> +  void __builtin_vsx_stxvd2x_v1ti (vsq, signed long, void *);
> +    STXVD2X_V1TI vsx_store_v1ti {stvec}
> +
> +  void __builtin_vsx_stxvd2x_v2df (vd, signed long, void *);
> +    STXVD2X_V2DF vsx_store_v2df {stvec}
> +
> +  void __builtin_vsx_stxvd2x_v2di (vsll, signed long, void *);
> +    STXVD2X_V2DI vsx_store_v2di {stvec}
> +
> +  void __builtin_vsx_stxvw4x_v4sf (vf, signed long, void *);
> +    STXVW4X_V4SF vsx_store_v4sf {stvec}
> +
> +  void __builtin_vsx_stxvw4x_v4si (vsi, signed long, void *);
> +    STXVW4X_V4SI vsx_store_v4si {stvec}
> +
> +  void __builtin_vsx_stxvw4x_v8hi (vss, signed long, void *);
> +    STXVW4X_V8HI vsx_store_v8hi {stvec}
> +
> +  void __builtin_vsx_stxvw4x_v16qi (vsc, signed long, void *);
> +    STXVW4X_V16QI vsx_store_v16qi {stvec}
> +
> +  const vull __builtin_vsx_udiv_2di (vull, vull);
> +    UDIV_V2DI vsx_udiv_v2di {}
> +
> +  const vd __builtin_vsx_uns_doublee_v4si (vsi);
> +    UNS_DOUBLEE_V4SI unsdoubleev4si2 {}
> +
> +  const vd __builtin_vsx_uns_doubleh_v4si (vsi);
> +    UNS_DOUBLEH_V4SI unsdoublehv4si2 {}
> +
> +  const vd __builtin_vsx_uns_doublel_v4si (vsi);
> +    UNS_DOUBLEL_V4SI unsdoublelv4si2 {}
> +
> +  const vd __builtin_vsx_uns_doubleo_v4si (vsi);
> +    UNS_DOUBLEO_V4SI unsdoubleov4si2 {}
> +
> +  const vf __builtin_vsx_uns_floate_v2di (vsll);
> +    UNS_FLOATE_V2DI unsfloatev2di {}
> +
> +  const vf __builtin_vsx_uns_floato_v2di (vsll);
> +    UNS_FLOATO_V2DI unsfloatov2di {}
> +
> +; I have no idea why we have __builtin_vsx_* duplicates of these when
> +; the __builtin_altivec_* counterparts are already present.  Keeping
> +; them for compatibility, but...oy.

Oy indeed.  Perhaps adding a straightforward statement of "These are
duplicates of __builtin_altivec_* builtins, and are here for
backwards
compatibility.".   Perhaps another TODO: label to someday later
deprecate if so desired?

No further comments, 
Thanks
-Will



> +  const vsc __builtin_vsx_vperm_16qi (vsc, vsc, vuc);
> +    VPERM_16QI_X altivec_vperm_v16qi {}
> +
> +  const vuc __builtin_vsx_vperm_16qi_uns (vuc, vuc, vuc);
> +    VPERM_16QI_UNS_X altivec_vperm_v16qi_uns {}
> +
> +  const vsq __builtin_vsx_vperm_1ti (vsq, vsq, vsc);
> +    VPERM_1TI_X altivec_vperm_v1ti {}
> +
> +  const vsq __builtin_vsx_vperm_1ti_uns (vsq, vsq, vsc);
> +    VPERM_1TI_UNS_X altivec_vperm_v1ti_uns {}
> +
> +  const vd __builtin_vsx_vperm_2df (vd, vd, vuc);
> +    VPERM_2DF_X altivec_vperm_v2df {}
> +
> +  const vsll __builtin_vsx_vperm_2di (vsll, vsll, vuc);
> +    VPERM_2DI_X altivec_vperm_v2di {}
> +
> +  const vull __builtin_vsx_vperm_2di_uns (vull, vull, vuc);
> +    VPERM_2DI_UNS_X altivec_vperm_v2di_uns {}
> +
> +  const vf __builtin_vsx_vperm_4sf (vf, vf, vuc);
> +    VPERM_4SF_X altivec_vperm_v4sf {}
> +
> +  const vsi __builtin_vsx_vperm_4si (vsi, vsi, vuc);
> +    VPERM_4SI_X altivec_vperm_v4si {}
> +
> +  const vui __builtin_vsx_vperm_4si_uns (vui, vui, vuc);
> +    VPERM_4SI_UNS_X altivec_vperm_v4si_uns {}
> +
> +  const vss __builtin_vsx_vperm_8hi (vss, vss, vuc);
> +    VPERM_8HI_X altivec_vperm_v8hi {}
> +
> +  const vus __builtin_vsx_vperm_8hi_uns (vus, vus, vuc);
> +    VPERM_8HI_UNS_X altivec_vperm_v8hi_uns {}
> +
> +  const vsll __builtin_vsx_vsigned_v2df (vd);
> +    VEC_VSIGNED_V2DF vsx_xvcvdpsxds {}
> +
> +  const vsi __builtin_vsx_vsigned_v4sf (vf);
> +    VEC_VSIGNED_V4SF vsx_xvcvspsxws {}
> +
> +  const vsi __builtin_vsx_vsignede_v2df (vd);
> +    VEC_VSIGNEDE_V2DF vsignede_v2df {}
> +
> +  const vsi __builtin_vsx_vsignedo_v2df (vd);
> +    VEC_VSIGNEDO_V2DF vsignedo_v2df {}
> +
> +  const vsll __builtin_vsx_vunsigned_v2df (vd);
> +    VEC_VUNSIGNED_V2DF vsx_xvcvdpsxds {}
> +
> +  const vsi __builtin_vsx_vunsigned_v4sf (vf);
> +    VEC_VUNSIGNED_V4SF vsx_xvcvspsxws {}
> +
> +  const vsi __builtin_vsx_vunsignede_v2df (vd);
> +    VEC_VUNSIGNEDE_V2DF vunsignede_v2df {}
> +
> +  const vsi __builtin_vsx_vunsignedo_v2df (vd);
> +    VEC_VUNSIGNEDO_V2DF vunsignedo_v2df {}
> +
> +  const vf __builtin_vsx_xscvdpsp (double);
> +    XSCVDPSP vsx_xscvdpsp {}
> +
> +  const double __builtin_vsx_xscvspdp (vf);
> +    XSCVSPDP vsx_xscvspdp {}
> +
> +  const double __builtin_vsx_xsmaxdp (double, double);
> +    XSMAXDP smaxdf3 {}
> +
> +  const double __builtin_vsx_xsmindp (double, double);
> +    XSMINDP smindf3 {}
> +
> +  const double __builtin_vsx_xsrdpi (double);
> +    XSRDPI vsx_xsrdpi {}
> +
> +  const double __builtin_vsx_xsrdpic (double);
> +    XSRDPIC vsx_xsrdpic {}
> +
> +  const double __builtin_vsx_xsrdpim (double);
> +    XSRDPIM floordf2 {}
> +
> +  const double __builtin_vsx_xsrdpip (double);
> +    XSRDPIP ceildf2 {}
> +
> +  const double __builtin_vsx_xsrdpiz (double);
> +    XSRDPIZ btruncdf2 {}
> +
> +  const signed int __builtin_vsx_xstdivdp_fe (double, double);
> +    XSTDIVDP_FE vsx_tdivdf3_fe {}
> +
> +  const signed int __builtin_vsx_xstdivdp_fg (double, double);
> +    XSTDIVDP_FG vsx_tdivdf3_fg {}
> +
> +  const signed int __builtin_vsx_xstsqrtdp_fe (double);
> +    XSTSQRTDP_FE vsx_tsqrtdf2_fe {}
> +
> +  const signed int __builtin_vsx_xstsqrtdp_fg (double);
> +    XSTSQRTDP_FG vsx_tsqrtdf2_fg {}
> +
> +  const vd __builtin_vsx_xvabsdp (vd);
> +    XVABSDP absv2df2 {}
> +
> +  const vf __builtin_vsx_xvabssp (vf);
> +    XVABSSP absv4sf2 {}
> +
> +  fpmath vd __builtin_vsx_xvadddp (vd, vd);
> +    XVADDDP addv2df3 {}
> +
> +  fpmath vf __builtin_vsx_xvaddsp (vf, vf);
> +    XVADDSP addv4sf3 {}
> +
> +  const vd __builtin_vsx_xvcmpeqdp (vd, vd);
> +    XVCMPEQDP vector_eqv2df {}
> +
> +  const signed int __builtin_vsx_xvcmpeqdp_p (signed int, vd, vd);
> +    XVCMPEQDP_P vector_eq_v2df_p {pred}
> +
> +  const vf __builtin_vsx_xvcmpeqsp (vf, vf);
> +    XVCMPEQSP vector_eqv4sf {}
> +
> +  const signed int __builtin_vsx_xvcmpeqsp_p (signed int, vf, vf);
> +    XVCMPEQSP_P vector_eq_v4sf_p {pred}
> +
> +  const vd __builtin_vsx_xvcmpgedp (vd, vd);
> +    XVCMPGEDP vector_gev2df {}
> +
> +  const signed int __builtin_vsx_xvcmpgedp_p (signed int, vd, vd);
> +    XVCMPGEDP_P vector_ge_v2df_p {pred}
> +
> +  const vf __builtin_vsx_xvcmpgesp (vf, vf);
> +    XVCMPGESP vector_gev4sf {}
> +
> +  const signed int __builtin_vsx_xvcmpgesp_p (signed int, vf, vf);
> +    XVCMPGESP_P vector_ge_v4sf_p {pred}
> +
> +  const vd __builtin_vsx_xvcmpgtdp (vd, vd);
> +    XVCMPGTDP vector_gtv2df {}
> +
> +  const signed int __builtin_vsx_xvcmpgtdp_p (signed int, vd, vd);
> +    XVCMPGTDP_P vector_gt_v2df_p {pred}
> +
> +  const vf __builtin_vsx_xvcmpgtsp (vf, vf);
> +    XVCMPGTSP vector_gtv4sf {}
> +
> +  const signed int __builtin_vsx_xvcmpgtsp_p (signed int, vf, vf);
> +    XVCMPGTSP_P vector_gt_v4sf_p {pred}
> +
> +  const vf __builtin_vsx_xvcvdpsp (vd);
> +    XVCVDPSP vsx_xvcvdpsp {}
> +
> +  const vsll __builtin_vsx_xvcvdpsxds (vd);
> +    XVCVDPSXDS vsx_fix_truncv2dfv2di2 {}
> +
> +  const vsll __builtin_vsx_xvcvdpsxds_scale (vd, const int);
> +    XVCVDPSXDS_SCALE vsx_xvcvdpsxds_scale {}
> +
> +  const vsi __builtin_vsx_xvcvdpsxws (vd);
> +    XVCVDPSXWS vsx_xvcvdpsxws {}
> +
> +  const vsll __builtin_vsx_xvcvdpuxds (vd);
> +    XVCVDPUXDS vsx_fixuns_truncv2dfv2di2 {}
> +
> +  const vsll __builtin_vsx_xvcvdpuxds_scale (vd, const int);
> +    XVCVDPUXDS_SCALE vsx_xvcvdpuxds_scale {}
> +
> +  const vull __builtin_vsx_xvcvdpuxds_uns (vd);
> +    XVCVDPUXDS_UNS vsx_fixuns_truncv2dfv2di2 {}
> +
> +  const vsi __builtin_vsx_xvcvdpuxws (vd);
> +    XVCVDPUXWS vsx_xvcvdpuxws {}
> +
> +  const vd __builtin_vsx_xvcvspdp (vf);
> +    XVCVSPDP vsx_xvcvspdp {}
> +
> +  const vsll __builtin_vsx_xvcvspsxds (vf);
> +    XVCVSPSXDS vsx_xvcvspsxds {}
> +
> +  const vsi __builtin_vsx_xvcvspsxws (vf);
> +    XVCVSPSXWS vsx_fix_truncv4sfv4si2 {}
> +
> +  const vsll __builtin_vsx_xvcvspuxds (vf);
> +    XVCVSPUXDS vsx_xvcvspuxds {}
> +
> +  const vsi __builtin_vsx_xvcvspuxws (vf);
> +    XVCVSPUXWS vsx_fixuns_truncv4sfv4si2 {}
> +
> +  const vd __builtin_vsx_xvcvsxddp (vsll);
> +    XVCVSXDDP vsx_floatv2div2df2 {}
> +
> +  const vd __builtin_vsx_xvcvsxddp_scale (vsll, const int<5>);
> +    XVCVSXDDP_SCALE vsx_xvcvsxddp_scale {}
> +
> +  const vf __builtin_vsx_xvcvsxdsp (vsll);
> +    XVCVSXDSP vsx_xvcvsxdsp {}
> +
> +  const vd __builtin_vsx_xvcvsxwdp (vsi);
> +    XVCVSXWDP vsx_xvcvsxwdp {}
> +
> +  const vf __builtin_vsx_xvcvsxwsp (vsi);
> +    XVCVSXWSP vsx_floatv4siv4sf2 {}
> +
> +  const vd __builtin_vsx_xvcvuxddp (vsll);
> +    XVCVUXDDP vsx_floatunsv2div2df2 {}
> +
> +  const vd __builtin_vsx_xvcvuxddp_scale (vsll, const int<5>);
> +    XVCVUXDDP_SCALE vsx_xvcvuxddp_scale {}
> +
> +  const vd __builtin_vsx_xvcvuxddp_uns (vull);
> +    XVCVUXDDP_UNS vsx_floatunsv2div2df2 {}
> +
> +  const vf __builtin_vsx_xvcvuxdsp (vull);
> +    XVCVUXDSP vsx_xvcvuxdsp {}
> +
> +  const vd __builtin_vsx_xvcvuxwdp (vsi);
> +    XVCVUXWDP vsx_xvcvuxwdp {}
> +
> +  const vf __builtin_vsx_xvcvuxwsp (vsi);
> +    XVCVUXWSP vsx_floatunsv4siv4sf2 {}
> +
> +  fpmath vd __builtin_vsx_xvdivdp (vd, vd);
> +    XVDIVDP divv2df3 {}
> +
> +  fpmath vf __builtin_vsx_xvdivsp (vf, vf);
> +    XVDIVSP divv4sf3 {}
> +
> +  const vd __builtin_vsx_xvmadddp (vd, vd, vd);
> +    XVMADDDP fmav2df4 {}
> +
> +  const vf __builtin_vsx_xvmaddsp (vf, vf, vf);
> +    XVMADDSP fmav4sf4 {}
> +
> +  const vd __builtin_vsx_xvmaxdp (vd, vd);
> +    XVMAXDP smaxv2df3 {}
> +
> +  const vf __builtin_vsx_xvmaxsp (vf, vf);
> +    XVMAXSP smaxv4sf3 {}
> +
> +  const vd __builtin_vsx_xvmindp (vd, vd);
> +    XVMINDP sminv2df3 {}
> +
> +  const vf __builtin_vsx_xvminsp (vf, vf);
> +    XVMINSP sminv4sf3 {}
> +
> +  const vd __builtin_vsx_xvmsubdp (vd, vd, vd);
> +    XVMSUBDP fmsv2df4 {}
> +
> +  const vf __builtin_vsx_xvmsubsp (vf, vf, vf);
> +    XVMSUBSP fmsv4sf4 {}
> +
> +  fpmath vd __builtin_vsx_xvmuldp (vd, vd);
> +    XVMULDP mulv2df3 {}
> +
> +  fpmath vf __builtin_vsx_xvmulsp (vf, vf);
> +    XVMULSP mulv4sf3 {}
> +
> +  const vd __builtin_vsx_xvnabsdp (vd);
> +    XVNABSDP vsx_nabsv2df2 {}
> +
> +  const vf __builtin_vsx_xvnabssp (vf);
> +    XVNABSSP vsx_nabsv4sf2 {}
> +
> +  const vd __builtin_vsx_xvnegdp (vd);
> +    XVNEGDP negv2df2 {}
> +
> +  const vf __builtin_vsx_xvnegsp (vf);
> +    XVNEGSP negv4sf2 {}
> +
> +  const vd __builtin_vsx_xvnmadddp (vd, vd, vd);
> +    XVNMADDDP nfmav2df4 {}
> +
> +  const vf __builtin_vsx_xvnmaddsp (vf, vf, vf);
> +    XVNMADDSP nfmav4sf4 {}
> +
> +  const vd __builtin_vsx_xvnmsubdp (vd, vd, vd);
> +    XVNMSUBDP nfmsv2df4 {}
> +
> +  const vf __builtin_vsx_xvnmsubsp (vf, vf, vf);
> +    XVNMSUBSP nfmsv4sf4 {}
> +
> +  const vd __builtin_vsx_xvrdpi (vd);
> +    XVRDPI vsx_xvrdpi {}
> +
> +  const vd __builtin_vsx_xvrdpic (vd);
> +    XVRDPIC vsx_xvrdpic {}
> +
> +  const vd __builtin_vsx_xvrdpim (vd);
> +    XVRDPIM vsx_floorv2df2 {}
> +
> +  const vd __builtin_vsx_xvrdpip (vd);
> +    XVRDPIP vsx_ceilv2df2 {}
> +
> +  const vd __builtin_vsx_xvrdpiz (vd);
> +    XVRDPIZ vsx_btruncv2df2 {}
> +
> +  fpmath vd __builtin_vsx_xvrecipdivdp (vd, vd);
> +    RECIP_V2DF recipv2df3 {}
> +
> +  fpmath vf __builtin_vsx_xvrecipdivsp (vf, vf);
> +    RECIP_V4SF recipv4sf3 {}
> +
> +  const vd __builtin_vsx_xvredp (vd);
> +    XVREDP vsx_frev2df2 {}
> +
> +  const vf __builtin_vsx_xvresp (vf);
> +    XVRESP vsx_frev4sf2 {}
> +
> +  const vf __builtin_vsx_xvrspi (vf);
> +    XVRSPI vsx_xvrspi {}
> +
> +  const vf __builtin_vsx_xvrspic (vf);
> +    XVRSPIC vsx_xvrspic {}
> +
> +  const vf __builtin_vsx_xvrspim (vf);
> +    XVRSPIM vsx_floorv4sf2 {}
> +
> +  const vf __builtin_vsx_xvrspip (vf);
> +    XVRSPIP vsx_ceilv4sf2 {}
> +
> +  const vf __builtin_vsx_xvrspiz (vf);
> +    XVRSPIZ vsx_btruncv4sf2 {}
> +
> +  const vd __builtin_vsx_xvrsqrtdp (vd);
> +    RSQRT_2DF rsqrtv2df2 {}
> +
> +  const vf __builtin_vsx_xvrsqrtsp (vf);
> +    RSQRT_4SF rsqrtv4sf2 {}
> +
> +  const vd __builtin_vsx_xvrsqrtedp (vd);
> +    XVRSQRTEDP rsqrtev2df2 {}
> +
> +  const vf __builtin_vsx_xvrsqrtesp (vf);
> +    XVRSQRTESP rsqrtev4sf2 {}
> +
> +  const vd __builtin_vsx_xvsqrtdp (vd);
> +    XVSQRTDP sqrtv2df2 {}
> +
> +  const vf __builtin_vsx_xvsqrtsp (vf);
> +    XVSQRTSP sqrtv4sf2 {}
> +
> +  fpmath vd __builtin_vsx_xvsubdp (vd, vd);
> +    XVSUBDP subv2df3 {}
> +
> +  fpmath vf __builtin_vsx_xvsubsp (vf, vf);
> +    XVSUBSP subv4sf3 {}
> +
> +  const signed int __builtin_vsx_xvtdivdp_fe (vd, vd);
> +    XVTDIVDP_FE vsx_tdivv2df3_fe {}
> +
> +  const signed int __builtin_vsx_xvtdivdp_fg (vd, vd);
> +    XVTDIVDP_FG vsx_tdivv2df3_fg {}
> +
> +  const signed int __builtin_vsx_xvtdivsp_fe (vf, vf);
> +    XVTDIVSP_FE vsx_tdivv4sf3_fe {}
> +
> +  const signed int __builtin_vsx_xvtdivsp_fg (vf, vf);
> +    XVTDIVSP_FG vsx_tdivv4sf3_fg {}
> +
> +  const signed int __builtin_vsx_xvtsqrtdp_fe (vd);
> +    XVTSQRTDP_FE vsx_tsqrtv2df2_fe {}
> +
> +  const signed int __builtin_vsx_xvtsqrtdp_fg (vd);
> +    XVTSQRTDP_FG vsx_tsqrtv2df2_fg {}
> +
> +  const signed int __builtin_vsx_xvtsqrtsp_fe (vf);
> +    XVTSQRTSP_FE vsx_tsqrtv4sf2_fe {}
> +
> +  const signed int __builtin_vsx_xvtsqrtsp_fg (vf);
> +    XVTSQRTSP_FG vsx_tsqrtv4sf2_fg {}
> +
> +  const vf __builtin_vsx_xxmrghw (vf, vf);
> +    XXMRGHW_4SF vsx_xxmrghw_v4sf {}
> +
> +  const vsi __builtin_vsx_xxmrghw_4si (vsi, vsi);
> +    XXMRGHW_4SI vsx_xxmrghw_v4si {}
> +
> +  const vf __builtin_vsx_xxmrglw (vf, vf);
> +    XXMRGLW_4SF vsx_xxmrglw_v4sf {}
> +
> +  const vsi __builtin_vsx_xxmrglw_4si (vsi, vsi);
> +    XXMRGLW_4SI vsx_xxmrglw_v4si {}
> +
> +  const vsc __builtin_vsx_xxpermdi_16qi (vsc, vsc, const int<2>);
> +    XXPERMDI_16QI vsx_xxpermdi_v16qi {}
> +
> +  const vsq __builtin_vsx_xxpermdi_1ti (vsq, vsq, const int<2>);
> +    XXPERMDI_1TI vsx_xxpermdi_v1ti {}
> +
> +  const vd __builtin_vsx_xxpermdi_2df (vd, vd, const int<2>);
> +    XXPERMDI_2DF vsx_xxpermdi_v2df {}
> +
> +  const vsll __builtin_vsx_xxpermdi_2di (vsll, vsll, const int<2>);
> +    XXPERMDI_2DI vsx_xxpermdi_v2di {}
> +
> +  const vf __builtin_vsx_xxpermdi_4sf (vf, vf, const int<2>);
> +    XXPERMDI_4SF vsx_xxpermdi_v4sf {}
> +
> +  const vsi __builtin_vsx_xxpermdi_4si (vsi, vsi, const int<2>);
> +    XXPERMDI_4SI vsx_xxpermdi_v4si {}
> +
> +  const vss __builtin_vsx_xxpermdi_8hi (vss, vss, const int<2>);
> +    XXPERMDI_8HI vsx_xxpermdi_v8hi {}
> +
> +  const vsc __builtin_vsx_xxsel_16qi (vsc, vsc, vsc);
> +    XXSEL_16QI vector_select_v16qi {}
> +
> +  const vuc __builtin_vsx_xxsel_16qi_uns (vuc, vuc, vuc);
> +    XXSEL_16QI_UNS vector_select_v16qi_uns {}
> +
> +  const vsq __builtin_vsx_xxsel_1ti (vsq, vsq, vsq);
> +    XXSEL_1TI vector_select_v1ti {}
> +
> +  const vsq __builtin_vsx_xxsel_1ti_uns (vsq, vsq, vsq);
> +    XXSEL_1TI_UNS vector_select_v1ti_uns {}
> +
> +  const vd __builtin_vsx_xxsel_2df (vd, vd, vd);
> +    XXSEL_2DF vector_select_v2df {}
> +
> +  const vsll __builtin_vsx_xxsel_2di (vsll, vsll, vsll);
> +    XXSEL_2DI vector_select_v2di {}
> +
> +  const vull __builtin_vsx_xxsel_2di_uns (vull, vull, vull);
> +    XXSEL_2DI_UNS vector_select_v2di_uns {}
> +
> +  const vf __builtin_vsx_xxsel_4sf (vf, vf, vf);
> +    XXSEL_4SF vector_select_v4sf {}
> +
> +  const vsi __builtin_vsx_xxsel_4si (vsi, vsi, vsi);
> +    XXSEL_4SI vector_select_v4si {}
> +
> +  const vui __builtin_vsx_xxsel_4si_uns (vui, vui, vui);
> +    XXSEL_4SI_UNS vector_select_v4si_uns {}
> +
> +  const vss __builtin_vsx_xxsel_8hi (vss, vss, vss);
> +    XXSEL_8HI vector_select_v8hi {}
> +
> +  const vus __builtin_vsx_xxsel_8hi_uns (vus, vus, vus);
> +    XXSEL_8HI_UNS vector_select_v8hi_uns {}
> +
> +  const vsc __builtin_vsx_xxsldwi_16qi (vsc, vsc, const int<2>);
> +    XXSLDWI_16QI vsx_xxsldwi_v16qi {}
> +
> +  const vd __builtin_vsx_xxsldwi_2df (vd, vd, const int<2>);
> +    XXSLDWI_2DF vsx_xxsldwi_v2df {}
> +
> +  const vsll __builtin_vsx_xxsldwi_2di (vsll, vsll, const int<2>);
> +    XXSLDWI_2DI vsx_xxsldwi_v2di {}
> +
> +  const vf __builtin_vsx_xxsldwi_4sf (vf, vf, const int<2>);
> +    XXSLDWI_4SF vsx_xxsldwi_v4sf {}
> +
> +  const vsi __builtin_vsx_xxsldwi_4si (vsi, vsi, const int<2>);
> +    XXSLDWI_4SI vsx_xxsldwi_v4si {}
> +
> +  const vss __builtin_vsx_xxsldwi_8hi (vss, vss, const int<2>);
> +    XXSLDWI_8HI vsx_xxsldwi_v8hi {}
> +
> +  const vd __builtin_vsx_xxspltd_2df (vd, const int<1>);
> +    XXSPLTD_V2DF vsx_xxspltd_v2df {}
> +
> +  const vsll __builtin_vsx_xxspltd_2di (vsll, const int<1>);
> +    XXSPLTD_V2DI vsx_xxspltd_v2di {}


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 06/34] rs6000: Add power7 and power7-64 builtins
  2021-07-29 13:30 ` [PATCH 06/34] rs6000: Add power7 and power7-64 builtins Bill Schmidt
@ 2021-08-10 16:16   ` will schmidt
  2021-08-10 17:48     ` Segher Boessenkool
  0 siblings, 1 reply; 84+ messages in thread
From: will schmidt @ 2021-08-10 16:16 UTC (permalink / raw)
  To: Bill Schmidt, gcc-patches; +Cc: segher, dje.gcc, willschm

On Thu, 2021-07-29 at 08:30 -0500, Bill Schmidt wrote:
> 2021-04-02  Bill Schmidt  <wschmidt@linux.ibm.com>
> 

Hi,


> gcc/
> 	* config/rs6000/rs6000-builtin-new.def: Add power7 and power7-64
> 	stanzas.


ok

> ---
>  gcc/config/rs6000/rs6000-builtin-new.def | 39 ++++++++++++++++++++++++
>  1 file changed, 39 insertions(+)
> 
> diff --git a/gcc/config/rs6000/rs6000-builtin-new.def b/gcc/config/rs6000/rs6000-builtin-new.def
> index ca694be1ac3..bffce52ee47 100644
> --- a/gcc/config/rs6000/rs6000-builtin-new.def
> +++ b/gcc/config/rs6000/rs6000-builtin-new.def
> @@ -1957,3 +1957,42 @@
>  
>    const vsll __builtin_vsx_xxspltd_2di (vsll, const int<1>);
>      XXSPLTD_V2DI vsx_xxspltd_v2di {}
> +
> +
> +; Power7 builtins (ISA 2.06).
> +[power7]
> +  const unsigned int __builtin_addg6s (unsigned int, unsigned int);
> +    ADDG6S addg6s {}

Add all of the sixes...   (ok).

> +
> +  const signed long __builtin_bpermd (signed long, signed long);
> +    BPERMD bpermd_di {}
> +
> +  const unsigned int __builtin_cbcdtd (unsigned int);
> +    CBCDTD cbcdtd {}
> +
> +  const unsigned int __builtin_cdtbcd (unsigned int);
> +    CDTBCD cdtbcd {}
> +
> +  const signed int __builtin_divwe (signed int, signed int);
> +    DIVWE dive_si {}
> +
> +  const unsigned int __builtin_divweu (unsigned int, unsigned int);
> +    DIVWEU diveu_si {}
> +
> +  const vsq __builtin_pack_vector_int128 (unsigned long long, unsigned long long);
> +    PACK_V1TI packv1ti {}
> +
> +  void __builtin_ppc_speculation_barrier ();
> +    SPECBARR speculation_barrier {}
> +
> +  const unsigned long __builtin_unpack_vector_int128 (vsq, const int<1>);
> +    UNPACK_V1TI unpackv1ti {}
> +
> +
> +; Power7 builtins requiring 64-bit GPRs (even with 32-bit addressing).
> +[power7-64]
> +  const signed long long __builtin_divde (signed long long, signed long long);
> +    DIVDE dive_di {}
> +
> +  const unsigned long long __builtin_divdeu (unsigned long long, unsigned long long);
> +    DIVDEU diveu_di {}

ok

thanks
-Will




^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 05/34] rs6000: Add available-everywhere and ancient builtins
  2021-07-29 13:30 ` [PATCH 05/34] rs6000: Add available-everywhere and ancient builtins Bill Schmidt
@ 2021-08-10 16:17   ` will schmidt
  2021-08-10 17:34     ` Segher Boessenkool
  2021-08-10 18:38   ` Segher Boessenkool
  1 sibling, 1 reply; 84+ messages in thread
From: will schmidt @ 2021-08-10 16:17 UTC (permalink / raw)
  To: Bill Schmidt, gcc-patches; +Cc: segher, dje.gcc, willschm

On Thu, 2021-07-29 at 08:30 -0500, Bill Schmidt wrote:
> 2021-06-07  Bill Schmidt  <wschmidt@linux.ibm.com>
> 
> gcc/
> 	* config/rs6000/rs6000-builtin-new.def: Add always, power5, and
> 	power6 stanzas.
> ---
>  gcc/config/rs6000/rs6000-builtin-new.def | 72 ++++++++++++++++++++++++
>  1 file changed, 72 insertions(+)
> 
> diff --git a/gcc/config/rs6000/rs6000-builtin-new.def b/gcc/config/rs6000/rs6000-builtin-new.def
> index 974cdc8c37c..ca694be1ac3 100644
> --- a/gcc/config/rs6000/rs6000-builtin-new.def
> +++ b/gcc/config/rs6000/rs6000-builtin-new.def
> @@ -184,6 +184,78 @@
>  
>  
>  
> +; Builtins that have been around since time immemorial or are just
> +; considered available everywhere.
> +[always]
> +  void __builtin_cpu_init ();
> +    CPU_INIT nothing {cpu}
> +
> +  bool __builtin_cpu_is (string);
> +    CPU_IS nothing {cpu}
> +
> +  bool __builtin_cpu_supports (string);
> +    CPU_SUPPORTS nothing {cpu}
> +
> +  unsigned long long __builtin_ppc_get_timebase ();
> +    GET_TB rs6000_get_timebase {}
> +
> +  double __builtin_mffs ();
> +    MFFS rs6000_mffs {}
> +
> +; This will break for long double == _Float128.  libgcc history.

Add a few more words to provide bigger hints for future archeological
digs?  (This is perhaps an obvious issue, but I'd need to do some
spelunking)
I see similar comments below, maybe just a wordier comment for the
first occurance.   Unsure...  

> +  const long double __builtin_pack_longdouble (double, double);
> +    PACK_TF packtf {}
> +
> +  unsigned long __builtin_ppc_mftb ();
> +    MFTB rs6000_mftb_di {32bit}
> +
> +  void __builtin_mtfsb0 (const int<5>);
> +    MTFSB0 rs6000_mtfsb0 {}
> +
> +  void __builtin_mtfsb1 (const int<5>);
> +    MTFSB1 rs6000_mtfsb1 {}
> +
> +  void __builtin_mtfsf (const int<8>, double);
> +    MTFSF rs6000_mtfsf {}
> +
> +  const __ibm128 __builtin_pack_ibm128 (double, double);
> +    PACK_IF packif {}
> +
> +  void __builtin_set_fpscr_rn (const int[0,3]);
> +    SET_FPSCR_RN rs6000_set_fpscr_rn {}
> +
> +  const double __builtin_unpack_ibm128 (__ibm128, const int<1>);
> +    UNPACK_IF unpackif {}
> +
> +; This will break for long double == _Float128.  libgcc history.
> +  const double __builtin_unpack_longdouble (long double, const int<1>);
> +    UNPACK_TF unpacktf {}
> +
> +
> +; Builtins that have been around just about forever, but not quite.
> +[power5]
> +  fpmath double __builtin_recipdiv (double, double);
> +    RECIP recipdf3 {}
> +
> +  fpmath float __builtin_recipdivf (float, float);
> +    RECIPF recipsf3 {}
> +
> +  fpmath double __builtin_rsqrt (double);
> +    RSQRT rsqrtdf2 {}
> +
> +  fpmath float __builtin_rsqrtf (float);
> +    RSQRTF rsqrtsf2 {}
> +
> +
> +; Power6 builtins.

I see in subsequent patches you also call out the ISA version in the
comment.  so perhaps
; Power6 builtins (ISA 2.05).

Similar comment for Power5 reference
above.


> +[power6]
> +  const signed long __builtin_p6_cmpb (signed long, signed long);
> +    CMPB cmpbdi3 {}
> +
> +  const signed int __builtin_p6_cmpb_32 (signed int, signed int);
> +    CMPB_32 cmpbsi3 {}
> +
> +

ok.


>  ; AltiVec builtins.
>  [altivec]
>    const vsc __builtin_altivec_abs_v16qi (vsc);


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 05/34] rs6000: Add available-everywhere and ancient builtins
  2021-08-10 16:17   ` will schmidt
@ 2021-08-10 17:34     ` Segher Boessenkool
  2021-08-10 21:29       ` Bill Schmidt
  0 siblings, 1 reply; 84+ messages in thread
From: Segher Boessenkool @ 2021-08-10 17:34 UTC (permalink / raw)
  To: will schmidt; +Cc: Bill Schmidt, gcc-patches, dje.gcc, willschm

On Tue, Aug 10, 2021 at 11:17:05AM -0500, will schmidt wrote:
> On Thu, 2021-07-29 at 08:30 -0500, Bill Schmidt wrote:
> > +; This will break for long double == _Float128.  libgcc history.
> > +  const long double __builtin_pack_longdouble (double, double);
> > +    PACK_TF packtf {}
> 
> Add a few more words to provide bigger hints for future archeological
> digs?  (This is perhaps an obvious issue, but I'd need to do some
> spelunking)

It is for __ibm128 only, not for other long double formats (we have
three: plain double, double double, IEEE QP).  So maybe the return type
should be changed?  The name of the builtin of course is unfortunate,
but it is too late to change :-)


Segher

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 06/34] rs6000: Add power7 and power7-64 builtins
  2021-08-10 16:16   ` will schmidt
@ 2021-08-10 17:48     ` Segher Boessenkool
  0 siblings, 0 replies; 84+ messages in thread
From: Segher Boessenkool @ 2021-08-10 17:48 UTC (permalink / raw)
  To: will schmidt; +Cc: Bill Schmidt, gcc-patches, dje.gcc, willschm

On Tue, Aug 10, 2021 at 11:16:41AM -0500, will schmidt wrote:
> On Thu, 2021-07-29 at 08:30 -0500, Bill Schmidt wrote:
> > +  const unsigned int __builtin_addg6s (unsigned int, unsigned int);
> > +    ADDG6S addg6s {}
> 
> Add all of the sixes...   (ok).

"Add and generate sixes."  Look it up in the ISA if you aren't confused
enough yet :-)


Segher

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 04/34] rs6000: Add VSX builtins
  2021-07-29 13:30 ` [PATCH 04/34] rs6000: Add VSX builtins Bill Schmidt
  2021-08-10 16:14   ` will schmidt
@ 2021-08-10 17:52   ` Segher Boessenkool
  1 sibling, 0 replies; 84+ messages in thread
From: Segher Boessenkool @ 2021-08-10 17:52 UTC (permalink / raw)
  To: Bill Schmidt; +Cc: gcc-patches, dje.gcc, willschm

On Thu, Jul 29, 2021 at 08:30:51AM -0500, Bill Schmidt wrote:
> 	* config/rs6000/rs6000-builtin-new.def: Add vsx stanza.

Okay for trunk.  Thanks!


Segher

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 05/34] rs6000: Add available-everywhere and ancient builtins
  2021-07-29 13:30 ` [PATCH 05/34] rs6000: Add available-everywhere and ancient builtins Bill Schmidt
  2021-08-10 16:17   ` will schmidt
@ 2021-08-10 18:38   ` Segher Boessenkool
  2021-08-10 18:56     ` Bill Schmidt
  1 sibling, 1 reply; 84+ messages in thread
From: Segher Boessenkool @ 2021-08-10 18:38 UTC (permalink / raw)
  To: Bill Schmidt; +Cc: gcc-patches, dje.gcc, willschm

On Thu, Jul 29, 2021 at 08:30:52AM -0500, Bill Schmidt wrote:
> 	* config/rs6000/rs6000-builtin-new.def: Add always, power5, and
> 	power6 stanzas.

> +  unsigned long __builtin_ppc_mftb ();
> +    MFTB rs6000_mftb_di {32bit}

I'm not sure what {32bit} means...  The builtin exists on both 32-bit
and on 64-bit, and returns what is a "long" in both cases.  The point
is that it is just a single "mfspr 268" always, which is fast, and
importantly has fixed and low latency.

Modulo perhaps that, okay for trunk.  Thanks!


Segher

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 05/34] rs6000: Add available-everywhere and ancient builtins
  2021-08-10 18:38   ` Segher Boessenkool
@ 2021-08-10 18:56     ` Bill Schmidt
  2021-08-10 20:33       ` Segher Boessenkool
  0 siblings, 1 reply; 84+ messages in thread
From: Bill Schmidt @ 2021-08-10 18:56 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: gcc-patches, dje.gcc, willschm

Hi Segher,

On 8/10/21 1:38 PM, Segher Boessenkool wrote:
> On Thu, Jul 29, 2021 at 08:30:52AM -0500, Bill Schmidt wrote:
>> 	* config/rs6000/rs6000-builtin-new.def: Add always, power5, and
>> 	power6 stanzas.
>> +  unsigned long __builtin_ppc_mftb ();
>> +    MFTB rs6000_mftb_di {32bit}
> I'm not sure what {32bit} means...  The builtin exists on both 32-bit
> and on 64-bit, and returns what is a "long" in both cases.  The point
> is that it is just a single "mfspr 268" always, which is fast, and
> importantly has fixed and low latency.


Right.  The implementation of this thing is that we have two different 
patterns in the machine description that get invoked depending on 
whether the target is 32-bit or 64-bit.  The syntax in the built-ins 
file only allows for one pattern.  So the {32bit} flag allows us to 
perform special processing for TARGET_32BIT, in this case to override 
the pattern.  Later in the patch series you'll find:

   if (bif_is_32bit (*bifaddr) && TARGET_32BIT)
     {
       if (fcode == RS6000_BIF_MFTB)
         icode = CODE_FOR_rs6000_mftb_si;
       else
         gcc_unreachable ();
     }

This is the only {32bit} built-in for now, and probably ever...

I'm sure there's a better way of dealing with the mode dependency on 
TARGET_32BIT, but for now this matches the old behavior as closely as 
possible. I'm happy to take suggestions on this.

Thanks for the review!
Bill
>
> Modulo perhaps that, okay for trunk.  Thanks!
>
>
> Segher

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 05/34] rs6000: Add available-everywhere and ancient builtins
  2021-08-10 18:56     ` Bill Schmidt
@ 2021-08-10 20:33       ` Segher Boessenkool
  0 siblings, 0 replies; 84+ messages in thread
From: Segher Boessenkool @ 2021-08-10 20:33 UTC (permalink / raw)
  To: Bill Schmidt; +Cc: gcc-patches, dje.gcc, willschm

On Tue, Aug 10, 2021 at 01:56:58PM -0500, Bill Schmidt wrote:
> On 8/10/21 1:38 PM, Segher Boessenkool wrote:
> >On Thu, Jul 29, 2021 at 08:30:52AM -0500, Bill Schmidt wrote:
> >>	* config/rs6000/rs6000-builtin-new.def: Add always, power5, and
> >>	power6 stanzas.
> >>+  unsigned long __builtin_ppc_mftb ();
> >>+    MFTB rs6000_mftb_di {32bit}
> >I'm not sure what {32bit} means...  The builtin exists on both 32-bit
> >and on 64-bit, and returns what is a "long" in both cases.  The point
> >is that it is just a single "mfspr 268" always, which is fast, and
> >importantly has fixed and low latency.
> 
> Right.  The implementation of this thing is that we have two different 
> patterns in the machine description that get invoked depending on 
> whether the target is 32-bit or 64-bit.  The syntax in the built-ins 
> file only allows for one pattern.  So the {32bit} flag allows us to 
> perform special processing for TARGET_32BIT, in this case to override 
> the pattern.  Later in the patch series you'll find:

[ snip ]

Ah ok.

> I'm sure there's a better way of dealing with the mode dependency on 
> TARGET_32BIT, but for now this matches the old behavior as closely as 
> possible. I'm happy to take suggestions on this.

You could try to use something with Pmode, but it's not going to be
pretty in any case.  You also might have to deal with -m32 -mpowerpc64,
depending on if the original did.


Segher

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 05/34] rs6000: Add available-everywhere and ancient builtins
  2021-08-10 17:34     ` Segher Boessenkool
@ 2021-08-10 21:29       ` Bill Schmidt
  2021-08-11 10:29         ` Segher Boessenkool
  0 siblings, 1 reply; 84+ messages in thread
From: Bill Schmidt @ 2021-08-10 21:29 UTC (permalink / raw)
  To: Segher Boessenkool, will schmidt; +Cc: gcc-patches, dje.gcc, willschm

Hi Segher,

On 8/10/21 12:34 PM, Segher Boessenkool wrote:
> On Tue, Aug 10, 2021 at 11:17:05AM -0500, will schmidt wrote:
>> On Thu, 2021-07-29 at 08:30 -0500, Bill Schmidt wrote:
>>> +; This will break for long double == _Float128.  libgcc history.
>>> +  const long double __builtin_pack_longdouble (double, double);
>>> +    PACK_TF packtf {}
>> Add a few more words to provide bigger hints for future archeological
>> digs?  (This is perhaps an obvious issue, but I'd need to do some
>> spelunking)
> It is for __ibm128 only, not for other long double formats (we have
> three: plain double, double double, IEEE QP).  So maybe the return type
> should be changed?  The name of the builtin of course is unfortunate,
> but it is too late to change :-)


Yeah...I'm not sure how much flexibility we have here to avoid breaking 
code in the field, but it's not a big break because whoever may be using 
it has to be assuming long double = __ibm128, and probably has work to 
do anyway.

Perhaps I should commit as is for now, and then prepare a separate patch 
to change this builtin?  There may be test suite fallout, not sure offhand.

Thanks!
Bill

>
>
> Segher

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 05/34] rs6000: Add available-everywhere and ancient builtins
  2021-08-10 21:29       ` Bill Schmidt
@ 2021-08-11 10:29         ` Segher Boessenkool
  0 siblings, 0 replies; 84+ messages in thread
From: Segher Boessenkool @ 2021-08-11 10:29 UTC (permalink / raw)
  To: Bill Schmidt; +Cc: will schmidt, gcc-patches, dje.gcc, willschm

On Tue, Aug 10, 2021 at 04:29:10PM -0500, Bill Schmidt wrote:
> On 8/10/21 12:34 PM, Segher Boessenkool wrote:
> >On Tue, Aug 10, 2021 at 11:17:05AM -0500, will schmidt wrote:
> >>On Thu, 2021-07-29 at 08:30 -0500, Bill Schmidt wrote:
> >>>+; This will break for long double == _Float128.  libgcc history.
> >>>+  const long double __builtin_pack_longdouble (double, double);
> >>>+    PACK_TF packtf {}
> >>Add a few more words to provide bigger hints for future archeological
> >>digs?  (This is perhaps an obvious issue, but I'd need to do some
> >>spelunking)
> >It is for __ibm128 only, not for other long double formats (we have
> >three: plain double, double double, IEEE QP).  So maybe the return type
> >should be changed?  The name of the builtin of course is unfortunate,
> >but it is too late to change :-)
> 
> Yeah...I'm not sure how much flexibility we have here to avoid breaking 
> code in the field, but it's not a big break because whoever may be using 
> it has to be assuming long double = __ibm128, and probably has work to 
> do anyway.

We do have an
  __ibm128 __builtin_pack_ibm128 (double, double);
already, so we just should get people to use that one, make it more
prominent in the documentation?  Or we can also make
__builtin_pack_longdouble warn (or even error) if used when long double
is not double-double.  Maybe an attribute (or what is it called, a
{thing} I mean) in the new description files to say "warn (or error) if
long double is not ibm128"?

> Perhaps I should commit as is for now, and then prepare a separate patch 
> to change this builtin?  There may be test suite fallout, not sure offhand.

Yes, I did approve it already, right?  Reviewing these patches I notice
things that should be improved, but that does not have to be done *now*,
or by you for that matter :-)

Cheers,


Segher

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 07/34] rs6000: Add power8-vector builtins
  2021-07-29 13:30 ` [PATCH 07/34] rs6000: Add power8-vector builtins Bill Schmidt
@ 2021-08-23 21:28   ` Segher Boessenkool
  0 siblings, 0 replies; 84+ messages in thread
From: Segher Boessenkool @ 2021-08-23 21:28 UTC (permalink / raw)
  To: Bill Schmidt; +Cc: gcc-patches, dje.gcc, willschm

On Thu, Jul 29, 2021 at 08:30:54AM -0500, Bill Schmidt wrote:
> 	* config/rs6000/rs6000-builtin-new.def: Add power8-vector stanza.

I looked it over and didn't see errors.  Okay for trunk.  Thanks!


Segher

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 08/34] rs6000: Add Power9 builtins
  2021-07-29 13:30 ` [PATCH 08/34] rs6000: Add Power9 builtins Bill Schmidt
@ 2021-08-23 21:40   ` Segher Boessenkool
  2021-08-24 14:20     ` Bill Schmidt
  0 siblings, 1 reply; 84+ messages in thread
From: Segher Boessenkool @ 2021-08-23 21:40 UTC (permalink / raw)
  To: Bill Schmidt; +Cc: gcc-patches, dje.gcc, willschm

On Thu, Jul 29, 2021 at 08:30:55AM -0500, Bill Schmidt wrote:
> 2021-06-15  Bill Schmidt  <wschmidt@linux.ibm.com>
> 	* config/rs6000/rs6000-builtin-new.def: Add power9-vector, power9,
> 	and power9-64 stanzas.

> +; These things need some review to see whether they really require
> +; MASK_POWERPC64.  For xsxexpdp, this seems to be fine for 32-bit,
> +; because the result will always fit in 32 bits and the return
> +; value is SImode; but the pattern currently requires TARGET_64BIT.

That is wrong then?  It should never have TARGET_64BIT if it isn't
addressing memory (or the like).  Did you just typo this?

> +; On the other hand, xsxsigdp has a result that doesn't fit in
> +; 32 bits, and the return value is DImode, so it seems that
> +; TARGET_64BIT (actually TARGET_POWERPC64) is justified.  TBD. ####

Because xsxsigdp needs it, it makes sense to have it for xsxexpdp as
well, or we would get a weird holey API.

Okay for trunk (with the typo fixed if it is one).  Thanks!


Segher

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 09/34] rs6000: Add more type nodes to support builtin processing
  2021-07-29 13:30 ` [PATCH 09/34] rs6000: Add more type nodes to support builtin processing Bill Schmidt
@ 2021-08-23 22:15   ` Segher Boessenkool
  2021-08-24 14:38     ` Bill Schmidt
  0 siblings, 1 reply; 84+ messages in thread
From: Segher Boessenkool @ 2021-08-23 22:15 UTC (permalink / raw)
  To: Bill Schmidt; +Cc: gcc-patches, dje.gcc, willschm

On Thu, Jul 29, 2021 at 08:30:56AM -0500, Bill Schmidt wrote:
> 	* config/rs6000/rs6000-call.c (rs6000_init_builtins): Initialize
> 	various pointer type nodes.
> 	* config/rs6000/rs6000.h (rs6000_builtin_type_index): Add enum
> 	values for various pointer types.
> 	(ptr_V16QI_type_node): New macro.

[ ... ]

> 	(ptr_long_long_unsigned_type_node): New macro.


> +  ptr_long_integer_type_node
> +    = build_pointer_type
> +	(build_qualified_type (long_integer_type_internal_node,
> +			       TYPE_QUAL_CONST));
> +
> +  ptr_long_unsigned_type_node
> +    = build_pointer_type
> +	(build_qualified_type (long_unsigned_type_internal_node,
> +			       TYPE_QUAL_CONST));

This isn't correct formatting either.  Just use a temp variable?  Long
names and function calls do not mix, moreso with our coding conventions.

  tree t = build_qualified_type (long_unsigned_type_internal_node,
				 TYPE_QUAL_CONST));
  ptr_long_unsigned_type_node = build_pointer_type (t);

> +  if (dfloat64_type_node)
> +    ptr_dfloat64_type_node
> +      = build_pointer_type (build_qualified_type (dfloat64_type_internal_node,

You might want to use a block to make this a little more readable / less
surprising.  Okay either way.

> @@ -2517,6 +2558,47 @@ enum rs6000_builtin_type_index
>  #define vector_pair_type_node		 (rs6000_builtin_types[RS6000_BTI_vector_pair])
>  #define vector_quad_type_node		 (rs6000_builtin_types[RS6000_BTI_vector_quad])
>  #define pcvoid_type_node		 (rs6000_builtin_types[RS6000_BTI_const_ptr_void])
> +#define ptr_V16QI_type_node		 (rs6000_builtin_types[RS6000_BTI_ptr_V16QI])

Not new of course, but those outer parens are pointless.  In macros
write extra parens around uses of parameters, and nowhere else.

Okay for trunk with the formatting fixed.  Thanks!


Segher

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 10/34] rs6000: Add Power10 builtins
  2021-07-29 13:30 ` [PATCH 10/34] rs6000: Add Power10 builtins Bill Schmidt
@ 2021-08-23 23:48   ` Segher Boessenkool
  0 siblings, 0 replies; 84+ messages in thread
From: Segher Boessenkool @ 2021-08-23 23:48 UTC (permalink / raw)
  To: Bill Schmidt; +Cc: gcc-patches, dje.gcc, willschm

On Thu, Jul 29, 2021 at 08:30:57AM -0500, Bill Schmidt wrote:
> 	* config/rs6000/rs6000-builtin-new.def: Add power10 and power10-64
> 	stanzas.

> +  void __builtin_altivec_tr_stxvrbx (vsq, signed long, signed char *);
> +    TR_STXVRBX vsx_stxvrbx {stvec}
> +
> +  void __builtin_altivec_tr_stxvrhx (vsq, signed long, signed int *);
> +    TR_STXVRHX vsx_stxvrhx {stvec}
> +
> +  void __builtin_altivec_tr_stxvrwx (vsq, signed long, signed short *);
> +    TR_STXVRWX vsx_stxvrwx {stvec}
> +
> +  void __builtin_altivec_tr_stxvrdx (vsq, signed long, signed long long *);
> +    TR_STXVRDX vsx_stxvrdx {stvec}

Is vsq for all of these correct?  Is it just a placeholder for "any
vector type"?


Okay for trunk.  Thanks!


Segehr

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 08/34] rs6000: Add Power9 builtins
  2021-08-23 21:40   ` Segher Boessenkool
@ 2021-08-24 14:20     ` Bill Schmidt
  2021-08-24 15:38       ` Segher Boessenkool
  0 siblings, 1 reply; 84+ messages in thread
From: Bill Schmidt @ 2021-08-24 14:20 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: gcc-patches, dje.gcc, willschm

On 8/23/21 4:40 PM, Segher Boessenkool wrote:
> On Thu, Jul 29, 2021 at 08:30:55AM -0500, Bill Schmidt wrote:
>> 2021-06-15  Bill Schmidt  <wschmidt@linux.ibm.com>
>> 	* config/rs6000/rs6000-builtin-new.def: Add power9-vector, power9,
>> 	and power9-64 stanzas.
>> +; These things need some review to see whether they really require
>> +; MASK_POWERPC64.  For xsxexpdp, this seems to be fine for 32-bit,
>> +; because the result will always fit in 32 bits and the return
>> +; value is SImode; but the pattern currently requires TARGET_64BIT.
> That is wrong then?  It should never have TARGET_64BIT if it isn't
> addressing memory (or the like).  Did you just typo this?

Not a typo... I was referring to the condition in the following:

;; VSX Scalar Extract Exponent Double-Precision
(define_insn "xsxexpdp"
   [(set (match_operand:DI 0 "register_operand" "=r")
         (unspec:DI [(match_operand:DF 1 "vsx_register_operand" "wa")]
          UNSPEC_VSX_SXEXPDP))]
   "TARGET_P9_VECTOR && TARGET_64BIT"
   "xsxexpdp %0,%x1"
   [(set_attr "type" "integer")])

>> +; On the other hand, xsxsigdp has a result that doesn't fit in
>> +; 32 bits, and the return value is DImode, so it seems that
>> +; TARGET_64BIT (actually TARGET_POWERPC64) is justified.  TBD. ####
> Because xsxsigdp needs it, it makes sense to have it for xsxexpdp as
> well, or we would get a weird holey API.

OK.  Based on this, I think I will just remove the comments here.

Thanks very much for the review!

Bill

>
> Okay for trunk (with the typo fixed if it is one).  Thanks!
>
>
> Segher

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 09/34] rs6000: Add more type nodes to support builtin processing
  2021-08-23 22:15   ` Segher Boessenkool
@ 2021-08-24 14:38     ` Bill Schmidt
  0 siblings, 0 replies; 84+ messages in thread
From: Bill Schmidt @ 2021-08-24 14:38 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: gcc-patches, dje.gcc, willschm


On 8/23/21 5:15 PM, Segher Boessenkool wrote:
> On Thu, Jul 29, 2021 at 08:30:56AM -0500, Bill Schmidt wrote:
>> 	* config/rs6000/rs6000-call.c (rs6000_init_builtins): Initialize
>> 	various pointer type nodes.
>> 	* config/rs6000/rs6000.h (rs6000_builtin_type_index): Add enum
>> 	values for various pointer types.
>> 	(ptr_V16QI_type_node): New macro.
> [ ... ]
>
>> 	(ptr_long_long_unsigned_type_node): New macro.
>
>> +  ptr_long_integer_type_node
>> +    = build_pointer_type
>> +	(build_qualified_type (long_integer_type_internal_node,
>> +			       TYPE_QUAL_CONST));
>> +
>> +  ptr_long_unsigned_type_node
>> +    = build_pointer_type
>> +	(build_qualified_type (long_unsigned_type_internal_node,
>> +			       TYPE_QUAL_CONST));
> This isn't correct formatting either.  Just use a temp variable?  Long
> names and function calls do not mix, moreso with our coding conventions.
>
>    tree t = build_qualified_type (long_unsigned_type_internal_node,
> 				 TYPE_QUAL_CONST));
>    ptr_long_unsigned_type_node = build_pointer_type (t);
Good choice, will do.
>> +  if (dfloat64_type_node)
>> +    ptr_dfloat64_type_node
>> +      = build_pointer_type (build_qualified_type (dfloat64_type_internal_node,
> You might want to use a block to make this a little more readable / less
> surprising.  Okay either way.
Yep.  Will use a temp variable again and that will force the block.
>> @@ -2517,6 +2558,47 @@ enum rs6000_builtin_type_index
>>   #define vector_pair_type_node		 (rs6000_builtin_types[RS6000_BTI_vector_pair])
>>   #define vector_quad_type_node		 (rs6000_builtin_types[RS6000_BTI_vector_quad])
>>   #define pcvoid_type_node		 (rs6000_builtin_types[RS6000_BTI_const_ptr_void])
>> +#define ptr_V16QI_type_node		 (rs6000_builtin_types[RS6000_BTI_ptr_V16QI])
> Not new of course, but those outer parens are pointless.  In macros
> write extra parens around uses of parameters, and nowhere else.
>
> Okay for trunk with the formatting fixed.  Thanks!

Thanks for the review!

Bill

>
> Segher

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 08/34] rs6000: Add Power9 builtins
  2021-08-24 14:20     ` Bill Schmidt
@ 2021-08-24 15:38       ` Segher Boessenkool
  2021-08-24 16:27         ` Bill Schmidt
  0 siblings, 1 reply; 84+ messages in thread
From: Segher Boessenkool @ 2021-08-24 15:38 UTC (permalink / raw)
  To: Bill Schmidt; +Cc: gcc-patches, dje.gcc, willschm

Hi!

On Tue, Aug 24, 2021 at 09:20:09AM -0500, Bill Schmidt wrote:
> On 8/23/21 4:40 PM, Segher Boessenkool wrote:
> >On Thu, Jul 29, 2021 at 08:30:55AM -0500, Bill Schmidt wrote:
> >>+; These things need some review to see whether they really require
> >>+; MASK_POWERPC64.  For xsxexpdp, this seems to be fine for 32-bit,
> >>+; because the result will always fit in 32 bits and the return
> >>+; value is SImode; but the pattern currently requires TARGET_64BIT.
> >That is wrong then?  It should never have TARGET_64BIT if it isn't
> >addressing memory (or the like).  Did you just typo this?
> 
> Not a typo... I was referring to the condition in the following:
> 
> ;; VSX Scalar Extract Exponent Double-Precision
> (define_insn "xsxexpdp"
>   [(set (match_operand:DI 0 "register_operand" "=r")
>         (unspec:DI [(match_operand:DF 1 "vsx_register_operand" "wa")]
>          UNSPEC_VSX_SXEXPDP))]
>   "TARGET_P9_VECTOR && TARGET_64BIT"
>   "xsxexpdp %0,%x1"
>   [(set_attr "type" "integer")])

That looks wrong.  It should be TARGET_POWERPC64 afaics.

> >>+; On the other hand, xsxsigdp has a result that doesn't fit in
> >>+; 32 bits, and the return value is DImode, so it seems that
> >>+; TARGET_64BIT (actually TARGET_POWERPC64) is justified.  TBD. ####
> >Because xsxsigdp needs it, it makes sense to have it for xsxexpdp as
> >well, or we would get a weird holey API.

Both should have TARGET_POWERPC64 (and the underlying patterns as well
of course, we don't like ICEs so much).


Segher

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 08/34] rs6000: Add Power9 builtins
  2021-08-24 15:38       ` Segher Boessenkool
@ 2021-08-24 16:27         ` Bill Schmidt
  0 siblings, 0 replies; 84+ messages in thread
From: Bill Schmidt @ 2021-08-24 16:27 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: gcc-patches, dje.gcc, willschm

On 8/24/21 10:38 AM, Segher Boessenkool wrote:
> Hi!
>
> On Tue, Aug 24, 2021 at 09:20:09AM -0500, Bill Schmidt wrote:
>> On 8/23/21 4:40 PM, Segher Boessenkool wrote:
>>> On Thu, Jul 29, 2021 at 08:30:55AM -0500, Bill Schmidt wrote:
>>>> +; These things need some review to see whether they really require
>>>> +; MASK_POWERPC64.  For xsxexpdp, this seems to be fine for 32-bit,
>>>> +; because the result will always fit in 32 bits and the return
>>>> +; value is SImode; but the pattern currently requires TARGET_64BIT.
>>> That is wrong then?  It should never have TARGET_64BIT if it isn't
>>> addressing memory (or the like).  Did you just typo this?
>> Not a typo... I was referring to the condition in the following:
>>
>> ;; VSX Scalar Extract Exponent Double-Precision
>> (define_insn "xsxexpdp"
>>    [(set (match_operand:DI 0 "register_operand" "=r")
>>          (unspec:DI [(match_operand:DF 1 "vsx_register_operand" "wa")]
>>           UNSPEC_VSX_SXEXPDP))]
>>    "TARGET_P9_VECTOR && TARGET_64BIT"
>>    "xsxexpdp %0,%x1"
>>    [(set_attr "type" "integer")])
> That looks wrong.  It should be TARGET_POWERPC64 afaics.
>
>>>> +; On the other hand, xsxsigdp has a result that doesn't fit in
>>>> +; 32 bits, and the return value is DImode, so it seems that
>>>> +; TARGET_64BIT (actually TARGET_POWERPC64) is justified.  TBD. ####
>>> Because xsxsigdp needs it, it makes sense to have it for xsxexpdp as
>>> well, or we would get a weird holey API.
> Both should have TARGET_POWERPC64 (and the underlying patterns as well
> of course, we don't like ICEs so much).

Yes, the enablement support I've added uses TARGET_POWERPC64.  I think 
we need a separate patch to fix the patterns in vsx.md. I'll take a note 
on that.

Thanks!
Bill

>
> Segher

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 11/34] rs6000: Add MMA builtins
  2021-07-29 13:30 ` [PATCH 11/34] rs6000: Add MMA builtins Bill Schmidt
@ 2021-08-25 22:56   ` Segher Boessenkool
  0 siblings, 0 replies; 84+ messages in thread
From: Segher Boessenkool @ 2021-08-25 22:56 UTC (permalink / raw)
  To: Bill Schmidt; +Cc: gcc-patches, dje.gcc, willschm

Hi!

On Thu, Jul 29, 2021 at 08:30:58AM -0500, Bill Schmidt wrote:
> 	* config/rs6000/rs6000-builtin-new.def: Add mma stanza.

Okay for trunk.  Thanks!


Segher

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 12/34] rs6000: Add miscellaneous builtins
  2021-07-29 13:30 ` [PATCH 12/34] rs6000: Add miscellaneous builtins Bill Schmidt
@ 2021-08-25 22:58   ` Segher Boessenkool
  0 siblings, 0 replies; 84+ messages in thread
From: Segher Boessenkool @ 2021-08-25 22:58 UTC (permalink / raw)
  To: Bill Schmidt; +Cc: gcc-patches, dje.gcc, willschm

On Thu, Jul 29, 2021 at 08:30:59AM -0500, Bill Schmidt wrote:
> 	* config/rs6000/rs6000-builtin-new.def: Add ieee128-hw, dfp,
> 	crypto, and htm stanzas.

Okay for trunk.  Thanks!


Segher

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 13/34] rs6000: Add Cell builtins
  2021-07-29 13:31 ` [PATCH 13/34] rs6000: Add Cell builtins Bill Schmidt
@ 2021-08-25 22:59   ` Segher Boessenkool
  0 siblings, 0 replies; 84+ messages in thread
From: Segher Boessenkool @ 2021-08-25 22:59 UTC (permalink / raw)
  To: Bill Schmidt; +Cc: gcc-patches, dje.gcc, willschm

On Thu, Jul 29, 2021 at 08:31:00AM -0500, Bill Schmidt wrote:
> 	* config/rs6000/rs6000-builtin-new.def: Add cell stanza.

This one is fine, too.  Thanks!


Segher

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 14/34] rs6000: Add remaining overloads
  2021-07-29 13:31 ` [PATCH 14/34] rs6000: Add remaining overloads Bill Schmidt
@ 2021-08-25 23:27   ` Segher Boessenkool
  2021-08-26 12:59     ` Bill Schmidt
  0 siblings, 1 reply; 84+ messages in thread
From: Segher Boessenkool @ 2021-08-25 23:27 UTC (permalink / raw)
  To: Bill Schmidt; +Cc: gcc-patches, dje.gcc, willschm

On Thu, Jul 29, 2021 at 08:31:01AM -0500, Bill Schmidt wrote:
> 	* config/rs6000/rs6000-overload.def: Add remaining overloads.

> +; TODO: Note that the entry for VEC_ADDE currently gets ignored in
> +; altivec_resolve_overloaded_builtin.  Revisit whether we can remove
> +; that.  We still need to register the legal builtin forms here.
> +[VEC_ADDE, vec_adde, __builtin_vec_adde]
> +  vsq __builtin_vec_adde (vsq, vsq, vsq);
> +    VADDEUQM  VADDEUQM_VSQ
> +  vuq __builtin_vec_adde (vuq, vuq, vuq);
> +    VADDEUQM  VADDEUQM_VUQ

I'm not sure what this means.  "Currently" is the problem I think.  Do
you mean that the existing code (before this patch) ignores it already?

> +; #### XVRSPIP{TARGET_VSX};VRFIP
> +[VEC_CEIL, vec_ceil, __builtin_vec_ceil]
> +  vf __builtin_vec_ceil (vf);
> +    VRFIP
> +  vd __builtin_vec_ceil (vd);
> +    XVRDPIP

Is that a comment you forgot to remove, or is there work to be done here?

> +[VEC_CMPEQ, vec_cmpeq, __builtin_vec_cmpeq]
> +; #### XVCMPEQSP{TARGET_VSX};VCMPEQFP

And this.

> +; #### XVCMPEQSP_P{TARGET_VSX};VCMPEQFP_P
> +[VEC_CMPEQ_P, SKIP, __builtin_vec_vcmpeq_p]

And more!

And more later.  It isn't clear to me at all what those comments mean,
and they are formatted haphazardly, so looks like a WIP?

> +; Note that the entries for VEC_MUL are currently ignored.  See rs6000-c.c:
> +; altivec_resolve_overloaded_builtin, where there is special-case code for
> +; VEC_MUL.  TODO: Is this really necessary?  Investigate.  Seven missing
> +; prototypes here...no corresponding builtins.  Also added "vmulld" in P10
> +; which could be used instead of MUL_V2DI, conditionally?

Space after "..." :-P

> +; Opportunity for improvement: We can use XVRESP instead of VREFP for
> +; TARGET_VSX.  We would need conditional dispatch to allow two possibilities.
> +; Some syntax like "XVRESP{TARGET_VSX};VREFP".
> +; TODO. ####
> +[VEC_RE, vec_re, __builtin_vec_re]

Don't we already anyway?  The only difference is whether all VSRs are
allowed or only the VRs, no?  The RTL generated is just the same?  Or
maybe I am overlooking something :-)

> +; **************************************************************************
> +; **************************************************************************
> +; ****    Deprecated overloads that should never have existed at all    ****
> +; **************************************************************************
> +; **************************************************************************

The coding conventions say not to use showy block comments like that,
but it seems appropriate here :-)


Okay for trunk with the #### looked at.  Please don't repost this one.
Thanks!


Segher

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 14/34] rs6000: Add remaining overloads
  2021-08-25 23:27   ` Segher Boessenkool
@ 2021-08-26 12:59     ` Bill Schmidt
  2021-08-26 13:58       ` Segher Boessenkool
  0 siblings, 1 reply; 84+ messages in thread
From: Bill Schmidt @ 2021-08-26 12:59 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: gcc-patches, dje.gcc, willschm

Hi Segher,

On 8/25/21 6:27 PM, Segher Boessenkool wrote:
> On Thu, Jul 29, 2021 at 08:31:01AM -0500, Bill Schmidt wrote:
>> 	* config/rs6000/rs6000-overload.def: Add remaining overloads.
>> +; TODO: Note that the entry for VEC_ADDE currently gets ignored in
>> +; altivec_resolve_overloaded_builtin.  Revisit whether we can remove
>> +; that.  We still need to register the legal builtin forms here.
>> +[VEC_ADDE, vec_adde, __builtin_vec_adde]
>> +  vsq __builtin_vec_adde (vsq, vsq, vsq);
>> +    VADDEUQM  VADDEUQM_VSQ
>> +  vuq __builtin_vec_adde (vuq, vuq, vuq);
>> +    VADDEUQM  VADDEUQM_VUQ
> I'm not sure what this means.  "Currently" is the problem I think.  Do
> you mean that the existing code (before this patch) ignores it already?

Right, exactly.  There is special-case code there for handling ADDE 
builtins, as there is for a number of other cases.  As a future cleanup, 
we'd like to have as little special-case code as possible. For this 
conversion effort, I elected to wait with that and leave the TODO here 
in the file.
>> +; #### XVRSPIP{TARGET_VSX};VRFIP
>> +[VEC_CEIL, vec_ceil, __builtin_vec_ceil]
>> +  vf __builtin_vec_ceil (vf);
>> +    VRFIP
>> +  vd __builtin_vec_ceil (vd);
>> +    XVRDPIP
> Is that a comment you forgot to remove, or is there work to be done here?

Sorry about these.  For a number of overloads, there is one builtin we 
have to use in case VSX isn't available, but a better one we can use if 
it is.  I had been thinking about adding special syntax in this file to 
select between multiple builtins.  However, since then I've realized 
it's probably better to have a single built-in mapping to a pattern that 
dispatches to the other two based on TARGET_VSX being available.  We 
probably have support for all these cases like that in vector.md as it is.

tl;dr:  I'll remove these comments. :-)

>> +[VEC_CMPEQ, vec_cmpeq, __builtin_vec_cmpeq]
>> +; #### XVCMPEQSP{TARGET_VSX};VCMPEQFP
> And this.
>
>> +; #### XVCMPEQSP_P{TARGET_VSX};VCMPEQFP_P
>> +[VEC_CMPEQ_P, SKIP, __builtin_vec_vcmpeq_p]
> And more!
>
> And more later.  It isn't clear to me at all what those comments mean,
> and they are formatted haphazardly, so looks like a WIP?
>
>> +; Note that the entries for VEC_MUL are currently ignored.  See rs6000-c.c:
>> +; altivec_resolve_overloaded_builtin, where there is special-case code for
>> +; VEC_MUL.  TODO: Is this really necessary?  Investigate.  Seven missing
>> +; prototypes here...no corresponding builtins.  Also added "vmulld" in P10
>> +; which could be used instead of MUL_V2DI, conditionally?
> Space after "..." :-P
>
>> +; Opportunity for improvement: We can use XVRESP instead of VREFP for
>> +; TARGET_VSX.  We would need conditional dispatch to allow two possibilities.
>> +; Some syntax like "XVRESP{TARGET_VSX};VREFP".
>> +; TODO. ####
>> +[VEC_RE, vec_re, __builtin_vec_re]
> Don't we already anyway?  The only difference is whether all VSRs are
> allowed or only the VRs, no?  The RTL generated is just the same?  Or
> maybe I am overlooking something :-)

Right, this is the same conclusion I came to -- I should be able to just 
use vector.md pattern names.  Future improvement (and maybe altogether 
unnecessary; I'll do some testing).
>> +; **************************************************************************
>> +; **************************************************************************
>> +; ****    Deprecated overloads that should never have existed at all    ****
>> +; **************************************************************************
>> +; **************************************************************************
> The coding conventions say not to use showy block comments like that,
> but it seems appropriate here :-)
>
>
> Okay for trunk with the #### looked at.  Please don't repost this one.
> Thanks!

Awww. :-P

Thanks very much for the review!  I know these were tedious to look 
through, and I appreciate it very much.

Bill

>
> Segher

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 14/34] rs6000: Add remaining overloads
  2021-08-26 12:59     ` Bill Schmidt
@ 2021-08-26 13:58       ` Segher Boessenkool
  0 siblings, 0 replies; 84+ messages in thread
From: Segher Boessenkool @ 2021-08-26 13:58 UTC (permalink / raw)
  To: Bill Schmidt; +Cc: gcc-patches, dje.gcc, willschm

Hi!

On Thu, Aug 26, 2021 at 07:59:04AM -0500, Bill Schmidt wrote:
> On 8/25/21 6:27 PM, Segher Boessenkool wrote:
> >On Thu, Jul 29, 2021 at 08:31:01AM -0500, Bill Schmidt wrote:
> >>	* config/rs6000/rs6000-overload.def: Add remaining overloads.
> >>+; TODO: Note that the entry for VEC_ADDE currently gets ignored in
> >>+; altivec_resolve_overloaded_builtin.  Revisit whether we can remove
> >>+; that.  We still need to register the legal builtin forms here.
> >>+[VEC_ADDE, vec_adde, __builtin_vec_adde]
> >>+  vsq __builtin_vec_adde (vsq, vsq, vsq);
> >>+    VADDEUQM  VADDEUQM_VSQ
> >>+  vuq __builtin_vec_adde (vuq, vuq, vuq);
> >>+    VADDEUQM  VADDEUQM_VUQ
> >I'm not sure what this means.  "Currently" is the problem I think.  Do
> >you mean that the existing code (before this patch) ignores it already?
> 
> Right, exactly.  There is special-case code there for handling ADDE 
> builtins, as there is for a number of other cases.  As a future cleanup, 
> we'd like to have as little special-case code as possible. For this 
> conversion effort, I elected to wait with that and leave the TODO here 
> in the file.

Ah, so it means the same as s/currently gets/is/ :-)

It is fine to not shave this yak right now of course.

> >>+; Note that the entries for VEC_MUL are currently ignored.  See 
> >>rs6000-c.c:
> >>+; altivec_resolve_overloaded_builtin, where there is special-case code 
> >>for
> >>+; VEC_MUL.  TODO: Is this really necessary?  Investigate.  Seven missing
> >>+; prototypes here...no corresponding builtins.  Also added "vmulld" in 
> >>P10
> >>+; which could be used instead of MUL_V2DI, conditionally?
> >Space after "..." :-P

(Remember to fix this important problem! :-) )

> >>+; Opportunity for improvement: We can use XVRESP instead of VREFP for
> >>+; TARGET_VSX.  We would need conditional dispatch to allow two 
> >>possibilities.
> >>+; Some syntax like "XVRESP{TARGET_VSX};VREFP".
> >>+; TODO. ####
> >>+[VEC_RE, vec_re, __builtin_vec_re]
> >Don't we already anyway?  The only difference is whether all VSRs are
> >allowed or only the VRs, no?  The RTL generated is just the same?  Or
> >maybe I am overlooking something :-)
> 
> Right, this is the same conclusion I came to -- I should be able to just 
> use vector.md pattern names.  Future improvement (and maybe altogether 
> unnecessary; I'll do some testing).

We still need to have both builtins, even if they do the same thing.
External names are forever and all that.

> >Okay for trunk with the #### looked at.  Please don't repost this one.
> >Thanks!
> 
> Awww. :-P

Hey there are 20 more patches in this series, you'll have plenty more
opportunity to torture me :-)


Segher

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 15/34] rs6000: Execute the automatic built-in initialization code
  2021-07-29 13:31 ` [PATCH 15/34] rs6000: Execute the automatic built-in initialization code Bill Schmidt
@ 2021-08-26 23:15   ` Segher Boessenkool
  2021-08-27 12:35     ` Bill Schmidt
  0 siblings, 1 reply; 84+ messages in thread
From: Segher Boessenkool @ 2021-08-26 23:15 UTC (permalink / raw)
  To: Bill Schmidt; +Cc: gcc-patches, dje.gcc, willschm

Hi!

On Thu, Jul 29, 2021 at 08:31:02AM -0500, Bill Schmidt wrote:
> 	* config/rs6000/rs6000-call.c (rs6000-builtins.h): New #include.
> 	(rs6000_init_builtins): Call rs6000_autoinit_builtins; skip the old
> 	initialization logic when new builtins are enabled.

s/; s/.  S/

> +  /* Execute the autogenerated initialization code for builtins.  */
> +  rs6000_autoinit_builtins ();

The name "autoinit" isn't so great (what "self" does this "auto" refer
to?), but perhaps some later patch fixes this up?  It is minor of
course, but the bigger something is, the better name that it deserves.
Names shape thoughts, and we should make the architecture of our code as
clear as possible.

> +#ifdef SUBTARGET_INIT_BUILTINS
> +      SUBTARGET_INIT_BUILTINS;
> +#endif

Let's see how this shapes up.  Preferably we won't have an #ifdef but
an empty macro (or a "do {} while (0)"), etc.

Okay for trunk, if this is revisited later.  Thanks!


Segher

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 15/34] rs6000: Execute the automatic built-in initialization code
  2021-08-26 23:15   ` Segher Boessenkool
@ 2021-08-27 12:35     ` Bill Schmidt
  2021-08-27 12:49       ` Segher Boessenkool
  0 siblings, 1 reply; 84+ messages in thread
From: Bill Schmidt @ 2021-08-27 12:35 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: gcc-patches, dje.gcc, willschm

Hi Segher,

On 8/26/21 6:15 PM, Segher Boessenkool wrote:
> Hi!
>
> On Thu, Jul 29, 2021 at 08:31:02AM -0500, Bill Schmidt wrote:
>> 	* config/rs6000/rs6000-call.c (rs6000-builtins.h): New #include.
>> 	(rs6000_init_builtins): Call rs6000_autoinit_builtins; skip the old
>> 	initialization logic when new builtins are enabled.
> s/; s/.  S/
>
>> +  /* Execute the autogenerated initialization code for builtins.  */
>> +  rs6000_autoinit_builtins ();
> The name "autoinit" isn't so great (what "self" does this "auto" refer
> to?), but perhaps some later patch fixes this up?  It is minor of
> course, but the bigger something is, the better name that it deserves.
> Names shape thoughts, and we should make the architecture of our code as
> clear as possible.

Well, "autoinit" was meant to mean "automated initialization."  But I 
take your point.  What would you say to "rs6000_init_generated_builtins"?
>
>> +#ifdef SUBTARGET_INIT_BUILTINS
>> +      SUBTARGET_INIT_BUILTINS;
>> +#endif
> Let's see how this shapes up.  Preferably we won't have an #ifdef but
> an empty macro (or a "do {} while (0)"), etc.
To be clear, this isn't new code.  It's just the only part of the old 
code that isn't replaced by the generated initialization. You'll see it 
repeated at the end of rs6000_init_builtins.  This is used on our port 
for adding one builtin from Darwin.  It's a standard macro, also used in 
the aarch64, i386, and netbsd ports.
>
> Okay for trunk, if this is revisited later.  Thanks!

Thanks for the review!  I'll commit after we agree on a name. (This will 
require minor changes to rs6000-gen-builtins.c to change the name of the 
generated function in rs6000-builtins.[ch].)

Bill

>
> Segher

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 15/34] rs6000: Execute the automatic built-in initialization code
  2021-08-27 12:35     ` Bill Schmidt
@ 2021-08-27 12:49       ` Segher Boessenkool
  0 siblings, 0 replies; 84+ messages in thread
From: Segher Boessenkool @ 2021-08-27 12:49 UTC (permalink / raw)
  To: Bill Schmidt; +Cc: gcc-patches, dje.gcc, willschm

On Fri, Aug 27, 2021 at 07:35:05AM -0500, Bill Schmidt wrote:
> On 8/26/21 6:15 PM, Segher Boessenkool wrote:
> >On Thu, Jul 29, 2021 at 08:31:02AM -0500, Bill Schmidt wrote:
> >>+  /* Execute the autogenerated initialization code for builtins.  */
> >>+  rs6000_autoinit_builtins ();
> >The name "autoinit" isn't so great (what "self" does this "auto" refer
> >to?), but perhaps some later patch fixes this up?  It is minor of
> >course, but the bigger something is, the better name that it deserves.
> >Names shape thoughts, and we should make the architecture of our code as
> >clear as possible.
> 
> Well, "autoinit" was meant to mean "automated initialization."  But I 
> take your point.

*Everything* GCC does is "automated" in some way, the term is
meaningless :-)

> What would you say to "rs6000_init_generated_builtins"?

That is fine.

Thanks!


Segher

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 16/34] rs6000: Darwin builtin support
  2021-07-29 13:31 ` [PATCH 16/34] rs6000: Darwin builtin support Bill Schmidt
@ 2021-08-27 18:01   ` Segher Boessenkool
  0 siblings, 0 replies; 84+ messages in thread
From: Segher Boessenkool @ 2021-08-27 18:01 UTC (permalink / raw)
  To: Bill Schmidt; +Cc: gcc-patches, dje.gcc, willschm

On Thu, Jul 29, 2021 at 08:31:03AM -0500, Bill Schmidt wrote:
> 	* config/rs6000/darwin.h (SUBTARGET_INIT_BUILTINS): Use the new
> 	decl when new_builtins_are_live.
> 	* config/rs6000/rs6000-builtin-new.def (__builtin_cfstring): New
> 	built-in.

Okay for trunk.  Thanks!


Segher

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 17/34] rs6000: Add sanity to V2DI_type_node definitions
  2021-07-29 13:31 ` [PATCH 17/34] rs6000: Add sanity to V2DI_type_node definitions Bill Schmidt
@ 2021-08-27 19:27   ` Segher Boessenkool
  0 siblings, 0 replies; 84+ messages in thread
From: Segher Boessenkool @ 2021-08-27 19:27 UTC (permalink / raw)
  To: Bill Schmidt; +Cc: gcc-patches, dje.gcc, willschm

On Thu, Jul 29, 2021 at 08:31:04AM -0500, Bill Schmidt wrote:
> It seems quite strange for these to be "vector long" for 64-bit and
> "vector long long" for 32-bit, when "vector long long" will do for both.

Yeah.  For most builtins the only thing that matters is the mode the
types map to.  Similarly we want "vector int" instead of "vector long"
on both ilp32 and lp64.

> +    V2DI_type_node = rs6000_vector_type (TARGET_POWERPC64 ? "__vector long"
> +					 : "__vector long long",
> +					 long_long_integer_type_node, 2);

The same as before: either ? and : on the same line, or both should
start a new line.  Or use some temporary name.  Or write an "if" around
the whole thing (this compiles to the same machine code -- source code
should be optimised for the human reader, the compiler does not care at
all, you almost never can use that as excuse :-) )

Anyway, you know what is needed :-)  Okay for trunk.  Thanks!


Segher

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 18/34] rs6000: Always initialize vector_pair and vector_quad nodes
  2021-07-29 13:31 ` [PATCH 18/34] rs6000: Always initialize vector_pair and vector_quad nodes Bill Schmidt
@ 2021-08-27 19:34   ` Segher Boessenkool
  0 siblings, 0 replies; 84+ messages in thread
From: Segher Boessenkool @ 2021-08-27 19:34 UTC (permalink / raw)
  To: Bill Schmidt; +Cc: gcc-patches, dje.gcc, willschm

On Thu, Jul 29, 2021 at 08:31:05AM -0500, Bill Schmidt wrote:
> 	* config/rs6000/rs6000-call.c (rs6000_init_builtins): Remove
> 	TARGET_EXTRA_BUILTINS guard.

> +  vector_pair_type_node = make_node (OPAQUE_TYPE);

> +  vector_quad_type_node = make_node (OPAQUE_TYPE);

Although those types are called "vector", they are not, so this does
work correctly even if not TARGET_EXTRA_BUILTINS.

Ideally we will that macro always enabled eventually, but that is later
work.  Okay for trunk.  Thanks!


Segher

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 19/34] rs6000: Handle overloads during program parsing
  2021-07-29 13:31 ` [PATCH 19/34] rs6000: Handle overloads during program parsing Bill Schmidt
@ 2021-08-27 23:07   ` Segher Boessenkool
  2021-08-31  3:34     ` Bill Schmidt
  0 siblings, 1 reply; 84+ messages in thread
From: Segher Boessenkool @ 2021-08-27 23:07 UTC (permalink / raw)
  To: Bill Schmidt; +Cc: gcc-patches, dje.gcc, willschm

Hi!

On Thu, Jul 29, 2021 at 08:31:06AM -0500, Bill Schmidt wrote:
> Although this patch looks quite large, the changes are fairly minimal.
> Most of it is duplicating the large function that does the overload
> resolution using the automatically generated data structures instead of
> the old hand-generated ones.  This doesn't make the patch terribly easy to
> review, unfortunately.

Yeah, and it pretty important that it does the same as the old code did.

> Just be aware that generally we aren't changing
> the logic and functionality of overload handling.

Good :-)

> 	* config/rs6000/rs6000-c.c (rs6000-builtins.h): New include.
> 	(altivec_resolve_new_overloaded_builtin): New forward decl.
> 	(rs6000_new_builtin_type_compatible): New function.
> 	(altivec_resolve_overloaded_builtin): Call
> 	altivec_resolve_new_overloaded_builtin.
> 	(altivec_build_new_resolved_builtin): New function.
> 	(altivec_resolve_new_overloaded_builtin): Likewise.
> 	* config/rs6000/rs6000-call.c (rs6000_new_builtin_is_supported_p):
> 	Likewise.

No "_p" please (it already has a verb in the name, and explicit ones are
much clearer anyway).

Does everything else belong in the C frontend file but this last
function not?  Maybe this could be split up better.  Maybe there should
be a separate file for just the builtin support, it probably is big
enough?

(This is all for later of course, but please think about it.  Code
rearrangement (or even refactoring) can be done at any time, there is
no time pressure on it).

> +static tree
> +altivec_resolve_new_overloaded_builtin (location_t, tree, void *);

This fits on one line, please do so (this is a declaration, not a
function definition; those are put with the name in the first column,
to make searching for them a lot easier).

> +static bool
> +rs6000_new_builtin_type_compatible (tree t, tree u)
> +{
> +  if (t == error_mark_node)
> +    return false;
> +
> +  if (INTEGRAL_TYPE_P (t) && INTEGRAL_TYPE_P (u))
> +    return true;
> +
> +  if (TARGET_IEEEQUAD && TARGET_LONG_DOUBLE_128
> +	   && is_float128_p (t) && is_float128_p (u))

Indent is wrong here?

> +  if (POINTER_TYPE_P (t) && POINTER_TYPE_P (u))
> +    {
> +      t = TREE_TYPE (t);
> +      u = TREE_TYPE (u);
> +      if (TYPE_READONLY (u))
> +	t = build_qualified_type (t, TYPE_QUAL_CONST);
> +    }

Just use TYPE_MAIN_VARIANT, don't make garbage here?

> +  /* The AltiVec overloading implementation is overall gross, but this
> +     is particularly disgusting.  The vec_{all,any}_{ge,le} builtins
> +     are completely different for floating-point vs. integer vector
> +     types, because the former has vcmpgefp, but the latter should use
> +     vcmpgtXX.

Yes.  The integer comparisons were reduced to just two in original VMX
(eq and gt, well signed and unsigned versions of the latter), but that
cannot be done for floating point (all of a<b a==b a>b can be false at
the same time (for a or b NaN), so this cannot be compressed to just two
functions, we really need three at least).

We have the same thing in xscmp* btw, it's not just VMX.  Having three
ops for the majority of comparisons (and doing the rest with two or so)
is nicer than having 14 :-)

There aren't builtins for most of that, thankfully.

> +  if (TARGET_DEBUG_BUILTIN)
> +    fprintf (stderr, "altivec_resolve_overloaded_builtin, code = %4d, %s\n",
> +	     (int)fcode, IDENTIFIER_POINTER (DECL_NAME (fndecl)));

(space after cast, also in debug code)

> +	  const char *name
> +	    = fcode == RS6000_OVLD_VEC_ADDE ? "vec_adde": "vec_sube";

(space before and after colon)

> +	  const char *name = fcode == RS6000_OVLD_VEC_ADDEC ?
> +	    "vec_addec": "vec_subec";

(ditto.  also, ? cannot end a line.  maybe just
	  const char *name;
	  name = fcode == RS6000_OVLD_VEC_ADDEC ? "vec_addec" : "vec_subec";
)

> +      const char *name
> +	= fcode == RS6000_OVLD_VEC_SPLATS ? "vec_splats": "vec_promote";

(more)

> +	  case E_SFmode: type = V4SF_type_node; size = 4; break;
> +	  case E_DFmode: type = V2DF_type_node; size = 2; break;

Don't put multiple statements on one line.  Put the label on its own,
too, for that matter.

> +	}
> +	return build_constructor (type, vec);

This is wrong indenting.  Where it started, I have no idea.  You figure
it out :-)

> + bad:
> +  {
> +    const char *name = rs6000_overload_info[adj_fcode].ovld_name;
> +    error ("invalid parameter combination for AltiVec intrinsic %qs", name);
> +    return error_mark_node;
> +  }

A huge function with a lot of "goto bad;" just *screams* "this needs to
be factored".

> +    case ENB_P5:
> +      if (!TARGET_POPCNTB)
> +	return false;
> +      break;

    case ENB_P5:
      return TARGET_POPCNTB;

and similar for all further cases.  It is shorter and does not have
negations, win-win!

> +      break;
> +    };

Stray semicolon.  Did this not warn?

Could you please try to factor this better?


Segher

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [PATCH 19/34] rs6000: Handle overloads during program parsing
  2021-08-27 23:07   ` Segher Boessenkool
@ 2021-08-31  3:34     ` Bill Schmidt
  0 siblings, 0 replies; 84+ messages in thread
From: Bill Schmidt @ 2021-08-31  3:34 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: gcc-patches, dje.gcc, willschm

Hi Segher,

On 8/27/21 6:07 PM, Segher Boessenkool wrote:
> Hi!
>
> On Thu, Jul 29, 2021 at 08:31:06AM -0500, Bill Schmidt wrote:
>> Although this patch looks quite large, the changes are fairly minimal.
>> Most of it is duplicating the large function that does the overload
>> resolution using the automatically generated data structures instead of
>> the old hand-generated ones.  This doesn't make the patch terribly easy to
>> review, unfortunately.
> Yeah, and it pretty important that it does the same as the old code did.
>
>> Just be aware that generally we aren't changing
>> the logic and functionality of overload handling.
> Good :-)
>
>> 	* config/rs6000/rs6000-c.c (rs6000-builtins.h): New include.
>> 	(altivec_resolve_new_overloaded_builtin): New forward decl.
>> 	(rs6000_new_builtin_type_compatible): New function.
>> 	(altivec_resolve_overloaded_builtin): Call
>> 	altivec_resolve_new_overloaded_builtin.
>> 	(altivec_build_new_resolved_builtin): New function.
>> 	(altivec_resolve_new_overloaded_builtin): Likewise.
>> 	* config/rs6000/rs6000-call.c (rs6000_new_builtin_is_supported_p):
>> 	Likewise.
> No "_p" please (it already has a verb in the name, and explicit ones are
> much clearer anyway).

OK.  This requires a small change to the gen program to match.
>
> Does everything else belong in the C frontend file but this last
> function not?  Maybe this could be split up better.  Maybe there should
> be a separate file for just the builtin support, it probably is big
> enough?
>
> (This is all for later of course, but please think about it.  Code
> rearrangement (or even refactoring) can be done at any time, there is
> no time pressure on it).

Yes, it's a fair point to figure out whether this should be refactored 
better.  This particular function is used both by the overloading 
support and in other places, so it really belongs elsewhere, I think.  
In general, we might consider simplifying the target hooks that land in 
the C frontend file and move the guts into a separate builtin support 
file.  Certainly rs6000-call.c is still too large, and factoring out the 
builtins parts would be an improvement.  I'll put this on the list for 
follow-up after the main series lands.
>
>> +static tree
>> +altivec_resolve_new_overloaded_builtin (location_t, tree, void *);
> This fits on one line, please do so (this is a declaration, not a
> function definition; those are put with the name in the first column,
> to make searching for them a lot easier).
>
>> +static bool
>> +rs6000_new_builtin_type_compatible (tree t, tree u)
>> +{
>> +  if (t == error_mark_node)
>> +    return false;
>> +
>> +  if (INTEGRAL_TYPE_P (t) && INTEGRAL_TYPE_P (u))
>> +    return true;
>> +
>> +  if (TARGET_IEEEQUAD && TARGET_LONG_DOUBLE_128
>> +	   && is_float128_p (t) && is_float128_p (u))
> Indent is wrong here?

Yeah, how'd that happen...
>
>> +  if (POINTER_TYPE_P (t) && POINTER_TYPE_P (u))
>> +    {
>> +      t = TREE_TYPE (t);
>> +      u = TREE_TYPE (u);
>> +      if (TYPE_READONLY (u))
>> +	t = build_qualified_type (t, TYPE_QUAL_CONST);
>> +    }
> Just use TYPE_MAIN_VARIANT, don't make garbage here?

I don't understand how TYPE_MAIN_VARIANT is appropriate.  Please explain?
>
>> +  /* The AltiVec overloading implementation is overall gross, but this
>> +     is particularly disgusting.  The vec_{all,any}_{ge,le} builtins
>> +     are completely different for floating-point vs. integer vector
>> +     types, because the former has vcmpgefp, but the latter should use
>> +     vcmpgtXX.
> Yes.  The integer comparisons were reduced to just two in original VMX
> (eq and gt, well signed and unsigned versions of the latter), but that
> cannot be done for floating point (all of a<b a==b a>b can be false at
> the same time (for a or b NaN), so this cannot be compressed to just two
> functions, we really need three at least).
>
> We have the same thing in xscmp* btw, it's not just VMX.  Having three
> ops for the majority of comparisons (and doing the rest with two or so)
> is nicer than having 14 :-)
>
> There aren't builtins for most of that, thankfully.
>
>> +  if (TARGET_DEBUG_BUILTIN)
>> +    fprintf (stderr, "altivec_resolve_overloaded_builtin, code = %4d, %s\n",
>> +	     (int)fcode, IDENTIFIER_POINTER (DECL_NAME (fndecl)));
> (space after cast, also in debug code)
>
>> +	  const char *name
>> +	    = fcode == RS6000_OVLD_VEC_ADDE ? "vec_adde": "vec_sube";
> (space before and after colon)

Yeah, there is a lot of bad formatting in the copied code.  I probably 
should have tried to clean it all up, but generally just did where I was 
making specific changes.  I'll see what I can do about these things.
>
>> +	  const char *name = fcode == RS6000_OVLD_VEC_ADDEC ?
>> +	    "vec_addec": "vec_subec";
> (ditto.  also, ? cannot end a line.  maybe just
> 	  const char *name;
> 	  name = fcode == RS6000_OVLD_VEC_ADDEC ? "vec_addec" : "vec_subec";
> )
>
>> +      const char *name
>> +	= fcode == RS6000_OVLD_VEC_SPLATS ? "vec_splats": "vec_promote";
> (more)
>
>> +	  case E_SFmode: type = V4SF_type_node; size = 4; break;
>> +	  case E_DFmode: type = V2DF_type_node; size = 2; break;
> Don't put multiple statements on one line.  Put the label on its own,
> too, for that matter.
>
>> +	}
>> +	return build_constructor (type, vec);
> This is wrong indenting.  Where it started, I have no idea.  You figure
> it out :-)
>
>> + bad:
>> +  {
>> +    const char *name = rs6000_overload_info[adj_fcode].ovld_name;
>> +    error ("invalid parameter combination for AltiVec intrinsic %qs", name);
>> +    return error_mark_node;
>> +  }
> A huge function with a lot of "goto bad;" just *screams* "this needs to
> be factored".

It does indeed.  Let me put that on the list for after the main patch 
series is done, if that's ok.
>
>> +    case ENB_P5:
>> +      if (!TARGET_POPCNTB)
>> +	return false;
>> +      break;
>      case ENB_P5:
>        return TARGET_POPCNTB;
>
> and similar for all further cases.  It is shorter and does not have
> negations, win-win!

Good call.
>
>> +      break;
>> +    };
> Stray semicolon.  Did this not warn?

It did not!  Or at least I didn't notice if it did.  I believe that 
-Werror may be somehow turned off for this file due to a bunch of 
excessive warnings involving format errors, though.  I've been meaning 
to look into how that's happening...
>
> Could you please try to factor this better?

I promise to do so later.  Right now I'm trying hard not to screw up the 
logic of this messed-up beast.  I think having it committed in the 
messier, but closer to original, state will be a good intermediate 
step.  I definitely want to take a shot at making this better down the road.

Thanks very much for the review!

Bill

>
>
> Segher

^ permalink raw reply	[flat|nested] 84+ messages in thread

end of thread, other threads:[~2021-08-31  3:34 UTC | newest]

Thread overview: 84+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-29 13:30 [PATCHv4 00/34] Replace the Power target-specific builtin machinery Bill Schmidt
2021-07-29 13:30 ` [PATCH 01/34] rs6000: Incorporate new builtins code into the build machinery Bill Schmidt
2021-08-04 22:29   ` Segher Boessenkool
2021-08-05 13:47     ` Bill Schmidt
2021-08-05 16:04       ` Segher Boessenkool
2021-07-29 13:30 ` [PATCH 02/34] rs6000: Add gengtype handling to " Bill Schmidt
2021-08-04 22:52   ` Segher Boessenkool
2021-07-29 13:30 ` [PATCH 03/34] rs6000: Add the rest of the [altivec] stanza to the builtins file Bill Schmidt
2021-08-07  0:01   ` Segher Boessenkool
2021-08-08 16:53     ` Bill Schmidt
2021-08-08 20:27       ` Segher Boessenkool
2021-08-08 20:53         ` Bill Schmidt
2021-08-09 18:05           ` Segher Boessenkool
2021-08-09 19:18           ` Bill Schmidt
2021-08-09 23:44             ` Segher Boessenkool
2021-08-10 12:17               ` Bill Schmidt
2021-08-10 12:48                 ` Segher Boessenkool
2021-08-10 13:02                   ` Bill Schmidt
2021-08-10 13:40                     ` Segher Boessenkool
2021-08-10 13:49                       ` Bill Schmidt
2021-07-29 13:30 ` [PATCH 04/34] rs6000: Add VSX builtins Bill Schmidt
2021-08-10 16:14   ` will schmidt
2021-08-10 17:52   ` Segher Boessenkool
2021-07-29 13:30 ` [PATCH 05/34] rs6000: Add available-everywhere and ancient builtins Bill Schmidt
2021-08-10 16:17   ` will schmidt
2021-08-10 17:34     ` Segher Boessenkool
2021-08-10 21:29       ` Bill Schmidt
2021-08-11 10:29         ` Segher Boessenkool
2021-08-10 18:38   ` Segher Boessenkool
2021-08-10 18:56     ` Bill Schmidt
2021-08-10 20:33       ` Segher Boessenkool
2021-07-29 13:30 ` [PATCH 06/34] rs6000: Add power7 and power7-64 builtins Bill Schmidt
2021-08-10 16:16   ` will schmidt
2021-08-10 17:48     ` Segher Boessenkool
2021-07-29 13:30 ` [PATCH 07/34] rs6000: Add power8-vector builtins Bill Schmidt
2021-08-23 21:28   ` Segher Boessenkool
2021-07-29 13:30 ` [PATCH 08/34] rs6000: Add Power9 builtins Bill Schmidt
2021-08-23 21:40   ` Segher Boessenkool
2021-08-24 14:20     ` Bill Schmidt
2021-08-24 15:38       ` Segher Boessenkool
2021-08-24 16:27         ` Bill Schmidt
2021-07-29 13:30 ` [PATCH 09/34] rs6000: Add more type nodes to support builtin processing Bill Schmidt
2021-08-23 22:15   ` Segher Boessenkool
2021-08-24 14:38     ` Bill Schmidt
2021-07-29 13:30 ` [PATCH 10/34] rs6000: Add Power10 builtins Bill Schmidt
2021-08-23 23:48   ` Segher Boessenkool
2021-07-29 13:30 ` [PATCH 11/34] rs6000: Add MMA builtins Bill Schmidt
2021-08-25 22:56   ` Segher Boessenkool
2021-07-29 13:30 ` [PATCH 12/34] rs6000: Add miscellaneous builtins Bill Schmidt
2021-08-25 22:58   ` Segher Boessenkool
2021-07-29 13:31 ` [PATCH 13/34] rs6000: Add Cell builtins Bill Schmidt
2021-08-25 22:59   ` Segher Boessenkool
2021-07-29 13:31 ` [PATCH 14/34] rs6000: Add remaining overloads Bill Schmidt
2021-08-25 23:27   ` Segher Boessenkool
2021-08-26 12:59     ` Bill Schmidt
2021-08-26 13:58       ` Segher Boessenkool
2021-07-29 13:31 ` [PATCH 15/34] rs6000: Execute the automatic built-in initialization code Bill Schmidt
2021-08-26 23:15   ` Segher Boessenkool
2021-08-27 12:35     ` Bill Schmidt
2021-08-27 12:49       ` Segher Boessenkool
2021-07-29 13:31 ` [PATCH 16/34] rs6000: Darwin builtin support Bill Schmidt
2021-08-27 18:01   ` Segher Boessenkool
2021-07-29 13:31 ` [PATCH 17/34] rs6000: Add sanity to V2DI_type_node definitions Bill Schmidt
2021-08-27 19:27   ` Segher Boessenkool
2021-07-29 13:31 ` [PATCH 18/34] rs6000: Always initialize vector_pair and vector_quad nodes Bill Schmidt
2021-08-27 19:34   ` Segher Boessenkool
2021-07-29 13:31 ` [PATCH 19/34] rs6000: Handle overloads during program parsing Bill Schmidt
2021-08-27 23:07   ` Segher Boessenkool
2021-08-31  3:34     ` Bill Schmidt
2021-07-29 13:31 ` [PATCH 20/34] rs6000: Handle gimple folding of target built-ins Bill Schmidt
2021-07-29 13:31 ` [PATCH 21/34] rs6000: Handle some recent MMA builtin changes Bill Schmidt
2021-07-29 13:31 ` [PATCH 22/34] rs6000: Support for vectorizing built-in functions Bill Schmidt
2021-07-29 13:31 ` [PATCH 23/34] rs6000: Builtin expansion, part 1 Bill Schmidt
2021-07-29 13:31 ` [PATCH 24/34] rs6000: Builtin expansion, part 2 Bill Schmidt
2021-07-29 13:31 ` [PATCH 25/34] rs6000: Builtin expansion, part 3 Bill Schmidt
2021-07-29 13:31 ` [PATCH 26/34] rs6000: Builtin expansion, part 4 Bill Schmidt
2021-07-29 13:31 ` [PATCH 27/34] rs6000: Builtin expansion, part 5 Bill Schmidt
2021-07-29 13:31 ` [PATCH 28/34] rs6000: Builtin expansion, part 6 Bill Schmidt
2021-07-29 13:31 ` [PATCH 29/34] rs6000: Update rs6000_builtin_decl Bill Schmidt
2021-07-29 13:31 ` [PATCH 30/34] rs6000: Miscellaneous uses of rs6000_builtins_decl_x Bill Schmidt
2021-07-29 13:31 ` [PATCH 31/34] rs6000: Debug support Bill Schmidt
2021-07-29 13:31 ` [PATCH 32/34] rs6000: Update altivec.h for automated interfaces Bill Schmidt
2021-07-29 13:31 ` [PATCH 33/34] rs6000: Test case adjustments Bill Schmidt
2021-07-29 13:31 ` [PATCH 34/34] rs6000: Enable the new builtin support Bill Schmidt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).