public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH v5 0/6] Add Loongson SX/ASX instruction support to LoongArch target.
@ 2023-08-24  3:13 Chenghui Pan
  2023-08-24  3:13 ` [PATCH v5 1/6] LoongArch: Add Loongson SX vector directive compilation framework Chenghui Pan
                   ` (6 more replies)
  0 siblings, 7 replies; 9+ messages in thread
From: Chenghui Pan @ 2023-08-24  3:13 UTC (permalink / raw)
  To: gcc-patches; +Cc: xry111, i, chenglulu, xuchenghua, Chenghui Pan

This is an update of:
https://gcc.gnu.org/pipermail/gcc-patches/2023-August/627413.html

Changes since last version of patch set:
- Fix regression test fail of pr54346.c with RUNTESTFLAGS="--target_board=unix/-mlsx".
  This is caused by the code simplification of loongarch_expand_vec_perm_const_2 () in
  last version.
- Combine vilvh/xvilvh insn's RTL template impl.
- Add dg-skip-if for loongarch*-*-* in vshuf test in g++.dg/torture, because
  vshuf/xvshuf insn's result is undefined when 6 or 7 bit of vector's element is set,
  and insns with this condition are generated in these testcases.

Brief version history of patch set:

v1 -> v2:
- Reduce usage of "unspec" in RTL template.
- Append Support of ADDR_REG_REG in LSX and LASX.
- Constraint docs are appended in gcc/doc/md.texi and ccomment block.
- Codes related to vecarg are removed.
- Testsuite of LSX and LASX is added in v2. (Because of the size limitation of
  mail list, these patches are not shown)
- Adjust the loongarch_expand_vector_init() function to reduce instruction 
  output amount.
- Some minor implementation changes of RTL templates.

v2 -> v3:
- Revert vabsd/xvabsd RTL templates to unspec impl.
- Resolve warning in gcc/config/loongarch/loongarch.cc when bootstrapping 
  with BOOT_CFLAGS="-O2 -ftree-vectorize -fno-vect-cost-model -mlasx".
- Remove redundant definitions in lasxintrin.h.
- Refine commit info.

v3 -> v4:
- Code simplification.
- Testsuite patches are splited from this patch set again and will be submitted 
independently in the future.

Lulu Cheng (6):
  LoongArch: Add Loongson SX vector directive compilation framework.
  LoongArch: Add Loongson SX base instruction support.
  LoongArch: Add Loongson SX directive builtin function support.
  LoongArch: Add Loongson ASX vector directive compilation framework.
  LoongArch: Add Loongson ASX base instruction support.
  LoongArch: Add Loongson ASX directive builtin function support.

 gcc/config.gcc                                |    2 +-
 gcc/config/loongarch/constraints.md           |  131 +-
 .../loongarch/genopts/loongarch-strings       |    4 +
 gcc/config/loongarch/genopts/loongarch.opt.in |   12 +-
 gcc/config/loongarch/lasx.md                  | 5104 ++++++++++++++++
 gcc/config/loongarch/lasxintrin.h             | 5338 +++++++++++++++++
 gcc/config/loongarch/loongarch-builtins.cc    | 2686 ++++++++-
 gcc/config/loongarch/loongarch-c.cc           |   18 +
 gcc/config/loongarch/loongarch-def.c          |    6 +
 gcc/config/loongarch/loongarch-def.h          |    9 +-
 gcc/config/loongarch/loongarch-driver.cc      |   10 +
 gcc/config/loongarch/loongarch-driver.h       |    2 +
 gcc/config/loongarch/loongarch-ftypes.def     |  666 +-
 gcc/config/loongarch/loongarch-modes.def      |   39 +
 gcc/config/loongarch/loongarch-opts.cc        |   89 +-
 gcc/config/loongarch/loongarch-opts.h         |    3 +
 gcc/config/loongarch/loongarch-protos.h       |   35 +
 gcc/config/loongarch/loongarch-str.h          |    3 +
 gcc/config/loongarch/loongarch.cc             | 4660 +++++++++++++-
 gcc/config/loongarch/loongarch.h              |  117 +-
 gcc/config/loongarch/loongarch.md             |   56 +-
 gcc/config/loongarch/loongarch.opt            |   12 +-
 gcc/config/loongarch/lsx.md                   | 4467 ++++++++++++++
 gcc/config/loongarch/lsxintrin.h              | 5181 ++++++++++++++++
 gcc/config/loongarch/predicates.md            |  333 +-
 gcc/doc/md.texi                               |   11 +
 gcc/testsuite/g++.dg/torture/vshuf-v16hi.C    |    1 +
 gcc/testsuite/g++.dg/torture/vshuf-v16qi.C    |    1 +
 gcc/testsuite/g++.dg/torture/vshuf-v2df.C     |    2 +
 gcc/testsuite/g++.dg/torture/vshuf-v2di.C     |    1 +
 gcc/testsuite/g++.dg/torture/vshuf-v4sf.C     |    2 +-
 gcc/testsuite/g++.dg/torture/vshuf-v8hi.C     |    1 +
 32 files changed, 28717 insertions(+), 285 deletions(-)
 create mode 100644 gcc/config/loongarch/lasx.md
 create mode 100644 gcc/config/loongarch/lasxintrin.h
 create mode 100644 gcc/config/loongarch/lsx.md
 create mode 100644 gcc/config/loongarch/lsxintrin.h

-- 
2.36.0


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v5 1/6] LoongArch: Add Loongson SX vector directive compilation framework.
  2023-08-24  3:13 [PATCH v5 0/6] Add Loongson SX/ASX instruction support to LoongArch target Chenghui Pan
@ 2023-08-24  3:13 ` Chenghui Pan
  2023-08-24  3:13 ` [PATCH v5 2/6] LoongArch: Add Loongson SX base instruction support Chenghui Pan
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 9+ messages in thread
From: Chenghui Pan @ 2023-08-24  3:13 UTC (permalink / raw)
  To: gcc-patches; +Cc: xry111, i, chenglulu, xuchenghua

From: Lulu Cheng <chenglulu@loongson.cn>

gcc/ChangeLog:

	* config/loongarch/genopts/loongarch-strings: Add compilation framework.
	* config/loongarch/genopts/loongarch.opt.in: Ditto.
	* config/loongarch/loongarch-c.cc (loongarch_cpu_cpp_builtins): Ditto.
	* config/loongarch/loongarch-def.c: Ditto.
	* config/loongarch/loongarch-def.h (N_ISA_EXT_TYPES): Ditto.
	(ISA_EXT_SIMD_LSX): Ditto.
	(N_SWITCH_TYPES): Ditto.
	(SW_LSX): Ditto.
	(struct loongarch_isa): Ditto.
	* config/loongarch/loongarch-driver.cc (APPEND_SWITCH): Ditto.
	(driver_get_normalized_m_opts): Ditto.
	* config/loongarch/loongarch-driver.h (driver_get_normalized_m_opts): Ditto.
	* config/loongarch/loongarch-opts.cc (loongarch_config_target): Ditto.
	(isa_str): Ditto.
	* config/loongarch/loongarch-opts.h (ISA_HAS_LSX): Ditto.
	* config/loongarch/loongarch-str.h (OPTSTR_LSX): Ditto.
	* config/loongarch/loongarch.opt: Ditto.
---
 .../loongarch/genopts/loongarch-strings       |  3 +
 gcc/config/loongarch/genopts/loongarch.opt.in |  8 +-
 gcc/config/loongarch/loongarch-c.cc           |  7 ++
 gcc/config/loongarch/loongarch-def.c          |  4 +
 gcc/config/loongarch/loongarch-def.h          |  7 +-
 gcc/config/loongarch/loongarch-driver.cc      | 10 +++
 gcc/config/loongarch/loongarch-driver.h       |  1 +
 gcc/config/loongarch/loongarch-opts.cc        | 82 ++++++++++++++++++-
 gcc/config/loongarch/loongarch-opts.h         |  1 +
 gcc/config/loongarch/loongarch-str.h          |  2 +
 gcc/config/loongarch/loongarch.opt            |  8 +-
 11 files changed, 128 insertions(+), 5 deletions(-)

diff --git a/gcc/config/loongarch/genopts/loongarch-strings b/gcc/config/loongarch/genopts/loongarch-strings
index a40998ead97..24a5025061f 100644
--- a/gcc/config/loongarch/genopts/loongarch-strings
+++ b/gcc/config/loongarch/genopts/loongarch-strings
@@ -40,6 +40,9 @@ OPTSTR_SOFT_FLOAT     soft-float
 OPTSTR_SINGLE_FLOAT   single-float
 OPTSTR_DOUBLE_FLOAT   double-float
 
+# SIMD extensions
+OPTSTR_LSX	lsx
+
 # -mabi=
 OPTSTR_ABI_BASE	      abi
 STR_ABI_BASE_LP64D    lp64d
diff --git a/gcc/config/loongarch/genopts/loongarch.opt.in b/gcc/config/loongarch/genopts/loongarch.opt.in
index 4b9b4ac273e..338d77a7e40 100644
--- a/gcc/config/loongarch/genopts/loongarch.opt.in
+++ b/gcc/config/loongarch/genopts/loongarch.opt.in
@@ -76,6 +76,9 @@ m@@OPTSTR_DOUBLE_FLOAT@@
 Target Driver RejectNegative Var(la_opt_switches) Mask(FORCE_F64) Negative(m@@OPTSTR_SOFT_FLOAT@@)
 Allow hardware floating-point instructions to cover both 32-bit and 64-bit operations.
 
+m@@OPTSTR_LSX@@
+Target RejectNegative Var(la_opt_switches) Mask(LSX) Negative(m@@OPTSTR_LSX@@)
+Enable LoongArch SIMD Extension (LSX).
 
 ;; Base target models (implies ISA & tune parameters)
 Enum
@@ -125,11 +128,14 @@ Target RejectNegative Joined ToLower Enum(abi_base) Var(la_opt_abi_base) Init(M_
 Variable
 int la_opt_abi_ext = M_OPTION_NOT_SEEN
 
-
 mbranch-cost=
 Target RejectNegative Joined UInteger Var(loongarch_branch_cost)
 -mbranch-cost=COST	Set the cost of branches to roughly COST instructions.
 
+mmemvec-cost=
+Target RejectNegative Joined UInteger Var(loongarch_vector_access_cost) IntegerRange(1, 5)
+mmemvec-cost=COST      Set the cost of vector memory access instructions.
+
 mcheck-zero-division
 Target Mask(CHECK_ZERO_DIV)
 Trap on integer divide by zero.
diff --git a/gcc/config/loongarch/loongarch-c.cc b/gcc/config/loongarch/loongarch-c.cc
index 67911b78f28..b065921adc3 100644
--- a/gcc/config/loongarch/loongarch-c.cc
+++ b/gcc/config/loongarch/loongarch-c.cc
@@ -99,6 +99,13 @@ loongarch_cpu_cpp_builtins (cpp_reader *pfile)
   else
     builtin_define ("__loongarch_frlen=0");
 
+  if (ISA_HAS_LSX)
+    {
+      builtin_define ("__loongarch_simd");
+      builtin_define ("__loongarch_sx");
+      builtin_define ("__loongarch_sx_width=128");
+    }
+
   /* Native Data Sizes.  */
   builtin_define_with_int_value ("_LOONGARCH_SZINT", INT_TYPE_SIZE);
   builtin_define_with_int_value ("_LOONGARCH_SZLONG", LONG_TYPE_SIZE);
diff --git a/gcc/config/loongarch/loongarch-def.c b/gcc/config/loongarch/loongarch-def.c
index 6729c857f7c..28e24c62249 100644
--- a/gcc/config/loongarch/loongarch-def.c
+++ b/gcc/config/loongarch/loongarch-def.c
@@ -49,10 +49,12 @@ loongarch_cpu_default_isa[N_ARCH_TYPES] = {
   [CPU_LOONGARCH64] = {
       .base = ISA_BASE_LA64V100,
       .fpu = ISA_EXT_FPU64,
+      .simd = 0,
   },
   [CPU_LA464] = {
       .base = ISA_BASE_LA64V100,
       .fpu = ISA_EXT_FPU64,
+      .simd = ISA_EXT_SIMD_LSX,
   },
 };
 
@@ -147,6 +149,7 @@ loongarch_isa_ext_strings[N_ISA_EXT_TYPES] = {
   [ISA_EXT_FPU64] = STR_ISA_EXT_FPU64,
   [ISA_EXT_FPU32] = STR_ISA_EXT_FPU32,
   [ISA_EXT_NOFPU] = STR_ISA_EXT_NOFPU,
+  [ISA_EXT_SIMD_LSX] = OPTSTR_LSX,
 };
 
 const char*
@@ -176,6 +179,7 @@ loongarch_switch_strings[] = {
   [SW_SOFT_FLOAT]	  = OPTSTR_SOFT_FLOAT,
   [SW_SINGLE_FLOAT]	  = OPTSTR_SINGLE_FLOAT,
   [SW_DOUBLE_FLOAT]	  = OPTSTR_DOUBLE_FLOAT,
+  [SW_LSX]		  = OPTSTR_LSX,
 };
 
 
diff --git a/gcc/config/loongarch/loongarch-def.h b/gcc/config/loongarch/loongarch-def.h
index fb8bb88eb52..f34cffcfb9b 100644
--- a/gcc/config/loongarch/loongarch-def.h
+++ b/gcc/config/loongarch/loongarch-def.h
@@ -63,7 +63,8 @@ extern const char* loongarch_isa_ext_strings[];
 #define ISA_EXT_FPU32	      1
 #define ISA_EXT_FPU64	      2
 #define N_ISA_EXT_FPU_TYPES   3
-#define N_ISA_EXT_TYPES	      3
+#define ISA_EXT_SIMD_LSX      3
+#define N_ISA_EXT_TYPES	      4
 
 /* enum abi_base */
 extern const char* loongarch_abi_base_strings[];
@@ -97,7 +98,8 @@ extern const char* loongarch_switch_strings[];
 #define SW_SOFT_FLOAT	      0
 #define SW_SINGLE_FLOAT	      1
 #define SW_DOUBLE_FLOAT	      2
-#define N_SWITCH_TYPES	      3
+#define SW_LSX		      3
+#define N_SWITCH_TYPES	      4
 
 /* The common default value for variables whose assignments
    are triggered by command-line options.  */
@@ -111,6 +113,7 @@ struct loongarch_isa
 {
   unsigned char base;	    /* ISA_BASE_ */
   unsigned char fpu;	    /* ISA_EXT_FPU_ */
+  unsigned char simd;	    /* ISA_EXT_SIMD_ */
 };
 
 struct loongarch_abi
diff --git a/gcc/config/loongarch/loongarch-driver.cc b/gcc/config/loongarch/loongarch-driver.cc
index 11ce082417f..aa5011bd86a 100644
--- a/gcc/config/loongarch/loongarch-driver.cc
+++ b/gcc/config/loongarch/loongarch-driver.cc
@@ -160,6 +160,10 @@ driver_get_normalized_m_opts (int argc, const char **argv)
    APPEND_LTR (" %<m" OPTSTR_##NAME "=* " \
 	       " -m" OPTSTR_##NAME "=")
 
+#undef APPEND_SWITCH
+#define APPEND_SWITCH(S) \
+   APPEND_LTR (" %<m" S " -m" S)
+
   for (int i = 0; i < N_SWITCH_TYPES; i++)
     {
       APPEND_LTR (" %<m");
@@ -175,6 +179,12 @@ driver_get_normalized_m_opts (int argc, const char **argv)
   APPEND_OPT (ISA_EXT_FPU);
   APPEND_VAL (loongarch_isa_ext_strings[la_target.isa.fpu]);
 
+  if (la_target.isa.simd)
+    {
+      APPEND_LTR (" %<m" OPTSTR_LSX " -m");
+      APPEND_VAL (loongarch_isa_ext_strings[la_target.isa.simd]);
+    }
+
   APPEND_OPT (CMODEL);
   APPEND_VAL (loongarch_cmodel_strings[la_target.cmodel]);
 
diff --git a/gcc/config/loongarch/loongarch-driver.h b/gcc/config/loongarch/loongarch-driver.h
index ba8817a4621..db663818b7c 100644
--- a/gcc/config/loongarch/loongarch-driver.h
+++ b/gcc/config/loongarch/loongarch-driver.h
@@ -51,6 +51,7 @@ driver_get_normalized_m_opts (int argc, const char **argv);
   LA_SET_FLAG_SPEC (SOFT_FLOAT)				      \
   LA_SET_FLAG_SPEC (SINGLE_FLOAT)			      \
   LA_SET_FLAG_SPEC (DOUBLE_FLOAT)			      \
+  LA_SET_FLAG_SPEC (LSX)				      \
   " %:get_normalized_m_opts()"
 
 #define DRIVER_SELF_SPECS \
diff --git a/gcc/config/loongarch/loongarch-opts.cc b/gcc/config/loongarch/loongarch-opts.cc
index a52e25236ea..9753cf1290b 100644
--- a/gcc/config/loongarch/loongarch-opts.cc
+++ b/gcc/config/loongarch/loongarch-opts.cc
@@ -83,6 +83,7 @@ const int loongarch_switch_mask[N_SWITCH_TYPES] = {
   /* SW_SOFT_FLOAT */    M(FORCE_SOFTF),
   /* SW_SINGLE_FLOAT */  M(FORCE_F32),
   /* SW_DOUBLE_FLOAT */  M(FORCE_F64),
+  /* SW_LSX */		 M(LSX),
 };
 #undef M
 
@@ -142,7 +143,7 @@ loongarch_config_target (struct loongarch_target *target,
   obstack_init (&msg_obstack);
 
   struct {
-    int arch, tune, fpu, abi_base, abi_ext, cmodel;
+    int arch, tune, fpu, abi_base, abi_ext, cmodel, simd;
   } constrained = {
       M_OPT_ABSENT(opt_arch)     ? 0 : 1,
       M_OPT_ABSENT(opt_tune)     ? 0 : 1,
@@ -150,6 +151,7 @@ loongarch_config_target (struct loongarch_target *target,
       M_OPT_ABSENT(opt_abi_base) ? 0 : 1,
       M_OPT_ABSENT(opt_abi_ext)  ? 0 : 1,
       M_OPT_ABSENT(opt_cmodel)   ? 0 : 1,
+      0
   };
 
 #define on(NAME) ((loongarch_switch_mask[(SW_##NAME)] & opt_switches) \
@@ -251,6 +253,73 @@ config_target_isa:
     ((t.cpu_arch == CPU_NATIVE && constrained.arch) ?
      t.isa.fpu : DEFAULT_ISA_EXT_FPU);
 
+  /* LoongArch SIMD extensions.  */
+  int simd_switch;
+  if (on (LSX))
+    {
+      constrained.simd = 1;
+      switch (on_switch)
+	{
+	  case SW_LSX:
+	    t.isa.simd = ISA_EXT_SIMD_LSX;
+	    break;
+
+	  default:
+	    gcc_unreachable ();
+	}
+    }
+  simd_switch = on_switch;
+
+  /* All SIMD extensions imply a 64-bit FPU:
+     - silently adjust t.isa.fpu to "fpu64" if it is unconstrained.
+     - warn if -msingle-float / -msoft-float is on, then disable SIMD extensions
+     - abort if -mfpu=0 / -mfpu=32 is forced.  */
+
+  if (t.isa.simd != 0 && t.isa.fpu != ISA_EXT_FPU64)
+    {
+    if (!constrained.fpu)
+      {
+	/* As long as the arch-default "t.isa.simd" is set to non-zero
+	   for an element "t" in loongarch_cpu_default_isa, "t.isa.fpu"
+	   should be set to "ISA_EXT_FPU64" accordingly.  Thus reaching
+	   here must be the result of forcing -mlsx/-mlasx explicitly.  */
+	gcc_assert (constrained.simd);
+
+	inform (UNKNOWN_LOCATION,
+		"%<-m%s%> promotes %<%s%> to %<%s%s%>",
+		OPTSTR_ISA_EXT_FPU, loongarch_isa_ext_strings[t.isa.fpu],
+		OPTSTR_ISA_EXT_FPU, loongarch_isa_ext_strings[ISA_EXT_FPU64]);
+
+	t.isa.fpu = ISA_EXT_FPU64;
+      }
+    else if (on (SOFT_FLOAT) || on (SINGLE_FLOAT))
+      {
+	if (constrained.simd)
+	  inform (UNKNOWN_LOCATION,
+		  "%<-m%s%> is disabled by %<-m%s%>, because it requires %<%s%s%>",
+		  loongarch_switch_strings[simd_switch],
+		  loongarch_switch_strings[on_switch],
+		  OPTSTR_ISA_EXT_FPU, loongarch_isa_ext_strings[ISA_EXT_FPU64]);
+
+	/* SIMD that comes from arch default.  */
+	t.isa.simd = 0;
+      }
+    else
+      {
+	/* -mfpu=0 / -mfpu=32 is set.  */
+	if (constrained.simd)
+	  fatal_error (UNKNOWN_LOCATION,
+		       "%<-m%s=%s%> conflicts with %<-m%s%>,"
+		       "which requires %<%s%s%>",
+		       OPTSTR_ISA_EXT_FPU, loongarch_isa_ext_strings[t.isa.fpu],
+		       loongarch_switch_strings[simd_switch],
+		       OPTSTR_ISA_EXT_FPU,
+		       loongarch_isa_ext_strings[ISA_EXT_FPU64]);
+
+	/* Same as above.  */
+	t.isa.simd = 0;
+      }
+    }
 
   /* 4.  ABI-ISA compatibility */
   /* Note:
@@ -530,6 +599,17 @@ isa_str (const struct loongarch_isa *isa, char separator)
       APPEND_STRING (OPTSTR_ISA_EXT_FPU)
       APPEND_STRING (loongarch_isa_ext_strings[isa->fpu])
     }
+
+  switch (isa->simd)
+    {
+      case ISA_EXT_SIMD_LSX:
+	APPEND1 (separator);
+	APPEND_STRING (loongarch_isa_ext_strings[isa->simd]);
+	break;
+
+      default:
+	gcc_assert (isa->simd == 0);
+    }
   APPEND1 ('\0')
 
   /* Add more here.  */
diff --git a/gcc/config/loongarch/loongarch-opts.h b/gcc/config/loongarch/loongarch-opts.h
index b1ff54426e4..d067c05dfc9 100644
--- a/gcc/config/loongarch/loongarch-opts.h
+++ b/gcc/config/loongarch/loongarch-opts.h
@@ -66,6 +66,7 @@ loongarch_config_target (struct loongarch_target *target,
 				   || la_target.abi.base == ABI_BASE_LP64F \
 				   || la_target.abi.base == ABI_BASE_LP64S)
 
+#define ISA_HAS_LSX		  (la_target.isa.simd == ISA_EXT_SIMD_LSX)
 #define TARGET_ARCH_NATIVE	  (la_target.cpu_arch == CPU_NATIVE)
 #define LARCH_ACTUAL_ARCH	  (TARGET_ARCH_NATIVE \
 				   ? (la_target.cpu_native < N_ARCH_TYPES \
diff --git a/gcc/config/loongarch/loongarch-str.h b/gcc/config/loongarch/loongarch-str.h
index af2e82a321f..6fa1b1571c5 100644
--- a/gcc/config/loongarch/loongarch-str.h
+++ b/gcc/config/loongarch/loongarch-str.h
@@ -42,6 +42,8 @@ along with GCC; see the file COPYING3.  If not see
 #define OPTSTR_SINGLE_FLOAT "single-float"
 #define OPTSTR_DOUBLE_FLOAT "double-float"
 
+#define OPTSTR_LSX "lsx"
+
 #define OPTSTR_ABI_BASE "abi"
 #define STR_ABI_BASE_LP64D "lp64d"
 #define STR_ABI_BASE_LP64F "lp64f"
diff --git a/gcc/config/loongarch/loongarch.opt b/gcc/config/loongarch/loongarch.opt
index 68018ade73f..5c7e6d37220 100644
--- a/gcc/config/loongarch/loongarch.opt
+++ b/gcc/config/loongarch/loongarch.opt
@@ -83,6 +83,9 @@ mdouble-float
 Target Driver RejectNegative Var(la_opt_switches) Mask(FORCE_F64) Negative(msoft-float)
 Allow hardware floating-point instructions to cover both 32-bit and 64-bit operations.
 
+mlsx
+Target RejectNegative Var(la_opt_switches) Mask(LSX) Negative(mlsx)
+Enable LoongArch SIMD Extension (LSX).
 
 ;; Base target models (implies ISA & tune parameters)
 Enum
@@ -132,11 +135,14 @@ Target RejectNegative Joined ToLower Enum(abi_base) Var(la_opt_abi_base) Init(M_
 Variable
 int la_opt_abi_ext = M_OPTION_NOT_SEEN
 
-
 mbranch-cost=
 Target RejectNegative Joined UInteger Var(loongarch_branch_cost)
 -mbranch-cost=COST	Set the cost of branches to roughly COST instructions.
 
+mmemvec-cost=
+Target RejectNegative Joined UInteger Var(loongarch_vector_access_cost) IntegerRange(1, 5)
+mmemvec-cost=COST      Set the cost of vector memory access instructions.
+
 mcheck-zero-division
 Target Mask(CHECK_ZERO_DIV)
 Trap on integer divide by zero.
-- 
2.36.0


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v5 2/6] LoongArch: Add Loongson SX base instruction support.
  2023-08-24  3:13 [PATCH v5 0/6] Add Loongson SX/ASX instruction support to LoongArch target Chenghui Pan
  2023-08-24  3:13 ` [PATCH v5 1/6] LoongArch: Add Loongson SX vector directive compilation framework Chenghui Pan
@ 2023-08-24  3:13 ` Chenghui Pan
  2023-08-24  3:13 ` [PATCH v5 3/6] LoongArch: Add Loongson SX directive builtin function support Chenghui Pan
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 9+ messages in thread
From: Chenghui Pan @ 2023-08-24  3:13 UTC (permalink / raw)
  To: gcc-patches; +Cc: xry111, i, chenglulu, xuchenghua

From: Lulu Cheng <chenglulu@loongson.cn>

gcc/ChangeLog:

	* config/loongarch/constraints.md (M): Add Loongson LSX base instruction support.
	(N): Ditto.
	(O): Ditto.
	(P): Ditto.
	(R): Ditto.
	(S): Ditto.
	(YG): Ditto.
	(YA): Ditto.
	(YB): Ditto.
	(Yb): Ditto.
	(Yh): Ditto.
	(Yw): Ditto.
	(YI): Ditto.
	(YC): Ditto.
	(YZ): Ditto.
	(Unv5): Ditto.
	(Uuv5): Ditto.
	(Usv5): Ditto.
	(Uuv6): Ditto.
	(Urv8): Ditto.
	* config/loongarch/loongarch-builtins.cc (loongarch_gen_const_int_vector): Ditto.
	* config/loongarch/loongarch-modes.def (VECTOR_MODES): Ditto.
	(VECTOR_MODE): Ditto.
	(INT_MODE): Ditto.
	* config/loongarch/loongarch-protos.h (loongarch_split_move_insn_p): Ditto.
	(loongarch_split_move_insn): Ditto.
	(loongarch_split_128bit_move): Ditto.
	(loongarch_split_128bit_move_p): Ditto.
	(loongarch_split_lsx_copy_d): Ditto.
	(loongarch_split_lsx_insert_d): Ditto.
	(loongarch_split_lsx_fill_d): Ditto.
	(loongarch_expand_vec_cmp): Ditto.
	(loongarch_const_vector_same_val_p): Ditto.
	(loongarch_const_vector_same_bytes_p): Ditto.
	(loongarch_const_vector_same_int_p): Ditto.
	(loongarch_const_vector_shuffle_set_p): Ditto.
	(loongarch_const_vector_bitimm_set_p): Ditto.
	(loongarch_const_vector_bitimm_clr_p): Ditto.
	(loongarch_lsx_vec_parallel_const_half): Ditto.
	(loongarch_gen_const_int_vector): Ditto.
	(loongarch_lsx_output_division): Ditto.
	(loongarch_expand_vector_init): Ditto.
	(loongarch_expand_vec_unpack): Ditto.
	(loongarch_expand_vec_perm): Ditto.
	(loongarch_expand_vector_extract): Ditto.
	(loongarch_expand_vector_reduc): Ditto.
	(loongarch_ldst_scaled_shift): Ditto.
	(loongarch_expand_vec_cond_expr): Ditto.
	(loongarch_expand_vec_cond_mask_expr): Ditto.
	(loongarch_builtin_vectorized_function): Ditto.
	(loongarch_gen_const_int_vector_shuffle): Ditto.
	(loongarch_build_signbit_mask): Ditto.
	* config/loongarch/loongarch.cc (loongarch_pass_aggregate_num_fpr): Ditto.
	(loongarch_setup_incoming_varargs): Ditto.
	(loongarch_emit_move): Ditto.
	(loongarch_const_vector_bitimm_set_p): Ditto.
	(loongarch_const_vector_bitimm_clr_p): Ditto.
	(loongarch_const_vector_same_val_p): Ditto.
	(loongarch_const_vector_same_bytes_p): Ditto.
	(loongarch_const_vector_same_int_p): Ditto.
	(loongarch_const_vector_shuffle_set_p): Ditto.
	(loongarch_symbol_insns): Ditto.
	(loongarch_cannot_force_const_mem): Ditto.
	(loongarch_valid_offset_p): Ditto.
	(loongarch_valid_index_p): Ditto.
	(loongarch_classify_address): Ditto.
	(loongarch_address_insns): Ditto.
	(loongarch_ldst_scaled_shift): Ditto.
	(loongarch_const_insns): Ditto.
	(loongarch_split_move_insn_p): Ditto.
	(loongarch_subword_at_byte): Ditto.
	(loongarch_legitimize_move): Ditto.
	(loongarch_builtin_vectorization_cost): Ditto.
	(loongarch_split_move_p): Ditto.
	(loongarch_split_move): Ditto.
	(loongarch_split_move_insn): Ditto.
	(loongarch_output_move_index_float): Ditto.
	(loongarch_split_128bit_move_p): Ditto.
	(loongarch_split_128bit_move): Ditto.
	(loongarch_split_lsx_copy_d): Ditto.
	(loongarch_split_lsx_insert_d): Ditto.
	(loongarch_split_lsx_fill_d): Ditto.
	(loongarch_output_move): Ditto.
	(loongarch_extend_comparands): Ditto.
	(loongarch_print_operand_reloc): Ditto.
	(loongarch_print_operand): Ditto.
	(loongarch_hard_regno_mode_ok_uncached): Ditto.
	(loongarch_hard_regno_call_part_clobbered): Ditto.
	(loongarch_hard_regno_nregs): Ditto.
	(loongarch_class_max_nregs): Ditto.
	(loongarch_can_change_mode_class): Ditto.
	(loongarch_mode_ok_for_mov_fmt_p): Ditto.
	(loongarch_secondary_reload): Ditto.
	(loongarch_vector_mode_supported_p): Ditto.
	(loongarch_preferred_simd_mode): Ditto.
	(loongarch_autovectorize_vector_modes): Ditto.
	(loongarch_lsx_output_division): Ditto.
	(loongarch_option_override_internal): Ditto.
	(loongarch_hard_regno_caller_save_mode): Ditto.
	(MAX_VECT_LEN): Ditto.
	(loongarch_spill_class): Ditto.
	(struct expand_vec_perm_d): Ditto.
	(loongarch_promote_function_mode): Ditto.
	(loongarch_expand_vselect): Ditto.
	(loongarch_starting_frame_offset): Ditto.
	(loongarch_expand_vselect_vconcat): Ditto.
	(TARGET_ASM_ALIGNED_DI_OP): Ditto.
	(TARGET_OPTION_OVERRIDE): Ditto.
	(TARGET_LEGITIMIZE_ADDRESS): Ditto.
	(loongarch_expand_lsx_shuffle): Ditto.
	(TARGET_ASM_SELECT_RTX_SECTION): Ditto.
	(TARGET_ASM_FUNCTION_RODATA_SECTION): Ditto.
	(TARGET_SCHED_INIT): Ditto.
	(TARGET_SCHED_REORDER): Ditto.
	(TARGET_SCHED_REORDER2): Ditto.
	(TARGET_SCHED_VARIABLE_ISSUE): Ditto.
	(TARGET_SCHED_ADJUST_COST): Ditto.
	(TARGET_SCHED_ISSUE_RATE): Ditto.
	(TARGET_SCHED_FIRST_CYCLE_MULTIPASS_DFA_LOOKAHEAD): Ditto.
	(TARGET_FUNCTION_OK_FOR_SIBCALL): Ditto.
	(TARGET_VALID_POINTER_MODE): Ditto.
	(TARGET_REGISTER_MOVE_COST): Ditto.
	(TARGET_MEMORY_MOVE_COST): Ditto.
	(TARGET_RTX_COSTS): Ditto.
	(TARGET_ADDRESS_COST): Ditto.
	(TARGET_IN_SMALL_DATA_P): Ditto.
	(TARGET_PREFERRED_RELOAD_CLASS): Ditto.
	(TARGET_ASM_FILE_START_FILE_DIRECTIVE): Ditto.
	(loongarch_expand_vec_perm): Ditto.
	(TARGET_EXPAND_BUILTIN_VA_START): Ditto.
	(TARGET_PROMOTE_FUNCTION_MODE): Ditto.
	(TARGET_RETURN_IN_MEMORY): Ditto.
	(TARGET_FUNCTION_VALUE): Ditto.
	(TARGET_LIBCALL_VALUE): Ditto.
	(loongarch_try_expand_lsx_vshuf_const): Ditto.
	(TARGET_ASM_OUTPUT_MI_THUNK): Ditto.
	(TARGET_ASM_CAN_OUTPUT_MI_THUNK): Ditto.
	(TARGET_PRINT_OPERAND): Ditto.
	(TARGET_PRINT_OPERAND_ADDRESS): Ditto.
	(TARGET_PRINT_OPERAND_PUNCT_VALID_P): Ditto.
	(TARGET_SETUP_INCOMING_VARARGS): Ditto.
	(TARGET_STRICT_ARGUMENT_NAMING): Ditto.
	(TARGET_MUST_PASS_IN_STACK): Ditto.
	(TARGET_PASS_BY_REFERENCE): Ditto.
	(TARGET_ARG_PARTIAL_BYTES): Ditto.
	(TARGET_FUNCTION_ARG): Ditto.
	(TARGET_FUNCTION_ARG_ADVANCE): Ditto.
	(TARGET_FUNCTION_ARG_BOUNDARY): Ditto.
	(TARGET_SCALAR_MODE_SUPPORTED_P): Ditto.
	(TARGET_INIT_BUILTINS): Ditto.
	(loongarch_expand_vec_perm_const_1): Ditto.
	(loongarch_expand_vec_perm_const_2): Ditto.
	(loongarch_vectorize_vec_perm_const): Ditto.
	(loongarch_sched_reassociation_width): Ditto.
	(loongarch_expand_vector_extract): Ditto.
	(emit_reduc_half): Ditto.
	(loongarch_expand_vector_reduc): Ditto.
	(loongarch_expand_vec_unpack): Ditto.
	(loongarch_lsx_vec_parallel_const_half): Ditto.
	(loongarch_constant_elt_p): Ditto.
	(loongarch_gen_const_int_vector_shuffle): Ditto.
	(loongarch_expand_vector_init): Ditto.
	(loongarch_expand_lsx_cmp): Ditto.
	(loongarch_expand_vec_cond_expr): Ditto.
	(loongarch_expand_vec_cond_mask_expr): Ditto.
	(loongarch_expand_vec_cmp): Ditto.
	(loongarch_case_values_threshold): Ditto.
	(loongarch_build_const_vector): Ditto.
	(loongarch_build_signbit_mask): Ditto.
	(loongarch_builtin_support_vector_misalignment): Ditto.
	(TARGET_ASM_ALIGNED_HI_OP): Ditto.
	(TARGET_ASM_ALIGNED_SI_OP): Ditto.
	(TARGET_VECTORIZE_BUILTIN_VECTORIZATION_COST): Ditto.
	(TARGET_VECTOR_MODE_SUPPORTED_P): Ditto.
	(TARGET_VECTORIZE_PREFERRED_SIMD_MODE): Ditto.
	(TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_MODES): Ditto.
	(TARGET_VECTORIZE_VEC_PERM_CONST): Ditto.
	(TARGET_SCHED_REASSOCIATION_WIDTH): Ditto.
	(TARGET_CASE_VALUES_THRESHOLD): Ditto.
	(TARGET_HARD_REGNO_CALL_PART_CLOBBERED): Ditto.
	(TARGET_VECTORIZE_SUPPORT_VECTOR_MISALIGNMENT): Ditto.
	* config/loongarch/loongarch.h (TARGET_SUPPORTS_WIDE_INT): Ditto.
	(UNITS_PER_LSX_REG): Ditto.
	(BITS_PER_LSX_REG): Ditto.
	(BIGGEST_ALIGNMENT): Ditto.
	(LSX_REG_FIRST): Ditto.
	(LSX_REG_LAST): Ditto.
	(LSX_REG_NUM): Ditto.
	(LSX_REG_P): Ditto.
	(LSX_REG_RTX_P): Ditto.
	(IMM13_OPERAND): Ditto.
	(LSX_SUPPORTED_MODE_P): Ditto.
	* config/loongarch/loongarch.md (unknown,add,sub,not,nor,and,or,xor): Ditto.
	(unknown,add,sub,not,nor,and,or,xor,simd_add): Ditto.
	(unknown,none,QI,HI,SI,DI,TI,SF,DF,TF,FCC): Ditto.
	(mode" ): Ditto.
	(DF): Ditto.
	(SF): Ditto.
	(sf): Ditto.
	(DI): Ditto.
	(SI): Ditto.
	* config/loongarch/predicates.md (const_lsx_branch_operand): Ditto.
	(const_uimm3_operand): Ditto.
	(const_8_to_11_operand): Ditto.
	(const_12_to_15_operand): Ditto.
	(const_uimm4_operand): Ditto.
	(const_uimm6_operand): Ditto.
	(const_uimm7_operand): Ditto.
	(const_uimm8_operand): Ditto.
	(const_imm5_operand): Ditto.
	(const_imm10_operand): Ditto.
	(const_imm13_operand): Ditto.
	(reg_imm10_operand): Ditto.
	(aq8b_operand): Ditto.
	(aq8h_operand): Ditto.
	(aq8w_operand): Ditto.
	(aq8d_operand): Ditto.
	(aq10b_operand): Ditto.
	(aq10h_operand): Ditto.
	(aq10w_operand): Ditto.
	(aq10d_operand): Ditto.
	(aq12b_operand): Ditto.
	(aq12h_operand): Ditto.
	(aq12w_operand): Ditto.
	(aq12d_operand): Ditto.
	(const_m1_operand): Ditto.
	(reg_or_m1_operand): Ditto.
	(const_exp_2_operand): Ditto.
	(const_exp_4_operand): Ditto.
	(const_exp_8_operand): Ditto.
	(const_exp_16_operand): Ditto.
	(const_exp_32_operand): Ditto.
	(const_0_or_1_operand): Ditto.
	(const_0_to_3_operand): Ditto.
	(const_0_to_7_operand): Ditto.
	(const_2_or_3_operand): Ditto.
	(const_4_to_7_operand): Ditto.
	(const_8_to_15_operand): Ditto.
	(const_16_to_31_operand): Ditto.
	(qi_mask_operand): Ditto.
	(hi_mask_operand): Ditto.
	(si_mask_operand): Ditto.
	(d_operand): Ditto.
	(db4_operand): Ditto.
	(db7_operand): Ditto.
	(db8_operand): Ditto.
	(ib3_operand): Ditto.
	(sb4_operand): Ditto.
	(sb5_operand): Ditto.
	(sb8_operand): Ditto.
	(sd8_operand): Ditto.
	(ub4_operand): Ditto.
	(ub8_operand): Ditto.
	(uh4_operand): Ditto.
	(uw4_operand): Ditto.
	(uw5_operand): Ditto.
	(uw6_operand): Ditto.
	(uw8_operand): Ditto.
	(addiur2_operand): Ditto.
	(addiusp_operand): Ditto.
	(andi16_operand): Ditto.
	(movep_src_register): Ditto.
	(movep_src_operand): Ditto.
	(fcc_reload_operand): Ditto.
	(muldiv_target_operand): Ditto.
	(const_vector_same_val_operand): Ditto.
	(const_vector_same_simm5_operand): Ditto.
	(const_vector_same_uimm5_operand): Ditto.
	(const_vector_same_ximm5_operand): Ditto.
	(const_vector_same_uimm6_operand): Ditto.
	(par_const_vector_shf_set_operand): Ditto.
	(reg_or_vector_same_val_operand): Ditto.
	(reg_or_vector_same_simm5_operand): Ditto.
	(reg_or_vector_same_uimm5_operand): Ditto.
	(reg_or_vector_same_ximm5_operand): Ditto.
	(reg_or_vector_same_uimm6_operand): Ditto.
	* doc/md.texi: Ditto.
	* config/loongarch/lsx.md: New file.

gcc/testsuite/ChangeLog:

	* g++.dg/torture/vshuf-v16qi.C: Skip loongarch*-*-* because of
	  vshuf insn's undefined result when 6 or 7 bit of vector's
	  element is set.
	* g++.dg/torture/vshuf-v2df.C: Ditto.
	* g++.dg/torture/vshuf-v2di.C: Ditto.
	* g++.dg/torture/vshuf-v4sf.C: Ditto.
	* g++.dg/torture/vshuf-v8hi.C: Ditto.
---
 gcc/config/loongarch/constraints.md        |  131 +-
 gcc/config/loongarch/loongarch-builtins.cc |   10 +
 gcc/config/loongarch/loongarch-modes.def   |   38 +
 gcc/config/loongarch/loongarch-protos.h    |   31 +
 gcc/config/loongarch/loongarch.cc          | 2214 +++++++++-
 gcc/config/loongarch/loongarch.h           |   65 +-
 gcc/config/loongarch/loongarch.md          |   44 +-
 gcc/config/loongarch/lsx.md                | 4467 ++++++++++++++++++++
 gcc/config/loongarch/predicates.md         |  333 +-
 gcc/doc/md.texi                            |   11 +
 gcc/testsuite/g++.dg/torture/vshuf-v16qi.C |    1 +
 gcc/testsuite/g++.dg/torture/vshuf-v2df.C  |    2 +
 gcc/testsuite/g++.dg/torture/vshuf-v2di.C  |    1 +
 gcc/testsuite/g++.dg/torture/vshuf-v4sf.C  |    2 +-
 gcc/testsuite/g++.dg/torture/vshuf-v8hi.C  |    1 +
 15 files changed, 7167 insertions(+), 184 deletions(-)
 create mode 100644 gcc/config/loongarch/lsx.md

diff --git a/gcc/config/loongarch/constraints.md b/gcc/config/loongarch/constraints.md
index 7a38cd07ae9..39505e45efe 100644
--- a/gcc/config/loongarch/constraints.md
+++ b/gcc/config/loongarch/constraints.md
@@ -76,12 +76,13 @@
 ;;     "Le"
 ;;	 "A signed 32-bit constant can be expressed as Lb + I, but not a
 ;;	  single Lb or I."
-;; "M" <-----unused
-;; "N" <-----unused
-;; "O" <-----unused
-;; "P" <-----unused
+;; "M" "A constant that cannot be loaded using @code{lui}, @code{addiu}
+;;	or @code{ori}."
+;; "N" "A constant in the range -65535 to -1 (inclusive)."
+;; "O" "A signed 15-bit constant."
+;; "P" "A constant in the range 1 to 65535 (inclusive)."
 ;; "Q" <-----unused
-;; "R" <-----unused
+;; "R" "An address that can be used in a non-macro load or store."
 ;; "S" <-----unused
 ;; "T" <-----unused
 ;; "U" <-----unused
@@ -214,6 +215,63 @@ (define_constraint "Le"
   (and (match_code "const_int")
        (match_test "loongarch_addu16i_imm12_operand_p (ival, SImode)")))
 
+(define_constraint "M"
+  "A constant that cannot be loaded using @code{lui}, @code{addiu}
+   or @code{ori}."
+  (and (match_code "const_int")
+       (not (match_test "IMM12_OPERAND (ival)"))
+       (not (match_test "IMM12_OPERAND_UNSIGNED (ival)"))
+       (not (match_test "LU12I_OPERAND (ival)"))))
+
+(define_constraint "N"
+  "A constant in the range -65535 to -1 (inclusive)."
+  (and (match_code "const_int")
+       (match_test "ival >= -0xffff && ival < 0")))
+
+(define_constraint "O"
+  "A signed 15-bit constant."
+  (and (match_code "const_int")
+       (match_test "ival >= -0x4000 && ival < 0x4000")))
+
+(define_constraint "P"
+  "A constant in the range 1 to 65535 (inclusive)."
+  (and (match_code "const_int")
+       (match_test "ival > 0 && ival < 0x10000")))
+
+;; General constraints
+
+(define_memory_constraint "R"
+  "An address that can be used in a non-macro load or store."
+  (and (match_code "mem")
+       (match_test "loongarch_address_insns (XEXP (op, 0), mode, false) == 1")))
+(define_constraint "S"
+  "@internal
+   A constant call address."
+  (and (match_operand 0 "call_insn_operand")
+       (match_test "CONSTANT_P (op)")))
+
+(define_constraint "YG"
+  "@internal
+   A vector zero."
+  (and (match_code "const_vector")
+       (match_test "op == CONST0_RTX (mode)")))
+
+(define_constraint "YA"
+  "@internal
+   An unsigned 6-bit constant."
+  (and (match_code "const_int")
+       (match_test "UIMM6_OPERAND (ival)")))
+
+(define_constraint "YB"
+  "@internal
+   A signed 10-bit constant."
+  (and (match_code "const_int")
+       (match_test "IMM10_OPERAND (ival)")))
+
+(define_constraint "Yb"
+   "@internal"
+   (match_operand 0 "qi_mask_operand"))
+
 (define_constraint "Yd"
   "@internal
    A constant @code{move_operand} that can be safely loaded using
@@ -221,10 +279,73 @@ (define_constraint "Yd"
   (and (match_operand 0 "move_operand")
        (match_test "CONSTANT_P (op)")))
 
+(define_constraint "Yh"
+   "@internal"
+    (match_operand 0 "hi_mask_operand"))
+
+(define_constraint "Yw"
+   "@internal"
+    (match_operand 0 "si_mask_operand"))
+
 (define_constraint "Yx"
    "@internal"
    (match_operand 0 "low_bitmask_operand"))
 
+(define_constraint "YI"
+  "@internal
+   A replicated vector const in which the replicated value is in the range
+   [-512,511]."
+  (and (match_code "const_vector")
+       (match_test "loongarch_const_vector_same_int_p (op, mode, -512, 511)")))
+
+(define_constraint "YC"
+  "@internal
+   A replicated vector const in which the replicated value has a single
+   bit set."
+  (and (match_code "const_vector")
+       (match_test "loongarch_const_vector_bitimm_set_p (op, mode)")))
+
+(define_constraint "YZ"
+  "@internal
+   A replicated vector const in which the replicated value has a single
+   bit clear."
+  (and (match_code "const_vector")
+       (match_test "loongarch_const_vector_bitimm_clr_p (op, mode)")))
+
+(define_constraint "Unv5"
+  "@internal
+   A replicated vector const in which the replicated value is in the range
+   [-31,0]."
+  (and (match_code "const_vector")
+       (match_test "loongarch_const_vector_same_int_p (op, mode, -31, 0)")))
+
+(define_constraint "Uuv5"
+  "@internal
+   A replicated vector const in which the replicated value is in the range
+   [0,31]."
+  (and (match_code "const_vector")
+       (match_test "loongarch_const_vector_same_int_p (op, mode, 0, 31)")))
+
+(define_constraint "Usv5"
+  "@internal
+   A replicated vector const in which the replicated value is in the range
+   [-16,15]."
+  (and (match_code "const_vector")
+       (match_test "loongarch_const_vector_same_int_p (op, mode, -16, 15)")))
+
+(define_constraint "Uuv6"
+  "@internal
+   A replicated vector const in which the replicated value is in the range
+   [0,63]."
+  (and (match_code "const_vector")
+       (match_test "loongarch_const_vector_same_int_p (op, mode, 0, 63)")))
+
+(define_constraint "Urv8"
+  "@internal
+   A replicated vector const with replicated byte values as well as elements"
+  (and (match_code "const_vector")
+       (match_test "loongarch_const_vector_same_bytes_p (op, mode)")))
+
 (define_memory_constraint "ZC"
   "A memory operand whose address is formed by a base register and offset
    that is suitable for use in instructions with the same addressing mode
diff --git a/gcc/config/loongarch/loongarch-builtins.cc b/gcc/config/loongarch/loongarch-builtins.cc
index b929f224dfa..ebe70a986c3 100644
--- a/gcc/config/loongarch/loongarch-builtins.cc
+++ b/gcc/config/loongarch/loongarch-builtins.cc
@@ -36,6 +36,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "fold-const.h"
 #include "expr.h"
 #include "langhooks.h"
+#include "emit-rtl.h"
 
 /* Macros to create an enumeration identifier for a function prototype.  */
 #define LARCH_FTYPE_NAME1(A, B) LARCH_##A##_FTYPE_##B
@@ -297,6 +298,15 @@ loongarch_prepare_builtin_arg (struct expand_operand *op, tree exp,
   create_input_operand (op, value, TYPE_MODE (TREE_TYPE (arg)));
 }
 
+/* Return a const_int vector of VAL with mode MODE.  */
+
+rtx
+loongarch_gen_const_int_vector (machine_mode mode, HOST_WIDE_INT val)
+{
+  rtx c = gen_int_mode (val, GET_MODE_INNER (mode));
+  return gen_const_vec_duplicate (mode, c);
+}
+
 /* Expand instruction ICODE as part of a built-in function sequence.
    Use the first NOPS elements of OPS as the instruction's operands.
    HAS_TARGET_P is true if operand 0 is a target; it is false if the
diff --git a/gcc/config/loongarch/loongarch-modes.def b/gcc/config/loongarch/loongarch-modes.def
index 8082ce993a5..6f57b60525d 100644
--- a/gcc/config/loongarch/loongarch-modes.def
+++ b/gcc/config/loongarch/loongarch-modes.def
@@ -23,3 +23,41 @@ FLOAT_MODE (TF, 16, ieee_quad_format);
 
 /* For floating point conditions in FCC registers.  */
 CC_MODE (FCC);
+
+/* Vector modes.  */
+VECTOR_MODES (INT, 4);	      /* V4QI  V2HI      */
+VECTOR_MODES (INT, 8);	      /* V8QI  V4HI V2SI */
+VECTOR_MODES (FLOAT, 8);      /*       V4HF V2SF */
+
+/* For LARCH LSX 128 bits.  */
+VECTOR_MODES (INT, 16);	      /* V16QI V8HI V4SI V2DI */
+VECTOR_MODES (FLOAT, 16);     /*	    V4SF V2DF */
+
+VECTOR_MODES (INT, 32);	      /* V32QI V16HI V8SI V4DI */
+VECTOR_MODES (FLOAT, 32);     /*	     V8SF V4DF */
+
+/* Double-sized vector modes for vec_concat.  */
+/* VECTOR_MODE (INT, QI, 32);	  V32QI	*/
+/* VECTOR_MODE (INT, HI, 16);	  V16HI	*/
+/* VECTOR_MODE (INT, SI, 8);	  V8SI	*/
+/* VECTOR_MODE (INT, DI, 4);	  V4DI	*/
+/* VECTOR_MODE (FLOAT, SF, 8);	  V8SF	*/
+/* VECTOR_MODE (FLOAT, DF, 4);	  V4DF	*/
+
+VECTOR_MODE (INT, QI, 64);    /* V64QI	*/
+VECTOR_MODE (INT, HI, 32);    /* V32HI	*/
+VECTOR_MODE (INT, SI, 16);    /* V16SI	*/
+VECTOR_MODE (INT, DI, 8);     /* V8DI */
+VECTOR_MODE (FLOAT, SF, 16);  /* V16SF	*/
+VECTOR_MODE (FLOAT, DF, 8);   /* V8DF */
+
+VECTOR_MODES (FRACT, 4);	/* V4QQ  V2HQ */
+VECTOR_MODES (UFRACT, 4);	/* V4UQQ V2UHQ */
+VECTOR_MODES (ACCUM, 4);	/*       V2HA */
+VECTOR_MODES (UACCUM, 4);	/*       V2UHA */
+
+INT_MODE (OI, 32);
+
+/* Keep the OI modes from confusing the compiler into thinking
+   that these modes could actually be used for computation.  They are
+   only holders for vectors during data movement.  */
diff --git a/gcc/config/loongarch/loongarch-protos.h b/gcc/config/loongarch/loongarch-protos.h
index b71b188507a..fc33527cdcf 100644
--- a/gcc/config/loongarch/loongarch-protos.h
+++ b/gcc/config/loongarch/loongarch-protos.h
@@ -85,10 +85,18 @@ extern bool loongarch_split_move_p (rtx, rtx);
 extern void loongarch_split_move (rtx, rtx, rtx);
 extern bool loongarch_addu16i_imm12_operand_p (HOST_WIDE_INT, machine_mode);
 extern void loongarch_split_plus_constant (rtx *, machine_mode);
+extern bool loongarch_split_move_insn_p (rtx, rtx);
+extern void loongarch_split_move_insn (rtx, rtx, rtx);
+extern void loongarch_split_128bit_move (rtx, rtx);
+extern bool loongarch_split_128bit_move_p (rtx, rtx);
+extern void loongarch_split_lsx_copy_d (rtx, rtx, rtx, rtx (*)(rtx, rtx, rtx));
+extern void loongarch_split_lsx_insert_d (rtx, rtx, rtx, rtx);
+extern void loongarch_split_lsx_fill_d (rtx, rtx);
 extern const char *loongarch_output_move (rtx, rtx);
 extern bool loongarch_cfun_has_cprestore_slot_p (void);
 #ifdef RTX_CODE
 extern void loongarch_expand_scc (rtx *);
+extern bool loongarch_expand_vec_cmp (rtx *);
 extern void loongarch_expand_conditional_branch (rtx *);
 extern void loongarch_expand_conditional_move (rtx *);
 extern void loongarch_expand_conditional_trap (rtx);
@@ -110,6 +118,15 @@ extern bool loongarch_small_data_pattern_p (rtx);
 extern rtx loongarch_rewrite_small_data (rtx);
 extern rtx loongarch_return_addr (int, rtx);
 
+extern bool loongarch_const_vector_same_val_p (rtx, machine_mode);
+extern bool loongarch_const_vector_same_bytes_p (rtx, machine_mode);
+extern bool loongarch_const_vector_same_int_p (rtx, machine_mode, HOST_WIDE_INT,
+					  HOST_WIDE_INT);
+extern bool loongarch_const_vector_shuffle_set_p (rtx, machine_mode);
+extern bool loongarch_const_vector_bitimm_set_p (rtx, machine_mode);
+extern bool loongarch_const_vector_bitimm_clr_p (rtx, machine_mode);
+extern rtx loongarch_lsx_vec_parallel_const_half (machine_mode, bool);
+extern rtx loongarch_gen_const_int_vector (machine_mode, HOST_WIDE_INT);
 extern enum reg_class loongarch_secondary_reload_class (enum reg_class,
 							machine_mode,
 							rtx, bool);
@@ -129,6 +146,7 @@ extern const char *loongarch_output_equal_conditional_branch (rtx_insn *,
 							      rtx *,
 							      bool);
 extern const char *loongarch_output_division (const char *, rtx *);
+extern const char *loongarch_lsx_output_division (const char *, rtx *);
 extern const char *loongarch_output_probe_stack_range (rtx, rtx, rtx);
 extern bool loongarch_hard_regno_rename_ok (unsigned int, unsigned int);
 extern int loongarch_dspalu_bypass_p (rtx, rtx);
@@ -156,6 +174,13 @@ union loongarch_gen_fn_ptrs
 extern void loongarch_expand_atomic_qihi (union loongarch_gen_fn_ptrs,
 					  rtx, rtx, rtx, rtx, rtx);
 
+extern void loongarch_expand_vector_init (rtx, rtx);
+extern void loongarch_expand_vec_unpack (rtx op[2], bool, bool);
+extern void loongarch_expand_vec_perm (rtx, rtx, rtx, rtx);
+extern void loongarch_expand_vector_extract (rtx, rtx, int);
+extern void loongarch_expand_vector_reduc (rtx (*)(rtx, rtx, rtx), rtx, rtx);
+
+extern int loongarch_ldst_scaled_shift (machine_mode);
 extern bool loongarch_signed_immediate_p (unsigned HOST_WIDE_INT, int, int);
 extern bool loongarch_unsigned_immediate_p (unsigned HOST_WIDE_INT, int, int);
 extern bool loongarch_12bit_offset_address_p (rtx, machine_mode);
@@ -171,6 +196,9 @@ extern bool loongarch_split_symbol_type (enum loongarch_symbol_type);
 typedef rtx (*mulsidi3_gen_fn) (rtx, rtx, rtx);
 
 extern void loongarch_register_frame_header_opt (void);
+extern void loongarch_expand_vec_cond_expr (machine_mode, machine_mode, rtx *);
+extern void loongarch_expand_vec_cond_mask_expr (machine_mode, machine_mode,
+						 rtx *);
 
 /* Routines implemented in loongarch-c.c.  */
 void loongarch_cpu_cpp_builtins (cpp_reader *);
@@ -180,6 +208,9 @@ extern void loongarch_atomic_assign_expand_fenv (tree *, tree *, tree *);
 extern tree loongarch_builtin_decl (unsigned int, bool);
 extern rtx loongarch_expand_builtin (tree, rtx, rtx subtarget ATTRIBUTE_UNUSED,
 				     machine_mode, int);
+extern tree loongarch_builtin_vectorized_function (unsigned int, tree, tree);
+extern rtx loongarch_gen_const_int_vector_shuffle (machine_mode, int);
 extern tree loongarch_build_builtin_va_list (void);
 
+extern rtx loongarch_build_signbit_mask (machine_mode, bool, bool);
 #endif /* ! GCC_LOONGARCH_PROTOS_H */
diff --git a/gcc/config/loongarch/loongarch.cc b/gcc/config/loongarch/loongarch.cc
index 86d58784113..7ffa6bbb73d 100644
--- a/gcc/config/loongarch/loongarch.cc
+++ b/gcc/config/loongarch/loongarch.cc
@@ -432,7 +432,7 @@ loongarch_flatten_aggregate_argument (const_tree type,
 
 static unsigned
 loongarch_pass_aggregate_num_fpr (const_tree type,
-					loongarch_aggregate_field fields[2])
+				  loongarch_aggregate_field fields[2])
 {
   int n = loongarch_flatten_aggregate_argument (type, fields);
 
@@ -773,7 +773,7 @@ loongarch_setup_incoming_varargs (cumulative_args_t cum,
     {
       rtx ptr = plus_constant (Pmode, virtual_incoming_args_rtx,
 			       REG_PARM_STACK_SPACE (cfun->decl)
-				 - gp_saved * UNITS_PER_WORD);
+			       - gp_saved * UNITS_PER_WORD);
       rtx mem = gen_frame_mem (BLKmode, ptr);
       set_mem_alias_set (mem, get_varargs_alias_set ());
 
@@ -1049,7 +1049,7 @@ rtx
 loongarch_emit_move (rtx dest, rtx src)
 {
   return (can_create_pseudo_p () ? emit_move_insn (dest, src)
-				 : emit_move_insn_1 (dest, src));
+	  : emit_move_insn_1 (dest, src));
 }
 
 /* Save register REG to MEM.  Make the instruction frame-related.  */
@@ -1675,6 +1675,140 @@ loongarch_symbol_binds_local_p (const_rtx x)
     return false;
 }
 
+/* Return true if OP is a constant vector with the number of units in MODE,
+   and each unit has the same bit set.  */
+
+bool
+loongarch_const_vector_bitimm_set_p (rtx op, machine_mode mode)
+{
+  if (GET_CODE (op) == CONST_VECTOR && op != CONST0_RTX (mode))
+    {
+      unsigned HOST_WIDE_INT val = UINTVAL (CONST_VECTOR_ELT (op, 0));
+      int vlog2 = exact_log2 (val & GET_MODE_MASK (GET_MODE_INNER (mode)));
+
+      if (vlog2 != -1)
+	{
+	  gcc_assert (GET_MODE_CLASS (mode) == MODE_VECTOR_INT);
+	  gcc_assert (vlog2 >= 0 && vlog2 <= GET_MODE_UNIT_BITSIZE (mode) - 1);
+	  return loongarch_const_vector_same_val_p (op, mode);
+	}
+    }
+
+  return false;
+}
+
+/* Return true if OP is a constant vector with the number of units in MODE,
+   and each unit has the same bit clear.  */
+
+bool
+loongarch_const_vector_bitimm_clr_p (rtx op, machine_mode mode)
+{
+  if (GET_CODE (op) == CONST_VECTOR && op != CONSTM1_RTX (mode))
+    {
+      unsigned HOST_WIDE_INT val = ~UINTVAL (CONST_VECTOR_ELT (op, 0));
+      int vlog2 = exact_log2 (val & GET_MODE_MASK (GET_MODE_INNER (mode)));
+
+      if (vlog2 != -1)
+	{
+	  gcc_assert (GET_MODE_CLASS (mode) == MODE_VECTOR_INT);
+	  gcc_assert (vlog2 >= 0 && vlog2 <= GET_MODE_UNIT_BITSIZE (mode) - 1);
+	  return loongarch_const_vector_same_val_p (op, mode);
+	}
+    }
+
+  return false;
+}
+
+/* Return true if OP is a constant vector with the number of units in MODE,
+   and each unit has the same value.  */
+
+bool
+loongarch_const_vector_same_val_p (rtx op, machine_mode mode)
+{
+  int i, nunits = GET_MODE_NUNITS (mode);
+  rtx first;
+
+  if (GET_CODE (op) != CONST_VECTOR || GET_MODE (op) != mode)
+    return false;
+
+  first = CONST_VECTOR_ELT (op, 0);
+  for (i = 1; i < nunits; i++)
+    if (!rtx_equal_p (first, CONST_VECTOR_ELT (op, i)))
+      return false;
+
+  return true;
+}
+
+/* Return true if OP is a constant vector with the number of units in MODE,
+   and each unit has the same value as well as replicated bytes in the value.
+*/
+
+bool
+loongarch_const_vector_same_bytes_p (rtx op, machine_mode mode)
+{
+  int i, bytes;
+  HOST_WIDE_INT val, first_byte;
+  rtx first;
+
+  if (!loongarch_const_vector_same_val_p (op, mode))
+    return false;
+
+  first = CONST_VECTOR_ELT (op, 0);
+  bytes = GET_MODE_UNIT_SIZE (mode);
+  val = INTVAL (first);
+  first_byte = val & 0xff;
+  for (i = 1; i < bytes; i++)
+    {
+      val >>= 8;
+      if ((val & 0xff) != first_byte)
+	return false;
+    }
+
+  return true;
+}
+
+/* Return true if OP is a constant vector with the number of units in MODE,
+   and each unit has the same integer value in the range [LOW, HIGH].  */
+
+bool
+loongarch_const_vector_same_int_p (rtx op, machine_mode mode, HOST_WIDE_INT low,
+				   HOST_WIDE_INT high)
+{
+  HOST_WIDE_INT value;
+  rtx elem0;
+
+  if (!loongarch_const_vector_same_val_p (op, mode))
+    return false;
+
+  elem0 = CONST_VECTOR_ELT (op, 0);
+  if (!CONST_INT_P (elem0))
+    return false;
+
+  value = INTVAL (elem0);
+  return (value >= low && value <= high);
+}
+
+/* Return true if OP is a constant vector with repeated 4-element sets
+   in mode MODE.  */
+
+bool
+loongarch_const_vector_shuffle_set_p (rtx op, machine_mode mode)
+{
+  int nunits = GET_MODE_NUNITS (mode);
+  int nsets = nunits / 4;
+  int set = 0;
+  int i, j;
+
+  /* Check if we have the same 4-element sets.  */
+  for (j = 0; j < nsets; j++, set = 4 * j)
+    for (i = 0; i < 4; i++)
+      if ((INTVAL (XVECEXP (op, 0, i))
+	   != (INTVAL (XVECEXP (op, 0, set + i)) - set))
+	  || !IN_RANGE (INTVAL (XVECEXP (op, 0, set + i)), 0, set + 3))
+	return false;
+  return true;
+}
+
 /* Return true if rtx constants of mode MODE should be put into a small
    data section.  */
 
@@ -1792,6 +1926,11 @@ loongarch_symbolic_constant_p (rtx x, enum loongarch_symbol_type *symbol_type)
 static int
 loongarch_symbol_insns (enum loongarch_symbol_type type, machine_mode mode)
 {
+  /* LSX LD.* and ST.* cannot support loading symbols via an immediate
+     operand.  */
+  if (LSX_SUPPORTED_MODE_P (mode))
+    return 0;
+
   switch (type)
     {
     case SYMBOL_GOT_DISP:
@@ -1838,7 +1977,8 @@ loongarch_cannot_force_const_mem (machine_mode mode, rtx x)
      references, reload will consider forcing C into memory and using
      one of the instruction's memory alternatives.  Returning false
      here will force it to use an input reload instead.  */
-  if (CONST_INT_P (x) && loongarch_legitimate_constant_p (mode, x))
+  if ((CONST_INT_P (x) || GET_CODE (x) == CONST_VECTOR)
+      && loongarch_legitimate_constant_p (mode, x))
     return true;
 
   split_const (x, &base, &offset);
@@ -1915,6 +2055,12 @@ loongarch_valid_offset_p (rtx x, machine_mode mode)
       && !IMM12_OPERAND (INTVAL (x) + GET_MODE_SIZE (mode) - UNITS_PER_WORD))
     return false;
 
+  /* LSX LD.* and ST.* supports 10-bit signed offsets.  */
+  if (LSX_SUPPORTED_MODE_P (mode)
+      && !loongarch_signed_immediate_p (INTVAL (x), 10,
+					loongarch_ldst_scaled_shift (mode)))
+    return false;
+
   return true;
 }
 
@@ -1999,7 +2145,7 @@ loongarch_valid_lo_sum_p (enum loongarch_symbol_type symbol_type,
 
 static bool
 loongarch_valid_index_p (struct loongarch_address_info *info, rtx x,
-			  machine_mode mode, bool strict_p)
+			 machine_mode mode, bool strict_p)
 {
   rtx index;
 
@@ -2052,7 +2198,7 @@ loongarch_classify_address (struct loongarch_address_info *info, rtx x,
 	}
 
       if (loongarch_valid_base_register_p (XEXP (x, 1), mode, strict_p)
-	 && loongarch_valid_index_p (info, XEXP (x, 0), mode, strict_p))
+	  && loongarch_valid_index_p (info, XEXP (x, 0), mode, strict_p))
 	{
 	  info->reg = XEXP (x, 1);
 	  return true;
@@ -2128,6 +2274,7 @@ loongarch_address_insns (rtx x, machine_mode mode, bool might_split_p)
 {
   struct loongarch_address_info addr;
   int factor;
+  bool lsx_p = !might_split_p && LSX_SUPPORTED_MODE_P (mode);
 
   if (!loongarch_classify_address (&addr, x, mode, false))
     return 0;
@@ -2145,15 +2292,29 @@ loongarch_address_insns (rtx x, machine_mode mode, bool might_split_p)
     switch (addr.type)
       {
       case ADDRESS_REG:
+	if (lsx_p)
+	  {
+	    /* LSX LD.* and ST.* supports 10-bit signed offsets.  */
+	    if (loongarch_signed_immediate_p (INTVAL (addr.offset), 10,
+					      loongarch_ldst_scaled_shift (mode)))
+	      return 1;
+	    else
+	      return 0;
+	  }
+	return factor;
+
       case ADDRESS_REG_REG:
-      case ADDRESS_CONST_INT:
 	return factor;
 
+      case ADDRESS_CONST_INT:
+	return lsx_p ? 0 : factor;
+
       case ADDRESS_LO_SUM:
 	return factor + 1;
 
       case ADDRESS_SYMBOLIC:
-	return factor * loongarch_symbol_insns (addr.symbol_type, mode);
+	return lsx_p ? 0
+	  : factor * loongarch_symbol_insns (addr.symbol_type, mode);
       }
   return 0;
 }
@@ -2179,6 +2340,19 @@ loongarch_signed_immediate_p (unsigned HOST_WIDE_INT x, int bits,
   return loongarch_unsigned_immediate_p (x, bits, shift);
 }
 
+/* Return the scale shift that applied to LSX LD/ST address offset.  */
+
+int
+loongarch_ldst_scaled_shift (machine_mode mode)
+{
+  int shift = exact_log2 (GET_MODE_UNIT_SIZE (mode));
+
+  if (shift < 0 || shift > 8)
+    gcc_unreachable ();
+
+  return shift;
+}
+
 /* Return true if X is a legitimate address with a 12-bit offset
    or addr.type is ADDRESS_LO_SUM.
    MODE is the mode of the value being accessed.  */
@@ -2246,6 +2420,9 @@ loongarch_const_insns (rtx x)
       return loongarch_integer_cost (INTVAL (x));
 
     case CONST_VECTOR:
+      if (LSX_SUPPORTED_MODE_P (GET_MODE (x))
+	  && loongarch_const_vector_same_int_p (x, GET_MODE (x), -512, 511))
+	return 1;
       /* Fall through.  */
     case CONST_DOUBLE:
       return x == CONST0_RTX (GET_MODE (x)) ? 1 : 0;
@@ -2280,7 +2457,7 @@ loongarch_const_insns (rtx x)
     case SYMBOL_REF:
     case LABEL_REF:
       return loongarch_symbol_insns (
-	loongarch_classify_symbol (x), MAX_MACHINE_MODE);
+		loongarch_classify_symbol (x), MAX_MACHINE_MODE);
 
     default:
       return 0;
@@ -2302,7 +2479,26 @@ loongarch_split_const_insns (rtx x)
   return low + high;
 }
 
-static bool loongarch_split_move_insn_p (rtx dest, rtx src);
+bool loongarch_split_move_insn_p (rtx dest, rtx src);
+/* Return one word of 128-bit value OP, taking into account the fixed
+   endianness of certain registers.  BYTE selects from the byte address.  */
+
+rtx
+loongarch_subword_at_byte (rtx op, unsigned int byte)
+{
+  machine_mode mode;
+
+  mode = GET_MODE (op);
+  if (mode == VOIDmode)
+    mode = TImode;
+
+  gcc_assert (!FP_REG_RTX_P (op));
+
+  if (MEM_P (op))
+    return loongarch_rewrite_small_data (adjust_address (op, word_mode, byte));
+
+  return simplify_gen_subreg (word_mode, op, mode, byte);
+}
 
 /* Return the number of instructions needed to implement INSN,
    given that it loads from or stores to MEM.  */
@@ -3063,9 +3259,10 @@ loongarch_legitimize_move (machine_mode mode, rtx dest, rtx src)
 
   /* Both src and dest are non-registers;  one special case is supported where
      the source is (const_int 0) and the store can source the zero register.
-     */
+     LSX is never able to source the zero register directly in
+     memory operations.  */
   if (!register_operand (dest, mode) && !register_operand (src, mode)
-      && !const_0_operand (src, mode))
+      && (!const_0_operand (src, mode) || LSX_SUPPORTED_MODE_P (mode)))
     {
       loongarch_emit_move (dest, force_reg (mode, src));
       return true;
@@ -3637,6 +3834,54 @@ loongarch_rtx_costs (rtx x, machine_mode mode, int outer_code,
     }
 }
 
+/* Vectorizer cost model implementation.  */
+
+/* Implement targetm.vectorize.builtin_vectorization_cost.  */
+
+static int
+loongarch_builtin_vectorization_cost (enum vect_cost_for_stmt type_of_cost,
+				      tree vectype,
+				      int misalign ATTRIBUTE_UNUSED)
+{
+  unsigned elements;
+
+  switch (type_of_cost)
+    {
+      case scalar_stmt:
+      case scalar_load:
+      case vector_stmt:
+      case vector_load:
+      case vec_to_scalar:
+      case scalar_to_vec:
+      case cond_branch_not_taken:
+      case vec_promote_demote:
+      case scalar_store:
+      case vector_store:
+	return 1;
+
+      case vec_perm:
+	return 1;
+
+      case unaligned_load:
+      case vector_gather_load:
+	return 2;
+
+      case unaligned_store:
+      case vector_scatter_store:
+	return 10;
+
+      case cond_branch_taken:
+	return 3;
+
+      case vec_construct:
+	elements = TYPE_VECTOR_SUBPARTS (vectype);
+	return elements / 2 + 1;
+
+      default:
+	gcc_unreachable ();
+    }
+}
+
 /* Implement TARGET_ADDRESS_COST.  */
 
 static int
@@ -3691,6 +3936,11 @@ loongarch_split_move_p (rtx dest, rtx src)
       if (FP_REG_RTX_P (src) && MEM_P (dest))
 	return false;
     }
+
+  /* Check if LSX moves need splitting.  */
+  if (LSX_SUPPORTED_MODE_P (GET_MODE (dest)))
+    return loongarch_split_128bit_move_p (dest, src);
+
   /* Otherwise split all multiword moves.  */
   return size > UNITS_PER_WORD;
 }
@@ -3704,7 +3954,9 @@ loongarch_split_move (rtx dest, rtx src, rtx insn_)
   rtx low_dest;
 
   gcc_checking_assert (loongarch_split_move_p (dest, src));
-  if (FP_REG_RTX_P (dest) || FP_REG_RTX_P (src))
+  if (LSX_SUPPORTED_MODE_P (GET_MODE (dest)))
+    loongarch_split_128bit_move (dest, src);
+  else if (FP_REG_RTX_P (dest) || FP_REG_RTX_P (src))
     {
       if (!TARGET_64BIT && GET_MODE (dest) == DImode)
 	emit_insn (gen_move_doubleword_fprdi (dest, src));
@@ -3808,12 +4060,21 @@ loongarch_split_plus_constant (rtx *op, machine_mode mode)
 
 /* Return true if a move from SRC to DEST in INSN should be split.  */
 
-static bool
+bool
 loongarch_split_move_insn_p (rtx dest, rtx src)
 {
   return loongarch_split_move_p (dest, src);
 }
 
+/* Split a move from SRC to DEST in INSN, given that
+   loongarch_split_move_insn_p holds.  */
+
+void
+loongarch_split_move_insn (rtx dest, rtx src, rtx insn)
+{
+  loongarch_split_move (dest, src, insn);
+}
+
 /* Implement TARGET_CONSTANT_ALIGNMENT.  */
 
 static HOST_WIDE_INT
@@ -3860,7 +4121,7 @@ const char *
 loongarch_output_move_index_float (rtx x, machine_mode mode, bool ldr)
 {
   int index = exact_log2 (GET_MODE_SIZE (mode));
-  if (!IN_RANGE (index, 2, 3))
+  if (!IN_RANGE (index, 2, 4))
     return NULL;
 
   struct loongarch_address_info info;
@@ -3869,20 +4130,216 @@ loongarch_output_move_index_float (rtx x, machine_mode mode, bool ldr)
       || !loongarch_legitimate_address_p (mode, x, false))
     return NULL;
 
-  const char *const insn[][2] =
+  const char *const insn[][3] =
     {
 	{
 	  "fstx.s\t%1,%0",
-	  "fstx.d\t%1,%0"
+	  "fstx.d\t%1,%0",
+	  "vstx\t%w1,%0"
 	},
 	{
 	  "fldx.s\t%0,%1",
-	  "fldx.d\t%0,%1"
-	},
+	  "fldx.d\t%0,%1",
+	  "vldx\t%w0,%1"
+	}
     };
 
   return insn[ldr][index-2];
 }
+/* Return true if a 128-bit move from SRC to DEST should be split.  */
+
+bool
+loongarch_split_128bit_move_p (rtx dest, rtx src)
+{
+  /* LSX-to-LSX moves can be done in a single instruction.  */
+  if (FP_REG_RTX_P (src) && FP_REG_RTX_P (dest))
+    return false;
+
+  /* Check for LSX loads and stores.  */
+  if (FP_REG_RTX_P (dest) && MEM_P (src))
+    return false;
+  if (FP_REG_RTX_P (src) && MEM_P (dest))
+    return false;
+
+  /* Check for LSX set to an immediate const vector with valid replicated
+     element.  */
+  if (FP_REG_RTX_P (dest)
+      && loongarch_const_vector_same_int_p (src, GET_MODE (src), -512, 511))
+    return false;
+
+  /* Check for LSX load zero immediate.  */
+  if (FP_REG_RTX_P (dest) && src == CONST0_RTX (GET_MODE (src)))
+    return false;
+
+  return true;
+}
+
+/* Split a 128-bit move from SRC to DEST.  */
+
+void
+loongarch_split_128bit_move (rtx dest, rtx src)
+{
+  int byte, index;
+  rtx low_dest, low_src, d, s;
+
+  if (FP_REG_RTX_P (dest))
+    {
+      gcc_assert (!MEM_P (src));
+
+      rtx new_dest = dest;
+      if (!TARGET_64BIT)
+	{
+	  if (GET_MODE (dest) != V4SImode)
+	    new_dest = simplify_gen_subreg (V4SImode, dest, GET_MODE (dest), 0);
+	}
+      else
+	{
+	  if (GET_MODE (dest) != V2DImode)
+	    new_dest = simplify_gen_subreg (V2DImode, dest, GET_MODE (dest), 0);
+	}
+
+      for (byte = 0, index = 0; byte < GET_MODE_SIZE (TImode);
+	   byte += UNITS_PER_WORD, index++)
+	{
+	  s = loongarch_subword_at_byte (src, byte);
+	  if (!TARGET_64BIT)
+	    emit_insn (gen_lsx_vinsgr2vr_w (new_dest, s, new_dest,
+					    GEN_INT (1 << index)));
+	  else
+	    emit_insn (gen_lsx_vinsgr2vr_d (new_dest, s, new_dest,
+					    GEN_INT (1 << index)));
+	}
+    }
+  else if (FP_REG_RTX_P (src))
+    {
+      gcc_assert (!MEM_P (dest));
+
+      rtx new_src = src;
+      if (!TARGET_64BIT)
+	{
+	  if (GET_MODE (src) != V4SImode)
+	    new_src = simplify_gen_subreg (V4SImode, src, GET_MODE (src), 0);
+	}
+      else
+	{
+	  if (GET_MODE (src) != V2DImode)
+	    new_src = simplify_gen_subreg (V2DImode, src, GET_MODE (src), 0);
+	}
+
+      for (byte = 0, index = 0; byte < GET_MODE_SIZE (TImode);
+	   byte += UNITS_PER_WORD, index++)
+	{
+	  d = loongarch_subword_at_byte (dest, byte);
+	  if (!TARGET_64BIT)
+	    emit_insn (gen_lsx_vpickve2gr_w (d, new_src, GEN_INT (index)));
+	  else
+	    emit_insn (gen_lsx_vpickve2gr_d (d, new_src, GEN_INT (index)));
+	}
+    }
+  else
+    {
+      low_dest = loongarch_subword_at_byte (dest, 0);
+      low_src = loongarch_subword_at_byte (src, 0);
+      gcc_assert (REG_P (low_dest) && REG_P (low_src));
+      /* Make sure the source register is not written before reading.  */
+      if (REGNO (low_dest) <= REGNO (low_src))
+	{
+	  for (byte = 0; byte < GET_MODE_SIZE (TImode);
+	       byte += UNITS_PER_WORD)
+	    {
+	      d = loongarch_subword_at_byte (dest, byte);
+	      s = loongarch_subword_at_byte (src, byte);
+	      loongarch_emit_move (d, s);
+	    }
+	}
+      else
+	{
+	  for (byte = GET_MODE_SIZE (TImode) - UNITS_PER_WORD; byte >= 0;
+	       byte -= UNITS_PER_WORD)
+	    {
+	      d = loongarch_subword_at_byte (dest, byte);
+	      s = loongarch_subword_at_byte (src, byte);
+	      loongarch_emit_move (d, s);
+	    }
+	}
+    }
+}
+
+
+/* Split a COPY_S.D with operands DEST, SRC and INDEX.  GEN is a function
+   used to generate subregs.  */
+
+void
+loongarch_split_lsx_copy_d (rtx dest, rtx src, rtx index,
+			    rtx (*gen_fn)(rtx, rtx, rtx))
+{
+  gcc_assert ((GET_MODE (src) == V2DImode && GET_MODE (dest) == DImode)
+	      || (GET_MODE (src) == V2DFmode && GET_MODE (dest) == DFmode));
+
+  /* Note that low is always from the lower index, and high is always
+     from the higher index.  */
+  rtx low = loongarch_subword (dest, false);
+  rtx high = loongarch_subword (dest, true);
+  rtx new_src = simplify_gen_subreg (V4SImode, src, GET_MODE (src), 0);
+
+  emit_insn (gen_fn (low, new_src, GEN_INT (INTVAL (index) * 2)));
+  emit_insn (gen_fn (high, new_src, GEN_INT (INTVAL (index) * 2 + 1)));
+}
+
+/* Split a INSERT.D with operand DEST, SRC1.INDEX and SRC2.  */
+
+void
+loongarch_split_lsx_insert_d (rtx dest, rtx src1, rtx index, rtx src2)
+{
+  int i;
+  gcc_assert (GET_MODE (dest) == GET_MODE (src1));
+  gcc_assert ((GET_MODE (dest) == V2DImode
+	       && (GET_MODE (src2) == DImode || src2 == const0_rtx))
+	      || (GET_MODE (dest) == V2DFmode && GET_MODE (src2) == DFmode));
+
+  /* Note that low is always from the lower index, and high is always
+     from the higher index.  */
+  rtx low = loongarch_subword (src2, false);
+  rtx high = loongarch_subword (src2, true);
+  rtx new_dest = simplify_gen_subreg (V4SImode, dest, GET_MODE (dest), 0);
+  rtx new_src1 = simplify_gen_subreg (V4SImode, src1, GET_MODE (src1), 0);
+  i = exact_log2 (INTVAL (index));
+  gcc_assert (i != -1);
+
+  emit_insn (gen_lsx_vinsgr2vr_w (new_dest, low, new_src1,
+				  GEN_INT (1 << (i * 2))));
+  emit_insn (gen_lsx_vinsgr2vr_w (new_dest, high, new_dest,
+				  GEN_INT (1 << (i * 2 + 1))));
+}
+
+/* Split FILL.D.  */
+
+void
+loongarch_split_lsx_fill_d (rtx dest, rtx src)
+{
+  gcc_assert ((GET_MODE (dest) == V2DImode
+	       && (GET_MODE (src) == DImode || src == const0_rtx))
+	      || (GET_MODE (dest) == V2DFmode && GET_MODE (src) == DFmode));
+
+  /* Note that low is always from the lower index, and high is always
+     from the higher index.  */
+  rtx low, high;
+  if (src == const0_rtx)
+    {
+      low = src;
+      high = src;
+    }
+  else
+    {
+      low = loongarch_subword (src, false);
+      high = loongarch_subword (src, true);
+    }
+  rtx new_dest = simplify_gen_subreg (V4SImode, dest, GET_MODE (dest), 0);
+  emit_insn (gen_lsx_vreplgr2vr_w (new_dest, low));
+  emit_insn (gen_lsx_vinsgr2vr_w (new_dest, high, new_dest, GEN_INT (1 << 1)));
+  emit_insn (gen_lsx_vinsgr2vr_w (new_dest, high, new_dest, GEN_INT (1 << 3)));
+}
+
 
 /* Return the appropriate instructions to move SRC into DEST.  Assume
    that SRC is operand 1 and DEST is operand 0.  */
@@ -3894,10 +4351,25 @@ loongarch_output_move (rtx dest, rtx src)
   enum rtx_code src_code = GET_CODE (src);
   machine_mode mode = GET_MODE (dest);
   bool dbl_p = (GET_MODE_SIZE (mode) == 8);
+  bool lsx_p = LSX_SUPPORTED_MODE_P (mode);
 
   if (loongarch_split_move_p (dest, src))
     return "#";
 
+  if ((lsx_p)
+      && dest_code == REG && FP_REG_P (REGNO (dest))
+      && src_code == CONST_VECTOR
+      && CONST_INT_P (CONST_VECTOR_ELT (src, 0)))
+    {
+      gcc_assert (loongarch_const_vector_same_int_p (src, mode, -512, 511));
+      switch (GET_MODE_SIZE (mode))
+	{
+	case 16:
+	  return "vrepli.%v0\t%w0,%E1";
+	default: gcc_unreachable ();
+	}
+    }
+
   if ((src_code == REG && GP_REG_P (REGNO (src)))
       || (src == CONST0_RTX (mode)))
     {
@@ -3907,7 +4379,21 @@ loongarch_output_move (rtx dest, rtx src)
 	    return "or\t%0,%z1,$r0";
 
 	  if (FP_REG_P (REGNO (dest)))
-	    return dbl_p ? "movgr2fr.d\t%0,%z1" : "movgr2fr.w\t%0,%z1";
+	    {
+	      if (lsx_p)
+		{
+		  gcc_assert (src == CONST0_RTX (GET_MODE (src)));
+		  switch (GET_MODE_SIZE (mode))
+		    {
+		    case 16:
+		      return "vrepli.b\t%w0,0";
+		    default:
+		      gcc_unreachable ();
+		    }
+		}
+
+	      return dbl_p ? "movgr2fr.d\t%0,%z1" : "movgr2fr.w\t%0,%z1";
+	    }
 	}
       if (dest_code == MEM)
 	{
@@ -3949,7 +4435,10 @@ loongarch_output_move (rtx dest, rtx src)
     {
       if (src_code == REG)
 	if (FP_REG_P (REGNO (src)))
-	  return dbl_p ? "movfr2gr.d\t%0,%1" : "movfr2gr.s\t%0,%1";
+	  {
+	    gcc_assert (!lsx_p);
+	    return dbl_p ? "movfr2gr.d\t%0,%1" : "movfr2gr.s\t%0,%1";
+	  }
 
       if (src_code == MEM)
 	{
@@ -3994,7 +4483,7 @@ loongarch_output_move (rtx dest, rtx src)
 	  enum loongarch_symbol_type type = SYMBOL_PCREL;
 
 	  if (UNSPEC_ADDRESS_P (x))
-	     type = UNSPEC_ADDRESS_TYPE (x);
+	    type = UNSPEC_ADDRESS_TYPE (x);
 
 	  if (type == SYMBOL_TLS_LE)
 	    return "lu12i.w\t%0,%h1";
@@ -4029,7 +4518,20 @@ loongarch_output_move (rtx dest, rtx src)
   if (src_code == REG && FP_REG_P (REGNO (src)))
     {
       if (dest_code == REG && FP_REG_P (REGNO (dest)))
-	return dbl_p ? "fmov.d\t%0,%1" : "fmov.s\t%0,%1";
+	{
+	  if (lsx_p)
+	    {
+	      switch (GET_MODE_SIZE (mode))
+		{
+		case 16:
+		  return "vori.b\t%w0,%w1,0";
+		default:
+		  gcc_unreachable ();
+		}
+	    }
+
+	  return dbl_p ? "fmov.d\t%0,%1" : "fmov.s\t%0,%1";
+	}
 
       if (dest_code == MEM)
 	{
@@ -4040,6 +4542,17 @@ loongarch_output_move (rtx dest, rtx src)
 	  if (insn)
 	    return insn;
 
+	  if (lsx_p)
+	    {
+	      switch (GET_MODE_SIZE (mode))
+		{
+		case 16:
+		  return "vst\t%w1,%0";
+		default:
+		  gcc_unreachable ();
+		}
+	    }
+
 	  return dbl_p ? "fst.d\t%1,%0" : "fst.s\t%1,%0";
 	}
     }
@@ -4055,6 +4568,16 @@ loongarch_output_move (rtx dest, rtx src)
 	  if (insn)
 	    return insn;
 
+	  if (lsx_p)
+	    {
+	      switch (GET_MODE_SIZE (mode))
+		{
+		case 16:
+		  return "vld\t%w0,%1";
+		default:
+		  gcc_unreachable ();
+		}
+	    }
 	  return dbl_p ? "fld.d\t%0,%1" : "fld.s\t%0,%1";
 	}
     }
@@ -4244,6 +4767,7 @@ loongarch_extend_comparands (rtx_code code, rtx *op0, rtx *op1)
     }
 }
 
+
 /* Convert a comparison into something that can be used in a branch.  On
    entry, *OP0 and *OP1 are the values being compared and *CODE is the code
    used to compare them.  Update them to describe the final comparison.  */
@@ -5003,9 +5527,12 @@ loongarch_print_operand_reloc (FILE *file, rtx op, bool hi64_part,
 
    'A'	Print a _DB suffix if the memory model requires a release.
    'b'	Print the address of a memory operand, without offset.
+   'B'	Print CONST_INT OP element 0 of a replicated CONST_VECTOR
+	  as an unsigned byte [0..255].
    'c'  Print an integer.
    'C'	Print the integer branch condition for comparison OP.
    'd'	Print CONST_INT OP in decimal.
+   'E'	Print CONST_INT OP element 0 of a replicated CONST_VECTOR in decimal.
    'F'	Print the FPU branch condition for comparison OP.
    'G'	Print a DBAR insn if the memory model requires a release.
    'H'  Print address 52-61bit relocation associated with OP.
@@ -5021,13 +5548,16 @@ loongarch_print_operand_reloc (FILE *file, rtx op, bool hi64_part,
    't'	Like 'T', but with the EQ/NE cases reversed
    'V'	Print exact log2 of CONST_INT OP element 0 of a replicated
 	  CONST_VECTOR in decimal.
+   'v'	Print the insn size suffix b, h, w or d for vector modes V16QI, V8HI,
+	  V4SI, V2SI, and w, d for vector modes V4SF, V2DF respectively.
    'W'	Print the inverse of the FPU branch condition for comparison OP.
+   'w'	Print a LSX register.
    'X'	Print CONST_INT OP in hexadecimal format.
    'x'	Print the low 16 bits of CONST_INT OP in hexadecimal format.
    'Y'	Print loongarch_fp_conditions[INTVAL (OP)]
    'y'	Print exact log2 of CONST_INT OP in decimal.
    'Z'	Print OP and a comma for 8CC, otherwise print nothing.
-   'z'	Print $0 if OP is zero, otherwise print OP normally.  */
+   'z'	Print $r0 if OP is zero, otherwise print OP normally.  */
 
 static void
 loongarch_print_operand (FILE *file, rtx op, int letter)
@@ -5049,6 +5579,18 @@ loongarch_print_operand (FILE *file, rtx op, int letter)
       if (loongarch_memmodel_needs_rel_acq_fence ((enum memmodel) INTVAL (op)))
        fputs ("_db", file);
       break;
+    case 'E':
+      if (GET_CODE (op) == CONST_VECTOR)
+	{
+	  gcc_assert (loongarch_const_vector_same_val_p (op, GET_MODE (op)));
+	  op = CONST_VECTOR_ELT (op, 0);
+	  gcc_assert (CONST_INT_P (op));
+	  fprintf (file, HOST_WIDE_INT_PRINT_DEC, INTVAL (op));
+	}
+      else
+	output_operand_lossage ("invalid use of '%%%c'", letter);
+      break;
+
 
     case 'c':
       if (CONST_INT_P (op))
@@ -5099,6 +5641,18 @@ loongarch_print_operand (FILE *file, rtx op, int letter)
       loongarch_print_operand_reloc (file, op, false /* hi64_part*/,
 				     false /* lo_reloc */);
       break;
+    case 'B':
+      if (GET_CODE (op) == CONST_VECTOR)
+	{
+	  gcc_assert (loongarch_const_vector_same_val_p (op, GET_MODE (op)));
+	  op = CONST_VECTOR_ELT (op, 0);
+	  gcc_assert (CONST_INT_P (op));
+	  unsigned HOST_WIDE_INT val8 = UINTVAL (op) & GET_MODE_MASK (QImode);
+	  fprintf (file, HOST_WIDE_INT_PRINT_UNSIGNED, val8);
+	}
+      else
+	output_operand_lossage ("invalid use of '%%%c'", letter);
+      break;
 
     case 'm':
       if (CONST_INT_P (op))
@@ -5145,10 +5699,45 @@ loongarch_print_operand (FILE *file, rtx op, int letter)
 	output_operand_lossage ("invalid use of '%%%c'", letter);
       break;
 
-    case 'W':
-      loongarch_print_float_branch_condition (file, reverse_condition (code),
-					      letter);
-      break;
+    case 'v':
+      switch (GET_MODE (op))
+	{
+	case E_V16QImode:
+	case E_V32QImode:
+	  fprintf (file, "b");
+	  break;
+	case E_V8HImode:
+	case E_V16HImode:
+	  fprintf (file, "h");
+	  break;
+	case E_V4SImode:
+	case E_V4SFmode:
+	case E_V8SImode:
+	case E_V8SFmode:
+	  fprintf (file, "w");
+	  break;
+	case E_V2DImode:
+	case E_V2DFmode:
+	case E_V4DImode:
+	case E_V4DFmode:
+	  fprintf (file, "d");
+	  break;
+	default:
+	  output_operand_lossage ("invalid use of '%%%c'", letter);
+	}
+      break;
+
+    case 'W':
+      loongarch_print_float_branch_condition (file, reverse_condition (code),
+					      letter);
+      break;
+
+    case 'w':
+      if (code == REG && LSX_REG_P (REGNO (op)))
+	fprintf (file, "$vr%s", &reg_names[REGNO (op)][2]);
+      else
+	output_operand_lossage ("invalid use of '%%%c'", letter);
+      break;
 
     case 'x':
       if (CONST_INT_P (op))
@@ -5521,9 +6110,13 @@ loongarch_hard_regno_mode_ok_uncached (unsigned int regno, machine_mode mode)
   size = GET_MODE_SIZE (mode);
   mclass = GET_MODE_CLASS (mode);
 
-  if (GP_REG_P (regno))
+  if (GP_REG_P (regno) && !LSX_SUPPORTED_MODE_P (mode))
     return ((regno - GP_REG_FIRST) & 1) == 0 || size <= UNITS_PER_WORD;
 
+  /* For LSX, allow TImode and 128-bit vector modes in all FPR.  */
+  if (FP_REG_P (regno) && LSX_SUPPORTED_MODE_P (mode))
+    return true;
+
   if (FP_REG_P (regno))
     {
       if (mclass == MODE_FLOAT
@@ -5550,6 +6143,17 @@ loongarch_hard_regno_mode_ok (unsigned int regno, machine_mode mode)
   return loongarch_hard_regno_mode_ok_p[mode][regno];
 }
 
+
+static bool
+loongarch_hard_regno_call_part_clobbered (unsigned int,
+					  unsigned int regno, machine_mode mode)
+{
+  if (ISA_HAS_LSX && FP_REG_P (regno) && GET_MODE_SIZE (mode) > 8)
+    return true;
+
+  return false;
+}
+
 /* Implement TARGET_HARD_REGNO_NREGS.  */
 
 static unsigned int
@@ -5561,7 +6165,12 @@ loongarch_hard_regno_nregs (unsigned int regno, machine_mode mode)
     return (GET_MODE_SIZE (mode) + 3) / 4;
 
   if (FP_REG_P (regno))
-    return (GET_MODE_SIZE (mode) + UNITS_PER_FPREG - 1) / UNITS_PER_FPREG;
+    {
+      if (LSX_SUPPORTED_MODE_P (mode))
+	return 1;
+
+      return (GET_MODE_SIZE (mode) + UNITS_PER_FPREG - 1) / UNITS_PER_FPREG;
+    }
 
   /* All other registers are word-sized.  */
   return (GET_MODE_SIZE (mode) + UNITS_PER_WORD - 1) / UNITS_PER_WORD;
@@ -5588,8 +6197,12 @@ loongarch_class_max_nregs (enum reg_class rclass, machine_mode mode)
   if (hard_reg_set_intersect_p (left, reg_class_contents[(int) FP_REGS]))
     {
       if (loongarch_hard_regno_mode_ok (FP_REG_FIRST, mode))
-	size = MIN (size, UNITS_PER_FPREG);
-
+	{
+	  if (LSX_SUPPORTED_MODE_P (mode))
+	    size = MIN (size, UNITS_PER_LSX_REG);
+	  else
+	    size = MIN (size, UNITS_PER_FPREG);
+	}
       left &= ~reg_class_contents[FP_REGS];
     }
   if (!hard_reg_set_empty_p (left))
@@ -5600,9 +6213,13 @@ loongarch_class_max_nregs (enum reg_class rclass, machine_mode mode)
 /* Implement TARGET_CAN_CHANGE_MODE_CLASS.  */
 
 static bool
-loongarch_can_change_mode_class (machine_mode, machine_mode,
+loongarch_can_change_mode_class (machine_mode from, machine_mode to,
 				 reg_class_t rclass)
 {
+  /* Allow conversions between different LSX vector modes.  */
+  if (LSX_SUPPORTED_MODE_P (from) && LSX_SUPPORTED_MODE_P (to))
+    return true;
+
   return !reg_classes_intersect_p (FP_REGS, rclass);
 }
 
@@ -5622,7 +6239,7 @@ loongarch_mode_ok_for_mov_fmt_p (machine_mode mode)
       return TARGET_HARD_FLOAT && TARGET_DOUBLE_FLOAT;
 
     default:
-      return 0;
+      return LSX_SUPPORTED_MODE_P (mode);
     }
 }
 
@@ -5779,7 +6396,12 @@ loongarch_secondary_reload (bool in_p ATTRIBUTE_UNUSED, rtx x,
       if (regno < 0
 	  || (MEM_P (x)
 	      && (GET_MODE_SIZE (mode) == 4 || GET_MODE_SIZE (mode) == 8)))
-	/* In this case we can use fld.s, fst.s, fld.d or fst.d.  */
+	/* In this case we can use lwc1, swc1, ldc1 or sdc1.  We'll use
+	   pairs of lwc1s and swc1s if ldc1 and sdc1 are not supported.  */
+	return NO_REGS;
+
+      if (MEM_P (x) && LSX_SUPPORTED_MODE_P (mode))
+	/* In this case we can use LSX LD.* and ST.*.  */
 	return NO_REGS;
 
       if (GP_REG_P (regno) || x == CONST0_RTX (mode))
@@ -5814,6 +6436,14 @@ loongarch_valid_pointer_mode (scalar_int_mode mode)
   return mode == SImode || (TARGET_64BIT && mode == DImode);
 }
 
+/* Implement TARGET_VECTOR_MODE_SUPPORTED_P.  */
+
+static bool
+loongarch_vector_mode_supported_p (machine_mode mode)
+{
+  return LSX_SUPPORTED_MODE_P (mode);
+}
+
 /* Implement TARGET_SCALAR_MODE_SUPPORTED_P.  */
 
 static bool
@@ -5826,6 +6456,48 @@ loongarch_scalar_mode_supported_p (scalar_mode mode)
   return default_scalar_mode_supported_p (mode);
 }
 
+/* Implement TARGET_VECTORIZE_PREFERRED_SIMD_MODE.  */
+
+static machine_mode
+loongarch_preferred_simd_mode (scalar_mode mode)
+{
+  if (!ISA_HAS_LSX)
+    return word_mode;
+
+  switch (mode)
+    {
+    case E_QImode:
+      return E_V16QImode;
+    case E_HImode:
+      return E_V8HImode;
+    case E_SImode:
+      return E_V4SImode;
+    case E_DImode:
+      return E_V2DImode;
+
+    case E_SFmode:
+      return E_V4SFmode;
+
+    case E_DFmode:
+      return E_V2DFmode;
+
+    default:
+      break;
+    }
+  return word_mode;
+}
+
+static unsigned int
+loongarch_autovectorize_vector_modes (vector_modes *modes, bool)
+{
+  if (ISA_HAS_LSX)
+    {
+      modes->safe_push (V16QImode);
+    }
+
+  return 0;
+}
+
 /* Return the assembly code for INSN, which has the operands given by
    OPERANDS, and which branches to OPERANDS[0] if some condition is true.
    BRANCH_IF_TRUE is the asm template that should be used if OPERANDS[0]
@@ -5990,6 +6662,29 @@ loongarch_output_division (const char *division, rtx *operands)
   return s;
 }
 
+/* Return the assembly code for LSX DIV_{S,U}.DF or MOD_{S,U}.DF instructions,
+   which has the operands given by OPERANDS.  Add in a divide-by-zero check
+   if needed.  */
+
+const char *
+loongarch_lsx_output_division (const char *division, rtx *operands)
+{
+  const char *s;
+
+  s = division;
+  if (TARGET_CHECK_ZERO_DIV)
+    {
+      if (ISA_HAS_LSX)
+	{
+	  output_asm_insn ("vsetallnez.%v0\t$fcc7,%w2",operands);
+	  output_asm_insn (s, operands);
+	  output_asm_insn ("bcnez\t$fcc7,1f", operands);
+	}
+      s = "break\t7\n1:";
+    }
+  return s;
+}
+
 /* Implement TARGET_SCHED_ADJUST_COST.  We assume that anti and output
    dependencies have no cost.  */
 
@@ -6259,6 +6954,9 @@ loongarch_option_override_internal (struct gcc_options *opts)
   if (TARGET_DIRECT_EXTERN_ACCESS && flag_shlib)
     error ("%qs cannot be used for compiling a shared library",
 	   "-mdirect-extern-access");
+  if (loongarch_vector_access_cost == 0)
+    loongarch_vector_access_cost = 5;
+
 
   switch (la_target.cmodel)
     {
@@ -6477,64 +7175,60 @@ loongarch_trampoline_init (rtx m_tramp, tree fndecl, rtx chain_value)
   emit_insn (gen_clear_cache (addr, end_addr));
 }
 
-/* Implement HARD_REGNO_CALLER_SAVE_MODE.  */
-
-machine_mode
-loongarch_hard_regno_caller_save_mode (unsigned int regno, unsigned int nregs,
-				       machine_mode mode)
-{
-  /* For performance, avoid saving/restoring upper parts of a register
-     by returning MODE as save mode when the mode is known.  */
-  if (mode == VOIDmode)
-    return choose_hard_reg_mode (regno, nregs, NULL);
-  else
-    return mode;
-}
+/* Generate or test for an insn that supports a constant permutation.  */
 
-/* Implement TARGET_SPILL_CLASS.  */
+#define MAX_VECT_LEN 32
 
-static reg_class_t
-loongarch_spill_class (reg_class_t rclass ATTRIBUTE_UNUSED,
-		       machine_mode mode ATTRIBUTE_UNUSED)
+struct expand_vec_perm_d
 {
-  return NO_REGS;
-}
-
-/* Implement TARGET_PROMOTE_FUNCTION_MODE.  */
+  rtx target, op0, op1;
+  unsigned char perm[MAX_VECT_LEN];
+  machine_mode vmode;
+  unsigned char nelt;
+  bool one_vector_p;
+  bool testing_p;
+};
 
-/* This function is equivalent to default_promote_function_mode_always_promote
-   except that it returns a promoted mode even if type is NULL_TREE.  This is
-   needed by libcalls which have no type (only a mode) such as fixed conversion
-   routines that take a signed or unsigned char/short argument and convert it
-   to a fixed type.  */
+/* Construct (set target (vec_select op0 (parallel perm))) and
+   return true if that's a valid instruction in the active ISA.  */
 
-static machine_mode
-loongarch_promote_function_mode (const_tree type ATTRIBUTE_UNUSED,
-				 machine_mode mode,
-				 int *punsignedp ATTRIBUTE_UNUSED,
-				 const_tree fntype ATTRIBUTE_UNUSED,
-				 int for_return ATTRIBUTE_UNUSED)
+static bool
+loongarch_expand_vselect (rtx target, rtx op0,
+			  const unsigned char *perm, unsigned nelt)
 {
-  int unsignedp;
+  rtx rperm[MAX_VECT_LEN], x;
+  rtx_insn *insn;
+  unsigned i;
 
-  if (type != NULL_TREE)
-    return promote_mode (type, mode, punsignedp);
+  for (i = 0; i < nelt; ++i)
+    rperm[i] = GEN_INT (perm[i]);
 
-  unsignedp = *punsignedp;
-  PROMOTE_MODE (mode, unsignedp, type);
-  *punsignedp = unsignedp;
-  return mode;
+  x = gen_rtx_PARALLEL (VOIDmode, gen_rtvec_v (nelt, rperm));
+  x = gen_rtx_VEC_SELECT (GET_MODE (target), op0, x);
+  x = gen_rtx_SET (target, x);
+
+  insn = emit_insn (x);
+  if (recog_memoized (insn) < 0)
+    {
+      remove_insn (insn);
+      return false;
+    }
+  return true;
 }
 
-/* Implement TARGET_STARTING_FRAME_OFFSET.  See loongarch_compute_frame_info
-   for details about the frame layout.  */
+/* Similar, but generate a vec_concat from op0 and op1 as well.  */
 
-static HOST_WIDE_INT
-loongarch_starting_frame_offset (void)
+static bool
+loongarch_expand_vselect_vconcat (rtx target, rtx op0, rtx op1,
+				  const unsigned char *perm, unsigned nelt)
 {
-  if (FRAME_GROWS_DOWNWARD)
-    return 0;
-  return crtl->outgoing_args_size;
+  machine_mode v2mode;
+  rtx x;
+
+  if (!GET_MODE_2XWIDER_MODE (GET_MODE (op0)).exists (&v2mode))
+    return false;
+  x = gen_rtx_VEC_CONCAT (v2mode, op0, op1);
+  return loongarch_expand_vselect (target, x, perm, nelt);
 }
 
 static tree
@@ -6797,105 +7491,1279 @@ loongarch_set_handled_components (sbitmap components)
 #define TARGET_ASM_ALIGNED_SI_OP "\t.word\t"
 #undef TARGET_ASM_ALIGNED_DI_OP
 #define TARGET_ASM_ALIGNED_DI_OP "\t.dword\t"
+/* Construct (set target (vec_select op0 (parallel selector))) and
+   return true if that's a valid instruction in the active ISA.  */
 
-#undef TARGET_OPTION_OVERRIDE
-#define TARGET_OPTION_OVERRIDE loongarch_option_override
-
-#undef TARGET_LEGITIMIZE_ADDRESS
-#define TARGET_LEGITIMIZE_ADDRESS loongarch_legitimize_address
-
-#undef TARGET_ASM_SELECT_RTX_SECTION
-#define TARGET_ASM_SELECT_RTX_SECTION loongarch_select_rtx_section
-#undef TARGET_ASM_FUNCTION_RODATA_SECTION
-#define TARGET_ASM_FUNCTION_RODATA_SECTION loongarch_function_rodata_section
+static bool
+loongarch_expand_lsx_shuffle (struct expand_vec_perm_d *d)
+{
+  rtx x, elts[MAX_VECT_LEN];
+  rtvec v;
+  rtx_insn *insn;
+  unsigned i;
 
-#undef TARGET_SCHED_INIT
-#define TARGET_SCHED_INIT loongarch_sched_init
-#undef TARGET_SCHED_REORDER
-#define TARGET_SCHED_REORDER loongarch_sched_reorder
-#undef TARGET_SCHED_REORDER2
-#define TARGET_SCHED_REORDER2 loongarch_sched_reorder2
-#undef TARGET_SCHED_VARIABLE_ISSUE
-#define TARGET_SCHED_VARIABLE_ISSUE loongarch_variable_issue
-#undef TARGET_SCHED_ADJUST_COST
-#define TARGET_SCHED_ADJUST_COST loongarch_adjust_cost
-#undef TARGET_SCHED_ISSUE_RATE
-#define TARGET_SCHED_ISSUE_RATE loongarch_issue_rate
-#undef TARGET_SCHED_FIRST_CYCLE_MULTIPASS_DFA_LOOKAHEAD
-#define TARGET_SCHED_FIRST_CYCLE_MULTIPASS_DFA_LOOKAHEAD \
-  loongarch_multipass_dfa_lookahead
+  if (!ISA_HAS_LSX)
+    return false;
 
-#undef TARGET_FUNCTION_OK_FOR_SIBCALL
-#define TARGET_FUNCTION_OK_FOR_SIBCALL loongarch_function_ok_for_sibcall
+  for (i = 0; i < d->nelt; i++)
+    elts[i] = GEN_INT (d->perm[i]);
 
-#undef TARGET_VALID_POINTER_MODE
-#define TARGET_VALID_POINTER_MODE loongarch_valid_pointer_mode
-#undef TARGET_REGISTER_MOVE_COST
-#define TARGET_REGISTER_MOVE_COST loongarch_register_move_cost
-#undef TARGET_MEMORY_MOVE_COST
-#define TARGET_MEMORY_MOVE_COST loongarch_memory_move_cost
-#undef TARGET_RTX_COSTS
-#define TARGET_RTX_COSTS loongarch_rtx_costs
-#undef TARGET_ADDRESS_COST
-#define TARGET_ADDRESS_COST loongarch_address_cost
+  v = gen_rtvec_v (d->nelt, elts);
+  x = gen_rtx_PARALLEL (VOIDmode, v);
 
-#undef TARGET_IN_SMALL_DATA_P
-#define TARGET_IN_SMALL_DATA_P loongarch_in_small_data_p
+  if (!loongarch_const_vector_shuffle_set_p (x, d->vmode))
+    return false;
 
-#undef TARGET_PREFERRED_RELOAD_CLASS
-#define TARGET_PREFERRED_RELOAD_CLASS loongarch_preferred_reload_class
+  x = gen_rtx_VEC_SELECT (d->vmode, d->op0, x);
+  x = gen_rtx_SET (d->target, x);
 
-#undef TARGET_ASM_FILE_START_FILE_DIRECTIVE
-#define TARGET_ASM_FILE_START_FILE_DIRECTIVE true
+  insn = emit_insn (x);
+  if (recog_memoized (insn) < 0)
+    {
+      remove_insn (insn);
+      return false;
+    }
+  return true;
+}
 
-#undef TARGET_EXPAND_BUILTIN_VA_START
-#define TARGET_EXPAND_BUILTIN_VA_START loongarch_va_start
+void
+loongarch_expand_vec_perm (rtx target, rtx op0, rtx op1, rtx sel)
+{
+  machine_mode vmode = GET_MODE (target);
 
-#undef TARGET_PROMOTE_FUNCTION_MODE
-#define TARGET_PROMOTE_FUNCTION_MODE loongarch_promote_function_mode
-#undef TARGET_RETURN_IN_MEMORY
-#define TARGET_RETURN_IN_MEMORY loongarch_return_in_memory
+  switch (vmode)
+    {
+    case E_V16QImode:
+      emit_insn (gen_lsx_vshuf_b (target, op1, op0, sel));
+      break;
+    case E_V2DFmode:
+      emit_insn (gen_lsx_vshuf_d_f (target, sel, op1, op0));
+      break;
+    case E_V2DImode:
+      emit_insn (gen_lsx_vshuf_d (target, sel, op1, op0));
+      break;
+    case E_V4SFmode:
+      emit_insn (gen_lsx_vshuf_w_f (target, sel, op1, op0));
+      break;
+    case E_V4SImode:
+      emit_insn (gen_lsx_vshuf_w (target, sel, op1, op0));
+      break;
+    case E_V8HImode:
+      emit_insn (gen_lsx_vshuf_h (target, sel, op1, op0));
+      break;
+    default:
+      break;
+    }
+}
 
-#undef TARGET_FUNCTION_VALUE
-#define TARGET_FUNCTION_VALUE loongarch_function_value
-#undef TARGET_LIBCALL_VALUE
-#define TARGET_LIBCALL_VALUE loongarch_libcall_value
+static bool
+loongarch_try_expand_lsx_vshuf_const (struct expand_vec_perm_d *d)
+{
+  int i;
+  rtx target, op0, op1, sel, tmp;
+  rtx rperm[MAX_VECT_LEN];
 
-#undef TARGET_ASM_OUTPUT_MI_THUNK
-#define TARGET_ASM_OUTPUT_MI_THUNK loongarch_output_mi_thunk
-#undef TARGET_ASM_CAN_OUTPUT_MI_THUNK
-#define TARGET_ASM_CAN_OUTPUT_MI_THUNK \
-  hook_bool_const_tree_hwi_hwi_const_tree_true
+  if (d->vmode == E_V2DImode || d->vmode == E_V2DFmode
+	|| d->vmode == E_V4SImode || d->vmode == E_V4SFmode
+	|| d->vmode == E_V8HImode || d->vmode == E_V16QImode)
+    {
+      target = d->target;
+      op0 = d->op0;
+      op1 = d->one_vector_p ? d->op0 : d->op1;
 
-#undef TARGET_PRINT_OPERAND
-#define TARGET_PRINT_OPERAND loongarch_print_operand
-#undef TARGET_PRINT_OPERAND_ADDRESS
-#define TARGET_PRINT_OPERAND_ADDRESS loongarch_print_operand_address
-#undef TARGET_PRINT_OPERAND_PUNCT_VALID_P
-#define TARGET_PRINT_OPERAND_PUNCT_VALID_P \
-  loongarch_print_operand_punct_valid_p
+      if (GET_MODE (op0) != GET_MODE (op1)
+	  || GET_MODE (op0) != GET_MODE (target))
+	return false;
 
-#undef TARGET_SETUP_INCOMING_VARARGS
-#define TARGET_SETUP_INCOMING_VARARGS loongarch_setup_incoming_varargs
-#undef TARGET_STRICT_ARGUMENT_NAMING
-#define TARGET_STRICT_ARGUMENT_NAMING hook_bool_CUMULATIVE_ARGS_true
-#undef TARGET_MUST_PASS_IN_STACK
-#define TARGET_MUST_PASS_IN_STACK must_pass_in_stack_var_size
-#undef TARGET_PASS_BY_REFERENCE
-#define TARGET_PASS_BY_REFERENCE loongarch_pass_by_reference
-#undef TARGET_ARG_PARTIAL_BYTES
-#define TARGET_ARG_PARTIAL_BYTES loongarch_arg_partial_bytes
-#undef TARGET_FUNCTION_ARG
-#define TARGET_FUNCTION_ARG loongarch_function_arg
-#undef TARGET_FUNCTION_ARG_ADVANCE
-#define TARGET_FUNCTION_ARG_ADVANCE loongarch_function_arg_advance
-#undef TARGET_FUNCTION_ARG_BOUNDARY
-#define TARGET_FUNCTION_ARG_BOUNDARY loongarch_function_arg_boundary
+      if (d->testing_p)
+	return true;
 
-#undef TARGET_SCALAR_MODE_SUPPORTED_P
-#define TARGET_SCALAR_MODE_SUPPORTED_P loongarch_scalar_mode_supported_p
+      for (i = 0; i < d->nelt; i += 1)
+	{
+	  rperm[i] = GEN_INT (d->perm[i]);
+	}
 
-#undef TARGET_INIT_BUILTINS
+      if (d->vmode == E_V2DFmode)
+	{
+	  sel = gen_rtx_CONST_VECTOR (E_V2DImode, gen_rtvec_v (d->nelt, rperm));
+	  tmp = gen_rtx_SUBREG (E_V2DImode, d->target, 0);
+	  emit_move_insn (tmp, sel);
+	}
+      else if (d->vmode == E_V4SFmode)
+	{
+	  sel = gen_rtx_CONST_VECTOR (E_V4SImode, gen_rtvec_v (d->nelt, rperm));
+	  tmp = gen_rtx_SUBREG (E_V4SImode, d->target, 0);
+	  emit_move_insn (tmp, sel);
+	}
+      else
+	{
+	  sel = gen_rtx_CONST_VECTOR (d->vmode, gen_rtvec_v (d->nelt, rperm));
+	  emit_move_insn (d->target, sel);
+	}
+
+      switch (d->vmode)
+	{
+	case E_V2DFmode:
+	  emit_insn (gen_lsx_vshuf_d_f (target, target, op1, op0));
+	  break;
+	case E_V2DImode:
+	  emit_insn (gen_lsx_vshuf_d (target, target, op1, op0));
+	  break;
+	case E_V4SFmode:
+	  emit_insn (gen_lsx_vshuf_w_f (target, target, op1, op0));
+	  break;
+	case E_V4SImode:
+	  emit_insn (gen_lsx_vshuf_w (target, target, op1, op0));
+	  break;
+	case E_V8HImode:
+	  emit_insn (gen_lsx_vshuf_h (target, target, op1, op0));
+	  break;
+	case E_V16QImode:
+	  emit_insn (gen_lsx_vshuf_b (target, op1, op0, target));
+	  break;
+	default:
+	  break;
+	}
+
+      return true;
+    }
+  return false;
+}
+
+static bool
+loongarch_expand_vec_perm_const_1 (struct expand_vec_perm_d *d)
+{
+  unsigned int i, nelt = d->nelt;
+  unsigned char perm2[MAX_VECT_LEN];
+
+  if (d->one_vector_p)
+    {
+      /* Try interleave with alternating operands.  */
+      memcpy (perm2, d->perm, sizeof (perm2));
+      for (i = 1; i < nelt; i += 2)
+	perm2[i] += nelt;
+      if (loongarch_expand_vselect_vconcat (d->target, d->op0, d->op1, perm2,
+					    nelt))
+	return true;
+    }
+  else
+    {
+      if (loongarch_expand_vselect_vconcat (d->target, d->op0, d->op1,
+					    d->perm, nelt))
+	return true;
+
+      /* Try again with swapped operands.  */
+      for (i = 0; i < nelt; ++i)
+	perm2[i] = (d->perm[i] + nelt) & (2 * nelt - 1);
+      if (loongarch_expand_vselect_vconcat (d->target, d->op1, d->op0, perm2,
+					    nelt))
+	return true;
+    }
+
+  if (loongarch_expand_lsx_shuffle (d))
+    return true;
+  return false;
+}
+
+/* Implementation of constant vector permuatation.  This function identifies
+ * recognized pattern of permuation selector argument, and use one or more
+ * instruction(s) to finish the permutation job correctly.  For unsupported
+ * patterns, it will return false.  */
+
+static bool
+loongarch_expand_vec_perm_const_2 (struct expand_vec_perm_d *d)
+{
+  /* Although we have the LSX vec_perm<mode> template, there's still some
+     128bit vector permuatation operations send to vectorize_vec_perm_const.
+     In this case, we just simpliy wrap them by single vshuf.* instruction,
+     because LSX vshuf.* instruction just have the same behavior that GCC
+     expects.  */
+  return loongarch_try_expand_lsx_vshuf_const (d);
+}
+
+/* Implement TARGET_VECTORIZE_VEC_PERM_CONST.  */
+
+static bool
+loongarch_vectorize_vec_perm_const (machine_mode vmode, machine_mode op_mode,
+				    rtx target, rtx op0, rtx op1,
+				    const vec_perm_indices &sel)
+{
+  if (vmode != op_mode)
+    return false;
+
+  struct expand_vec_perm_d d;
+  int i, nelt, which;
+  unsigned char orig_perm[MAX_VECT_LEN];
+  bool ok;
+
+  d.target = target;
+  if (op0)
+    {
+      rtx nop0 = force_reg (vmode, op0);
+      if (op0 == op1)
+	op1 = nop0;
+      op0 = nop0;
+    }
+  if (op1)
+    op1 = force_reg (vmode, op1);
+  d.op0 = op0;
+  d.op1 = op1;
+
+  d.vmode = vmode;
+  gcc_assert (VECTOR_MODE_P (vmode));
+  d.nelt = nelt = GET_MODE_NUNITS (vmode);
+  d.testing_p = !target;
+
+  /* This is overly conservative, but ensures we don't get an
+     uninitialized warning on ORIG_PERM.  */
+  memset (orig_perm, 0, MAX_VECT_LEN);
+  for (i = which = 0; i < nelt; ++i)
+    {
+      int ei = sel[i] & (2 * nelt - 1);
+      which |= (ei < nelt ? 1 : 2);
+      orig_perm[i] = ei;
+    }
+  memcpy (d.perm, orig_perm, MAX_VECT_LEN);
+
+  switch (which)
+    {
+    default:
+      gcc_unreachable ();
+
+    case 3:
+      d.one_vector_p = false;
+      if (d.testing_p || !rtx_equal_p (d.op0, d.op1))
+	break;
+      /* FALLTHRU */
+
+    case 2:
+      for (i = 0; i < nelt; ++i)
+	d.perm[i] &= nelt - 1;
+      d.op0 = d.op1;
+      d.one_vector_p = true;
+      break;
+
+    case 1:
+      d.op1 = d.op0;
+      d.one_vector_p = true;
+      break;
+    }
+
+  if (d.testing_p)
+    {
+      d.target = gen_raw_REG (d.vmode, LAST_VIRTUAL_REGISTER + 1);
+      d.op1 = d.op0 = gen_raw_REG (d.vmode, LAST_VIRTUAL_REGISTER + 2);
+      if (!d.one_vector_p)
+	d.op1 = gen_raw_REG (d.vmode, LAST_VIRTUAL_REGISTER + 3);
+
+      ok = loongarch_expand_vec_perm_const_2 (&d);
+      if (ok)
+	return ok;
+
+      start_sequence ();
+      ok = loongarch_expand_vec_perm_const_1 (&d);
+      end_sequence ();
+      return ok;
+    }
+
+  ok = loongarch_expand_vec_perm_const_2 (&d);
+  if (!ok)
+    ok = loongarch_expand_vec_perm_const_1 (&d);
+
+  /* If we were given a two-vector permutation which just happened to
+     have both input vectors equal, we folded this into a one-vector
+     permutation.  There are several loongson patterns that are matched
+     via direct vec_select+vec_concat expansion, but we do not have
+     support in loongarch_expand_vec_perm_const_1 to guess the adjustment
+     that should be made for a single operand.  Just try again with
+     the original permutation.  */
+  if (!ok && which == 3)
+    {
+      d.op0 = op0;
+      d.op1 = op1;
+      d.one_vector_p = false;
+      memcpy (d.perm, orig_perm, MAX_VECT_LEN);
+      ok = loongarch_expand_vec_perm_const_1 (&d);
+    }
+
+  return ok;
+}
+
+/* Implement TARGET_SCHED_REASSOCIATION_WIDTH.  */
+
+static int
+loongarch_sched_reassociation_width (unsigned int opc, machine_mode mode)
+{
+  switch (LARCH_ACTUAL_TUNE)
+    {
+    case CPU_LOONGARCH64:
+    case CPU_LA464:
+      /* Vector part.  */
+      if (LSX_SUPPORTED_MODE_P (mode))
+	{
+	  /* Integer vector instructions execute in FP unit.
+	     The width of integer/float-point vector instructions is 3.  */
+	  return 3;
+	}
+
+      /* Scalar part.  */
+      else if (INTEGRAL_MODE_P (mode))
+	return 1;
+      else if (FLOAT_MODE_P (mode))
+	{
+	  if (opc == PLUS_EXPR)
+	    {
+	      return 2;
+	    }
+	  return 4;
+	}
+      break;
+    default:
+      break;
+    }
+  return 1;
+}
+
+/* Implement extract a scalar element from vecotr register */
+
+void
+loongarch_expand_vector_extract (rtx target, rtx vec, int elt)
+{
+  machine_mode mode = GET_MODE (vec);
+  machine_mode inner_mode = GET_MODE_INNER (mode);
+  rtx tmp;
+
+  switch (mode)
+    {
+    case E_V8HImode:
+    case E_V16QImode:
+      break;
+
+    default:
+      break;
+    }
+
+  tmp = gen_rtx_PARALLEL (VOIDmode, gen_rtvec (1, GEN_INT (elt)));
+  tmp = gen_rtx_VEC_SELECT (inner_mode, vec, tmp);
+
+  /* Let the rtl optimizers know about the zero extension performed.  */
+  if (inner_mode == QImode || inner_mode == HImode)
+    {
+      tmp = gen_rtx_ZERO_EXTEND (SImode, tmp);
+      target = gen_lowpart (SImode, target);
+    }
+  if (inner_mode == SImode || inner_mode == DImode)
+    {
+      tmp = gen_rtx_SIGN_EXTEND (inner_mode, tmp);
+    }
+
+  emit_insn (gen_rtx_SET (target, tmp));
+}
+
+/* Generate code to copy vector bits i / 2 ... i - 1 from vector SRC
+   to bits 0 ... i / 2 - 1 of vector DEST, which has the same mode.
+   The upper bits of DEST are undefined, though they shouldn't cause
+   exceptions (some bits from src or all zeros are ok).  */
+
+static void
+emit_reduc_half (rtx dest, rtx src, int i)
+{
+  rtx tem, d = dest;
+  switch (GET_MODE (src))
+    {
+    case E_V4SFmode:
+      tem = gen_lsx_vbsrl_w_f (dest, src, GEN_INT (i == 128 ? 8 : 4));
+      break;
+    case E_V2DFmode:
+      tem = gen_lsx_vbsrl_d_f (dest, src, GEN_INT (8));
+      break;
+    case E_V16QImode:
+    case E_V8HImode:
+    case E_V4SImode:
+    case E_V2DImode:
+      d = gen_reg_rtx (V2DImode);
+      tem = gen_lsx_vbsrl_d (d, gen_lowpart (V2DImode, src), GEN_INT (i/16));
+      break;
+    default:
+      gcc_unreachable ();
+    }
+  emit_insn (tem);
+  if (d != dest)
+    emit_move_insn (dest, gen_lowpart (GET_MODE (dest), d));
+}
+
+/* Expand a vector reduction.  FN is the binary pattern to reduce;
+   DEST is the destination; IN is the input vector.  */
+
+void
+loongarch_expand_vector_reduc (rtx (*fn) (rtx, rtx, rtx), rtx dest, rtx in)
+{
+  rtx half, dst, vec = in;
+  machine_mode mode = GET_MODE (in);
+  int i;
+
+  for (i = GET_MODE_BITSIZE (mode);
+       i > GET_MODE_UNIT_BITSIZE (mode);
+       i >>= 1)
+    {
+      half = gen_reg_rtx (mode);
+      emit_reduc_half (half, vec, i);
+      if (i == GET_MODE_UNIT_BITSIZE (mode) * 2)
+	dst = dest;
+      else
+	dst = gen_reg_rtx (mode);
+      emit_insn (fn (dst, half, vec));
+      vec = dst;
+    }
+}
+
+/* Expand an integral vector unpack operation.  */
+
+void
+loongarch_expand_vec_unpack (rtx operands[2], bool unsigned_p, bool high_p)
+{
+  machine_mode imode = GET_MODE (operands[1]);
+  rtx (*unpack) (rtx, rtx, rtx);
+  rtx (*cmpFunc) (rtx, rtx, rtx);
+  rtx tmp, dest;
+
+  if (ISA_HAS_LSX)
+    {
+      switch (imode)
+	{
+	case E_V4SImode:
+	  if (high_p != 0)
+	    unpack = gen_lsx_vilvh_w;
+	  else
+	    unpack = gen_lsx_vilvl_w;
+
+	  cmpFunc = gen_lsx_vslt_w;
+	  break;
+
+	case E_V8HImode:
+	  if (high_p != 0)
+	    unpack = gen_lsx_vilvh_h;
+	  else
+	    unpack = gen_lsx_vilvl_h;
+
+	  cmpFunc = gen_lsx_vslt_h;
+	  break;
+
+	case E_V16QImode:
+	  if (high_p != 0)
+	    unpack = gen_lsx_vilvh_b;
+	  else
+	    unpack = gen_lsx_vilvl_b;
+
+	  cmpFunc = gen_lsx_vslt_b;
+	  break;
+
+	default:
+	  gcc_unreachable ();
+	  break;
+	}
+
+      if (!unsigned_p)
+	{
+	  /* Extract sign extention for each element comparing each element
+	     with immediate zero.  */
+	  tmp = gen_reg_rtx (imode);
+	  emit_insn (cmpFunc (tmp, operands[1], CONST0_RTX (imode)));
+	}
+      else
+	tmp = force_reg (imode, CONST0_RTX (imode));
+
+      dest = gen_reg_rtx (imode);
+
+      emit_insn (unpack (dest, operands[1], tmp));
+      emit_move_insn (operands[0], gen_lowpart (GET_MODE (operands[0]), dest));
+      return;
+    }
+  gcc_unreachable ();
+}
+
+/* Construct and return PARALLEL RTX with CONST_INTs for HIGH (high_p == TRUE)
+   or LOW (high_p == FALSE) half of a vector for mode MODE.  */
+
+rtx
+loongarch_lsx_vec_parallel_const_half (machine_mode mode, bool high_p)
+{
+  int nunits = GET_MODE_NUNITS (mode);
+  rtvec v = rtvec_alloc (nunits / 2);
+  int base;
+  int i;
+
+  base = high_p ? nunits / 2 : 0;
+
+  for (i = 0; i < nunits / 2; i++)
+    RTVEC_ELT (v, i) = GEN_INT (base + i);
+
+  return gen_rtx_PARALLEL (VOIDmode, v);
+}
+
+/* A subroutine of loongarch_expand_vec_init, match constant vector
+   elements.  */
+
+static inline bool
+loongarch_constant_elt_p (rtx x)
+{
+  return CONST_INT_P (x) || GET_CODE (x) == CONST_DOUBLE;
+}
+
+rtx
+loongarch_gen_const_int_vector_shuffle (machine_mode mode, int val)
+{
+  int nunits = GET_MODE_NUNITS (mode);
+  int nsets = nunits / 4;
+  rtx elts[MAX_VECT_LEN];
+  int set = 0;
+  int i, j;
+
+  /* Generate a const_int vector replicating the same 4-element set
+     from an immediate.  */
+  for (j = 0; j < nsets; j++, set = 4 * j)
+    for (i = 0; i < 4; i++)
+      elts[set + i] = GEN_INT (set + ((val >> (2 * i)) & 0x3));
+
+  return gen_rtx_PARALLEL (VOIDmode, gen_rtvec_v (nunits, elts));
+}
+
+/* Expand a vector initialization.  */
+
+void
+loongarch_expand_vector_init (rtx target, rtx vals)
+{
+  machine_mode vmode = GET_MODE (target);
+  machine_mode imode = GET_MODE_INNER (vmode);
+  unsigned i, nelt = GET_MODE_NUNITS (vmode);
+  unsigned nvar = 0;
+  bool all_same = true;
+  rtx x;
+
+  for (i = 0; i < nelt; ++i)
+    {
+      x = XVECEXP (vals, 0, i);
+      if (!loongarch_constant_elt_p (x))
+	nvar++;
+      if (i > 0 && !rtx_equal_p (x, XVECEXP (vals, 0, 0)))
+	all_same = false;
+    }
+
+  if (ISA_HAS_LSX)
+    {
+      if (all_same)
+	{
+	  rtx same = XVECEXP (vals, 0, 0);
+	  rtx temp, temp2;
+
+	  if (CONST_INT_P (same) && nvar == 0
+	      && loongarch_signed_immediate_p (INTVAL (same), 10, 0))
+	    {
+	      switch (vmode)
+		{
+		case E_V16QImode:
+		case E_V8HImode:
+		case E_V4SImode:
+		case E_V2DImode:
+		  temp = gen_rtx_CONST_VECTOR (vmode, XVEC (vals, 0));
+		  emit_move_insn (target, temp);
+		  return;
+
+		default:
+		  gcc_unreachable ();
+		}
+	    }
+	  temp = gen_reg_rtx (imode);
+	  if (imode == GET_MODE (same))
+	    temp2 = same;
+	  else if (GET_MODE_SIZE (imode) >= UNITS_PER_WORD)
+	    {
+	      if (GET_CODE (same) == MEM)
+		{
+		  rtx reg_tmp = gen_reg_rtx (GET_MODE (same));
+		  loongarch_emit_move (reg_tmp, same);
+		  temp2 = simplify_gen_subreg (imode, reg_tmp,
+					       GET_MODE (reg_tmp), 0);
+		}
+	      else
+		temp2 = simplify_gen_subreg (imode, same, GET_MODE (same), 0);
+	    }
+	  else
+	    {
+	      if (GET_CODE (same) == MEM)
+		{
+		  rtx reg_tmp = gen_reg_rtx (GET_MODE (same));
+		  loongarch_emit_move (reg_tmp, same);
+		  temp2 = lowpart_subreg (imode, reg_tmp, GET_MODE (reg_tmp));
+		}
+	      else
+		temp2 = lowpart_subreg (imode, same, GET_MODE (same));
+	    }
+	  emit_move_insn (temp, temp2);
+
+	  switch (vmode)
+	    {
+	    case E_V16QImode:
+	    case E_V8HImode:
+	    case E_V4SImode:
+	    case E_V2DImode:
+	      loongarch_emit_move (target, gen_rtx_VEC_DUPLICATE (vmode, temp));
+	      break;
+
+	    case E_V4SFmode:
+	      emit_insn (gen_lsx_vreplvei_w_f_scalar (target, temp));
+	      break;
+
+	    case E_V2DFmode:
+	      emit_insn (gen_lsx_vreplvei_d_f_scalar (target, temp));
+	      break;
+
+	    default:
+	      gcc_unreachable ();
+	    }
+	}
+      else
+	{
+	  emit_move_insn (target, CONST0_RTX (vmode));
+
+	  for (i = 0; i < nelt; ++i)
+	    {
+	      rtx temp = gen_reg_rtx (imode);
+	      emit_move_insn (temp, XVECEXP (vals, 0, i));
+	      switch (vmode)
+		{
+		case E_V16QImode:
+		  if (i == 0)
+		    emit_insn (gen_lsx_vreplvei_b_scalar (target, temp));
+		  else
+		    emit_insn (gen_vec_setv16qi (target, temp, GEN_INT (i)));
+		  break;
+
+		case E_V8HImode:
+		  if (i == 0)
+		    emit_insn (gen_lsx_vreplvei_h_scalar (target, temp));
+		  else
+		    emit_insn (gen_vec_setv8hi (target, temp, GEN_INT (i)));
+		  break;
+
+		case E_V4SImode:
+		  if (i == 0)
+		    emit_insn (gen_lsx_vreplvei_w_scalar (target, temp));
+		  else
+		    emit_insn (gen_vec_setv4si (target, temp, GEN_INT (i)));
+		  break;
+
+		case E_V2DImode:
+		  if (i == 0)
+		    emit_insn (gen_lsx_vreplvei_d_scalar (target, temp));
+		  else
+		    emit_insn (gen_vec_setv2di (target, temp, GEN_INT (i)));
+		  break;
+
+		case E_V4SFmode:
+		  if (i == 0)
+		    emit_insn (gen_lsx_vreplvei_w_f_scalar (target, temp));
+		  else
+		    emit_insn (gen_vec_setv4sf (target, temp, GEN_INT (i)));
+		  break;
+
+		case E_V2DFmode:
+		  if (i == 0)
+		    emit_insn (gen_lsx_vreplvei_d_f_scalar (target, temp));
+		  else
+		    emit_insn (gen_vec_setv2df (target, temp, GEN_INT (i)));
+		  break;
+
+		default:
+		  gcc_unreachable ();
+		}
+	    }
+	}
+      return;
+    }
+
+  /* Load constants from the pool, or whatever's handy.  */
+  if (nvar == 0)
+    {
+      emit_move_insn (target, gen_rtx_CONST_VECTOR (vmode, XVEC (vals, 0)));
+      return;
+    }
+
+  /* For two-part initialization, always use CONCAT.  */
+  if (nelt == 2)
+    {
+      rtx op0 = force_reg (imode, XVECEXP (vals, 0, 0));
+      rtx op1 = force_reg (imode, XVECEXP (vals, 0, 1));
+      x = gen_rtx_VEC_CONCAT (vmode, op0, op1);
+      emit_insn (gen_rtx_SET (target, x));
+      return;
+    }
+
+  /* Loongson is the only cpu with vectors with more elements.  */
+  gcc_assert (0);
+}
+
+/* Implement HARD_REGNO_CALLER_SAVE_MODE.  */
+
+machine_mode
+loongarch_hard_regno_caller_save_mode (unsigned int regno, unsigned int nregs,
+				       machine_mode mode)
+{
+  /* For performance, avoid saving/restoring upper parts of a register
+     by returning MODE as save mode when the mode is known.  */
+  if (mode == VOIDmode)
+    return choose_hard_reg_mode (regno, nregs, NULL);
+  else
+    return mode;
+}
+
+/* Generate RTL for comparing CMP_OP0 and CMP_OP1 using condition COND and
+   store the result -1 or 0 in DEST.  */
+
+static void
+loongarch_expand_lsx_cmp (rtx dest, enum rtx_code cond, rtx op0, rtx op1)
+{
+  machine_mode cmp_mode = GET_MODE (op0);
+  int unspec = -1;
+  bool negate = false;
+
+  switch (cmp_mode)
+    {
+    case E_V16QImode:
+    case E_V32QImode:
+    case E_V8HImode:
+    case E_V16HImode:
+    case E_V4SImode:
+    case E_V8SImode:
+    case E_V2DImode:
+    case E_V4DImode:
+      switch (cond)
+	{
+	case NE:
+	  cond = reverse_condition (cond);
+	  negate = true;
+	  break;
+	case EQ:
+	case LT:
+	case LE:
+	case LTU:
+	case LEU:
+	  break;
+	case GE:
+	case GT:
+	case GEU:
+	case GTU:
+	  std::swap (op0, op1);
+	  cond = swap_condition (cond);
+	  break;
+	default:
+	  gcc_unreachable ();
+	}
+      loongarch_emit_binary (cond, dest, op0, op1);
+      if (negate)
+	emit_move_insn (dest, gen_rtx_NOT (GET_MODE (dest), dest));
+      break;
+
+    case E_V4SFmode:
+    case E_V2DFmode:
+      switch (cond)
+	{
+	case UNORDERED:
+	case ORDERED:
+	case EQ:
+	case NE:
+	case UNEQ:
+	case UNLE:
+	case UNLT:
+	  break;
+	case LTGT: cond = NE; break;
+	case UNGE: cond = UNLE; std::swap (op0, op1); break;
+	case UNGT: cond = UNLT; std::swap (op0, op1); break;
+	case LE: unspec = UNSPEC_LSX_VFCMP_SLE; break;
+	case LT: unspec = UNSPEC_LSX_VFCMP_SLT; break;
+	case GE: unspec = UNSPEC_LSX_VFCMP_SLE; std::swap (op0, op1); break;
+	case GT: unspec = UNSPEC_LSX_VFCMP_SLT; std::swap (op0, op1); break;
+	default:
+		 gcc_unreachable ();
+	}
+      if (unspec < 0)
+	loongarch_emit_binary (cond, dest, op0, op1);
+      else
+	{
+	  rtx x = gen_rtx_UNSPEC (GET_MODE (dest),
+				  gen_rtvec (2, op0, op1), unspec);
+	  emit_insn (gen_rtx_SET (dest, x));
+	}
+      break;
+
+    default:
+      gcc_unreachable ();
+      break;
+    }
+}
+
+/* Expand VEC_COND_EXPR, where:
+   MODE is mode of the result
+   VIMODE equivalent integer mode
+   OPERANDS operands of VEC_COND_EXPR.  */
+
+void
+loongarch_expand_vec_cond_expr (machine_mode mode, machine_mode vimode,
+				rtx *operands)
+{
+  rtx cond = operands[3];
+  rtx cmp_op0 = operands[4];
+  rtx cmp_op1 = operands[5];
+  rtx cmp_res = gen_reg_rtx (vimode);
+
+  loongarch_expand_lsx_cmp (cmp_res, GET_CODE (cond), cmp_op0, cmp_op1);
+
+  /* We handle the following cases:
+     1) r = a CMP b ? -1 : 0
+     2) r = a CMP b ? -1 : v
+     3) r = a CMP b ?  v : 0
+     4) r = a CMP b ? v1 : v2  */
+
+  /* Case (1) above.  We only move the results.  */
+  if (operands[1] == CONSTM1_RTX (vimode)
+      && operands[2] == CONST0_RTX (vimode))
+    emit_move_insn (operands[0], cmp_res);
+  else
+    {
+      rtx src1 = gen_reg_rtx (vimode);
+      rtx src2 = gen_reg_rtx (vimode);
+      rtx mask = gen_reg_rtx (vimode);
+      rtx bsel;
+
+      /* Move the vector result to use it as a mask.  */
+      emit_move_insn (mask, cmp_res);
+
+      if (register_operand (operands[1], mode))
+	{
+	  rtx xop1 = operands[1];
+	  if (mode != vimode)
+	    {
+	      xop1 = gen_reg_rtx (vimode);
+	      emit_move_insn (xop1, gen_rtx_SUBREG (vimode, operands[1], 0));
+	    }
+	  emit_move_insn (src1, xop1);
+	}
+      else
+	{
+	  gcc_assert (operands[1] == CONSTM1_RTX (vimode));
+	  /* Case (2) if the below doesn't move the mask to src2.  */
+	  emit_move_insn (src1, mask);
+	}
+
+      if (register_operand (operands[2], mode))
+	{
+	  rtx xop2 = operands[2];
+	  if (mode != vimode)
+	    {
+	      xop2 = gen_reg_rtx (vimode);
+	      emit_move_insn (xop2, gen_rtx_SUBREG (vimode, operands[2], 0));
+	    }
+	  emit_move_insn (src2, xop2);
+	}
+      else
+	{
+	  gcc_assert (operands[2] == CONST0_RTX (mode));
+	  /* Case (3) if the above didn't move the mask to src1.  */
+	  emit_move_insn (src2, mask);
+	}
+
+      /* We deal with case (4) if the mask wasn't moved to either src1 or src2.
+	 In any case, we eventually do vector mask-based copy.  */
+      bsel = gen_rtx_IOR (vimode,
+			  gen_rtx_AND (vimode,
+				       gen_rtx_NOT (vimode, mask), src2),
+			  gen_rtx_AND (vimode, mask, src1));
+      /* The result is placed back to a register with the mask.  */
+      emit_insn (gen_rtx_SET (mask, bsel));
+      emit_move_insn (operands[0], gen_rtx_SUBREG (mode, mask, 0));
+    }
+}
+
+void
+loongarch_expand_vec_cond_mask_expr (machine_mode mode, machine_mode vimode,
+				    rtx *operands)
+{
+  rtx cmp_res = operands[3];
+
+  /* We handle the following cases:
+     1) r = a CMP b ? -1 : 0
+     2) r = a CMP b ? -1 : v
+     3) r = a CMP b ?  v : 0
+     4) r = a CMP b ? v1 : v2  */
+
+  /* Case (1) above.  We only move the results.  */
+  if (operands[1] == CONSTM1_RTX (vimode)
+      && operands[2] == CONST0_RTX (vimode))
+    emit_move_insn (operands[0], cmp_res);
+  else
+    {
+      rtx src1 = gen_reg_rtx (vimode);
+      rtx src2 = gen_reg_rtx (vimode);
+      rtx mask = gen_reg_rtx (vimode);
+      rtx bsel;
+
+      /* Move the vector result to use it as a mask.  */
+      emit_move_insn (mask, cmp_res);
+
+      if (register_operand (operands[1], mode))
+	{
+	  rtx xop1 = operands[1];
+	  if (mode != vimode)
+	    {
+	      xop1 = gen_reg_rtx (vimode);
+	      emit_move_insn (xop1, gen_rtx_SUBREG (vimode, operands[1], 0));
+	    }
+	  emit_move_insn (src1, xop1);
+	}
+      else
+	{
+	  gcc_assert (operands[1] == CONSTM1_RTX (vimode));
+	  /* Case (2) if the below doesn't move the mask to src2.  */
+	  emit_move_insn (src1, mask);
+	}
+
+      if (register_operand (operands[2], mode))
+	{
+	  rtx xop2 = operands[2];
+	  if (mode != vimode)
+	    {
+	      xop2 = gen_reg_rtx (vimode);
+	      emit_move_insn (xop2, gen_rtx_SUBREG (vimode, operands[2], 0));
+	    }
+	  emit_move_insn (src2, xop2);
+	}
+      else
+	{
+	  gcc_assert (operands[2] == CONST0_RTX (mode));
+	  /* Case (3) if the above didn't move the mask to src1.  */
+	  emit_move_insn (src2, mask);
+	}
+
+      /* We deal with case (4) if the mask wasn't moved to either src1 or src2.
+	 In any case, we eventually do vector mask-based copy.  */
+      bsel = gen_rtx_IOR (vimode,
+			  gen_rtx_AND (vimode,
+				       gen_rtx_NOT (vimode, mask), src2),
+			  gen_rtx_AND (vimode, mask, src1));
+      /* The result is placed back to a register with the mask.  */
+      emit_insn (gen_rtx_SET (mask, bsel));
+      emit_move_insn (operands[0], gen_rtx_SUBREG (mode, mask, 0));
+    }
+}
+
+/* Expand integer vector comparison */
+bool
+loongarch_expand_vec_cmp (rtx operands[])
+{
+
+  rtx_code code = GET_CODE (operands[1]);
+  loongarch_expand_lsx_cmp (operands[0], code, operands[2], operands[3]);
+  return true;
+}
+
+/* Implement TARGET_CASE_VALUES_THRESHOLD.  */
+
+unsigned int
+loongarch_case_values_threshold (void)
+{
+  return default_case_values_threshold ();
+}
+
+/* Implement TARGET_SPILL_CLASS.  */
+
+static reg_class_t
+loongarch_spill_class (reg_class_t rclass ATTRIBUTE_UNUSED,
+		       machine_mode mode ATTRIBUTE_UNUSED)
+{
+  return NO_REGS;
+}
+
+/* Implement TARGET_PROMOTE_FUNCTION_MODE.  */
+
+/* This function is equivalent to default_promote_function_mode_always_promote
+   except that it returns a promoted mode even if type is NULL_TREE.  This is
+   needed by libcalls which have no type (only a mode) such as fixed conversion
+   routines that take a signed or unsigned char/short argument and convert it
+   to a fixed type.  */
+
+static machine_mode
+loongarch_promote_function_mode (const_tree type ATTRIBUTE_UNUSED,
+				 machine_mode mode,
+				 int *punsignedp ATTRIBUTE_UNUSED,
+				 const_tree fntype ATTRIBUTE_UNUSED,
+				 int for_return ATTRIBUTE_UNUSED)
+{
+  int unsignedp;
+
+  if (type != NULL_TREE)
+    return promote_mode (type, mode, punsignedp);
+
+  unsignedp = *punsignedp;
+  PROMOTE_MODE (mode, unsignedp, type);
+  *punsignedp = unsignedp;
+  return mode;
+}
+
+/* Implement TARGET_STARTING_FRAME_OFFSET.  See loongarch_compute_frame_info
+   for details about the frame layout.  */
+
+static HOST_WIDE_INT
+loongarch_starting_frame_offset (void)
+{
+  if (FRAME_GROWS_DOWNWARD)
+    return 0;
+  return crtl->outgoing_args_size;
+}
+
+/* A subroutine of loongarch_build_signbit_mask.  If VECT is true,
+   then replicate the value for all elements of the vector
+   register.  */
+
+rtx
+loongarch_build_const_vector (machine_mode mode, bool vect, rtx value)
+{
+  int i, n_elt;
+  rtvec v;
+  machine_mode scalar_mode;
+
+  switch (mode)
+    {
+    case E_V32QImode:
+    case E_V16QImode:
+    case E_V32HImode:
+    case E_V16HImode:
+    case E_V8HImode:
+    case E_V8SImode:
+    case E_V4SImode:
+    case E_V8DImode:
+    case E_V4DImode:
+    case E_V2DImode:
+      gcc_assert (vect);
+      /* FALLTHRU */
+    case E_V8SFmode:
+    case E_V4SFmode:
+    case E_V8DFmode:
+    case E_V4DFmode:
+    case E_V2DFmode:
+      n_elt = GET_MODE_NUNITS (mode);
+      v = rtvec_alloc (n_elt);
+      scalar_mode = GET_MODE_INNER (mode);
+
+      RTVEC_ELT (v, 0) = value;
+
+      for (i = 1; i < n_elt; ++i)
+	RTVEC_ELT (v, i) = vect ? value : CONST0_RTX (scalar_mode);
+
+      return gen_rtx_CONST_VECTOR (mode, v);
+
+    default:
+      gcc_unreachable ();
+    }
+}
+
+/* Create a mask for the sign bit in MODE
+   for an register.  If VECT is true, then replicate the mask for
+   all elements of the vector register.  If INVERT is true, then create
+   a mask excluding the sign bit.  */
+
+rtx
+loongarch_build_signbit_mask (machine_mode mode, bool vect, bool invert)
+{
+  machine_mode vec_mode, imode;
+  wide_int w;
+  rtx mask, v;
+
+  switch (mode)
+    {
+    case E_V16SImode:
+    case E_V16SFmode:
+    case E_V8SImode:
+    case E_V4SImode:
+    case E_V8SFmode:
+    case E_V4SFmode:
+      vec_mode = mode;
+      imode = SImode;
+      break;
+
+    case E_V8DImode:
+    case E_V4DImode:
+    case E_V2DImode:
+    case E_V8DFmode:
+    case E_V4DFmode:
+    case E_V2DFmode:
+      vec_mode = mode;
+      imode = DImode;
+      break;
+
+    case E_TImode:
+    case E_TFmode:
+      vec_mode = VOIDmode;
+      imode = TImode;
+      break;
+
+    default:
+      gcc_unreachable ();
+    }
+
+  machine_mode inner_mode = GET_MODE_INNER (mode);
+  w = wi::set_bit_in_zero (GET_MODE_BITSIZE (inner_mode) - 1,
+			   GET_MODE_BITSIZE (inner_mode));
+  if (invert)
+    w = wi::bit_not (w);
+
+  /* Force this value into the low part of a fp vector constant.  */
+  mask = immed_wide_int_const (w, imode);
+  mask = gen_lowpart (inner_mode, mask);
+
+  if (vec_mode == VOIDmode)
+    return force_reg (inner_mode, mask);
+
+  v = loongarch_build_const_vector (vec_mode, vect, mask);
+  return force_reg (vec_mode, v);
+}
+
+static bool
+loongarch_builtin_support_vector_misalignment (machine_mode mode,
+					       const_tree type,
+					       int misalignment,
+					       bool is_packed)
+{
+  if (ISA_HAS_LSX && STRICT_ALIGNMENT)
+    {
+      if (optab_handler (movmisalign_optab, mode) == CODE_FOR_nothing)
+	return false;
+      if (misalignment == -1)
+	return false;
+    }
+  return default_builtin_support_vector_misalignment (mode, type, misalignment,
+						      is_packed);
+}
+
+/* Initialize the GCC target structure.  */
+#undef TARGET_ASM_ALIGNED_HI_OP
+#define TARGET_ASM_ALIGNED_HI_OP "\t.half\t"
+#undef TARGET_ASM_ALIGNED_SI_OP
+#define TARGET_ASM_ALIGNED_SI_OP "\t.word\t"
+#undef TARGET_ASM_ALIGNED_DI_OP
+#define TARGET_ASM_ALIGNED_DI_OP "\t.dword\t"
+
+#undef TARGET_OPTION_OVERRIDE
+#define TARGET_OPTION_OVERRIDE loongarch_option_override
+
+#undef TARGET_LEGITIMIZE_ADDRESS
+#define TARGET_LEGITIMIZE_ADDRESS loongarch_legitimize_address
+
+#undef TARGET_ASM_SELECT_RTX_SECTION
+#define TARGET_ASM_SELECT_RTX_SECTION loongarch_select_rtx_section
+#undef TARGET_ASM_FUNCTION_RODATA_SECTION
+#define TARGET_ASM_FUNCTION_RODATA_SECTION loongarch_function_rodata_section
+
+#undef TARGET_SCHED_INIT
+#define TARGET_SCHED_INIT loongarch_sched_init
+#undef TARGET_SCHED_REORDER
+#define TARGET_SCHED_REORDER loongarch_sched_reorder
+#undef TARGET_SCHED_REORDER2
+#define TARGET_SCHED_REORDER2 loongarch_sched_reorder2
+#undef TARGET_SCHED_VARIABLE_ISSUE
+#define TARGET_SCHED_VARIABLE_ISSUE loongarch_variable_issue
+#undef TARGET_SCHED_ADJUST_COST
+#define TARGET_SCHED_ADJUST_COST loongarch_adjust_cost
+#undef TARGET_SCHED_ISSUE_RATE
+#define TARGET_SCHED_ISSUE_RATE loongarch_issue_rate
+#undef TARGET_SCHED_FIRST_CYCLE_MULTIPASS_DFA_LOOKAHEAD
+#define TARGET_SCHED_FIRST_CYCLE_MULTIPASS_DFA_LOOKAHEAD \
+  loongarch_multipass_dfa_lookahead
+
+#undef TARGET_FUNCTION_OK_FOR_SIBCALL
+#define TARGET_FUNCTION_OK_FOR_SIBCALL loongarch_function_ok_for_sibcall
+
+#undef TARGET_VALID_POINTER_MODE
+#define TARGET_VALID_POINTER_MODE loongarch_valid_pointer_mode
+#undef TARGET_REGISTER_MOVE_COST
+#define TARGET_REGISTER_MOVE_COST loongarch_register_move_cost
+#undef TARGET_MEMORY_MOVE_COST
+#define TARGET_MEMORY_MOVE_COST loongarch_memory_move_cost
+#undef TARGET_RTX_COSTS
+#define TARGET_RTX_COSTS loongarch_rtx_costs
+#undef TARGET_ADDRESS_COST
+#define TARGET_ADDRESS_COST loongarch_address_cost
+#undef TARGET_VECTORIZE_BUILTIN_VECTORIZATION_COST
+#define TARGET_VECTORIZE_BUILTIN_VECTORIZATION_COST \
+  loongarch_builtin_vectorization_cost
+
+
+#undef TARGET_IN_SMALL_DATA_P
+#define TARGET_IN_SMALL_DATA_P loongarch_in_small_data_p
+
+#undef TARGET_PREFERRED_RELOAD_CLASS
+#define TARGET_PREFERRED_RELOAD_CLASS loongarch_preferred_reload_class
+
+#undef TARGET_ASM_FILE_START_FILE_DIRECTIVE
+#define TARGET_ASM_FILE_START_FILE_DIRECTIVE true
+
+#undef TARGET_EXPAND_BUILTIN_VA_START
+#define TARGET_EXPAND_BUILTIN_VA_START loongarch_va_start
+
+#undef TARGET_PROMOTE_FUNCTION_MODE
+#define TARGET_PROMOTE_FUNCTION_MODE loongarch_promote_function_mode
+#undef TARGET_RETURN_IN_MEMORY
+#define TARGET_RETURN_IN_MEMORY loongarch_return_in_memory
+
+#undef TARGET_FUNCTION_VALUE
+#define TARGET_FUNCTION_VALUE loongarch_function_value
+#undef TARGET_LIBCALL_VALUE
+#define TARGET_LIBCALL_VALUE loongarch_libcall_value
+
+#undef TARGET_ASM_OUTPUT_MI_THUNK
+#define TARGET_ASM_OUTPUT_MI_THUNK loongarch_output_mi_thunk
+#undef TARGET_ASM_CAN_OUTPUT_MI_THUNK
+#define TARGET_ASM_CAN_OUTPUT_MI_THUNK \
+  hook_bool_const_tree_hwi_hwi_const_tree_true
+
+#undef TARGET_PRINT_OPERAND
+#define TARGET_PRINT_OPERAND loongarch_print_operand
+#undef TARGET_PRINT_OPERAND_ADDRESS
+#define TARGET_PRINT_OPERAND_ADDRESS loongarch_print_operand_address
+#undef TARGET_PRINT_OPERAND_PUNCT_VALID_P
+#define TARGET_PRINT_OPERAND_PUNCT_VALID_P \
+  loongarch_print_operand_punct_valid_p
+
+#undef TARGET_SETUP_INCOMING_VARARGS
+#define TARGET_SETUP_INCOMING_VARARGS loongarch_setup_incoming_varargs
+#undef TARGET_STRICT_ARGUMENT_NAMING
+#define TARGET_STRICT_ARGUMENT_NAMING hook_bool_CUMULATIVE_ARGS_true
+#undef TARGET_MUST_PASS_IN_STACK
+#define TARGET_MUST_PASS_IN_STACK must_pass_in_stack_var_size
+#undef TARGET_PASS_BY_REFERENCE
+#define TARGET_PASS_BY_REFERENCE loongarch_pass_by_reference
+#undef TARGET_ARG_PARTIAL_BYTES
+#define TARGET_ARG_PARTIAL_BYTES loongarch_arg_partial_bytes
+#undef TARGET_FUNCTION_ARG
+#define TARGET_FUNCTION_ARG loongarch_function_arg
+#undef TARGET_FUNCTION_ARG_ADVANCE
+#define TARGET_FUNCTION_ARG_ADVANCE loongarch_function_arg_advance
+#undef TARGET_FUNCTION_ARG_BOUNDARY
+#define TARGET_FUNCTION_ARG_BOUNDARY loongarch_function_arg_boundary
+
+#undef TARGET_VECTOR_MODE_SUPPORTED_P
+#define TARGET_VECTOR_MODE_SUPPORTED_P loongarch_vector_mode_supported_p
+
+#undef TARGET_SCALAR_MODE_SUPPORTED_P
+#define TARGET_SCALAR_MODE_SUPPORTED_P loongarch_scalar_mode_supported_p
+
+#undef TARGET_VECTORIZE_PREFERRED_SIMD_MODE
+#define TARGET_VECTORIZE_PREFERRED_SIMD_MODE loongarch_preferred_simd_mode
+
+#undef TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_MODES
+#define TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_MODES \
+  loongarch_autovectorize_vector_modes
+
+#undef TARGET_INIT_BUILTINS
 #define TARGET_INIT_BUILTINS loongarch_init_builtins
 #undef TARGET_BUILTIN_DECL
 #define TARGET_BUILTIN_DECL loongarch_builtin_decl
@@ -6942,6 +8810,14 @@ loongarch_set_handled_components (sbitmap components)
 
 #undef TARGET_MAX_ANCHOR_OFFSET
 #define TARGET_MAX_ANCHOR_OFFSET (IMM_REACH/2-1)
+#undef TARGET_VECTORIZE_VEC_PERM_CONST
+#define TARGET_VECTORIZE_VEC_PERM_CONST loongarch_vectorize_vec_perm_const
+
+#undef TARGET_SCHED_REASSOCIATION_WIDTH
+#define TARGET_SCHED_REASSOCIATION_WIDTH loongarch_sched_reassociation_width
+
+#undef TARGET_CASE_VALUES_THRESHOLD
+#define TARGET_CASE_VALUES_THRESHOLD loongarch_case_values_threshold
 
 #undef TARGET_ATOMIC_ASSIGN_EXPAND_FENV
 #define TARGET_ATOMIC_ASSIGN_EXPAND_FENV loongarch_atomic_assign_expand_fenv
@@ -6960,6 +8836,10 @@ loongarch_set_handled_components (sbitmap components)
 #undef TARGET_MODES_TIEABLE_P
 #define TARGET_MODES_TIEABLE_P loongarch_modes_tieable_p
 
+#undef TARGET_HARD_REGNO_CALL_PART_CLOBBERED
+#define TARGET_HARD_REGNO_CALL_PART_CLOBBERED \
+  loongarch_hard_regno_call_part_clobbered
+
 #undef TARGET_CUSTOM_FUNCTION_DESCRIPTORS
 #define TARGET_CUSTOM_FUNCTION_DESCRIPTORS 2
 
@@ -7010,6 +8890,10 @@ loongarch_set_handled_components (sbitmap components)
 #define TARGET_SHRINK_WRAP_SET_HANDLED_COMPONENTS \
   loongarch_set_handled_components
 
+#undef TARGET_VECTORIZE_SUPPORT_VECTOR_MISALIGNMENT
+#define TARGET_VECTORIZE_SUPPORT_VECTOR_MISALIGNMENT \
+  loongarch_builtin_support_vector_misalignment
+
 struct gcc_target targetm = TARGET_INITIALIZER;
 
 #include "gt-loongarch.h"
diff --git a/gcc/config/loongarch/loongarch.h b/gcc/config/loongarch/loongarch.h
index eca723293a1..e939dd826d1 100644
--- a/gcc/config/loongarch/loongarch.h
+++ b/gcc/config/loongarch/loongarch.h
@@ -23,6 +23,8 @@ along with GCC; see the file COPYING3.  If not see
 
 #include "config/loongarch/loongarch-opts.h"
 
+#define TARGET_SUPPORTS_WIDE_INT 1
+
 /* Macros to silence warnings about numbers being signed in traditional
    C and unsigned in ISO C when compiled on 32-bit hosts.  */
 
@@ -179,6 +181,11 @@ along with GCC; see the file COPYING3.  If not see
 #define MIN_UNITS_PER_WORD 4
 #endif
 
+/* Width of a LSX vector register in bytes.  */
+#define UNITS_PER_LSX_REG 16
+/* Width of a LSX vector register in bits.  */
+#define BITS_PER_LSX_REG (UNITS_PER_LSX_REG * BITS_PER_UNIT)
+
 /* For LARCH, width of a floating point register.  */
 #define UNITS_PER_FPREG (TARGET_DOUBLE_FLOAT ? 8 : 4)
 
@@ -241,8 +248,10 @@ along with GCC; see the file COPYING3.  If not see
 #define STRUCTURE_SIZE_BOUNDARY 8
 
 /* There is no point aligning anything to a rounder boundary than
-   LONG_DOUBLE_TYPE_SIZE.  */
-#define BIGGEST_ALIGNMENT (LONG_DOUBLE_TYPE_SIZE)
+   LONG_DOUBLE_TYPE_SIZE, unless under LSX the bigggest alignment is
+   BITS_PER_LSX_REG/..  */
+#define BIGGEST_ALIGNMENT \
+  (ISA_HAS_LSX ? BITS_PER_LSX_REG : LONG_DOUBLE_TYPE_SIZE)
 
 /* All accesses must be aligned.  */
 #define STRICT_ALIGNMENT (TARGET_STRICT_ALIGN)
@@ -378,6 +387,9 @@ along with GCC; see the file COPYING3.  If not see
 #define FP_REG_FIRST 32
 #define FP_REG_LAST 63
 #define FP_REG_NUM (FP_REG_LAST - FP_REG_FIRST + 1)
+#define LSX_REG_FIRST FP_REG_FIRST
+#define LSX_REG_LAST  FP_REG_LAST
+#define LSX_REG_NUM   FP_REG_NUM
 
 /* The DWARF 2 CFA column which tracks the return address from a
    signal handler context.  This means that to maintain backwards
@@ -395,8 +407,11 @@ along with GCC; see the file COPYING3.  If not see
   ((unsigned int) ((int) (REGNO) - FP_REG_FIRST) < FP_REG_NUM)
 #define FCC_REG_P(REGNO) \
   ((unsigned int) ((int) (REGNO) - FCC_REG_FIRST) < FCC_REG_NUM)
+#define LSX_REG_P(REGNO) \
+  ((unsigned int) ((int) (REGNO) - LSX_REG_FIRST) < LSX_REG_NUM)
 
 #define FP_REG_RTX_P(X) (REG_P (X) && FP_REG_P (REGNO (X)))
+#define LSX_REG_RTX_P(X) (REG_P (X) && LSX_REG_P (REGNO (X)))
 
 /* Select a register mode required for caller save of hard regno REGNO.  */
 #define HARD_REGNO_CALLER_SAVE_MODE(REGNO, NREGS, MODE) \
@@ -577,6 +592,11 @@ enum reg_class
 #define IMM12_OPERAND(VALUE) \
   ((unsigned HOST_WIDE_INT) (VALUE) + IMM_REACH / 2 < IMM_REACH)
 
+/* True if VALUE is a signed 13-bit number.  */
+
+#define IMM13_OPERAND(VALUE) \
+  ((unsigned HOST_WIDE_INT) (VALUE) + 0x1000 < 0x2000)
+
 /* True if VALUE is a signed 16-bit number.  */
 
 #define IMM16_OPERAND(VALUE) \
@@ -706,6 +726,13 @@ enum reg_class
 #define FP_ARG_FIRST (FP_REG_FIRST + 0)
 #define FP_ARG_LAST (FP_ARG_FIRST + MAX_ARGS_IN_REGISTERS - 1)
 
+/* True if MODE is vector and supported in a LSX vector register.  */
+#define LSX_SUPPORTED_MODE_P(MODE)			\
+  (ISA_HAS_LSX						\
+   && GET_MODE_SIZE (MODE) == UNITS_PER_LSX_REG		\
+   && (GET_MODE_CLASS (MODE) == MODE_VECTOR_INT		\
+       || GET_MODE_CLASS (MODE) == MODE_VECTOR_FLOAT))
+
 /* 1 if N is a possible register number for function argument passing.
    We have no FP argument registers when soft-float.  */
 
@@ -926,7 +953,39 @@ typedef struct {
   { "s7",	30 + GP_REG_FIRST },					\
   { "s8",	31 + GP_REG_FIRST },					\
   { "v0",	 4 + GP_REG_FIRST },					\
-  { "v1",	 5 + GP_REG_FIRST }					\
+  { "v1",	 5 + GP_REG_FIRST },					\
+  { "vr0",	 0 + FP_REG_FIRST },					\
+  { "vr1",	 1 + FP_REG_FIRST },					\
+  { "vr2",	 2 + FP_REG_FIRST },					\
+  { "vr3",	 3 + FP_REG_FIRST },					\
+  { "vr4",	 4 + FP_REG_FIRST },					\
+  { "vr5",	 5 + FP_REG_FIRST },					\
+  { "vr6",	 6 + FP_REG_FIRST },					\
+  { "vr7",	 7 + FP_REG_FIRST },					\
+  { "vr8",	 8 + FP_REG_FIRST },					\
+  { "vr9",	 9 + FP_REG_FIRST },					\
+  { "vr10",	10 + FP_REG_FIRST },					\
+  { "vr11",	11 + FP_REG_FIRST },					\
+  { "vr12",	12 + FP_REG_FIRST },					\
+  { "vr13",	13 + FP_REG_FIRST },					\
+  { "vr14",	14 + FP_REG_FIRST },					\
+  { "vr15",	15 + FP_REG_FIRST },					\
+  { "vr16",	16 + FP_REG_FIRST },					\
+  { "vr17",	17 + FP_REG_FIRST },					\
+  { "vr18",	18 + FP_REG_FIRST },					\
+  { "vr19",	19 + FP_REG_FIRST },					\
+  { "vr20",	20 + FP_REG_FIRST },					\
+  { "vr21",	21 + FP_REG_FIRST },					\
+  { "vr22",	22 + FP_REG_FIRST },					\
+  { "vr23",	23 + FP_REG_FIRST },					\
+  { "vr24",	24 + FP_REG_FIRST },					\
+  { "vr25",	25 + FP_REG_FIRST },					\
+  { "vr26",	26 + FP_REG_FIRST },					\
+  { "vr27",	27 + FP_REG_FIRST },					\
+  { "vr28",	28 + FP_REG_FIRST },					\
+  { "vr29",	29 + FP_REG_FIRST },					\
+  { "vr30",	30 + FP_REG_FIRST },					\
+  { "vr31",	31 + FP_REG_FIRST }					\
 }
 
 /* Globalizing directive for a label.  */
diff --git a/gcc/config/loongarch/loongarch.md b/gcc/config/loongarch/loongarch.md
index b37e070660f..7b8978e2533 100644
--- a/gcc/config/loongarch/loongarch.md
+++ b/gcc/config/loongarch/loongarch.md
@@ -158,11 +158,12 @@ (define_attr "move_type"
    const,signext,pick_ins,logical,arith,sll0,andi,shift_shift"
   (const_string "unknown"))
 
-(define_attr "alu_type" "unknown,add,sub,not,nor,and,or,xor"
+(define_attr "alu_type" "unknown,add,sub,not,nor,and,or,xor,simd_add"
   (const_string "unknown"))
 
 ;; Main data type used by the insn
-(define_attr "mode" "unknown,none,QI,HI,SI,DI,TI,SF,DF,TF,FCC"
+(define_attr "mode" "unknown,none,QI,HI,SI,DI,TI,SF,DF,TF,FCC,
+  V2DI,V4SI,V8HI,V16QI,V2DF,V4SF"
   (const_string "unknown"))
 
 ;; True if the main data type is twice the size of a word.
@@ -234,7 +235,12 @@ (define_attr "type"
    prefetch,prefetchx,condmove,mgtf,mftg,const,arith,logical,
    shift,slt,signext,clz,trap,imul,idiv,move,
    fmove,fadd,fmul,fmadd,fdiv,frdiv,fabs,flogb,fneg,fcmp,fcopysign,fcvt,
-   fscaleb,fsqrt,frsqrt,accext,accmod,multi,atomic,syncloop,nop,ghost"
+   fscaleb,fsqrt,frsqrt,accext,accmod,multi,atomic,syncloop,nop,ghost,
+   simd_div,simd_fclass,simd_flog2,simd_fadd,simd_fcvt,simd_fmul,simd_fmadd,
+   simd_fdiv,simd_bitins,simd_bitmov,simd_insert,simd_sld,simd_mul,simd_fcmp,
+   simd_fexp2,simd_int_arith,simd_bit,simd_shift,simd_splat,simd_fill,
+   simd_permute,simd_shf,simd_sat,simd_pcnt,simd_copy,simd_branch,simd_clsx,
+   simd_fminmax,simd_logic,simd_move,simd_load,simd_store"
   (cond [(eq_attr "jirl" "!unset") (const_string "call")
 	 (eq_attr "got" "load") (const_string "load")
 
@@ -414,11 +420,20 @@ (define_mode_attr ifmt [(SI "w") (DI "l")])
 
 ;; This attribute gives the upper-case mode name for one unit of a
 ;; floating-point mode or vector mode.
-(define_mode_attr UNITMODE [(SF "SF") (DF "DF")])
+(define_mode_attr UNITMODE [(SF "SF") (DF "DF") (V2SF "SF") (V4SF "SF")
+			    (V16QI "QI") (V8HI "HI") (V4SI "SI") (V2DI "DI")
+			    (V2DF "DF")])
+
+;; As above, but in lower case.
+(define_mode_attr unitmode [(SF "sf") (DF "df") (V2SF "sf") (V4SF "sf")
+			    (V16QI "qi") (V8QI "qi") (V8HI "hi") (V4HI "hi")
+			    (V4SI "si") (V2SI "si") (V2DI "di") (V2DF "df")])
 
 ;; This attribute gives the integer mode that has half the size of
 ;; the controlling mode.
-(define_mode_attr HALFMODE [(DF "SI") (DI "SI") (TF "DI")])
+(define_mode_attr HALFMODE [(DF "SI") (DI "SI") (V2SF "SI")
+			    (V2SI "SI") (V4HI "SI") (V8QI "SI")
+			    (TF "DI")])
 
 ;; This attribute gives the integer mode that has the same size of a
 ;; floating-point mode.
@@ -445,6 +460,18 @@ (define_code_iterator neg_bitwise [and ior])
 ;; from the same template.
 (define_code_iterator any_div [div udiv mod umod])
 
+;; This code iterator allows addition and subtraction to be generated
+;; from the same template.
+(define_code_iterator addsub [plus minus])
+
+;; This code iterator allows addition and multiplication to be generated
+;; from the same template.
+(define_code_iterator addmul [plus mult])
+
+;; This code iterator allows addition subtraction and multiplication to be
+;; generated from the same template
+(define_code_iterator addsubmul [plus minus mult])
+
 ;; This code iterator allows all native floating-point comparisons to be
 ;; generated from the same template.
 (define_code_iterator fcond [unordered uneq unlt unle eq lt le
@@ -684,7 +711,6 @@ (define_insn "sub<mode>3"
   [(set_attr "alu_type" "sub")
    (set_attr "mode" "<MODE>")])
 
-
 (define_insn "*subsi3_extended"
   [(set (match_operand:DI 0 "register_operand" "= r")
 	(sign_extend:DI
@@ -1228,7 +1254,7 @@ (define_insn "smina<mode>3"
   "fmina.<fmt>\t%0,%1,%2"
   [(set_attr "type" "fmove")
    (set_attr "mode" "<MODE>")])
-\f
+
 ;;
 ;;  ....................
 ;;
@@ -2541,7 +2567,6 @@ (define_insn "rotr<mode>3"
   [(set_attr "type" "shift,shift")
    (set_attr "mode" "<MODE>")])
 
-\f
 ;; The following templates were added to generate "bstrpick.d + alsl.d"
 ;; instruction pairs.
 ;; It is required that the values of const_immalsl_operand and
@@ -3606,6 +3631,9 @@ (define_insn "loongarch_crcc_w_<size>_w"
 (include "generic.md")
 (include "la464.md")
 
+; The LoongArch SX Instructions.
+(include "lsx.md")
+
 (define_c_enum "unspec" [
   UNSPEC_ADDRESS_FIRST
 ])
diff --git a/gcc/config/loongarch/lsx.md b/gcc/config/loongarch/lsx.md
new file mode 100644
index 00000000000..fb4d228ba84
--- /dev/null
+++ b/gcc/config/loongarch/lsx.md
@@ -0,0 +1,4467 @@
+;; Machine Description for LARCH Loongson SX ASE
+;;
+;; Copyright (C) 2018 Free Software Foundation, Inc.
+;;
+;; This file is part of GCC.
+;;
+;; GCC is free software; you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation; either version 3, or (at your option)
+;; any later version.
+;;
+;; GCC is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;; GNU General Public License for more details.
+;;
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; <http://www.gnu.org/licenses/>.
+;;
+
+(define_c_enum "unspec" [
+  UNSPEC_LSX_ABSD_S
+  UNSPEC_LSX_VABSD_U
+  UNSPEC_LSX_VAVG_S
+  UNSPEC_LSX_VAVG_U
+  UNSPEC_LSX_VAVGR_S
+  UNSPEC_LSX_VAVGR_U
+  UNSPEC_LSX_VBITCLR
+  UNSPEC_LSX_VBITCLRI
+  UNSPEC_LSX_VBITREV
+  UNSPEC_LSX_VBITREVI
+  UNSPEC_LSX_VBITSET
+  UNSPEC_LSX_VBITSETI
+  UNSPEC_LSX_BRANCH_V
+  UNSPEC_LSX_BRANCH
+  UNSPEC_LSX_VFCMP_CAF
+  UNSPEC_LSX_VFCLASS
+  UNSPEC_LSX_VFCMP_CUNE
+  UNSPEC_LSX_VFCVT
+  UNSPEC_LSX_VFCVTH
+  UNSPEC_LSX_VFCVTL
+  UNSPEC_LSX_VFLOGB
+  UNSPEC_LSX_VFRECIP
+  UNSPEC_LSX_VFRINT
+  UNSPEC_LSX_VFRSQRT
+  UNSPEC_LSX_VFCMP_SAF
+  UNSPEC_LSX_VFCMP_SEQ
+  UNSPEC_LSX_VFCMP_SLE
+  UNSPEC_LSX_VFCMP_SLT
+  UNSPEC_LSX_VFCMP_SNE
+  UNSPEC_LSX_VFCMP_SOR
+  UNSPEC_LSX_VFCMP_SUEQ
+  UNSPEC_LSX_VFCMP_SULE
+  UNSPEC_LSX_VFCMP_SULT
+  UNSPEC_LSX_VFCMP_SUN
+  UNSPEC_LSX_VFCMP_SUNE
+  UNSPEC_LSX_VFTINT_S
+  UNSPEC_LSX_VFTINT_U
+  UNSPEC_LSX_VSAT_S
+  UNSPEC_LSX_VSAT_U
+  UNSPEC_LSX_VREPLVEI
+  UNSPEC_LSX_VSRAR
+  UNSPEC_LSX_VSRARI
+  UNSPEC_LSX_VSRLR
+  UNSPEC_LSX_VSRLRI
+  UNSPEC_LSX_VSHUF
+  UNSPEC_LSX_VMUH_S
+  UNSPEC_LSX_VMUH_U
+  UNSPEC_LSX_VEXTW_S
+  UNSPEC_LSX_VEXTW_U
+  UNSPEC_LSX_VSLLWIL_S
+  UNSPEC_LSX_VSLLWIL_U
+  UNSPEC_LSX_VSRAN
+  UNSPEC_LSX_VSSRAN_S
+  UNSPEC_LSX_VSSRAN_U
+  UNSPEC_LSX_VSRAIN
+  UNSPEC_LSX_VSRAINS_S
+  UNSPEC_LSX_VSRAINS_U
+  UNSPEC_LSX_VSRARN
+  UNSPEC_LSX_VSRLN
+  UNSPEC_LSX_VSRLRN
+  UNSPEC_LSX_VSSRLRN_U
+  UNSPEC_LSX_VFRSTPI
+  UNSPEC_LSX_VFRSTP
+  UNSPEC_LSX_VSHUF4I
+  UNSPEC_LSX_VBSRL_V
+  UNSPEC_LSX_VBSLL_V
+  UNSPEC_LSX_VEXTRINS
+  UNSPEC_LSX_VMSKLTZ
+  UNSPEC_LSX_VSIGNCOV
+  UNSPEC_LSX_VFTINTRNE
+  UNSPEC_LSX_VFTINTRP
+  UNSPEC_LSX_VFTINTRM
+  UNSPEC_LSX_VFTINT_W_D
+  UNSPEC_LSX_VFFINT_S_L
+  UNSPEC_LSX_VFTINTRZ_W_D
+  UNSPEC_LSX_VFTINTRP_W_D
+  UNSPEC_LSX_VFTINTRM_W_D
+  UNSPEC_LSX_VFTINTRNE_W_D
+  UNSPEC_LSX_VFTINTL_L_S
+  UNSPEC_LSX_VFFINTH_D_W
+  UNSPEC_LSX_VFFINTL_D_W
+  UNSPEC_LSX_VFTINTRZL_L_S
+  UNSPEC_LSX_VFTINTRZH_L_S
+  UNSPEC_LSX_VFTINTRPL_L_S
+  UNSPEC_LSX_VFTINTRPH_L_S
+  UNSPEC_LSX_VFTINTRMH_L_S
+  UNSPEC_LSX_VFTINTRML_L_S
+  UNSPEC_LSX_VFTINTRNEL_L_S
+  UNSPEC_LSX_VFTINTRNEH_L_S
+  UNSPEC_LSX_VFTINTH_L_H
+  UNSPEC_LSX_VFRINTRNE_S
+  UNSPEC_LSX_VFRINTRNE_D
+  UNSPEC_LSX_VFRINTRZ_S
+  UNSPEC_LSX_VFRINTRZ_D
+  UNSPEC_LSX_VFRINTRP_S
+  UNSPEC_LSX_VFRINTRP_D
+  UNSPEC_LSX_VFRINTRM_S
+  UNSPEC_LSX_VFRINTRM_D
+  UNSPEC_LSX_VSSRARN_S
+  UNSPEC_LSX_VSSRARN_U
+  UNSPEC_LSX_VSSRLN_U
+  UNSPEC_LSX_VSSRLN
+  UNSPEC_LSX_VSSRLRN
+  UNSPEC_LSX_VLDI
+  UNSPEC_LSX_VSHUF_B
+  UNSPEC_LSX_VLDX
+  UNSPEC_LSX_VSTX
+  UNSPEC_LSX_VEXTL_QU_DU
+  UNSPEC_LSX_VSETEQZ_V
+  UNSPEC_LSX_VADDWEV
+  UNSPEC_LSX_VADDWEV2
+  UNSPEC_LSX_VADDWEV3
+  UNSPEC_LSX_VADDWOD
+  UNSPEC_LSX_VADDWOD2
+  UNSPEC_LSX_VADDWOD3
+  UNSPEC_LSX_VSUBWEV
+  UNSPEC_LSX_VSUBWEV2
+  UNSPEC_LSX_VSUBWOD
+  UNSPEC_LSX_VSUBWOD2
+  UNSPEC_LSX_VMULWEV
+  UNSPEC_LSX_VMULWEV2
+  UNSPEC_LSX_VMULWEV3
+  UNSPEC_LSX_VMULWOD
+  UNSPEC_LSX_VMULWOD2
+  UNSPEC_LSX_VMULWOD3
+  UNSPEC_LSX_VHADDW_Q_D
+  UNSPEC_LSX_VHADDW_QU_DU
+  UNSPEC_LSX_VHSUBW_Q_D
+  UNSPEC_LSX_VHSUBW_QU_DU
+  UNSPEC_LSX_VMADDWEV
+  UNSPEC_LSX_VMADDWEV2
+  UNSPEC_LSX_VMADDWEV3
+  UNSPEC_LSX_VMADDWOD
+  UNSPEC_LSX_VMADDWOD2
+  UNSPEC_LSX_VMADDWOD3
+  UNSPEC_LSX_VROTR
+  UNSPEC_LSX_VADD_Q
+  UNSPEC_LSX_VSUB_Q
+  UNSPEC_LSX_VEXTH_Q_D
+  UNSPEC_LSX_VEXTH_QU_DU
+  UNSPEC_LSX_VMSKGEZ
+  UNSPEC_LSX_VMSKNZ
+  UNSPEC_LSX_VEXTL_Q_D
+  UNSPEC_LSX_VSRLNI
+  UNSPEC_LSX_VSRLRNI
+  UNSPEC_LSX_VSSRLNI
+  UNSPEC_LSX_VSSRLNI2
+  UNSPEC_LSX_VSSRLRNI
+  UNSPEC_LSX_VSSRLRNI2
+  UNSPEC_LSX_VSRANI
+  UNSPEC_LSX_VSRARNI
+  UNSPEC_LSX_VSSRANI
+  UNSPEC_LSX_VSSRANI2
+  UNSPEC_LSX_VSSRARNI
+  UNSPEC_LSX_VSSRARNI2
+  UNSPEC_LSX_VPERMI
+])
+
+;; This attribute gives suffix for integers in VHMODE.
+(define_mode_attr dlsxfmt
+  [(V2DI "q")
+   (V4SI "d")
+   (V8HI "w")
+   (V16QI "h")])
+
+(define_mode_attr dlsxfmt_u
+  [(V2DI "qu")
+   (V4SI "du")
+   (V8HI "wu")
+   (V16QI "hu")])
+
+(define_mode_attr d2lsxfmt
+  [(V4SI "q")
+   (V8HI "d")
+   (V16QI "w")])
+
+(define_mode_attr d2lsxfmt_u
+  [(V4SI "qu")
+   (V8HI "du")
+   (V16QI "wu")])
+
+;; The attribute gives two double modes for vector modes.
+(define_mode_attr VD2MODE
+  [(V4SI "V2DI")
+   (V8HI "V2DI")
+   (V16QI "V4SI")])
+
+;; All vector modes with 128 bits.
+(define_mode_iterator LSX      [V2DF V4SF V2DI V4SI V8HI V16QI])
+
+;; Same as LSX.  Used by vcond to iterate two modes.
+(define_mode_iterator LSX_2    [V2DF V4SF V2DI V4SI V8HI V16QI])
+
+;; Only used for vilvh and splitting insert_d and copy_{u,s}.d.
+(define_mode_iterator LSX_D    [V2DI V2DF])
+
+;; Only used for copy_{u,s}.w and vilvh.
+(define_mode_iterator LSX_W    [V4SI V4SF])
+
+;; Only integer modes.
+(define_mode_iterator ILSX     [V2DI V4SI V8HI V16QI])
+
+;; As ILSX but excludes V16QI.
+(define_mode_iterator ILSX_DWH [V2DI V4SI V8HI])
+
+;; As LSX but excludes V16QI.
+(define_mode_iterator LSX_DWH  [V2DF V4SF V2DI V4SI V8HI])
+
+;; As ILSX but excludes V2DI.
+(define_mode_iterator ILSX_WHB [V4SI V8HI V16QI])
+
+;; Only integer modes equal or larger than a word.
+(define_mode_iterator ILSX_DW  [V2DI V4SI])
+
+;; Only integer modes smaller than a word.
+(define_mode_iterator ILSX_HB  [V8HI V16QI])
+
+;;;; Only integer modes for fixed-point madd_q/maddr_q.
+;;(define_mode_iterator ILSX_WH  [V4SI V8HI])
+
+;; Only floating-point modes.
+(define_mode_iterator FLSX     [V2DF V4SF])
+
+;; Only used for immediate set shuffle elements instruction.
+(define_mode_iterator LSX_WHB_W [V4SI V8HI V16QI V4SF])
+
+;; The attribute gives the integer vector mode with same size.
+(define_mode_attr VIMODE
+  [(V2DF "V2DI")
+   (V4SF "V4SI")
+   (V2DI "V2DI")
+   (V4SI "V4SI")
+   (V8HI "V8HI")
+   (V16QI "V16QI")])
+
+;; The attribute gives half modes for vector modes.
+(define_mode_attr VHMODE
+  [(V8HI "V16QI")
+   (V4SI "V8HI")
+   (V2DI "V4SI")])
+
+;; The attribute gives double modes for vector modes.
+(define_mode_attr VDMODE
+  [(V2DI "V2DI")
+   (V4SI "V2DI")
+   (V8HI "V4SI")
+   (V16QI "V8HI")])
+
+;; The attribute gives half modes with same number of elements for vector modes.
+(define_mode_attr VTRUNCMODE
+  [(V8HI "V8QI")
+   (V4SI "V4HI")
+   (V2DI "V2SI")])
+
+;; Double-sized Vector MODE with same elemet type. "Vector, Enlarged-MODE"
+(define_mode_attr VEMODE
+  [(V4SF "V8SF")
+   (V4SI "V8SI")
+   (V2DI "V4DI")
+   (V2DF "V4DF")])
+
+;; This attribute gives the mode of the result for "vpickve2gr_b, copy_u_b" etc.
+(define_mode_attr VRES
+  [(V2DF "DF")
+   (V4SF "SF")
+   (V2DI "DI")
+   (V4SI "SI")
+   (V8HI "SI")
+   (V16QI "SI")])
+
+;; Only used with LSX_D iterator.
+(define_mode_attr lsx_d
+  [(V2DI "reg_or_0")
+   (V2DF "register")])
+
+;; This attribute gives the integer vector mode with same size.
+(define_mode_attr mode_i
+  [(V2DF "v2di")
+   (V4SF "v4si")
+   (V2DI "v2di")
+   (V4SI "v4si")
+   (V8HI "v8hi")
+   (V16QI "v16qi")])
+
+;; This attribute gives suffix for LSX instructions.
+(define_mode_attr lsxfmt
+  [(V2DF "d")
+   (V4SF "w")
+   (V2DI "d")
+   (V4SI "w")
+   (V8HI "h")
+   (V16QI "b")])
+
+;; This attribute gives suffix for LSX instructions.
+(define_mode_attr lsxfmt_u
+  [(V2DF "du")
+   (V4SF "wu")
+   (V2DI "du")
+   (V4SI "wu")
+   (V8HI "hu")
+   (V16QI "bu")])
+
+;; This attribute gives suffix for integers in VHMODE.
+(define_mode_attr hlsxfmt
+  [(V2DI "w")
+   (V4SI "h")
+   (V8HI "b")])
+
+;; This attribute gives suffix for integers in VHMODE.
+(define_mode_attr hlsxfmt_u
+  [(V2DI "wu")
+   (V4SI "hu")
+   (V8HI "bu")])
+
+;; This attribute gives define_insn suffix for LSX instructions that need
+;; distinction between integer and floating point.
+(define_mode_attr lsxfmt_f
+  [(V2DF "d_f")
+   (V4SF "w_f")
+   (V2DI "d")
+   (V4SI "w")
+   (V8HI "h")
+   (V16QI "b")])
+
+(define_mode_attr flsxfmt_f
+  [(V2DF "d_f")
+   (V4SF "s_f")
+   (V2DI "d")
+   (V4SI "w")
+   (V8HI "h")
+   (V16QI "b")])
+
+(define_mode_attr flsxfmt
+  [(V2DF "d")
+   (V4SF "s")
+   (V2DI "d")
+   (V4SI "s")])
+
+(define_mode_attr flsxfrint
+  [(V2DF "d")
+   (V4SF "s")])
+
+(define_mode_attr ilsxfmt
+  [(V2DF "l")
+   (V4SF "w")])
+
+(define_mode_attr ilsxfmt_u
+  [(V2DF "lu")
+   (V4SF "wu")])
+
+;; This is used to form an immediate operand constraint using
+;; "const_<indeximm>_operand".
+(define_mode_attr indeximm
+  [(V2DF "0_or_1")
+   (V4SF "0_to_3")
+   (V2DI "0_or_1")
+   (V4SI "0_to_3")
+   (V8HI "uimm3")
+   (V16QI "uimm4")])
+
+;; This attribute represents bitmask needed for vec_merge using
+;; "const_<bitmask>_operand".
+(define_mode_attr bitmask
+  [(V2DF "exp_2")
+   (V4SF "exp_4")
+   (V2DI "exp_2")
+   (V4SI "exp_4")
+   (V8HI "exp_8")
+   (V16QI "exp_16")])
+
+;; This attribute is used to form an immediate operand constraint using
+;; "const_<bitimm>_operand".
+(define_mode_attr bitimm
+  [(V16QI "uimm3")
+   (V8HI  "uimm4")
+   (V4SI  "uimm5")
+   (V2DI  "uimm6")])
+
+
+(define_int_iterator FRINT_S [UNSPEC_LSX_VFRINTRP_S
+			    UNSPEC_LSX_VFRINTRZ_S
+			    UNSPEC_LSX_VFRINT
+			    UNSPEC_LSX_VFRINTRM_S])
+
+(define_int_iterator FRINT_D [UNSPEC_LSX_VFRINTRP_D
+			    UNSPEC_LSX_VFRINTRZ_D
+			    UNSPEC_LSX_VFRINT
+			    UNSPEC_LSX_VFRINTRM_D])
+
+(define_int_attr frint_pattern_s
+  [(UNSPEC_LSX_VFRINTRP_S  "ceil")
+   (UNSPEC_LSX_VFRINTRZ_S  "btrunc")
+   (UNSPEC_LSX_VFRINT	   "rint")
+   (UNSPEC_LSX_VFRINTRM_S  "floor")])
+
+(define_int_attr frint_pattern_d
+  [(UNSPEC_LSX_VFRINTRP_D  "ceil")
+   (UNSPEC_LSX_VFRINTRZ_D  "btrunc")
+   (UNSPEC_LSX_VFRINT	   "rint")
+   (UNSPEC_LSX_VFRINTRM_D  "floor")])
+
+(define_int_attr frint_suffix
+  [(UNSPEC_LSX_VFRINTRP_S  "rp")
+   (UNSPEC_LSX_VFRINTRP_D  "rp")
+   (UNSPEC_LSX_VFRINTRZ_S  "rz")
+   (UNSPEC_LSX_VFRINTRZ_D  "rz")
+   (UNSPEC_LSX_VFRINT	   "")
+   (UNSPEC_LSX_VFRINTRM_S  "rm")
+   (UNSPEC_LSX_VFRINTRM_D  "rm")])
+
+(define_expand "vec_init<mode><unitmode>"
+  [(match_operand:LSX 0 "register_operand")
+   (match_operand:LSX 1 "")]
+  "ISA_HAS_LSX"
+{
+  loongarch_expand_vector_init (operands[0], operands[1]);
+  DONE;
+})
+
+;; vpickev pattern with implicit type conversion.
+(define_insn "vec_pack_trunc_<mode>"
+  [(set (match_operand:<VHMODE> 0 "register_operand" "=f")
+	(vec_concat:<VHMODE>
+	  (truncate:<VTRUNCMODE>
+	    (match_operand:ILSX_DWH 1 "register_operand" "f"))
+	  (truncate:<VTRUNCMODE>
+	    (match_operand:ILSX_DWH 2 "register_operand" "f"))))]
+  "ISA_HAS_LSX"
+  "vpickev.<hlsxfmt>\t%w0,%w2,%w1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "<MODE>")])
+
+(define_expand "vec_unpacks_hi_v4sf"
+  [(set (match_operand:V2DF 0 "register_operand" "=f")
+	(float_extend:V2DF
+	  (vec_select:V2SF
+	    (match_operand:V4SF 1 "register_operand" "f")
+	    (match_dup 2))))]
+  "ISA_HAS_LSX"
+{
+  operands[2] = loongarch_lsx_vec_parallel_const_half (V4SFmode,
+      true/*high_p*/);
+})
+
+(define_expand "vec_unpacks_lo_v4sf"
+  [(set (match_operand:V2DF 0 "register_operand" "=f")
+	(float_extend:V2DF
+	  (vec_select:V2SF
+	    (match_operand:V4SF 1 "register_operand" "f")
+	    (match_dup 2))))]
+  "ISA_HAS_LSX"
+{
+  operands[2] = loongarch_lsx_vec_parallel_const_half (V4SFmode,
+      false/*high_p*/);
+})
+
+(define_expand "vec_unpacks_hi_<mode>"
+  [(match_operand:<VDMODE> 0 "register_operand")
+   (match_operand:ILSX_WHB 1 "register_operand")]
+  "ISA_HAS_LSX"
+{
+  loongarch_expand_vec_unpack (operands, false/*unsigned_p*/, true/*high_p*/);
+  DONE;
+})
+
+(define_expand "vec_unpacks_lo_<mode>"
+  [(match_operand:<VDMODE> 0 "register_operand")
+   (match_operand:ILSX_WHB 1 "register_operand")]
+  "ISA_HAS_LSX"
+{
+  loongarch_expand_vec_unpack (operands, false/*unsigned_p*/, false/*high_p*/);
+  DONE;
+})
+
+(define_expand "vec_unpacku_hi_<mode>"
+  [(match_operand:<VDMODE> 0 "register_operand")
+   (match_operand:ILSX_WHB 1 "register_operand")]
+  "ISA_HAS_LSX"
+{
+  loongarch_expand_vec_unpack (operands, true/*unsigned_p*/, true/*high_p*/);
+  DONE;
+})
+
+(define_expand "vec_unpacku_lo_<mode>"
+  [(match_operand:<VDMODE> 0 "register_operand")
+   (match_operand:ILSX_WHB 1 "register_operand")]
+  "ISA_HAS_LSX"
+{
+  loongarch_expand_vec_unpack (operands, true/*unsigned_p*/, false/*high_p*/);
+  DONE;
+})
+
+(define_expand "vec_extract<mode><unitmode>"
+  [(match_operand:<UNITMODE> 0 "register_operand")
+   (match_operand:ILSX 1 "register_operand")
+   (match_operand 2 "const_<indeximm>_operand")]
+  "ISA_HAS_LSX"
+{
+  if (<UNITMODE>mode == QImode || <UNITMODE>mode == HImode)
+    {
+      rtx dest1 = gen_reg_rtx (SImode);
+      emit_insn (gen_lsx_vpickve2gr_<lsxfmt> (dest1, operands[1], operands[2]));
+      emit_move_insn (operands[0],
+		      gen_lowpart (<UNITMODE>mode, dest1));
+    }
+  else
+    emit_insn (gen_lsx_vpickve2gr_<lsxfmt> (operands[0], operands[1], operands[2]));
+  DONE;
+})
+
+(define_expand "vec_extract<mode><unitmode>"
+  [(match_operand:<UNITMODE> 0 "register_operand")
+   (match_operand:FLSX 1 "register_operand")
+   (match_operand 2 "const_<indeximm>_operand")]
+  "ISA_HAS_LSX"
+{
+  rtx temp;
+  HOST_WIDE_INT val = INTVAL (operands[2]);
+
+  if (val == 0)
+    temp = operands[1];
+  else
+    {
+      rtx n = GEN_INT (val * GET_MODE_SIZE (<UNITMODE>mode));
+      temp = gen_reg_rtx (<MODE>mode);
+      emit_insn (gen_lsx_vbsrl_<lsxfmt_f> (temp, operands[1], n));
+    }
+  emit_insn (gen_lsx_vec_extract_<lsxfmt_f> (operands[0], temp));
+  DONE;
+})
+
+(define_insn_and_split "lsx_vec_extract_<lsxfmt_f>"
+  [(set (match_operand:<UNITMODE> 0 "register_operand" "=f")
+	(vec_select:<UNITMODE>
+	  (match_operand:FLSX 1 "register_operand" "f")
+	  (parallel [(const_int 0)])))]
+  "ISA_HAS_LSX"
+  "#"
+  "&& reload_completed"
+  [(set (match_dup 0) (match_dup 1))]
+{
+  operands[1] = gen_rtx_REG (<UNITMODE>mode, REGNO (operands[1]));
+}
+  [(set_attr "move_type" "fmove")
+   (set_attr "mode" "<UNITMODE>")])
+
+(define_expand "vec_set<mode>"
+  [(match_operand:ILSX 0 "register_operand")
+   (match_operand:<UNITMODE> 1 "reg_or_0_operand")
+   (match_operand 2 "const_<indeximm>_operand")]
+  "ISA_HAS_LSX"
+{
+  rtx index = GEN_INT (1 << INTVAL (operands[2]));
+  emit_insn (gen_lsx_vinsgr2vr_<lsxfmt> (operands[0], operands[1],
+					 operands[0], index));
+  DONE;
+})
+
+(define_expand "vec_set<mode>"
+  [(match_operand:FLSX 0 "register_operand")
+   (match_operand:<UNITMODE> 1 "register_operand")
+   (match_operand 2 "const_<indeximm>_operand")]
+  "ISA_HAS_LSX"
+{
+  rtx index = GEN_INT (1 << INTVAL (operands[2]));
+  emit_insn (gen_lsx_vextrins_<lsxfmt_f>_scalar (operands[0], operands[1],
+						 operands[0], index));
+  DONE;
+})
+
+(define_expand "vec_cmp<mode><mode_i>"
+  [(set (match_operand:<VIMODE> 0 "register_operand")
+	(match_operator 1 ""
+	  [(match_operand:LSX 2 "register_operand")
+	   (match_operand:LSX 3 "register_operand")]))]
+  "ISA_HAS_LSX"
+{
+  bool ok = loongarch_expand_vec_cmp (operands);
+  gcc_assert (ok);
+  DONE;
+})
+
+(define_expand "vec_cmpu<ILSX:mode><mode_i>"
+  [(set (match_operand:<VIMODE> 0 "register_operand")
+	(match_operator 1 ""
+	  [(match_operand:ILSX 2 "register_operand")
+	   (match_operand:ILSX 3 "register_operand")]))]
+  "ISA_HAS_LSX"
+{
+  bool ok = loongarch_expand_vec_cmp (operands);
+  gcc_assert (ok);
+  DONE;
+})
+
+(define_expand "vcondu<LSX:mode><ILSX:mode>"
+  [(match_operand:LSX 0 "register_operand")
+   (match_operand:LSX 1 "reg_or_m1_operand")
+   (match_operand:LSX 2 "reg_or_0_operand")
+   (match_operator 3 ""
+     [(match_operand:ILSX 4 "register_operand")
+      (match_operand:ILSX 5 "register_operand")])]
+  "ISA_HAS_LSX
+   && (GET_MODE_NUNITS (<LSX:MODE>mode) == GET_MODE_NUNITS (<ILSX:MODE>mode))"
+{
+  loongarch_expand_vec_cond_expr (<LSX:MODE>mode, <LSX:VIMODE>mode, operands);
+  DONE;
+})
+
+(define_expand "vcond<LSX:mode><LSX_2:mode>"
+  [(match_operand:LSX 0 "register_operand")
+   (match_operand:LSX 1 "reg_or_m1_operand")
+   (match_operand:LSX 2 "reg_or_0_operand")
+   (match_operator 3 ""
+     [(match_operand:LSX_2 4 "register_operand")
+      (match_operand:LSX_2 5 "register_operand")])]
+  "ISA_HAS_LSX
+   && (GET_MODE_NUNITS (<LSX:MODE>mode) == GET_MODE_NUNITS (<LSX_2:MODE>mode))"
+{
+  loongarch_expand_vec_cond_expr (<LSX:MODE>mode, <LSX:VIMODE>mode, operands);
+  DONE;
+})
+
+(define_expand "vcond_mask_<ILSX:mode><ILSX:mode>"
+  [(match_operand:ILSX 0 "register_operand")
+   (match_operand:ILSX 1 "reg_or_m1_operand")
+   (match_operand:ILSX 2 "reg_or_0_operand")
+   (match_operand:ILSX 3 "register_operand")]
+  "ISA_HAS_LSX"
+{
+  loongarch_expand_vec_cond_mask_expr (<ILSX:MODE>mode,
+				      <ILSX:VIMODE>mode, operands);
+  DONE;
+})
+
+(define_insn "lsx_vinsgr2vr_<lsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(vec_merge:ILSX
+	  (vec_duplicate:ILSX
+	    (match_operand:<UNITMODE> 1 "reg_or_0_operand" "rJ"))
+	  (match_operand:ILSX 2 "register_operand" "0")
+	  (match_operand 3 "const_<bitmask>_operand" "")))]
+  "ISA_HAS_LSX"
+{
+  if (!TARGET_64BIT && (<MODE>mode == V2DImode || <MODE>mode == V2DFmode))
+    return "#";
+  else
+    return "vinsgr2vr.<lsxfmt>\t%w0,%z1,%y3";
+}
+  [(set_attr "type" "simd_insert")
+   (set_attr "mode" "<MODE>")])
+
+(define_split
+  [(set (match_operand:LSX_D 0 "register_operand")
+	(vec_merge:LSX_D
+	  (vec_duplicate:LSX_D
+	    (match_operand:<UNITMODE> 1 "<LSX_D:lsx_d>_operand"))
+	  (match_operand:LSX_D 2 "register_operand")
+	  (match_operand 3 "const_<bitmask>_operand")))]
+  "reload_completed && ISA_HAS_LSX && !TARGET_64BIT"
+  [(const_int 0)]
+{
+  loongarch_split_lsx_insert_d (operands[0], operands[2], operands[3], operands[1]);
+  DONE;
+})
+
+(define_insn "lsx_vextrins_<lsxfmt_f>_internal"
+  [(set (match_operand:LSX 0 "register_operand" "=f")
+	(vec_merge:LSX
+	  (vec_duplicate:LSX
+	    (vec_select:<UNITMODE>
+	      (match_operand:LSX 1 "register_operand" "f")
+	      (parallel [(const_int 0)])))
+	  (match_operand:LSX 2 "register_operand" "0")
+	  (match_operand 3 "const_<bitmask>_operand" "")))]
+  "ISA_HAS_LSX"
+  "vextrins.<lsxfmt>\t%w0,%w1,%y3<<4"
+  [(set_attr "type" "simd_insert")
+   (set_attr "mode" "<MODE>")])
+
+;; Operand 3 is a scalar.
+(define_insn "lsx_vextrins_<lsxfmt_f>_scalar"
+  [(set (match_operand:FLSX 0 "register_operand" "=f")
+	(vec_merge:FLSX
+	  (vec_duplicate:FLSX
+	    (match_operand:<UNITMODE> 1 "register_operand" "f"))
+	  (match_operand:FLSX 2 "register_operand" "0")
+	  (match_operand 3 "const_<bitmask>_operand" "")))]
+  "ISA_HAS_LSX"
+  "vextrins.<lsxfmt>\t%w0,%w1,%y3<<4"
+  [(set_attr "type" "simd_insert")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vpickve2gr_<lsxfmt><u>"
+  [(set (match_operand:<VRES> 0 "register_operand" "=r")
+	(any_extend:<VRES>
+	  (vec_select:<UNITMODE>
+	    (match_operand:ILSX_HB 1 "register_operand" "f")
+	    (parallel [(match_operand 2 "const_<indeximm>_operand" "")]))))]
+  "ISA_HAS_LSX"
+  "vpickve2gr.<lsxfmt><u>\t%0,%w1,%2"
+  [(set_attr "type" "simd_copy")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vpickve2gr_<lsxfmt_f><u>"
+  [(set (match_operand:<UNITMODE> 0 "register_operand" "=r")
+	(any_extend:<UNITMODE>
+	  (vec_select:<UNITMODE>
+	    (match_operand:LSX_W 1 "register_operand" "f")
+	    (parallel [(match_operand 2 "const_<indeximm>_operand" "")]))))]
+  "ISA_HAS_LSX"
+  "vpickve2gr.<lsxfmt><u>\t%0,%w1,%2"
+  [(set_attr "type" "simd_copy")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn_and_split "lsx_vpickve2gr_du"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+	(vec_select:DI
+	  (match_operand:V2DI 1 "register_operand" "f")
+	  (parallel [(match_operand 2 "const_0_or_1_operand" "")])))]
+  "ISA_HAS_LSX"
+{
+  if (TARGET_64BIT)
+    return "vpickve2gr.du\t%0,%w1,%2";
+  else
+    return "#";
+}
+  "reload_completed && ISA_HAS_LSX && !TARGET_64BIT"
+  [(const_int 0)]
+{
+  loongarch_split_lsx_copy_d (operands[0], operands[1], operands[2],
+			      gen_lsx_vpickve2gr_wu);
+  DONE;
+}
+  [(set_attr "type" "simd_copy")
+   (set_attr "mode" "V2DI")])
+
+(define_insn_and_split "lsx_vpickve2gr_<lsxfmt_f>"
+  [(set (match_operand:<UNITMODE> 0 "register_operand" "=r")
+	(vec_select:<UNITMODE>
+	  (match_operand:LSX_D 1 "register_operand" "f")
+	  (parallel [(match_operand 2 "const_<indeximm>_operand" "")])))]
+  "ISA_HAS_LSX"
+{
+  if (TARGET_64BIT)
+    return "vpickve2gr.<lsxfmt>\t%0,%w1,%2";
+  else
+    return "#";
+}
+  "reload_completed && ISA_HAS_LSX && !TARGET_64BIT"
+  [(const_int 0)]
+{
+  loongarch_split_lsx_copy_d (operands[0], operands[1], operands[2],
+			      gen_lsx_vpickve2gr_w);
+  DONE;
+}
+  [(set_attr "type" "simd_copy")
+   (set_attr "mode" "<MODE>")])
+
+
+(define_expand "abs<mode>2"
+  [(match_operand:ILSX 0 "register_operand" "=f")
+   (abs:ILSX (match_operand:ILSX 1 "register_operand" "f"))]
+  "ISA_HAS_LSX"
+{
+  if (ISA_HAS_LSX)
+  {
+    emit_insn (gen_vabs<mode>2 (operands[0], operands[1]));
+    DONE;
+  }
+  else
+  {
+    rtx reg = gen_reg_rtx (<MODE>mode);
+    emit_move_insn (reg, CONST0_RTX (<MODE>mode));
+    emit_insn (gen_lsx_vadda_<lsxfmt> (operands[0], operands[1], reg));
+    DONE;
+  }
+})
+
+(define_expand "neg<mode>2"
+  [(set (match_operand:ILSX 0 "register_operand")
+	(neg:ILSX (match_operand:ILSX 1 "register_operand")))]
+  "ISA_HAS_LSX"
+{
+  emit_insn (gen_vneg<mode>2 (operands[0], operands[1]));
+  DONE;
+})
+
+(define_expand "neg<mode>2"
+  [(set (match_operand:FLSX 0 "register_operand")
+	(neg:FLSX (match_operand:FLSX 1 "register_operand")))]
+  "ISA_HAS_LSX"
+{
+  rtx reg = gen_reg_rtx (<MODE>mode);
+  emit_move_insn (reg, CONST0_RTX (<MODE>mode));
+  emit_insn (gen_sub<mode>3 (operands[0], reg, operands[1]));
+  DONE;
+})
+
+(define_expand "lsx_vrepli<mode>"
+  [(match_operand:ILSX 0 "register_operand")
+   (match_operand 1 "const_imm10_operand")]
+  "ISA_HAS_LSX"
+{
+  if (<MODE>mode == V16QImode)
+    operands[1] = GEN_INT (trunc_int_for_mode (INTVAL (operands[1]),
+					       <UNITMODE>mode));
+  emit_move_insn (operands[0],
+		  loongarch_gen_const_int_vector (<MODE>mode, INTVAL (operands[1])));
+  DONE;
+})
+
+(define_expand "vec_perm<mode>"
+ [(match_operand:LSX 0 "register_operand")
+  (match_operand:LSX 1 "register_operand")
+  (match_operand:LSX 2 "register_operand")
+  (match_operand:LSX 3 "register_operand")]
+  "ISA_HAS_LSX"
+{
+  loongarch_expand_vec_perm (operands[0], operands[1],
+			     operands[2], operands[3]);
+  DONE;
+})
+
+(define_insn "lsx_vshuf_<lsxfmt_f>"
+  [(set (match_operand:LSX_DWH 0 "register_operand" "=f")
+	(unspec:LSX_DWH [(match_operand:LSX_DWH 1 "register_operand" "0")
+			 (match_operand:LSX_DWH 2 "register_operand" "f")
+			 (match_operand:LSX_DWH 3 "register_operand" "f")]
+			UNSPEC_LSX_VSHUF))]
+  "ISA_HAS_LSX"
+  "vshuf.<lsxfmt>\t%w0,%w2,%w3"
+  [(set_attr "type" "simd_sld")
+   (set_attr "mode" "<MODE>")])
+
+(define_expand "mov<mode>"
+  [(set (match_operand:LSX 0)
+	(match_operand:LSX 1))]
+  "ISA_HAS_LSX"
+{
+  if (loongarch_legitimize_move (<MODE>mode, operands[0], operands[1]))
+    DONE;
+})
+
+(define_expand "movmisalign<mode>"
+  [(set (match_operand:LSX 0)
+	(match_operand:LSX 1))]
+  "ISA_HAS_LSX"
+{
+  if (loongarch_legitimize_move (<MODE>mode, operands[0], operands[1]))
+    DONE;
+})
+
+(define_insn "mov<mode>_lsx"
+  [(set (match_operand:LSX 0 "nonimmediate_operand" "=f,f,R,*r,*f")
+	(match_operand:LSX 1 "move_operand" "fYGYI,R,f,*f,*r"))]
+  "ISA_HAS_LSX"
+{ return loongarch_output_move (operands[0], operands[1]); }
+  [(set_attr "type" "simd_move,simd_load,simd_store,simd_copy,simd_insert")
+   (set_attr "mode" "<MODE>")])
+
+(define_split
+  [(set (match_operand:LSX 0 "nonimmediate_operand")
+	(match_operand:LSX 1 "move_operand"))]
+  "reload_completed && ISA_HAS_LSX
+   && loongarch_split_move_insn_p (operands[0], operands[1])"
+  [(const_int 0)]
+{
+  loongarch_split_move_insn (operands[0], operands[1], curr_insn);
+  DONE;
+})
+
+;; Offset load
+(define_expand "lsx_ld_<lsxfmt_f>"
+  [(match_operand:LSX 0 "register_operand")
+   (match_operand 1 "pmode_register_operand")
+   (match_operand 2 "aq10<lsxfmt>_operand")]
+  "ISA_HAS_LSX"
+{
+  rtx addr = plus_constant (GET_MODE (operands[1]), operands[1],
+			    INTVAL (operands[2]));
+  loongarch_emit_move (operands[0], gen_rtx_MEM (<MODE>mode, addr));
+  DONE;
+})
+
+;; Offset store
+(define_expand "lsx_st_<lsxfmt_f>"
+  [(match_operand:LSX 0 "register_operand")
+   (match_operand 1 "pmode_register_operand")
+   (match_operand 2 "aq10<lsxfmt>_operand")]
+  "ISA_HAS_LSX"
+{
+  rtx addr = plus_constant (GET_MODE (operands[1]), operands[1],
+			    INTVAL (operands[2]));
+  loongarch_emit_move (gen_rtx_MEM (<MODE>mode, addr), operands[0]);
+  DONE;
+})
+
+;; Integer operations
+(define_insn "add<mode>3"
+  [(set (match_operand:ILSX 0 "register_operand" "=f,f,f")
+	(plus:ILSX
+	  (match_operand:ILSX 1 "register_operand" "f,f,f")
+	  (match_operand:ILSX 2 "reg_or_vector_same_ximm5_operand" "f,Unv5,Uuv5")))]
+  "ISA_HAS_LSX"
+{
+  switch (which_alternative)
+    {
+    case 0:
+      return "vadd.<lsxfmt>\t%w0,%w1,%w2";
+    case 1:
+      {
+	HOST_WIDE_INT val = INTVAL (CONST_VECTOR_ELT (operands[2], 0));
+
+	operands[2] = GEN_INT (-val);
+	return "vsubi.<lsxfmt_u>\t%w0,%w1,%d2";
+      }
+    case 2:
+      return "vaddi.<lsxfmt_u>\t%w0,%w1,%E2";
+    default:
+      gcc_unreachable ();
+    }
+}
+  [(set_attr "alu_type" "simd_add")
+   (set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "sub<mode>3"
+  [(set (match_operand:ILSX 0 "register_operand" "=f,f")
+	(minus:ILSX
+	  (match_operand:ILSX 1 "register_operand" "f,f")
+	  (match_operand:ILSX 2 "reg_or_vector_same_uimm5_operand" "f,Uuv5")))]
+  "ISA_HAS_LSX"
+  "@
+   vsub.<lsxfmt>\t%w0,%w1,%w2
+   vsubi.<lsxfmt_u>\t%w0,%w1,%E2"
+  [(set_attr "alu_type" "simd_add")
+   (set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "mul<mode>3"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(mult:ILSX (match_operand:ILSX 1 "register_operand" "f")
+		   (match_operand:ILSX 2 "register_operand" "f")))]
+  "ISA_HAS_LSX"
+  "vmul.<lsxfmt>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_mul")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vmadd_<lsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(plus:ILSX (mult:ILSX (match_operand:ILSX 2 "register_operand" "f")
+			      (match_operand:ILSX 3 "register_operand" "f"))
+		   (match_operand:ILSX 1 "register_operand" "0")))]
+  "ISA_HAS_LSX"
+  "vmadd.<lsxfmt>\t%w0,%w2,%w3"
+  [(set_attr "type" "simd_mul")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vmsub_<lsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(minus:ILSX (match_operand:ILSX 1 "register_operand" "0")
+		    (mult:ILSX (match_operand:ILSX 2 "register_operand" "f")
+			       (match_operand:ILSX 3 "register_operand" "f"))))]
+  "ISA_HAS_LSX"
+  "vmsub.<lsxfmt>\t%w0,%w2,%w3"
+  [(set_attr "type" "simd_mul")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "div<mode>3"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(div:ILSX (match_operand:ILSX 1 "register_operand" "f")
+		  (match_operand:ILSX 2 "register_operand" "f")))]
+  "ISA_HAS_LSX"
+{ return loongarch_lsx_output_division ("vdiv.<lsxfmt>\t%w0,%w1,%w2", operands); }
+  [(set_attr "type" "simd_div")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "udiv<mode>3"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(udiv:ILSX (match_operand:ILSX 1 "register_operand" "f")
+		   (match_operand:ILSX 2 "register_operand" "f")))]
+  "ISA_HAS_LSX"
+{ return loongarch_lsx_output_division ("vdiv.<lsxfmt_u>\t%w0,%w1,%w2", operands); }
+  [(set_attr "type" "simd_div")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "mod<mode>3"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(mod:ILSX (match_operand:ILSX 1 "register_operand" "f")
+		  (match_operand:ILSX 2 "register_operand" "f")))]
+  "ISA_HAS_LSX"
+{ return loongarch_lsx_output_division ("vmod.<lsxfmt>\t%w0,%w1,%w2", operands); }
+  [(set_attr "type" "simd_div")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "umod<mode>3"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(umod:ILSX (match_operand:ILSX 1 "register_operand" "f")
+		   (match_operand:ILSX 2 "register_operand" "f")))]
+  "ISA_HAS_LSX"
+{ return loongarch_lsx_output_division ("vmod.<lsxfmt_u>\t%w0,%w1,%w2", operands); }
+  [(set_attr "type" "simd_div")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "xor<mode>3"
+  [(set (match_operand:ILSX 0 "register_operand" "=f,f,f")
+	(xor:ILSX
+	  (match_operand:ILSX 1 "register_operand" "f,f,f")
+	  (match_operand:ILSX 2 "reg_or_vector_same_val_operand" "f,YC,Urv8")))]
+  "ISA_HAS_LSX"
+  "@
+   vxor.v\t%w0,%w1,%w2
+   vbitrevi.%v0\t%w0,%w1,%V2
+   vxori.b\t%w0,%w1,%B2"
+  [(set_attr "type" "simd_logic,simd_bit,simd_logic")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "ior<mode>3"
+  [(set (match_operand:LSX 0 "register_operand" "=f,f,f")
+	(ior:LSX
+	  (match_operand:LSX 1 "register_operand" "f,f,f")
+	  (match_operand:LSX 2 "reg_or_vector_same_val_operand" "f,YC,Urv8")))]
+  "ISA_HAS_LSX"
+  "@
+   vor.v\t%w0,%w1,%w2
+   vbitseti.%v0\t%w0,%w1,%V2
+   vori.b\t%w0,%w1,%B2"
+  [(set_attr "type" "simd_logic,simd_bit,simd_logic")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "and<mode>3"
+  [(set (match_operand:LSX 0 "register_operand" "=f,f,f")
+	(and:LSX
+	  (match_operand:LSX 1 "register_operand" "f,f,f")
+	  (match_operand:LSX 2 "reg_or_vector_same_val_operand" "f,YZ,Urv8")))]
+  "ISA_HAS_LSX"
+{
+  switch (which_alternative)
+    {
+    case 0:
+      return "vand.v\t%w0,%w1,%w2";
+    case 1:
+      {
+	rtx elt0 = CONST_VECTOR_ELT (operands[2], 0);
+	unsigned HOST_WIDE_INT val = ~UINTVAL (elt0);
+	operands[2] = loongarch_gen_const_int_vector (<MODE>mode, val & (-val));
+	return "vbitclri.%v0\t%w0,%w1,%V2";
+      }
+    case 2:
+      return "vandi.b\t%w0,%w1,%B2";
+    default:
+      gcc_unreachable ();
+    }
+}
+  [(set_attr "type" "simd_logic,simd_bit,simd_logic")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "one_cmpl<mode>2"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(not:ILSX (match_operand:ILSX 1 "register_operand" "f")))]
+  "ISA_HAS_LSX"
+  "vnor.v\t%w0,%w1,%w1"
+  [(set_attr "type" "simd_logic")
+   (set_attr "mode" "TI")])
+
+(define_insn "vlshr<mode>3"
+  [(set (match_operand:ILSX 0 "register_operand" "=f,f")
+	(lshiftrt:ILSX
+	  (match_operand:ILSX 1 "register_operand" "f,f")
+	  (match_operand:ILSX 2 "reg_or_vector_same_uimm6_operand" "f,Uuv6")))]
+  "ISA_HAS_LSX"
+  "@
+   vsrl.<lsxfmt>\t%w0,%w1,%w2
+   vsrli.<lsxfmt>\t%w0,%w1,%E2"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "vashr<mode>3"
+  [(set (match_operand:ILSX 0 "register_operand" "=f,f")
+	(ashiftrt:ILSX
+	  (match_operand:ILSX 1 "register_operand" "f,f")
+	  (match_operand:ILSX 2 "reg_or_vector_same_uimm6_operand" "f,Uuv6")))]
+  "ISA_HAS_LSX"
+  "@
+   vsra.<lsxfmt>\t%w0,%w1,%w2
+   vsrai.<lsxfmt>\t%w0,%w1,%E2"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "vashl<mode>3"
+  [(set (match_operand:ILSX 0 "register_operand" "=f,f")
+	(ashift:ILSX
+	  (match_operand:ILSX 1 "register_operand" "f,f")
+	  (match_operand:ILSX 2 "reg_or_vector_same_uimm6_operand" "f,Uuv6")))]
+  "ISA_HAS_LSX"
+  "@
+   vsll.<lsxfmt>\t%w0,%w1,%w2
+   vslli.<lsxfmt>\t%w0,%w1,%E2"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+;; Floating-point operations
+(define_insn "add<mode>3"
+  [(set (match_operand:FLSX 0 "register_operand" "=f")
+	(plus:FLSX (match_operand:FLSX 1 "register_operand" "f")
+		   (match_operand:FLSX 2 "register_operand" "f")))]
+  "ISA_HAS_LSX"
+  "vfadd.<flsxfmt>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_fadd")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "sub<mode>3"
+  [(set (match_operand:FLSX 0 "register_operand" "=f")
+	(minus:FLSX (match_operand:FLSX 1 "register_operand" "f")
+		    (match_operand:FLSX 2 "register_operand" "f")))]
+  "ISA_HAS_LSX"
+  "vfsub.<flsxfmt>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_fadd")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "mul<mode>3"
+  [(set (match_operand:FLSX 0 "register_operand" "=f")
+	(mult:FLSX (match_operand:FLSX 1 "register_operand" "f")
+		   (match_operand:FLSX 2 "register_operand" "f")))]
+  "ISA_HAS_LSX"
+  "vfmul.<flsxfmt>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_fmul")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "div<mode>3"
+  [(set (match_operand:FLSX 0 "register_operand" "=f")
+	(div:FLSX (match_operand:FLSX 1 "register_operand" "f")
+		  (match_operand:FLSX 2 "register_operand" "f")))]
+  "ISA_HAS_LSX"
+  "vfdiv.<flsxfmt>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_fdiv")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "fma<mode>4"
+  [(set (match_operand:FLSX 0 "register_operand" "=f")
+	(fma:FLSX (match_operand:FLSX 1 "register_operand" "f")
+		  (match_operand:FLSX 2 "register_operand" "f")
+		  (match_operand:FLSX 3 "register_operand" "f")))]
+  "ISA_HAS_LSX"
+  "vfmadd.<flsxfmt>\t%w0,%w1,%w2,%w3"
+  [(set_attr "type" "simd_fmadd")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "fnma<mode>4"
+  [(set (match_operand:FLSX 0 "register_operand" "=f")
+	(fma:FLSX (neg:FLSX (match_operand:FLSX 1 "register_operand" "f"))
+		  (match_operand:FLSX 2 "register_operand" "f")
+		  (match_operand:FLSX 3 "register_operand" "0")))]
+  "ISA_HAS_LSX"
+  "vfnmsub.<flsxfmt>\t%w0,%w1,%w2,%w0"
+  [(set_attr "type" "simd_fmadd")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "sqrt<mode>2"
+  [(set (match_operand:FLSX 0 "register_operand" "=f")
+	(sqrt:FLSX (match_operand:FLSX 1 "register_operand" "f")))]
+  "ISA_HAS_LSX"
+  "vfsqrt.<flsxfmt>\t%w0,%w1"
+  [(set_attr "type" "simd_fdiv")
+   (set_attr "mode" "<MODE>")])
+
+;; Built-in functions
+(define_insn "lsx_vadda_<lsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(plus:ILSX (abs:ILSX (match_operand:ILSX 1 "register_operand" "f"))
+		   (abs:ILSX (match_operand:ILSX 2 "register_operand" "f"))))]
+  "ISA_HAS_LSX"
+  "vadda.<lsxfmt>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "ssadd<mode>3"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(ss_plus:ILSX (match_operand:ILSX 1 "register_operand" "f")
+		      (match_operand:ILSX 2 "register_operand" "f")))]
+  "ISA_HAS_LSX"
+  "vsadd.<lsxfmt>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "usadd<mode>3"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(us_plus:ILSX (match_operand:ILSX 1 "register_operand" "f")
+		      (match_operand:ILSX 2 "register_operand" "f")))]
+  "ISA_HAS_LSX"
+  "vsadd.<lsxfmt_u>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vabsd_s_<lsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(unspec:ILSX [(match_operand:ILSX 1 "register_operand" "f")
+		      (match_operand:ILSX 2 "register_operand" "f")]
+		     UNSPEC_LSX_ABSD_S))]
+  "ISA_HAS_LSX"
+  "vabsd.<lsxfmt>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vabsd_u_<lsxfmt_u>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(unspec:ILSX [(match_operand:ILSX 1 "register_operand" "f")
+		      (match_operand:ILSX 2 "register_operand" "f")]
+		     UNSPEC_LSX_VABSD_U))]
+  "ISA_HAS_LSX"
+  "vabsd.<lsxfmt_u>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vavg_s_<lsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(unspec:ILSX [(match_operand:ILSX 1 "register_operand" "f")
+		      (match_operand:ILSX 2 "register_operand" "f")]
+		     UNSPEC_LSX_VAVG_S))]
+  "ISA_HAS_LSX"
+  "vavg.<lsxfmt>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vavg_u_<lsxfmt_u>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(unspec:ILSX [(match_operand:ILSX 1 "register_operand" "f")
+		      (match_operand:ILSX 2 "register_operand" "f")]
+		     UNSPEC_LSX_VAVG_U))]
+  "ISA_HAS_LSX"
+  "vavg.<lsxfmt_u>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vavgr_s_<lsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(unspec:ILSX [(match_operand:ILSX 1 "register_operand" "f")
+		      (match_operand:ILSX 2 "register_operand" "f")]
+		     UNSPEC_LSX_VAVGR_S))]
+  "ISA_HAS_LSX"
+  "vavgr.<lsxfmt>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vavgr_u_<lsxfmt_u>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(unspec:ILSX [(match_operand:ILSX 1 "register_operand" "f")
+		      (match_operand:ILSX 2 "register_operand" "f")]
+		     UNSPEC_LSX_VAVGR_U))]
+  "ISA_HAS_LSX"
+  "vavgr.<lsxfmt_u>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vbitclr_<lsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(unspec:ILSX [(match_operand:ILSX 1 "register_operand" "f")
+		      (match_operand:ILSX 2 "register_operand" "f")]
+		     UNSPEC_LSX_VBITCLR))]
+  "ISA_HAS_LSX"
+  "vbitclr.<lsxfmt>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_bit")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vbitclri_<lsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(unspec:ILSX [(match_operand:ILSX 1 "register_operand" "f")
+		      (match_operand 2 "const_<bitimm>_operand" "")]
+		     UNSPEC_LSX_VBITCLRI))]
+  "ISA_HAS_LSX"
+  "vbitclri.<lsxfmt>\t%w0,%w1,%2"
+  [(set_attr "type" "simd_bit")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vbitrev_<lsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(unspec:ILSX [(match_operand:ILSX 1 "register_operand" "f")
+		      (match_operand:ILSX 2 "register_operand" "f")]
+		     UNSPEC_LSX_VBITREV))]
+  "ISA_HAS_LSX"
+  "vbitrev.<lsxfmt>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_bit")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vbitrevi_<lsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(unspec:ILSX [(match_operand:ILSX 1 "register_operand" "f")
+		       (match_operand 2 "const_lsx_branch_operand" "")]
+		     UNSPEC_LSX_VBITREVI))]
+  "ISA_HAS_LSX"
+  "vbitrevi.<lsxfmt>\t%w0,%w1,%2"
+  [(set_attr "type" "simd_bit")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vbitsel_<lsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(ior:ILSX (and:ILSX (not:ILSX
+			      (match_operand:ILSX 3 "register_operand" "f"))
+			    (match_operand:ILSX 1 "register_operand" "f"))
+		  (and:ILSX (match_dup 3)
+			    (match_operand:ILSX 2 "register_operand" "f"))))]
+  "ISA_HAS_LSX"
+  "vbitsel.v\t%w0,%w1,%w2,%w3"
+  [(set_attr "type" "simd_bitmov")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vbitseli_b"
+  [(set (match_operand:V16QI 0 "register_operand" "=f")
+	(ior:V16QI (and:V16QI (not:V16QI
+				(match_operand:V16QI 1 "register_operand" "0"))
+			      (match_operand:V16QI 2 "register_operand" "f"))
+		   (and:V16QI (match_dup 1)
+			      (match_operand:V16QI 3 "const_vector_same_val_operand" "Urv8"))))]
+  "ISA_HAS_LSX"
+  "vbitseli.b\t%w0,%w2,%B3"
+  [(set_attr "type" "simd_bitmov")
+   (set_attr "mode" "V16QI")])
+
+(define_insn "lsx_vbitset_<lsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(unspec:ILSX [(match_operand:ILSX 1 "register_operand" "f")
+		      (match_operand:ILSX 2 "register_operand" "f")]
+		     UNSPEC_LSX_VBITSET))]
+  "ISA_HAS_LSX"
+  "vbitset.<lsxfmt>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_bit")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vbitseti_<lsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(unspec:ILSX [(match_operand:ILSX 1 "register_operand" "f")
+		      (match_operand 2 "const_<bitimm>_operand" "")]
+		     UNSPEC_LSX_VBITSETI))]
+  "ISA_HAS_LSX"
+  "vbitseti.<lsxfmt>\t%w0,%w1,%2"
+  [(set_attr "type" "simd_bit")
+   (set_attr "mode" "<MODE>")])
+
+(define_code_iterator ICC [eq le leu lt ltu])
+
+(define_code_attr icc
+  [(eq  "eq")
+   (le  "le")
+   (leu "le")
+   (lt  "lt")
+   (ltu "lt")])
+
+(define_code_attr icci
+  [(eq  "eqi")
+   (le  "lei")
+   (leu "lei")
+   (lt  "lti")
+   (ltu "lti")])
+
+(define_code_attr cmpi
+  [(eq   "s")
+   (le   "s")
+   (leu  "u")
+   (lt   "s")
+   (ltu  "u")])
+
+(define_code_attr cmpi_1
+  [(eq   "")
+   (le   "")
+   (leu  "u")
+   (lt   "")
+   (ltu  "u")])
+
+(define_insn "lsx_vs<ICC:icc>_<ILSX:lsxfmt><ICC:cmpi_1>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f,f")
+	(ICC:ILSX
+	  (match_operand:ILSX 1 "register_operand" "f,f")
+	  (match_operand:ILSX 2 "reg_or_vector_same_<ICC:cmpi>imm5_operand" "f,U<ICC:cmpi>v5")))]
+  "ISA_HAS_LSX"
+  "@
+   vs<ICC:icc>.<ILSX:lsxfmt><ICC:cmpi_1>\t%w0,%w1,%w2
+   vs<ICC:icci>.<ILSX:lsxfmt><ICC:cmpi_1>\t%w0,%w1,%E2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vfclass_<flsxfmt>"
+  [(set (match_operand:<VIMODE> 0 "register_operand" "=f")
+	(unspec:<VIMODE> [(match_operand:FLSX 1 "register_operand" "f")]
+			 UNSPEC_LSX_VFCLASS))]
+  "ISA_HAS_LSX"
+  "vfclass.<flsxfmt>\t%w0,%w1"
+  [(set_attr "type" "simd_fclass")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vfcmp_caf_<flsxfmt>"
+  [(set (match_operand:<VIMODE> 0 "register_operand" "=f")
+	(unspec:<VIMODE> [(match_operand:FLSX 1 "register_operand" "f")
+			  (match_operand:FLSX 2 "register_operand" "f")]
+			 UNSPEC_LSX_VFCMP_CAF))]
+  "ISA_HAS_LSX"
+  "vfcmp.caf.<flsxfmt>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_fcmp")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vfcmp_cune_<FLSX:flsxfmt>"
+  [(set (match_operand:<VIMODE> 0 "register_operand" "=f")
+	(unspec:<VIMODE> [(match_operand:FLSX 1 "register_operand" "f")
+			  (match_operand:FLSX 2 "register_operand" "f")]
+			 UNSPEC_LSX_VFCMP_CUNE))]
+  "ISA_HAS_LSX"
+  "vfcmp.cune.<FLSX:flsxfmt>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_fcmp")
+   (set_attr "mode" "<MODE>")])
+
+(define_code_iterator vfcond [unordered ordered eq ne le lt uneq unle unlt])
+
+(define_code_attr fcc
+  [(unordered "cun")
+   (ordered   "cor")
+   (eq	      "ceq")
+   (ne	      "cne")
+   (uneq      "cueq")
+   (unle      "cule")
+   (unlt      "cult")
+   (le	      "cle")
+   (lt	      "clt")])
+
+(define_int_iterator FSC_UNS [UNSPEC_LSX_VFCMP_SAF UNSPEC_LSX_VFCMP_SUN UNSPEC_LSX_VFCMP_SOR
+			      UNSPEC_LSX_VFCMP_SEQ UNSPEC_LSX_VFCMP_SNE UNSPEC_LSX_VFCMP_SUEQ
+			      UNSPEC_LSX_VFCMP_SUNE UNSPEC_LSX_VFCMP_SULE UNSPEC_LSX_VFCMP_SULT
+			      UNSPEC_LSX_VFCMP_SLE UNSPEC_LSX_VFCMP_SLT])
+
+(define_int_attr fsc
+  [(UNSPEC_LSX_VFCMP_SAF  "saf")
+   (UNSPEC_LSX_VFCMP_SUN  "sun")
+   (UNSPEC_LSX_VFCMP_SOR  "sor")
+   (UNSPEC_LSX_VFCMP_SEQ  "seq")
+   (UNSPEC_LSX_VFCMP_SNE  "sne")
+   (UNSPEC_LSX_VFCMP_SUEQ "sueq")
+   (UNSPEC_LSX_VFCMP_SUNE "sune")
+   (UNSPEC_LSX_VFCMP_SULE "sule")
+   (UNSPEC_LSX_VFCMP_SULT "sult")
+   (UNSPEC_LSX_VFCMP_SLE  "sle")
+   (UNSPEC_LSX_VFCMP_SLT  "slt")])
+
+(define_insn "lsx_vfcmp_<vfcond:fcc>_<FLSX:flsxfmt>"
+  [(set (match_operand:<VIMODE> 0 "register_operand" "=f")
+	(vfcond:<VIMODE> (match_operand:FLSX 1 "register_operand" "f")
+			 (match_operand:FLSX 2 "register_operand" "f")))]
+  "ISA_HAS_LSX"
+  "vfcmp.<vfcond:fcc>.<FLSX:flsxfmt>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_fcmp")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vfcmp_<fsc>_<FLSX:flsxfmt>"
+  [(set (match_operand:<VIMODE> 0 "register_operand" "=f")
+	(unspec:<VIMODE> [(match_operand:FLSX 1 "register_operand" "f")
+			  (match_operand:FLSX 2 "register_operand" "f")]
+			 FSC_UNS))]
+  "ISA_HAS_LSX"
+  "vfcmp.<fsc>.<FLSX:flsxfmt>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_fcmp")
+   (set_attr "mode" "<MODE>")])
+
+(define_mode_attr fint
+  [(V4SF "v4si")
+   (V2DF "v2di")])
+
+(define_mode_attr FINTCNV
+  [(V4SF "I2S")
+   (V2DF "I2D")])
+
+(define_mode_attr FINTCNV_2
+  [(V4SF "S2I")
+   (V2DF "D2I")])
+
+(define_insn "float<fint><FLSX:mode>2"
+  [(set (match_operand:FLSX 0 "register_operand" "=f")
+	(float:FLSX (match_operand:<VIMODE> 1 "register_operand" "f")))]
+  "ISA_HAS_LSX"
+  "vffint.<flsxfmt>.<ilsxfmt>\t%w0,%w1"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "cnv_mode" "<FINTCNV>")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "floatuns<fint><FLSX:mode>2"
+  [(set (match_operand:FLSX 0 "register_operand" "=f")
+	(unsigned_float:FLSX
+	  (match_operand:<VIMODE> 1 "register_operand" "f")))]
+  "ISA_HAS_LSX"
+  "vffint.<flsxfmt>.<ilsxfmt_u>\t%w0,%w1"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "cnv_mode" "<FINTCNV>")
+   (set_attr "mode" "<MODE>")])
+
+(define_mode_attr FFQ
+  [(V4SF "V8HI")
+   (V2DF "V4SI")])
+
+(define_insn "lsx_vreplgr2vr_<lsxfmt_f>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f,f")
+	(vec_duplicate:ILSX
+	  (match_operand:<UNITMODE> 1 "reg_or_0_operand" "r,J")))]
+  "ISA_HAS_LSX"
+{
+  if (which_alternative == 1)
+    return "ldi.<lsxfmt>\t%w0,0";
+
+  if (!TARGET_64BIT && (<MODE>mode == V2DImode || <MODE>mode == V2DFmode))
+    return "#";
+  else
+    return "vreplgr2vr.<lsxfmt>\t%w0,%z1";
+}
+  [(set_attr "type" "simd_fill")
+   (set_attr "mode" "<MODE>")])
+
+(define_split
+  [(set (match_operand:LSX_D 0 "register_operand")
+	(vec_duplicate:LSX_D
+	  (match_operand:<UNITMODE> 1 "register_operand")))]
+  "reload_completed && ISA_HAS_LSX && !TARGET_64BIT"
+  [(const_int 0)]
+{
+  loongarch_split_lsx_fill_d (operands[0], operands[1]);
+  DONE;
+})
+
+(define_insn "logb<mode>2"
+  [(set (match_operand:FLSX 0 "register_operand" "=f")
+	(unspec:FLSX [(match_operand:FLSX 1 "register_operand" "f")]
+		     UNSPEC_LSX_VFLOGB))]
+  "ISA_HAS_LSX"
+  "vflogb.<flsxfmt>\t%w0,%w1"
+  [(set_attr "type" "simd_flog2")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "smax<mode>3"
+  [(set (match_operand:FLSX 0 "register_operand" "=f")
+	(smax:FLSX (match_operand:FLSX 1 "register_operand" "f")
+		   (match_operand:FLSX 2 "register_operand" "f")))]
+  "ISA_HAS_LSX"
+  "vfmax.<flsxfmt>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_fminmax")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vfmaxa_<flsxfmt>"
+  [(set (match_operand:FLSX 0 "register_operand" "=f")
+	(if_then_else:FLSX
+	   (gt (abs:FLSX (match_operand:FLSX 1 "register_operand" "f"))
+	       (abs:FLSX (match_operand:FLSX 2 "register_operand" "f")))
+	   (match_dup 1)
+	   (match_dup 2)))]
+  "ISA_HAS_LSX"
+  "vfmaxa.<flsxfmt>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_fminmax")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "smin<mode>3"
+  [(set (match_operand:FLSX 0 "register_operand" "=f")
+	(smin:FLSX (match_operand:FLSX 1 "register_operand" "f")
+		   (match_operand:FLSX 2 "register_operand" "f")))]
+  "ISA_HAS_LSX"
+  "vfmin.<flsxfmt>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_fminmax")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vfmina_<flsxfmt>"
+  [(set (match_operand:FLSX 0 "register_operand" "=f")
+	(if_then_else:FLSX
+	   (lt (abs:FLSX (match_operand:FLSX 1 "register_operand" "f"))
+	       (abs:FLSX (match_operand:FLSX 2 "register_operand" "f")))
+	   (match_dup 1)
+	   (match_dup 2)))]
+  "ISA_HAS_LSX"
+  "vfmina.<flsxfmt>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_fminmax")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vfrecip_<flsxfmt>"
+  [(set (match_operand:FLSX 0 "register_operand" "=f")
+	(unspec:FLSX [(match_operand:FLSX 1 "register_operand" "f")]
+		     UNSPEC_LSX_VFRECIP))]
+  "ISA_HAS_LSX"
+  "vfrecip.<flsxfmt>\t%w0,%w1"
+  [(set_attr "type" "simd_fdiv")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vfrint_<flsxfmt>"
+  [(set (match_operand:FLSX 0 "register_operand" "=f")
+	(unspec:FLSX [(match_operand:FLSX 1 "register_operand" "f")]
+		     UNSPEC_LSX_VFRINT))]
+  "ISA_HAS_LSX"
+  "vfrint.<flsxfmt>\t%w0,%w1"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vfrsqrt_<flsxfmt>"
+  [(set (match_operand:FLSX 0 "register_operand" "=f")
+	(unspec:FLSX [(match_operand:FLSX 1 "register_operand" "f")]
+		     UNSPEC_LSX_VFRSQRT))]
+  "ISA_HAS_LSX"
+  "vfrsqrt.<flsxfmt>\t%w0,%w1"
+  [(set_attr "type" "simd_fdiv")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vftint_s_<ilsxfmt>_<flsxfmt>"
+  [(set (match_operand:<VIMODE> 0 "register_operand" "=f")
+	(unspec:<VIMODE> [(match_operand:FLSX 1 "register_operand" "f")]
+			 UNSPEC_LSX_VFTINT_S))]
+  "ISA_HAS_LSX"
+  "vftint.<ilsxfmt>.<flsxfmt>\t%w0,%w1"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "cnv_mode" "<FINTCNV_2>")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vftint_u_<ilsxfmt_u>_<flsxfmt>"
+  [(set (match_operand:<VIMODE> 0 "register_operand" "=f")
+	(unspec:<VIMODE> [(match_operand:FLSX 1 "register_operand" "f")]
+			 UNSPEC_LSX_VFTINT_U))]
+  "ISA_HAS_LSX"
+  "vftint.<ilsxfmt_u>.<flsxfmt>\t%w0,%w1"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "cnv_mode" "<FINTCNV_2>")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "fix_trunc<FLSX:mode><mode_i>2"
+  [(set (match_operand:<VIMODE> 0 "register_operand" "=f")
+	(fix:<VIMODE> (match_operand:FLSX 1 "register_operand" "f")))]
+  "ISA_HAS_LSX"
+  "vftintrz.<ilsxfmt>.<flsxfmt>\t%w0,%w1"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "cnv_mode" "<FINTCNV_2>")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "fixuns_trunc<FLSX:mode><mode_i>2"
+  [(set (match_operand:<VIMODE> 0 "register_operand" "=f")
+	(unsigned_fix:<VIMODE> (match_operand:FLSX 1 "register_operand" "f")))]
+  "ISA_HAS_LSX"
+  "vftintrz.<ilsxfmt_u>.<flsxfmt>\t%w0,%w1"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "cnv_mode" "<FINTCNV_2>")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vh<optab>w_h<u>_b<u>"
+  [(set (match_operand:V8HI 0 "register_operand" "=f")
+	(addsub:V8HI
+	  (any_extend:V8HI
+	    (vec_select:V8QI
+	      (match_operand:V16QI 1 "register_operand" "f")
+	      (parallel [(const_int 1) (const_int 3)
+			 (const_int 5) (const_int 7)
+			 (const_int 9) (const_int 11)
+			 (const_int 13) (const_int 15)])))
+	  (any_extend:V8HI
+	    (vec_select:V8QI
+	      (match_operand:V16QI 2 "register_operand" "f")
+	      (parallel [(const_int 0) (const_int 2)
+			 (const_int 4) (const_int 6)
+			 (const_int 8) (const_int 10)
+			 (const_int 12) (const_int 14)])))))]
+  "ISA_HAS_LSX"
+  "vh<optab>w.h<u>.b<u>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V8HI")])
+
+(define_insn "lsx_vh<optab>w_w<u>_h<u>"
+  [(set (match_operand:V4SI 0 "register_operand" "=f")
+	(addsub:V4SI
+	  (any_extend:V4SI
+	    (vec_select:V4HI
+	      (match_operand:V8HI 1 "register_operand" "f")
+	      (parallel [(const_int 1) (const_int 3)
+			 (const_int 5) (const_int 7)])))
+	  (any_extend:V4SI
+	    (vec_select:V4HI
+	      (match_operand:V8HI 2 "register_operand" "f")
+	      (parallel [(const_int 0) (const_int 2)
+			 (const_int 4) (const_int 6)])))))]
+  "ISA_HAS_LSX"
+  "vh<optab>w.w<u>.h<u>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4SI")])
+
+(define_insn "lsx_vh<optab>w_d<u>_w<u>"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(addsub:V2DI
+	  (any_extend:V2DI
+	    (vec_select:V2SI
+	      (match_operand:V4SI 1 "register_operand" "f")
+	      (parallel [(const_int 1) (const_int 3)])))
+	  (any_extend:V2DI
+	    (vec_select:V2SI
+	      (match_operand:V4SI 2 "register_operand" "f")
+	      (parallel [(const_int 0) (const_int 2)])))))]
+  "ISA_HAS_LSX"
+  "vh<optab>w.d<u>.w<u>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vpackev_b"
+  [(set (match_operand:V16QI 0 "register_operand" "=f")
+	(vec_select:V16QI
+	  (vec_concat:V32QI
+	    (match_operand:V16QI 1 "register_operand" "f")
+	    (match_operand:V16QI 2 "register_operand" "f"))
+	  (parallel [(const_int 0)  (const_int 16)
+		     (const_int 2)  (const_int 18)
+		     (const_int 4)  (const_int 20)
+		     (const_int 6)  (const_int 22)
+		     (const_int 8)  (const_int 24)
+		     (const_int 10) (const_int 26)
+		     (const_int 12) (const_int 28)
+		     (const_int 14) (const_int 30)])))]
+  "ISA_HAS_LSX"
+  "vpackev.b\t%w0,%w2,%w1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V16QI")])
+
+(define_insn "lsx_vpackev_h"
+  [(set (match_operand:V8HI 0 "register_operand" "=f")
+	(vec_select:V8HI
+	  (vec_concat:V16HI
+	    (match_operand:V8HI 1 "register_operand" "f")
+	    (match_operand:V8HI 2 "register_operand" "f"))
+	  (parallel [(const_int 0) (const_int 8)
+		     (const_int 2) (const_int 10)
+		     (const_int 4) (const_int 12)
+		     (const_int 6) (const_int 14)])))]
+  "ISA_HAS_LSX"
+  "vpackev.h\t%w0,%w2,%w1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V8HI")])
+
+(define_insn "lsx_vpackev_w"
+  [(set (match_operand:V4SI 0 "register_operand" "=f")
+	(vec_select:V4SI
+	  (vec_concat:V8SI
+	    (match_operand:V4SI 1 "register_operand" "f")
+	    (match_operand:V4SI 2 "register_operand" "f"))
+	  (parallel [(const_int 0) (const_int 4)
+		     (const_int 2) (const_int 6)])))]
+  "ISA_HAS_LSX"
+  "vpackev.w\t%w0,%w2,%w1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V4SI")])
+
+(define_insn "lsx_vpackev_w_f"
+  [(set (match_operand:V4SF 0 "register_operand" "=f")
+	(vec_select:V4SF
+	  (vec_concat:V8SF
+	    (match_operand:V4SF 1 "register_operand" "f")
+	    (match_operand:V4SF 2 "register_operand" "f"))
+	  (parallel [(const_int 0) (const_int 4)
+		     (const_int 2) (const_int 6)])))]
+  "ISA_HAS_LSX"
+  "vpackev.w\t%w0,%w2,%w1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V4SF")])
+
+(define_insn "lsx_vilvh_b"
+  [(set (match_operand:V16QI 0 "register_operand" "=f")
+	(vec_select:V16QI
+	  (vec_concat:V32QI
+	    (match_operand:V16QI 1 "register_operand" "f")
+	    (match_operand:V16QI 2 "register_operand" "f"))
+	  (parallel [(const_int 8)  (const_int 24)
+		     (const_int 9)  (const_int 25)
+		     (const_int 10) (const_int 26)
+		     (const_int 11) (const_int 27)
+		     (const_int 12) (const_int 28)
+		     (const_int 13) (const_int 29)
+		     (const_int 14) (const_int 30)
+		     (const_int 15) (const_int 31)])))]
+  "ISA_HAS_LSX"
+  "vilvh.b\t%w0,%w2,%w1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V16QI")])
+
+(define_insn "lsx_vilvh_h"
+  [(set (match_operand:V8HI 0 "register_operand" "=f")
+	(vec_select:V8HI
+	  (vec_concat:V16HI
+	    (match_operand:V8HI 1 "register_operand" "f")
+	    (match_operand:V8HI 2 "register_operand" "f"))
+	  (parallel [(const_int 4) (const_int 12)
+		     (const_int 5) (const_int 13)
+		     (const_int 6) (const_int 14)
+		     (const_int 7) (const_int 15)])))]
+  "ISA_HAS_LSX"
+  "vilvh.h\t%w0,%w2,%w1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V8HI")])
+
+(define_mode_attr vilvh_suffix
+  [(V4SI "") (V4SF "_f")
+   (V2DI "") (V2DF "_f")])
+
+(define_insn "lsx_vilvh_w<vilvh_suffix>"
+  [(set (match_operand:LSX_W 0 "register_operand" "=f")
+	(vec_select:LSX_W
+	  (vec_concat:<VEMODE>
+	    (match_operand:LSX_W 1 "register_operand" "f")
+	    (match_operand:LSX_W 2 "register_operand" "f"))
+	  (parallel [(const_int 2) (const_int 6)
+		     (const_int 3) (const_int 7)])))]
+  "ISA_HAS_LSX"
+  "vilvh.w\t%w0,%w2,%w1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vilvh_d<vilvh_suffix>"
+  [(set (match_operand:LSX_D 0 "register_operand" "=f")
+	(vec_select:LSX_D
+	  (vec_concat:<VEMODE>
+	    (match_operand:LSX_D 1 "register_operand" "f")
+	    (match_operand:LSX_D 2 "register_operand" "f"))
+	  (parallel [(const_int 1) (const_int 3)])))]
+  "ISA_HAS_LSX"
+  "vilvh.d\t%w0,%w2,%w1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vpackod_b"
+  [(set (match_operand:V16QI 0 "register_operand" "=f")
+	(vec_select:V16QI
+	  (vec_concat:V32QI
+	    (match_operand:V16QI 1 "register_operand" "f")
+	    (match_operand:V16QI 2 "register_operand" "f"))
+	  (parallel [(const_int 1)  (const_int 17)
+		     (const_int 3)  (const_int 19)
+		     (const_int 5)  (const_int 21)
+		     (const_int 7)  (const_int 23)
+		     (const_int 9)  (const_int 25)
+		     (const_int 11) (const_int 27)
+		     (const_int 13) (const_int 29)
+		     (const_int 15) (const_int 31)])))]
+  "ISA_HAS_LSX"
+  "vpackod.b\t%w0,%w2,%w1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V16QI")])
+
+(define_insn "lsx_vpackod_h"
+  [(set (match_operand:V8HI 0 "register_operand" "=f")
+	(vec_select:V8HI
+	  (vec_concat:V16HI
+	    (match_operand:V8HI 1 "register_operand" "f")
+	    (match_operand:V8HI 2 "register_operand" "f"))
+	  (parallel [(const_int 1) (const_int 9)
+		     (const_int 3) (const_int 11)
+		     (const_int 5) (const_int 13)
+		     (const_int 7) (const_int 15)])))]
+  "ISA_HAS_LSX"
+  "vpackod.h\t%w0,%w2,%w1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V8HI")])
+
+(define_insn "lsx_vpackod_w"
+  [(set (match_operand:V4SI 0 "register_operand" "=f")
+	(vec_select:V4SI
+	  (vec_concat:V8SI
+	    (match_operand:V4SI 1 "register_operand" "f")
+	    (match_operand:V4SI 2 "register_operand" "f"))
+	  (parallel [(const_int 1) (const_int 5)
+		     (const_int 3) (const_int 7)])))]
+  "ISA_HAS_LSX"
+  "vpackod.w\t%w0,%w2,%w1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V4SI")])
+
+(define_insn "lsx_vpackod_w_f"
+  [(set (match_operand:V4SF 0 "register_operand" "=f")
+	(vec_select:V4SF
+	  (vec_concat:V8SF
+	    (match_operand:V4SF 1 "register_operand" "f")
+	    (match_operand:V4SF 2 "register_operand" "f"))
+	  (parallel [(const_int 1) (const_int 5)
+		     (const_int 3) (const_int 7)])))]
+  "ISA_HAS_LSX"
+  "vpackod.w\t%w0,%w2,%w1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V4SF")])
+
+(define_insn "lsx_vilvl_b"
+  [(set (match_operand:V16QI 0 "register_operand" "=f")
+	(vec_select:V16QI
+	  (vec_concat:V32QI
+	    (match_operand:V16QI 1 "register_operand" "f")
+	    (match_operand:V16QI 2 "register_operand" "f"))
+	  (parallel [(const_int 0) (const_int 16)
+		     (const_int 1) (const_int 17)
+		     (const_int 2) (const_int 18)
+		     (const_int 3) (const_int 19)
+		     (const_int 4) (const_int 20)
+		     (const_int 5) (const_int 21)
+		     (const_int 6) (const_int 22)
+		     (const_int 7) (const_int 23)])))]
+  "ISA_HAS_LSX"
+  "vilvl.b\t%w0,%w2,%w1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V16QI")])
+
+(define_insn "lsx_vilvl_h"
+  [(set (match_operand:V8HI 0 "register_operand" "=f")
+	(vec_select:V8HI
+	  (vec_concat:V16HI
+	    (match_operand:V8HI 1 "register_operand" "f")
+	    (match_operand:V8HI 2 "register_operand" "f"))
+	  (parallel [(const_int 0) (const_int 8)
+		     (const_int 1) (const_int 9)
+		     (const_int 2) (const_int 10)
+		     (const_int 3) (const_int 11)])))]
+  "ISA_HAS_LSX"
+  "vilvl.h\t%w0,%w2,%w1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V8HI")])
+
+(define_insn "lsx_vilvl_w"
+  [(set (match_operand:V4SI 0 "register_operand" "=f")
+	(vec_select:V4SI
+	  (vec_concat:V8SI
+	    (match_operand:V4SI 1 "register_operand" "f")
+	    (match_operand:V4SI 2 "register_operand" "f"))
+	  (parallel [(const_int 0) (const_int 4)
+		     (const_int 1) (const_int 5)])))]
+  "ISA_HAS_LSX"
+  "vilvl.w\t%w0,%w2,%w1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V4SI")])
+
+(define_insn "lsx_vilvl_w_f"
+  [(set (match_operand:V4SF 0 "register_operand" "=f")
+	(vec_select:V4SF
+	  (vec_concat:V8SF
+	    (match_operand:V4SF 1 "register_operand" "f")
+	    (match_operand:V4SF 2 "register_operand" "f"))
+	  (parallel [(const_int 0) (const_int 4)
+		     (const_int 1) (const_int 5)])))]
+  "ISA_HAS_LSX"
+  "vilvl.w\t%w0,%w2,%w1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V4SF")])
+
+(define_insn "lsx_vilvl_d"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(vec_select:V2DI
+	  (vec_concat:V4DI
+	    (match_operand:V2DI 1 "register_operand" "f")
+	    (match_operand:V2DI 2 "register_operand" "f"))
+	  (parallel [(const_int 0) (const_int 2)])))]
+  "ISA_HAS_LSX"
+  "vilvl.d\t%w0,%w2,%w1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vilvl_d_f"
+  [(set (match_operand:V2DF 0 "register_operand" "=f")
+	(vec_select:V2DF
+	  (vec_concat:V4DF
+	    (match_operand:V2DF 1 "register_operand" "f")
+	    (match_operand:V2DF 2 "register_operand" "f"))
+	  (parallel [(const_int 0) (const_int 2)])))]
+  "ISA_HAS_LSX"
+  "vilvl.d\t%w0,%w2,%w1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V2DF")])
+
+(define_insn "smax<mode>3"
+  [(set (match_operand:ILSX 0 "register_operand" "=f,f")
+	(smax:ILSX (match_operand:ILSX 1 "register_operand" "f,f")
+		   (match_operand:ILSX 2 "reg_or_vector_same_simm5_operand" "f,Usv5")))]
+  "ISA_HAS_LSX"
+  "@
+   vmax.<lsxfmt>\t%w0,%w1,%w2
+   vmaxi.<lsxfmt>\t%w0,%w1,%E2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "umax<mode>3"
+  [(set (match_operand:ILSX 0 "register_operand" "=f,f")
+	(umax:ILSX (match_operand:ILSX 1 "register_operand" "f,f")
+		   (match_operand:ILSX 2 "reg_or_vector_same_uimm5_operand" "f,Uuv5")))]
+  "ISA_HAS_LSX"
+  "@
+   vmax.<lsxfmt_u>\t%w0,%w1,%w2
+   vmaxi.<lsxfmt_u>\t%w0,%w1,%B2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "smin<mode>3"
+  [(set (match_operand:ILSX 0 "register_operand" "=f,f")
+	(smin:ILSX (match_operand:ILSX 1 "register_operand" "f,f")
+		   (match_operand:ILSX 2 "reg_or_vector_same_simm5_operand" "f,Usv5")))]
+  "ISA_HAS_LSX"
+  "@
+   vmin.<lsxfmt>\t%w0,%w1,%w2
+   vmini.<lsxfmt>\t%w0,%w1,%E2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "umin<mode>3"
+  [(set (match_operand:ILSX 0 "register_operand" "=f,f")
+	(umin:ILSX (match_operand:ILSX 1 "register_operand" "f,f")
+		   (match_operand:ILSX 2 "reg_or_vector_same_uimm5_operand" "f,Uuv5")))]
+  "ISA_HAS_LSX"
+  "@
+   vmin.<lsxfmt_u>\t%w0,%w1,%w2
+   vmini.<lsxfmt_u>\t%w0,%w1,%B2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vclo_<lsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(clz:ILSX (not:ILSX (match_operand:ILSX 1 "register_operand" "f"))))]
+  "ISA_HAS_LSX"
+  "vclo.<lsxfmt>\t%w0,%w1"
+  [(set_attr "type" "simd_bit")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "clz<mode>2"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(clz:ILSX (match_operand:ILSX 1 "register_operand" "f")))]
+  "ISA_HAS_LSX"
+  "vclz.<lsxfmt>\t%w0,%w1"
+  [(set_attr "type" "simd_bit")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_nor_<lsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f,f")
+	(and:ILSX (not:ILSX (match_operand:ILSX 1 "register_operand" "f,f"))
+		  (not:ILSX (match_operand:ILSX 2 "reg_or_vector_same_val_operand" "f,Urv8"))))]
+  "ISA_HAS_LSX"
+  "@
+   vnor.v\t%w0,%w1,%w2
+   vnori.b\t%w0,%w1,%B2"
+  [(set_attr "type" "simd_logic")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vpickev_b"
+[(set (match_operand:V16QI 0 "register_operand" "=f")
+      (vec_select:V16QI
+	(vec_concat:V32QI
+	  (match_operand:V16QI 1 "register_operand" "f")
+	  (match_operand:V16QI 2 "register_operand" "f"))
+	(parallel [(const_int 0) (const_int 2)
+		   (const_int 4) (const_int 6)
+		   (const_int 8) (const_int 10)
+		   (const_int 12) (const_int 14)
+		   (const_int 16) (const_int 18)
+		   (const_int 20) (const_int 22)
+		   (const_int 24) (const_int 26)
+		   (const_int 28) (const_int 30)])))]
+  "ISA_HAS_LSX"
+  "vpickev.b\t%w0,%w2,%w1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V16QI")])
+
+(define_insn "lsx_vpickev_h"
+[(set (match_operand:V8HI 0 "register_operand" "=f")
+      (vec_select:V8HI
+	(vec_concat:V16HI
+	  (match_operand:V8HI 1 "register_operand" "f")
+	  (match_operand:V8HI 2 "register_operand" "f"))
+	(parallel [(const_int 0) (const_int 2)
+		   (const_int 4) (const_int 6)
+		   (const_int 8) (const_int 10)
+		   (const_int 12) (const_int 14)])))]
+  "ISA_HAS_LSX"
+  "vpickev.h\t%w0,%w2,%w1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V8HI")])
+
+(define_insn "lsx_vpickev_w"
+[(set (match_operand:V4SI 0 "register_operand" "=f")
+      (vec_select:V4SI
+	(vec_concat:V8SI
+	  (match_operand:V4SI 1 "register_operand" "f")
+	  (match_operand:V4SI 2 "register_operand" "f"))
+	(parallel [(const_int 0) (const_int 2)
+		   (const_int 4) (const_int 6)])))]
+  "ISA_HAS_LSX"
+  "vpickev.w\t%w0,%w2,%w1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V4SI")])
+
+(define_insn "lsx_vpickev_w_f"
+[(set (match_operand:V4SF 0 "register_operand" "=f")
+      (vec_select:V4SF
+	(vec_concat:V8SF
+	  (match_operand:V4SF 1 "register_operand" "f")
+	  (match_operand:V4SF 2 "register_operand" "f"))
+	(parallel [(const_int 0) (const_int 2)
+		   (const_int 4) (const_int 6)])))]
+  "ISA_HAS_LSX"
+  "vpickev.w\t%w0,%w2,%w1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V4SF")])
+
+(define_insn "lsx_vpickod_b"
+[(set (match_operand:V16QI 0 "register_operand" "=f")
+      (vec_select:V16QI
+	(vec_concat:V32QI
+	  (match_operand:V16QI 1 "register_operand" "f")
+	  (match_operand:V16QI 2 "register_operand" "f"))
+	(parallel [(const_int 1) (const_int 3)
+		   (const_int 5) (const_int 7)
+		   (const_int 9) (const_int 11)
+		   (const_int 13) (const_int 15)
+		   (const_int 17) (const_int 19)
+		   (const_int 21) (const_int 23)
+		   (const_int 25) (const_int 27)
+		   (const_int 29) (const_int 31)])))]
+  "ISA_HAS_LSX"
+  "vpickod.b\t%w0,%w2,%w1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V16QI")])
+
+(define_insn "lsx_vpickod_h"
+[(set (match_operand:V8HI 0 "register_operand" "=f")
+      (vec_select:V8HI
+	(vec_concat:V16HI
+	  (match_operand:V8HI 1 "register_operand" "f")
+	  (match_operand:V8HI 2 "register_operand" "f"))
+	(parallel [(const_int 1) (const_int 3)
+		   (const_int 5) (const_int 7)
+		   (const_int 9) (const_int 11)
+		   (const_int 13) (const_int 15)])))]
+  "ISA_HAS_LSX"
+  "vpickod.h\t%w0,%w2,%w1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V8HI")])
+
+(define_insn "lsx_vpickod_w"
+[(set (match_operand:V4SI 0 "register_operand" "=f")
+      (vec_select:V4SI
+	(vec_concat:V8SI
+	  (match_operand:V4SI 1 "register_operand" "f")
+	  (match_operand:V4SI 2 "register_operand" "f"))
+	(parallel [(const_int 1) (const_int 3)
+		   (const_int 5) (const_int 7)])))]
+  "ISA_HAS_LSX"
+  "vpickod.w\t%w0,%w2,%w1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V4SI")])
+
+(define_insn "lsx_vpickod_w_f"
+[(set (match_operand:V4SF 0 "register_operand" "=f")
+      (vec_select:V4SF
+	(vec_concat:V8SF
+	  (match_operand:V4SF 1 "register_operand" "f")
+	  (match_operand:V4SF 2 "register_operand" "f"))
+	(parallel [(const_int 1) (const_int 3)
+		   (const_int 5) (const_int 7)])))]
+  "ISA_HAS_LSX"
+  "vpickod.w\t%w0,%w2,%w1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V4SF")])
+
+(define_insn "popcount<mode>2"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(popcount:ILSX (match_operand:ILSX 1 "register_operand" "f")))]
+  "ISA_HAS_LSX"
+  "vpcnt.<lsxfmt>\t%w0,%w1"
+  [(set_attr "type" "simd_pcnt")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vsat_s_<lsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(unspec:ILSX [(match_operand:ILSX 1 "register_operand" "f")
+		      (match_operand 2 "const_<bitimm>_operand" "")]
+		     UNSPEC_LSX_VSAT_S))]
+  "ISA_HAS_LSX"
+  "vsat.<lsxfmt>\t%w0,%w1,%2"
+  [(set_attr "type" "simd_sat")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vsat_u_<lsxfmt_u>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(unspec:ILSX [(match_operand:ILSX 1 "register_operand" "f")
+		      (match_operand 2 "const_<bitimm>_operand" "")]
+		     UNSPEC_LSX_VSAT_U))]
+  "ISA_HAS_LSX"
+  "vsat.<lsxfmt_u>\t%w0,%w1,%2"
+  [(set_attr "type" "simd_sat")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vshuf4i_<lsxfmt_f>"
+  [(set (match_operand:LSX_WHB_W 0 "register_operand" "=f")
+	(vec_select:LSX_WHB_W
+	  (match_operand:LSX_WHB_W 1 "register_operand" "f")
+	  (match_operand 2 "par_const_vector_shf_set_operand" "")))]
+  "ISA_HAS_LSX"
+{
+  HOST_WIDE_INT val = 0;
+  unsigned int i;
+
+  /* We convert the selection to an immediate.  */
+  for (i = 0; i < 4; i++)
+    val |= INTVAL (XVECEXP (operands[2], 0, i)) << (2 * i);
+
+  operands[2] = GEN_INT (val);
+  return "vshuf4i.<lsxfmt>\t%w0,%w1,%X2";
+}
+  [(set_attr "type" "simd_shf")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vsrar_<lsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(unspec:ILSX [(match_operand:ILSX 1 "register_operand" "f")
+		      (match_operand:ILSX 2 "register_operand" "f")]
+		     UNSPEC_LSX_VSRAR))]
+  "ISA_HAS_LSX"
+  "vsrar.<lsxfmt>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vsrari_<lsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(unspec:ILSX [(match_operand:ILSX 1 "register_operand" "f")
+		      (match_operand 2 "const_<bitimm>_operand" "")]
+		     UNSPEC_LSX_VSRARI))]
+  "ISA_HAS_LSX"
+  "vsrari.<lsxfmt>\t%w0,%w1,%2"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vsrlr_<lsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(unspec:ILSX [(match_operand:ILSX 1 "register_operand" "f")
+		      (match_operand:ILSX 2 "register_operand" "f")]
+		     UNSPEC_LSX_VSRLR))]
+  "ISA_HAS_LSX"
+  "vsrlr.<lsxfmt>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vsrlri_<lsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(unspec:ILSX [(match_operand:ILSX 1 "register_operand" "f")
+		      (match_operand 2 "const_<bitimm>_operand" "")]
+		     UNSPEC_LSX_VSRLRI))]
+  "ISA_HAS_LSX"
+  "vsrlri.<lsxfmt>\t%w0,%w1,%2"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vssub_s_<lsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(ss_minus:ILSX (match_operand:ILSX 1 "register_operand" "f")
+		      (match_operand:ILSX 2 "register_operand" "f")))]
+  "ISA_HAS_LSX"
+  "vssub.<lsxfmt>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vssub_u_<lsxfmt_u>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(us_minus:ILSX (match_operand:ILSX 1 "register_operand" "f")
+		      (match_operand:ILSX 2 "register_operand" "f")))]
+  "ISA_HAS_LSX"
+  "vssub.<lsxfmt_u>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vreplve_<lsxfmt_f>"
+  [(set (match_operand:LSX 0 "register_operand" "=f")
+	(vec_duplicate:LSX
+	  (vec_select:<UNITMODE>
+	    (match_operand:LSX 1 "register_operand" "f")
+	    (parallel [(match_operand:SI 2 "register_operand" "r")]))))]
+  "ISA_HAS_LSX"
+  "vreplve.<lsxfmt>\t%w0,%w1,%z2"
+  [(set_attr "type" "simd_splat")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vreplvei_<lsxfmt_f>"
+  [(set (match_operand:LSX 0 "register_operand" "=f")
+	(vec_duplicate:LSX
+	  (vec_select:<UNITMODE>
+	    (match_operand:LSX 1 "register_operand" "f")
+	    (parallel [(match_operand 2 "const_<indeximm>_operand" "")]))))]
+  "ISA_HAS_LSX"
+  "vreplvei.<lsxfmt>\t%w0,%w1,%2"
+  [(set_attr "type" "simd_splat")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vreplvei_<lsxfmt_f>_scalar"
+  [(set (match_operand:LSX 0 "register_operand" "=f")
+      (vec_duplicate:LSX
+	(match_operand:<UNITMODE> 1 "register_operand" "f")))]
+  "ISA_HAS_LSX"
+  "vreplvei.<lsxfmt>\t%w0,%w1,0"
+  [(set_attr "type" "simd_splat")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vfcvt_h_s"
+  [(set (match_operand:V8HI 0 "register_operand" "=f")
+	(unspec:V8HI [(match_operand:V4SF 1 "register_operand" "f")
+		      (match_operand:V4SF 2 "register_operand" "f")]
+		     UNSPEC_LSX_VFCVT))]
+  "ISA_HAS_LSX"
+  "vfcvt.h.s\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "mode" "V8HI")])
+
+(define_insn "lsx_vfcvt_s_d"
+  [(set (match_operand:V4SF 0 "register_operand" "=f")
+	(unspec:V4SF [(match_operand:V2DF 1 "register_operand" "f")
+		      (match_operand:V2DF 2 "register_operand" "f")]
+		     UNSPEC_LSX_VFCVT))]
+  "ISA_HAS_LSX"
+  "vfcvt.s.d\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "mode" "V4SF")])
+
+(define_insn "vec_pack_trunc_v2df"
+  [(set (match_operand:V4SF 0 "register_operand" "=f")
+	(vec_concat:V4SF
+	  (float_truncate:V2SF (match_operand:V2DF 1 "register_operand" "f"))
+	  (float_truncate:V2SF (match_operand:V2DF 2 "register_operand" "f"))))]
+  "ISA_HAS_LSX"
+  "vfcvt.s.d\t%w0,%w2,%w1"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "mode" "V4SF")])
+
+(define_insn "lsx_vfcvth_s_h"
+  [(set (match_operand:V4SF 0 "register_operand" "=f")
+	(unspec:V4SF [(match_operand:V8HI 1 "register_operand" "f")]
+		     UNSPEC_LSX_VFCVTH))]
+  "ISA_HAS_LSX"
+  "vfcvth.s.h\t%w0,%w1"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "mode" "V4SF")])
+
+(define_insn "lsx_vfcvth_d_s"
+  [(set (match_operand:V2DF 0 "register_operand" "=f")
+	(float_extend:V2DF
+	(vec_select:V2SF
+	  (match_operand:V4SF 1 "register_operand" "f")
+	  (parallel [(const_int 2) (const_int 3)]))))]
+  "ISA_HAS_LSX"
+  "vfcvth.d.s\t%w0,%w1"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "mode" "V2DF")])
+
+(define_insn "lsx_vfcvtl_s_h"
+  [(set (match_operand:V4SF 0 "register_operand" "=f")
+	(unspec:V4SF [(match_operand:V8HI 1 "register_operand" "f")]
+		     UNSPEC_LSX_VFCVTL))]
+  "ISA_HAS_LSX"
+  "vfcvtl.s.h\t%w0,%w1"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "mode" "V4SF")])
+
+(define_insn "lsx_vfcvtl_d_s"
+  [(set (match_operand:V2DF 0 "register_operand" "=f")
+	(float_extend:V2DF
+	(vec_select:V2SF
+	  (match_operand:V4SF 1 "register_operand" "f")
+	  (parallel [(const_int 0) (const_int 1)]))))]
+  "ISA_HAS_LSX"
+  "vfcvtl.d.s\t%w0,%w1"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "mode" "V2DF")])
+
+(define_code_attr lsxbr
+  [(eq "bz")
+   (ne "bnz")])
+
+(define_code_attr lsxeq_v
+  [(eq "eqz")
+   (ne "nez")])
+
+(define_code_attr lsxne_v
+  [(eq "nez")
+   (ne "eqz")])
+
+(define_code_attr lsxeq
+  [(eq "anyeqz")
+   (ne "allnez")])
+
+(define_code_attr lsxne
+  [(eq "allnez")
+   (ne "anyeqz")])
+
+(define_insn "lsx_<lsxbr>_<lsxfmt_f>"
+ [(set (pc) (if_then_else
+	      (equality_op
+		(unspec:SI [(match_operand:LSX 1 "register_operand" "f")]
+			    UNSPEC_LSX_BRANCH)
+		  (match_operand:SI 2 "const_0_operand"))
+		  (label_ref (match_operand 0))
+		  (pc)))
+      (clobber (match_scratch:FCC 3 "=z"))]
+ "ISA_HAS_LSX"
+{
+  return loongarch_output_conditional_branch (insn, operands,
+					 "vset<lsxeq>.<lsxfmt>\t%Z3%w1\n\tbcnez\t%Z3%0",
+					 "vset<lsxne>.<lsxfmt>\t%Z3%w1\n\tbcnez\t%Z3%0");
+}
+ [(set_attr "type" "simd_branch")
+  (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_<lsxbr>_v_<lsxfmt_f>"
+ [(set (pc) (if_then_else
+	      (equality_op
+		(unspec:SI [(match_operand:LSX 1 "register_operand" "f")]
+			    UNSPEC_LSX_BRANCH_V)
+		  (match_operand:SI 2 "const_0_operand"))
+		  (label_ref (match_operand 0))
+		  (pc)))
+      (clobber (match_scratch:FCC 3 "=z"))]
+ "ISA_HAS_LSX"
+{
+  return loongarch_output_conditional_branch (insn, operands,
+					 "vset<lsxeq_v>.v\t%Z3%w1\n\tbcnez\t%Z3%0",
+					 "vset<lsxne_v>.v\t%Z3%w1\n\tbcnez\t%Z3%0");
+}
+ [(set_attr "type" "simd_branch")
+  (set_attr "mode" "TI")])
+
+;; vec_concate
+(define_expand "vec_concatv2di"
+  [(set (match_operand:V2DI 0 "register_operand")
+	(vec_concat:V2DI
+	  (match_operand:DI 1 "register_operand")
+	  (match_operand:DI 2 "register_operand")))]
+  "ISA_HAS_LSX"
+{
+  emit_insn (gen_lsx_vinsgr2vr_d (operands[0], operands[1],
+				  operands[0], GEN_INT (0)));
+  emit_insn (gen_lsx_vinsgr2vr_d (operands[0], operands[2],
+				  operands[0], GEN_INT (1)));
+  DONE;
+})
+
+
+(define_insn "vandn<mode>3"
+  [(set (match_operand:LSX 0 "register_operand" "=f")
+	(and:LSX (not:LSX (match_operand:LSX 1 "register_operand" "f"))
+		 (match_operand:LSX 2 "register_operand" "f")))]
+  "ISA_HAS_LSX"
+  "vandn.v\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_logic")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "vabs<mode>2"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(abs:ILSX (match_operand:ILSX 1 "register_operand" "f")))]
+  "ISA_HAS_LSX"
+  "vsigncov.<lsxfmt>\t%w0,%w1,%w1"
+  [(set_attr "type" "simd_logic")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "vneg<mode>2"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(neg:ILSX (match_operand:ILSX 1 "register_operand" "f")))]
+  "ISA_HAS_LSX"
+  "vneg.<lsxfmt>\t%w0,%w1"
+  [(set_attr "type" "simd_logic")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vmuh_s_<lsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(unspec:ILSX [(match_operand:ILSX 1 "register_operand" "f")
+		      (match_operand:ILSX 2 "register_operand" "f")]
+		     UNSPEC_LSX_VMUH_S))]
+  "ISA_HAS_LSX"
+  "vmuh.<lsxfmt>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vmuh_u_<lsxfmt_u>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(unspec:ILSX [(match_operand:ILSX 1 "register_operand" "f")
+		      (match_operand:ILSX 2 "register_operand" "f")]
+		     UNSPEC_LSX_VMUH_U))]
+  "ISA_HAS_LSX"
+  "vmuh.<lsxfmt_u>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vextw_s_d"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V4SI 1 "register_operand" "f")]
+		     UNSPEC_LSX_VEXTW_S))]
+  "ISA_HAS_LSX"
+  "vextw_s.d\t%w0,%w1"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "mode" "V4SI")])
+
+(define_insn "lsx_vextw_u_d"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V4SI 1 "register_operand" "f")]
+		     UNSPEC_LSX_VEXTW_U))]
+  "ISA_HAS_LSX"
+  "vextw_u.d\t%w0,%w1"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "mode" "V4SI")])
+
+(define_insn "lsx_vsllwil_s_<dlsxfmt>_<lsxfmt>"
+  [(set (match_operand:<VDMODE> 0 "register_operand" "=f")
+	(unspec:<VDMODE> [(match_operand:ILSX_WHB 1 "register_operand" "f")
+			  (match_operand 2 "const_<bitimm>_operand" "")]
+			 UNSPEC_LSX_VSLLWIL_S))]
+  "ISA_HAS_LSX"
+  "vsllwil.<dlsxfmt>.<lsxfmt>\t%w0,%w1,%2"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vsllwil_u_<dlsxfmt_u>_<lsxfmt_u>"
+  [(set (match_operand:<VDMODE> 0 "register_operand" "=f")
+	(unspec:<VDMODE> [(match_operand:ILSX_WHB 1 "register_operand" "f")
+			  (match_operand 2 "const_<bitimm>_operand" "")]
+			 UNSPEC_LSX_VSLLWIL_U))]
+  "ISA_HAS_LSX"
+  "vsllwil.<dlsxfmt_u>.<lsxfmt_u>\t%w0,%w1,%2"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vsran_<hlsxfmt>_<lsxfmt>"
+  [(set (match_operand:<VHMODE> 0 "register_operand" "=f")
+	(unspec:<VHMODE> [(match_operand:ILSX_DWH 1 "register_operand" "f")
+			  (match_operand:ILSX_DWH 2 "register_operand" "f")]
+			 UNSPEC_LSX_VSRAN))]
+  "ISA_HAS_LSX"
+  "vsran.<hlsxfmt>.<lsxfmt>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vssran_s_<hlsxfmt>_<lsxfmt>"
+  [(set (match_operand:<VHMODE> 0 "register_operand" "=f")
+	(unspec:<VHMODE> [(match_operand:ILSX_DWH 1 "register_operand" "f")
+			  (match_operand:ILSX_DWH 2 "register_operand" "f")]
+			 UNSPEC_LSX_VSSRAN_S))]
+  "ISA_HAS_LSX"
+  "vssran.<hlsxfmt>.<lsxfmt>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vssran_u_<hlsxfmt_u>_<lsxfmt>"
+  [(set (match_operand:<VHMODE> 0 "register_operand" "=f")
+	(unspec:<VHMODE> [(match_operand:ILSX_DWH 1 "register_operand" "f")
+			  (match_operand:ILSX_DWH 2 "register_operand" "f")]
+			 UNSPEC_LSX_VSSRAN_U))]
+  "ISA_HAS_LSX"
+  "vssran.<hlsxfmt_u>.<lsxfmt>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vsrain_<hlsxfmt>"
+  [(set (match_operand:<VHMODE> 0 "register_operand" "=f")
+	(unspec:<VHMODE> [(match_operand:ILSX_DWH 1 "register_operand" "f")
+			  (match_operand 2 "const_<bitimm>_operand" "")]
+			 UNSPEC_LSX_VSRAIN))]
+  "ISA_HAS_LSX"
+  "vsrain.<hlsxfmt>\t%w0,%w1,%2"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+;; FIXME: bitimm
+(define_insn "lsx_vsrains_s_<hlsxfmt>"
+  [(set (match_operand:<VHMODE> 0 "register_operand" "=f")
+	(unspec:<VHMODE> [(match_operand:ILSX_DWH 1 "register_operand" "f")
+			  (match_operand 2 "const_<bitimm>_operand" "")]
+			 UNSPEC_LSX_VSRAINS_S))]
+  "ISA_HAS_LSX"
+  "vsrains_s.<hlsxfmt>\t%w0,%w1,%2"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+;; FIXME: bitimm
+(define_insn "lsx_vsrains_u_<hlsxfmt>"
+  [(set (match_operand:<VHMODE> 0 "register_operand" "=f")
+	(unspec:<VHMODE> [(match_operand:ILSX_DWH 1 "register_operand" "f")
+			  (match_operand 2 "const_<bitimm>_operand" "")]
+			 UNSPEC_LSX_VSRAINS_U))]
+  "ISA_HAS_LSX"
+  "vsrains_u.<hlsxfmt>\t%w0,%w1,%2"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vsrarn_<hlsxfmt>_<lsxfmt>"
+  [(set (match_operand:<VHMODE> 0 "register_operand" "=f")
+	(unspec:<VHMODE> [(match_operand:ILSX_DWH 1 "register_operand" "f")
+			  (match_operand:ILSX_DWH 2 "register_operand" "f")]
+			 UNSPEC_LSX_VSRARN))]
+  "ISA_HAS_LSX"
+  "vsrarn.<hlsxfmt>.<lsxfmt>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vssrarn_s_<hlsxfmt>_<lsxfmt>"
+  [(set (match_operand:<VHMODE> 0 "register_operand" "=f")
+	(unspec:<VHMODE> [(match_operand:ILSX_DWH 1 "register_operand" "f")
+			  (match_operand:ILSX_DWH 2 "register_operand" "f")]
+			 UNSPEC_LSX_VSSRARN_S))]
+  "ISA_HAS_LSX"
+  "vssrarn.<hlsxfmt>.<lsxfmt>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vssrarn_u_<hlsxfmt_u>_<lsxfmt>"
+  [(set (match_operand:<VHMODE> 0 "register_operand" "=f")
+	(unspec:<VHMODE> [(match_operand:ILSX_DWH 1 "register_operand" "f")
+			  (match_operand:ILSX_DWH 2 "register_operand" "f")]
+			 UNSPEC_LSX_VSSRARN_U))]
+  "ISA_HAS_LSX"
+  "vssrarn.<hlsxfmt_u>.<lsxfmt>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vsrln_<hlsxfmt>_<lsxfmt>"
+  [(set (match_operand:<VHMODE> 0 "register_operand" "=f")
+	(unspec:<VHMODE> [(match_operand:ILSX_DWH 1 "register_operand" "f")
+			  (match_operand:ILSX_DWH 2 "register_operand" "f")]
+			 UNSPEC_LSX_VSRLN))]
+  "ISA_HAS_LSX"
+  "vsrln.<hlsxfmt>.<lsxfmt>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vssrln_u_<hlsxfmt_u>_<lsxfmt>"
+  [(set (match_operand:<VHMODE> 0 "register_operand" "=f")
+	(unspec:<VHMODE> [(match_operand:ILSX_DWH 1 "register_operand" "f")
+			  (match_operand:ILSX_DWH 2 "register_operand" "f")]
+			 UNSPEC_LSX_VSSRLN_U))]
+  "ISA_HAS_LSX"
+  "vssrln.<hlsxfmt_u>.<lsxfmt>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vsrlrn_<hlsxfmt>_<lsxfmt>"
+  [(set (match_operand:<VHMODE> 0 "register_operand" "=f")
+	(unspec:<VHMODE> [(match_operand:ILSX_DWH 1 "register_operand" "f")
+			  (match_operand:ILSX_DWH 2 "register_operand" "f")]
+			 UNSPEC_LSX_VSRLRN))]
+  "ISA_HAS_LSX"
+  "vsrlrn.<hlsxfmt>.<lsxfmt>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vssrlrn_u_<hlsxfmt_u>_<lsxfmt>"
+  [(set (match_operand:<VHMODE> 0 "register_operand" "=f")
+	(unspec:<VHMODE> [(match_operand:ILSX_DWH 1 "register_operand" "f")
+			  (match_operand:ILSX_DWH 2 "register_operand" "f")]
+			 UNSPEC_LSX_VSSRLRN_U))]
+  "ISA_HAS_LSX"
+  "vssrlrn.<hlsxfmt_u>.<lsxfmt>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vfrstpi_<lsxfmt>"
+  [(set (match_operand:ILSX_HB 0 "register_operand" "=f")
+	(unspec:ILSX_HB [(match_operand:ILSX_HB 1 "register_operand" "0")
+			 (match_operand:ILSX_HB 2 "register_operand" "f")
+			 (match_operand 3 "const_uimm5_operand" "")]
+			UNSPEC_LSX_VFRSTPI))]
+  "ISA_HAS_LSX"
+  "vfrstpi.<lsxfmt>\t%w0,%w2,%3"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vfrstp_<lsxfmt>"
+  [(set (match_operand:ILSX_HB 0 "register_operand" "=f")
+	(unspec:ILSX_HB [(match_operand:ILSX_HB 1 "register_operand" "0")
+			 (match_operand:ILSX_HB 2 "register_operand" "f")
+			 (match_operand:ILSX_HB 3 "register_operand" "f")]
+			UNSPEC_LSX_VFRSTP))]
+  "ISA_HAS_LSX"
+  "vfrstp.<lsxfmt>\t%w0,%w2,%w3"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vshuf4i_d"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V2DI 1 "register_operand" "0")
+		      (match_operand:V2DI 2 "register_operand" "f")
+		      (match_operand 3 "const_uimm8_operand")]
+		     UNSPEC_LSX_VSHUF4I))]
+  "ISA_HAS_LSX"
+  "vshuf4i.d\t%w0,%w2,%3"
+  [(set_attr "type" "simd_sld")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vbsrl_<lsxfmt_f>"
+  [(set (match_operand:LSX 0 "register_operand" "=f")
+	(unspec:LSX [(match_operand:LSX 1 "register_operand" "f")
+		     (match_operand 2 "const_uimm5_operand" "")]
+		    UNSPEC_LSX_VBSRL_V))]
+  "ISA_HAS_LSX"
+  "vbsrl.v\t%w0,%w1,%2"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vbsll_<lsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(unspec:ILSX [(match_operand:ILSX 1 "register_operand" "f")
+		      (match_operand 2 "const_uimm5_operand" "")]
+		     UNSPEC_LSX_VBSLL_V))]
+  "ISA_HAS_LSX"
+  "vbsll.v\t%w0,%w1,%2"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vextrins_<lsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(unspec:ILSX [(match_operand:ILSX 1 "register_operand" "0")
+		      (match_operand:ILSX 2 "register_operand" "f")
+		      (match_operand 3 "const_uimm8_operand" "")]
+		     UNSPEC_LSX_VEXTRINS))]
+  "ISA_HAS_LSX"
+  "vextrins.<lsxfmt>\t%w0,%w2,%3"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vmskltz_<lsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(unspec:ILSX [(match_operand:ILSX 1 "register_operand" "f")]
+		     UNSPEC_LSX_VMSKLTZ))]
+  "ISA_HAS_LSX"
+  "vmskltz.<lsxfmt>\t%w0,%w1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vsigncov_<lsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(unspec:ILSX [(match_operand:ILSX 1 "register_operand" "f")
+		      (match_operand:ILSX 2 "register_operand" "f")]
+		     UNSPEC_LSX_VSIGNCOV))]
+  "ISA_HAS_LSX"
+  "vsigncov.<lsxfmt>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_expand "copysign<mode>3"
+  [(set (match_dup 4)
+	(and:FLSX
+	  (not:FLSX (match_dup 3))
+	  (match_operand:FLSX 1 "register_operand")))
+   (set (match_dup 5)
+	(and:FLSX (match_dup 3)
+		  (match_operand:FLSX 2 "register_operand")))
+   (set (match_operand:FLSX 0 "register_operand")
+	(ior:FLSX (match_dup 4) (match_dup 5)))]
+  "ISA_HAS_LSX"
+{
+  operands[3] = loongarch_build_signbit_mask (<MODE>mode, 1, 0);
+
+  operands[4] = gen_reg_rtx (<MODE>mode);
+  operands[5] = gen_reg_rtx (<MODE>mode);
+})
+
+(define_insn "absv2df2"
+  [(set (match_operand:V2DF 0 "register_operand" "=f")
+	(abs:V2DF (match_operand:V2DF 1 "register_operand" "f")))]
+  "ISA_HAS_LSX"
+  "vbitclri.d\t%w0,%w1,63"
+  [(set_attr "type" "simd_logic")
+   (set_attr "mode" "V2DF")])
+
+(define_insn "absv4sf2"
+  [(set (match_operand:V4SF 0 "register_operand" "=f")
+	(abs:V4SF (match_operand:V4SF 1 "register_operand" "f")))]
+  "ISA_HAS_LSX"
+  "vbitclri.w\t%w0,%w1,31"
+  [(set_attr "type" "simd_logic")
+   (set_attr "mode" "V4SF")])
+
+(define_insn "vfmadd<mode>4"
+  [(set (match_operand:FLSX 0 "register_operand" "=f")
+	(fma:FLSX (match_operand:FLSX 1 "register_operand" "f")
+		  (match_operand:FLSX 2 "register_operand" "f")
+		  (match_operand:FLSX 3 "register_operand" "f")))]
+  "ISA_HAS_LSX"
+  "vfmadd.<flsxfmt>\t%w0,%w1,$w2,%w3"
+  [(set_attr "type" "simd_fmadd")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "fms<mode>4"
+  [(set (match_operand:FLSX 0 "register_operand" "=f")
+	(fma:FLSX (match_operand:FLSX 1 "register_operand" "f")
+		  (match_operand:FLSX 2 "register_operand" "f")
+		  (neg:FLSX (match_operand:FLSX 3 "register_operand" "f"))))]
+  "ISA_HAS_LSX"
+  "vfmsub.<flsxfmt>\t%w0,%w1,%w2,%w3"
+  [(set_attr "type" "simd_fmadd")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "vfnmsub<mode>4_nmsub4"
+  [(set (match_operand:FLSX 0 "register_operand" "=f")
+	(neg:FLSX
+	  (fma:FLSX
+	    (match_operand:FLSX 1 "register_operand" "f")
+	    (match_operand:FLSX 2 "register_operand" "f")
+	    (neg:FLSX (match_operand:FLSX 3 "register_operand" "f")))))]
+  "ISA_HAS_LSX"
+  "vfnmsub.<flsxfmt>\t%w0,%w1,%w2,%w3"
+  [(set_attr "type" "simd_fmadd")
+   (set_attr "mode" "<MODE>")])
+
+
+(define_insn "vfnmadd<mode>4_nmadd4"
+  [(set (match_operand:FLSX 0 "register_operand" "=f")
+	(neg:FLSX
+	  (fma:FLSX
+	    (match_operand:FLSX 1 "register_operand" "f")
+	    (match_operand:FLSX 2 "register_operand" "f")
+	    (match_operand:FLSX 3 "register_operand" "f"))))]
+  "ISA_HAS_LSX"
+  "vfnmadd.<flsxfmt>\t%w0,%w1,%w2,%w3"
+  [(set_attr "type" "simd_fmadd")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vftintrne_w_s"
+  [(set (match_operand:V4SI 0 "register_operand" "=f")
+	(unspec:V4SI [(match_operand:V4SF 1 "register_operand" "f")]
+		     UNSPEC_LSX_VFTINTRNE))]
+  "ISA_HAS_LSX"
+  "vftintrne.w.s\t%w0,%w1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V4SF")])
+
+(define_insn "lsx_vftintrne_l_d"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V2DF 1 "register_operand" "f")]
+		     UNSPEC_LSX_VFTINTRNE))]
+  "ISA_HAS_LSX"
+  "vftintrne.l.d\t%w0,%w1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V2DF")])
+
+(define_insn "lsx_vftintrp_w_s"
+  [(set (match_operand:V4SI 0 "register_operand" "=f")
+	(unspec:V4SI [(match_operand:V4SF 1 "register_operand" "f")]
+		     UNSPEC_LSX_VFTINTRP))]
+  "ISA_HAS_LSX"
+  "vftintrp.w.s\t%w0,%w1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V4SF")])
+
+(define_insn "lsx_vftintrp_l_d"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V2DF 1 "register_operand" "f")]
+		     UNSPEC_LSX_VFTINTRP))]
+  "ISA_HAS_LSX"
+  "vftintrp.l.d\t%w0,%w1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V2DF")])
+
+(define_insn "lsx_vftintrm_w_s"
+  [(set (match_operand:V4SI 0 "register_operand" "=f")
+	(unspec:V4SI [(match_operand:V4SF 1 "register_operand" "f")]
+		     UNSPEC_LSX_VFTINTRM))]
+  "ISA_HAS_LSX"
+  "vftintrm.w.s\t%w0,%w1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V4SF")])
+
+(define_insn "lsx_vftintrm_l_d"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V2DF 1 "register_operand" "f")]
+		     UNSPEC_LSX_VFTINTRM))]
+  "ISA_HAS_LSX"
+  "vftintrm.l.d\t%w0,%w1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V2DF")])
+
+(define_insn "lsx_vftint_w_d"
+  [(set (match_operand:V4SI 0 "register_operand" "=f")
+	(unspec:V4SI [(match_operand:V2DF 1 "register_operand" "f")
+		      (match_operand:V2DF 2 "register_operand" "f")]
+		     UNSPEC_LSX_VFTINT_W_D))]
+  "ISA_HAS_LSX"
+  "vftint.w.d\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V2DF")])
+
+(define_insn "lsx_vffint_s_l"
+  [(set (match_operand:V4SF 0 "register_operand" "=f")
+	(unspec:V4SF [(match_operand:V2DI 1 "register_operand" "f")
+		      (match_operand:V2DI 2 "register_operand" "f")]
+		     UNSPEC_LSX_VFFINT_S_L))]
+  "ISA_HAS_LSX"
+  "vffint.s.l\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vftintrz_w_d"
+  [(set (match_operand:V4SI 0 "register_operand" "=f")
+	(unspec:V4SI [(match_operand:V2DF 1 "register_operand" "f")
+		      (match_operand:V2DF 2 "register_operand" "f")]
+		     UNSPEC_LSX_VFTINTRZ_W_D))]
+  "ISA_HAS_LSX"
+  "vftintrz.w.d\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V2DF")])
+
+(define_insn "lsx_vftintrp_w_d"
+  [(set (match_operand:V4SI 0 "register_operand" "=f")
+	(unspec:V4SI [(match_operand:V2DF 1 "register_operand" "f")
+		      (match_operand:V2DF 2 "register_operand" "f")]
+		     UNSPEC_LSX_VFTINTRP_W_D))]
+  "ISA_HAS_LSX"
+  "vftintrp.w.d\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V2DF")])
+
+(define_insn "lsx_vftintrm_w_d"
+  [(set (match_operand:V4SI 0 "register_operand" "=f")
+	(unspec:V4SI [(match_operand:V2DF 1 "register_operand" "f")
+		      (match_operand:V2DF 2 "register_operand" "f")]
+		     UNSPEC_LSX_VFTINTRM_W_D))]
+  "ISA_HAS_LSX"
+  "vftintrm.w.d\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V2DF")])
+
+(define_insn "lsx_vftintrne_w_d"
+  [(set (match_operand:V4SI 0 "register_operand" "=f")
+	(unspec:V4SI [(match_operand:V2DF 1 "register_operand" "f")
+		      (match_operand:V2DF 2 "register_operand" "f")]
+		     UNSPEC_LSX_VFTINTRNE_W_D))]
+  "ISA_HAS_LSX"
+  "vftintrne.w.d\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V2DF")])
+
+(define_insn "lsx_vftinth_l_s"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V4SF 1 "register_operand" "f")]
+		     UNSPEC_LSX_VFTINTH_L_H))]
+  "ISA_HAS_LSX"
+  "vftinth.l.s\t%w0,%w1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V4SF")])
+
+(define_insn "lsx_vftintl_l_s"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V4SF 1 "register_operand" "f")]
+		     UNSPEC_LSX_VFTINTL_L_S))]
+  "ISA_HAS_LSX"
+  "vftintl.l.s\t%w0,%w1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V4SF")])
+
+(define_insn "lsx_vffinth_d_w"
+  [(set (match_operand:V2DF 0 "register_operand" "=f")
+	(unspec:V2DF [(match_operand:V4SI 1 "register_operand" "f")]
+		     UNSPEC_LSX_VFFINTH_D_W))]
+  "ISA_HAS_LSX"
+  "vffinth.d.w\t%w0,%w1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V4SI")])
+
+(define_insn "lsx_vffintl_d_w"
+  [(set (match_operand:V2DF 0 "register_operand" "=f")
+	(unspec:V2DF [(match_operand:V4SI 1 "register_operand" "f")]
+		     UNSPEC_LSX_VFFINTL_D_W))]
+  "ISA_HAS_LSX"
+  "vffintl.d.w\t%w0,%w1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V4SI")])
+
+(define_insn "lsx_vftintrzh_l_s"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V4SF 1 "register_operand" "f")]
+		     UNSPEC_LSX_VFTINTRZH_L_S))]
+  "ISA_HAS_LSX"
+  "vftintrzh.l.s\t%w0,%w1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V4SF")])
+
+(define_insn "lsx_vftintrzl_l_s"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V4SF 1 "register_operand" "f")]
+		     UNSPEC_LSX_VFTINTRZL_L_S))]
+  "ISA_HAS_LSX"
+  "vftintrzl.l.s\t%w0,%w1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V4SF")])
+
+(define_insn "lsx_vftintrph_l_s"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V4SF 1 "register_operand" "f")]
+		     UNSPEC_LSX_VFTINTRPH_L_S))]
+  "ISA_HAS_LSX"
+  "vftintrph.l.s\t%w0,%w1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V4SF")])
+
+(define_insn "lsx_vftintrpl_l_s"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V4SF 1 "register_operand" "f")]
+		     UNSPEC_LSX_VFTINTRPL_L_S))]
+  "ISA_HAS_LSX"
+  "vftintrpl.l.s\t%w0,%w1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V4SF")])
+
+(define_insn "lsx_vftintrmh_l_s"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V4SF 1 "register_operand" "f")]
+		     UNSPEC_LSX_VFTINTRMH_L_S))]
+  "ISA_HAS_LSX"
+  "vftintrmh.l.s\t%w0,%w1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V4SF")])
+
+(define_insn "lsx_vftintrml_l_s"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V4SF 1 "register_operand" "f")]
+		     UNSPEC_LSX_VFTINTRML_L_S))]
+  "ISA_HAS_LSX"
+  "vftintrml.l.s\t%w0,%w1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V4SF")])
+
+(define_insn "lsx_vftintrneh_l_s"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V4SF 1 "register_operand" "f")]
+		     UNSPEC_LSX_VFTINTRNEH_L_S))]
+  "ISA_HAS_LSX"
+  "vftintrneh.l.s\t%w0,%w1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V4SF")])
+
+(define_insn "lsx_vftintrnel_l_s"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V4SF 1 "register_operand" "f")]
+		     UNSPEC_LSX_VFTINTRNEL_L_S))]
+  "ISA_HAS_LSX"
+  "vftintrnel.l.s\t%w0,%w1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V4SF")])
+
+(define_insn "lsx_vfrintrne_s"
+  [(set (match_operand:V4SF 0 "register_operand" "=f")
+	(unspec:V4SF [(match_operand:V4SF 1 "register_operand" "f")]
+		     UNSPEC_LSX_VFRINTRNE_S))]
+  "ISA_HAS_LSX"
+  "vfrintrne.s\t%w0,%w1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V4SF")])
+
+(define_insn "lsx_vfrintrne_d"
+  [(set (match_operand:V2DF 0 "register_operand" "=f")
+	(unspec:V2DF [(match_operand:V2DF 1 "register_operand" "f")]
+		     UNSPEC_LSX_VFRINTRNE_D))]
+  "ISA_HAS_LSX"
+  "vfrintrne.d\t%w0,%w1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V2DF")])
+
+(define_insn "lsx_vfrintrz_s"
+  [(set (match_operand:V4SF 0 "register_operand" "=f")
+	(unspec:V4SF [(match_operand:V4SF 1 "register_operand" "f")]
+		     UNSPEC_LSX_VFRINTRZ_S))]
+  "ISA_HAS_LSX"
+  "vfrintrz.s\t%w0,%w1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V4SF")])
+
+(define_insn "lsx_vfrintrz_d"
+  [(set (match_operand:V2DF 0 "register_operand" "=f")
+	(unspec:V2DF [(match_operand:V2DF 1 "register_operand" "f")]
+		     UNSPEC_LSX_VFRINTRZ_D))]
+  "ISA_HAS_LSX"
+  "vfrintrz.d\t%w0,%w1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V2DF")])
+
+(define_insn "lsx_vfrintrp_s"
+  [(set (match_operand:V4SF 0 "register_operand" "=f")
+	(unspec:V4SF [(match_operand:V4SF 1 "register_operand" "f")]
+		     UNSPEC_LSX_VFRINTRP_S))]
+  "ISA_HAS_LSX"
+  "vfrintrp.s\t%w0,%w1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V4SF")])
+
+(define_insn "lsx_vfrintrp_d"
+  [(set (match_operand:V2DF 0 "register_operand" "=f")
+	(unspec:V2DF [(match_operand:V2DF 1 "register_operand" "f")]
+		     UNSPEC_LSX_VFRINTRP_D))]
+  "ISA_HAS_LSX"
+  "vfrintrp.d\t%w0,%w1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V2DF")])
+
+(define_insn "lsx_vfrintrm_s"
+  [(set (match_operand:V4SF 0 "register_operand" "=f")
+	(unspec:V4SF [(match_operand:V4SF 1 "register_operand" "f")]
+		     UNSPEC_LSX_VFRINTRM_S))]
+  "ISA_HAS_LSX"
+  "vfrintrm.s\t%w0,%w1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V4SF")])
+
+(define_insn "lsx_vfrintrm_d"
+  [(set (match_operand:V2DF 0 "register_operand" "=f")
+	(unspec:V2DF [(match_operand:V2DF 1 "register_operand" "f")]
+		     UNSPEC_LSX_VFRINTRM_D))]
+  "ISA_HAS_LSX"
+  "vfrintrm.d\t%w0,%w1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V2DF")])
+
+;; Vector versions of the floating-point frint patterns.
+;; Expands to btrunc, ceil, floor, rint.
+(define_insn "<FRINT_S:frint_pattern_s>v4sf2"
+ [(set (match_operand:V4SF 0 "register_operand" "=f")
+	(unspec:V4SF [(match_operand:V4SF 1 "register_operand" "f")]
+			 FRINT_S))]
+  "ISA_HAS_LSX"
+  "vfrint<FRINT_S:frint_suffix>.s\t%w0,%w1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V4SF")])
+
+(define_insn "<FRINT_D:frint_pattern_d>v2df2"
+ [(set (match_operand:V2DF 0 "register_operand" "=f")
+	(unspec:V2DF [(match_operand:V2DF 1 "register_operand" "f")]
+			 FRINT_D))]
+  "ISA_HAS_LSX"
+  "vfrint<FRINT_D:frint_suffix>.d\t%w0,%w1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V2DF")])
+
+;; Expands to round.
+(define_insn "round<mode>2"
+ [(set (match_operand:FLSX 0 "register_operand" "=f")
+	(unspec:FLSX [(match_operand:FLSX 1 "register_operand" "f")]
+			 UNSPEC_LSX_VFRINT))]
+  "ISA_HAS_LSX"
+  "vfrint.<flsxfrint>\t%w0,%w1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+;; Offset load and broadcast
+(define_expand "lsx_vldrepl_<lsxfmt_f>"
+  [(match_operand:LSX 0 "register_operand")
+   (match_operand 1 "pmode_register_operand")
+   (match_operand 2 "aq12<lsxfmt>_operand")]
+  "ISA_HAS_LSX"
+{
+  emit_insn (gen_lsx_vldrepl_<lsxfmt_f>_insn
+	     (operands[0], operands[1], operands[2]));
+  DONE;
+})
+
+(define_insn "lsx_vldrepl_<lsxfmt_f>_insn"
+  [(set (match_operand:LSX 0 "register_operand" "=f")
+	(vec_duplicate:LSX
+	  (mem:<UNITMODE> (plus:DI (match_operand:DI 1 "register_operand" "r")
+				   (match_operand 2 "aq12<lsxfmt>_operand")))))]
+  "ISA_HAS_LSX"
+{
+    return "vldrepl.<lsxfmt>\t%w0,%1,%2";
+}
+  [(set_attr "type" "simd_load")
+   (set_attr "mode" "<MODE>")
+   (set_attr "length" "4")])
+
+(define_insn "lsx_vldrepl_<lsxfmt_f>_insn_0"
+  [(set (match_operand:LSX 0 "register_operand" "=f")
+    (vec_duplicate:LSX
+      (mem:<UNITMODE> (match_operand:DI 1 "register_operand" "r"))))]
+  "ISA_HAS_LSX"
+{
+    return "vldrepl.<lsxfmt>\t%w0,%1,0";
+}
+  [(set_attr "type" "simd_load")
+   (set_attr "mode" "<MODE>")
+   (set_attr "length" "4")])
+
+;; Offset store by sel
+(define_expand "lsx_vstelm_<lsxfmt_f>"
+  [(match_operand:LSX 0 "register_operand")
+   (match_operand 3 "const_<indeximm>_operand")
+   (match_operand 2 "aq8<lsxfmt>_operand")
+   (match_operand 1 "pmode_register_operand")]
+  "ISA_HAS_LSX"
+{
+  emit_insn (gen_lsx_vstelm_<lsxfmt_f>_insn
+	     (operands[1], operands[2], operands[0], operands[3]));
+  DONE;
+})
+
+(define_insn "lsx_vstelm_<lsxfmt_f>_insn"
+  [(set (mem:<UNITMODE> (plus:DI (match_operand:DI 0 "register_operand" "r")
+				 (match_operand 1 "aq8<lsxfmt>_operand")))
+	(vec_select:<UNITMODE>
+	  (match_operand:LSX 2 "register_operand" "f")
+	  (parallel [(match_operand 3 "const_<indeximm>_operand" "")])))]
+
+  "ISA_HAS_LSX"
+{
+  return "vstelm.<lsxfmt>\t%w2,%0,%1,%3";
+}
+  [(set_attr "type" "simd_store")
+   (set_attr "mode" "<MODE>")
+   (set_attr "length" "4")])
+
+;; Offset is "0"
+(define_insn "lsx_vstelm_<lsxfmt_f>_insn_0"
+  [(set (mem:<UNITMODE> (match_operand:DI 0 "register_operand" "r"))
+    (vec_select:<UNITMODE>
+      (match_operand:LSX 1 "register_operand" "f")
+      (parallel [(match_operand:SI 2 "const_<indeximm>_operand")])))]
+  "ISA_HAS_LSX"
+{
+    return "vstelm.<lsxfmt>\t%w1,%0,0,%2";
+}
+  [(set_attr "type" "simd_store")
+   (set_attr "mode" "<MODE>")
+   (set_attr "length" "4")])
+
+(define_expand "lsx_vld"
+  [(match_operand:V16QI 0 "register_operand")
+   (match_operand 1 "pmode_register_operand")
+   (match_operand 2 "aq12b_operand")]
+  "ISA_HAS_LSX"
+{
+  rtx addr = plus_constant (GET_MODE (operands[1]), operands[1],
+			    INTVAL (operands[2]));
+  loongarch_emit_move (operands[0], gen_rtx_MEM (V16QImode, addr));
+  DONE;
+})
+
+(define_expand "lsx_vst"
+  [(match_operand:V16QI 0 "register_operand")
+   (match_operand 1 "pmode_register_operand")
+   (match_operand 2 "aq12b_operand")]
+  "ISA_HAS_LSX"
+{
+  rtx addr = plus_constant (GET_MODE (operands[1]), operands[1],
+			    INTVAL (operands[2]));
+  loongarch_emit_move (gen_rtx_MEM (V16QImode, addr), operands[0]);
+  DONE;
+})
+
+(define_insn "lsx_vssrln_<hlsxfmt>_<lsxfmt>"
+  [(set (match_operand:<VHMODE> 0 "register_operand" "=f")
+	(unspec:<VHMODE> [(match_operand:ILSX_DWH 1 "register_operand" "f")
+			  (match_operand:ILSX_DWH 2 "register_operand" "f")]
+			 UNSPEC_LSX_VSSRLN))]
+  "ISA_HAS_LSX"
+  "vssrln.<hlsxfmt>.<lsxfmt>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+
+(define_insn "lsx_vssrlrn_<hlsxfmt>_<lsxfmt>"
+  [(set (match_operand:<VHMODE> 0 "register_operand" "=f")
+	(unspec:<VHMODE> [(match_operand:ILSX_DWH 1 "register_operand" "f")
+			  (match_operand:ILSX_DWH 2 "register_operand" "f")]
+			 UNSPEC_LSX_VSSRLRN))]
+  "ISA_HAS_LSX"
+  "vssrlrn.<hlsxfmt>.<lsxfmt>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "vorn<mode>3"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(ior:ILSX (not:ILSX (match_operand:ILSX 2 "register_operand" "f"))
+		  (match_operand:ILSX 1 "register_operand" "f")))]
+  "ISA_HAS_LSX"
+  "vorn.v\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_logic")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vldi"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand 1 "const_imm13_operand")]
+		    UNSPEC_LSX_VLDI))]
+  "ISA_HAS_LSX"
+{
+  HOST_WIDE_INT val = INTVAL (operands[1]);
+  if (val < 0)
+  {
+    HOST_WIDE_INT modeVal = (val & 0xf00) >> 8;
+    if (modeVal < 13)
+      return  "vldi\t%w0,%1";
+    else
+      sorry ("imm13 only support 0000 ~ 1100 in bits 9 ~ 12 when bit '13' is 1");
+    return "#";
+  }
+  else
+    return "vldi\t%w0,%1";
+}
+  [(set_attr "type" "simd_load")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vshuf_b"
+  [(set (match_operand:V16QI 0 "register_operand" "=f")
+	(unspec:V16QI [(match_operand:V16QI 1 "register_operand" "f")
+		       (match_operand:V16QI 2 "register_operand" "f")
+		       (match_operand:V16QI 3 "register_operand" "f")]
+		      UNSPEC_LSX_VSHUF_B))]
+  "ISA_HAS_LSX"
+  "vshuf.b\t%w0,%w1,%w2,%w3"
+  [(set_attr "type" "simd_shf")
+   (set_attr "mode" "V16QI")])
+
+(define_insn "lsx_vldx"
+  [(set (match_operand:V16QI 0 "register_operand" "=f")
+	(unspec:V16QI [(match_operand:DI 1 "register_operand" "r")
+		       (match_operand:DI 2 "reg_or_0_operand" "rJ")]
+		      UNSPEC_LSX_VLDX))]
+  "ISA_HAS_LSX"
+{
+  return "vldx\t%w0,%1,%z2";
+}
+  [(set_attr "type" "simd_load")
+   (set_attr "mode" "V16QI")])
+
+(define_insn "lsx_vstx"
+  [(set (mem:V16QI (plus:DI (match_operand:DI 1 "register_operand" "r")
+			    (match_operand:DI 2 "reg_or_0_operand" "rJ")))
+	(unspec: V16QI [(match_operand:V16QI 0 "register_operand" "f")]
+		      UNSPEC_LSX_VSTX))]
+
+  "ISA_HAS_LSX"
+{
+  return "vstx\t%w0,%1,%z2";
+}
+  [(set_attr "type" "simd_store")
+   (set_attr "mode" "DI")])
+
+(define_insn "lsx_vextl_qu_du"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V2DI 1 "register_operand" "f")]
+		     UNSPEC_LSX_VEXTL_QU_DU))]
+  "ISA_HAS_LSX"
+  "vextl.qu.du\t%w0,%w1"
+  [(set_attr "type" "simd_bit")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vseteqz_v"
+  [(set (match_operand:FCC 0 "register_operand" "=z")
+	(eq:FCC
+	  (unspec:SI [(match_operand:V16QI 1 "register_operand" "f")]
+		     UNSPEC_LSX_VSETEQZ_V)
+	  (match_operand:SI 2 "const_0_operand")))]
+  "ISA_HAS_LSX"
+{
+  return "vseteqz.v\t%0,%1";
+}
+  [(set_attr "type" "simd_fcmp")
+   (set_attr "mode" "FCC")])
+
+;; Vector reduction operation
+(define_expand "reduc_plus_scal_v2di"
+  [(match_operand:DI 0 "register_operand")
+   (match_operand:V2DI 1 "register_operand")]
+  "ISA_HAS_LSX"
+{
+  rtx tmp = gen_reg_rtx (V2DImode);
+  emit_insn (gen_lsx_vhaddw_q_d (tmp, operands[1], operands[1]));
+  emit_insn (gen_vec_extractv2didi (operands[0], tmp, const0_rtx));
+  DONE;
+})
+
+(define_expand "reduc_plus_scal_v4si"
+  [(match_operand:SI 0 "register_operand")
+   (match_operand:V4SI 1 "register_operand")]
+  "ISA_HAS_LSX"
+{
+  rtx tmp = gen_reg_rtx (V2DImode);
+  rtx tmp1 = gen_reg_rtx (V2DImode);
+  emit_insn (gen_lsx_vhaddw_d_w (tmp, operands[1], operands[1]));
+  emit_insn (gen_lsx_vhaddw_q_d (tmp1, tmp, tmp));
+  emit_insn (gen_vec_extractv4sisi (operands[0], gen_lowpart (V4SImode,tmp1),
+				    const0_rtx));
+  DONE;
+})
+
+(define_expand "reduc_plus_scal_<mode>"
+  [(match_operand:<UNITMODE> 0 "register_operand")
+   (match_operand:FLSX 1 "register_operand")]
+  "ISA_HAS_LSX"
+{
+  rtx tmp = gen_reg_rtx (<MODE>mode);
+  loongarch_expand_vector_reduc (gen_add<mode>3, tmp, operands[1]);
+  emit_insn (gen_vec_extract<mode><unitmode> (operands[0], tmp,
+					      const0_rtx));
+  DONE;
+})
+
+(define_expand "reduc_<optab>_scal_<mode>"
+  [(any_bitwise:<UNITMODE>
+      (match_operand:<UNITMODE> 0 "register_operand")
+      (match_operand:ILSX 1 "register_operand"))]
+  "ISA_HAS_LSX"
+{
+  rtx tmp = gen_reg_rtx (<MODE>mode);
+  loongarch_expand_vector_reduc (gen_<optab><mode>3, tmp, operands[1]);
+  emit_insn (gen_vec_extract<mode><unitmode> (operands[0], tmp,
+					      const0_rtx));
+  DONE;
+})
+
+(define_expand "reduc_smax_scal_<mode>"
+  [(match_operand:<UNITMODE> 0 "register_operand")
+   (match_operand:LSX 1 "register_operand")]
+  "ISA_HAS_LSX"
+{
+  rtx tmp = gen_reg_rtx (<MODE>mode);
+  loongarch_expand_vector_reduc (gen_smax<mode>3, tmp, operands[1]);
+  emit_insn (gen_vec_extract<mode><unitmode> (operands[0], tmp,
+					      const0_rtx));
+  DONE;
+})
+
+(define_expand "reduc_smin_scal_<mode>"
+  [(match_operand:<UNITMODE> 0 "register_operand")
+   (match_operand:LSX 1 "register_operand")]
+  "ISA_HAS_LSX"
+{
+  rtx tmp = gen_reg_rtx (<MODE>mode);
+  loongarch_expand_vector_reduc (gen_smin<mode>3, tmp, operands[1]);
+  emit_insn (gen_vec_extract<mode><unitmode> (operands[0], tmp,
+					      const0_rtx));
+  DONE;
+})
+
+(define_expand "reduc_umax_scal_<mode>"
+  [(match_operand:<UNITMODE> 0 "register_operand")
+   (match_operand:ILSX 1 "register_operand")]
+  "ISA_HAS_LSX"
+{
+  rtx tmp = gen_reg_rtx (<MODE>mode);
+  loongarch_expand_vector_reduc (gen_umax<mode>3, tmp, operands[1]);
+  emit_insn (gen_vec_extract<mode><unitmode> (operands[0], tmp,
+					      const0_rtx));
+  DONE;
+})
+
+(define_expand "reduc_umin_scal_<mode>"
+  [(match_operand:<UNITMODE> 0 "register_operand")
+   (match_operand:ILSX 1 "register_operand")]
+  "ISA_HAS_LSX"
+{
+  rtx tmp = gen_reg_rtx (<MODE>mode);
+  loongarch_expand_vector_reduc (gen_umin<mode>3, tmp, operands[1]);
+  emit_insn (gen_vec_extract<mode><unitmode> (operands[0], tmp,
+					      const0_rtx));
+  DONE;
+})
+
+(define_insn "lsx_v<optab>wev_d_w<u>"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(addsubmul:V2DI
+	  (any_extend:V2DI
+	    (vec_select:V2SI
+	      (match_operand:V4SI 1 "register_operand" "%f")
+	      (parallel [(const_int 0) (const_int 2)])))
+	  (any_extend:V2DI
+	    (vec_select:V2SI
+	      (match_operand:V4SI 2 "register_operand" "f")
+	      (parallel [(const_int 0) (const_int 2)])))))]
+  "ISA_HAS_LSX"
+  "v<optab>wev.d.w<u>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_v<optab>wev_w_h<u>"
+  [(set (match_operand:V4SI 0 "register_operand" "=f")
+	(addsubmul:V4SI
+	  (any_extend:V4SI
+	    (vec_select:V4HI
+	      (match_operand:V8HI 1 "register_operand" "%f")
+	      (parallel [(const_int 0) (const_int 2)
+			 (const_int 4) (const_int 6)])))
+	  (any_extend:V4SI
+	    (vec_select:V4HI
+	      (match_operand:V8HI 2 "register_operand" "f")
+	      (parallel [(const_int 0) (const_int 2)
+			 (const_int 4) (const_int 6)])))))]
+  "ISA_HAS_LSX"
+  "v<optab>wev.w.h<u>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4SI")])
+
+(define_insn "lsx_v<optab>wev_h_b<u>"
+  [(set (match_operand:V8HI 0 "register_operand" "=f")
+	(addsubmul:V8HI
+	  (any_extend:V8HI
+	    (vec_select:V8QI
+	      (match_operand:V16QI 1 "register_operand" "%f")
+	      (parallel [(const_int 0) (const_int 2)
+			 (const_int 4) (const_int 6)
+			 (const_int 8) (const_int 10)
+			 (const_int 12) (const_int 14)])))
+	  (any_extend:V8HI
+	    (vec_select:V8QI
+	      (match_operand:V16QI 2 "register_operand" "f")
+	      (parallel [(const_int 0) (const_int 2)
+			 (const_int 4) (const_int 6)
+			 (const_int 8) (const_int 10)
+			 (const_int 12) (const_int 14)])))))]
+  "ISA_HAS_LSX"
+  "v<optab>wev.h.b<u>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V8HI")])
+
+(define_insn "lsx_v<optab>wod_d_w<u>"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(addsubmul:V2DI
+	  (any_extend:V2DI
+	    (vec_select:V2SI
+	      (match_operand:V4SI 1 "register_operand" "%f")
+	      (parallel [(const_int 1) (const_int 3)])))
+	  (any_extend:V2DI
+	    (vec_select:V2SI
+	      (match_operand:V4SI 2 "register_operand" "f")
+	      (parallel [(const_int 1) (const_int 3)])))))]
+  "ISA_HAS_LSX"
+  "v<optab>wod.d.w<u>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_v<optab>wod_w_h<u>"
+  [(set (match_operand:V4SI 0 "register_operand" "=f")
+	(addsubmul:V4SI
+	  (any_extend:V4SI
+	    (vec_select:V4HI
+	      (match_operand:V8HI 1 "register_operand" "%f")
+	      (parallel [(const_int 1) (const_int 3)
+			 (const_int 5) (const_int 7)])))
+	  (any_extend:V4SI
+	    (vec_select:V4HI
+	      (match_operand:V8HI 2 "register_operand" "f")
+	      (parallel [(const_int 1) (const_int 3)
+			 (const_int 5) (const_int 7)])))))]
+  "ISA_HAS_LSX"
+  "v<optab>wod.w.h<u>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4SI")])
+
+(define_insn "lsx_v<optab>wod_h_b<u>"
+  [(set (match_operand:V8HI 0 "register_operand" "=f")
+	(addsubmul:V8HI
+	  (any_extend:V8HI
+	    (vec_select:V8QI
+	      (match_operand:V16QI 1 "register_operand" "%f")
+	      (parallel [(const_int 1) (const_int 3)
+			 (const_int 5) (const_int 7)
+			 (const_int 9) (const_int 11)
+			 (const_int 13) (const_int 15)])))
+	  (any_extend:V8HI
+	    (vec_select:V8QI
+	      (match_operand:V16QI 2 "register_operand" "f")
+	      (parallel [(const_int 1) (const_int 3)
+			 (const_int 5) (const_int 7)
+			 (const_int 9) (const_int 11)
+			 (const_int 13) (const_int 15)])))))]
+  "ISA_HAS_LSX"
+  "v<optab>wod.h.b<u>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V8HI")])
+
+(define_insn "lsx_v<optab>wev_d_wu_w"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(addmul:V2DI
+	  (zero_extend:V2DI
+	    (vec_select:V2SI
+	      (match_operand:V4SI 1 "register_operand" "%f")
+	      (parallel [(const_int 0) (const_int 2)])))
+	  (sign_extend:V2DI
+	    (vec_select:V2SI
+	      (match_operand:V4SI 2 "register_operand" "f")
+	      (parallel [(const_int 0) (const_int 2)])))))]
+  "ISA_HAS_LSX"
+  "v<optab>wev.d.wu.w\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_v<optab>wev_w_hu_h"
+  [(set (match_operand:V4SI 0 "register_operand" "=f")
+	(addmul:V4SI
+	  (zero_extend:V4SI
+	    (vec_select:V4HI
+	      (match_operand:V8HI 1 "register_operand" "%f")
+	      (parallel [(const_int 0) (const_int 2)
+			 (const_int 4) (const_int 6)])))
+	  (sign_extend:V4SI
+	    (vec_select:V4HI
+	      (match_operand:V8HI 2 "register_operand" "f")
+	      (parallel [(const_int 0) (const_int 2)
+			 (const_int 4) (const_int 6)])))))]
+  "ISA_HAS_LSX"
+  "v<optab>wev.w.hu.h\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4SI")])
+
+(define_insn "lsx_v<optab>wev_h_bu_b"
+  [(set (match_operand:V8HI 0 "register_operand" "=f")
+	(addmul:V8HI
+	  (zero_extend:V8HI
+	    (vec_select:V8QI
+	      (match_operand:V16QI 1 "register_operand" "%f")
+	      (parallel [(const_int 0) (const_int 2)
+			 (const_int 4) (const_int 6)
+			 (const_int 8) (const_int 10)
+			 (const_int 12) (const_int 14)])))
+	  (sign_extend:V8HI
+	    (vec_select:V8QI
+	      (match_operand:V16QI 2 "register_operand" "f")
+	      (parallel [(const_int 0) (const_int 2)
+			 (const_int 4) (const_int 6)
+			 (const_int 8) (const_int 10)
+			 (const_int 12) (const_int 14)])))))]
+  "ISA_HAS_LSX"
+  "v<optab>wev.h.bu.b\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V8HI")])
+
+(define_insn "lsx_v<optab>wod_d_wu_w"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(addmul:V2DI
+	  (zero_extend:V2DI
+	    (vec_select:V2SI
+	      (match_operand:V4SI 1 "register_operand" "%f")
+	      (parallel [(const_int 1) (const_int 3)])))
+	  (sign_extend:V2DI
+	    (vec_select:V2SI
+	      (match_operand:V4SI 2 "register_operand" "f")
+	      (parallel [(const_int 1) (const_int 3)])))))]
+  "ISA_HAS_LSX"
+  "v<optab>wod.d.wu.w\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_v<optab>wod_w_hu_h"
+  [(set (match_operand:V4SI 0 "register_operand" "=f")
+	(addmul:V4SI
+	  (zero_extend:V4SI
+	    (vec_select:V4HI
+	      (match_operand:V8HI 1 "register_operand" "%f")
+	      (parallel [(const_int 1) (const_int 3)
+			 (const_int 5) (const_int 7)])))
+	  (sign_extend:V4SI
+	    (vec_select:V4HI
+	      (match_operand:V8HI 2 "register_operand" "f")
+	      (parallel [(const_int 1) (const_int 3)
+			 (const_int 5) (const_int 7)])))))]
+  "ISA_HAS_LSX"
+  "v<optab>wod.w.hu.h\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4SI")])
+
+(define_insn "lsx_v<optab>wod_h_bu_b"
+  [(set (match_operand:V8HI 0 "register_operand" "=f")
+	(addmul:V8HI
+	  (zero_extend:V8HI
+	    (vec_select:V8QI
+	      (match_operand:V16QI 1 "register_operand" "%f")
+	      (parallel [(const_int 1) (const_int 3)
+			 (const_int 5) (const_int 7)
+			 (const_int 9) (const_int 11)
+			 (const_int 13) (const_int 15)])))
+	  (sign_extend:V8HI
+	    (vec_select:V8QI
+	      (match_operand:V16QI 2 "register_operand" "f")
+	      (parallel [(const_int 1) (const_int 3)
+			 (const_int 5) (const_int 7)
+			 (const_int 9) (const_int 11)
+			 (const_int 13) (const_int 15)])))))]
+  "ISA_HAS_LSX"
+  "v<optab>wod.h.bu.b\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V8HI")])
+
+(define_insn "lsx_vaddwev_q_d"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V2DI 1 "register_operand" "f")
+		      (match_operand:V2DI 2 "register_operand" "f")]
+		     UNSPEC_LSX_VADDWEV))]
+  "ISA_HAS_LSX"
+  "vaddwev.q.d\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vaddwev_q_du"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V2DI 1 "register_operand" "f")
+		      (match_operand:V2DI 2 "register_operand" "f")]
+		     UNSPEC_LSX_VADDWEV2))]
+  "ISA_HAS_LSX"
+  "vaddwev.q.du\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vaddwod_q_d"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V2DI 1 "register_operand" "f")
+		      (match_operand:V2DI 2 "register_operand" "f")]
+		     UNSPEC_LSX_VADDWOD))]
+  "ISA_HAS_LSX"
+  "vaddwod.q.d\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vaddwod_q_du"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V2DI 1 "register_operand" "f")
+		      (match_operand:V2DI 2 "register_operand" "f")]
+		     UNSPEC_LSX_VADDWOD2))]
+  "ISA_HAS_LSX"
+  "vaddwod.q.du\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vsubwev_q_d"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V2DI 1 "register_operand" "f")
+		      (match_operand:V2DI 2 "register_operand" "f")]
+		     UNSPEC_LSX_VSUBWEV))]
+  "ISA_HAS_LSX"
+  "vsubwev.q.d\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vsubwev_q_du"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V2DI 1 "register_operand" "f")
+		      (match_operand:V2DI 2 "register_operand" "f")]
+		     UNSPEC_LSX_VSUBWEV2))]
+  "ISA_HAS_LSX"
+  "vsubwev.q.du\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vsubwod_q_d"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V2DI 1 "register_operand" "f")
+		      (match_operand:V2DI 2 "register_operand" "f")]
+		     UNSPEC_LSX_VSUBWOD))]
+  "ISA_HAS_LSX"
+  "vsubwod.q.d\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vsubwod_q_du"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V2DI 1 "register_operand" "f")
+		      (match_operand:V2DI 2 "register_operand" "f")]
+		     UNSPEC_LSX_VSUBWOD2))]
+  "ISA_HAS_LSX"
+  "vsubwod.q.du\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vaddwev_q_du_d"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V2DI 1 "register_operand" "f")
+		      (match_operand:V2DI 2 "register_operand" "f")]
+		     UNSPEC_LSX_VADDWEV3))]
+  "ISA_HAS_LSX"
+  "vaddwev.q.du.d\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vaddwod_q_du_d"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V2DI 1 "register_operand" "f")
+		      (match_operand:V2DI 2 "register_operand" "f")]
+		     UNSPEC_LSX_VADDWOD3))]
+  "ISA_HAS_LSX"
+  "vaddwod.q.du.d\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vmulwev_q_du_d"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V2DI 1 "register_operand" "f")
+		      (match_operand:V2DI 2 "register_operand" "f")]
+		     UNSPEC_LSX_VMULWEV3))]
+  "ISA_HAS_LSX"
+  "vmulwev.q.du.d\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vmulwod_q_du_d"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V2DI 1 "register_operand" "f")
+		      (match_operand:V2DI 2 "register_operand" "f")]
+		     UNSPEC_LSX_VMULWOD3))]
+  "ISA_HAS_LSX"
+  "vmulwod.q.du.d\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vmulwev_q_d"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V2DI 1 "register_operand" "f")
+		      (match_operand:V2DI 2 "register_operand" "f")]
+		     UNSPEC_LSX_VMULWEV))]
+  "ISA_HAS_LSX"
+  "vmulwev.q.d\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vmulwev_q_du"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V2DI 1 "register_operand" "f")
+		      (match_operand:V2DI 2 "register_operand" "f")]
+		     UNSPEC_LSX_VMULWEV2))]
+  "ISA_HAS_LSX"
+  "vmulwev.q.du\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vmulwod_q_d"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V2DI 1 "register_operand" "f")
+		      (match_operand:V2DI 2 "register_operand" "f")]
+		     UNSPEC_LSX_VMULWOD))]
+  "ISA_HAS_LSX"
+  "vmulwod.q.d\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vmulwod_q_du"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V2DI 1 "register_operand" "f")
+		      (match_operand:V2DI 2 "register_operand" "f")]
+		     UNSPEC_LSX_VMULWOD2))]
+  "ISA_HAS_LSX"
+  "vmulwod.q.du\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vhaddw_q_d"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V2DI 1 "register_operand" "f")
+		      (match_operand:V2DI 2 "register_operand" "f")]
+		     UNSPEC_LSX_VHADDW_Q_D))]
+  "ISA_HAS_LSX"
+  "vhaddw.q.d\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vhaddw_qu_du"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V2DI 1 "register_operand" "f")
+		      (match_operand:V2DI 2 "register_operand" "f")]
+		     UNSPEC_LSX_VHADDW_QU_DU))]
+  "ISA_HAS_LSX"
+  "vhaddw.qu.du\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vhsubw_q_d"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V2DI 1 "register_operand" "f")
+		      (match_operand:V2DI 2 "register_operand" "f")]
+		     UNSPEC_LSX_VHSUBW_Q_D))]
+  "ISA_HAS_LSX"
+  "vhsubw.q.d\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vhsubw_qu_du"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V2DI 1 "register_operand" "f")
+		      (match_operand:V2DI 2 "register_operand" "f")]
+		     UNSPEC_LSX_VHSUBW_QU_DU))]
+  "ISA_HAS_LSX"
+  "vhsubw.qu.du\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vmaddwev_d_w<u>"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(plus:V2DI
+	  (match_operand:V2DI 1 "register_operand" "0")
+	  (mult:V2DI
+	    (any_extend:V2DI
+	      (vec_select:V2SI
+		(match_operand:V4SI 2 "register_operand" "%f")
+		(parallel [(const_int 0) (const_int 2)])))
+	    (any_extend:V2DI
+	      (vec_select:V2SI
+		(match_operand:V4SI 3 "register_operand" "f")
+		(parallel [(const_int 0) (const_int 2)]))))))]
+  "ISA_HAS_LSX"
+  "vmaddwev.d.w<u>\t%w0,%w2,%w3"
+  [(set_attr "type" "simd_fmadd")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vmaddwev_w_h<u>"
+  [(set (match_operand:V4SI 0 "register_operand" "=f")
+	(plus:V4SI
+	  (match_operand:V4SI 1 "register_operand" "0")
+	  (mult:V4SI
+	    (any_extend:V4SI
+	      (vec_select:V4HI
+		(match_operand:V8HI 2 "register_operand" "%f")
+		(parallel [(const_int 0) (const_int 2)
+			   (const_int 4) (const_int 6)])))
+	    (any_extend:V4SI
+	      (vec_select:V4HI
+		(match_operand:V8HI 3 "register_operand" "f")
+		(parallel [(const_int 0) (const_int 2)
+			   (const_int 4) (const_int 6)]))))))]
+  "ISA_HAS_LSX"
+  "vmaddwev.w.h<u>\t%w0,%w2,%w3"
+  [(set_attr "type" "simd_fmadd")
+   (set_attr "mode" "V4SI")])
+
+(define_insn "lsx_vmaddwev_h_b<u>"
+  [(set (match_operand:V8HI 0 "register_operand" "=f")
+	(plus:V8HI
+	  (match_operand:V8HI 1 "register_operand" "0")
+	  (mult:V8HI
+	    (any_extend:V8HI
+	      (vec_select:V8QI
+		(match_operand:V16QI 2 "register_operand" "%f")
+		(parallel [(const_int 0) (const_int 2)
+			   (const_int 4) (const_int 6)
+			   (const_int 8) (const_int 10)
+			   (const_int 12) (const_int 14)])))
+	    (any_extend:V8HI
+	      (vec_select:V8QI
+		(match_operand:V16QI 3 "register_operand" "f")
+		(parallel [(const_int 0) (const_int 2)
+			   (const_int 4) (const_int 6)
+			   (const_int 8) (const_int 10)
+			   (const_int 12) (const_int 14)]))))))]
+  "ISA_HAS_LSX"
+  "vmaddwev.h.b<u>\t%w0,%w2,%w3"
+  [(set_attr "type" "simd_fmadd")
+   (set_attr "mode" "V8HI")])
+
+(define_insn "lsx_vmaddwod_d_w<u>"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(plus:V2DI
+	  (match_operand:V2DI 1 "register_operand" "0")
+	  (mult:V2DI
+	    (any_extend:V2DI
+	      (vec_select:V2SI
+		(match_operand:V4SI 2 "register_operand" "%f")
+		(parallel [(const_int 1) (const_int 3)])))
+	    (any_extend:V2DI
+	      (vec_select:V2SI
+		(match_operand:V4SI 3 "register_operand" "f")
+		(parallel [(const_int 1) (const_int 3)]))))))]
+  "ISA_HAS_LSX"
+  "vmaddwod.d.w<u>\t%w0,%w2,%w3"
+  [(set_attr "type" "simd_fmadd")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vmaddwod_w_h<u>"
+  [(set (match_operand:V4SI 0 "register_operand" "=f")
+	(plus:V4SI
+	  (match_operand:V4SI 1 "register_operand" "0")
+	  (mult:V4SI
+	    (any_extend:V4SI
+	      (vec_select:V4HI
+		(match_operand:V8HI 2 "register_operand" "%f")
+		(parallel [(const_int 1) (const_int 3)
+			   (const_int 5) (const_int 7)])))
+	    (any_extend:V4SI
+	      (vec_select:V4HI
+		(match_operand:V8HI 3 "register_operand" "f")
+		(parallel [(const_int 1) (const_int 3)
+			   (const_int 5) (const_int 7)]))))))]
+  "ISA_HAS_LSX"
+  "vmaddwod.w.h<u>\t%w0,%w2,%w3"
+  [(set_attr "type" "simd_fmadd")
+   (set_attr "mode" "V4SI")])
+
+(define_insn "lsx_vmaddwod_h_b<u>"
+  [(set (match_operand:V8HI 0 "register_operand" "=f")
+	(plus:V8HI
+	  (match_operand:V8HI 1 "register_operand" "0")
+	  (mult:V8HI
+	    (any_extend:V8HI
+	      (vec_select:V8QI
+		(match_operand:V16QI 2 "register_operand" "%f")
+		(parallel [(const_int 1) (const_int 3)
+			   (const_int 5) (const_int 7)
+			   (const_int 9) (const_int 11)
+			   (const_int 13) (const_int 15)])))
+	    (any_extend:V8HI
+	      (vec_select:V8QI
+		(match_operand:V16QI 3 "register_operand" "f")
+		(parallel [(const_int 1) (const_int 3)
+			   (const_int 5) (const_int 7)
+			   (const_int 9) (const_int 11)
+			   (const_int 13) (const_int 15)]))))))]
+  "ISA_HAS_LSX"
+  "vmaddwod.h.b<u>\t%w0,%w2,%w3"
+  [(set_attr "type" "simd_fmadd")
+   (set_attr "mode" "V8HI")])
+
+(define_insn "lsx_vmaddwev_d_wu_w"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(plus:V2DI
+	  (match_operand:V2DI 1 "register_operand" "0")
+	  (mult:V2DI
+	    (zero_extend:V2DI
+	      (vec_select:V2SI
+		(match_operand:V4SI 2 "register_operand" "%f")
+		(parallel [(const_int 0) (const_int 2)])))
+	    (sign_extend:V2DI
+	      (vec_select:V2SI
+		(match_operand:V4SI 3 "register_operand" "f")
+		(parallel [(const_int 0) (const_int 2)]))))))]
+  "ISA_HAS_LSX"
+  "vmaddwev.d.wu.w\t%w0,%w2,%w3"
+  [(set_attr "type" "simd_fmadd")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vmaddwev_w_hu_h"
+  [(set (match_operand:V4SI 0 "register_operand" "=f")
+	(plus:V4SI
+	  (match_operand:V4SI 1 "register_operand" "0")
+	  (mult:V4SI
+	    (zero_extend:V4SI
+	      (vec_select:V4HI
+		(match_operand:V8HI 2 "register_operand" "%f")
+		(parallel [(const_int 0) (const_int 2)
+			   (const_int 4) (const_int 6)])))
+	    (sign_extend:V4SI
+	      (vec_select:V4HI
+		(match_operand:V8HI 3 "register_operand" "f")
+		(parallel [(const_int 0) (const_int 2)
+			   (const_int 4) (const_int 6)]))))))]
+  "ISA_HAS_LSX"
+  "vmaddwev.w.hu.h\t%w0,%w2,%w3"
+  [(set_attr "type" "simd_fmadd")
+   (set_attr "mode" "V4SI")])
+
+(define_insn "lsx_vmaddwev_h_bu_b"
+  [(set (match_operand:V8HI 0 "register_operand" "=f")
+	(plus:V8HI
+	  (match_operand:V8HI 1 "register_operand" "0")
+	  (mult:V8HI
+	    (zero_extend:V8HI
+	      (vec_select:V8QI
+		(match_operand:V16QI 2 "register_operand" "%f")
+		(parallel [(const_int 0) (const_int 2)
+			   (const_int 4) (const_int 6)
+			   (const_int 8) (const_int 10)
+			   (const_int 12) (const_int 14)])))
+	    (sign_extend:V8HI
+	      (vec_select:V8QI
+		(match_operand:V16QI 3 "register_operand" "f")
+		(parallel [(const_int 0) (const_int 2)
+			   (const_int 4) (const_int 6)
+			   (const_int 8) (const_int 10)
+			   (const_int 12) (const_int 14)]))))))]
+  "ISA_HAS_LSX"
+  "vmaddwev.h.bu.b\t%w0,%w2,%w3"
+  [(set_attr "type" "simd_fmadd")
+   (set_attr "mode" "V8HI")])
+
+(define_insn "lsx_vmaddwod_d_wu_w"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(plus:V2DI
+	  (match_operand:V2DI 1 "register_operand" "0")
+	  (mult:V2DI
+	    (zero_extend:V2DI
+	      (vec_select:V2SI
+		(match_operand:V4SI 2 "register_operand" "%f")
+		(parallel [(const_int 1) (const_int 3)])))
+	    (sign_extend:V2DI
+	      (vec_select:V2SI
+		(match_operand:V4SI 3 "register_operand" "f")
+		(parallel [(const_int 1) (const_int 3)]))))))]
+  "ISA_HAS_LSX"
+  "vmaddwod.d.wu.w\t%w0,%w2,%w3"
+  [(set_attr "type" "simd_fmadd")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vmaddwod_w_hu_h"
+  [(set (match_operand:V4SI 0 "register_operand" "=f")
+	(plus:V4SI
+	  (match_operand:V4SI 1 "register_operand" "0")
+	  (mult:V4SI
+	    (zero_extend:V4SI
+	      (vec_select:V4HI
+		(match_operand:V8HI 2 "register_operand" "%f")
+		(parallel [(const_int 1) (const_int 3)
+			   (const_int 5) (const_int 7)])))
+	    (sign_extend:V4SI
+	      (vec_select:V4HI
+		(match_operand:V8HI 3 "register_operand" "f")
+		(parallel [(const_int 1) (const_int 3)
+			   (const_int 5) (const_int 7)]))))))]
+  "ISA_HAS_LSX"
+  "vmaddwod.w.hu.h\t%w0,%w2,%w3"
+  [(set_attr "type" "simd_fmadd")
+   (set_attr "mode" "V4SI")])
+
+(define_insn "lsx_vmaddwod_h_bu_b"
+  [(set (match_operand:V8HI 0 "register_operand" "=f")
+	(plus:V8HI
+	  (match_operand:V8HI 1 "register_operand" "0")
+	  (mult:V8HI
+	    (zero_extend:V8HI
+	      (vec_select:V8QI
+		(match_operand:V16QI 2 "register_operand" "%f")
+		(parallel [(const_int 1) (const_int 3)
+			   (const_int 5) (const_int 7)
+			   (const_int 9) (const_int 11)
+			   (const_int 13) (const_int 15)])))
+	    (sign_extend:V8HI
+	      (vec_select:V8QI
+		(match_operand:V16QI 3 "register_operand" "f")
+		(parallel [(const_int 1) (const_int 3)
+			   (const_int 5) (const_int 7)
+			   (const_int 9) (const_int 11)
+			   (const_int 13) (const_int 15)]))))))]
+  "ISA_HAS_LSX"
+  "vmaddwod.h.bu.b\t%w0,%w2,%w3"
+  [(set_attr "type" "simd_fmadd")
+   (set_attr "mode" "V8HI")])
+
+(define_insn "lsx_vmaddwev_q_d"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V2DI 1 "register_operand" "0")
+		      (match_operand:V2DI 2 "register_operand" "f")
+		      (match_operand:V2DI 3 "register_operand" "f")]
+		     UNSPEC_LSX_VMADDWEV))]
+  "ISA_HAS_LSX"
+  "vmaddwev.q.d\t%w0,%w2,%w3"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vmaddwod_q_d"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V2DI 1 "register_operand" "0")
+		      (match_operand:V2DI 2 "register_operand" "f")
+		      (match_operand:V2DI 3 "register_operand" "f")]
+		     UNSPEC_LSX_VMADDWOD))]
+  "ISA_HAS_LSX"
+  "vmaddwod.q.d\t%w0,%w2,%w3"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vmaddwev_q_du"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V2DI 1 "register_operand" "0")
+		      (match_operand:V2DI 2 "register_operand" "f")
+		      (match_operand:V2DI 3 "register_operand" "f")]
+		     UNSPEC_LSX_VMADDWEV2))]
+  "ISA_HAS_LSX"
+  "vmaddwev.q.du\t%w0,%w2,%w3"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vmaddwod_q_du"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V2DI 1 "register_operand" "0")
+		      (match_operand:V2DI 2 "register_operand" "f")
+		      (match_operand:V2DI 3 "register_operand" "f")]
+		     UNSPEC_LSX_VMADDWOD2))]
+  "ISA_HAS_LSX"
+  "vmaddwod.q.du\t%w0,%w2,%w3"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vmaddwev_q_du_d"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V2DI 1 "register_operand" "0")
+		      (match_operand:V2DI 2 "register_operand" "f")
+		      (match_operand:V2DI 3 "register_operand" "f")]
+		     UNSPEC_LSX_VMADDWEV3))]
+  "ISA_HAS_LSX"
+  "vmaddwev.q.du.d\t%w0,%w2,%w3"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vmaddwod_q_du_d"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V2DI 1 "register_operand" "0")
+		      (match_operand:V2DI 2 "register_operand" "f")
+		      (match_operand:V2DI 3 "register_operand" "f")]
+		     UNSPEC_LSX_VMADDWOD3))]
+  "ISA_HAS_LSX"
+  "vmaddwod.q.du.d\t%w0,%w2,%w3"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vrotr_<lsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(unspec:ILSX [(match_operand:ILSX 1 "register_operand" "f")
+		      (match_operand:ILSX 2 "register_operand" "f")]
+		     UNSPEC_LSX_VROTR))]
+  "ISA_HAS_LSX"
+  "vrotr.<lsxfmt>\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vadd_q"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V2DI 1 "register_operand" "f")
+		      (match_operand:V2DI 2 "register_operand" "f")]
+		     UNSPEC_LSX_VADD_Q))]
+  "ISA_HAS_LSX"
+  "vadd.q\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vsub_q"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V2DI 1 "register_operand" "f")
+		      (match_operand:V2DI 2 "register_operand" "f")]
+		     UNSPEC_LSX_VSUB_Q))]
+  "ISA_HAS_LSX"
+  "vsub.q\t%w0,%w1,%w2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vmskgez_b"
+  [(set (match_operand:V16QI 0 "register_operand" "=f")
+	(unspec:V16QI [(match_operand:V16QI 1 "register_operand" "f")]
+		      UNSPEC_LSX_VMSKGEZ))]
+  "ISA_HAS_LSX"
+  "vmskgez.b\t%w0,%w1"
+  [(set_attr "type" "simd_bit")
+   (set_attr "mode" "V16QI")])
+
+(define_insn "lsx_vmsknz_b"
+  [(set (match_operand:V16QI 0 "register_operand" "=f")
+	(unspec:V16QI [(match_operand:V16QI 1 "register_operand" "f")]
+		      UNSPEC_LSX_VMSKNZ))]
+  "ISA_HAS_LSX"
+  "vmsknz.b\t%w0,%w1"
+  [(set_attr "type" "simd_bit")
+   (set_attr "mode" "V16QI")])
+
+(define_insn "lsx_vexth_h<u>_b<u>"
+  [(set (match_operand:V8HI 0 "register_operand" "=f")
+	(any_extend:V8HI
+	  (vec_select:V8QI
+	    (match_operand:V16QI 1 "register_operand" "f")
+	    (parallel [(const_int 8) (const_int 9)
+		       (const_int 10) (const_int 11)
+		       (const_int 12) (const_int 13)
+		       (const_int 14) (const_int 15)]))))]
+  "ISA_HAS_LSX"
+  "vexth.h<u>.b<u>\t%w0,%w1"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "mode" "V8HI")])
+
+(define_insn "lsx_vexth_w<u>_h<u>"
+  [(set (match_operand:V4SI 0 "register_operand" "=f")
+	(any_extend:V4SI
+	  (vec_select:V4HI
+	    (match_operand:V8HI 1 "register_operand" "f")
+	    (parallel [(const_int 4) (const_int 5)
+		       (const_int 6) (const_int 7)]))))]
+  "ISA_HAS_LSX"
+  "vexth.w<u>.h<u>\t%w0,%w1"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "mode" "V4SI")])
+
+(define_insn "lsx_vexth_d<u>_w<u>"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(any_extend:V2DI
+	  (vec_select:V2SI
+	    (match_operand:V4SI 1 "register_operand" "f")
+	    (parallel [(const_int 2) (const_int 3)]))))]
+  "ISA_HAS_LSX"
+  "vexth.d<u>.w<u>\t%w0,%w1"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vexth_q_d"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V2DI 1 "register_operand" "f")]
+		     UNSPEC_LSX_VEXTH_Q_D))]
+  "ISA_HAS_LSX"
+  "vexth.q.d\t%w0,%w1"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vexth_qu_du"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V2DI 1 "register_operand" "f")]
+		     UNSPEC_LSX_VEXTH_QU_DU))]
+  "ISA_HAS_LSX"
+  "vexth.qu.du\t%w0,%w1"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vrotri_<lsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(rotatert:ILSX (match_operand:ILSX 1 "register_operand" "f")
+		      (match_operand 2 "const_<bitimm>_operand" "")))]
+  "ISA_HAS_LSX"
+  "vrotri.<lsxfmt>\t%w0,%w1,%2"
+  [(set_attr "type" "simd_shf")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vextl_q_d"
+  [(set (match_operand:V2DI 0 "register_operand" "=f")
+	(unspec:V2DI [(match_operand:V2DI 1 "register_operand" "f")]
+		     UNSPEC_LSX_VEXTL_Q_D))]
+  "ISA_HAS_LSX"
+  "vextl.q.d\t%w0,%w1"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "mode" "V2DI")])
+
+(define_insn "lsx_vsrlni_<lsxfmt>_<dlsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(unspec:ILSX [(match_operand:ILSX 1 "register_operand" "0")
+		      (match_operand:ILSX 2 "register_operand" "f")
+		      (match_operand 3 "const_uimm8_operand" "")]
+		     UNSPEC_LSX_VSRLNI))]
+  "ISA_HAS_LSX"
+  "vsrlni.<lsxfmt>.<dlsxfmt>\t%w0,%w2,%3"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vsrlrni_<lsxfmt>_<dlsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(unspec:ILSX [(match_operand:ILSX 1 "register_operand" "0")
+		      (match_operand:ILSX 2 "register_operand" "f")
+		      (match_operand 3 "const_uimm8_operand" "")]
+		     UNSPEC_LSX_VSRLRNI))]
+  "ISA_HAS_LSX"
+  "vsrlrni.<lsxfmt>.<dlsxfmt>\t%w0,%w2,%3"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vssrlni_<lsxfmt>_<dlsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(unspec:ILSX [(match_operand:ILSX 1 "register_operand" "0")
+		      (match_operand:ILSX 2 "register_operand" "f")
+		      (match_operand 3 "const_uimm8_operand" "")]
+		     UNSPEC_LSX_VSSRLNI))]
+  "ISA_HAS_LSX"
+  "vssrlni.<lsxfmt>.<dlsxfmt>\t%w0,%w2,%3"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vssrlni_<lsxfmt_u>_<dlsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(unspec:ILSX [(match_operand:ILSX 1 "register_operand" "0")
+		      (match_operand:ILSX 2 "register_operand" "f")
+		      (match_operand 3 "const_uimm8_operand" "")]
+		     UNSPEC_LSX_VSSRLNI2))]
+  "ISA_HAS_LSX"
+  "vssrlni.<lsxfmt_u>.<dlsxfmt>\t%w0,%w2,%3"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vssrlrni_<lsxfmt>_<dlsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(unspec:ILSX [(match_operand:ILSX 1 "register_operand" "0")
+		      (match_operand:ILSX 2 "register_operand" "f")
+		      (match_operand 3 "const_uimm8_operand" "")]
+		     UNSPEC_LSX_VSSRLRNI))]
+  "ISA_HAS_LSX"
+  "vssrlrni.<lsxfmt>.<dlsxfmt>\t%w0,%w2,%3"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vssrlrni_<lsxfmt_u>_<dlsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(unspec:ILSX [(match_operand:ILSX 1 "register_operand" "0")
+		      (match_operand:ILSX 2 "register_operand" "f")
+		      (match_operand 3 "const_uimm8_operand" "")]
+		     UNSPEC_LSX_VSSRLRNI2))]
+  "ISA_HAS_LSX"
+  "vssrlrni.<lsxfmt_u>.<dlsxfmt>\t%w0,%w2,%3"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vsrani_<lsxfmt>_<dlsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(unspec:ILSX [(match_operand:ILSX 1 "register_operand" "0")
+		      (match_operand:ILSX 2 "register_operand" "f")
+		      (match_operand 3 "const_uimm8_operand" "")]
+		     UNSPEC_LSX_VSRANI))]
+  "ISA_HAS_LSX"
+  "vsrani.<lsxfmt>.<dlsxfmt>\t%w0,%w2,%3"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vsrarni_<lsxfmt>_<dlsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(unspec:ILSX [(match_operand:ILSX 1 "register_operand" "0")
+		      (match_operand:ILSX 2 "register_operand" "f")
+		      (match_operand 3 "const_uimm8_operand" "")]
+		     UNSPEC_LSX_VSRARNI))]
+  "ISA_HAS_LSX"
+  "vsrarni.<lsxfmt>.<dlsxfmt>\t%w0,%w2,%3"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vssrani_<lsxfmt>_<dlsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(unspec:ILSX [(match_operand:ILSX 1 "register_operand" "0")
+		      (match_operand:ILSX 2 "register_operand" "f")
+		      (match_operand 3 "const_uimm8_operand" "")]
+		    UNSPEC_LSX_VSSRANI))]
+  "ISA_HAS_LSX"
+  "vssrani.<lsxfmt>.<dlsxfmt>\t%w0,%w2,%3"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vssrani_<lsxfmt_u>_<dlsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(unspec:ILSX [(match_operand:ILSX 1 "register_operand" "0")
+		      (match_operand:ILSX 2 "register_operand" "f")
+		      (match_operand 3 "const_uimm8_operand" "")]
+		     UNSPEC_LSX_VSSRANI2))]
+  "ISA_HAS_LSX"
+  "vssrani.<lsxfmt_u>.<dlsxfmt>\t%w0,%w2,%3"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vssrarni_<lsxfmt>_<dlsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(unspec:ILSX [(match_operand:ILSX 1 "register_operand" "0")
+		      (match_operand:ILSX 2 "register_operand" "f")
+		      (match_operand 3 "const_uimm8_operand" "")]
+		     UNSPEC_LSX_VSSRARNI))]
+  "ISA_HAS_LSX"
+  "vssrarni.<lsxfmt>.<dlsxfmt>\t%w0,%w2,%3"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vssrarni_<lsxfmt_u>_<dlsxfmt>"
+  [(set (match_operand:ILSX 0 "register_operand" "=f")
+	(unspec:ILSX [(match_operand:ILSX 1 "register_operand" "0")
+		      (match_operand:ILSX 2 "register_operand" "f")
+		      (match_operand 3 "const_uimm8_operand" "")]
+		     UNSPEC_LSX_VSSRARNI2))]
+  "ISA_HAS_LSX"
+  "vssrarni.<lsxfmt_u>.<dlsxfmt>\t%w0,%w2,%3"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lsx_vpermi_w"
+  [(set (match_operand:V4SI 0 "register_operand" "=f")
+	(unspec:V4SI [(match_operand:V4SI 1 "register_operand" "0")
+		      (match_operand:V4SI 2 "register_operand" "f")
+		      (match_operand 3 "const_uimm8_operand" "")]
+		     UNSPEC_LSX_VPERMI))]
+  "ISA_HAS_LSX"
+  "vpermi.w\t%w0,%w2,%3"
+  [(set_attr "type" "simd_bit")
+   (set_attr "mode" "V4SI")])
diff --git a/gcc/config/loongarch/predicates.md b/gcc/config/loongarch/predicates.md
index 510973aa339..f430629825e 100644
--- a/gcc/config/loongarch/predicates.md
+++ b/gcc/config/loongarch/predicates.md
@@ -87,10 +87,42 @@ (define_predicate "const_immalsl_operand"
   (and (match_code "const_int")
        (match_test "IN_RANGE (INTVAL (op), 1, 4)")))
 
+(define_predicate "const_lsx_branch_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (INTVAL (op), -1024, 1023)")))
+
+(define_predicate "const_uimm3_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (INTVAL (op), 0, 7)")))
+
+(define_predicate "const_8_to_11_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (INTVAL (op), 8, 11)")))
+
+(define_predicate "const_12_to_15_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (INTVAL (op), 12, 15)")))
+
+(define_predicate "const_uimm4_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (INTVAL (op), 0, 15)")))
+
 (define_predicate "const_uimm5_operand"
   (and (match_code "const_int")
        (match_test "IN_RANGE (INTVAL (op), 0, 31)")))
 
+(define_predicate "const_uimm6_operand"
+  (and (match_code "const_int")
+       (match_test "UIMM6_OPERAND (INTVAL (op))")))
+
+(define_predicate "const_uimm7_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (INTVAL (op), 0, 127)")))
+
+(define_predicate "const_uimm8_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (INTVAL (op), 0, 255)")))
+
 (define_predicate "const_uimm14_operand"
   (and (match_code "const_int")
        (match_test "IN_RANGE (INTVAL (op), 0, 16383)")))
@@ -99,10 +131,74 @@ (define_predicate "const_uimm15_operand"
   (and (match_code "const_int")
        (match_test "IN_RANGE (INTVAL (op), 0, 32767)")))
 
+(define_predicate "const_imm5_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (INTVAL (op), -16, 15)")))
+
+(define_predicate "const_imm10_operand"
+  (and (match_code "const_int")
+       (match_test "IMM10_OPERAND (INTVAL (op))")))
+
 (define_predicate "const_imm12_operand"
   (and (match_code "const_int")
        (match_test "IMM12_OPERAND (INTVAL (op))")))
 
+(define_predicate "const_imm13_operand"
+  (and (match_code "const_int")
+       (match_test "IMM13_OPERAND (INTVAL (op))")))
+
+(define_predicate "reg_imm10_operand"
+  (ior (match_operand 0 "const_imm10_operand")
+       (match_operand 0 "register_operand")))
+
+(define_predicate "aq8b_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_signed_immediate_p (INTVAL (op), 8, 0)")))
+
+(define_predicate "aq8h_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_signed_immediate_p (INTVAL (op), 8, 1)")))
+
+(define_predicate "aq8w_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_signed_immediate_p (INTVAL (op), 8, 2)")))
+
+(define_predicate "aq8d_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_signed_immediate_p (INTVAL (op), 8, 3)")))
+
+(define_predicate "aq10b_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_signed_immediate_p (INTVAL (op), 10, 0)")))
+
+(define_predicate "aq10h_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_signed_immediate_p (INTVAL (op), 10, 1)")))
+
+(define_predicate "aq10w_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_signed_immediate_p (INTVAL (op), 10, 2)")))
+
+(define_predicate "aq10d_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_signed_immediate_p (INTVAL (op), 10, 3)")))
+
+(define_predicate "aq12b_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_signed_immediate_p (INTVAL (op), 12, 0)")))
+
+(define_predicate "aq12h_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_signed_immediate_p (INTVAL (op), 11, 1)")))
+
+(define_predicate "aq12w_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_signed_immediate_p (INTVAL (op), 10, 2)")))
+
+(define_predicate "aq12d_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_signed_immediate_p (INTVAL (op), 9, 3)")))
+
 (define_predicate "sle_operand"
   (and (match_code "const_int")
        (match_test "IMM12_OPERAND (INTVAL (op) + 1)")))
@@ -112,29 +208,206 @@ (define_predicate "sleu_operand"
        (match_test "INTVAL (op) + 1 != 0")))
 
 (define_predicate "const_0_operand"
-  (and (match_code "const_int,const_double,const_vector")
+  (and (match_code "const_int,const_wide_int,const_double,const_vector")
        (match_test "op == CONST0_RTX (GET_MODE (op))")))
 
+(define_predicate "const_m1_operand"
+  (and (match_code "const_int,const_wide_int,const_double,const_vector")
+       (match_test "op == CONSTM1_RTX (GET_MODE (op))")))
+
+(define_predicate "reg_or_m1_operand"
+  (ior (match_operand 0 "const_m1_operand")
+       (match_operand 0 "register_operand")))
+
 (define_predicate "reg_or_0_operand"
   (ior (match_operand 0 "const_0_operand")
        (match_operand 0 "register_operand")))
 
 (define_predicate "const_1_operand"
-  (and (match_code "const_int,const_double,const_vector")
+  (and (match_code "const_int,const_wide_int,const_double,const_vector")
        (match_test "op == CONST1_RTX (GET_MODE (op))")))
 
 (define_predicate "reg_or_1_operand"
   (ior (match_operand 0 "const_1_operand")
        (match_operand 0 "register_operand")))
 
+;; These are used in vec_merge, hence accept bitmask as const_int.
+(define_predicate "const_exp_2_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (exact_log2 (INTVAL (op)), 0, 1)")))
+
+(define_predicate "const_exp_4_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (exact_log2 (INTVAL (op)), 0, 3)")))
+
+(define_predicate "const_exp_8_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (exact_log2 (INTVAL (op)), 0, 7)")))
+
+(define_predicate "const_exp_16_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (exact_log2 (INTVAL (op)), 0, 15)")))
+
+(define_predicate "const_exp_32_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (exact_log2 (INTVAL (op)), 0, 31)")))
+
+;; This is used for indexing into vectors, and hence only accepts const_int.
+(define_predicate "const_0_or_1_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (INTVAL (op), 0, 1)")))
+
+(define_predicate "const_0_to_3_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (INTVAL (op), 0, 3)")))
+
+(define_predicate "const_0_to_7_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (INTVAL (op), 0, 7)")))
+
+(define_predicate "const_2_or_3_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (INTVAL (op), 2, 3)")))
+
+(define_predicate "const_4_to_7_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (INTVAL (op), 4, 7)")))
+
+(define_predicate "const_8_to_15_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (INTVAL (op), 0, 7)")))
+
+(define_predicate "const_16_to_31_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (INTVAL (op), 0, 7)")))
+
+(define_predicate "qi_mask_operand"
+  (and (match_code "const_int")
+       (match_test "UINTVAL (op) == 0xff")))
+
+(define_predicate "hi_mask_operand"
+  (and (match_code "const_int")
+       (match_test "UINTVAL (op) == 0xffff")))
+
 (define_predicate "lu52i_mask_operand"
   (and (match_code "const_int")
        (match_test "UINTVAL (op) == 0xfffffffffffff")))
 
+(define_predicate "si_mask_operand"
+  (and (match_code "const_int")
+       (match_test "UINTVAL (op) == 0xffffffff")))
+
 (define_predicate "low_bitmask_operand"
   (and (match_code "const_int")
        (match_test "low_bitmask_len (mode, INTVAL (op)) > 12")))
 
+(define_predicate "d_operand"
+  (and (match_code "reg")
+       (match_test "GP_REG_P (REGNO (op))")))
+
+(define_predicate "db4_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_unsigned_immediate_p (INTVAL (op) + 1, 4, 0)")))
+
+(define_predicate "db7_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_unsigned_immediate_p (INTVAL (op) + 1, 7, 0)")))
+
+(define_predicate "db8_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_unsigned_immediate_p (INTVAL (op) + 1, 8, 0)")))
+
+(define_predicate "ib3_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_unsigned_immediate_p (INTVAL (op) - 1, 3, 0)")))
+
+(define_predicate "sb4_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_signed_immediate_p (INTVAL (op), 4, 0)")))
+
+(define_predicate "sb5_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_signed_immediate_p (INTVAL (op), 5, 0)")))
+
+(define_predicate "sb8_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_signed_immediate_p (INTVAL (op), 8, 0)")))
+
+(define_predicate "sd8_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_signed_immediate_p (INTVAL (op), 8, 3)")))
+
+(define_predicate "ub4_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_unsigned_immediate_p (INTVAL (op), 4, 0)")))
+
+(define_predicate "ub8_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_unsigned_immediate_p (INTVAL (op), 8, 0)")))
+
+(define_predicate "uh4_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_unsigned_immediate_p (INTVAL (op), 4, 1)")))
+
+(define_predicate "uw4_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_unsigned_immediate_p (INTVAL (op), 4, 2)")))
+
+(define_predicate "uw5_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_unsigned_immediate_p (INTVAL (op), 5, 2)")))
+
+(define_predicate "uw6_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_unsigned_immediate_p (INTVAL (op), 6, 2)")))
+
+(define_predicate "uw8_operand"
+  (and (match_code "const_int")
+       (match_test "loongarch_unsigned_immediate_p (INTVAL (op), 8, 2)")))
+
+(define_predicate "addiur2_operand"
+  (and (match_code "const_int")
+	(ior (match_test "INTVAL (op) == -1")
+	     (match_test "INTVAL (op) == 1")
+	     (match_test "INTVAL (op) == 4")
+	     (match_test "INTVAL (op) == 8")
+	     (match_test "INTVAL (op) == 12")
+	     (match_test "INTVAL (op) == 16")
+	     (match_test "INTVAL (op) == 20")
+	     (match_test "INTVAL (op) == 24"))))
+
+(define_predicate "addiusp_operand"
+  (and (match_code "const_int")
+       (ior (match_test "(IN_RANGE (INTVAL (op), 2, 257))")
+	    (match_test "(IN_RANGE (INTVAL (op), -258, -3))"))))
+
+(define_predicate "andi16_operand"
+  (and (match_code "const_int")
+	(ior (match_test "IN_RANGE (INTVAL (op), 1, 4)")
+	     (match_test "IN_RANGE (INTVAL (op), 7, 8)")
+	     (match_test "IN_RANGE (INTVAL (op), 15, 16)")
+	     (match_test "IN_RANGE (INTVAL (op), 31, 32)")
+	     (match_test "IN_RANGE (INTVAL (op), 63, 64)")
+	     (match_test "INTVAL (op) == 255")
+	     (match_test "INTVAL (op) == 32768")
+	     (match_test "INTVAL (op) == 65535"))))
+
+(define_predicate "movep_src_register"
+  (and (match_code "reg")
+       (ior (match_test ("IN_RANGE (REGNO (op), 2, 3)"))
+	    (match_test ("IN_RANGE (REGNO (op), 16, 20)")))))
+
+(define_predicate "movep_src_operand"
+  (ior (match_operand 0 "const_0_operand")
+       (match_operand 0 "movep_src_register")))
+
+(define_predicate "fcc_reload_operand"
+  (and (match_code "reg,subreg")
+       (match_test "FCC_REG_P (true_regnum (op))")))
+
+(define_predicate "muldiv_target_operand"
+		(match_operand 0 "register_operand"))
+
 (define_predicate "const_call_insn_operand"
   (match_code "const,symbol_ref,label_ref")
 {
@@ -303,3 +576,59 @@ (define_predicate "small_data_pattern"
 (define_predicate "non_volatile_mem_operand"
   (and (match_operand 0 "memory_operand")
        (not (match_test "MEM_VOLATILE_P (op)"))))
+
+(define_predicate "const_vector_same_val_operand"
+  (match_code "const_vector")
+{
+  return loongarch_const_vector_same_val_p (op, mode);
+})
+
+(define_predicate "const_vector_same_simm5_operand"
+  (match_code "const_vector")
+{
+  return loongarch_const_vector_same_int_p (op, mode, -16, 15);
+})
+
+(define_predicate "const_vector_same_uimm5_operand"
+  (match_code "const_vector")
+{
+  return loongarch_const_vector_same_int_p (op, mode, 0, 31);
+})
+
+(define_predicate "const_vector_same_ximm5_operand"
+  (match_code "const_vector")
+{
+  return loongarch_const_vector_same_int_p (op, mode, -31, 31);
+})
+
+(define_predicate "const_vector_same_uimm6_operand"
+  (match_code "const_vector")
+{
+  return loongarch_const_vector_same_int_p (op, mode, 0, 63);
+})
+
+(define_predicate "par_const_vector_shf_set_operand"
+  (match_code "parallel")
+{
+  return loongarch_const_vector_shuffle_set_p (op, mode);
+})
+
+(define_predicate "reg_or_vector_same_val_operand"
+  (ior (match_operand 0 "register_operand")
+       (match_operand 0 "const_vector_same_val_operand")))
+
+(define_predicate "reg_or_vector_same_simm5_operand"
+  (ior (match_operand 0 "register_operand")
+       (match_operand 0 "const_vector_same_simm5_operand")))
+
+(define_predicate "reg_or_vector_same_uimm5_operand"
+  (ior (match_operand 0 "register_operand")
+       (match_operand 0 "const_vector_same_uimm5_operand")))
+
+(define_predicate "reg_or_vector_same_ximm5_operand"
+  (ior (match_operand 0 "register_operand")
+       (match_operand 0 "const_vector_same_ximm5_operand")))
+
+(define_predicate "reg_or_vector_same_uimm6_operand"
+  (ior (match_operand 0 "register_operand")
+       (match_operand 0 "const_vector_same_uimm6_operand")))
diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index 24453693d89..daa318ee3da 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -2892,6 +2892,17 @@ as @code{st.w} and @code{ld.w}.
 A signed 12-bit constant (for arithmetic instructions).
 @item K
 An unsigned 12-bit constant (for logic instructions).
+@item M
+A constant that cannot be loaded using @code{lui}, @code{addiu}
+or @code{ori}.
+@item N
+A constant in the range -65535 to -1 (inclusive).
+@item O
+A signed 15-bit constant.
+@item P
+A constant in the range 1 to 65535 (inclusive).
+@item R
+An address that can be used in a non-macro load or store.
 @item ZB
 An address that is held in a general-purpose register.
 The offset is zero.
diff --git a/gcc/testsuite/g++.dg/torture/vshuf-v16qi.C b/gcc/testsuite/g++.dg/torture/vshuf-v16qi.C
index 56801177583..350b450d05f 100644
--- a/gcc/testsuite/g++.dg/torture/vshuf-v16qi.C
+++ b/gcc/testsuite/g++.dg/torture/vshuf-v16qi.C
@@ -1,5 +1,6 @@
 // { dg-options "-std=c++11" }
 // { dg-do run }
+// { dg-skip-if "LoongArch vshuf/xvshuf insn result is undefined when 6 or 7 bit of vector's element is set." { loongarch*-*-* } }
 
 typedef unsigned char V __attribute__((vector_size(16)));
 typedef V VI;
diff --git a/gcc/testsuite/g++.dg/torture/vshuf-v2df.C b/gcc/testsuite/g++.dg/torture/vshuf-v2df.C
index ba45078ea13..9a7f7c4188a 100644
--- a/gcc/testsuite/g++.dg/torture/vshuf-v2df.C
+++ b/gcc/testsuite/g++.dg/torture/vshuf-v2df.C
@@ -1,5 +1,7 @@
 // { dg-options "-std=c++11" }
 // // { dg-do run }
+// { dg-skip-if "LoongArch vshuf/xvshuf insn result is undefined when 6 or 7 bit of vector's element is set." { loongarch*-*-* } }
+
 #if __SIZEOF_DOUBLE__ == 8 && __SIZEOF_LONG_LONG__ == 8
 typedef double V __attribute__((vector_size(16)));
 typedef unsigned long long VI __attribute__((vector_size(16)));
diff --git a/gcc/testsuite/g++.dg/torture/vshuf-v2di.C b/gcc/testsuite/g++.dg/torture/vshuf-v2di.C
index a4272842a36..26ab98c09cc 100644
--- a/gcc/testsuite/g++.dg/torture/vshuf-v2di.C
+++ b/gcc/testsuite/g++.dg/torture/vshuf-v2di.C
@@ -1,5 +1,6 @@
 // { dg-options "-std=c++11" }
 // // { dg-do run }
+// { dg-skip-if "LoongArch vshuf/xvshuf insn result is undefined when 6 or 7 bit of vector's element is set." { loongarch*-*-* } }
 
 #if __SIZEOF_LONG_LONG__ == 8
 typedef unsigned long long V __attribute__((vector_size(16)));
diff --git a/gcc/testsuite/g++.dg/torture/vshuf-v4sf.C b/gcc/testsuite/g++.dg/torture/vshuf-v4sf.C
index c7d58434409..1617eba3827 100644
--- a/gcc/testsuite/g++.dg/torture/vshuf-v4sf.C
+++ b/gcc/testsuite/g++.dg/torture/vshuf-v4sf.C
@@ -1,6 +1,6 @@
 // { dg-options "-std=c++11" }
 // { dg-do run }
-
+// { dg-skip-if "LoongArch vshuf/xvshuf insn result is undefined when 6 or 7 bit of vector's element is set." { loongarch*-*-* } }
 
 #if __SIZEOF_FLOAT__ == 4
 typedef float V __attribute__((vector_size(16)));
diff --git a/gcc/testsuite/g++.dg/torture/vshuf-v8hi.C b/gcc/testsuite/g++.dg/torture/vshuf-v8hi.C
index 33b20c68a87..61a40f733f6 100644
--- a/gcc/testsuite/g++.dg/torture/vshuf-v8hi.C
+++ b/gcc/testsuite/g++.dg/torture/vshuf-v8hi.C
@@ -1,5 +1,6 @@
 // { dg-options "-std=c++11" }
 // { dg-do run }
+// { dg-skip-if "LoongArch vshuf/xvshuf insn result is undefined when 6 or 7 bit of vector's element is set." { loongarch*-*-* } }
 
 typedef unsigned short V __attribute__((vector_size(16)));
 typedef V VI;
-- 
2.36.0


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v5 3/6] LoongArch: Add Loongson SX directive builtin function support.
  2023-08-24  3:13 [PATCH v5 0/6] Add Loongson SX/ASX instruction support to LoongArch target Chenghui Pan
  2023-08-24  3:13 ` [PATCH v5 1/6] LoongArch: Add Loongson SX vector directive compilation framework Chenghui Pan
  2023-08-24  3:13 ` [PATCH v5 2/6] LoongArch: Add Loongson SX base instruction support Chenghui Pan
@ 2023-08-24  3:13 ` Chenghui Pan
  2023-08-24  3:13 ` [PATCH v5 4/6] LoongArch: Add Loongson ASX vector directive compilation framework Chenghui Pan
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 9+ messages in thread
From: Chenghui Pan @ 2023-08-24  3:13 UTC (permalink / raw)
  To: gcc-patches; +Cc: xry111, i, chenglulu, xuchenghua

From: Lulu Cheng <chenglulu@loongson.cn>

gcc/ChangeLog:

	* config.gcc: Export the header file lsxintrin.h.
	* config/loongarch/loongarch-builtins.cc (LARCH_FTYPE_NAME4): Add builtin function support.
	(enum loongarch_builtin_type): Ditto.
	(AVAIL_ALL): Ditto.
	(LARCH_BUILTIN): Ditto.
	(LSX_BUILTIN): Ditto.
	(LSX_BUILTIN_TEST_BRANCH): Ditto.
	(LSX_NO_TARGET_BUILTIN): Ditto.
	(CODE_FOR_lsx_vsadd_b): Ditto.
	(CODE_FOR_lsx_vsadd_h): Ditto.
	(CODE_FOR_lsx_vsadd_w): Ditto.
	(CODE_FOR_lsx_vsadd_d): Ditto.
	(CODE_FOR_lsx_vsadd_bu): Ditto.
	(CODE_FOR_lsx_vsadd_hu): Ditto.
	(CODE_FOR_lsx_vsadd_wu): Ditto.
	(CODE_FOR_lsx_vsadd_du): Ditto.
	(CODE_FOR_lsx_vadd_b): Ditto.
	(CODE_FOR_lsx_vadd_h): Ditto.
	(CODE_FOR_lsx_vadd_w): Ditto.
	(CODE_FOR_lsx_vadd_d): Ditto.
	(CODE_FOR_lsx_vaddi_bu): Ditto.
	(CODE_FOR_lsx_vaddi_hu): Ditto.
	(CODE_FOR_lsx_vaddi_wu): Ditto.
	(CODE_FOR_lsx_vaddi_du): Ditto.
	(CODE_FOR_lsx_vand_v): Ditto.
	(CODE_FOR_lsx_vandi_b): Ditto.
	(CODE_FOR_lsx_bnz_v): Ditto.
	(CODE_FOR_lsx_bz_v): Ditto.
	(CODE_FOR_lsx_vbitsel_v): Ditto.
	(CODE_FOR_lsx_vseqi_b): Ditto.
	(CODE_FOR_lsx_vseqi_h): Ditto.
	(CODE_FOR_lsx_vseqi_w): Ditto.
	(CODE_FOR_lsx_vseqi_d): Ditto.
	(CODE_FOR_lsx_vslti_b): Ditto.
	(CODE_FOR_lsx_vslti_h): Ditto.
	(CODE_FOR_lsx_vslti_w): Ditto.
	(CODE_FOR_lsx_vslti_d): Ditto.
	(CODE_FOR_lsx_vslti_bu): Ditto.
	(CODE_FOR_lsx_vslti_hu): Ditto.
	(CODE_FOR_lsx_vslti_wu): Ditto.
	(CODE_FOR_lsx_vslti_du): Ditto.
	(CODE_FOR_lsx_vslei_b): Ditto.
	(CODE_FOR_lsx_vslei_h): Ditto.
	(CODE_FOR_lsx_vslei_w): Ditto.
	(CODE_FOR_lsx_vslei_d): Ditto.
	(CODE_FOR_lsx_vslei_bu): Ditto.
	(CODE_FOR_lsx_vslei_hu): Ditto.
	(CODE_FOR_lsx_vslei_wu): Ditto.
	(CODE_FOR_lsx_vslei_du): Ditto.
	(CODE_FOR_lsx_vdiv_b): Ditto.
	(CODE_FOR_lsx_vdiv_h): Ditto.
	(CODE_FOR_lsx_vdiv_w): Ditto.
	(CODE_FOR_lsx_vdiv_d): Ditto.
	(CODE_FOR_lsx_vdiv_bu): Ditto.
	(CODE_FOR_lsx_vdiv_hu): Ditto.
	(CODE_FOR_lsx_vdiv_wu): Ditto.
	(CODE_FOR_lsx_vdiv_du): Ditto.
	(CODE_FOR_lsx_vfadd_s): Ditto.
	(CODE_FOR_lsx_vfadd_d): Ditto.
	(CODE_FOR_lsx_vftintrz_w_s): Ditto.
	(CODE_FOR_lsx_vftintrz_l_d): Ditto.
	(CODE_FOR_lsx_vftintrz_wu_s): Ditto.
	(CODE_FOR_lsx_vftintrz_lu_d): Ditto.
	(CODE_FOR_lsx_vffint_s_w): Ditto.
	(CODE_FOR_lsx_vffint_d_l): Ditto.
	(CODE_FOR_lsx_vffint_s_wu): Ditto.
	(CODE_FOR_lsx_vffint_d_lu): Ditto.
	(CODE_FOR_lsx_vfsub_s): Ditto.
	(CODE_FOR_lsx_vfsub_d): Ditto.
	(CODE_FOR_lsx_vfmul_s): Ditto.
	(CODE_FOR_lsx_vfmul_d): Ditto.
	(CODE_FOR_lsx_vfdiv_s): Ditto.
	(CODE_FOR_lsx_vfdiv_d): Ditto.
	(CODE_FOR_lsx_vfmax_s): Ditto.
	(CODE_FOR_lsx_vfmax_d): Ditto.
	(CODE_FOR_lsx_vfmin_s): Ditto.
	(CODE_FOR_lsx_vfmin_d): Ditto.
	(CODE_FOR_lsx_vfsqrt_s): Ditto.
	(CODE_FOR_lsx_vfsqrt_d): Ditto.
	(CODE_FOR_lsx_vflogb_s): Ditto.
	(CODE_FOR_lsx_vflogb_d): Ditto.
	(CODE_FOR_lsx_vmax_b): Ditto.
	(CODE_FOR_lsx_vmax_h): Ditto.
	(CODE_FOR_lsx_vmax_w): Ditto.
	(CODE_FOR_lsx_vmax_d): Ditto.
	(CODE_FOR_lsx_vmaxi_b): Ditto.
	(CODE_FOR_lsx_vmaxi_h): Ditto.
	(CODE_FOR_lsx_vmaxi_w): Ditto.
	(CODE_FOR_lsx_vmaxi_d): Ditto.
	(CODE_FOR_lsx_vmax_bu): Ditto.
	(CODE_FOR_lsx_vmax_hu): Ditto.
	(CODE_FOR_lsx_vmax_wu): Ditto.
	(CODE_FOR_lsx_vmax_du): Ditto.
	(CODE_FOR_lsx_vmaxi_bu): Ditto.
	(CODE_FOR_lsx_vmaxi_hu): Ditto.
	(CODE_FOR_lsx_vmaxi_wu): Ditto.
	(CODE_FOR_lsx_vmaxi_du): Ditto.
	(CODE_FOR_lsx_vmin_b): Ditto.
	(CODE_FOR_lsx_vmin_h): Ditto.
	(CODE_FOR_lsx_vmin_w): Ditto.
	(CODE_FOR_lsx_vmin_d): Ditto.
	(CODE_FOR_lsx_vmini_b): Ditto.
	(CODE_FOR_lsx_vmini_h): Ditto.
	(CODE_FOR_lsx_vmini_w): Ditto.
	(CODE_FOR_lsx_vmini_d): Ditto.
	(CODE_FOR_lsx_vmin_bu): Ditto.
	(CODE_FOR_lsx_vmin_hu): Ditto.
	(CODE_FOR_lsx_vmin_wu): Ditto.
	(CODE_FOR_lsx_vmin_du): Ditto.
	(CODE_FOR_lsx_vmini_bu): Ditto.
	(CODE_FOR_lsx_vmini_hu): Ditto.
	(CODE_FOR_lsx_vmini_wu): Ditto.
	(CODE_FOR_lsx_vmini_du): Ditto.
	(CODE_FOR_lsx_vmod_b): Ditto.
	(CODE_FOR_lsx_vmod_h): Ditto.
	(CODE_FOR_lsx_vmod_w): Ditto.
	(CODE_FOR_lsx_vmod_d): Ditto.
	(CODE_FOR_lsx_vmod_bu): Ditto.
	(CODE_FOR_lsx_vmod_hu): Ditto.
	(CODE_FOR_lsx_vmod_wu): Ditto.
	(CODE_FOR_lsx_vmod_du): Ditto.
	(CODE_FOR_lsx_vmul_b): Ditto.
	(CODE_FOR_lsx_vmul_h): Ditto.
	(CODE_FOR_lsx_vmul_w): Ditto.
	(CODE_FOR_lsx_vmul_d): Ditto.
	(CODE_FOR_lsx_vclz_b): Ditto.
	(CODE_FOR_lsx_vclz_h): Ditto.
	(CODE_FOR_lsx_vclz_w): Ditto.
	(CODE_FOR_lsx_vclz_d): Ditto.
	(CODE_FOR_lsx_vnor_v): Ditto.
	(CODE_FOR_lsx_vor_v): Ditto.
	(CODE_FOR_lsx_vori_b): Ditto.
	(CODE_FOR_lsx_vnori_b): Ditto.
	(CODE_FOR_lsx_vpcnt_b): Ditto.
	(CODE_FOR_lsx_vpcnt_h): Ditto.
	(CODE_FOR_lsx_vpcnt_w): Ditto.
	(CODE_FOR_lsx_vpcnt_d): Ditto.
	(CODE_FOR_lsx_vxor_v): Ditto.
	(CODE_FOR_lsx_vxori_b): Ditto.
	(CODE_FOR_lsx_vsll_b): Ditto.
	(CODE_FOR_lsx_vsll_h): Ditto.
	(CODE_FOR_lsx_vsll_w): Ditto.
	(CODE_FOR_lsx_vsll_d): Ditto.
	(CODE_FOR_lsx_vslli_b): Ditto.
	(CODE_FOR_lsx_vslli_h): Ditto.
	(CODE_FOR_lsx_vslli_w): Ditto.
	(CODE_FOR_lsx_vslli_d): Ditto.
	(CODE_FOR_lsx_vsra_b): Ditto.
	(CODE_FOR_lsx_vsra_h): Ditto.
	(CODE_FOR_lsx_vsra_w): Ditto.
	(CODE_FOR_lsx_vsra_d): Ditto.
	(CODE_FOR_lsx_vsrai_b): Ditto.
	(CODE_FOR_lsx_vsrai_h): Ditto.
	(CODE_FOR_lsx_vsrai_w): Ditto.
	(CODE_FOR_lsx_vsrai_d): Ditto.
	(CODE_FOR_lsx_vsrl_b): Ditto.
	(CODE_FOR_lsx_vsrl_h): Ditto.
	(CODE_FOR_lsx_vsrl_w): Ditto.
	(CODE_FOR_lsx_vsrl_d): Ditto.
	(CODE_FOR_lsx_vsrli_b): Ditto.
	(CODE_FOR_lsx_vsrli_h): Ditto.
	(CODE_FOR_lsx_vsrli_w): Ditto.
	(CODE_FOR_lsx_vsrli_d): Ditto.
	(CODE_FOR_lsx_vsub_b): Ditto.
	(CODE_FOR_lsx_vsub_h): Ditto.
	(CODE_FOR_lsx_vsub_w): Ditto.
	(CODE_FOR_lsx_vsub_d): Ditto.
	(CODE_FOR_lsx_vsubi_bu): Ditto.
	(CODE_FOR_lsx_vsubi_hu): Ditto.
	(CODE_FOR_lsx_vsubi_wu): Ditto.
	(CODE_FOR_lsx_vsubi_du): Ditto.
	(CODE_FOR_lsx_vpackod_d): Ditto.
	(CODE_FOR_lsx_vpackev_d): Ditto.
	(CODE_FOR_lsx_vpickod_d): Ditto.
	(CODE_FOR_lsx_vpickev_d): Ditto.
	(CODE_FOR_lsx_vrepli_b): Ditto.
	(CODE_FOR_lsx_vrepli_h): Ditto.
	(CODE_FOR_lsx_vrepli_w): Ditto.
	(CODE_FOR_lsx_vrepli_d): Ditto.
	(CODE_FOR_lsx_vsat_b): Ditto.
	(CODE_FOR_lsx_vsat_h): Ditto.
	(CODE_FOR_lsx_vsat_w): Ditto.
	(CODE_FOR_lsx_vsat_d): Ditto.
	(CODE_FOR_lsx_vsat_bu): Ditto.
	(CODE_FOR_lsx_vsat_hu): Ditto.
	(CODE_FOR_lsx_vsat_wu): Ditto.
	(CODE_FOR_lsx_vsat_du): Ditto.
	(CODE_FOR_lsx_vavg_b): Ditto.
	(CODE_FOR_lsx_vavg_h): Ditto.
	(CODE_FOR_lsx_vavg_w): Ditto.
	(CODE_FOR_lsx_vavg_d): Ditto.
	(CODE_FOR_lsx_vavg_bu): Ditto.
	(CODE_FOR_lsx_vavg_hu): Ditto.
	(CODE_FOR_lsx_vavg_wu): Ditto.
	(CODE_FOR_lsx_vavg_du): Ditto.
	(CODE_FOR_lsx_vavgr_b): Ditto.
	(CODE_FOR_lsx_vavgr_h): Ditto.
	(CODE_FOR_lsx_vavgr_w): Ditto.
	(CODE_FOR_lsx_vavgr_d): Ditto.
	(CODE_FOR_lsx_vavgr_bu): Ditto.
	(CODE_FOR_lsx_vavgr_hu): Ditto.
	(CODE_FOR_lsx_vavgr_wu): Ditto.
	(CODE_FOR_lsx_vavgr_du): Ditto.
	(CODE_FOR_lsx_vssub_b): Ditto.
	(CODE_FOR_lsx_vssub_h): Ditto.
	(CODE_FOR_lsx_vssub_w): Ditto.
	(CODE_FOR_lsx_vssub_d): Ditto.
	(CODE_FOR_lsx_vssub_bu): Ditto.
	(CODE_FOR_lsx_vssub_hu): Ditto.
	(CODE_FOR_lsx_vssub_wu): Ditto.
	(CODE_FOR_lsx_vssub_du): Ditto.
	(CODE_FOR_lsx_vabsd_b): Ditto.
	(CODE_FOR_lsx_vabsd_h): Ditto.
	(CODE_FOR_lsx_vabsd_w): Ditto.
	(CODE_FOR_lsx_vabsd_d): Ditto.
	(CODE_FOR_lsx_vabsd_bu): Ditto.
	(CODE_FOR_lsx_vabsd_hu): Ditto.
	(CODE_FOR_lsx_vabsd_wu): Ditto.
	(CODE_FOR_lsx_vabsd_du): Ditto.
	(CODE_FOR_lsx_vftint_w_s): Ditto.
	(CODE_FOR_lsx_vftint_l_d): Ditto.
	(CODE_FOR_lsx_vftint_wu_s): Ditto.
	(CODE_FOR_lsx_vftint_lu_d): Ditto.
	(CODE_FOR_lsx_vandn_v): Ditto.
	(CODE_FOR_lsx_vorn_v): Ditto.
	(CODE_FOR_lsx_vneg_b): Ditto.
	(CODE_FOR_lsx_vneg_h): Ditto.
	(CODE_FOR_lsx_vneg_w): Ditto.
	(CODE_FOR_lsx_vneg_d): Ditto.
	(CODE_FOR_lsx_vshuf4i_d): Ditto.
	(CODE_FOR_lsx_vbsrl_v): Ditto.
	(CODE_FOR_lsx_vbsll_v): Ditto.
	(CODE_FOR_lsx_vfmadd_s): Ditto.
	(CODE_FOR_lsx_vfmadd_d): Ditto.
	(CODE_FOR_lsx_vfmsub_s): Ditto.
	(CODE_FOR_lsx_vfmsub_d): Ditto.
	(CODE_FOR_lsx_vfnmadd_s): Ditto.
	(CODE_FOR_lsx_vfnmadd_d): Ditto.
	(CODE_FOR_lsx_vfnmsub_s): Ditto.
	(CODE_FOR_lsx_vfnmsub_d): Ditto.
	(CODE_FOR_lsx_vmuh_b): Ditto.
	(CODE_FOR_lsx_vmuh_h): Ditto.
	(CODE_FOR_lsx_vmuh_w): Ditto.
	(CODE_FOR_lsx_vmuh_d): Ditto.
	(CODE_FOR_lsx_vmuh_bu): Ditto.
	(CODE_FOR_lsx_vmuh_hu): Ditto.
	(CODE_FOR_lsx_vmuh_wu): Ditto.
	(CODE_FOR_lsx_vmuh_du): Ditto.
	(CODE_FOR_lsx_vsllwil_h_b): Ditto.
	(CODE_FOR_lsx_vsllwil_w_h): Ditto.
	(CODE_FOR_lsx_vsllwil_d_w): Ditto.
	(CODE_FOR_lsx_vsllwil_hu_bu): Ditto.
	(CODE_FOR_lsx_vsllwil_wu_hu): Ditto.
	(CODE_FOR_lsx_vsllwil_du_wu): Ditto.
	(CODE_FOR_lsx_vssran_b_h): Ditto.
	(CODE_FOR_lsx_vssran_h_w): Ditto.
	(CODE_FOR_lsx_vssran_w_d): Ditto.
	(CODE_FOR_lsx_vssran_bu_h): Ditto.
	(CODE_FOR_lsx_vssran_hu_w): Ditto.
	(CODE_FOR_lsx_vssran_wu_d): Ditto.
	(CODE_FOR_lsx_vssrarn_b_h): Ditto.
	(CODE_FOR_lsx_vssrarn_h_w): Ditto.
	(CODE_FOR_lsx_vssrarn_w_d): Ditto.
	(CODE_FOR_lsx_vssrarn_bu_h): Ditto.
	(CODE_FOR_lsx_vssrarn_hu_w): Ditto.
	(CODE_FOR_lsx_vssrarn_wu_d): Ditto.
	(CODE_FOR_lsx_vssrln_bu_h): Ditto.
	(CODE_FOR_lsx_vssrln_hu_w): Ditto.
	(CODE_FOR_lsx_vssrln_wu_d): Ditto.
	(CODE_FOR_lsx_vssrlrn_bu_h): Ditto.
	(CODE_FOR_lsx_vssrlrn_hu_w): Ditto.
	(CODE_FOR_lsx_vssrlrn_wu_d): Ditto.
	(loongarch_builtin_vector_type): Ditto.
	(loongarch_build_cvpointer_type): Ditto.
	(LARCH_ATYPE_CVPOINTER): Ditto.
	(LARCH_ATYPE_BOOLEAN): Ditto.
	(LARCH_ATYPE_V2SF): Ditto.
	(LARCH_ATYPE_V2HI): Ditto.
	(LARCH_ATYPE_V2SI): Ditto.
	(LARCH_ATYPE_V4QI): Ditto.
	(LARCH_ATYPE_V4HI): Ditto.
	(LARCH_ATYPE_V8QI): Ditto.
	(LARCH_ATYPE_V2DI): Ditto.
	(LARCH_ATYPE_V4SI): Ditto.
	(LARCH_ATYPE_V8HI): Ditto.
	(LARCH_ATYPE_V16QI): Ditto.
	(LARCH_ATYPE_V2DF): Ditto.
	(LARCH_ATYPE_V4SF): Ditto.
	(LARCH_ATYPE_V4DI): Ditto.
	(LARCH_ATYPE_V8SI): Ditto.
	(LARCH_ATYPE_V16HI): Ditto.
	(LARCH_ATYPE_V32QI): Ditto.
	(LARCH_ATYPE_V4DF): Ditto.
	(LARCH_ATYPE_V8SF): Ditto.
	(LARCH_ATYPE_UV2DI): Ditto.
	(LARCH_ATYPE_UV4SI): Ditto.
	(LARCH_ATYPE_UV8HI): Ditto.
	(LARCH_ATYPE_UV16QI): Ditto.
	(LARCH_ATYPE_UV4DI): Ditto.
	(LARCH_ATYPE_UV8SI): Ditto.
	(LARCH_ATYPE_UV16HI): Ditto.
	(LARCH_ATYPE_UV32QI): Ditto.
	(LARCH_ATYPE_UV2SI): Ditto.
	(LARCH_ATYPE_UV4HI): Ditto.
	(LARCH_ATYPE_UV8QI): Ditto.
	(loongarch_builtin_vectorized_function): Ditto.
	(LARCH_GET_BUILTIN): Ditto.
	(loongarch_expand_builtin_insn): Ditto.
	(loongarch_expand_builtin_lsx_test_branch): Ditto.
	(loongarch_expand_builtin): Ditto.
	* config/loongarch/loongarch-ftypes.def (1): Ditto.
	(2): Ditto.
	(3): Ditto.
	(4): Ditto.
	* config/loongarch/lsxintrin.h: New file.
---
 gcc/config.gcc                             |    2 +-
 gcc/config/loongarch/loongarch-builtins.cc | 1498 +++++-
 gcc/config/loongarch/loongarch-ftypes.def  |  397 +-
 gcc/config/loongarch/lsxintrin.h           | 5181 ++++++++++++++++++++
 4 files changed, 7071 insertions(+), 7 deletions(-)
 create mode 100644 gcc/config/loongarch/lsxintrin.h

diff --git a/gcc/config.gcc b/gcc/config.gcc
index 415e0e1ebc5..d6b809cdb55 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -469,7 +469,7 @@ mips*-*-*)
 	;;
 loongarch*-*-*)
 	cpu_type=loongarch
-	extra_headers="larchintrin.h"
+	extra_headers="larchintrin.h lsxintrin.h"
 	extra_objs="loongarch-c.o loongarch-builtins.o loongarch-cpu.o loongarch-opts.o loongarch-def.o"
 	extra_gcc_objs="loongarch-driver.o loongarch-cpu.o loongarch-opts.o loongarch-def.o"
 	extra_options="${extra_options} g.opt fused-madd.opt"
diff --git a/gcc/config/loongarch/loongarch-builtins.cc b/gcc/config/loongarch/loongarch-builtins.cc
index ebe70a986c3..5958f5b7fbe 100644
--- a/gcc/config/loongarch/loongarch-builtins.cc
+++ b/gcc/config/loongarch/loongarch-builtins.cc
@@ -34,14 +34,18 @@ along with GCC; see the file COPYING3.  If not see
 #include "recog.h"
 #include "diagnostic.h"
 #include "fold-const.h"
+#include "explow.h"
 #include "expr.h"
 #include "langhooks.h"
 #include "emit-rtl.h"
+#include "case-cfn-macros.h"
 
 /* Macros to create an enumeration identifier for a function prototype.  */
 #define LARCH_FTYPE_NAME1(A, B) LARCH_##A##_FTYPE_##B
 #define LARCH_FTYPE_NAME2(A, B, C) LARCH_##A##_FTYPE_##B##_##C
 #define LARCH_FTYPE_NAME3(A, B, C, D) LARCH_##A##_FTYPE_##B##_##C##_##D
+#define LARCH_FTYPE_NAME4(A, B, C, D, E) \
+  LARCH_##A##_FTYPE_##B##_##C##_##D##_##E
 
 /* Classifies the prototype of a built-in function.  */
 enum loongarch_function_type
@@ -64,6 +68,12 @@ enum loongarch_builtin_type
      value and the arguments are mapped to operands 0 and above.  */
   LARCH_BUILTIN_DIRECT_NO_TARGET,
 
+  /* For generating LoongArch LSX.  */
+  LARCH_BUILTIN_LSX,
+
+  /* The function corresponds to an LSX conditional branch instruction
+     combined with a compare instruction.  */
+  LARCH_BUILTIN_LSX_TEST_BRANCH,
 };
 
 /* Declare an availability predicate for built-in functions that require
@@ -101,6 +111,7 @@ struct loongarch_builtin_description
 };
 
 AVAIL_ALL (hard_float, TARGET_HARD_FLOAT_ABI)
+AVAIL_ALL (lsx, ISA_HAS_LSX)
 
 /* Construct a loongarch_builtin_description from the given arguments.
 
@@ -120,8 +131,8 @@ AVAIL_ALL (hard_float, TARGET_HARD_FLOAT_ABI)
 #define LARCH_BUILTIN(INSN, NAME, BUILTIN_TYPE, FUNCTION_TYPE, AVAIL) \
   { \
     CODE_FOR_loongarch_##INSN, "__builtin_loongarch_" NAME, \
-      BUILTIN_TYPE, FUNCTION_TYPE, \
-      loongarch_builtin_avail_##AVAIL \
+    BUILTIN_TYPE, FUNCTION_TYPE, \
+    loongarch_builtin_avail_##AVAIL \
   }
 
 /* Define __builtin_loongarch_<INSN>, which is a LARCH_BUILTIN_DIRECT function
@@ -137,6 +148,300 @@ AVAIL_ALL (hard_float, TARGET_HARD_FLOAT_ABI)
   LARCH_BUILTIN (INSN, #INSN, LARCH_BUILTIN_DIRECT_NO_TARGET, \
 		 FUNCTION_TYPE, AVAIL)
 
+/* Define an LSX LARCH_BUILTIN_DIRECT function __builtin_lsx_<INSN>
+   for instruction CODE_FOR_lsx_<INSN>.  FUNCTION_TYPE is a builtin_description
+   field.  */
+#define LSX_BUILTIN(INSN, FUNCTION_TYPE)				\
+  { CODE_FOR_lsx_ ## INSN,						\
+    "__builtin_lsx_" #INSN,  LARCH_BUILTIN_DIRECT,			\
+    FUNCTION_TYPE, loongarch_builtin_avail_lsx }
+
+
+/* Define an LSX LARCH_BUILTIN_LSX_TEST_BRANCH function __builtin_lsx_<INSN>
+   for instruction CODE_FOR_lsx_<INSN>.  FUNCTION_TYPE is a builtin_description
+   field.  */
+#define LSX_BUILTIN_TEST_BRANCH(INSN, FUNCTION_TYPE)			\
+  { CODE_FOR_lsx_ ## INSN,						\
+    "__builtin_lsx_" #INSN, LARCH_BUILTIN_LSX_TEST_BRANCH,		\
+    FUNCTION_TYPE, loongarch_builtin_avail_lsx }
+
+/* Define an LSX LARCH_BUILTIN_DIRECT_NO_TARGET function __builtin_lsx_<INSN>
+   for instruction CODE_FOR_lsx_<INSN>.  FUNCTION_TYPE is a builtin_description
+   field.  */
+#define LSX_NO_TARGET_BUILTIN(INSN, FUNCTION_TYPE)			\
+  { CODE_FOR_lsx_ ## INSN,						\
+    "__builtin_lsx_" #INSN,  LARCH_BUILTIN_DIRECT_NO_TARGET,		\
+    FUNCTION_TYPE, loongarch_builtin_avail_lsx }
+
+/* LoongArch SX define CODE_FOR_lsx_xxx */
+#define CODE_FOR_lsx_vsadd_b CODE_FOR_ssaddv16qi3
+#define CODE_FOR_lsx_vsadd_h CODE_FOR_ssaddv8hi3
+#define CODE_FOR_lsx_vsadd_w CODE_FOR_ssaddv4si3
+#define CODE_FOR_lsx_vsadd_d CODE_FOR_ssaddv2di3
+#define CODE_FOR_lsx_vsadd_bu CODE_FOR_usaddv16qi3
+#define CODE_FOR_lsx_vsadd_hu CODE_FOR_usaddv8hi3
+#define CODE_FOR_lsx_vsadd_wu CODE_FOR_usaddv4si3
+#define CODE_FOR_lsx_vsadd_du CODE_FOR_usaddv2di3
+#define CODE_FOR_lsx_vadd_b CODE_FOR_addv16qi3
+#define CODE_FOR_lsx_vadd_h CODE_FOR_addv8hi3
+#define CODE_FOR_lsx_vadd_w CODE_FOR_addv4si3
+#define CODE_FOR_lsx_vadd_d CODE_FOR_addv2di3
+#define CODE_FOR_lsx_vaddi_bu CODE_FOR_addv16qi3
+#define CODE_FOR_lsx_vaddi_hu CODE_FOR_addv8hi3
+#define CODE_FOR_lsx_vaddi_wu CODE_FOR_addv4si3
+#define CODE_FOR_lsx_vaddi_du CODE_FOR_addv2di3
+#define CODE_FOR_lsx_vand_v CODE_FOR_andv16qi3
+#define CODE_FOR_lsx_vandi_b CODE_FOR_andv16qi3
+#define CODE_FOR_lsx_bnz_v CODE_FOR_lsx_bnz_v_b
+#define CODE_FOR_lsx_bz_v CODE_FOR_lsx_bz_v_b
+#define CODE_FOR_lsx_vbitsel_v CODE_FOR_lsx_vbitsel_b
+#define CODE_FOR_lsx_vseqi_b CODE_FOR_lsx_vseq_b
+#define CODE_FOR_lsx_vseqi_h CODE_FOR_lsx_vseq_h
+#define CODE_FOR_lsx_vseqi_w CODE_FOR_lsx_vseq_w
+#define CODE_FOR_lsx_vseqi_d CODE_FOR_lsx_vseq_d
+#define CODE_FOR_lsx_vslti_b CODE_FOR_lsx_vslt_b
+#define CODE_FOR_lsx_vslti_h CODE_FOR_lsx_vslt_h
+#define CODE_FOR_lsx_vslti_w CODE_FOR_lsx_vslt_w
+#define CODE_FOR_lsx_vslti_d CODE_FOR_lsx_vslt_d
+#define CODE_FOR_lsx_vslti_bu CODE_FOR_lsx_vslt_bu
+#define CODE_FOR_lsx_vslti_hu CODE_FOR_lsx_vslt_hu
+#define CODE_FOR_lsx_vslti_wu CODE_FOR_lsx_vslt_wu
+#define CODE_FOR_lsx_vslti_du CODE_FOR_lsx_vslt_du
+#define CODE_FOR_lsx_vslei_b CODE_FOR_lsx_vsle_b
+#define CODE_FOR_lsx_vslei_h CODE_FOR_lsx_vsle_h
+#define CODE_FOR_lsx_vslei_w CODE_FOR_lsx_vsle_w
+#define CODE_FOR_lsx_vslei_d CODE_FOR_lsx_vsle_d
+#define CODE_FOR_lsx_vslei_bu CODE_FOR_lsx_vsle_bu
+#define CODE_FOR_lsx_vslei_hu CODE_FOR_lsx_vsle_hu
+#define CODE_FOR_lsx_vslei_wu CODE_FOR_lsx_vsle_wu
+#define CODE_FOR_lsx_vslei_du CODE_FOR_lsx_vsle_du
+#define CODE_FOR_lsx_vdiv_b CODE_FOR_divv16qi3
+#define CODE_FOR_lsx_vdiv_h CODE_FOR_divv8hi3
+#define CODE_FOR_lsx_vdiv_w CODE_FOR_divv4si3
+#define CODE_FOR_lsx_vdiv_d CODE_FOR_divv2di3
+#define CODE_FOR_lsx_vdiv_bu CODE_FOR_udivv16qi3
+#define CODE_FOR_lsx_vdiv_hu CODE_FOR_udivv8hi3
+#define CODE_FOR_lsx_vdiv_wu CODE_FOR_udivv4si3
+#define CODE_FOR_lsx_vdiv_du CODE_FOR_udivv2di3
+#define CODE_FOR_lsx_vfadd_s CODE_FOR_addv4sf3
+#define CODE_FOR_lsx_vfadd_d CODE_FOR_addv2df3
+#define CODE_FOR_lsx_vftintrz_w_s CODE_FOR_fix_truncv4sfv4si2
+#define CODE_FOR_lsx_vftintrz_l_d CODE_FOR_fix_truncv2dfv2di2
+#define CODE_FOR_lsx_vftintrz_wu_s CODE_FOR_fixuns_truncv4sfv4si2
+#define CODE_FOR_lsx_vftintrz_lu_d CODE_FOR_fixuns_truncv2dfv2di2
+#define CODE_FOR_lsx_vffint_s_w CODE_FOR_floatv4siv4sf2
+#define CODE_FOR_lsx_vffint_d_l CODE_FOR_floatv2div2df2
+#define CODE_FOR_lsx_vffint_s_wu CODE_FOR_floatunsv4siv4sf2
+#define CODE_FOR_lsx_vffint_d_lu CODE_FOR_floatunsv2div2df2
+#define CODE_FOR_lsx_vfsub_s CODE_FOR_subv4sf3
+#define CODE_FOR_lsx_vfsub_d CODE_FOR_subv2df3
+#define CODE_FOR_lsx_vfmul_s CODE_FOR_mulv4sf3
+#define CODE_FOR_lsx_vfmul_d CODE_FOR_mulv2df3
+#define CODE_FOR_lsx_vfdiv_s CODE_FOR_divv4sf3
+#define CODE_FOR_lsx_vfdiv_d CODE_FOR_divv2df3
+#define CODE_FOR_lsx_vfmax_s CODE_FOR_smaxv4sf3
+#define CODE_FOR_lsx_vfmax_d CODE_FOR_smaxv2df3
+#define CODE_FOR_lsx_vfmin_s CODE_FOR_sminv4sf3
+#define CODE_FOR_lsx_vfmin_d CODE_FOR_sminv2df3
+#define CODE_FOR_lsx_vfsqrt_s CODE_FOR_sqrtv4sf2
+#define CODE_FOR_lsx_vfsqrt_d CODE_FOR_sqrtv2df2
+#define CODE_FOR_lsx_vflogb_s CODE_FOR_logbv4sf2
+#define CODE_FOR_lsx_vflogb_d CODE_FOR_logbv2df2
+#define CODE_FOR_lsx_vmax_b CODE_FOR_smaxv16qi3
+#define CODE_FOR_lsx_vmax_h CODE_FOR_smaxv8hi3
+#define CODE_FOR_lsx_vmax_w CODE_FOR_smaxv4si3
+#define CODE_FOR_lsx_vmax_d CODE_FOR_smaxv2di3
+#define CODE_FOR_lsx_vmaxi_b CODE_FOR_smaxv16qi3
+#define CODE_FOR_lsx_vmaxi_h CODE_FOR_smaxv8hi3
+#define CODE_FOR_lsx_vmaxi_w CODE_FOR_smaxv4si3
+#define CODE_FOR_lsx_vmaxi_d CODE_FOR_smaxv2di3
+#define CODE_FOR_lsx_vmax_bu CODE_FOR_umaxv16qi3
+#define CODE_FOR_lsx_vmax_hu CODE_FOR_umaxv8hi3
+#define CODE_FOR_lsx_vmax_wu CODE_FOR_umaxv4si3
+#define CODE_FOR_lsx_vmax_du CODE_FOR_umaxv2di3
+#define CODE_FOR_lsx_vmaxi_bu CODE_FOR_umaxv16qi3
+#define CODE_FOR_lsx_vmaxi_hu CODE_FOR_umaxv8hi3
+#define CODE_FOR_lsx_vmaxi_wu CODE_FOR_umaxv4si3
+#define CODE_FOR_lsx_vmaxi_du CODE_FOR_umaxv2di3
+#define CODE_FOR_lsx_vmin_b CODE_FOR_sminv16qi3
+#define CODE_FOR_lsx_vmin_h CODE_FOR_sminv8hi3
+#define CODE_FOR_lsx_vmin_w CODE_FOR_sminv4si3
+#define CODE_FOR_lsx_vmin_d CODE_FOR_sminv2di3
+#define CODE_FOR_lsx_vmini_b CODE_FOR_sminv16qi3
+#define CODE_FOR_lsx_vmini_h CODE_FOR_sminv8hi3
+#define CODE_FOR_lsx_vmini_w CODE_FOR_sminv4si3
+#define CODE_FOR_lsx_vmini_d CODE_FOR_sminv2di3
+#define CODE_FOR_lsx_vmin_bu CODE_FOR_uminv16qi3
+#define CODE_FOR_lsx_vmin_hu CODE_FOR_uminv8hi3
+#define CODE_FOR_lsx_vmin_wu CODE_FOR_uminv4si3
+#define CODE_FOR_lsx_vmin_du CODE_FOR_uminv2di3
+#define CODE_FOR_lsx_vmini_bu CODE_FOR_uminv16qi3
+#define CODE_FOR_lsx_vmini_hu CODE_FOR_uminv8hi3
+#define CODE_FOR_lsx_vmini_wu CODE_FOR_uminv4si3
+#define CODE_FOR_lsx_vmini_du CODE_FOR_uminv2di3
+#define CODE_FOR_lsx_vmod_b CODE_FOR_modv16qi3
+#define CODE_FOR_lsx_vmod_h CODE_FOR_modv8hi3
+#define CODE_FOR_lsx_vmod_w CODE_FOR_modv4si3
+#define CODE_FOR_lsx_vmod_d CODE_FOR_modv2di3
+#define CODE_FOR_lsx_vmod_bu CODE_FOR_umodv16qi3
+#define CODE_FOR_lsx_vmod_hu CODE_FOR_umodv8hi3
+#define CODE_FOR_lsx_vmod_wu CODE_FOR_umodv4si3
+#define CODE_FOR_lsx_vmod_du CODE_FOR_umodv2di3
+#define CODE_FOR_lsx_vmul_b CODE_FOR_mulv16qi3
+#define CODE_FOR_lsx_vmul_h CODE_FOR_mulv8hi3
+#define CODE_FOR_lsx_vmul_w CODE_FOR_mulv4si3
+#define CODE_FOR_lsx_vmul_d CODE_FOR_mulv2di3
+#define CODE_FOR_lsx_vclz_b CODE_FOR_clzv16qi2
+#define CODE_FOR_lsx_vclz_h CODE_FOR_clzv8hi2
+#define CODE_FOR_lsx_vclz_w CODE_FOR_clzv4si2
+#define CODE_FOR_lsx_vclz_d CODE_FOR_clzv2di2
+#define CODE_FOR_lsx_vnor_v CODE_FOR_lsx_nor_b
+#define CODE_FOR_lsx_vor_v CODE_FOR_iorv16qi3
+#define CODE_FOR_lsx_vori_b CODE_FOR_iorv16qi3
+#define CODE_FOR_lsx_vnori_b CODE_FOR_lsx_nor_b
+#define CODE_FOR_lsx_vpcnt_b CODE_FOR_popcountv16qi2
+#define CODE_FOR_lsx_vpcnt_h CODE_FOR_popcountv8hi2
+#define CODE_FOR_lsx_vpcnt_w CODE_FOR_popcountv4si2
+#define CODE_FOR_lsx_vpcnt_d CODE_FOR_popcountv2di2
+#define CODE_FOR_lsx_vxor_v CODE_FOR_xorv16qi3
+#define CODE_FOR_lsx_vxori_b CODE_FOR_xorv16qi3
+#define CODE_FOR_lsx_vsll_b CODE_FOR_vashlv16qi3
+#define CODE_FOR_lsx_vsll_h CODE_FOR_vashlv8hi3
+#define CODE_FOR_lsx_vsll_w CODE_FOR_vashlv4si3
+#define CODE_FOR_lsx_vsll_d CODE_FOR_vashlv2di3
+#define CODE_FOR_lsx_vslli_b CODE_FOR_vashlv16qi3
+#define CODE_FOR_lsx_vslli_h CODE_FOR_vashlv8hi3
+#define CODE_FOR_lsx_vslli_w CODE_FOR_vashlv4si3
+#define CODE_FOR_lsx_vslli_d CODE_FOR_vashlv2di3
+#define CODE_FOR_lsx_vsra_b CODE_FOR_vashrv16qi3
+#define CODE_FOR_lsx_vsra_h CODE_FOR_vashrv8hi3
+#define CODE_FOR_lsx_vsra_w CODE_FOR_vashrv4si3
+#define CODE_FOR_lsx_vsra_d CODE_FOR_vashrv2di3
+#define CODE_FOR_lsx_vsrai_b CODE_FOR_vashrv16qi3
+#define CODE_FOR_lsx_vsrai_h CODE_FOR_vashrv8hi3
+#define CODE_FOR_lsx_vsrai_w CODE_FOR_vashrv4si3
+#define CODE_FOR_lsx_vsrai_d CODE_FOR_vashrv2di3
+#define CODE_FOR_lsx_vsrl_b CODE_FOR_vlshrv16qi3
+#define CODE_FOR_lsx_vsrl_h CODE_FOR_vlshrv8hi3
+#define CODE_FOR_lsx_vsrl_w CODE_FOR_vlshrv4si3
+#define CODE_FOR_lsx_vsrl_d CODE_FOR_vlshrv2di3
+#define CODE_FOR_lsx_vsrli_b CODE_FOR_vlshrv16qi3
+#define CODE_FOR_lsx_vsrli_h CODE_FOR_vlshrv8hi3
+#define CODE_FOR_lsx_vsrli_w CODE_FOR_vlshrv4si3
+#define CODE_FOR_lsx_vsrli_d CODE_FOR_vlshrv2di3
+#define CODE_FOR_lsx_vsub_b CODE_FOR_subv16qi3
+#define CODE_FOR_lsx_vsub_h CODE_FOR_subv8hi3
+#define CODE_FOR_lsx_vsub_w CODE_FOR_subv4si3
+#define CODE_FOR_lsx_vsub_d CODE_FOR_subv2di3
+#define CODE_FOR_lsx_vsubi_bu CODE_FOR_subv16qi3
+#define CODE_FOR_lsx_vsubi_hu CODE_FOR_subv8hi3
+#define CODE_FOR_lsx_vsubi_wu CODE_FOR_subv4si3
+#define CODE_FOR_lsx_vsubi_du CODE_FOR_subv2di3
+
+#define CODE_FOR_lsx_vpackod_d CODE_FOR_lsx_vilvh_d
+#define CODE_FOR_lsx_vpackev_d CODE_FOR_lsx_vilvl_d
+#define CODE_FOR_lsx_vpickod_d CODE_FOR_lsx_vilvh_d
+#define CODE_FOR_lsx_vpickev_d CODE_FOR_lsx_vilvl_d
+
+#define CODE_FOR_lsx_vrepli_b CODE_FOR_lsx_vrepliv16qi
+#define CODE_FOR_lsx_vrepli_h CODE_FOR_lsx_vrepliv8hi
+#define CODE_FOR_lsx_vrepli_w CODE_FOR_lsx_vrepliv4si
+#define CODE_FOR_lsx_vrepli_d CODE_FOR_lsx_vrepliv2di
+#define CODE_FOR_lsx_vsat_b CODE_FOR_lsx_vsat_s_b
+#define CODE_FOR_lsx_vsat_h CODE_FOR_lsx_vsat_s_h
+#define CODE_FOR_lsx_vsat_w CODE_FOR_lsx_vsat_s_w
+#define CODE_FOR_lsx_vsat_d CODE_FOR_lsx_vsat_s_d
+#define CODE_FOR_lsx_vsat_bu CODE_FOR_lsx_vsat_u_bu
+#define CODE_FOR_lsx_vsat_hu CODE_FOR_lsx_vsat_u_hu
+#define CODE_FOR_lsx_vsat_wu CODE_FOR_lsx_vsat_u_wu
+#define CODE_FOR_lsx_vsat_du CODE_FOR_lsx_vsat_u_du
+#define CODE_FOR_lsx_vavg_b CODE_FOR_lsx_vavg_s_b
+#define CODE_FOR_lsx_vavg_h CODE_FOR_lsx_vavg_s_h
+#define CODE_FOR_lsx_vavg_w CODE_FOR_lsx_vavg_s_w
+#define CODE_FOR_lsx_vavg_d CODE_FOR_lsx_vavg_s_d
+#define CODE_FOR_lsx_vavg_bu CODE_FOR_lsx_vavg_u_bu
+#define CODE_FOR_lsx_vavg_hu CODE_FOR_lsx_vavg_u_hu
+#define CODE_FOR_lsx_vavg_wu CODE_FOR_lsx_vavg_u_wu
+#define CODE_FOR_lsx_vavg_du CODE_FOR_lsx_vavg_u_du
+#define CODE_FOR_lsx_vavgr_b CODE_FOR_lsx_vavgr_s_b
+#define CODE_FOR_lsx_vavgr_h CODE_FOR_lsx_vavgr_s_h
+#define CODE_FOR_lsx_vavgr_w CODE_FOR_lsx_vavgr_s_w
+#define CODE_FOR_lsx_vavgr_d CODE_FOR_lsx_vavgr_s_d
+#define CODE_FOR_lsx_vavgr_bu CODE_FOR_lsx_vavgr_u_bu
+#define CODE_FOR_lsx_vavgr_hu CODE_FOR_lsx_vavgr_u_hu
+#define CODE_FOR_lsx_vavgr_wu CODE_FOR_lsx_vavgr_u_wu
+#define CODE_FOR_lsx_vavgr_du CODE_FOR_lsx_vavgr_u_du
+#define CODE_FOR_lsx_vssub_b CODE_FOR_lsx_vssub_s_b
+#define CODE_FOR_lsx_vssub_h CODE_FOR_lsx_vssub_s_h
+#define CODE_FOR_lsx_vssub_w CODE_FOR_lsx_vssub_s_w
+#define CODE_FOR_lsx_vssub_d CODE_FOR_lsx_vssub_s_d
+#define CODE_FOR_lsx_vssub_bu CODE_FOR_lsx_vssub_u_bu
+#define CODE_FOR_lsx_vssub_hu CODE_FOR_lsx_vssub_u_hu
+#define CODE_FOR_lsx_vssub_wu CODE_FOR_lsx_vssub_u_wu
+#define CODE_FOR_lsx_vssub_du CODE_FOR_lsx_vssub_u_du
+#define CODE_FOR_lsx_vabsd_b CODE_FOR_lsx_vabsd_s_b
+#define CODE_FOR_lsx_vabsd_h CODE_FOR_lsx_vabsd_s_h
+#define CODE_FOR_lsx_vabsd_w CODE_FOR_lsx_vabsd_s_w
+#define CODE_FOR_lsx_vabsd_d CODE_FOR_lsx_vabsd_s_d
+#define CODE_FOR_lsx_vabsd_bu CODE_FOR_lsx_vabsd_u_bu
+#define CODE_FOR_lsx_vabsd_hu CODE_FOR_lsx_vabsd_u_hu
+#define CODE_FOR_lsx_vabsd_wu CODE_FOR_lsx_vabsd_u_wu
+#define CODE_FOR_lsx_vabsd_du CODE_FOR_lsx_vabsd_u_du
+#define CODE_FOR_lsx_vftint_w_s CODE_FOR_lsx_vftint_s_w_s
+#define CODE_FOR_lsx_vftint_l_d CODE_FOR_lsx_vftint_s_l_d
+#define CODE_FOR_lsx_vftint_wu_s CODE_FOR_lsx_vftint_u_wu_s
+#define CODE_FOR_lsx_vftint_lu_d CODE_FOR_lsx_vftint_u_lu_d
+#define CODE_FOR_lsx_vandn_v CODE_FOR_vandnv16qi3
+#define CODE_FOR_lsx_vorn_v CODE_FOR_vornv16qi3
+#define CODE_FOR_lsx_vneg_b CODE_FOR_vnegv16qi2
+#define CODE_FOR_lsx_vneg_h CODE_FOR_vnegv8hi2
+#define CODE_FOR_lsx_vneg_w CODE_FOR_vnegv4si2
+#define CODE_FOR_lsx_vneg_d CODE_FOR_vnegv2di2
+#define CODE_FOR_lsx_vshuf4i_d CODE_FOR_lsx_vshuf4i_d
+#define CODE_FOR_lsx_vbsrl_v CODE_FOR_lsx_vbsrl_b
+#define CODE_FOR_lsx_vbsll_v CODE_FOR_lsx_vbsll_b
+#define CODE_FOR_lsx_vfmadd_s CODE_FOR_fmav4sf4
+#define CODE_FOR_lsx_vfmadd_d CODE_FOR_fmav2df4
+#define CODE_FOR_lsx_vfmsub_s CODE_FOR_fmsv4sf4
+#define CODE_FOR_lsx_vfmsub_d CODE_FOR_fmsv2df4
+#define CODE_FOR_lsx_vfnmadd_s CODE_FOR_vfnmaddv4sf4_nmadd4
+#define CODE_FOR_lsx_vfnmadd_d CODE_FOR_vfnmaddv2df4_nmadd4
+#define CODE_FOR_lsx_vfnmsub_s CODE_FOR_vfnmsubv4sf4_nmsub4
+#define CODE_FOR_lsx_vfnmsub_d CODE_FOR_vfnmsubv2df4_nmsub4
+
+#define CODE_FOR_lsx_vmuh_b CODE_FOR_lsx_vmuh_s_b
+#define CODE_FOR_lsx_vmuh_h CODE_FOR_lsx_vmuh_s_h
+#define CODE_FOR_lsx_vmuh_w CODE_FOR_lsx_vmuh_s_w
+#define CODE_FOR_lsx_vmuh_d CODE_FOR_lsx_vmuh_s_d
+#define CODE_FOR_lsx_vmuh_bu CODE_FOR_lsx_vmuh_u_bu
+#define CODE_FOR_lsx_vmuh_hu CODE_FOR_lsx_vmuh_u_hu
+#define CODE_FOR_lsx_vmuh_wu CODE_FOR_lsx_vmuh_u_wu
+#define CODE_FOR_lsx_vmuh_du CODE_FOR_lsx_vmuh_u_du
+#define CODE_FOR_lsx_vsllwil_h_b CODE_FOR_lsx_vsllwil_s_h_b
+#define CODE_FOR_lsx_vsllwil_w_h CODE_FOR_lsx_vsllwil_s_w_h
+#define CODE_FOR_lsx_vsllwil_d_w CODE_FOR_lsx_vsllwil_s_d_w
+#define CODE_FOR_lsx_vsllwil_hu_bu CODE_FOR_lsx_vsllwil_u_hu_bu
+#define CODE_FOR_lsx_vsllwil_wu_hu CODE_FOR_lsx_vsllwil_u_wu_hu
+#define CODE_FOR_lsx_vsllwil_du_wu CODE_FOR_lsx_vsllwil_u_du_wu
+#define CODE_FOR_lsx_vssran_b_h CODE_FOR_lsx_vssran_s_b_h
+#define CODE_FOR_lsx_vssran_h_w CODE_FOR_lsx_vssran_s_h_w
+#define CODE_FOR_lsx_vssran_w_d CODE_FOR_lsx_vssran_s_w_d
+#define CODE_FOR_lsx_vssran_bu_h CODE_FOR_lsx_vssran_u_bu_h
+#define CODE_FOR_lsx_vssran_hu_w CODE_FOR_lsx_vssran_u_hu_w
+#define CODE_FOR_lsx_vssran_wu_d CODE_FOR_lsx_vssran_u_wu_d
+#define CODE_FOR_lsx_vssrarn_b_h CODE_FOR_lsx_vssrarn_s_b_h
+#define CODE_FOR_lsx_vssrarn_h_w CODE_FOR_lsx_vssrarn_s_h_w
+#define CODE_FOR_lsx_vssrarn_w_d CODE_FOR_lsx_vssrarn_s_w_d
+#define CODE_FOR_lsx_vssrarn_bu_h CODE_FOR_lsx_vssrarn_u_bu_h
+#define CODE_FOR_lsx_vssrarn_hu_w CODE_FOR_lsx_vssrarn_u_hu_w
+#define CODE_FOR_lsx_vssrarn_wu_d CODE_FOR_lsx_vssrarn_u_wu_d
+#define CODE_FOR_lsx_vssrln_bu_h CODE_FOR_lsx_vssrln_u_bu_h
+#define CODE_FOR_lsx_vssrln_hu_w CODE_FOR_lsx_vssrln_u_hu_w
+#define CODE_FOR_lsx_vssrln_wu_d CODE_FOR_lsx_vssrln_u_wu_d
+#define CODE_FOR_lsx_vssrlrn_bu_h CODE_FOR_lsx_vssrlrn_u_bu_h
+#define CODE_FOR_lsx_vssrlrn_hu_w CODE_FOR_lsx_vssrlrn_u_hu_w
+#define CODE_FOR_lsx_vssrlrn_wu_d CODE_FOR_lsx_vssrlrn_u_wu_d
+
 static const struct loongarch_builtin_description loongarch_builtins[] = {
 #define LARCH_MOVFCSR2GR 0
   DIRECT_BUILTIN (movfcsr2gr, LARCH_USI_FTYPE_UQI, hard_float),
@@ -184,6 +489,727 @@ static const struct loongarch_builtin_description loongarch_builtins[] = {
   DIRECT_NO_TARGET_BUILTIN (asrtgt_d, LARCH_VOID_FTYPE_DI_DI, default),
   DIRECT_NO_TARGET_BUILTIN (syscall, LARCH_VOID_FTYPE_USI, default),
   DIRECT_NO_TARGET_BUILTIN (break, LARCH_VOID_FTYPE_USI, default),
+
+  /* Built-in functions for LSX.  */
+  LSX_BUILTIN (vsll_b, LARCH_V16QI_FTYPE_V16QI_V16QI),
+  LSX_BUILTIN (vsll_h, LARCH_V8HI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vsll_w, LARCH_V4SI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vsll_d, LARCH_V2DI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vslli_b, LARCH_V16QI_FTYPE_V16QI_UQI),
+  LSX_BUILTIN (vslli_h, LARCH_V8HI_FTYPE_V8HI_UQI),
+  LSX_BUILTIN (vslli_w, LARCH_V4SI_FTYPE_V4SI_UQI),
+  LSX_BUILTIN (vslli_d, LARCH_V2DI_FTYPE_V2DI_UQI),
+  LSX_BUILTIN (vsra_b, LARCH_V16QI_FTYPE_V16QI_V16QI),
+  LSX_BUILTIN (vsra_h, LARCH_V8HI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vsra_w, LARCH_V4SI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vsra_d, LARCH_V2DI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vsrai_b, LARCH_V16QI_FTYPE_V16QI_UQI),
+  LSX_BUILTIN (vsrai_h, LARCH_V8HI_FTYPE_V8HI_UQI),
+  LSX_BUILTIN (vsrai_w, LARCH_V4SI_FTYPE_V4SI_UQI),
+  LSX_BUILTIN (vsrai_d, LARCH_V2DI_FTYPE_V2DI_UQI),
+  LSX_BUILTIN (vsrar_b, LARCH_V16QI_FTYPE_V16QI_V16QI),
+  LSX_BUILTIN (vsrar_h, LARCH_V8HI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vsrar_w, LARCH_V4SI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vsrar_d, LARCH_V2DI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vsrari_b, LARCH_V16QI_FTYPE_V16QI_UQI),
+  LSX_BUILTIN (vsrari_h, LARCH_V8HI_FTYPE_V8HI_UQI),
+  LSX_BUILTIN (vsrari_w, LARCH_V4SI_FTYPE_V4SI_UQI),
+  LSX_BUILTIN (vsrari_d, LARCH_V2DI_FTYPE_V2DI_UQI),
+  LSX_BUILTIN (vsrl_b, LARCH_V16QI_FTYPE_V16QI_V16QI),
+  LSX_BUILTIN (vsrl_h, LARCH_V8HI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vsrl_w, LARCH_V4SI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vsrl_d, LARCH_V2DI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vsrli_b, LARCH_V16QI_FTYPE_V16QI_UQI),
+  LSX_BUILTIN (vsrli_h, LARCH_V8HI_FTYPE_V8HI_UQI),
+  LSX_BUILTIN (vsrli_w, LARCH_V4SI_FTYPE_V4SI_UQI),
+  LSX_BUILTIN (vsrli_d, LARCH_V2DI_FTYPE_V2DI_UQI),
+  LSX_BUILTIN (vsrlr_b, LARCH_V16QI_FTYPE_V16QI_V16QI),
+  LSX_BUILTIN (vsrlr_h, LARCH_V8HI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vsrlr_w, LARCH_V4SI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vsrlr_d, LARCH_V2DI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vsrlri_b, LARCH_V16QI_FTYPE_V16QI_UQI),
+  LSX_BUILTIN (vsrlri_h, LARCH_V8HI_FTYPE_V8HI_UQI),
+  LSX_BUILTIN (vsrlri_w, LARCH_V4SI_FTYPE_V4SI_UQI),
+  LSX_BUILTIN (vsrlri_d, LARCH_V2DI_FTYPE_V2DI_UQI),
+  LSX_BUILTIN (vbitclr_b, LARCH_UV16QI_FTYPE_UV16QI_UV16QI),
+  LSX_BUILTIN (vbitclr_h, LARCH_UV8HI_FTYPE_UV8HI_UV8HI),
+  LSX_BUILTIN (vbitclr_w, LARCH_UV4SI_FTYPE_UV4SI_UV4SI),
+  LSX_BUILTIN (vbitclr_d, LARCH_UV2DI_FTYPE_UV2DI_UV2DI),
+  LSX_BUILTIN (vbitclri_b, LARCH_UV16QI_FTYPE_UV16QI_UQI),
+  LSX_BUILTIN (vbitclri_h, LARCH_UV8HI_FTYPE_UV8HI_UQI),
+  LSX_BUILTIN (vbitclri_w, LARCH_UV4SI_FTYPE_UV4SI_UQI),
+  LSX_BUILTIN (vbitclri_d, LARCH_UV2DI_FTYPE_UV2DI_UQI),
+  LSX_BUILTIN (vbitset_b, LARCH_UV16QI_FTYPE_UV16QI_UV16QI),
+  LSX_BUILTIN (vbitset_h, LARCH_UV8HI_FTYPE_UV8HI_UV8HI),
+  LSX_BUILTIN (vbitset_w, LARCH_UV4SI_FTYPE_UV4SI_UV4SI),
+  LSX_BUILTIN (vbitset_d, LARCH_UV2DI_FTYPE_UV2DI_UV2DI),
+  LSX_BUILTIN (vbitseti_b, LARCH_UV16QI_FTYPE_UV16QI_UQI),
+  LSX_BUILTIN (vbitseti_h, LARCH_UV8HI_FTYPE_UV8HI_UQI),
+  LSX_BUILTIN (vbitseti_w, LARCH_UV4SI_FTYPE_UV4SI_UQI),
+  LSX_BUILTIN (vbitseti_d, LARCH_UV2DI_FTYPE_UV2DI_UQI),
+  LSX_BUILTIN (vbitrev_b, LARCH_UV16QI_FTYPE_UV16QI_UV16QI),
+  LSX_BUILTIN (vbitrev_h, LARCH_UV8HI_FTYPE_UV8HI_UV8HI),
+  LSX_BUILTIN (vbitrev_w, LARCH_UV4SI_FTYPE_UV4SI_UV4SI),
+  LSX_BUILTIN (vbitrev_d, LARCH_UV2DI_FTYPE_UV2DI_UV2DI),
+  LSX_BUILTIN (vbitrevi_b, LARCH_UV16QI_FTYPE_UV16QI_UQI),
+  LSX_BUILTIN (vbitrevi_h, LARCH_UV8HI_FTYPE_UV8HI_UQI),
+  LSX_BUILTIN (vbitrevi_w, LARCH_UV4SI_FTYPE_UV4SI_UQI),
+  LSX_BUILTIN (vbitrevi_d, LARCH_UV2DI_FTYPE_UV2DI_UQI),
+  LSX_BUILTIN (vadd_b, LARCH_V16QI_FTYPE_V16QI_V16QI),
+  LSX_BUILTIN (vadd_h, LARCH_V8HI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vadd_w, LARCH_V4SI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vadd_d, LARCH_V2DI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vaddi_bu, LARCH_V16QI_FTYPE_V16QI_UQI),
+  LSX_BUILTIN (vaddi_hu, LARCH_V8HI_FTYPE_V8HI_UQI),
+  LSX_BUILTIN (vaddi_wu, LARCH_V4SI_FTYPE_V4SI_UQI),
+  LSX_BUILTIN (vaddi_du, LARCH_V2DI_FTYPE_V2DI_UQI),
+  LSX_BUILTIN (vsub_b, LARCH_V16QI_FTYPE_V16QI_V16QI),
+  LSX_BUILTIN (vsub_h, LARCH_V8HI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vsub_w, LARCH_V4SI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vsub_d, LARCH_V2DI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vsubi_bu, LARCH_V16QI_FTYPE_V16QI_UQI),
+  LSX_BUILTIN (vsubi_hu, LARCH_V8HI_FTYPE_V8HI_UQI),
+  LSX_BUILTIN (vsubi_wu, LARCH_V4SI_FTYPE_V4SI_UQI),
+  LSX_BUILTIN (vsubi_du, LARCH_V2DI_FTYPE_V2DI_UQI),
+  LSX_BUILTIN (vmax_b, LARCH_V16QI_FTYPE_V16QI_V16QI),
+  LSX_BUILTIN (vmax_h, LARCH_V8HI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vmax_w, LARCH_V4SI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vmax_d, LARCH_V2DI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vmaxi_b, LARCH_V16QI_FTYPE_V16QI_QI),
+  LSX_BUILTIN (vmaxi_h, LARCH_V8HI_FTYPE_V8HI_QI),
+  LSX_BUILTIN (vmaxi_w, LARCH_V4SI_FTYPE_V4SI_QI),
+  LSX_BUILTIN (vmaxi_d, LARCH_V2DI_FTYPE_V2DI_QI),
+  LSX_BUILTIN (vmax_bu, LARCH_UV16QI_FTYPE_UV16QI_UV16QI),
+  LSX_BUILTIN (vmax_hu, LARCH_UV8HI_FTYPE_UV8HI_UV8HI),
+  LSX_BUILTIN (vmax_wu, LARCH_UV4SI_FTYPE_UV4SI_UV4SI),
+  LSX_BUILTIN (vmax_du, LARCH_UV2DI_FTYPE_UV2DI_UV2DI),
+  LSX_BUILTIN (vmaxi_bu, LARCH_UV16QI_FTYPE_UV16QI_UQI),
+  LSX_BUILTIN (vmaxi_hu, LARCH_UV8HI_FTYPE_UV8HI_UQI),
+  LSX_BUILTIN (vmaxi_wu, LARCH_UV4SI_FTYPE_UV4SI_UQI),
+  LSX_BUILTIN (vmaxi_du, LARCH_UV2DI_FTYPE_UV2DI_UQI),
+  LSX_BUILTIN (vmin_b, LARCH_V16QI_FTYPE_V16QI_V16QI),
+  LSX_BUILTIN (vmin_h, LARCH_V8HI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vmin_w, LARCH_V4SI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vmin_d, LARCH_V2DI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vmini_b, LARCH_V16QI_FTYPE_V16QI_QI),
+  LSX_BUILTIN (vmini_h, LARCH_V8HI_FTYPE_V8HI_QI),
+  LSX_BUILTIN (vmini_w, LARCH_V4SI_FTYPE_V4SI_QI),
+  LSX_BUILTIN (vmini_d, LARCH_V2DI_FTYPE_V2DI_QI),
+  LSX_BUILTIN (vmin_bu, LARCH_UV16QI_FTYPE_UV16QI_UV16QI),
+  LSX_BUILTIN (vmin_hu, LARCH_UV8HI_FTYPE_UV8HI_UV8HI),
+  LSX_BUILTIN (vmin_wu, LARCH_UV4SI_FTYPE_UV4SI_UV4SI),
+  LSX_BUILTIN (vmin_du, LARCH_UV2DI_FTYPE_UV2DI_UV2DI),
+  LSX_BUILTIN (vmini_bu, LARCH_UV16QI_FTYPE_UV16QI_UQI),
+  LSX_BUILTIN (vmini_hu, LARCH_UV8HI_FTYPE_UV8HI_UQI),
+  LSX_BUILTIN (vmini_wu, LARCH_UV4SI_FTYPE_UV4SI_UQI),
+  LSX_BUILTIN (vmini_du, LARCH_UV2DI_FTYPE_UV2DI_UQI),
+  LSX_BUILTIN (vseq_b, LARCH_V16QI_FTYPE_V16QI_V16QI),
+  LSX_BUILTIN (vseq_h, LARCH_V8HI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vseq_w, LARCH_V4SI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vseq_d, LARCH_V2DI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vseqi_b, LARCH_V16QI_FTYPE_V16QI_QI),
+  LSX_BUILTIN (vseqi_h, LARCH_V8HI_FTYPE_V8HI_QI),
+  LSX_BUILTIN (vseqi_w, LARCH_V4SI_FTYPE_V4SI_QI),
+  LSX_BUILTIN (vseqi_d, LARCH_V2DI_FTYPE_V2DI_QI),
+  LSX_BUILTIN (vslti_b, LARCH_V16QI_FTYPE_V16QI_QI),
+  LSX_BUILTIN (vslt_b, LARCH_V16QI_FTYPE_V16QI_V16QI),
+  LSX_BUILTIN (vslt_h, LARCH_V8HI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vslt_w, LARCH_V4SI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vslt_d, LARCH_V2DI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vslti_h, LARCH_V8HI_FTYPE_V8HI_QI),
+  LSX_BUILTIN (vslti_w, LARCH_V4SI_FTYPE_V4SI_QI),
+  LSX_BUILTIN (vslti_d, LARCH_V2DI_FTYPE_V2DI_QI),
+  LSX_BUILTIN (vslt_bu, LARCH_V16QI_FTYPE_UV16QI_UV16QI),
+  LSX_BUILTIN (vslt_hu, LARCH_V8HI_FTYPE_UV8HI_UV8HI),
+  LSX_BUILTIN (vslt_wu, LARCH_V4SI_FTYPE_UV4SI_UV4SI),
+  LSX_BUILTIN (vslt_du, LARCH_V2DI_FTYPE_UV2DI_UV2DI),
+  LSX_BUILTIN (vslti_bu, LARCH_V16QI_FTYPE_UV16QI_UQI),
+  LSX_BUILTIN (vslti_hu, LARCH_V8HI_FTYPE_UV8HI_UQI),
+  LSX_BUILTIN (vslti_wu, LARCH_V4SI_FTYPE_UV4SI_UQI),
+  LSX_BUILTIN (vslti_du, LARCH_V2DI_FTYPE_UV2DI_UQI),
+  LSX_BUILTIN (vsle_b, LARCH_V16QI_FTYPE_V16QI_V16QI),
+  LSX_BUILTIN (vsle_h, LARCH_V8HI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vsle_w, LARCH_V4SI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vsle_d, LARCH_V2DI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vslei_b, LARCH_V16QI_FTYPE_V16QI_QI),
+  LSX_BUILTIN (vslei_h, LARCH_V8HI_FTYPE_V8HI_QI),
+  LSX_BUILTIN (vslei_w, LARCH_V4SI_FTYPE_V4SI_QI),
+  LSX_BUILTIN (vslei_d, LARCH_V2DI_FTYPE_V2DI_QI),
+  LSX_BUILTIN (vsle_bu, LARCH_V16QI_FTYPE_UV16QI_UV16QI),
+  LSX_BUILTIN (vsle_hu, LARCH_V8HI_FTYPE_UV8HI_UV8HI),
+  LSX_BUILTIN (vsle_wu, LARCH_V4SI_FTYPE_UV4SI_UV4SI),
+  LSX_BUILTIN (vsle_du, LARCH_V2DI_FTYPE_UV2DI_UV2DI),
+  LSX_BUILTIN (vslei_bu, LARCH_V16QI_FTYPE_UV16QI_UQI),
+  LSX_BUILTIN (vslei_hu, LARCH_V8HI_FTYPE_UV8HI_UQI),
+  LSX_BUILTIN (vslei_wu, LARCH_V4SI_FTYPE_UV4SI_UQI),
+  LSX_BUILTIN (vslei_du, LARCH_V2DI_FTYPE_UV2DI_UQI),
+  LSX_BUILTIN (vsat_b, LARCH_V16QI_FTYPE_V16QI_UQI),
+  LSX_BUILTIN (vsat_h, LARCH_V8HI_FTYPE_V8HI_UQI),
+  LSX_BUILTIN (vsat_w, LARCH_V4SI_FTYPE_V4SI_UQI),
+  LSX_BUILTIN (vsat_d, LARCH_V2DI_FTYPE_V2DI_UQI),
+  LSX_BUILTIN (vsat_bu, LARCH_UV16QI_FTYPE_UV16QI_UQI),
+  LSX_BUILTIN (vsat_hu, LARCH_UV8HI_FTYPE_UV8HI_UQI),
+  LSX_BUILTIN (vsat_wu, LARCH_UV4SI_FTYPE_UV4SI_UQI),
+  LSX_BUILTIN (vsat_du, LARCH_UV2DI_FTYPE_UV2DI_UQI),
+  LSX_BUILTIN (vadda_b, LARCH_V16QI_FTYPE_V16QI_V16QI),
+  LSX_BUILTIN (vadda_h, LARCH_V8HI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vadda_w, LARCH_V4SI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vadda_d, LARCH_V2DI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vsadd_b, LARCH_V16QI_FTYPE_V16QI_V16QI),
+  LSX_BUILTIN (vsadd_h, LARCH_V8HI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vsadd_w, LARCH_V4SI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vsadd_d, LARCH_V2DI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vsadd_bu, LARCH_UV16QI_FTYPE_UV16QI_UV16QI),
+  LSX_BUILTIN (vsadd_hu, LARCH_UV8HI_FTYPE_UV8HI_UV8HI),
+  LSX_BUILTIN (vsadd_wu, LARCH_UV4SI_FTYPE_UV4SI_UV4SI),
+  LSX_BUILTIN (vsadd_du, LARCH_UV2DI_FTYPE_UV2DI_UV2DI),
+  LSX_BUILTIN (vavg_b, LARCH_V16QI_FTYPE_V16QI_V16QI),
+  LSX_BUILTIN (vavg_h, LARCH_V8HI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vavg_w, LARCH_V4SI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vavg_d, LARCH_V2DI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vavg_bu, LARCH_UV16QI_FTYPE_UV16QI_UV16QI),
+  LSX_BUILTIN (vavg_hu, LARCH_UV8HI_FTYPE_UV8HI_UV8HI),
+  LSX_BUILTIN (vavg_wu, LARCH_UV4SI_FTYPE_UV4SI_UV4SI),
+  LSX_BUILTIN (vavg_du, LARCH_UV2DI_FTYPE_UV2DI_UV2DI),
+  LSX_BUILTIN (vavgr_b, LARCH_V16QI_FTYPE_V16QI_V16QI),
+  LSX_BUILTIN (vavgr_h, LARCH_V8HI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vavgr_w, LARCH_V4SI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vavgr_d, LARCH_V2DI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vavgr_bu, LARCH_UV16QI_FTYPE_UV16QI_UV16QI),
+  LSX_BUILTIN (vavgr_hu, LARCH_UV8HI_FTYPE_UV8HI_UV8HI),
+  LSX_BUILTIN (vavgr_wu, LARCH_UV4SI_FTYPE_UV4SI_UV4SI),
+  LSX_BUILTIN (vavgr_du, LARCH_UV2DI_FTYPE_UV2DI_UV2DI),
+  LSX_BUILTIN (vssub_b, LARCH_V16QI_FTYPE_V16QI_V16QI),
+  LSX_BUILTIN (vssub_h, LARCH_V8HI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vssub_w, LARCH_V4SI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vssub_d, LARCH_V2DI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vssub_bu, LARCH_UV16QI_FTYPE_UV16QI_UV16QI),
+  LSX_BUILTIN (vssub_hu, LARCH_UV8HI_FTYPE_UV8HI_UV8HI),
+  LSX_BUILTIN (vssub_wu, LARCH_UV4SI_FTYPE_UV4SI_UV4SI),
+  LSX_BUILTIN (vssub_du, LARCH_UV2DI_FTYPE_UV2DI_UV2DI),
+  LSX_BUILTIN (vabsd_b, LARCH_V16QI_FTYPE_V16QI_V16QI),
+  LSX_BUILTIN (vabsd_h, LARCH_V8HI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vabsd_w, LARCH_V4SI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vabsd_d, LARCH_V2DI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vabsd_bu, LARCH_UV16QI_FTYPE_UV16QI_UV16QI),
+  LSX_BUILTIN (vabsd_hu, LARCH_UV8HI_FTYPE_UV8HI_UV8HI),
+  LSX_BUILTIN (vabsd_wu, LARCH_UV4SI_FTYPE_UV4SI_UV4SI),
+  LSX_BUILTIN (vabsd_du, LARCH_UV2DI_FTYPE_UV2DI_UV2DI),
+  LSX_BUILTIN (vmul_b, LARCH_V16QI_FTYPE_V16QI_V16QI),
+  LSX_BUILTIN (vmul_h, LARCH_V8HI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vmul_w, LARCH_V4SI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vmul_d, LARCH_V2DI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vmadd_b, LARCH_V16QI_FTYPE_V16QI_V16QI_V16QI),
+  LSX_BUILTIN (vmadd_h, LARCH_V8HI_FTYPE_V8HI_V8HI_V8HI),
+  LSX_BUILTIN (vmadd_w, LARCH_V4SI_FTYPE_V4SI_V4SI_V4SI),
+  LSX_BUILTIN (vmadd_d, LARCH_V2DI_FTYPE_V2DI_V2DI_V2DI),
+  LSX_BUILTIN (vmsub_b, LARCH_V16QI_FTYPE_V16QI_V16QI_V16QI),
+  LSX_BUILTIN (vmsub_h, LARCH_V8HI_FTYPE_V8HI_V8HI_V8HI),
+  LSX_BUILTIN (vmsub_w, LARCH_V4SI_FTYPE_V4SI_V4SI_V4SI),
+  LSX_BUILTIN (vmsub_d, LARCH_V2DI_FTYPE_V2DI_V2DI_V2DI),
+  LSX_BUILTIN (vdiv_b, LARCH_V16QI_FTYPE_V16QI_V16QI),
+  LSX_BUILTIN (vdiv_h, LARCH_V8HI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vdiv_w, LARCH_V4SI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vdiv_d, LARCH_V2DI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vdiv_bu, LARCH_UV16QI_FTYPE_UV16QI_UV16QI),
+  LSX_BUILTIN (vdiv_hu, LARCH_UV8HI_FTYPE_UV8HI_UV8HI),
+  LSX_BUILTIN (vdiv_wu, LARCH_UV4SI_FTYPE_UV4SI_UV4SI),
+  LSX_BUILTIN (vdiv_du, LARCH_UV2DI_FTYPE_UV2DI_UV2DI),
+  LSX_BUILTIN (vhaddw_h_b, LARCH_V8HI_FTYPE_V16QI_V16QI),
+  LSX_BUILTIN (vhaddw_w_h, LARCH_V4SI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vhaddw_d_w, LARCH_V2DI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vhaddw_hu_bu, LARCH_UV8HI_FTYPE_UV16QI_UV16QI),
+  LSX_BUILTIN (vhaddw_wu_hu, LARCH_UV4SI_FTYPE_UV8HI_UV8HI),
+  LSX_BUILTIN (vhaddw_du_wu, LARCH_UV2DI_FTYPE_UV4SI_UV4SI),
+  LSX_BUILTIN (vhsubw_h_b, LARCH_V8HI_FTYPE_V16QI_V16QI),
+  LSX_BUILTIN (vhsubw_w_h, LARCH_V4SI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vhsubw_d_w, LARCH_V2DI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vhsubw_hu_bu, LARCH_V8HI_FTYPE_UV16QI_UV16QI),
+  LSX_BUILTIN (vhsubw_wu_hu, LARCH_V4SI_FTYPE_UV8HI_UV8HI),
+  LSX_BUILTIN (vhsubw_du_wu, LARCH_V2DI_FTYPE_UV4SI_UV4SI),
+  LSX_BUILTIN (vmod_b, LARCH_V16QI_FTYPE_V16QI_V16QI),
+  LSX_BUILTIN (vmod_h, LARCH_V8HI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vmod_w, LARCH_V4SI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vmod_d, LARCH_V2DI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vmod_bu, LARCH_UV16QI_FTYPE_UV16QI_UV16QI),
+  LSX_BUILTIN (vmod_hu, LARCH_UV8HI_FTYPE_UV8HI_UV8HI),
+  LSX_BUILTIN (vmod_wu, LARCH_UV4SI_FTYPE_UV4SI_UV4SI),
+  LSX_BUILTIN (vmod_du, LARCH_UV2DI_FTYPE_UV2DI_UV2DI),
+  LSX_BUILTIN (vreplve_b, LARCH_V16QI_FTYPE_V16QI_SI),
+  LSX_BUILTIN (vreplve_h, LARCH_V8HI_FTYPE_V8HI_SI),
+  LSX_BUILTIN (vreplve_w, LARCH_V4SI_FTYPE_V4SI_SI),
+  LSX_BUILTIN (vreplve_d, LARCH_V2DI_FTYPE_V2DI_SI),
+  LSX_BUILTIN (vreplvei_b, LARCH_V16QI_FTYPE_V16QI_UQI),
+  LSX_BUILTIN (vreplvei_h, LARCH_V8HI_FTYPE_V8HI_UQI),
+  LSX_BUILTIN (vreplvei_w, LARCH_V4SI_FTYPE_V4SI_UQI),
+  LSX_BUILTIN (vreplvei_d, LARCH_V2DI_FTYPE_V2DI_UQI),
+  LSX_BUILTIN (vpickev_b, LARCH_V16QI_FTYPE_V16QI_V16QI),
+  LSX_BUILTIN (vpickev_h, LARCH_V8HI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vpickev_w, LARCH_V4SI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vpickev_d, LARCH_V2DI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vpickod_b, LARCH_V16QI_FTYPE_V16QI_V16QI),
+  LSX_BUILTIN (vpickod_h, LARCH_V8HI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vpickod_w, LARCH_V4SI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vpickod_d, LARCH_V2DI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vilvh_b, LARCH_V16QI_FTYPE_V16QI_V16QI),
+  LSX_BUILTIN (vilvh_h, LARCH_V8HI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vilvh_w, LARCH_V4SI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vilvh_d, LARCH_V2DI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vilvl_b, LARCH_V16QI_FTYPE_V16QI_V16QI),
+  LSX_BUILTIN (vilvl_h, LARCH_V8HI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vilvl_w, LARCH_V4SI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vilvl_d, LARCH_V2DI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vpackev_b, LARCH_V16QI_FTYPE_V16QI_V16QI),
+  LSX_BUILTIN (vpackev_h, LARCH_V8HI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vpackev_w, LARCH_V4SI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vpackev_d, LARCH_V2DI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vpackod_b, LARCH_V16QI_FTYPE_V16QI_V16QI),
+  LSX_BUILTIN (vpackod_h, LARCH_V8HI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vpackod_w, LARCH_V4SI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vpackod_d, LARCH_V2DI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vshuf_h, LARCH_V8HI_FTYPE_V8HI_V8HI_V8HI),
+  LSX_BUILTIN (vshuf_w, LARCH_V4SI_FTYPE_V4SI_V4SI_V4SI),
+  LSX_BUILTIN (vshuf_d, LARCH_V2DI_FTYPE_V2DI_V2DI_V2DI),
+  LSX_BUILTIN (vand_v, LARCH_UV16QI_FTYPE_UV16QI_UV16QI),
+  LSX_BUILTIN (vandi_b, LARCH_UV16QI_FTYPE_UV16QI_UQI),
+  LSX_BUILTIN (vor_v, LARCH_UV16QI_FTYPE_UV16QI_UV16QI),
+  LSX_BUILTIN (vori_b, LARCH_UV16QI_FTYPE_UV16QI_UQI),
+  LSX_BUILTIN (vnor_v, LARCH_UV16QI_FTYPE_UV16QI_UV16QI),
+  LSX_BUILTIN (vnori_b, LARCH_UV16QI_FTYPE_UV16QI_UQI),
+  LSX_BUILTIN (vxor_v, LARCH_UV16QI_FTYPE_UV16QI_UV16QI),
+  LSX_BUILTIN (vxori_b, LARCH_UV16QI_FTYPE_UV16QI_UQI),
+  LSX_BUILTIN (vbitsel_v, LARCH_UV16QI_FTYPE_UV16QI_UV16QI_UV16QI),
+  LSX_BUILTIN (vbitseli_b, LARCH_UV16QI_FTYPE_UV16QI_UV16QI_USI),
+  LSX_BUILTIN (vshuf4i_b, LARCH_V16QI_FTYPE_V16QI_USI),
+  LSX_BUILTIN (vshuf4i_h, LARCH_V8HI_FTYPE_V8HI_USI),
+  LSX_BUILTIN (vshuf4i_w, LARCH_V4SI_FTYPE_V4SI_USI),
+  LSX_BUILTIN (vreplgr2vr_b, LARCH_V16QI_FTYPE_SI),
+  LSX_BUILTIN (vreplgr2vr_h, LARCH_V8HI_FTYPE_SI),
+  LSX_BUILTIN (vreplgr2vr_w, LARCH_V4SI_FTYPE_SI),
+  LSX_BUILTIN (vreplgr2vr_d, LARCH_V2DI_FTYPE_DI),
+  LSX_BUILTIN (vpcnt_b, LARCH_V16QI_FTYPE_V16QI),
+  LSX_BUILTIN (vpcnt_h, LARCH_V8HI_FTYPE_V8HI),
+  LSX_BUILTIN (vpcnt_w, LARCH_V4SI_FTYPE_V4SI),
+  LSX_BUILTIN (vpcnt_d, LARCH_V2DI_FTYPE_V2DI),
+  LSX_BUILTIN (vclo_b, LARCH_V16QI_FTYPE_V16QI),
+  LSX_BUILTIN (vclo_h, LARCH_V8HI_FTYPE_V8HI),
+  LSX_BUILTIN (vclo_w, LARCH_V4SI_FTYPE_V4SI),
+  LSX_BUILTIN (vclo_d, LARCH_V2DI_FTYPE_V2DI),
+  LSX_BUILTIN (vclz_b, LARCH_V16QI_FTYPE_V16QI),
+  LSX_BUILTIN (vclz_h, LARCH_V8HI_FTYPE_V8HI),
+  LSX_BUILTIN (vclz_w, LARCH_V4SI_FTYPE_V4SI),
+  LSX_BUILTIN (vclz_d, LARCH_V2DI_FTYPE_V2DI),
+  LSX_BUILTIN (vpickve2gr_b, LARCH_SI_FTYPE_V16QI_UQI),
+  LSX_BUILTIN (vpickve2gr_h, LARCH_SI_FTYPE_V8HI_UQI),
+  LSX_BUILTIN (vpickve2gr_w, LARCH_SI_FTYPE_V4SI_UQI),
+  LSX_BUILTIN (vpickve2gr_d, LARCH_DI_FTYPE_V2DI_UQI),
+  LSX_BUILTIN (vpickve2gr_bu, LARCH_USI_FTYPE_V16QI_UQI),
+  LSX_BUILTIN (vpickve2gr_hu, LARCH_USI_FTYPE_V8HI_UQI),
+  LSX_BUILTIN (vpickve2gr_wu, LARCH_USI_FTYPE_V4SI_UQI),
+  LSX_BUILTIN (vpickve2gr_du, LARCH_UDI_FTYPE_V2DI_UQI),
+  LSX_BUILTIN (vinsgr2vr_b, LARCH_V16QI_FTYPE_V16QI_SI_UQI),
+  LSX_BUILTIN (vinsgr2vr_h, LARCH_V8HI_FTYPE_V8HI_SI_UQI),
+  LSX_BUILTIN (vinsgr2vr_w, LARCH_V4SI_FTYPE_V4SI_SI_UQI),
+  LSX_BUILTIN (vinsgr2vr_d, LARCH_V2DI_FTYPE_V2DI_DI_UQI),
+  LSX_BUILTIN_TEST_BRANCH (bnz_b, LARCH_SI_FTYPE_UV16QI),
+  LSX_BUILTIN_TEST_BRANCH (bnz_h, LARCH_SI_FTYPE_UV8HI),
+  LSX_BUILTIN_TEST_BRANCH (bnz_w, LARCH_SI_FTYPE_UV4SI),
+  LSX_BUILTIN_TEST_BRANCH (bnz_d, LARCH_SI_FTYPE_UV2DI),
+  LSX_BUILTIN_TEST_BRANCH (bz_b, LARCH_SI_FTYPE_UV16QI),
+  LSX_BUILTIN_TEST_BRANCH (bz_h, LARCH_SI_FTYPE_UV8HI),
+  LSX_BUILTIN_TEST_BRANCH (bz_w, LARCH_SI_FTYPE_UV4SI),
+  LSX_BUILTIN_TEST_BRANCH (bz_d, LARCH_SI_FTYPE_UV2DI),
+  LSX_BUILTIN_TEST_BRANCH (bz_v, LARCH_SI_FTYPE_UV16QI),
+  LSX_BUILTIN_TEST_BRANCH (bnz_v,	LARCH_SI_FTYPE_UV16QI),
+  LSX_BUILTIN (vrepli_b, LARCH_V16QI_FTYPE_HI),
+  LSX_BUILTIN (vrepli_h, LARCH_V8HI_FTYPE_HI),
+  LSX_BUILTIN (vrepli_w, LARCH_V4SI_FTYPE_HI),
+  LSX_BUILTIN (vrepli_d, LARCH_V2DI_FTYPE_HI),
+  LSX_BUILTIN (vfcmp_caf_s, LARCH_V4SI_FTYPE_V4SF_V4SF),
+  LSX_BUILTIN (vfcmp_caf_d, LARCH_V2DI_FTYPE_V2DF_V2DF),
+  LSX_BUILTIN (vfcmp_cor_s, LARCH_V4SI_FTYPE_V4SF_V4SF),
+  LSX_BUILTIN (vfcmp_cor_d, LARCH_V2DI_FTYPE_V2DF_V2DF),
+  LSX_BUILTIN (vfcmp_cun_s, LARCH_V4SI_FTYPE_V4SF_V4SF),
+  LSX_BUILTIN (vfcmp_cun_d, LARCH_V2DI_FTYPE_V2DF_V2DF),
+  LSX_BUILTIN (vfcmp_cune_s, LARCH_V4SI_FTYPE_V4SF_V4SF),
+  LSX_BUILTIN (vfcmp_cune_d, LARCH_V2DI_FTYPE_V2DF_V2DF),
+  LSX_BUILTIN (vfcmp_cueq_s, LARCH_V4SI_FTYPE_V4SF_V4SF),
+  LSX_BUILTIN (vfcmp_cueq_d, LARCH_V2DI_FTYPE_V2DF_V2DF),
+  LSX_BUILTIN (vfcmp_ceq_s, LARCH_V4SI_FTYPE_V4SF_V4SF),
+  LSX_BUILTIN (vfcmp_ceq_d, LARCH_V2DI_FTYPE_V2DF_V2DF),
+  LSX_BUILTIN (vfcmp_cne_s, LARCH_V4SI_FTYPE_V4SF_V4SF),
+  LSX_BUILTIN (vfcmp_cne_d, LARCH_V2DI_FTYPE_V2DF_V2DF),
+  LSX_BUILTIN (vfcmp_clt_s, LARCH_V4SI_FTYPE_V4SF_V4SF),
+  LSX_BUILTIN (vfcmp_clt_d, LARCH_V2DI_FTYPE_V2DF_V2DF),
+  LSX_BUILTIN (vfcmp_cult_s, LARCH_V4SI_FTYPE_V4SF_V4SF),
+  LSX_BUILTIN (vfcmp_cult_d, LARCH_V2DI_FTYPE_V2DF_V2DF),
+  LSX_BUILTIN (vfcmp_cle_s, LARCH_V4SI_FTYPE_V4SF_V4SF),
+  LSX_BUILTIN (vfcmp_cle_d, LARCH_V2DI_FTYPE_V2DF_V2DF),
+  LSX_BUILTIN (vfcmp_cule_s, LARCH_V4SI_FTYPE_V4SF_V4SF),
+  LSX_BUILTIN (vfcmp_cule_d, LARCH_V2DI_FTYPE_V2DF_V2DF),
+  LSX_BUILTIN (vfcmp_saf_s, LARCH_V4SI_FTYPE_V4SF_V4SF),
+  LSX_BUILTIN (vfcmp_saf_d, LARCH_V2DI_FTYPE_V2DF_V2DF),
+  LSX_BUILTIN (vfcmp_sor_s, LARCH_V4SI_FTYPE_V4SF_V4SF),
+  LSX_BUILTIN (vfcmp_sor_d, LARCH_V2DI_FTYPE_V2DF_V2DF),
+  LSX_BUILTIN (vfcmp_sun_s, LARCH_V4SI_FTYPE_V4SF_V4SF),
+  LSX_BUILTIN (vfcmp_sun_d, LARCH_V2DI_FTYPE_V2DF_V2DF),
+  LSX_BUILTIN (vfcmp_sune_s, LARCH_V4SI_FTYPE_V4SF_V4SF),
+  LSX_BUILTIN (vfcmp_sune_d, LARCH_V2DI_FTYPE_V2DF_V2DF),
+  LSX_BUILTIN (vfcmp_sueq_s, LARCH_V4SI_FTYPE_V4SF_V4SF),
+  LSX_BUILTIN (vfcmp_sueq_d, LARCH_V2DI_FTYPE_V2DF_V2DF),
+  LSX_BUILTIN (vfcmp_seq_s, LARCH_V4SI_FTYPE_V4SF_V4SF),
+  LSX_BUILTIN (vfcmp_seq_d, LARCH_V2DI_FTYPE_V2DF_V2DF),
+  LSX_BUILTIN (vfcmp_sne_s, LARCH_V4SI_FTYPE_V4SF_V4SF),
+  LSX_BUILTIN (vfcmp_sne_d, LARCH_V2DI_FTYPE_V2DF_V2DF),
+  LSX_BUILTIN (vfcmp_slt_s, LARCH_V4SI_FTYPE_V4SF_V4SF),
+  LSX_BUILTIN (vfcmp_slt_d, LARCH_V2DI_FTYPE_V2DF_V2DF),
+  LSX_BUILTIN (vfcmp_sult_s, LARCH_V4SI_FTYPE_V4SF_V4SF),
+  LSX_BUILTIN (vfcmp_sult_d, LARCH_V2DI_FTYPE_V2DF_V2DF),
+  LSX_BUILTIN (vfcmp_sle_s, LARCH_V4SI_FTYPE_V4SF_V4SF),
+  LSX_BUILTIN (vfcmp_sle_d, LARCH_V2DI_FTYPE_V2DF_V2DF),
+  LSX_BUILTIN (vfcmp_sule_s, LARCH_V4SI_FTYPE_V4SF_V4SF),
+  LSX_BUILTIN (vfcmp_sule_d, LARCH_V2DI_FTYPE_V2DF_V2DF),
+  LSX_BUILTIN (vfadd_s, LARCH_V4SF_FTYPE_V4SF_V4SF),
+  LSX_BUILTIN (vfadd_d, LARCH_V2DF_FTYPE_V2DF_V2DF),
+  LSX_BUILTIN (vfsub_s, LARCH_V4SF_FTYPE_V4SF_V4SF),
+  LSX_BUILTIN (vfsub_d, LARCH_V2DF_FTYPE_V2DF_V2DF),
+  LSX_BUILTIN (vfmul_s, LARCH_V4SF_FTYPE_V4SF_V4SF),
+  LSX_BUILTIN (vfmul_d, LARCH_V2DF_FTYPE_V2DF_V2DF),
+  LSX_BUILTIN (vfdiv_s, LARCH_V4SF_FTYPE_V4SF_V4SF),
+  LSX_BUILTIN (vfdiv_d, LARCH_V2DF_FTYPE_V2DF_V2DF),
+  LSX_BUILTIN (vfcvt_h_s, LARCH_V8HI_FTYPE_V4SF_V4SF),
+  LSX_BUILTIN (vfcvt_s_d, LARCH_V4SF_FTYPE_V2DF_V2DF),
+  LSX_BUILTIN (vfmin_s, LARCH_V4SF_FTYPE_V4SF_V4SF),
+  LSX_BUILTIN (vfmin_d, LARCH_V2DF_FTYPE_V2DF_V2DF),
+  LSX_BUILTIN (vfmina_s, LARCH_V4SF_FTYPE_V4SF_V4SF),
+  LSX_BUILTIN (vfmina_d, LARCH_V2DF_FTYPE_V2DF_V2DF),
+  LSX_BUILTIN (vfmax_s, LARCH_V4SF_FTYPE_V4SF_V4SF),
+  LSX_BUILTIN (vfmax_d, LARCH_V2DF_FTYPE_V2DF_V2DF),
+  LSX_BUILTIN (vfmaxa_s, LARCH_V4SF_FTYPE_V4SF_V4SF),
+  LSX_BUILTIN (vfmaxa_d, LARCH_V2DF_FTYPE_V2DF_V2DF),
+  LSX_BUILTIN (vfclass_s, LARCH_V4SI_FTYPE_V4SF),
+  LSX_BUILTIN (vfclass_d, LARCH_V2DI_FTYPE_V2DF),
+  LSX_BUILTIN (vfsqrt_s, LARCH_V4SF_FTYPE_V4SF),
+  LSX_BUILTIN (vfsqrt_d, LARCH_V2DF_FTYPE_V2DF),
+  LSX_BUILTIN (vfrecip_s, LARCH_V4SF_FTYPE_V4SF),
+  LSX_BUILTIN (vfrecip_d, LARCH_V2DF_FTYPE_V2DF),
+  LSX_BUILTIN (vfrint_s, LARCH_V4SF_FTYPE_V4SF),
+  LSX_BUILTIN (vfrint_d, LARCH_V2DF_FTYPE_V2DF),
+  LSX_BUILTIN (vfrsqrt_s, LARCH_V4SF_FTYPE_V4SF),
+  LSX_BUILTIN (vfrsqrt_d, LARCH_V2DF_FTYPE_V2DF),
+  LSX_BUILTIN (vflogb_s, LARCH_V4SF_FTYPE_V4SF),
+  LSX_BUILTIN (vflogb_d, LARCH_V2DF_FTYPE_V2DF),
+  LSX_BUILTIN (vfcvth_s_h, LARCH_V4SF_FTYPE_V8HI),
+  LSX_BUILTIN (vfcvth_d_s, LARCH_V2DF_FTYPE_V4SF),
+  LSX_BUILTIN (vfcvtl_s_h, LARCH_V4SF_FTYPE_V8HI),
+  LSX_BUILTIN (vfcvtl_d_s, LARCH_V2DF_FTYPE_V4SF),
+  LSX_BUILTIN (vftint_w_s, LARCH_V4SI_FTYPE_V4SF),
+  LSX_BUILTIN (vftint_l_d, LARCH_V2DI_FTYPE_V2DF),
+  LSX_BUILTIN (vftint_wu_s, LARCH_UV4SI_FTYPE_V4SF),
+  LSX_BUILTIN (vftint_lu_d, LARCH_UV2DI_FTYPE_V2DF),
+  LSX_BUILTIN (vftintrz_w_s, LARCH_V4SI_FTYPE_V4SF),
+  LSX_BUILTIN (vftintrz_l_d, LARCH_V2DI_FTYPE_V2DF),
+  LSX_BUILTIN (vftintrz_wu_s, LARCH_UV4SI_FTYPE_V4SF),
+  LSX_BUILTIN (vftintrz_lu_d, LARCH_UV2DI_FTYPE_V2DF),
+  LSX_BUILTIN (vffint_s_w, LARCH_V4SF_FTYPE_V4SI),
+  LSX_BUILTIN (vffint_d_l, LARCH_V2DF_FTYPE_V2DI),
+  LSX_BUILTIN (vffint_s_wu, LARCH_V4SF_FTYPE_UV4SI),
+  LSX_BUILTIN (vffint_d_lu, LARCH_V2DF_FTYPE_UV2DI),
+
+  LSX_BUILTIN (vandn_v, LARCH_UV16QI_FTYPE_UV16QI_UV16QI),
+  LSX_BUILTIN (vneg_b, LARCH_V16QI_FTYPE_V16QI),
+  LSX_BUILTIN (vneg_h, LARCH_V8HI_FTYPE_V8HI),
+  LSX_BUILTIN (vneg_w, LARCH_V4SI_FTYPE_V4SI),
+  LSX_BUILTIN (vneg_d, LARCH_V2DI_FTYPE_V2DI),
+  LSX_BUILTIN (vmuh_b, LARCH_V16QI_FTYPE_V16QI_V16QI),
+  LSX_BUILTIN (vmuh_h, LARCH_V8HI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vmuh_w, LARCH_V4SI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vmuh_d, LARCH_V2DI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vmuh_bu, LARCH_UV16QI_FTYPE_UV16QI_UV16QI),
+  LSX_BUILTIN (vmuh_hu, LARCH_UV8HI_FTYPE_UV8HI_UV8HI),
+  LSX_BUILTIN (vmuh_wu, LARCH_UV4SI_FTYPE_UV4SI_UV4SI),
+  LSX_BUILTIN (vmuh_du, LARCH_UV2DI_FTYPE_UV2DI_UV2DI),
+  LSX_BUILTIN (vsllwil_h_b, LARCH_V8HI_FTYPE_V16QI_UQI),
+  LSX_BUILTIN (vsllwil_w_h, LARCH_V4SI_FTYPE_V8HI_UQI),
+  LSX_BUILTIN (vsllwil_d_w, LARCH_V2DI_FTYPE_V4SI_UQI),
+  LSX_BUILTIN (vsllwil_hu_bu, LARCH_UV8HI_FTYPE_UV16QI_UQI),
+  LSX_BUILTIN (vsllwil_wu_hu, LARCH_UV4SI_FTYPE_UV8HI_UQI),
+  LSX_BUILTIN (vsllwil_du_wu, LARCH_UV2DI_FTYPE_UV4SI_UQI),
+  LSX_BUILTIN (vsran_b_h, LARCH_V16QI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vsran_h_w, LARCH_V8HI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vsran_w_d, LARCH_V4SI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vssran_b_h, LARCH_V16QI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vssran_h_w, LARCH_V8HI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vssran_w_d, LARCH_V4SI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vssran_bu_h, LARCH_UV16QI_FTYPE_UV8HI_UV8HI),
+  LSX_BUILTIN (vssran_hu_w, LARCH_UV8HI_FTYPE_UV4SI_UV4SI),
+  LSX_BUILTIN (vssran_wu_d, LARCH_UV4SI_FTYPE_UV2DI_UV2DI),
+  LSX_BUILTIN (vsrarn_b_h, LARCH_V16QI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vsrarn_h_w, LARCH_V8HI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vsrarn_w_d, LARCH_V4SI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vssrarn_b_h, LARCH_V16QI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vssrarn_h_w, LARCH_V8HI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vssrarn_w_d, LARCH_V4SI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vssrarn_bu_h, LARCH_UV16QI_FTYPE_UV8HI_UV8HI),
+  LSX_BUILTIN (vssrarn_hu_w, LARCH_UV8HI_FTYPE_UV4SI_UV4SI),
+  LSX_BUILTIN (vssrarn_wu_d, LARCH_UV4SI_FTYPE_UV2DI_UV2DI),
+  LSX_BUILTIN (vsrln_b_h, LARCH_V16QI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vsrln_h_w, LARCH_V8HI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vsrln_w_d, LARCH_V4SI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vssrln_bu_h, LARCH_UV16QI_FTYPE_UV8HI_UV8HI),
+  LSX_BUILTIN (vssrln_hu_w, LARCH_UV8HI_FTYPE_UV4SI_UV4SI),
+  LSX_BUILTIN (vssrln_wu_d, LARCH_UV4SI_FTYPE_UV2DI_UV2DI),
+  LSX_BUILTIN (vsrlrn_b_h, LARCH_V16QI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vsrlrn_h_w, LARCH_V8HI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vsrlrn_w_d, LARCH_V4SI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vssrlrn_bu_h, LARCH_UV16QI_FTYPE_UV8HI_UV8HI),
+  LSX_BUILTIN (vssrlrn_hu_w, LARCH_UV8HI_FTYPE_UV4SI_UV4SI),
+  LSX_BUILTIN (vssrlrn_wu_d, LARCH_UV4SI_FTYPE_UV2DI_UV2DI),
+  LSX_BUILTIN (vfrstpi_b, LARCH_V16QI_FTYPE_V16QI_V16QI_UQI),
+  LSX_BUILTIN (vfrstpi_h, LARCH_V8HI_FTYPE_V8HI_V8HI_UQI),
+  LSX_BUILTIN (vfrstp_b, LARCH_V16QI_FTYPE_V16QI_V16QI_V16QI),
+  LSX_BUILTIN (vfrstp_h, LARCH_V8HI_FTYPE_V8HI_V8HI_V8HI),
+  LSX_BUILTIN (vshuf4i_d, LARCH_V2DI_FTYPE_V2DI_V2DI_USI),
+  LSX_BUILTIN (vbsrl_v, LARCH_V16QI_FTYPE_V16QI_UQI),
+  LSX_BUILTIN (vbsll_v, LARCH_V16QI_FTYPE_V16QI_UQI),
+  LSX_BUILTIN (vextrins_b, LARCH_V16QI_FTYPE_V16QI_V16QI_USI),
+  LSX_BUILTIN (vextrins_h, LARCH_V8HI_FTYPE_V8HI_V8HI_USI),
+  LSX_BUILTIN (vextrins_w, LARCH_V4SI_FTYPE_V4SI_V4SI_USI),
+  LSX_BUILTIN (vextrins_d, LARCH_V2DI_FTYPE_V2DI_V2DI_USI),
+  LSX_BUILTIN (vmskltz_b, LARCH_V16QI_FTYPE_V16QI),
+  LSX_BUILTIN (vmskltz_h, LARCH_V8HI_FTYPE_V8HI),
+  LSX_BUILTIN (vmskltz_w, LARCH_V4SI_FTYPE_V4SI),
+  LSX_BUILTIN (vmskltz_d, LARCH_V2DI_FTYPE_V2DI),
+  LSX_BUILTIN (vsigncov_b, LARCH_V16QI_FTYPE_V16QI_V16QI),
+  LSX_BUILTIN (vsigncov_h, LARCH_V8HI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vsigncov_w, LARCH_V4SI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vsigncov_d, LARCH_V2DI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vfmadd_s, LARCH_V4SF_FTYPE_V4SF_V4SF_V4SF),
+  LSX_BUILTIN (vfmadd_d, LARCH_V2DF_FTYPE_V2DF_V2DF_V2DF),
+  LSX_BUILTIN (vfmsub_s, LARCH_V4SF_FTYPE_V4SF_V4SF_V4SF),
+  LSX_BUILTIN (vfmsub_d, LARCH_V2DF_FTYPE_V2DF_V2DF_V2DF),
+  LSX_BUILTIN (vfnmadd_s, LARCH_V4SF_FTYPE_V4SF_V4SF_V4SF),
+  LSX_BUILTIN (vfnmadd_d, LARCH_V2DF_FTYPE_V2DF_V2DF_V2DF),
+  LSX_BUILTIN (vfnmsub_s, LARCH_V4SF_FTYPE_V4SF_V4SF_V4SF),
+  LSX_BUILTIN (vfnmsub_d, LARCH_V2DF_FTYPE_V2DF_V2DF_V2DF),
+  LSX_BUILTIN (vftintrne_w_s, LARCH_V4SI_FTYPE_V4SF),
+  LSX_BUILTIN (vftintrne_l_d, LARCH_V2DI_FTYPE_V2DF),
+  LSX_BUILTIN (vftintrp_w_s, LARCH_V4SI_FTYPE_V4SF),
+  LSX_BUILTIN (vftintrp_l_d, LARCH_V2DI_FTYPE_V2DF),
+  LSX_BUILTIN (vftintrm_w_s, LARCH_V4SI_FTYPE_V4SF),
+  LSX_BUILTIN (vftintrm_l_d, LARCH_V2DI_FTYPE_V2DF),
+  LSX_BUILTIN (vftint_w_d, LARCH_V4SI_FTYPE_V2DF_V2DF),
+  LSX_BUILTIN (vffint_s_l, LARCH_V4SF_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vftintrz_w_d, LARCH_V4SI_FTYPE_V2DF_V2DF),
+  LSX_BUILTIN (vftintrp_w_d, LARCH_V4SI_FTYPE_V2DF_V2DF),
+  LSX_BUILTIN (vftintrm_w_d, LARCH_V4SI_FTYPE_V2DF_V2DF),
+  LSX_BUILTIN (vftintrne_w_d, LARCH_V4SI_FTYPE_V2DF_V2DF),
+  LSX_BUILTIN (vftintl_l_s, LARCH_V2DI_FTYPE_V4SF),
+  LSX_BUILTIN (vftinth_l_s, LARCH_V2DI_FTYPE_V4SF),
+  LSX_BUILTIN (vffinth_d_w, LARCH_V2DF_FTYPE_V4SI),
+  LSX_BUILTIN (vffintl_d_w, LARCH_V2DF_FTYPE_V4SI),
+  LSX_BUILTIN (vftintrzl_l_s, LARCH_V2DI_FTYPE_V4SF),
+  LSX_BUILTIN (vftintrzh_l_s, LARCH_V2DI_FTYPE_V4SF),
+  LSX_BUILTIN (vftintrpl_l_s, LARCH_V2DI_FTYPE_V4SF),
+  LSX_BUILTIN (vftintrph_l_s, LARCH_V2DI_FTYPE_V4SF),
+  LSX_BUILTIN (vftintrml_l_s, LARCH_V2DI_FTYPE_V4SF),
+  LSX_BUILTIN (vftintrmh_l_s, LARCH_V2DI_FTYPE_V4SF),
+  LSX_BUILTIN (vftintrnel_l_s, LARCH_V2DI_FTYPE_V4SF),
+  LSX_BUILTIN (vftintrneh_l_s, LARCH_V2DI_FTYPE_V4SF),
+  LSX_BUILTIN (vfrintrne_s, LARCH_V4SF_FTYPE_V4SF),
+  LSX_BUILTIN (vfrintrne_d, LARCH_V2DF_FTYPE_V2DF),
+  LSX_BUILTIN (vfrintrz_s, LARCH_V4SF_FTYPE_V4SF),
+  LSX_BUILTIN (vfrintrz_d, LARCH_V2DF_FTYPE_V2DF),
+  LSX_BUILTIN (vfrintrp_s, LARCH_V4SF_FTYPE_V4SF),
+  LSX_BUILTIN (vfrintrp_d, LARCH_V2DF_FTYPE_V2DF),
+  LSX_BUILTIN (vfrintrm_s, LARCH_V4SF_FTYPE_V4SF),
+  LSX_BUILTIN (vfrintrm_d, LARCH_V2DF_FTYPE_V2DF),
+  LSX_NO_TARGET_BUILTIN (vstelm_b, LARCH_VOID_FTYPE_V16QI_CVPOINTER_SI_UQI),
+  LSX_NO_TARGET_BUILTIN (vstelm_h, LARCH_VOID_FTYPE_V8HI_CVPOINTER_SI_UQI),
+  LSX_NO_TARGET_BUILTIN (vstelm_w, LARCH_VOID_FTYPE_V4SI_CVPOINTER_SI_UQI),
+  LSX_NO_TARGET_BUILTIN (vstelm_d, LARCH_VOID_FTYPE_V2DI_CVPOINTER_SI_UQI),
+  LSX_BUILTIN (vaddwev_d_w, LARCH_V2DI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vaddwev_w_h, LARCH_V4SI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vaddwev_h_b, LARCH_V8HI_FTYPE_V16QI_V16QI),
+  LSX_BUILTIN (vaddwod_d_w, LARCH_V2DI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vaddwod_w_h, LARCH_V4SI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vaddwod_h_b, LARCH_V8HI_FTYPE_V16QI_V16QI),
+  LSX_BUILTIN (vaddwev_d_wu, LARCH_V2DI_FTYPE_UV4SI_UV4SI),
+  LSX_BUILTIN (vaddwev_w_hu, LARCH_V4SI_FTYPE_UV8HI_UV8HI),
+  LSX_BUILTIN (vaddwev_h_bu, LARCH_V8HI_FTYPE_UV16QI_UV16QI),
+  LSX_BUILTIN (vaddwod_d_wu, LARCH_V2DI_FTYPE_UV4SI_UV4SI),
+  LSX_BUILTIN (vaddwod_w_hu, LARCH_V4SI_FTYPE_UV8HI_UV8HI),
+  LSX_BUILTIN (vaddwod_h_bu, LARCH_V8HI_FTYPE_UV16QI_UV16QI),
+  LSX_BUILTIN (vaddwev_d_wu_w, LARCH_V2DI_FTYPE_UV4SI_V4SI),
+  LSX_BUILTIN (vaddwev_w_hu_h, LARCH_V4SI_FTYPE_UV8HI_V8HI),
+  LSX_BUILTIN (vaddwev_h_bu_b, LARCH_V8HI_FTYPE_UV16QI_V16QI),
+  LSX_BUILTIN (vaddwod_d_wu_w, LARCH_V2DI_FTYPE_UV4SI_V4SI),
+  LSX_BUILTIN (vaddwod_w_hu_h, LARCH_V4SI_FTYPE_UV8HI_V8HI),
+  LSX_BUILTIN (vaddwod_h_bu_b, LARCH_V8HI_FTYPE_UV16QI_V16QI),
+  LSX_BUILTIN (vsubwev_d_w, LARCH_V2DI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vsubwev_w_h, LARCH_V4SI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vsubwev_h_b, LARCH_V8HI_FTYPE_V16QI_V16QI),
+  LSX_BUILTIN (vsubwod_d_w, LARCH_V2DI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vsubwod_w_h, LARCH_V4SI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vsubwod_h_b, LARCH_V8HI_FTYPE_V16QI_V16QI),
+  LSX_BUILTIN (vsubwev_d_wu, LARCH_V2DI_FTYPE_UV4SI_UV4SI),
+  LSX_BUILTIN (vsubwev_w_hu, LARCH_V4SI_FTYPE_UV8HI_UV8HI),
+  LSX_BUILTIN (vsubwev_h_bu, LARCH_V8HI_FTYPE_UV16QI_UV16QI),
+  LSX_BUILTIN (vsubwod_d_wu, LARCH_V2DI_FTYPE_UV4SI_UV4SI),
+  LSX_BUILTIN (vsubwod_w_hu, LARCH_V4SI_FTYPE_UV8HI_UV8HI),
+  LSX_BUILTIN (vsubwod_h_bu, LARCH_V8HI_FTYPE_UV16QI_UV16QI),
+  LSX_BUILTIN (vaddwev_q_d, LARCH_V2DI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vaddwod_q_d, LARCH_V2DI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vaddwev_q_du, LARCH_V2DI_FTYPE_UV2DI_UV2DI),
+  LSX_BUILTIN (vaddwod_q_du, LARCH_V2DI_FTYPE_UV2DI_UV2DI),
+  LSX_BUILTIN (vsubwev_q_d, LARCH_V2DI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vsubwod_q_d, LARCH_V2DI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vsubwev_q_du, LARCH_V2DI_FTYPE_UV2DI_UV2DI),
+  LSX_BUILTIN (vsubwod_q_du, LARCH_V2DI_FTYPE_UV2DI_UV2DI),
+  LSX_BUILTIN (vaddwev_q_du_d, LARCH_V2DI_FTYPE_UV2DI_V2DI),
+  LSX_BUILTIN (vaddwod_q_du_d, LARCH_V2DI_FTYPE_UV2DI_V2DI),
+
+  LSX_BUILTIN (vmulwev_d_w, LARCH_V2DI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vmulwev_w_h, LARCH_V4SI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vmulwev_h_b, LARCH_V8HI_FTYPE_V16QI_V16QI),
+  LSX_BUILTIN (vmulwod_d_w, LARCH_V2DI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vmulwod_w_h, LARCH_V4SI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vmulwod_h_b, LARCH_V8HI_FTYPE_V16QI_V16QI),
+  LSX_BUILTIN (vmulwev_d_wu, LARCH_V2DI_FTYPE_UV4SI_UV4SI),
+  LSX_BUILTIN (vmulwev_w_hu, LARCH_V4SI_FTYPE_UV8HI_UV8HI),
+  LSX_BUILTIN (vmulwev_h_bu, LARCH_V8HI_FTYPE_UV16QI_UV16QI),
+  LSX_BUILTIN (vmulwod_d_wu, LARCH_V2DI_FTYPE_UV4SI_UV4SI),
+  LSX_BUILTIN (vmulwod_w_hu, LARCH_V4SI_FTYPE_UV8HI_UV8HI),
+  LSX_BUILTIN (vmulwod_h_bu, LARCH_V8HI_FTYPE_UV16QI_UV16QI),
+  LSX_BUILTIN (vmulwev_d_wu_w, LARCH_V2DI_FTYPE_UV4SI_V4SI),
+  LSX_BUILTIN (vmulwev_w_hu_h, LARCH_V4SI_FTYPE_UV8HI_V8HI),
+  LSX_BUILTIN (vmulwev_h_bu_b, LARCH_V8HI_FTYPE_UV16QI_V16QI),
+  LSX_BUILTIN (vmulwod_d_wu_w, LARCH_V2DI_FTYPE_UV4SI_V4SI),
+  LSX_BUILTIN (vmulwod_w_hu_h, LARCH_V4SI_FTYPE_UV8HI_V8HI),
+  LSX_BUILTIN (vmulwod_h_bu_b, LARCH_V8HI_FTYPE_UV16QI_V16QI),
+  LSX_BUILTIN (vmulwev_q_d, LARCH_V2DI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vmulwod_q_d, LARCH_V2DI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vmulwev_q_du, LARCH_V2DI_FTYPE_UV2DI_UV2DI),
+  LSX_BUILTIN (vmulwod_q_du, LARCH_V2DI_FTYPE_UV2DI_UV2DI),
+  LSX_BUILTIN (vmulwev_q_du_d, LARCH_V2DI_FTYPE_UV2DI_V2DI),
+  LSX_BUILTIN (vmulwod_q_du_d, LARCH_V2DI_FTYPE_UV2DI_V2DI),
+  LSX_BUILTIN (vhaddw_q_d, LARCH_V2DI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vhaddw_qu_du, LARCH_UV2DI_FTYPE_UV2DI_UV2DI),
+  LSX_BUILTIN (vhsubw_q_d, LARCH_V2DI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vhsubw_qu_du, LARCH_UV2DI_FTYPE_UV2DI_UV2DI),
+  LSX_BUILTIN (vmaddwev_d_w, LARCH_V2DI_FTYPE_V2DI_V4SI_V4SI),
+  LSX_BUILTIN (vmaddwev_w_h, LARCH_V4SI_FTYPE_V4SI_V8HI_V8HI),
+  LSX_BUILTIN (vmaddwev_h_b, LARCH_V8HI_FTYPE_V8HI_V16QI_V16QI),
+  LSX_BUILTIN (vmaddwev_d_wu, LARCH_UV2DI_FTYPE_UV2DI_UV4SI_UV4SI),
+  LSX_BUILTIN (vmaddwev_w_hu, LARCH_UV4SI_FTYPE_UV4SI_UV8HI_UV8HI),
+  LSX_BUILTIN (vmaddwev_h_bu, LARCH_UV8HI_FTYPE_UV8HI_UV16QI_UV16QI),
+  LSX_BUILTIN (vmaddwod_d_w, LARCH_V2DI_FTYPE_V2DI_V4SI_V4SI),
+  LSX_BUILTIN (vmaddwod_w_h, LARCH_V4SI_FTYPE_V4SI_V8HI_V8HI),
+  LSX_BUILTIN (vmaddwod_h_b, LARCH_V8HI_FTYPE_V8HI_V16QI_V16QI),
+  LSX_BUILTIN (vmaddwod_d_wu, LARCH_UV2DI_FTYPE_UV2DI_UV4SI_UV4SI),
+  LSX_BUILTIN (vmaddwod_w_hu, LARCH_UV4SI_FTYPE_UV4SI_UV8HI_UV8HI),
+  LSX_BUILTIN (vmaddwod_h_bu, LARCH_UV8HI_FTYPE_UV8HI_UV16QI_UV16QI),
+  LSX_BUILTIN (vmaddwev_d_wu_w, LARCH_V2DI_FTYPE_V2DI_UV4SI_V4SI),
+  LSX_BUILTIN (vmaddwev_w_hu_h, LARCH_V4SI_FTYPE_V4SI_UV8HI_V8HI),
+  LSX_BUILTIN (vmaddwev_h_bu_b, LARCH_V8HI_FTYPE_V8HI_UV16QI_V16QI),
+  LSX_BUILTIN (vmaddwod_d_wu_w, LARCH_V2DI_FTYPE_V2DI_UV4SI_V4SI),
+  LSX_BUILTIN (vmaddwod_w_hu_h, LARCH_V4SI_FTYPE_V4SI_UV8HI_V8HI),
+  LSX_BUILTIN (vmaddwod_h_bu_b, LARCH_V8HI_FTYPE_V8HI_UV16QI_V16QI),
+  LSX_BUILTIN (vmaddwev_q_d, LARCH_V2DI_FTYPE_V2DI_V2DI_V2DI),
+  LSX_BUILTIN (vmaddwod_q_d, LARCH_V2DI_FTYPE_V2DI_V2DI_V2DI),
+  LSX_BUILTIN (vmaddwev_q_du, LARCH_UV2DI_FTYPE_UV2DI_UV2DI_UV2DI),
+  LSX_BUILTIN (vmaddwod_q_du, LARCH_UV2DI_FTYPE_UV2DI_UV2DI_UV2DI),
+  LSX_BUILTIN (vmaddwev_q_du_d, LARCH_V2DI_FTYPE_V2DI_UV2DI_V2DI),
+  LSX_BUILTIN (vmaddwod_q_du_d, LARCH_V2DI_FTYPE_V2DI_UV2DI_V2DI),
+  LSX_BUILTIN (vrotr_b, LARCH_V16QI_FTYPE_V16QI_V16QI),
+  LSX_BUILTIN (vrotr_h, LARCH_V8HI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vrotr_w, LARCH_V4SI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vrotr_d, LARCH_V2DI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vadd_q, LARCH_V2DI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vsub_q, LARCH_V2DI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vldrepl_b, LARCH_V16QI_FTYPE_CVPOINTER_SI),
+  LSX_BUILTIN (vldrepl_h, LARCH_V8HI_FTYPE_CVPOINTER_SI),
+  LSX_BUILTIN (vldrepl_w, LARCH_V4SI_FTYPE_CVPOINTER_SI),
+  LSX_BUILTIN (vldrepl_d, LARCH_V2DI_FTYPE_CVPOINTER_SI),
+
+  LSX_BUILTIN (vmskgez_b, LARCH_V16QI_FTYPE_V16QI),
+  LSX_BUILTIN (vmsknz_b, LARCH_V16QI_FTYPE_V16QI),
+  LSX_BUILTIN (vexth_h_b, LARCH_V8HI_FTYPE_V16QI),
+  LSX_BUILTIN (vexth_w_h, LARCH_V4SI_FTYPE_V8HI),
+  LSX_BUILTIN (vexth_d_w, LARCH_V2DI_FTYPE_V4SI),
+  LSX_BUILTIN (vexth_q_d, LARCH_V2DI_FTYPE_V2DI),
+  LSX_BUILTIN (vexth_hu_bu, LARCH_UV8HI_FTYPE_UV16QI),
+  LSX_BUILTIN (vexth_wu_hu, LARCH_UV4SI_FTYPE_UV8HI),
+  LSX_BUILTIN (vexth_du_wu, LARCH_UV2DI_FTYPE_UV4SI),
+  LSX_BUILTIN (vexth_qu_du, LARCH_UV2DI_FTYPE_UV2DI),
+  LSX_BUILTIN (vrotri_b, LARCH_V16QI_FTYPE_V16QI_UQI),
+  LSX_BUILTIN (vrotri_h, LARCH_V8HI_FTYPE_V8HI_UQI),
+  LSX_BUILTIN (vrotri_w, LARCH_V4SI_FTYPE_V4SI_UQI),
+  LSX_BUILTIN (vrotri_d, LARCH_V2DI_FTYPE_V2DI_UQI),
+  LSX_BUILTIN (vextl_q_d, LARCH_V2DI_FTYPE_V2DI),
+  LSX_BUILTIN (vsrlni_b_h, LARCH_V16QI_FTYPE_V16QI_V16QI_USI),
+  LSX_BUILTIN (vsrlni_h_w, LARCH_V8HI_FTYPE_V8HI_V8HI_USI),
+  LSX_BUILTIN (vsrlni_w_d, LARCH_V4SI_FTYPE_V4SI_V4SI_USI),
+  LSX_BUILTIN (vsrlni_d_q, LARCH_V2DI_FTYPE_V2DI_V2DI_USI),
+  LSX_BUILTIN (vsrlrni_b_h, LARCH_V16QI_FTYPE_V16QI_V16QI_USI),
+  LSX_BUILTIN (vsrlrni_h_w, LARCH_V8HI_FTYPE_V8HI_V8HI_USI),
+  LSX_BUILTIN (vsrlrni_w_d, LARCH_V4SI_FTYPE_V4SI_V4SI_USI),
+  LSX_BUILTIN (vsrlrni_d_q, LARCH_V2DI_FTYPE_V2DI_V2DI_USI),
+  LSX_BUILTIN (vssrlni_b_h, LARCH_V16QI_FTYPE_V16QI_V16QI_USI),
+  LSX_BUILTIN (vssrlni_h_w, LARCH_V8HI_FTYPE_V8HI_V8HI_USI),
+  LSX_BUILTIN (vssrlni_w_d, LARCH_V4SI_FTYPE_V4SI_V4SI_USI),
+  LSX_BUILTIN (vssrlni_d_q, LARCH_V2DI_FTYPE_V2DI_V2DI_USI),
+  LSX_BUILTIN (vssrlni_bu_h, LARCH_UV16QI_FTYPE_UV16QI_V16QI_USI),
+  LSX_BUILTIN (vssrlni_hu_w, LARCH_UV8HI_FTYPE_UV8HI_V8HI_USI),
+  LSX_BUILTIN (vssrlni_wu_d, LARCH_UV4SI_FTYPE_UV4SI_V4SI_USI),
+  LSX_BUILTIN (vssrlni_du_q, LARCH_UV2DI_FTYPE_UV2DI_V2DI_USI),
+  LSX_BUILTIN (vssrlrni_b_h, LARCH_V16QI_FTYPE_V16QI_V16QI_USI),
+  LSX_BUILTIN (vssrlrni_h_w, LARCH_V8HI_FTYPE_V8HI_V8HI_USI),
+  LSX_BUILTIN (vssrlrni_w_d, LARCH_V4SI_FTYPE_V4SI_V4SI_USI),
+  LSX_BUILTIN (vssrlrni_d_q, LARCH_V2DI_FTYPE_V2DI_V2DI_USI),
+  LSX_BUILTIN (vssrlrni_bu_h, LARCH_UV16QI_FTYPE_UV16QI_V16QI_USI),
+  LSX_BUILTIN (vssrlrni_hu_w, LARCH_UV8HI_FTYPE_UV8HI_V8HI_USI),
+  LSX_BUILTIN (vssrlrni_wu_d, LARCH_UV4SI_FTYPE_UV4SI_V4SI_USI),
+  LSX_BUILTIN (vssrlrni_du_q, LARCH_UV2DI_FTYPE_UV2DI_V2DI_USI),
+  LSX_BUILTIN (vsrani_b_h, LARCH_V16QI_FTYPE_V16QI_V16QI_USI),
+  LSX_BUILTIN (vsrani_h_w, LARCH_V8HI_FTYPE_V8HI_V8HI_USI),
+  LSX_BUILTIN (vsrani_w_d, LARCH_V4SI_FTYPE_V4SI_V4SI_USI),
+  LSX_BUILTIN (vsrani_d_q, LARCH_V2DI_FTYPE_V2DI_V2DI_USI),
+  LSX_BUILTIN (vsrarni_b_h, LARCH_V16QI_FTYPE_V16QI_V16QI_USI),
+  LSX_BUILTIN (vsrarni_h_w, LARCH_V8HI_FTYPE_V8HI_V8HI_USI),
+  LSX_BUILTIN (vsrarni_w_d, LARCH_V4SI_FTYPE_V4SI_V4SI_USI),
+  LSX_BUILTIN (vsrarni_d_q, LARCH_V2DI_FTYPE_V2DI_V2DI_USI),
+  LSX_BUILTIN (vssrani_b_h, LARCH_V16QI_FTYPE_V16QI_V16QI_USI),
+  LSX_BUILTIN (vssrani_h_w, LARCH_V8HI_FTYPE_V8HI_V8HI_USI),
+  LSX_BUILTIN (vssrani_w_d, LARCH_V4SI_FTYPE_V4SI_V4SI_USI),
+  LSX_BUILTIN (vssrani_d_q, LARCH_V2DI_FTYPE_V2DI_V2DI_USI),
+  LSX_BUILTIN (vssrani_bu_h, LARCH_UV16QI_FTYPE_UV16QI_V16QI_USI),
+  LSX_BUILTIN (vssrani_hu_w, LARCH_UV8HI_FTYPE_UV8HI_V8HI_USI),
+  LSX_BUILTIN (vssrani_wu_d, LARCH_UV4SI_FTYPE_UV4SI_V4SI_USI),
+  LSX_BUILTIN (vssrani_du_q, LARCH_UV2DI_FTYPE_UV2DI_V2DI_USI),
+  LSX_BUILTIN (vssrarni_b_h, LARCH_V16QI_FTYPE_V16QI_V16QI_USI),
+  LSX_BUILTIN (vssrarni_h_w, LARCH_V8HI_FTYPE_V8HI_V8HI_USI),
+  LSX_BUILTIN (vssrarni_w_d, LARCH_V4SI_FTYPE_V4SI_V4SI_USI),
+  LSX_BUILTIN (vssrarni_d_q, LARCH_V2DI_FTYPE_V2DI_V2DI_USI),
+  LSX_BUILTIN (vssrarni_bu_h, LARCH_UV16QI_FTYPE_UV16QI_V16QI_USI),
+  LSX_BUILTIN (vssrarni_hu_w, LARCH_UV8HI_FTYPE_UV8HI_V8HI_USI),
+  LSX_BUILTIN (vssrarni_wu_d, LARCH_UV4SI_FTYPE_UV4SI_V4SI_USI),
+  LSX_BUILTIN (vssrarni_du_q, LARCH_UV2DI_FTYPE_UV2DI_V2DI_USI),
+  LSX_BUILTIN (vpermi_w, LARCH_V4SI_FTYPE_V4SI_V4SI_USI),
+  LSX_BUILTIN (vld, LARCH_V16QI_FTYPE_CVPOINTER_SI),
+  LSX_NO_TARGET_BUILTIN (vst, LARCH_VOID_FTYPE_V16QI_CVPOINTER_SI),
+  LSX_BUILTIN (vssrlrn_b_h, LARCH_V16QI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vssrlrn_h_w, LARCH_V8HI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vssrlrn_w_d, LARCH_V4SI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vssrln_b_h, LARCH_V16QI_FTYPE_V8HI_V8HI),
+  LSX_BUILTIN (vssrln_h_w, LARCH_V8HI_FTYPE_V4SI_V4SI),
+  LSX_BUILTIN (vssrln_w_d, LARCH_V4SI_FTYPE_V2DI_V2DI),
+  LSX_BUILTIN (vorn_v, LARCH_V16QI_FTYPE_V16QI_V16QI),
+  LSX_BUILTIN (vldi, LARCH_V2DI_FTYPE_HI),
+  LSX_BUILTIN (vshuf_b, LARCH_V16QI_FTYPE_V16QI_V16QI_V16QI),
+  LSX_BUILTIN (vldx, LARCH_V16QI_FTYPE_CVPOINTER_DI),
+  LSX_NO_TARGET_BUILTIN (vstx, LARCH_VOID_FTYPE_V16QI_CVPOINTER_DI),
+  LSX_BUILTIN (vextl_qu_du, LARCH_UV2DI_FTYPE_UV2DI)
 };
 
 /* Index I is the function declaration for loongarch_builtins[I], or null if
@@ -193,11 +1219,46 @@ static GTY (()) tree loongarch_builtin_decls[ARRAY_SIZE (loongarch_builtins)];
    using the instruction code or return null if not defined for the target.  */
 static GTY (()) int loongarch_get_builtin_decl_index[NUM_INSN_CODES];
 
+
+/* MODE is a vector mode whose elements have type TYPE.  Return the type
+   of the vector itself.  */
+
+static tree
+loongarch_builtin_vector_type (tree type, machine_mode mode)
+{
+  static tree types[2 * (int) MAX_MACHINE_MODE];
+  int mode_index;
+
+  mode_index = (int) mode;
+
+  if (TREE_CODE (type) == INTEGER_TYPE && TYPE_UNSIGNED (type))
+    mode_index += MAX_MACHINE_MODE;
+
+  if (types[mode_index] == NULL_TREE)
+    types[mode_index] = build_vector_type_for_mode (type, mode);
+  return types[mode_index];
+}
+
+/* Return a type for 'const volatile void *'.  */
+
+static tree
+loongarch_build_cvpointer_type (void)
+{
+  static tree cache;
+
+  if (cache == NULL_TREE)
+    cache = build_pointer_type (build_qualified_type (void_type_node,
+						      TYPE_QUAL_CONST
+						      | TYPE_QUAL_VOLATILE));
+  return cache;
+}
+
 /* Source-level argument types.  */
 #define LARCH_ATYPE_VOID void_type_node
 #define LARCH_ATYPE_INT integer_type_node
 #define LARCH_ATYPE_POINTER ptr_type_node
-
+#define LARCH_ATYPE_CVPOINTER loongarch_build_cvpointer_type ()
+#define LARCH_ATYPE_BOOLEAN boolean_type_node
 /* Standard mode-based argument types.  */
 #define LARCH_ATYPE_QI intQI_type_node
 #define LARCH_ATYPE_UQI unsigned_intQI_type_node
@@ -210,6 +1271,72 @@ static GTY (()) int loongarch_get_builtin_decl_index[NUM_INSN_CODES];
 #define LARCH_ATYPE_SF float_type_node
 #define LARCH_ATYPE_DF double_type_node
 
+/* Vector argument types.  */
+#define LARCH_ATYPE_V2SF						\
+  loongarch_builtin_vector_type (float_type_node, V2SFmode)
+#define LARCH_ATYPE_V2HI						\
+  loongarch_builtin_vector_type (intHI_type_node, V2HImode)
+#define LARCH_ATYPE_V2SI						\
+  loongarch_builtin_vector_type (intSI_type_node, V2SImode)
+#define LARCH_ATYPE_V4QI						\
+  loongarch_builtin_vector_type (intQI_type_node, V4QImode)
+#define LARCH_ATYPE_V4HI						\
+  loongarch_builtin_vector_type (intHI_type_node, V4HImode)
+#define LARCH_ATYPE_V8QI						\
+  loongarch_builtin_vector_type (intQI_type_node, V8QImode)
+
+#define LARCH_ATYPE_V2DI						\
+  loongarch_builtin_vector_type (long_long_integer_type_node, V2DImode)
+#define LARCH_ATYPE_V4SI						\
+  loongarch_builtin_vector_type (intSI_type_node, V4SImode)
+#define LARCH_ATYPE_V8HI						\
+  loongarch_builtin_vector_type (intHI_type_node, V8HImode)
+#define LARCH_ATYPE_V16QI						\
+  loongarch_builtin_vector_type (intQI_type_node, V16QImode)
+#define LARCH_ATYPE_V2DF						\
+  loongarch_builtin_vector_type (double_type_node, V2DFmode)
+#define LARCH_ATYPE_V4SF						\
+  loongarch_builtin_vector_type (float_type_node, V4SFmode)
+
+/* LoongArch ASX.  */
+#define LARCH_ATYPE_V4DI						\
+  loongarch_builtin_vector_type (long_long_integer_type_node, V4DImode)
+#define LARCH_ATYPE_V8SI						\
+  loongarch_builtin_vector_type (intSI_type_node, V8SImode)
+#define LARCH_ATYPE_V16HI						\
+  loongarch_builtin_vector_type (intHI_type_node, V16HImode)
+#define LARCH_ATYPE_V32QI						\
+  loongarch_builtin_vector_type (intQI_type_node, V32QImode)
+#define LARCH_ATYPE_V4DF						\
+  loongarch_builtin_vector_type (double_type_node, V4DFmode)
+#define LARCH_ATYPE_V8SF						\
+  loongarch_builtin_vector_type (float_type_node, V8SFmode)
+
+#define LARCH_ATYPE_UV2DI					\
+  loongarch_builtin_vector_type (long_long_unsigned_type_node, V2DImode)
+#define LARCH_ATYPE_UV4SI					\
+  loongarch_builtin_vector_type (unsigned_intSI_type_node, V4SImode)
+#define LARCH_ATYPE_UV8HI					\
+  loongarch_builtin_vector_type (unsigned_intHI_type_node, V8HImode)
+#define LARCH_ATYPE_UV16QI					\
+  loongarch_builtin_vector_type (unsigned_intQI_type_node, V16QImode)
+
+#define LARCH_ATYPE_UV4DI					\
+  loongarch_builtin_vector_type (long_long_unsigned_type_node, V4DImode)
+#define LARCH_ATYPE_UV8SI					\
+  loongarch_builtin_vector_type (unsigned_intSI_type_node, V8SImode)
+#define LARCH_ATYPE_UV16HI					\
+  loongarch_builtin_vector_type (unsigned_intHI_type_node, V16HImode)
+#define LARCH_ATYPE_UV32QI					\
+  loongarch_builtin_vector_type (unsigned_intQI_type_node, V32QImode)
+
+#define LARCH_ATYPE_UV2SI					\
+  loongarch_builtin_vector_type (unsigned_intSI_type_node, V2SImode)
+#define LARCH_ATYPE_UV4HI					\
+  loongarch_builtin_vector_type (unsigned_intHI_type_node, V4HImode)
+#define LARCH_ATYPE_UV8QI					\
+  loongarch_builtin_vector_type (unsigned_intQI_type_node, V8QImode)
+
 /* LARCH_FTYPE_ATYPESN takes N LARCH_FTYPES-like type codes and lists
    their associated LARCH_ATYPEs.  */
 #define LARCH_FTYPE_ATYPES1(A, B) LARCH_ATYPE_##A, LARCH_ATYPE_##B
@@ -283,6 +1410,92 @@ loongarch_builtin_decl (unsigned int code, bool initialize_p ATTRIBUTE_UNUSED)
   return loongarch_builtin_decls[code];
 }
 
+/* Implement TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION.  */
+
+tree
+loongarch_builtin_vectorized_function (unsigned int fn, tree type_out,
+				       tree type_in)
+{
+  machine_mode in_mode, out_mode;
+  int in_n, out_n;
+
+  if (TREE_CODE (type_out) != VECTOR_TYPE
+      || TREE_CODE (type_in) != VECTOR_TYPE
+      || !ISA_HAS_LSX)
+    return NULL_TREE;
+
+  out_mode = TYPE_MODE (TREE_TYPE (type_out));
+  out_n = TYPE_VECTOR_SUBPARTS (type_out);
+  in_mode = TYPE_MODE (TREE_TYPE (type_in));
+  in_n = TYPE_VECTOR_SUBPARTS (type_in);
+
+  /* INSN is the name of the associated instruction pattern, without
+     the leading CODE_FOR_.  */
+#define LARCH_GET_BUILTIN(INSN) \
+  loongarch_builtin_decls[loongarch_get_builtin_decl_index[CODE_FOR_##INSN]]
+
+  switch (fn)
+    {
+    CASE_CFN_CEIL:
+      if (out_mode == DFmode && in_mode == DFmode)
+    {
+      if (out_n == 2 && in_n == 2)
+	return LARCH_GET_BUILTIN (lsx_vfrintrp_d);
+    }
+      if (out_mode == SFmode && in_mode == SFmode)
+    {
+      if (out_n == 4 && in_n == 4)
+	return LARCH_GET_BUILTIN (lsx_vfrintrp_s);
+    }
+      break;
+
+    CASE_CFN_TRUNC:
+      if (out_mode == DFmode && in_mode == DFmode)
+    {
+      if (out_n == 2 && in_n == 2)
+	return LARCH_GET_BUILTIN (lsx_vfrintrz_d);
+    }
+      if (out_mode == SFmode && in_mode == SFmode)
+    {
+      if (out_n == 4 && in_n == 4)
+	return LARCH_GET_BUILTIN (lsx_vfrintrz_s);
+    }
+      break;
+
+    CASE_CFN_RINT:
+    CASE_CFN_ROUND:
+      if (out_mode == DFmode && in_mode == DFmode)
+    {
+      if (out_n == 2 && in_n == 2)
+	return LARCH_GET_BUILTIN (lsx_vfrint_d);
+    }
+      if (out_mode == SFmode && in_mode == SFmode)
+    {
+      if (out_n == 4 && in_n == 4)
+	return LARCH_GET_BUILTIN (lsx_vfrint_s);
+    }
+      break;
+
+    CASE_CFN_FLOOR:
+      if (out_mode == DFmode && in_mode == DFmode)
+    {
+      if (out_n == 2 && in_n == 2)
+	return LARCH_GET_BUILTIN (lsx_vfrintrm_d);
+    }
+      if (out_mode == SFmode && in_mode == SFmode)
+    {
+      if (out_n == 4 && in_n == 4)
+	return LARCH_GET_BUILTIN (lsx_vfrintrm_s);
+    }
+      break;
+
+    default:
+      break;
+    }
+
+  return NULL_TREE;
+}
+
 /* Take argument ARGNO from EXP's argument list and convert it into
    an expand operand.  Store the operand in *OP.  */
 
@@ -318,7 +1531,236 @@ static rtx
 loongarch_expand_builtin_insn (enum insn_code icode, unsigned int nops,
 			       struct expand_operand *ops, bool has_target_p)
 {
-  if (!maybe_expand_insn (icode, nops, ops))
+  machine_mode imode;
+  int rangelo = 0, rangehi = 0, error_opno = 0;
+
+  switch (icode)
+    {
+    case CODE_FOR_lsx_vaddi_bu:
+    case CODE_FOR_lsx_vaddi_hu:
+    case CODE_FOR_lsx_vaddi_wu:
+    case CODE_FOR_lsx_vaddi_du:
+    case CODE_FOR_lsx_vslti_bu:
+    case CODE_FOR_lsx_vslti_hu:
+    case CODE_FOR_lsx_vslti_wu:
+    case CODE_FOR_lsx_vslti_du:
+    case CODE_FOR_lsx_vslei_bu:
+    case CODE_FOR_lsx_vslei_hu:
+    case CODE_FOR_lsx_vslei_wu:
+    case CODE_FOR_lsx_vslei_du:
+    case CODE_FOR_lsx_vmaxi_bu:
+    case CODE_FOR_lsx_vmaxi_hu:
+    case CODE_FOR_lsx_vmaxi_wu:
+    case CODE_FOR_lsx_vmaxi_du:
+    case CODE_FOR_lsx_vmini_bu:
+    case CODE_FOR_lsx_vmini_hu:
+    case CODE_FOR_lsx_vmini_wu:
+    case CODE_FOR_lsx_vmini_du:
+    case CODE_FOR_lsx_vsubi_bu:
+    case CODE_FOR_lsx_vsubi_hu:
+    case CODE_FOR_lsx_vsubi_wu:
+    case CODE_FOR_lsx_vsubi_du:
+      gcc_assert (has_target_p && nops == 3);
+      /* We only generate a vector of constants iff the second argument
+	 is an immediate.  We also validate the range of the immediate.  */
+      if (CONST_INT_P (ops[2].value))
+	{
+	  rangelo = 0;
+	  rangehi = 31;
+	  if (IN_RANGE (INTVAL (ops[2].value), rangelo, rangehi))
+	    {
+	      ops[2].mode = ops[0].mode;
+	      ops[2].value = loongarch_gen_const_int_vector (ops[2].mode,
+							     INTVAL (ops[2].value));
+	    }
+	  else
+	    error_opno = 2;
+	}
+      break;
+
+    case CODE_FOR_lsx_vseqi_b:
+    case CODE_FOR_lsx_vseqi_h:
+    case CODE_FOR_lsx_vseqi_w:
+    case CODE_FOR_lsx_vseqi_d:
+    case CODE_FOR_lsx_vslti_b:
+    case CODE_FOR_lsx_vslti_h:
+    case CODE_FOR_lsx_vslti_w:
+    case CODE_FOR_lsx_vslti_d:
+    case CODE_FOR_lsx_vslei_b:
+    case CODE_FOR_lsx_vslei_h:
+    case CODE_FOR_lsx_vslei_w:
+    case CODE_FOR_lsx_vslei_d:
+    case CODE_FOR_lsx_vmaxi_b:
+    case CODE_FOR_lsx_vmaxi_h:
+    case CODE_FOR_lsx_vmaxi_w:
+    case CODE_FOR_lsx_vmaxi_d:
+    case CODE_FOR_lsx_vmini_b:
+    case CODE_FOR_lsx_vmini_h:
+    case CODE_FOR_lsx_vmini_w:
+    case CODE_FOR_lsx_vmini_d:
+      gcc_assert (has_target_p && nops == 3);
+      /* We only generate a vector of constants iff the second argument
+	 is an immediate.  We also validate the range of the immediate.  */
+      if (CONST_INT_P (ops[2].value))
+	{
+	  rangelo = -16;
+	  rangehi = 15;
+	  if (IN_RANGE (INTVAL (ops[2].value), rangelo, rangehi))
+	    {
+	      ops[2].mode = ops[0].mode;
+	      ops[2].value = loongarch_gen_const_int_vector (ops[2].mode,
+							     INTVAL (ops[2].value));
+	    }
+	  else
+	    error_opno = 2;
+	}
+      break;
+
+    case CODE_FOR_lsx_vandi_b:
+    case CODE_FOR_lsx_vori_b:
+    case CODE_FOR_lsx_vnori_b:
+    case CODE_FOR_lsx_vxori_b:
+      gcc_assert (has_target_p && nops == 3);
+      if (!CONST_INT_P (ops[2].value))
+	break;
+      ops[2].mode = ops[0].mode;
+      ops[2].value = loongarch_gen_const_int_vector (ops[2].mode,
+						     INTVAL (ops[2].value));
+      break;
+
+    case CODE_FOR_lsx_vbitseli_b:
+      gcc_assert (has_target_p && nops == 4);
+      if (!CONST_INT_P (ops[3].value))
+	break;
+      ops[3].mode = ops[0].mode;
+      ops[3].value = loongarch_gen_const_int_vector (ops[3].mode,
+						     INTVAL (ops[3].value));
+      break;
+
+    case CODE_FOR_lsx_vreplgr2vr_b:
+    case CODE_FOR_lsx_vreplgr2vr_h:
+    case CODE_FOR_lsx_vreplgr2vr_w:
+    case CODE_FOR_lsx_vreplgr2vr_d:
+      /* Map the built-ins to vector fill operations.  We need fix up the mode
+	 for the element being inserted.  */
+      gcc_assert (has_target_p && nops == 2);
+      imode = GET_MODE_INNER (ops[0].mode);
+      ops[1].value = lowpart_subreg (imode, ops[1].value, ops[1].mode);
+      ops[1].mode = imode;
+      break;
+
+    case CODE_FOR_lsx_vilvh_b:
+    case CODE_FOR_lsx_vilvh_h:
+    case CODE_FOR_lsx_vilvh_w:
+    case CODE_FOR_lsx_vilvh_d:
+    case CODE_FOR_lsx_vilvl_b:
+    case CODE_FOR_lsx_vilvl_h:
+    case CODE_FOR_lsx_vilvl_w:
+    case CODE_FOR_lsx_vilvl_d:
+    case CODE_FOR_lsx_vpackev_b:
+    case CODE_FOR_lsx_vpackev_h:
+    case CODE_FOR_lsx_vpackev_w:
+    case CODE_FOR_lsx_vpackod_b:
+    case CODE_FOR_lsx_vpackod_h:
+    case CODE_FOR_lsx_vpackod_w:
+    case CODE_FOR_lsx_vpickev_b:
+    case CODE_FOR_lsx_vpickev_h:
+    case CODE_FOR_lsx_vpickev_w:
+    case CODE_FOR_lsx_vpickod_b:
+    case CODE_FOR_lsx_vpickod_h:
+    case CODE_FOR_lsx_vpickod_w:
+      /* Swap the operands 1 and 2 for interleave operations.  Built-ins follow
+	 convention of ISA, which have op1 as higher component and op2 as lower
+	 component.  However, the VEC_PERM op in tree and vec_concat in RTL
+	 expects first operand to be lower component, because of which this
+	 swap is needed for builtins.  */
+      gcc_assert (has_target_p && nops == 3);
+      std::swap (ops[1], ops[2]);
+      break;
+
+    case CODE_FOR_lsx_vslli_b:
+    case CODE_FOR_lsx_vslli_h:
+    case CODE_FOR_lsx_vslli_w:
+    case CODE_FOR_lsx_vslli_d:
+    case CODE_FOR_lsx_vsrai_b:
+    case CODE_FOR_lsx_vsrai_h:
+    case CODE_FOR_lsx_vsrai_w:
+    case CODE_FOR_lsx_vsrai_d:
+    case CODE_FOR_lsx_vsrli_b:
+    case CODE_FOR_lsx_vsrli_h:
+    case CODE_FOR_lsx_vsrli_w:
+    case CODE_FOR_lsx_vsrli_d:
+      gcc_assert (has_target_p && nops == 3);
+      if (CONST_INT_P (ops[2].value))
+	{
+	  rangelo = 0;
+	  rangehi = GET_MODE_UNIT_BITSIZE (ops[0].mode) - 1;
+	  if (IN_RANGE (INTVAL (ops[2].value), rangelo, rangehi))
+	    {
+	      ops[2].mode = ops[0].mode;
+	      ops[2].value = loongarch_gen_const_int_vector (ops[2].mode,
+							     INTVAL (ops[2].value));
+	    }
+	  else
+	    error_opno = 2;
+	}
+      break;
+
+    case CODE_FOR_lsx_vinsgr2vr_b:
+    case CODE_FOR_lsx_vinsgr2vr_h:
+    case CODE_FOR_lsx_vinsgr2vr_w:
+    case CODE_FOR_lsx_vinsgr2vr_d:
+      /* Map the built-ins to insert operations.  We need to swap operands,
+	 fix up the mode for the element being inserted, and generate
+	 a bit mask for vec_merge.  */
+      gcc_assert (has_target_p && nops == 4);
+      std::swap (ops[1], ops[2]);
+      imode = GET_MODE_INNER (ops[0].mode);
+      ops[1].value = lowpart_subreg (imode, ops[1].value, ops[1].mode);
+      ops[1].mode = imode;
+      rangelo = 0;
+      rangehi = GET_MODE_NUNITS (ops[0].mode) - 1;
+      if (CONST_INT_P (ops[3].value)
+	  && IN_RANGE (INTVAL (ops[3].value), rangelo, rangehi))
+	ops[3].value = GEN_INT (1 << INTVAL (ops[3].value));
+      else
+	error_opno = 2;
+      break;
+
+      /* Map the built-ins to element insert operations.  We need to swap
+	 operands and generate a bit mask.  */
+      gcc_assert (has_target_p && nops == 4);
+      std::swap (ops[1], ops[2]);
+      std::swap (ops[1], ops[3]);
+      rangelo = 0;
+      rangehi = GET_MODE_NUNITS (ops[0].mode) - 1;
+      if (CONST_INT_P (ops[3].value)
+	  && IN_RANGE (INTVAL (ops[3].value), rangelo, rangehi))
+	ops[3].value = GEN_INT (1 << INTVAL (ops[3].value));
+      else
+	error_opno = 2;
+      break;
+
+    case CODE_FOR_lsx_vshuf4i_b:
+    case CODE_FOR_lsx_vshuf4i_h:
+    case CODE_FOR_lsx_vshuf4i_w:
+    case CODE_FOR_lsx_vshuf4i_w_f:
+      gcc_assert (has_target_p && nops == 3);
+      ops[2].value = loongarch_gen_const_int_vector_shuffle (ops[0].mode,
+							     INTVAL (ops[2].value));
+      break;
+
+    default:
+      break;
+  }
+
+  if (error_opno != 0)
+    {
+      error ("argument %d to the built-in must be a constant"
+	     " in range %d to %d", error_opno, rangelo, rangehi);
+      return has_target_p ? gen_reg_rtx (ops[0].mode) : const0_rtx;
+    }
+  else if (!maybe_expand_insn (icode, nops, ops))
     {
       error ("invalid argument to built-in function");
       return has_target_p ? gen_reg_rtx (ops[0].mode) : const0_rtx;
@@ -352,6 +1794,50 @@ loongarch_expand_builtin_direct (enum insn_code icode, rtx target, tree exp,
   return loongarch_expand_builtin_insn (icode, opno, ops, has_target_p);
 }
 
+/* Expand an LSX built-in for a compare and branch instruction specified by
+   ICODE, set a general-purpose register to 1 if the branch was taken,
+   0 otherwise.  */
+
+static rtx
+loongarch_expand_builtin_lsx_test_branch (enum insn_code icode, tree exp)
+{
+  struct expand_operand ops[3];
+  rtx_insn *cbranch;
+  rtx_code_label *true_label, *done_label;
+  rtx cmp_result;
+
+  true_label = gen_label_rtx ();
+  done_label = gen_label_rtx ();
+
+  create_input_operand (&ops[0], true_label, TYPE_MODE (TREE_TYPE (exp)));
+  loongarch_prepare_builtin_arg (&ops[1], exp, 0);
+  create_fixed_operand (&ops[2], const0_rtx);
+
+  /* Make sure that the operand 1 is a REG.  */
+  if (GET_CODE (ops[1].value) != REG)
+    ops[1].value = force_reg (ops[1].mode, ops[1].value);
+
+  if ((cbranch = maybe_gen_insn (icode, 3, ops)) == NULL_RTX)
+    error ("failed to expand built-in function");
+
+  cmp_result = gen_reg_rtx (SImode);
+
+  /* First assume that CMP_RESULT is false.  */
+  loongarch_emit_move (cmp_result, const0_rtx);
+
+  /* Branch to TRUE_LABEL if CBRANCH is taken and DONE_LABEL otherwise.  */
+  emit_jump_insn (cbranch);
+  emit_jump_insn (gen_jump (done_label));
+  emit_barrier ();
+
+  /* Set CMP_RESULT to true if the branch was taken.  */
+  emit_label (true_label);
+  loongarch_emit_move (cmp_result, const1_rtx);
+
+  emit_label (done_label);
+  return cmp_result;
+}
+
 /* Implement TARGET_EXPAND_BUILTIN.  */
 
 rtx
@@ -372,10 +1858,14 @@ loongarch_expand_builtin (tree exp, rtx target, rtx subtarget ATTRIBUTE_UNUSED,
   switch (d->builtin_type)
     {
     case LARCH_BUILTIN_DIRECT:
+    case LARCH_BUILTIN_LSX:
       return loongarch_expand_builtin_direct (d->icode, target, exp, true);
 
     case LARCH_BUILTIN_DIRECT_NO_TARGET:
       return loongarch_expand_builtin_direct (d->icode, target, exp, false);
+
+    case LARCH_BUILTIN_LSX_TEST_BRANCH:
+      return loongarch_expand_builtin_lsx_test_branch (d->icode, exp);
     }
   gcc_unreachable ();
 }
diff --git a/gcc/config/loongarch/loongarch-ftypes.def b/gcc/config/loongarch/loongarch-ftypes.def
index 06d2e0519f7..1ce9d83ccab 100644
--- a/gcc/config/loongarch/loongarch-ftypes.def
+++ b/gcc/config/loongarch/loongarch-ftypes.def
@@ -1,7 +1,7 @@
 /* Definitions of prototypes for LoongArch built-in functions.
    Copyright (C) 2021-2023 Free Software Foundation, Inc.
    Contributed by Loongson Ltd.
-   Based on MIPS target for GNU ckompiler.
+   Based on MIPS target for GNU compiler.
 
 This file is part of GCC.
 
@@ -32,7 +32,7 @@ along with GCC; see the file COPYING3.  If not see
       INT for integer_type_node
       POINTER for ptr_type_node
 
-   (we don't use PTR because that's a ANSI-compatibillity macro).
+   (we don't use PTR because that's a ANSI-compatibility macro).
 
    Please keep this list lexicographically sorted by the LIST argument.  */
 
@@ -63,3 +63,396 @@ DEF_LARCH_FTYPE (3, (VOID, USI, USI, SI))
 DEF_LARCH_FTYPE (3, (VOID, USI, UDI, SI))
 DEF_LARCH_FTYPE (3, (USI, USI, USI, USI))
 DEF_LARCH_FTYPE (3, (UDI, UDI, UDI, USI))
+
+DEF_LARCH_FTYPE (1, (DF, DF))
+DEF_LARCH_FTYPE (2, (DF, DF, DF))
+DEF_LARCH_FTYPE (1, (DF, V2DF))
+
+DEF_LARCH_FTYPE (1, (DI, DI))
+DEF_LARCH_FTYPE (1, (DI, SI))
+DEF_LARCH_FTYPE (1, (DI, UQI))
+DEF_LARCH_FTYPE (2, (DI, DI, DI))
+DEF_LARCH_FTYPE (2, (DI, DI, SI))
+DEF_LARCH_FTYPE (3, (DI, DI, SI, SI))
+DEF_LARCH_FTYPE (3, (DI, DI, USI, USI))
+DEF_LARCH_FTYPE (3, (DI, DI, DI, QI))
+DEF_LARCH_FTYPE (3, (DI, DI, V2HI, V2HI))
+DEF_LARCH_FTYPE (3, (DI, DI, V4QI, V4QI))
+DEF_LARCH_FTYPE (2, (DI, POINTER, SI))
+DEF_LARCH_FTYPE (2, (DI, SI, SI))
+DEF_LARCH_FTYPE (2, (DI, USI, USI))
+
+DEF_LARCH_FTYPE (2, (DI, V2DI, UQI))
+
+DEF_LARCH_FTYPE (2, (INT, DF, DF))
+DEF_LARCH_FTYPE (2, (INT, SF, SF))
+
+DEF_LARCH_FTYPE (2, (INT, V2SF, V2SF))
+DEF_LARCH_FTYPE (4, (INT, V2SF, V2SF, V2SF, V2SF))
+
+DEF_LARCH_FTYPE (1, (SF, SF))
+DEF_LARCH_FTYPE (2, (SF, SF, SF))
+DEF_LARCH_FTYPE (1, (SF, V2SF))
+DEF_LARCH_FTYPE (1, (SF, V4SF))
+
+DEF_LARCH_FTYPE (2, (SI, POINTER, SI))
+DEF_LARCH_FTYPE (1, (SI, SI))
+DEF_LARCH_FTYPE (1, (SI, UDI))
+DEF_LARCH_FTYPE (2, (QI, QI, QI))
+DEF_LARCH_FTYPE (2, (HI, HI, HI))
+DEF_LARCH_FTYPE (3, (SI, SI, SI, SI))
+DEF_LARCH_FTYPE (3, (SI, SI, SI, QI))
+DEF_LARCH_FTYPE (1, (SI, UQI))
+DEF_LARCH_FTYPE (1, (SI, UV16QI))
+DEF_LARCH_FTYPE (1, (SI, UV2DI))
+DEF_LARCH_FTYPE (1, (SI, UV4SI))
+DEF_LARCH_FTYPE (1, (SI, UV8HI))
+DEF_LARCH_FTYPE (2, (SI, V16QI, UQI))
+DEF_LARCH_FTYPE (1, (SI, V2HI))
+DEF_LARCH_FTYPE (2, (SI, V2HI, V2HI))
+DEF_LARCH_FTYPE (1, (SI, V4QI))
+DEF_LARCH_FTYPE (2, (SI, V4QI, V4QI))
+DEF_LARCH_FTYPE (2, (SI, V4SI, UQI))
+DEF_LARCH_FTYPE (2, (SI, V8HI, UQI))
+DEF_LARCH_FTYPE (1, (SI, VOID))
+
+DEF_LARCH_FTYPE (2, (UDI, UDI, UDI))
+DEF_LARCH_FTYPE (2, (UDI, UV2SI, UV2SI))
+DEF_LARCH_FTYPE (2, (UDI, V2DI, UQI))
+
+DEF_LARCH_FTYPE (2, (USI, V16QI, UQI))
+DEF_LARCH_FTYPE (2, (USI, V4SI, UQI))
+DEF_LARCH_FTYPE (2, (USI, V8HI, UQI))
+DEF_LARCH_FTYPE (1, (USI, VOID))
+
+DEF_LARCH_FTYPE (2, (UV16QI, UV16QI, UQI))
+DEF_LARCH_FTYPE (2, (UV16QI, UV16QI, USI))
+DEF_LARCH_FTYPE (2, (UV16QI, UV16QI, UV16QI))
+DEF_LARCH_FTYPE (3, (UV16QI, UV16QI, UV16QI, UQI))
+DEF_LARCH_FTYPE (3, (UV16QI, UV16QI, UV16QI, USI))
+DEF_LARCH_FTYPE (3, (UV16QI, UV16QI, UV16QI, UV16QI))
+DEF_LARCH_FTYPE (2, (UV16QI, UV16QI, V16QI))
+
+DEF_LARCH_FTYPE (2, (UV2DI, UV2DI, UQI))
+DEF_LARCH_FTYPE (2, (UV2DI, UV2DI, UV2DI))
+DEF_LARCH_FTYPE (3, (UV2DI, UV2DI, UV2DI, UQI))
+DEF_LARCH_FTYPE (3, (UV2DI, UV2DI, UV2DI, UV2DI))
+DEF_LARCH_FTYPE (3, (UV2DI, UV2DI, UV4SI, UV4SI))
+DEF_LARCH_FTYPE (2, (UV2DI, UV2DI, V2DI))
+DEF_LARCH_FTYPE (2, (UV2DI, UV4SI, UV4SI))
+DEF_LARCH_FTYPE (1, (UV2DI, V2DF))
+
+DEF_LARCH_FTYPE (2, (UV2SI, UV2SI, UQI))
+DEF_LARCH_FTYPE (2, (UV2SI, UV2SI, UV2SI))
+
+DEF_LARCH_FTYPE (2, (UV4HI, UV4HI, UQI))
+DEF_LARCH_FTYPE (2, (UV4HI, UV4HI, USI))
+DEF_LARCH_FTYPE (2, (UV4HI, UV4HI, UV4HI))
+DEF_LARCH_FTYPE (3, (UV4HI, UV4HI, UV4HI, UQI))
+DEF_LARCH_FTYPE (3, (UV4HI, UV4HI, UV4HI, USI))
+DEF_LARCH_FTYPE (1, (UV4HI, UV8QI))
+DEF_LARCH_FTYPE (2, (UV4HI, UV8QI, UV8QI))
+
+DEF_LARCH_FTYPE (2, (UV4SI, UV4SI, UQI))
+DEF_LARCH_FTYPE (2, (UV4SI, UV4SI, UV4SI))
+DEF_LARCH_FTYPE (3, (UV4SI, UV4SI, UV4SI, UQI))
+DEF_LARCH_FTYPE (3, (UV4SI, UV4SI, UV4SI, UV4SI))
+DEF_LARCH_FTYPE (3, (UV4SI, UV4SI, UV8HI, UV8HI))
+DEF_LARCH_FTYPE (2, (UV4SI, UV4SI, V4SI))
+DEF_LARCH_FTYPE (2, (UV4SI, UV8HI, UV8HI))
+DEF_LARCH_FTYPE (1, (UV4SI, V4SF))
+
+DEF_LARCH_FTYPE (2, (UV8HI, UV16QI, UV16QI))
+DEF_LARCH_FTYPE (2, (UV8HI, UV8HI, UQI))
+DEF_LARCH_FTYPE (3, (UV8HI, UV8HI, UV16QI, UV16QI))
+DEF_LARCH_FTYPE (2, (UV8HI, UV8HI, UV8HI))
+DEF_LARCH_FTYPE (3, (UV8HI, UV8HI, UV8HI, UQI))
+DEF_LARCH_FTYPE (3, (UV8HI, UV8HI, UV8HI, UV8HI))
+DEF_LARCH_FTYPE (2, (UV8HI, UV8HI, V8HI))
+
+
+
+DEF_LARCH_FTYPE (2, (UV8QI, UV4HI, UV4HI))
+DEF_LARCH_FTYPE (1, (UV8QI, UV8QI))
+DEF_LARCH_FTYPE (2, (UV8QI, UV8QI, UV8QI))
+
+DEF_LARCH_FTYPE (2, (V16QI, CVPOINTER, SI))
+DEF_LARCH_FTYPE (2, (V16QI, CVPOINTER, DI))
+DEF_LARCH_FTYPE (1, (V16QI, HI))
+DEF_LARCH_FTYPE (1, (V16QI, SI))
+DEF_LARCH_FTYPE (2, (V16QI, UV16QI, UQI))
+DEF_LARCH_FTYPE (2, (V16QI, UV16QI, UV16QI))
+DEF_LARCH_FTYPE (1, (V16QI, V16QI))
+DEF_LARCH_FTYPE (2, (V16QI, V16QI, QI))
+DEF_LARCH_FTYPE (2, (V16QI, V16QI, SI))
+DEF_LARCH_FTYPE (2, (V16QI, V16QI, USI))
+DEF_LARCH_FTYPE (2, (V16QI, V16QI, UQI))
+DEF_LARCH_FTYPE (3, (V16QI, V16QI, UQI, SI))
+DEF_LARCH_FTYPE (3, (V16QI, V16QI, UQI, V16QI))
+DEF_LARCH_FTYPE (2, (V16QI, V16QI, V16QI))
+DEF_LARCH_FTYPE (3, (V16QI, V16QI, V16QI, SI))
+DEF_LARCH_FTYPE (3, (V16QI, V16QI, V16QI, UQI))
+DEF_LARCH_FTYPE (4, (V16QI, V16QI, V16QI, UQI, UQI))
+DEF_LARCH_FTYPE (3, (V16QI, V16QI, V16QI, USI))
+DEF_LARCH_FTYPE (3, (V16QI, V16QI, V16QI, V16QI))
+
+
+DEF_LARCH_FTYPE (1, (V2DF, DF))
+DEF_LARCH_FTYPE (1, (V2DF, UV2DI))
+DEF_LARCH_FTYPE (1, (V2DF, V2DF))
+DEF_LARCH_FTYPE (2, (V2DF, V2DF, V2DF))
+DEF_LARCH_FTYPE (3, (V2DF, V2DF, V2DF, V2DF))
+DEF_LARCH_FTYPE (2, (V2DF, V2DF, V2DI))
+DEF_LARCH_FTYPE (1, (V2DF, V2DI))
+DEF_LARCH_FTYPE (1, (V2DF, V4SF))
+DEF_LARCH_FTYPE (1, (V2DF, V4SI))
+
+DEF_LARCH_FTYPE (2, (V2DI, CVPOINTER, SI))
+DEF_LARCH_FTYPE (1, (V2DI, DI))
+DEF_LARCH_FTYPE (1, (V2DI, HI))
+DEF_LARCH_FTYPE (2, (V2DI, UV2DI, UQI))
+DEF_LARCH_FTYPE (2, (V2DI, UV2DI, UV2DI))
+DEF_LARCH_FTYPE (2, (V2DI, UV4SI, UV4SI))
+DEF_LARCH_FTYPE (1, (V2DI, V2DF))
+DEF_LARCH_FTYPE (2, (V2DI, V2DF, V2DF))
+DEF_LARCH_FTYPE (1, (V2DI, V2DI))
+DEF_LARCH_FTYPE (1, (UV2DI, UV2DI))
+DEF_LARCH_FTYPE (2, (V2DI, V2DI, QI))
+DEF_LARCH_FTYPE (2, (V2DI, V2DI, SI))
+DEF_LARCH_FTYPE (2, (V2DI, V2DI, UQI))
+DEF_LARCH_FTYPE (2, (V2DI, V2DI, USI))
+DEF_LARCH_FTYPE (3, (V2DI, V2DI, UQI, DI))
+DEF_LARCH_FTYPE (3, (V2DI, V2DI, UQI, V2DI))
+DEF_LARCH_FTYPE (3, (V2DI, V2DI, UV4SI, UV4SI))
+DEF_LARCH_FTYPE (2, (V2DI, V2DI, V2DI))
+DEF_LARCH_FTYPE (3, (V2DI, V2DI, V2DI, SI))
+DEF_LARCH_FTYPE (3, (V2DI, V2DI, V2DI, UQI))
+DEF_LARCH_FTYPE (3, (V2DI, V2DI, V2DI, USI))
+DEF_LARCH_FTYPE (4, (V2DI, V2DI, V2DI, UQI, UQI))
+DEF_LARCH_FTYPE (3, (V2DI, V2DI, V2DI, V2DI))
+DEF_LARCH_FTYPE (3, (V2DI, V2DI, V4SI, V4SI))
+DEF_LARCH_FTYPE (2, (V2DI, V4SI, V4SI))
+
+DEF_LARCH_FTYPE (1, (V2HI, SI))
+DEF_LARCH_FTYPE (2, (V2HI, SI, SI))
+DEF_LARCH_FTYPE (3, (V2HI, SI, SI, SI))
+DEF_LARCH_FTYPE (1, (V2HI, V2HI))
+DEF_LARCH_FTYPE (2, (V2HI, V2HI, SI))
+DEF_LARCH_FTYPE (2, (V2HI, V2HI, V2HI))
+DEF_LARCH_FTYPE (1, (V2HI, V4QI))
+DEF_LARCH_FTYPE (2, (V2HI, V4QI, V2HI))
+
+DEF_LARCH_FTYPE (2, (V2SF, SF, SF))
+DEF_LARCH_FTYPE (1, (V2SF, V2SF))
+DEF_LARCH_FTYPE (2, (V2SF, V2SF, V2SF))
+DEF_LARCH_FTYPE (3, (V2SF, V2SF, V2SF, INT))
+DEF_LARCH_FTYPE (4, (V2SF, V2SF, V2SF, V2SF, V2SF))
+
+DEF_LARCH_FTYPE (2, (V2SI, V2SI, UQI))
+DEF_LARCH_FTYPE (2, (V2SI, V2SI, V2SI))
+DEF_LARCH_FTYPE (2, (V2SI, V4HI, V4HI))
+
+DEF_LARCH_FTYPE (2, (V4HI, V2SI, V2SI))
+DEF_LARCH_FTYPE (2, (V4HI, V4HI, UQI))
+DEF_LARCH_FTYPE (2, (V4HI, V4HI, USI))
+DEF_LARCH_FTYPE (2, (V4HI, V4HI, V4HI))
+DEF_LARCH_FTYPE (3, (V4HI, V4HI, V4HI, UQI))
+DEF_LARCH_FTYPE (3, (V4HI, V4HI, V4HI, USI))
+
+DEF_LARCH_FTYPE (1, (V4QI, SI))
+DEF_LARCH_FTYPE (2, (V4QI, V2HI, V2HI))
+DEF_LARCH_FTYPE (1, (V4QI, V4QI))
+DEF_LARCH_FTYPE (2, (V4QI, V4QI, SI))
+DEF_LARCH_FTYPE (2, (V4QI, V4QI, V4QI))
+
+DEF_LARCH_FTYPE (1, (V4SF, SF))
+DEF_LARCH_FTYPE (1, (V4SF, UV4SI))
+DEF_LARCH_FTYPE (2, (V4SF, V2DF, V2DF))
+DEF_LARCH_FTYPE (1, (V4SF, V4SF))
+DEF_LARCH_FTYPE (2, (V4SF, V4SF, V4SF))
+DEF_LARCH_FTYPE (3, (V4SF, V4SF, V4SF, V4SF))
+DEF_LARCH_FTYPE (2, (V4SF, V4SF, V4SI))
+DEF_LARCH_FTYPE (1, (V4SF, V4SI))
+DEF_LARCH_FTYPE (1, (V4SF, V8HI))
+
+DEF_LARCH_FTYPE (2, (V4SI, CVPOINTER, SI))
+DEF_LARCH_FTYPE (1, (V4SI, HI))
+DEF_LARCH_FTYPE (1, (V4SI, SI))
+DEF_LARCH_FTYPE (2, (V4SI, UV4SI, UQI))
+DEF_LARCH_FTYPE (2, (V4SI, UV4SI, UV4SI))
+DEF_LARCH_FTYPE (2, (V4SI, UV8HI, UV8HI))
+DEF_LARCH_FTYPE (2, (V4SI, V2DF, V2DF))
+DEF_LARCH_FTYPE (1, (V4SI, V4SF))
+DEF_LARCH_FTYPE (2, (V4SI, V4SF, V4SF))
+DEF_LARCH_FTYPE (1, (V4SI, V4SI))
+DEF_LARCH_FTYPE (2, (V4SI, V4SI, QI))
+DEF_LARCH_FTYPE (2, (V4SI, V4SI, SI))
+DEF_LARCH_FTYPE (2, (V4SI, V4SI, UQI))
+DEF_LARCH_FTYPE (2, (V4SI, V4SI, USI))
+DEF_LARCH_FTYPE (3, (V4SI, V4SI, UQI, SI))
+DEF_LARCH_FTYPE (3, (V4SI, V4SI, UQI, V4SI))
+DEF_LARCH_FTYPE (3, (V4SI, V4SI, UV8HI, UV8HI))
+DEF_LARCH_FTYPE (2, (V4SI, V4SI, V4SI))
+DEF_LARCH_FTYPE (3, (V4SI, V4SI, V4SI, SI))
+DEF_LARCH_FTYPE (3, (V4SI, V4SI, V4SI, UQI))
+DEF_LARCH_FTYPE (3, (V4SI, V4SI, V4SI, USI))
+DEF_LARCH_FTYPE (4, (V4SI, V4SI, V4SI, UQI, UQI))
+DEF_LARCH_FTYPE (3, (V4SI, V4SI, V4SI, V4SI))
+DEF_LARCH_FTYPE (3, (V4SI, V4SI, V8HI, V8HI))
+DEF_LARCH_FTYPE (2, (V4SI, V8HI, V8HI))
+
+DEF_LARCH_FTYPE (2, (V8HI, CVPOINTER, SI))
+DEF_LARCH_FTYPE (1, (V8HI, HI))
+DEF_LARCH_FTYPE (1, (V8HI, SI))
+DEF_LARCH_FTYPE (2, (V8HI, UV16QI, UV16QI))
+DEF_LARCH_FTYPE (2, (V8HI, UV8HI, UQI))
+DEF_LARCH_FTYPE (2, (V8HI, UV8HI, UV8HI))
+DEF_LARCH_FTYPE (2, (V8HI, V16QI, V16QI))
+DEF_LARCH_FTYPE (2, (V8HI, V4SF, V4SF))
+DEF_LARCH_FTYPE (1, (V8HI, V8HI))
+DEF_LARCH_FTYPE (2, (V8HI, V8HI, QI))
+DEF_LARCH_FTYPE (2, (V8HI, V8HI, SI))
+DEF_LARCH_FTYPE (3, (V8HI, V8HI, SI, UQI))
+DEF_LARCH_FTYPE (2, (V8HI, V8HI, UQI))
+DEF_LARCH_FTYPE (2, (V8HI, V8HI, USI))
+DEF_LARCH_FTYPE (3, (V8HI, V8HI, UQI, SI))
+DEF_LARCH_FTYPE (3, (V8HI, V8HI, UQI, V8HI))
+DEF_LARCH_FTYPE (3, (V8HI, V8HI, UV16QI, UV16QI))
+DEF_LARCH_FTYPE (3, (V8HI, V8HI, V16QI, V16QI))
+DEF_LARCH_FTYPE (2, (V8HI, V8HI, V8HI))
+DEF_LARCH_FTYPE (3, (V8HI, V8HI, V8HI, SI))
+DEF_LARCH_FTYPE (3, (V8HI, V8HI, V8HI, UQI))
+DEF_LARCH_FTYPE (4, (V8HI, V8HI, V8HI, UQI, UQI))
+DEF_LARCH_FTYPE (3, (V8HI, V8HI, V8HI, USI))
+DEF_LARCH_FTYPE (3, (V8HI, V8HI, V8HI, V8HI))
+
+DEF_LARCH_FTYPE (2, (V8QI, V4HI, V4HI))
+DEF_LARCH_FTYPE (1, (V8QI, V8QI))
+DEF_LARCH_FTYPE (2, (V8QI, V8QI, V8QI))
+
+DEF_LARCH_FTYPE (2, (VOID, SI, CVPOINTER))
+DEF_LARCH_FTYPE (2, (VOID, SI, SI))
+DEF_LARCH_FTYPE (2, (VOID, UQI, SI))
+DEF_LARCH_FTYPE (2, (VOID, USI, UQI))
+DEF_LARCH_FTYPE (1, (VOID, UHI))
+DEF_LARCH_FTYPE (3, (VOID, V16QI, CVPOINTER, SI))
+DEF_LARCH_FTYPE (3, (VOID, V16QI, CVPOINTER, DI))
+DEF_LARCH_FTYPE (3, (VOID, V2DF, POINTER, SI))
+DEF_LARCH_FTYPE (3, (VOID, V2DI, CVPOINTER, SI))
+DEF_LARCH_FTYPE (2, (VOID, V2HI, V2HI))
+DEF_LARCH_FTYPE (2, (VOID, V4QI, V4QI))
+DEF_LARCH_FTYPE (3, (VOID, V4SF, POINTER, SI))
+DEF_LARCH_FTYPE (3, (VOID, V4SI, CVPOINTER, SI))
+DEF_LARCH_FTYPE (3, (VOID, V8HI, CVPOINTER, SI))
+
+DEF_LARCH_FTYPE (1, (V8HI, V16QI))
+DEF_LARCH_FTYPE (1, (V4SI, V16QI))
+DEF_LARCH_FTYPE (1, (V2DI, V16QI))
+DEF_LARCH_FTYPE (1, (V4SI, V8HI))
+DEF_LARCH_FTYPE (1, (V2DI, V8HI))
+DEF_LARCH_FTYPE (1, (V2DI, V4SI))
+DEF_LARCH_FTYPE (1, (UV8HI, V16QI))
+DEF_LARCH_FTYPE (1, (UV4SI, V16QI))
+DEF_LARCH_FTYPE (1, (UV2DI, V16QI))
+DEF_LARCH_FTYPE (1, (UV4SI, V8HI))
+DEF_LARCH_FTYPE (1, (UV2DI, V8HI))
+DEF_LARCH_FTYPE (1, (UV2DI, V4SI))
+DEF_LARCH_FTYPE (1, (UV8HI, UV16QI))
+DEF_LARCH_FTYPE (1, (UV4SI, UV16QI))
+DEF_LARCH_FTYPE (1, (UV2DI, UV16QI))
+DEF_LARCH_FTYPE (1, (UV4SI, UV8HI))
+DEF_LARCH_FTYPE (1, (UV2DI, UV8HI))
+DEF_LARCH_FTYPE (1, (UV2DI, UV4SI))
+DEF_LARCH_FTYPE (2, (UV8HI, V16QI, V16QI))
+DEF_LARCH_FTYPE (2, (UV4SI, V8HI, V8HI))
+DEF_LARCH_FTYPE (2, (UV2DI, V4SI, V4SI))
+DEF_LARCH_FTYPE (2, (V8HI, V16QI, UQI))
+DEF_LARCH_FTYPE (2, (V4SI, V8HI, UQI))
+DEF_LARCH_FTYPE (2, (V2DI, V4SI, UQI))
+DEF_LARCH_FTYPE (2, (UV8HI, UV16QI, UQI))
+DEF_LARCH_FTYPE (2, (UV4SI, UV8HI, UQI))
+DEF_LARCH_FTYPE (2, (UV2DI, UV4SI, UQI))
+DEF_LARCH_FTYPE (2, (V16QI, V8HI, V8HI))
+DEF_LARCH_FTYPE (2, (V8HI, V4SI, V4SI))
+DEF_LARCH_FTYPE (2, (V4SI, V2DI, V2DI))
+DEF_LARCH_FTYPE (2, (UV16QI, UV8HI, UV8HI))
+DEF_LARCH_FTYPE (2, (UV8HI, UV4SI, UV4SI))
+DEF_LARCH_FTYPE (2, (UV4SI, UV2DI, UV2DI))
+DEF_LARCH_FTYPE (2, (V16QI, V8HI, UQI))
+DEF_LARCH_FTYPE (2, (V8HI, V4SI, UQI))
+DEF_LARCH_FTYPE (2, (V4SI, V2DI, UQI))
+DEF_LARCH_FTYPE (2, (UV16QI, UV8HI, UQI))
+DEF_LARCH_FTYPE (2, (UV8HI, UV4SI, UQI))
+DEF_LARCH_FTYPE (2, (UV4SI, UV2DI, UQI))
+DEF_LARCH_FTYPE (2, (V16QI, V16QI, DI))
+DEF_LARCH_FTYPE (2, (V16QI, UQI, UQI))
+DEF_LARCH_FTYPE (3, (V16QI, V16QI, UQI, UQI))
+DEF_LARCH_FTYPE (3, (V8HI, V8HI, UQI, UQI))
+DEF_LARCH_FTYPE (3, (V4SI, V4SI, UQI, UQI))
+DEF_LARCH_FTYPE (3, (V2DI, V2DI, UQI, UQI))
+DEF_LARCH_FTYPE (2, (V4SF, V2DI, V2DI))
+DEF_LARCH_FTYPE (1, (V2DI, V4SF))
+DEF_LARCH_FTYPE (2, (V2DI, UQI, USI))
+DEF_LARCH_FTYPE (2, (V2DI, UQI, UQI))
+DEF_LARCH_FTYPE (4, (VOID, SI, UQI, V16QI, CVPOINTER))
+DEF_LARCH_FTYPE (4, (VOID, SI, UQI, V8HI, CVPOINTER))
+DEF_LARCH_FTYPE (4, (VOID, SI, UQI, V4SI, CVPOINTER))
+DEF_LARCH_FTYPE (4, (VOID, SI, UQI, V2DI, CVPOINTER))
+DEF_LARCH_FTYPE (2, (V16QI, SI, CVPOINTER))
+DEF_LARCH_FTYPE (2, (V8HI, SI, CVPOINTER))
+DEF_LARCH_FTYPE (2, (V4SI, SI, CVPOINTER))
+DEF_LARCH_FTYPE (2, (V2DI, SI, CVPOINTER))
+DEF_LARCH_FTYPE (2, (V8HI, UV16QI, V16QI))
+DEF_LARCH_FTYPE (2, (V16QI, V16QI, UV16QI))
+DEF_LARCH_FTYPE (2, (UV16QI, V16QI, UV16QI))
+DEF_LARCH_FTYPE (2, (V8HI, V8HI, UV8HI))
+DEF_LARCH_FTYPE (2, (UV8HI, V8HI, UV8HI))
+DEF_LARCH_FTYPE (2, (V4SI, V4SI, UV4SI))
+DEF_LARCH_FTYPE (2, (UV4SI, V4SI, UV4SI))
+DEF_LARCH_FTYPE (2, (V4SI, V16QI, V16QI))
+DEF_LARCH_FTYPE (2, (V4SI, UV16QI, V16QI))
+DEF_LARCH_FTYPE (2, (UV4SI, UV16QI, UV16QI))
+DEF_LARCH_FTYPE (2, (V2DI, V2DI, UV2DI))
+DEF_LARCH_FTYPE (2, (UV2DI, UV8HI, UV8HI))
+DEF_LARCH_FTYPE (2, (V4SI, UV8HI, V8HI))
+DEF_LARCH_FTYPE (2, (V2DI, UV4SI, V4SI))
+DEF_LARCH_FTYPE (2, (V2DI, UV2DI, V2DI))
+DEF_LARCH_FTYPE (2, (V2DI, V8HI, V8HI))
+DEF_LARCH_FTYPE (2, (V2DI, UV8HI, V8HI))
+DEF_LARCH_FTYPE (2, (UV2DI, V2DI, UV2DI))
+DEF_LARCH_FTYPE (3, (V4SI, V4SI, UV8HI, V8HI))
+DEF_LARCH_FTYPE (3, (V2DI, V2DI, UV2DI, V2DI))
+DEF_LARCH_FTYPE (3, (V2DI, V2DI, UV4SI, V4SI))
+DEF_LARCH_FTYPE (3, (V2DI, V2DI, V8HI, V8HI))
+DEF_LARCH_FTYPE (3, (V2DI, V2DI, UV8HI, V8HI))
+DEF_LARCH_FTYPE (3, (UV2DI, UV2DI, UV8HI, UV8HI))
+DEF_LARCH_FTYPE (3, (V8HI, V8HI, UV16QI, V16QI))
+DEF_LARCH_FTYPE (3, (V4SI, V4SI, V16QI, V16QI))
+DEF_LARCH_FTYPE (3, (V4SI, V4SI, UV16QI, V16QI))
+DEF_LARCH_FTYPE (3, (UV4SI, UV4SI, UV16QI, UV16QI))
+
+DEF_LARCH_FTYPE(4,(VOID,V16QI,CVPOINTER,SI,UQI))
+DEF_LARCH_FTYPE(4,(VOID,V8HI,CVPOINTER,SI,UQI))
+DEF_LARCH_FTYPE(4,(VOID,V4SI,CVPOINTER,SI,UQI))
+DEF_LARCH_FTYPE(4,(VOID,V2DI,CVPOINTER,SI,UQI))
+
+DEF_LARCH_FTYPE (2, (DI, V16QI, UQI))
+DEF_LARCH_FTYPE (2, (DI, V8HI, UQI))
+DEF_LARCH_FTYPE (2, (DI, V4SI, UQI))
+DEF_LARCH_FTYPE (2, (UDI, V16QI, UQI))
+DEF_LARCH_FTYPE (2, (UDI, V8HI, UQI))
+DEF_LARCH_FTYPE (2, (UDI, V4SI, UQI))
+
+DEF_LARCH_FTYPE (3, (UV16QI, UV16QI, V16QI, USI))
+DEF_LARCH_FTYPE (3, (UV8HI, UV8HI, V8HI, USI))
+DEF_LARCH_FTYPE (3, (UV4SI, UV4SI, V4SI, USI))
+DEF_LARCH_FTYPE (3, (UV2DI, UV2DI, V2DI, USI))
+
+DEF_LARCH_FTYPE (1, (BOOLEAN,V16QI))
+DEF_LARCH_FTYPE(2,(V16QI,CVPOINTER,CVPOINTER))
+DEF_LARCH_FTYPE(3,(VOID,V16QI,CVPOINTER,CVPOINTER))
+
+DEF_LARCH_FTYPE (3, (V16QI, V16QI, SI, UQI))
+DEF_LARCH_FTYPE (3, (V2DI, V2DI, SI, UQI))
+DEF_LARCH_FTYPE (3, (V2DI, V2DI, DI, UQI))
+DEF_LARCH_FTYPE (3, (V4SI, V4SI, SI, UQI))
diff --git a/gcc/config/loongarch/lsxintrin.h b/gcc/config/loongarch/lsxintrin.h
new file mode 100644
index 00000000000..ec42069904d
--- /dev/null
+++ b/gcc/config/loongarch/lsxintrin.h
@@ -0,0 +1,5181 @@
+/* LARCH Loongson SX intrinsics include file.
+
+   Copyright (C) 2018 Free Software Foundation, Inc.
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published
+   by the Free Software Foundation; either version 3, or (at your
+   option) any later version.
+
+   GCC is distributed in the hope that it will be useful, but WITHOUT
+   ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+   or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
+   License for more details.
+
+   Under Section 7 of GPL version 3, you are granted additional
+   permissions described in the GCC Runtime Library Exception, version
+   3.1, as published by the Free Software Foundation.
+
+   You should have received a copy of the GNU General Public License and
+   a copy of the GCC Runtime Library Exception along with this program;
+   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef _GCC_LOONGSON_SXINTRIN_H
+#define _GCC_LOONGSON_SXINTRIN_H 1
+
+#if defined(__loongarch_sx)
+typedef signed char v16i8 __attribute__ ((vector_size(16), aligned(16)));
+typedef signed char v16i8_b __attribute__ ((vector_size(16), aligned(1)));
+typedef unsigned char v16u8 __attribute__ ((vector_size(16), aligned(16)));
+typedef unsigned char v16u8_b __attribute__ ((vector_size(16), aligned(1)));
+typedef short v8i16 __attribute__ ((vector_size(16), aligned(16)));
+typedef short v8i16_h __attribute__ ((vector_size(16), aligned(2)));
+typedef unsigned short v8u16 __attribute__ ((vector_size(16), aligned(16)));
+typedef unsigned short v8u16_h __attribute__ ((vector_size(16), aligned(2)));
+typedef int v4i32 __attribute__ ((vector_size(16), aligned(16)));
+typedef int v4i32_w __attribute__ ((vector_size(16), aligned(4)));
+typedef unsigned int v4u32 __attribute__ ((vector_size(16), aligned(16)));
+typedef unsigned int v4u32_w __attribute__ ((vector_size(16), aligned(4)));
+typedef long long v2i64 __attribute__ ((vector_size(16), aligned(16)));
+typedef long long v2i64_d __attribute__ ((vector_size(16), aligned(8)));
+typedef unsigned long long v2u64 __attribute__ ((vector_size(16), aligned(16)));
+typedef unsigned long long v2u64_d __attribute__ ((vector_size(16), aligned(8)));
+typedef float v4f32 __attribute__ ((vector_size(16), aligned(16)));
+typedef float v4f32_w __attribute__ ((vector_size(16), aligned(4)));
+typedef double v2f64 __attribute__ ((vector_size(16), aligned(16)));
+typedef double v2f64_d __attribute__ ((vector_size(16), aligned(8)));
+
+typedef long long __m128i __attribute__ ((__vector_size__ (16), __may_alias__));
+typedef float __m128 __attribute__ ((__vector_size__ (16), __may_alias__));
+typedef double __m128d __attribute__ ((__vector_size__ (16), __may_alias__));
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsll_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsll_b ((v16i8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsll_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsll_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsll_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsll_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsll_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsll_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, ui3.  */
+/* Data types in instruction templates:  V16QI, V16QI, UQI.  */
+#define __lsx_vslli_b(/*__m128i*/ _1, /*ui3*/ _2) \
+  ((__m128i)__builtin_lsx_vslli_b ((v16i8)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui4.  */
+/* Data types in instruction templates:  V8HI, V8HI, UQI.  */
+#define __lsx_vslli_h(/*__m128i*/ _1, /*ui4*/ _2) \
+  ((__m128i)__builtin_lsx_vslli_h ((v8i16)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  V4SI, V4SI, UQI.  */
+#define __lsx_vslli_w(/*__m128i*/ _1, /*ui5*/ _2) \
+  ((__m128i)__builtin_lsx_vslli_w ((v4i32)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui6.  */
+/* Data types in instruction templates:  V2DI, V2DI, UQI.  */
+#define __lsx_vslli_d(/*__m128i*/ _1, /*ui6*/ _2) \
+  ((__m128i)__builtin_lsx_vslli_d ((v2i64)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsra_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsra_b ((v16i8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsra_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsra_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsra_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsra_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsra_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsra_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, ui3.  */
+/* Data types in instruction templates:  V16QI, V16QI, UQI.  */
+#define __lsx_vsrai_b(/*__m128i*/ _1, /*ui3*/ _2) \
+  ((__m128i)__builtin_lsx_vsrai_b ((v16i8)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui4.  */
+/* Data types in instruction templates:  V8HI, V8HI, UQI.  */
+#define __lsx_vsrai_h(/*__m128i*/ _1, /*ui4*/ _2) \
+  ((__m128i)__builtin_lsx_vsrai_h ((v8i16)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  V4SI, V4SI, UQI.  */
+#define __lsx_vsrai_w(/*__m128i*/ _1, /*ui5*/ _2) \
+  ((__m128i)__builtin_lsx_vsrai_w ((v4i32)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui6.  */
+/* Data types in instruction templates:  V2DI, V2DI, UQI.  */
+#define __lsx_vsrai_d(/*__m128i*/ _1, /*ui6*/ _2) \
+  ((__m128i)__builtin_lsx_vsrai_d ((v2i64)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsrar_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsrar_b ((v16i8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsrar_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsrar_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsrar_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsrar_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsrar_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsrar_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, ui3.  */
+/* Data types in instruction templates:  V16QI, V16QI, UQI.  */
+#define __lsx_vsrari_b(/*__m128i*/ _1, /*ui3*/ _2) \
+  ((__m128i)__builtin_lsx_vsrari_b ((v16i8)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui4.  */
+/* Data types in instruction templates:  V8HI, V8HI, UQI.  */
+#define __lsx_vsrari_h(/*__m128i*/ _1, /*ui4*/ _2) \
+  ((__m128i)__builtin_lsx_vsrari_h ((v8i16)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  V4SI, V4SI, UQI.  */
+#define __lsx_vsrari_w(/*__m128i*/ _1, /*ui5*/ _2) \
+  ((__m128i)__builtin_lsx_vsrari_w ((v4i32)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui6.  */
+/* Data types in instruction templates:  V2DI, V2DI, UQI.  */
+#define __lsx_vsrari_d(/*__m128i*/ _1, /*ui6*/ _2) \
+  ((__m128i)__builtin_lsx_vsrari_d ((v2i64)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsrl_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsrl_b ((v16i8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsrl_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsrl_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsrl_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsrl_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsrl_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsrl_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, ui3.  */
+/* Data types in instruction templates:  V16QI, V16QI, UQI.  */
+#define __lsx_vsrli_b(/*__m128i*/ _1, /*ui3*/ _2) \
+  ((__m128i)__builtin_lsx_vsrli_b ((v16i8)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui4.  */
+/* Data types in instruction templates:  V8HI, V8HI, UQI.  */
+#define __lsx_vsrli_h(/*__m128i*/ _1, /*ui4*/ _2) \
+  ((__m128i)__builtin_lsx_vsrli_h ((v8i16)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  V4SI, V4SI, UQI.  */
+#define __lsx_vsrli_w(/*__m128i*/ _1, /*ui5*/ _2) \
+  ((__m128i)__builtin_lsx_vsrli_w ((v4i32)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui6.  */
+/* Data types in instruction templates:  V2DI, V2DI, UQI.  */
+#define __lsx_vsrli_d(/*__m128i*/ _1, /*ui6*/ _2) \
+  ((__m128i)__builtin_lsx_vsrli_d ((v2i64)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsrlr_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsrlr_b ((v16i8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsrlr_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsrlr_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsrlr_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsrlr_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsrlr_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsrlr_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, ui3.  */
+/* Data types in instruction templates:  V16QI, V16QI, UQI.  */
+#define __lsx_vsrlri_b(/*__m128i*/ _1, /*ui3*/ _2) \
+  ((__m128i)__builtin_lsx_vsrlri_b ((v16i8)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui4.  */
+/* Data types in instruction templates:  V8HI, V8HI, UQI.  */
+#define __lsx_vsrlri_h(/*__m128i*/ _1, /*ui4*/ _2) \
+  ((__m128i)__builtin_lsx_vsrlri_h ((v8i16)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  V4SI, V4SI, UQI.  */
+#define __lsx_vsrlri_w(/*__m128i*/ _1, /*ui5*/ _2) \
+  ((__m128i)__builtin_lsx_vsrlri_w ((v4i32)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui6.  */
+/* Data types in instruction templates:  V2DI, V2DI, UQI.  */
+#define __lsx_vsrlri_d(/*__m128i*/ _1, /*ui6*/ _2) \
+  ((__m128i)__builtin_lsx_vsrlri_d ((v2i64)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV16QI, UV16QI, UV16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vbitclr_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vbitclr_b ((v16u8)_1, (v16u8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV8HI, UV8HI, UV8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vbitclr_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vbitclr_h ((v8u16)_1, (v8u16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV4SI, UV4SI, UV4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vbitclr_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vbitclr_w ((v4u32)_1, (v4u32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV2DI, UV2DI, UV2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vbitclr_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vbitclr_d ((v2u64)_1, (v2u64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, ui3.  */
+/* Data types in instruction templates:  UV16QI, UV16QI, UQI.  */
+#define __lsx_vbitclri_b(/*__m128i*/ _1, /*ui3*/ _2) \
+  ((__m128i)__builtin_lsx_vbitclri_b ((v16u8)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui4.  */
+/* Data types in instruction templates:  UV8HI, UV8HI, UQI.  */
+#define __lsx_vbitclri_h(/*__m128i*/ _1, /*ui4*/ _2) \
+  ((__m128i)__builtin_lsx_vbitclri_h ((v8u16)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  UV4SI, UV4SI, UQI.  */
+#define __lsx_vbitclri_w(/*__m128i*/ _1, /*ui5*/ _2) \
+  ((__m128i)__builtin_lsx_vbitclri_w ((v4u32)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui6.  */
+/* Data types in instruction templates:  UV2DI, UV2DI, UQI.  */
+#define __lsx_vbitclri_d(/*__m128i*/ _1, /*ui6*/ _2) \
+  ((__m128i)__builtin_lsx_vbitclri_d ((v2u64)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV16QI, UV16QI, UV16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vbitset_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vbitset_b ((v16u8)_1, (v16u8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV8HI, UV8HI, UV8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vbitset_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vbitset_h ((v8u16)_1, (v8u16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV4SI, UV4SI, UV4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vbitset_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vbitset_w ((v4u32)_1, (v4u32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV2DI, UV2DI, UV2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vbitset_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vbitset_d ((v2u64)_1, (v2u64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, ui3.  */
+/* Data types in instruction templates:  UV16QI, UV16QI, UQI.  */
+#define __lsx_vbitseti_b(/*__m128i*/ _1, /*ui3*/ _2) \
+  ((__m128i)__builtin_lsx_vbitseti_b ((v16u8)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui4.  */
+/* Data types in instruction templates:  UV8HI, UV8HI, UQI.  */
+#define __lsx_vbitseti_h(/*__m128i*/ _1, /*ui4*/ _2) \
+  ((__m128i)__builtin_lsx_vbitseti_h ((v8u16)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  UV4SI, UV4SI, UQI.  */
+#define __lsx_vbitseti_w(/*__m128i*/ _1, /*ui5*/ _2) \
+  ((__m128i)__builtin_lsx_vbitseti_w ((v4u32)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui6.  */
+/* Data types in instruction templates:  UV2DI, UV2DI, UQI.  */
+#define __lsx_vbitseti_d(/*__m128i*/ _1, /*ui6*/ _2) \
+  ((__m128i)__builtin_lsx_vbitseti_d ((v2u64)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV16QI, UV16QI, UV16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vbitrev_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vbitrev_b ((v16u8)_1, (v16u8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV8HI, UV8HI, UV8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vbitrev_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vbitrev_h ((v8u16)_1, (v8u16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV4SI, UV4SI, UV4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vbitrev_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vbitrev_w ((v4u32)_1, (v4u32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV2DI, UV2DI, UV2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vbitrev_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vbitrev_d ((v2u64)_1, (v2u64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, ui3.  */
+/* Data types in instruction templates:  UV16QI, UV16QI, UQI.  */
+#define __lsx_vbitrevi_b(/*__m128i*/ _1, /*ui3*/ _2) \
+  ((__m128i)__builtin_lsx_vbitrevi_b ((v16u8)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui4.  */
+/* Data types in instruction templates:  UV8HI, UV8HI, UQI.  */
+#define __lsx_vbitrevi_h(/*__m128i*/ _1, /*ui4*/ _2) \
+  ((__m128i)__builtin_lsx_vbitrevi_h ((v8u16)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  UV4SI, UV4SI, UQI.  */
+#define __lsx_vbitrevi_w(/*__m128i*/ _1, /*ui5*/ _2) \
+  ((__m128i)__builtin_lsx_vbitrevi_w ((v4u32)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui6.  */
+/* Data types in instruction templates:  UV2DI, UV2DI, UQI.  */
+#define __lsx_vbitrevi_d(/*__m128i*/ _1, /*ui6*/ _2) \
+  ((__m128i)__builtin_lsx_vbitrevi_d ((v2u64)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vadd_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vadd_b ((v16i8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vadd_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vadd_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vadd_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vadd_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vadd_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vadd_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  V16QI, V16QI, UQI.  */
+#define __lsx_vaddi_bu(/*__m128i*/ _1, /*ui5*/ _2) \
+  ((__m128i)__builtin_lsx_vaddi_bu ((v16i8)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  V8HI, V8HI, UQI.  */
+#define __lsx_vaddi_hu(/*__m128i*/ _1, /*ui5*/ _2) \
+  ((__m128i)__builtin_lsx_vaddi_hu ((v8i16)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  V4SI, V4SI, UQI.  */
+#define __lsx_vaddi_wu(/*__m128i*/ _1, /*ui5*/ _2) \
+  ((__m128i)__builtin_lsx_vaddi_wu ((v4i32)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  V2DI, V2DI, UQI.  */
+#define __lsx_vaddi_du(/*__m128i*/ _1, /*ui5*/ _2) \
+  ((__m128i)__builtin_lsx_vaddi_du ((v2i64)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsub_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsub_b ((v16i8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsub_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsub_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsub_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsub_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsub_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsub_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  V16QI, V16QI, UQI.  */
+#define __lsx_vsubi_bu(/*__m128i*/ _1, /*ui5*/ _2) \
+  ((__m128i)__builtin_lsx_vsubi_bu ((v16i8)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  V8HI, V8HI, UQI.  */
+#define __lsx_vsubi_hu(/*__m128i*/ _1, /*ui5*/ _2) \
+  ((__m128i)__builtin_lsx_vsubi_hu ((v8i16)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  V4SI, V4SI, UQI.  */
+#define __lsx_vsubi_wu(/*__m128i*/ _1, /*ui5*/ _2) \
+  ((__m128i)__builtin_lsx_vsubi_wu ((v4i32)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  V2DI, V2DI, UQI.  */
+#define __lsx_vsubi_du(/*__m128i*/ _1, /*ui5*/ _2) \
+  ((__m128i)__builtin_lsx_vsubi_du ((v2i64)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmax_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmax_b ((v16i8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmax_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmax_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmax_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmax_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmax_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmax_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, si5.  */
+/* Data types in instruction templates:  V16QI, V16QI, QI.  */
+#define __lsx_vmaxi_b(/*__m128i*/ _1, /*si5*/ _2) \
+  ((__m128i)__builtin_lsx_vmaxi_b ((v16i8)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, si5.  */
+/* Data types in instruction templates:  V8HI, V8HI, QI.  */
+#define __lsx_vmaxi_h(/*__m128i*/ _1, /*si5*/ _2) \
+  ((__m128i)__builtin_lsx_vmaxi_h ((v8i16)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, si5.  */
+/* Data types in instruction templates:  V4SI, V4SI, QI.  */
+#define __lsx_vmaxi_w(/*__m128i*/ _1, /*si5*/ _2) \
+  ((__m128i)__builtin_lsx_vmaxi_w ((v4i32)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, si5.  */
+/* Data types in instruction templates:  V2DI, V2DI, QI.  */
+#define __lsx_vmaxi_d(/*__m128i*/ _1, /*si5*/ _2) \
+  ((__m128i)__builtin_lsx_vmaxi_d ((v2i64)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV16QI, UV16QI, UV16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmax_bu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmax_bu ((v16u8)_1, (v16u8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV8HI, UV8HI, UV8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmax_hu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmax_hu ((v8u16)_1, (v8u16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV4SI, UV4SI, UV4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmax_wu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmax_wu ((v4u32)_1, (v4u32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV2DI, UV2DI, UV2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmax_du (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmax_du ((v2u64)_1, (v2u64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  UV16QI, UV16QI, UQI.  */
+#define __lsx_vmaxi_bu(/*__m128i*/ _1, /*ui5*/ _2) \
+  ((__m128i)__builtin_lsx_vmaxi_bu ((v16u8)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  UV8HI, UV8HI, UQI.  */
+#define __lsx_vmaxi_hu(/*__m128i*/ _1, /*ui5*/ _2) \
+  ((__m128i)__builtin_lsx_vmaxi_hu ((v8u16)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  UV4SI, UV4SI, UQI.  */
+#define __lsx_vmaxi_wu(/*__m128i*/ _1, /*ui5*/ _2) \
+  ((__m128i)__builtin_lsx_vmaxi_wu ((v4u32)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  UV2DI, UV2DI, UQI.  */
+#define __lsx_vmaxi_du(/*__m128i*/ _1, /*ui5*/ _2) \
+  ((__m128i)__builtin_lsx_vmaxi_du ((v2u64)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmin_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmin_b ((v16i8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmin_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmin_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmin_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmin_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmin_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmin_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, si5.  */
+/* Data types in instruction templates:  V16QI, V16QI, QI.  */
+#define __lsx_vmini_b(/*__m128i*/ _1, /*si5*/ _2) \
+  ((__m128i)__builtin_lsx_vmini_b ((v16i8)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, si5.  */
+/* Data types in instruction templates:  V8HI, V8HI, QI.  */
+#define __lsx_vmini_h(/*__m128i*/ _1, /*si5*/ _2) \
+  ((__m128i)__builtin_lsx_vmini_h ((v8i16)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, si5.  */
+/* Data types in instruction templates:  V4SI, V4SI, QI.  */
+#define __lsx_vmini_w(/*__m128i*/ _1, /*si5*/ _2) \
+  ((__m128i)__builtin_lsx_vmini_w ((v4i32)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, si5.  */
+/* Data types in instruction templates:  V2DI, V2DI, QI.  */
+#define __lsx_vmini_d(/*__m128i*/ _1, /*si5*/ _2) \
+  ((__m128i)__builtin_lsx_vmini_d ((v2i64)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV16QI, UV16QI, UV16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmin_bu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmin_bu ((v16u8)_1, (v16u8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV8HI, UV8HI, UV8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmin_hu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmin_hu ((v8u16)_1, (v8u16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV4SI, UV4SI, UV4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmin_wu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmin_wu ((v4u32)_1, (v4u32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV2DI, UV2DI, UV2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmin_du (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmin_du ((v2u64)_1, (v2u64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  UV16QI, UV16QI, UQI.  */
+#define __lsx_vmini_bu(/*__m128i*/ _1, /*ui5*/ _2) \
+  ((__m128i)__builtin_lsx_vmini_bu ((v16u8)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  UV8HI, UV8HI, UQI.  */
+#define __lsx_vmini_hu(/*__m128i*/ _1, /*ui5*/ _2) \
+  ((__m128i)__builtin_lsx_vmini_hu ((v8u16)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  UV4SI, UV4SI, UQI.  */
+#define __lsx_vmini_wu(/*__m128i*/ _1, /*ui5*/ _2) \
+  ((__m128i)__builtin_lsx_vmini_wu ((v4u32)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  UV2DI, UV2DI, UQI.  */
+#define __lsx_vmini_du(/*__m128i*/ _1, /*ui5*/ _2) \
+  ((__m128i)__builtin_lsx_vmini_du ((v2u64)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vseq_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vseq_b ((v16i8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vseq_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vseq_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vseq_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vseq_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vseq_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vseq_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, si5.  */
+/* Data types in instruction templates:  V16QI, V16QI, QI.  */
+#define __lsx_vseqi_b(/*__m128i*/ _1, /*si5*/ _2) \
+  ((__m128i)__builtin_lsx_vseqi_b ((v16i8)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, si5.  */
+/* Data types in instruction templates:  V8HI, V8HI, QI.  */
+#define __lsx_vseqi_h(/*__m128i*/ _1, /*si5*/ _2) \
+  ((__m128i)__builtin_lsx_vseqi_h ((v8i16)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, si5.  */
+/* Data types in instruction templates:  V4SI, V4SI, QI.  */
+#define __lsx_vseqi_w(/*__m128i*/ _1, /*si5*/ _2) \
+  ((__m128i)__builtin_lsx_vseqi_w ((v4i32)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, si5.  */
+/* Data types in instruction templates:  V2DI, V2DI, QI.  */
+#define __lsx_vseqi_d(/*__m128i*/ _1, /*si5*/ _2) \
+  ((__m128i)__builtin_lsx_vseqi_d ((v2i64)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, si5.  */
+/* Data types in instruction templates:  V16QI, V16QI, QI.  */
+#define __lsx_vslti_b(/*__m128i*/ _1, /*si5*/ _2) \
+  ((__m128i)__builtin_lsx_vslti_b ((v16i8)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vslt_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vslt_b ((v16i8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vslt_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vslt_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vslt_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vslt_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vslt_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vslt_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, si5.  */
+/* Data types in instruction templates:  V8HI, V8HI, QI.  */
+#define __lsx_vslti_h(/*__m128i*/ _1, /*si5*/ _2) \
+  ((__m128i)__builtin_lsx_vslti_h ((v8i16)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, si5.  */
+/* Data types in instruction templates:  V4SI, V4SI, QI.  */
+#define __lsx_vslti_w(/*__m128i*/ _1, /*si5*/ _2) \
+  ((__m128i)__builtin_lsx_vslti_w ((v4i32)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, si5.  */
+/* Data types in instruction templates:  V2DI, V2DI, QI.  */
+#define __lsx_vslti_d(/*__m128i*/ _1, /*si5*/ _2) \
+  ((__m128i)__builtin_lsx_vslti_d ((v2i64)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, UV16QI, UV16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vslt_bu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vslt_bu ((v16u8)_1, (v16u8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, UV8HI, UV8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vslt_hu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vslt_hu ((v8u16)_1, (v8u16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, UV4SI, UV4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vslt_wu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vslt_wu ((v4u32)_1, (v4u32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, UV2DI, UV2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vslt_du (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vslt_du ((v2u64)_1, (v2u64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  V16QI, UV16QI, UQI.  */
+#define __lsx_vslti_bu(/*__m128i*/ _1, /*ui5*/ _2) \
+  ((__m128i)__builtin_lsx_vslti_bu ((v16u8)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  V8HI, UV8HI, UQI.  */
+#define __lsx_vslti_hu(/*__m128i*/ _1, /*ui5*/ _2) \
+  ((__m128i)__builtin_lsx_vslti_hu ((v8u16)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  V4SI, UV4SI, UQI.  */
+#define __lsx_vslti_wu(/*__m128i*/ _1, /*ui5*/ _2) \
+  ((__m128i)__builtin_lsx_vslti_wu ((v4u32)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  V2DI, UV2DI, UQI.  */
+#define __lsx_vslti_du(/*__m128i*/ _1, /*ui5*/ _2) \
+  ((__m128i)__builtin_lsx_vslti_du ((v2u64)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsle_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsle_b ((v16i8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsle_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsle_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsle_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsle_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsle_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsle_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, si5.  */
+/* Data types in instruction templates:  V16QI, V16QI, QI.  */
+#define __lsx_vslei_b(/*__m128i*/ _1, /*si5*/ _2) \
+  ((__m128i)__builtin_lsx_vslei_b ((v16i8)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, si5.  */
+/* Data types in instruction templates:  V8HI, V8HI, QI.  */
+#define __lsx_vslei_h(/*__m128i*/ _1, /*si5*/ _2) \
+  ((__m128i)__builtin_lsx_vslei_h ((v8i16)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, si5.  */
+/* Data types in instruction templates:  V4SI, V4SI, QI.  */
+#define __lsx_vslei_w(/*__m128i*/ _1, /*si5*/ _2) \
+  ((__m128i)__builtin_lsx_vslei_w ((v4i32)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, si5.  */
+/* Data types in instruction templates:  V2DI, V2DI, QI.  */
+#define __lsx_vslei_d(/*__m128i*/ _1, /*si5*/ _2) \
+  ((__m128i)__builtin_lsx_vslei_d ((v2i64)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, UV16QI, UV16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsle_bu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsle_bu ((v16u8)_1, (v16u8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, UV8HI, UV8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsle_hu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsle_hu ((v8u16)_1, (v8u16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, UV4SI, UV4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsle_wu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsle_wu ((v4u32)_1, (v4u32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, UV2DI, UV2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsle_du (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsle_du ((v2u64)_1, (v2u64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  V16QI, UV16QI, UQI.  */
+#define __lsx_vslei_bu(/*__m128i*/ _1, /*ui5*/ _2) \
+  ((__m128i)__builtin_lsx_vslei_bu ((v16u8)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  V8HI, UV8HI, UQI.  */
+#define __lsx_vslei_hu(/*__m128i*/ _1, /*ui5*/ _2) \
+  ((__m128i)__builtin_lsx_vslei_hu ((v8u16)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  V4SI, UV4SI, UQI.  */
+#define __lsx_vslei_wu(/*__m128i*/ _1, /*ui5*/ _2) \
+  ((__m128i)__builtin_lsx_vslei_wu ((v4u32)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  V2DI, UV2DI, UQI.  */
+#define __lsx_vslei_du(/*__m128i*/ _1, /*ui5*/ _2) \
+  ((__m128i)__builtin_lsx_vslei_du ((v2u64)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui3.  */
+/* Data types in instruction templates:  V16QI, V16QI, UQI.  */
+#define __lsx_vsat_b(/*__m128i*/ _1, /*ui3*/ _2) \
+  ((__m128i)__builtin_lsx_vsat_b ((v16i8)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui4.  */
+/* Data types in instruction templates:  V8HI, V8HI, UQI.  */
+#define __lsx_vsat_h(/*__m128i*/ _1, /*ui4*/ _2) \
+  ((__m128i)__builtin_lsx_vsat_h ((v8i16)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  V4SI, V4SI, UQI.  */
+#define __lsx_vsat_w(/*__m128i*/ _1, /*ui5*/ _2) \
+  ((__m128i)__builtin_lsx_vsat_w ((v4i32)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui6.  */
+/* Data types in instruction templates:  V2DI, V2DI, UQI.  */
+#define __lsx_vsat_d(/*__m128i*/ _1, /*ui6*/ _2) \
+  ((__m128i)__builtin_lsx_vsat_d ((v2i64)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui3.  */
+/* Data types in instruction templates:  UV16QI, UV16QI, UQI.  */
+#define __lsx_vsat_bu(/*__m128i*/ _1, /*ui3*/ _2) \
+  ((__m128i)__builtin_lsx_vsat_bu ((v16u8)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui4.  */
+/* Data types in instruction templates:  UV8HI, UV8HI, UQI.  */
+#define __lsx_vsat_hu(/*__m128i*/ _1, /*ui4*/ _2) \
+  ((__m128i)__builtin_lsx_vsat_hu ((v8u16)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  UV4SI, UV4SI, UQI.  */
+#define __lsx_vsat_wu(/*__m128i*/ _1, /*ui5*/ _2) \
+  ((__m128i)__builtin_lsx_vsat_wu ((v4u32)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui6.  */
+/* Data types in instruction templates:  UV2DI, UV2DI, UQI.  */
+#define __lsx_vsat_du(/*__m128i*/ _1, /*ui6*/ _2) \
+  ((__m128i)__builtin_lsx_vsat_du ((v2u64)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vadda_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vadda_b ((v16i8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vadda_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vadda_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vadda_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vadda_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vadda_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vadda_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsadd_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsadd_b ((v16i8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsadd_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsadd_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsadd_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsadd_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsadd_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsadd_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV16QI, UV16QI, UV16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsadd_bu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsadd_bu ((v16u8)_1, (v16u8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV8HI, UV8HI, UV8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsadd_hu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsadd_hu ((v8u16)_1, (v8u16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV4SI, UV4SI, UV4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsadd_wu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsadd_wu ((v4u32)_1, (v4u32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV2DI, UV2DI, UV2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsadd_du (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsadd_du ((v2u64)_1, (v2u64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vavg_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vavg_b ((v16i8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vavg_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vavg_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vavg_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vavg_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vavg_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vavg_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV16QI, UV16QI, UV16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vavg_bu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vavg_bu ((v16u8)_1, (v16u8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV8HI, UV8HI, UV8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vavg_hu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vavg_hu ((v8u16)_1, (v8u16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV4SI, UV4SI, UV4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vavg_wu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vavg_wu ((v4u32)_1, (v4u32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV2DI, UV2DI, UV2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vavg_du (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vavg_du ((v2u64)_1, (v2u64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vavgr_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vavgr_b ((v16i8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vavgr_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vavgr_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vavgr_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vavgr_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vavgr_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vavgr_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV16QI, UV16QI, UV16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vavgr_bu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vavgr_bu ((v16u8)_1, (v16u8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV8HI, UV8HI, UV8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vavgr_hu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vavgr_hu ((v8u16)_1, (v8u16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV4SI, UV4SI, UV4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vavgr_wu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vavgr_wu ((v4u32)_1, (v4u32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV2DI, UV2DI, UV2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vavgr_du (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vavgr_du ((v2u64)_1, (v2u64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vssub_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vssub_b ((v16i8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vssub_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vssub_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vssub_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vssub_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vssub_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vssub_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV16QI, UV16QI, UV16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vssub_bu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vssub_bu ((v16u8)_1, (v16u8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV8HI, UV8HI, UV8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vssub_hu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vssub_hu ((v8u16)_1, (v8u16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV4SI, UV4SI, UV4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vssub_wu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vssub_wu ((v4u32)_1, (v4u32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV2DI, UV2DI, UV2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vssub_du (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vssub_du ((v2u64)_1, (v2u64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vabsd_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vabsd_b ((v16i8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vabsd_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vabsd_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vabsd_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vabsd_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vabsd_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vabsd_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV16QI, UV16QI, UV16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vabsd_bu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vabsd_bu ((v16u8)_1, (v16u8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV8HI, UV8HI, UV8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vabsd_hu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vabsd_hu ((v8u16)_1, (v8u16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV4SI, UV4SI, UV4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vabsd_wu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vabsd_wu ((v4u32)_1, (v4u32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV2DI, UV2DI, UV2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vabsd_du (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vabsd_du ((v2u64)_1, (v2u64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmul_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmul_b ((v16i8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmul_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmul_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmul_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmul_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmul_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmul_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmadd_b (__m128i _1, __m128i _2, __m128i _3)
+{
+  return (__m128i)__builtin_lsx_vmadd_b ((v16i8)_1, (v16i8)_2, (v16i8)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmadd_h (__m128i _1, __m128i _2, __m128i _3)
+{
+  return (__m128i)__builtin_lsx_vmadd_h ((v8i16)_1, (v8i16)_2, (v8i16)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmadd_w (__m128i _1, __m128i _2, __m128i _3)
+{
+  return (__m128i)__builtin_lsx_vmadd_w ((v4i32)_1, (v4i32)_2, (v4i32)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmadd_d (__m128i _1, __m128i _2, __m128i _3)
+{
+  return (__m128i)__builtin_lsx_vmadd_d ((v2i64)_1, (v2i64)_2, (v2i64)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmsub_b (__m128i _1, __m128i _2, __m128i _3)
+{
+  return (__m128i)__builtin_lsx_vmsub_b ((v16i8)_1, (v16i8)_2, (v16i8)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmsub_h (__m128i _1, __m128i _2, __m128i _3)
+{
+  return (__m128i)__builtin_lsx_vmsub_h ((v8i16)_1, (v8i16)_2, (v8i16)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmsub_w (__m128i _1, __m128i _2, __m128i _3)
+{
+  return (__m128i)__builtin_lsx_vmsub_w ((v4i32)_1, (v4i32)_2, (v4i32)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmsub_d (__m128i _1, __m128i _2, __m128i _3)
+{
+  return (__m128i)__builtin_lsx_vmsub_d ((v2i64)_1, (v2i64)_2, (v2i64)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vdiv_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vdiv_b ((v16i8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vdiv_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vdiv_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vdiv_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vdiv_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vdiv_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vdiv_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV16QI, UV16QI, UV16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vdiv_bu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vdiv_bu ((v16u8)_1, (v16u8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV8HI, UV8HI, UV8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vdiv_hu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vdiv_hu ((v8u16)_1, (v8u16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV4SI, UV4SI, UV4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vdiv_wu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vdiv_wu ((v4u32)_1, (v4u32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV2DI, UV2DI, UV2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vdiv_du (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vdiv_du ((v2u64)_1, (v2u64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vhaddw_h_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vhaddw_h_b ((v16i8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vhaddw_w_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vhaddw_w_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vhaddw_d_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vhaddw_d_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV8HI, UV16QI, UV16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vhaddw_hu_bu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vhaddw_hu_bu ((v16u8)_1, (v16u8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV4SI, UV8HI, UV8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vhaddw_wu_hu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vhaddw_wu_hu ((v8u16)_1, (v8u16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV2DI, UV4SI, UV4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vhaddw_du_wu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vhaddw_du_wu ((v4u32)_1, (v4u32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vhsubw_h_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vhsubw_h_b ((v16i8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vhsubw_w_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vhsubw_w_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vhsubw_d_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vhsubw_d_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, UV16QI, UV16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vhsubw_hu_bu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vhsubw_hu_bu ((v16u8)_1, (v16u8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, UV8HI, UV8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vhsubw_wu_hu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vhsubw_wu_hu ((v8u16)_1, (v8u16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, UV4SI, UV4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vhsubw_du_wu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vhsubw_du_wu ((v4u32)_1, (v4u32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmod_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmod_b ((v16i8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmod_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmod_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmod_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmod_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmod_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmod_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV16QI, UV16QI, UV16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmod_bu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmod_bu ((v16u8)_1, (v16u8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV8HI, UV8HI, UV8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmod_hu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmod_hu ((v8u16)_1, (v8u16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV4SI, UV4SI, UV4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmod_wu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmod_wu ((v4u32)_1, (v4u32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV2DI, UV2DI, UV2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmod_du (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmod_du ((v2u64)_1, (v2u64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, rk.  */
+/* Data types in instruction templates:  V16QI, V16QI, SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vreplve_b (__m128i _1, int _2)
+{
+  return (__m128i)__builtin_lsx_vreplve_b ((v16i8)_1, (int)_2);
+}
+
+/* Assembly instruction format:	vd, vj, rk.  */
+/* Data types in instruction templates:  V8HI, V8HI, SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vreplve_h (__m128i _1, int _2)
+{
+  return (__m128i)__builtin_lsx_vreplve_h ((v8i16)_1, (int)_2);
+}
+
+/* Assembly instruction format:	vd, vj, rk.  */
+/* Data types in instruction templates:  V4SI, V4SI, SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vreplve_w (__m128i _1, int _2)
+{
+  return (__m128i)__builtin_lsx_vreplve_w ((v4i32)_1, (int)_2);
+}
+
+/* Assembly instruction format:	vd, vj, rk.  */
+/* Data types in instruction templates:  V2DI, V2DI, SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vreplve_d (__m128i _1, int _2)
+{
+  return (__m128i)__builtin_lsx_vreplve_d ((v2i64)_1, (int)_2);
+}
+
+/* Assembly instruction format:	vd, vj, ui4.  */
+/* Data types in instruction templates:  V16QI, V16QI, UQI.  */
+#define __lsx_vreplvei_b(/*__m128i*/ _1, /*ui4*/ _2) \
+  ((__m128i)__builtin_lsx_vreplvei_b ((v16i8)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui3.  */
+/* Data types in instruction templates:  V8HI, V8HI, UQI.  */
+#define __lsx_vreplvei_h(/*__m128i*/ _1, /*ui3*/ _2) \
+  ((__m128i)__builtin_lsx_vreplvei_h ((v8i16)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui2.  */
+/* Data types in instruction templates:  V4SI, V4SI, UQI.  */
+#define __lsx_vreplvei_w(/*__m128i*/ _1, /*ui2*/ _2) \
+  ((__m128i)__builtin_lsx_vreplvei_w ((v4i32)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui1.  */
+/* Data types in instruction templates:  V2DI, V2DI, UQI.  */
+#define __lsx_vreplvei_d(/*__m128i*/ _1, /*ui1*/ _2) \
+  ((__m128i)__builtin_lsx_vreplvei_d ((v2i64)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vpickev_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vpickev_b ((v16i8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vpickev_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vpickev_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vpickev_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vpickev_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vpickev_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vpickev_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vpickod_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vpickod_b ((v16i8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vpickod_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vpickod_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vpickod_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vpickod_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vpickod_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vpickod_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vilvh_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vilvh_b ((v16i8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vilvh_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vilvh_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vilvh_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vilvh_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vilvh_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vilvh_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vilvl_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vilvl_b ((v16i8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vilvl_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vilvl_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vilvl_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vilvl_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vilvl_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vilvl_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vpackev_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vpackev_b ((v16i8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vpackev_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vpackev_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vpackev_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vpackev_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vpackev_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vpackev_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vpackod_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vpackod_b ((v16i8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vpackod_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vpackod_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vpackod_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vpackod_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vpackod_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vpackod_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vshuf_h (__m128i _1, __m128i _2, __m128i _3)
+{
+  return (__m128i)__builtin_lsx_vshuf_h ((v8i16)_1, (v8i16)_2, (v8i16)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vshuf_w (__m128i _1, __m128i _2, __m128i _3)
+{
+  return (__m128i)__builtin_lsx_vshuf_w ((v4i32)_1, (v4i32)_2, (v4i32)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vshuf_d (__m128i _1, __m128i _2, __m128i _3)
+{
+  return (__m128i)__builtin_lsx_vshuf_d ((v2i64)_1, (v2i64)_2, (v2i64)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV16QI, UV16QI, UV16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vand_v (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vand_v ((v16u8)_1, (v16u8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, ui8.  */
+/* Data types in instruction templates:  UV16QI, UV16QI, UQI.  */
+#define __lsx_vandi_b(/*__m128i*/ _1, /*ui8*/ _2) \
+  ((__m128i)__builtin_lsx_vandi_b ((v16u8)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV16QI, UV16QI, UV16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vor_v (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vor_v ((v16u8)_1, (v16u8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, ui8.  */
+/* Data types in instruction templates:  UV16QI, UV16QI, UQI.  */
+#define __lsx_vori_b(/*__m128i*/ _1, /*ui8*/ _2) \
+  ((__m128i)__builtin_lsx_vori_b ((v16u8)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV16QI, UV16QI, UV16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vnor_v (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vnor_v ((v16u8)_1, (v16u8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, ui8.  */
+/* Data types in instruction templates:  UV16QI, UV16QI, UQI.  */
+#define __lsx_vnori_b(/*__m128i*/ _1, /*ui8*/ _2) \
+  ((__m128i)__builtin_lsx_vnori_b ((v16u8)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV16QI, UV16QI, UV16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vxor_v (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vxor_v ((v16u8)_1, (v16u8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, ui8.  */
+/* Data types in instruction templates:  UV16QI, UV16QI, UQI.  */
+#define __lsx_vxori_b(/*__m128i*/ _1, /*ui8*/ _2) \
+  ((__m128i)__builtin_lsx_vxori_b ((v16u8)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, vk, va.  */
+/* Data types in instruction templates:  UV16QI, UV16QI, UV16QI, UV16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vbitsel_v (__m128i _1, __m128i _2, __m128i _3)
+{
+  return (__m128i)__builtin_lsx_vbitsel_v ((v16u8)_1, (v16u8)_2, (v16u8)_3);
+}
+
+/* Assembly instruction format:	vd, vj, ui8.  */
+/* Data types in instruction templates:  UV16QI, UV16QI, UV16QI, USI.  */
+#define __lsx_vbitseli_b(/*__m128i*/ _1, /*__m128i*/ _2, /*ui8*/ _3) \
+  ((__m128i)__builtin_lsx_vbitseli_b ((v16u8)(_1), (v16u8)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui8.  */
+/* Data types in instruction templates:  V16QI, V16QI, USI.  */
+#define __lsx_vshuf4i_b(/*__m128i*/ _1, /*ui8*/ _2) \
+  ((__m128i)__builtin_lsx_vshuf4i_b ((v16i8)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui8.  */
+/* Data types in instruction templates:  V8HI, V8HI, USI.  */
+#define __lsx_vshuf4i_h(/*__m128i*/ _1, /*ui8*/ _2) \
+  ((__m128i)__builtin_lsx_vshuf4i_h ((v8i16)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui8.  */
+/* Data types in instruction templates:  V4SI, V4SI, USI.  */
+#define __lsx_vshuf4i_w(/*__m128i*/ _1, /*ui8*/ _2) \
+  ((__m128i)__builtin_lsx_vshuf4i_w ((v4i32)(_1), (_2)))
+
+/* Assembly instruction format:	vd, rj.  */
+/* Data types in instruction templates:  V16QI, SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vreplgr2vr_b (int _1)
+{
+  return (__m128i)__builtin_lsx_vreplgr2vr_b ((int)_1);
+}
+
+/* Assembly instruction format:	vd, rj.  */
+/* Data types in instruction templates:  V8HI, SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vreplgr2vr_h (int _1)
+{
+  return (__m128i)__builtin_lsx_vreplgr2vr_h ((int)_1);
+}
+
+/* Assembly instruction format:	vd, rj.  */
+/* Data types in instruction templates:  V4SI, SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vreplgr2vr_w (int _1)
+{
+  return (__m128i)__builtin_lsx_vreplgr2vr_w ((int)_1);
+}
+
+/* Assembly instruction format:	vd, rj.  */
+/* Data types in instruction templates:  V2DI, DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vreplgr2vr_d (long int _1)
+{
+  return (__m128i)__builtin_lsx_vreplgr2vr_d ((long int)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vpcnt_b (__m128i _1)
+{
+  return (__m128i)__builtin_lsx_vpcnt_b ((v16i8)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vpcnt_h (__m128i _1)
+{
+  return (__m128i)__builtin_lsx_vpcnt_h ((v8i16)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vpcnt_w (__m128i _1)
+{
+  return (__m128i)__builtin_lsx_vpcnt_w ((v4i32)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vpcnt_d (__m128i _1)
+{
+  return (__m128i)__builtin_lsx_vpcnt_d ((v2i64)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vclo_b (__m128i _1)
+{
+  return (__m128i)__builtin_lsx_vclo_b ((v16i8)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vclo_h (__m128i _1)
+{
+  return (__m128i)__builtin_lsx_vclo_h ((v8i16)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vclo_w (__m128i _1)
+{
+  return (__m128i)__builtin_lsx_vclo_w ((v4i32)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vclo_d (__m128i _1)
+{
+  return (__m128i)__builtin_lsx_vclo_d ((v2i64)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vclz_b (__m128i _1)
+{
+  return (__m128i)__builtin_lsx_vclz_b ((v16i8)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vclz_h (__m128i _1)
+{
+  return (__m128i)__builtin_lsx_vclz_h ((v8i16)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vclz_w (__m128i _1)
+{
+  return (__m128i)__builtin_lsx_vclz_w ((v4i32)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vclz_d (__m128i _1)
+{
+  return (__m128i)__builtin_lsx_vclz_d ((v2i64)_1);
+}
+
+/* Assembly instruction format:	rd, vj, ui4.  */
+/* Data types in instruction templates:  SI, V16QI, UQI.  */
+#define __lsx_vpickve2gr_b(/*__m128i*/ _1, /*ui4*/ _2) \
+  ((int)__builtin_lsx_vpickve2gr_b ((v16i8)(_1), (_2)))
+
+/* Assembly instruction format:	rd, vj, ui3.  */
+/* Data types in instruction templates:  SI, V8HI, UQI.  */
+#define __lsx_vpickve2gr_h(/*__m128i*/ _1, /*ui3*/ _2) \
+  ((int)__builtin_lsx_vpickve2gr_h ((v8i16)(_1), (_2)))
+
+/* Assembly instruction format:	rd, vj, ui2.  */
+/* Data types in instruction templates:  SI, V4SI, UQI.  */
+#define __lsx_vpickve2gr_w(/*__m128i*/ _1, /*ui2*/ _2) \
+  ((int)__builtin_lsx_vpickve2gr_w ((v4i32)(_1), (_2)))
+
+/* Assembly instruction format:	rd, vj, ui1.  */
+/* Data types in instruction templates:  DI, V2DI, UQI.  */
+#define __lsx_vpickve2gr_d(/*__m128i*/ _1, /*ui1*/ _2) \
+  ((long int)__builtin_lsx_vpickve2gr_d ((v2i64)(_1), (_2)))
+
+/* Assembly instruction format:	rd, vj, ui4.  */
+/* Data types in instruction templates:  USI, V16QI, UQI.  */
+#define __lsx_vpickve2gr_bu(/*__m128i*/ _1, /*ui4*/ _2) \
+  ((unsigned int)__builtin_lsx_vpickve2gr_bu ((v16i8)(_1), (_2)))
+
+/* Assembly instruction format:	rd, vj, ui3.  */
+/* Data types in instruction templates:  USI, V8HI, UQI.  */
+#define __lsx_vpickve2gr_hu(/*__m128i*/ _1, /*ui3*/ _2) \
+  ((unsigned int)__builtin_lsx_vpickve2gr_hu ((v8i16)(_1), (_2)))
+
+/* Assembly instruction format:	rd, vj, ui2.  */
+/* Data types in instruction templates:  USI, V4SI, UQI.  */
+#define __lsx_vpickve2gr_wu(/*__m128i*/ _1, /*ui2*/ _2) \
+  ((unsigned int)__builtin_lsx_vpickve2gr_wu ((v4i32)(_1), (_2)))
+
+/* Assembly instruction format:	rd, vj, ui1.  */
+/* Data types in instruction templates:  UDI, V2DI, UQI.  */
+#define __lsx_vpickve2gr_du(/*__m128i*/ _1, /*ui1*/ _2) \
+  ((unsigned long int)__builtin_lsx_vpickve2gr_du ((v2i64)(_1), (_2)))
+
+/* Assembly instruction format:	vd, rj, ui4.  */
+/* Data types in instruction templates:  V16QI, V16QI, SI, UQI.  */
+#define __lsx_vinsgr2vr_b(/*__m128i*/ _1, /*int*/ _2, /*ui4*/ _3) \
+  ((__m128i)__builtin_lsx_vinsgr2vr_b ((v16i8)(_1), (int)(_2), (_3)))
+
+/* Assembly instruction format:	vd, rj, ui3.  */
+/* Data types in instruction templates:  V8HI, V8HI, SI, UQI.  */
+#define __lsx_vinsgr2vr_h(/*__m128i*/ _1, /*int*/ _2, /*ui3*/ _3) \
+  ((__m128i)__builtin_lsx_vinsgr2vr_h ((v8i16)(_1), (int)(_2), (_3)))
+
+/* Assembly instruction format:	vd, rj, ui2.  */
+/* Data types in instruction templates:  V4SI, V4SI, SI, UQI.  */
+#define __lsx_vinsgr2vr_w(/*__m128i*/ _1, /*int*/ _2, /*ui2*/ _3) \
+  ((__m128i)__builtin_lsx_vinsgr2vr_w ((v4i32)(_1), (int)(_2), (_3)))
+
+/* Assembly instruction format:	vd, rj, ui1.  */
+/* Data types in instruction templates:  V2DI, V2DI, DI, UQI.  */
+#define __lsx_vinsgr2vr_d(/*__m128i*/ _1, /*long int*/ _2, /*ui1*/ _3) \
+  ((__m128i)__builtin_lsx_vinsgr2vr_d ((v2i64)(_1), (long int)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SF, V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128 __lsx_vfadd_s (__m128 _1, __m128 _2)
+{
+  return (__m128)__builtin_lsx_vfadd_s ((v4f32)_1, (v4f32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DF, V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128d __lsx_vfadd_d (__m128d _1, __m128d _2)
+{
+  return (__m128d)__builtin_lsx_vfadd_d ((v2f64)_1, (v2f64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SF, V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128 __lsx_vfsub_s (__m128 _1, __m128 _2)
+{
+  return (__m128)__builtin_lsx_vfsub_s ((v4f32)_1, (v4f32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DF, V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128d __lsx_vfsub_d (__m128d _1, __m128d _2)
+{
+  return (__m128d)__builtin_lsx_vfsub_d ((v2f64)_1, (v2f64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SF, V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128 __lsx_vfmul_s (__m128 _1, __m128 _2)
+{
+  return (__m128)__builtin_lsx_vfmul_s ((v4f32)_1, (v4f32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DF, V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128d __lsx_vfmul_d (__m128d _1, __m128d _2)
+{
+  return (__m128d)__builtin_lsx_vfmul_d ((v2f64)_1, (v2f64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SF, V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128 __lsx_vfdiv_s (__m128 _1, __m128 _2)
+{
+  return (__m128)__builtin_lsx_vfdiv_s ((v4f32)_1, (v4f32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DF, V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128d __lsx_vfdiv_d (__m128d _1, __m128d _2)
+{
+  return (__m128d)__builtin_lsx_vfdiv_d ((v2f64)_1, (v2f64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcvt_h_s (__m128 _1, __m128 _2)
+{
+  return (__m128i)__builtin_lsx_vfcvt_h_s ((v4f32)_1, (v4f32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SF, V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128 __lsx_vfcvt_s_d (__m128d _1, __m128d _2)
+{
+  return (__m128)__builtin_lsx_vfcvt_s_d ((v2f64)_1, (v2f64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SF, V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128 __lsx_vfmin_s (__m128 _1, __m128 _2)
+{
+  return (__m128)__builtin_lsx_vfmin_s ((v4f32)_1, (v4f32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DF, V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128d __lsx_vfmin_d (__m128d _1, __m128d _2)
+{
+  return (__m128d)__builtin_lsx_vfmin_d ((v2f64)_1, (v2f64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SF, V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128 __lsx_vfmina_s (__m128 _1, __m128 _2)
+{
+  return (__m128)__builtin_lsx_vfmina_s ((v4f32)_1, (v4f32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DF, V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128d __lsx_vfmina_d (__m128d _1, __m128d _2)
+{
+  return (__m128d)__builtin_lsx_vfmina_d ((v2f64)_1, (v2f64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SF, V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128 __lsx_vfmax_s (__m128 _1, __m128 _2)
+{
+  return (__m128)__builtin_lsx_vfmax_s ((v4f32)_1, (v4f32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DF, V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128d __lsx_vfmax_d (__m128d _1, __m128d _2)
+{
+  return (__m128d)__builtin_lsx_vfmax_d ((v2f64)_1, (v2f64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SF, V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128 __lsx_vfmaxa_s (__m128 _1, __m128 _2)
+{
+  return (__m128)__builtin_lsx_vfmaxa_s ((v4f32)_1, (v4f32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DF, V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128d __lsx_vfmaxa_d (__m128d _1, __m128d _2)
+{
+  return (__m128d)__builtin_lsx_vfmaxa_d ((v2f64)_1, (v2f64)_2);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V4SI, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfclass_s (__m128 _1)
+{
+  return (__m128i)__builtin_lsx_vfclass_s ((v4f32)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V2DI, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfclass_d (__m128d _1)
+{
+  return (__m128i)__builtin_lsx_vfclass_d ((v2f64)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128 __lsx_vfsqrt_s (__m128 _1)
+{
+  return (__m128)__builtin_lsx_vfsqrt_s ((v4f32)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128d __lsx_vfsqrt_d (__m128d _1)
+{
+  return (__m128d)__builtin_lsx_vfsqrt_d ((v2f64)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128 __lsx_vfrecip_s (__m128 _1)
+{
+  return (__m128)__builtin_lsx_vfrecip_s ((v4f32)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128d __lsx_vfrecip_d (__m128d _1)
+{
+  return (__m128d)__builtin_lsx_vfrecip_d ((v2f64)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128 __lsx_vfrint_s (__m128 _1)
+{
+  return (__m128)__builtin_lsx_vfrint_s ((v4f32)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128d __lsx_vfrint_d (__m128d _1)
+{
+  return (__m128d)__builtin_lsx_vfrint_d ((v2f64)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128 __lsx_vfrsqrt_s (__m128 _1)
+{
+  return (__m128)__builtin_lsx_vfrsqrt_s ((v4f32)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128d __lsx_vfrsqrt_d (__m128d _1)
+{
+  return (__m128d)__builtin_lsx_vfrsqrt_d ((v2f64)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128 __lsx_vflogb_s (__m128 _1)
+{
+  return (__m128)__builtin_lsx_vflogb_s ((v4f32)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128d __lsx_vflogb_d (__m128d _1)
+{
+  return (__m128d)__builtin_lsx_vflogb_d ((v2f64)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V4SF, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128 __lsx_vfcvth_s_h (__m128i _1)
+{
+  return (__m128)__builtin_lsx_vfcvth_s_h ((v8i16)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V2DF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128d __lsx_vfcvth_d_s (__m128 _1)
+{
+  return (__m128d)__builtin_lsx_vfcvth_d_s ((v4f32)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V4SF, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128 __lsx_vfcvtl_s_h (__m128i _1)
+{
+  return (__m128)__builtin_lsx_vfcvtl_s_h ((v8i16)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V2DF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128d __lsx_vfcvtl_d_s (__m128 _1)
+{
+  return (__m128d)__builtin_lsx_vfcvtl_d_s ((v4f32)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V4SI, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vftint_w_s (__m128 _1)
+{
+  return (__m128i)__builtin_lsx_vftint_w_s ((v4f32)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V2DI, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vftint_l_d (__m128d _1)
+{
+  return (__m128i)__builtin_lsx_vftint_l_d ((v2f64)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  UV4SI, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vftint_wu_s (__m128 _1)
+{
+  return (__m128i)__builtin_lsx_vftint_wu_s ((v4f32)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  UV2DI, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vftint_lu_d (__m128d _1)
+{
+  return (__m128i)__builtin_lsx_vftint_lu_d ((v2f64)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V4SI, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vftintrz_w_s (__m128 _1)
+{
+  return (__m128i)__builtin_lsx_vftintrz_w_s ((v4f32)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V2DI, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vftintrz_l_d (__m128d _1)
+{
+  return (__m128i)__builtin_lsx_vftintrz_l_d ((v2f64)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  UV4SI, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vftintrz_wu_s (__m128 _1)
+{
+  return (__m128i)__builtin_lsx_vftintrz_wu_s ((v4f32)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  UV2DI, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vftintrz_lu_d (__m128d _1)
+{
+  return (__m128i)__builtin_lsx_vftintrz_lu_d ((v2f64)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V4SF, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128 __lsx_vffint_s_w (__m128i _1)
+{
+  return (__m128)__builtin_lsx_vffint_s_w ((v4i32)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V2DF, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128d __lsx_vffint_d_l (__m128i _1)
+{
+  return (__m128d)__builtin_lsx_vffint_d_l ((v2i64)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V4SF, UV4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128 __lsx_vffint_s_wu (__m128i _1)
+{
+  return (__m128)__builtin_lsx_vffint_s_wu ((v4u32)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V2DF, UV2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128d __lsx_vffint_d_lu (__m128i _1)
+{
+  return (__m128d)__builtin_lsx_vffint_d_lu ((v2u64)_1);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV16QI, UV16QI, UV16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vandn_v (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vandn_v ((v16u8)_1, (v16u8)_2);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vneg_b (__m128i _1)
+{
+  return (__m128i)__builtin_lsx_vneg_b ((v16i8)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vneg_h (__m128i _1)
+{
+  return (__m128i)__builtin_lsx_vneg_h ((v8i16)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vneg_w (__m128i _1)
+{
+  return (__m128i)__builtin_lsx_vneg_w ((v4i32)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vneg_d (__m128i _1)
+{
+  return (__m128i)__builtin_lsx_vneg_d ((v2i64)_1);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmuh_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmuh_b ((v16i8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmuh_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmuh_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmuh_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmuh_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmuh_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmuh_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV16QI, UV16QI, UV16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmuh_bu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmuh_bu ((v16u8)_1, (v16u8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV8HI, UV8HI, UV8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmuh_hu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmuh_hu ((v8u16)_1, (v8u16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV4SI, UV4SI, UV4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmuh_wu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmuh_wu ((v4u32)_1, (v4u32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV2DI, UV2DI, UV2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmuh_du (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmuh_du ((v2u64)_1, (v2u64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, ui3.  */
+/* Data types in instruction templates:  V8HI, V16QI, UQI.  */
+#define __lsx_vsllwil_h_b(/*__m128i*/ _1, /*ui3*/ _2) \
+  ((__m128i)__builtin_lsx_vsllwil_h_b ((v16i8)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui4.  */
+/* Data types in instruction templates:  V4SI, V8HI, UQI.  */
+#define __lsx_vsllwil_w_h(/*__m128i*/ _1, /*ui4*/ _2) \
+  ((__m128i)__builtin_lsx_vsllwil_w_h ((v8i16)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  V2DI, V4SI, UQI.  */
+#define __lsx_vsllwil_d_w(/*__m128i*/ _1, /*ui5*/ _2) \
+  ((__m128i)__builtin_lsx_vsllwil_d_w ((v4i32)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui3.  */
+/* Data types in instruction templates:  UV8HI, UV16QI, UQI.  */
+#define __lsx_vsllwil_hu_bu(/*__m128i*/ _1, /*ui3*/ _2) \
+  ((__m128i)__builtin_lsx_vsllwil_hu_bu ((v16u8)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui4.  */
+/* Data types in instruction templates:  UV4SI, UV8HI, UQI.  */
+#define __lsx_vsllwil_wu_hu(/*__m128i*/ _1, /*ui4*/ _2) \
+  ((__m128i)__builtin_lsx_vsllwil_wu_hu ((v8u16)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  UV2DI, UV4SI, UQI.  */
+#define __lsx_vsllwil_du_wu(/*__m128i*/ _1, /*ui5*/ _2) \
+  ((__m128i)__builtin_lsx_vsllwil_du_wu ((v4u32)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsran_b_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsran_b_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsran_h_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsran_h_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsran_w_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsran_w_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vssran_b_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vssran_b_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vssran_h_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vssran_h_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vssran_w_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vssran_w_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV16QI, UV8HI, UV8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vssran_bu_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vssran_bu_h ((v8u16)_1, (v8u16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV8HI, UV4SI, UV4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vssran_hu_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vssran_hu_w ((v4u32)_1, (v4u32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV4SI, UV2DI, UV2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vssran_wu_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vssran_wu_d ((v2u64)_1, (v2u64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsrarn_b_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsrarn_b_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsrarn_h_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsrarn_h_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsrarn_w_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsrarn_w_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vssrarn_b_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vssrarn_b_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vssrarn_h_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vssrarn_h_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vssrarn_w_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vssrarn_w_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV16QI, UV8HI, UV8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vssrarn_bu_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vssrarn_bu_h ((v8u16)_1, (v8u16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV8HI, UV4SI, UV4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vssrarn_hu_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vssrarn_hu_w ((v4u32)_1, (v4u32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV4SI, UV2DI, UV2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vssrarn_wu_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vssrarn_wu_d ((v2u64)_1, (v2u64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsrln_b_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsrln_b_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsrln_h_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsrln_h_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsrln_w_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsrln_w_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV16QI, UV8HI, UV8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vssrln_bu_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vssrln_bu_h ((v8u16)_1, (v8u16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV8HI, UV4SI, UV4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vssrln_hu_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vssrln_hu_w ((v4u32)_1, (v4u32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV4SI, UV2DI, UV2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vssrln_wu_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vssrln_wu_d ((v2u64)_1, (v2u64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsrlrn_b_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsrlrn_b_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsrlrn_h_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsrlrn_h_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsrlrn_w_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsrlrn_w_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV16QI, UV8HI, UV8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vssrlrn_bu_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vssrlrn_bu_h ((v8u16)_1, (v8u16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV8HI, UV4SI, UV4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vssrlrn_hu_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vssrlrn_hu_w ((v4u32)_1, (v4u32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV4SI, UV2DI, UV2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vssrlrn_wu_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vssrlrn_wu_d ((v2u64)_1, (v2u64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI, UQI.  */
+#define __lsx_vfrstpi_b(/*__m128i*/ _1, /*__m128i*/ _2, /*ui5*/ _3) \
+  ((__m128i)__builtin_lsx_vfrstpi_b ((v16i8)(_1), (v16i8)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI, UQI.  */
+#define __lsx_vfrstpi_h(/*__m128i*/ _1, /*__m128i*/ _2, /*ui5*/ _3) \
+  ((__m128i)__builtin_lsx_vfrstpi_h ((v8i16)(_1), (v8i16)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfrstp_b (__m128i _1, __m128i _2, __m128i _3)
+{
+  return (__m128i)__builtin_lsx_vfrstp_b ((v16i8)_1, (v16i8)_2, (v16i8)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfrstp_h (__m128i _1, __m128i _2, __m128i _3)
+{
+  return (__m128i)__builtin_lsx_vfrstp_h ((v8i16)_1, (v8i16)_2, (v8i16)_3);
+}
+
+/* Assembly instruction format:	vd, vj, ui8.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI, USI.  */
+#define __lsx_vshuf4i_d(/*__m128i*/ _1, /*__m128i*/ _2, /*ui8*/ _3) \
+  ((__m128i)__builtin_lsx_vshuf4i_d ((v2i64)(_1), (v2i64)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  V16QI, V16QI, UQI.  */
+#define __lsx_vbsrl_v(/*__m128i*/ _1, /*ui5*/ _2) \
+  ((__m128i)__builtin_lsx_vbsrl_v ((v16i8)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  V16QI, V16QI, UQI.  */
+#define __lsx_vbsll_v(/*__m128i*/ _1, /*ui5*/ _2) \
+  ((__m128i)__builtin_lsx_vbsll_v ((v16i8)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui8.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI, USI.  */
+#define __lsx_vextrins_b(/*__m128i*/ _1, /*__m128i*/ _2, /*ui8*/ _3) \
+  ((__m128i)__builtin_lsx_vextrins_b ((v16i8)(_1), (v16i8)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui8.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI, USI.  */
+#define __lsx_vextrins_h(/*__m128i*/ _1, /*__m128i*/ _2, /*ui8*/ _3) \
+  ((__m128i)__builtin_lsx_vextrins_h ((v8i16)(_1), (v8i16)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui8.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI, USI.  */
+#define __lsx_vextrins_w(/*__m128i*/ _1, /*__m128i*/ _2, /*ui8*/ _3) \
+  ((__m128i)__builtin_lsx_vextrins_w ((v4i32)(_1), (v4i32)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui8.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI, USI.  */
+#define __lsx_vextrins_d(/*__m128i*/ _1, /*__m128i*/ _2, /*ui8*/ _3) \
+  ((__m128i)__builtin_lsx_vextrins_d ((v2i64)(_1), (v2i64)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmskltz_b (__m128i _1)
+{
+  return (__m128i)__builtin_lsx_vmskltz_b ((v16i8)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmskltz_h (__m128i _1)
+{
+  return (__m128i)__builtin_lsx_vmskltz_h ((v8i16)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmskltz_w (__m128i _1)
+{
+  return (__m128i)__builtin_lsx_vmskltz_w ((v4i32)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmskltz_d (__m128i _1)
+{
+  return (__m128i)__builtin_lsx_vmskltz_d ((v2i64)_1);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsigncov_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsigncov_b ((v16i8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsigncov_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsigncov_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsigncov_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsigncov_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsigncov_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsigncov_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk, va.  */
+/* Data types in instruction templates:  V4SF, V4SF, V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128 __lsx_vfmadd_s (__m128 _1, __m128 _2, __m128 _3)
+{
+  return (__m128)__builtin_lsx_vfmadd_s ((v4f32)_1, (v4f32)_2, (v4f32)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk, va.  */
+/* Data types in instruction templates:  V2DF, V2DF, V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128d __lsx_vfmadd_d (__m128d _1, __m128d _2, __m128d _3)
+{
+  return (__m128d)__builtin_lsx_vfmadd_d ((v2f64)_1, (v2f64)_2, (v2f64)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk, va.  */
+/* Data types in instruction templates:  V4SF, V4SF, V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128 __lsx_vfmsub_s (__m128 _1, __m128 _2, __m128 _3)
+{
+  return (__m128)__builtin_lsx_vfmsub_s ((v4f32)_1, (v4f32)_2, (v4f32)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk, va.  */
+/* Data types in instruction templates:  V2DF, V2DF, V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128d __lsx_vfmsub_d (__m128d _1, __m128d _2, __m128d _3)
+{
+  return (__m128d)__builtin_lsx_vfmsub_d ((v2f64)_1, (v2f64)_2, (v2f64)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk, va.  */
+/* Data types in instruction templates:  V4SF, V4SF, V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128 __lsx_vfnmadd_s (__m128 _1, __m128 _2, __m128 _3)
+{
+  return (__m128)__builtin_lsx_vfnmadd_s ((v4f32)_1, (v4f32)_2, (v4f32)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk, va.  */
+/* Data types in instruction templates:  V2DF, V2DF, V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128d __lsx_vfnmadd_d (__m128d _1, __m128d _2, __m128d _3)
+{
+  return (__m128d)__builtin_lsx_vfnmadd_d ((v2f64)_1, (v2f64)_2, (v2f64)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk, va.  */
+/* Data types in instruction templates:  V4SF, V4SF, V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128 __lsx_vfnmsub_s (__m128 _1, __m128 _2, __m128 _3)
+{
+  return (__m128)__builtin_lsx_vfnmsub_s ((v4f32)_1, (v4f32)_2, (v4f32)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk, va.  */
+/* Data types in instruction templates:  V2DF, V2DF, V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128d __lsx_vfnmsub_d (__m128d _1, __m128d _2, __m128d _3)
+{
+  return (__m128d)__builtin_lsx_vfnmsub_d ((v2f64)_1, (v2f64)_2, (v2f64)_3);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V4SI, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vftintrne_w_s (__m128 _1)
+{
+  return (__m128i)__builtin_lsx_vftintrne_w_s ((v4f32)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V2DI, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vftintrne_l_d (__m128d _1)
+{
+  return (__m128i)__builtin_lsx_vftintrne_l_d ((v2f64)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V4SI, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vftintrp_w_s (__m128 _1)
+{
+  return (__m128i)__builtin_lsx_vftintrp_w_s ((v4f32)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V2DI, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vftintrp_l_d (__m128d _1)
+{
+  return (__m128i)__builtin_lsx_vftintrp_l_d ((v2f64)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V4SI, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vftintrm_w_s (__m128 _1)
+{
+  return (__m128i)__builtin_lsx_vftintrm_w_s ((v4f32)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V2DI, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vftintrm_l_d (__m128d _1)
+{
+  return (__m128i)__builtin_lsx_vftintrm_l_d ((v2f64)_1);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vftint_w_d (__m128d _1, __m128d _2)
+{
+  return (__m128i)__builtin_lsx_vftint_w_d ((v2f64)_1, (v2f64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SF, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128 __lsx_vffint_s_l (__m128i _1, __m128i _2)
+{
+  return (__m128)__builtin_lsx_vffint_s_l ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vftintrz_w_d (__m128d _1, __m128d _2)
+{
+  return (__m128i)__builtin_lsx_vftintrz_w_d ((v2f64)_1, (v2f64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vftintrp_w_d (__m128d _1, __m128d _2)
+{
+  return (__m128i)__builtin_lsx_vftintrp_w_d ((v2f64)_1, (v2f64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vftintrm_w_d (__m128d _1, __m128d _2)
+{
+  return (__m128i)__builtin_lsx_vftintrm_w_d ((v2f64)_1, (v2f64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vftintrne_w_d (__m128d _1, __m128d _2)
+{
+  return (__m128i)__builtin_lsx_vftintrne_w_d ((v2f64)_1, (v2f64)_2);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V2DI, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vftintl_l_s (__m128 _1)
+{
+  return (__m128i)__builtin_lsx_vftintl_l_s ((v4f32)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V2DI, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vftinth_l_s (__m128 _1)
+{
+  return (__m128i)__builtin_lsx_vftinth_l_s ((v4f32)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V2DF, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128d __lsx_vffinth_d_w (__m128i _1)
+{
+  return (__m128d)__builtin_lsx_vffinth_d_w ((v4i32)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V2DF, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128d __lsx_vffintl_d_w (__m128i _1)
+{
+  return (__m128d)__builtin_lsx_vffintl_d_w ((v4i32)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V2DI, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vftintrzl_l_s (__m128 _1)
+{
+  return (__m128i)__builtin_lsx_vftintrzl_l_s ((v4f32)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V2DI, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vftintrzh_l_s (__m128 _1)
+{
+  return (__m128i)__builtin_lsx_vftintrzh_l_s ((v4f32)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V2DI, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vftintrpl_l_s (__m128 _1)
+{
+  return (__m128i)__builtin_lsx_vftintrpl_l_s ((v4f32)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V2DI, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vftintrph_l_s (__m128 _1)
+{
+  return (__m128i)__builtin_lsx_vftintrph_l_s ((v4f32)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V2DI, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vftintrml_l_s (__m128 _1)
+{
+  return (__m128i)__builtin_lsx_vftintrml_l_s ((v4f32)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V2DI, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vftintrmh_l_s (__m128 _1)
+{
+  return (__m128i)__builtin_lsx_vftintrmh_l_s ((v4f32)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V2DI, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vftintrnel_l_s (__m128 _1)
+{
+  return (__m128i)__builtin_lsx_vftintrnel_l_s ((v4f32)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V2DI, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vftintrneh_l_s (__m128 _1)
+{
+  return (__m128i)__builtin_lsx_vftintrneh_l_s ((v4f32)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V4SI, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128 __lsx_vfrintrne_s (__m128 _1)
+{
+  return (__m128)__builtin_lsx_vfrintrne_s ((v4f32)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V2DI, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128d __lsx_vfrintrne_d (__m128d _1)
+{
+  return (__m128d)__builtin_lsx_vfrintrne_d ((v2f64)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V4SI, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128 __lsx_vfrintrz_s (__m128 _1)
+{
+  return (__m128)__builtin_lsx_vfrintrz_s ((v4f32)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V2DI, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128d __lsx_vfrintrz_d (__m128d _1)
+{
+  return (__m128d)__builtin_lsx_vfrintrz_d ((v2f64)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V4SI, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128 __lsx_vfrintrp_s (__m128 _1)
+{
+  return (__m128)__builtin_lsx_vfrintrp_s ((v4f32)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V2DI, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128d __lsx_vfrintrp_d (__m128d _1)
+{
+  return (__m128d)__builtin_lsx_vfrintrp_d ((v2f64)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V4SI, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128 __lsx_vfrintrm_s (__m128 _1)
+{
+  return (__m128)__builtin_lsx_vfrintrm_s ((v4f32)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V2DI, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128d __lsx_vfrintrm_d (__m128d _1)
+{
+  return (__m128d)__builtin_lsx_vfrintrm_d ((v2f64)_1);
+}
+
+/* Assembly instruction format:	vd, rj, si8, idx.  */
+/* Data types in instruction templates:  VOID, V16QI, CVPOINTER, SI, UQI.  */
+#define __lsx_vstelm_b(/*__m128i*/ _1, /*void **/ _2, /*si8*/ _3, /*idx*/ _4) \
+  ((void)__builtin_lsx_vstelm_b ((v16i8)(_1), (void *)(_2), (_3), (_4)))
+
+/* Assembly instruction format:	vd, rj, si8, idx.  */
+/* Data types in instruction templates:  VOID, V8HI, CVPOINTER, SI, UQI.  */
+#define __lsx_vstelm_h(/*__m128i*/ _1, /*void **/ _2, /*si8*/ _3, /*idx*/ _4) \
+  ((void)__builtin_lsx_vstelm_h ((v8i16)(_1), (void *)(_2), (_3), (_4)))
+
+/* Assembly instruction format:	vd, rj, si8, idx.  */
+/* Data types in instruction templates:  VOID, V4SI, CVPOINTER, SI, UQI.  */
+#define __lsx_vstelm_w(/*__m128i*/ _1, /*void **/ _2, /*si8*/ _3, /*idx*/ _4) \
+  ((void)__builtin_lsx_vstelm_w ((v4i32)(_1), (void *)(_2), (_3), (_4)))
+
+/* Assembly instruction format:	vd, rj, si8, idx.  */
+/* Data types in instruction templates:  VOID, V2DI, CVPOINTER, SI, UQI.  */
+#define __lsx_vstelm_d(/*__m128i*/ _1, /*void **/ _2, /*si8*/ _3, /*idx*/ _4) \
+  ((void)__builtin_lsx_vstelm_d ((v2i64)(_1), (void *)(_2), (_3), (_4)))
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vaddwev_d_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vaddwev_d_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vaddwev_w_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vaddwev_w_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vaddwev_h_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vaddwev_h_b ((v16i8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vaddwod_d_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vaddwod_d_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vaddwod_w_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vaddwod_w_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vaddwod_h_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vaddwod_h_b ((v16i8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, UV4SI, UV4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vaddwev_d_wu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vaddwev_d_wu ((v4u32)_1, (v4u32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, UV8HI, UV8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vaddwev_w_hu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vaddwev_w_hu ((v8u16)_1, (v8u16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, UV16QI, UV16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vaddwev_h_bu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vaddwev_h_bu ((v16u8)_1, (v16u8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, UV4SI, UV4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vaddwod_d_wu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vaddwod_d_wu ((v4u32)_1, (v4u32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, UV8HI, UV8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vaddwod_w_hu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vaddwod_w_hu ((v8u16)_1, (v8u16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, UV16QI, UV16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vaddwod_h_bu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vaddwod_h_bu ((v16u8)_1, (v16u8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, UV4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vaddwev_d_wu_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vaddwev_d_wu_w ((v4u32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, UV8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vaddwev_w_hu_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vaddwev_w_hu_h ((v8u16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, UV16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vaddwev_h_bu_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vaddwev_h_bu_b ((v16u8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, UV4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vaddwod_d_wu_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vaddwod_d_wu_w ((v4u32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, UV8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vaddwod_w_hu_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vaddwod_w_hu_h ((v8u16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, UV16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vaddwod_h_bu_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vaddwod_h_bu_b ((v16u8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsubwev_d_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsubwev_d_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsubwev_w_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsubwev_w_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsubwev_h_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsubwev_h_b ((v16i8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsubwod_d_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsubwod_d_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsubwod_w_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsubwod_w_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsubwod_h_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsubwod_h_b ((v16i8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, UV4SI, UV4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsubwev_d_wu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsubwev_d_wu ((v4u32)_1, (v4u32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, UV8HI, UV8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsubwev_w_hu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsubwev_w_hu ((v8u16)_1, (v8u16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, UV16QI, UV16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsubwev_h_bu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsubwev_h_bu ((v16u8)_1, (v16u8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, UV4SI, UV4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsubwod_d_wu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsubwod_d_wu ((v4u32)_1, (v4u32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, UV8HI, UV8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsubwod_w_hu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsubwod_w_hu ((v8u16)_1, (v8u16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, UV16QI, UV16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsubwod_h_bu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsubwod_h_bu ((v16u8)_1, (v16u8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vaddwev_q_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vaddwev_q_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vaddwod_q_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vaddwod_q_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, UV2DI, UV2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vaddwev_q_du (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vaddwev_q_du ((v2u64)_1, (v2u64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, UV2DI, UV2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vaddwod_q_du (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vaddwod_q_du ((v2u64)_1, (v2u64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsubwev_q_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsubwev_q_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsubwod_q_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsubwod_q_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, UV2DI, UV2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsubwev_q_du (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsubwev_q_du ((v2u64)_1, (v2u64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, UV2DI, UV2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsubwod_q_du (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsubwod_q_du ((v2u64)_1, (v2u64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, UV2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vaddwev_q_du_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vaddwev_q_du_d ((v2u64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, UV2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vaddwod_q_du_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vaddwod_q_du_d ((v2u64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmulwev_d_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmulwev_d_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmulwev_w_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmulwev_w_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmulwev_h_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmulwev_h_b ((v16i8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmulwod_d_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmulwod_d_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmulwod_w_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmulwod_w_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmulwod_h_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmulwod_h_b ((v16i8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, UV4SI, UV4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmulwev_d_wu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmulwev_d_wu ((v4u32)_1, (v4u32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, UV8HI, UV8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmulwev_w_hu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmulwev_w_hu ((v8u16)_1, (v8u16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, UV16QI, UV16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmulwev_h_bu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmulwev_h_bu ((v16u8)_1, (v16u8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, UV4SI, UV4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmulwod_d_wu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmulwod_d_wu ((v4u32)_1, (v4u32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, UV8HI, UV8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmulwod_w_hu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmulwod_w_hu ((v8u16)_1, (v8u16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, UV16QI, UV16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmulwod_h_bu (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmulwod_h_bu ((v16u8)_1, (v16u8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, UV4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmulwev_d_wu_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmulwev_d_wu_w ((v4u32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, UV8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmulwev_w_hu_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmulwev_w_hu_h ((v8u16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, UV16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmulwev_h_bu_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmulwev_h_bu_b ((v16u8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, UV4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmulwod_d_wu_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmulwod_d_wu_w ((v4u32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, UV8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmulwod_w_hu_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmulwod_w_hu_h ((v8u16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, UV16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmulwod_h_bu_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmulwod_h_bu_b ((v16u8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmulwev_q_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmulwev_q_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmulwod_q_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmulwod_q_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, UV2DI, UV2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmulwev_q_du (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmulwev_q_du ((v2u64)_1, (v2u64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, UV2DI, UV2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmulwod_q_du (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmulwod_q_du ((v2u64)_1, (v2u64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, UV2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmulwev_q_du_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmulwev_q_du_d ((v2u64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, UV2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmulwod_q_du_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vmulwod_q_du_d ((v2u64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vhaddw_q_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vhaddw_q_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV2DI, UV2DI, UV2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vhaddw_qu_du (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vhaddw_qu_du ((v2u64)_1, (v2u64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vhsubw_q_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vhsubw_q_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV2DI, UV2DI, UV2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vhsubw_qu_du (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vhsubw_qu_du ((v2u64)_1, (v2u64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmaddwev_d_w (__m128i _1, __m128i _2, __m128i _3)
+{
+  return (__m128i)__builtin_lsx_vmaddwev_d_w ((v2i64)_1, (v4i32)_2, (v4i32)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmaddwev_w_h (__m128i _1, __m128i _2, __m128i _3)
+{
+  return (__m128i)__builtin_lsx_vmaddwev_w_h ((v4i32)_1, (v8i16)_2, (v8i16)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V8HI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmaddwev_h_b (__m128i _1, __m128i _2, __m128i _3)
+{
+  return (__m128i)__builtin_lsx_vmaddwev_h_b ((v8i16)_1, (v16i8)_2, (v16i8)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV2DI, UV2DI, UV4SI, UV4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmaddwev_d_wu (__m128i _1, __m128i _2, __m128i _3)
+{
+  return (__m128i)__builtin_lsx_vmaddwev_d_wu ((v2u64)_1, (v4u32)_2, (v4u32)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV4SI, UV4SI, UV8HI, UV8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmaddwev_w_hu (__m128i _1, __m128i _2, __m128i _3)
+{
+  return (__m128i)__builtin_lsx_vmaddwev_w_hu ((v4u32)_1, (v8u16)_2, (v8u16)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV8HI, UV8HI, UV16QI, UV16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmaddwev_h_bu (__m128i _1, __m128i _2, __m128i _3)
+{
+  return (__m128i)__builtin_lsx_vmaddwev_h_bu ((v8u16)_1, (v16u8)_2, (v16u8)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmaddwod_d_w (__m128i _1, __m128i _2, __m128i _3)
+{
+  return (__m128i)__builtin_lsx_vmaddwod_d_w ((v2i64)_1, (v4i32)_2, (v4i32)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmaddwod_w_h (__m128i _1, __m128i _2, __m128i _3)
+{
+  return (__m128i)__builtin_lsx_vmaddwod_w_h ((v4i32)_1, (v8i16)_2, (v8i16)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V8HI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmaddwod_h_b (__m128i _1, __m128i _2, __m128i _3)
+{
+  return (__m128i)__builtin_lsx_vmaddwod_h_b ((v8i16)_1, (v16i8)_2, (v16i8)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV2DI, UV2DI, UV4SI, UV4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmaddwod_d_wu (__m128i _1, __m128i _2, __m128i _3)
+{
+  return (__m128i)__builtin_lsx_vmaddwod_d_wu ((v2u64)_1, (v4u32)_2, (v4u32)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV4SI, UV4SI, UV8HI, UV8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmaddwod_w_hu (__m128i _1, __m128i _2, __m128i _3)
+{
+  return (__m128i)__builtin_lsx_vmaddwod_w_hu ((v4u32)_1, (v8u16)_2, (v8u16)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV8HI, UV8HI, UV16QI, UV16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmaddwod_h_bu (__m128i _1, __m128i _2, __m128i _3)
+{
+  return (__m128i)__builtin_lsx_vmaddwod_h_bu ((v8u16)_1, (v16u8)_2, (v16u8)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, UV4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmaddwev_d_wu_w (__m128i _1, __m128i _2, __m128i _3)
+{
+  return (__m128i)__builtin_lsx_vmaddwev_d_wu_w ((v2i64)_1, (v4u32)_2, (v4i32)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SI, UV8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmaddwev_w_hu_h (__m128i _1, __m128i _2, __m128i _3)
+{
+  return (__m128i)__builtin_lsx_vmaddwev_w_hu_h ((v4i32)_1, (v8u16)_2, (v8i16)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V8HI, UV16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmaddwev_h_bu_b (__m128i _1, __m128i _2, __m128i _3)
+{
+  return (__m128i)__builtin_lsx_vmaddwev_h_bu_b ((v8i16)_1, (v16u8)_2, (v16i8)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, UV4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmaddwod_d_wu_w (__m128i _1, __m128i _2, __m128i _3)
+{
+  return (__m128i)__builtin_lsx_vmaddwod_d_wu_w ((v2i64)_1, (v4u32)_2, (v4i32)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SI, UV8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmaddwod_w_hu_h (__m128i _1, __m128i _2, __m128i _3)
+{
+  return (__m128i)__builtin_lsx_vmaddwod_w_hu_h ((v4i32)_1, (v8u16)_2, (v8i16)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V8HI, UV16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmaddwod_h_bu_b (__m128i _1, __m128i _2, __m128i _3)
+{
+  return (__m128i)__builtin_lsx_vmaddwod_h_bu_b ((v8i16)_1, (v16u8)_2, (v16i8)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmaddwev_q_d (__m128i _1, __m128i _2, __m128i _3)
+{
+  return (__m128i)__builtin_lsx_vmaddwev_q_d ((v2i64)_1, (v2i64)_2, (v2i64)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmaddwod_q_d (__m128i _1, __m128i _2, __m128i _3)
+{
+  return (__m128i)__builtin_lsx_vmaddwod_q_d ((v2i64)_1, (v2i64)_2, (v2i64)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV2DI, UV2DI, UV2DI, UV2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmaddwev_q_du (__m128i _1, __m128i _2, __m128i _3)
+{
+  return (__m128i)__builtin_lsx_vmaddwev_q_du ((v2u64)_1, (v2u64)_2, (v2u64)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  UV2DI, UV2DI, UV2DI, UV2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmaddwod_q_du (__m128i _1, __m128i _2, __m128i _3)
+{
+  return (__m128i)__builtin_lsx_vmaddwod_q_du ((v2u64)_1, (v2u64)_2, (v2u64)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, UV2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmaddwev_q_du_d (__m128i _1, __m128i _2, __m128i _3)
+{
+  return (__m128i)__builtin_lsx_vmaddwev_q_du_d ((v2i64)_1, (v2u64)_2, (v2i64)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, UV2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmaddwod_q_du_d (__m128i _1, __m128i _2, __m128i _3)
+{
+  return (__m128i)__builtin_lsx_vmaddwod_q_du_d ((v2i64)_1, (v2u64)_2, (v2i64)_3);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vrotr_b (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vrotr_b ((v16i8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vrotr_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vrotr_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vrotr_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vrotr_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vrotr_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vrotr_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vadd_q (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vadd_q ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vsub_q (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vsub_q ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, rj, si12.  */
+/* Data types in instruction templates:  V16QI, CVPOINTER, SI.  */
+#define __lsx_vldrepl_b(/*void **/ _1, /*si12*/ _2) \
+  ((__m128i)__builtin_lsx_vldrepl_b ((void *)(_1), (_2)))
+
+/* Assembly instruction format:	vd, rj, si11.  */
+/* Data types in instruction templates:  V8HI, CVPOINTER, SI.  */
+#define __lsx_vldrepl_h(/*void **/ _1, /*si11*/ _2) \
+  ((__m128i)__builtin_lsx_vldrepl_h ((void *)(_1), (_2)))
+
+/* Assembly instruction format:	vd, rj, si10.  */
+/* Data types in instruction templates:  V4SI, CVPOINTER, SI.  */
+#define __lsx_vldrepl_w(/*void **/ _1, /*si10*/ _2) \
+  ((__m128i)__builtin_lsx_vldrepl_w ((void *)(_1), (_2)))
+
+/* Assembly instruction format:	vd, rj, si9.  */
+/* Data types in instruction templates:  V2DI, CVPOINTER, SI.  */
+#define __lsx_vldrepl_d(/*void **/ _1, /*si9*/ _2) \
+  ((__m128i)__builtin_lsx_vldrepl_d ((void *)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmskgez_b (__m128i _1)
+{
+  return (__m128i)__builtin_lsx_vmskgez_b ((v16i8)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vmsknz_b (__m128i _1)
+{
+  return (__m128i)__builtin_lsx_vmsknz_b ((v16i8)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V8HI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vexth_h_b (__m128i _1)
+{
+  return (__m128i)__builtin_lsx_vexth_h_b ((v16i8)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V4SI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vexth_w_h (__m128i _1)
+{
+  return (__m128i)__builtin_lsx_vexth_w_h ((v8i16)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V2DI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vexth_d_w (__m128i _1)
+{
+  return (__m128i)__builtin_lsx_vexth_d_w ((v4i32)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vexth_q_d (__m128i _1)
+{
+  return (__m128i)__builtin_lsx_vexth_q_d ((v2i64)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  UV8HI, UV16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vexth_hu_bu (__m128i _1)
+{
+  return (__m128i)__builtin_lsx_vexth_hu_bu ((v16u8)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  UV4SI, UV8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vexth_wu_hu (__m128i _1)
+{
+  return (__m128i)__builtin_lsx_vexth_wu_hu ((v8u16)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  UV2DI, UV4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vexth_du_wu (__m128i _1)
+{
+  return (__m128i)__builtin_lsx_vexth_du_wu ((v4u32)_1);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  UV2DI, UV2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vexth_qu_du (__m128i _1)
+{
+  return (__m128i)__builtin_lsx_vexth_qu_du ((v2u64)_1);
+}
+
+/* Assembly instruction format:	vd, vj, ui3.  */
+/* Data types in instruction templates:  V16QI, V16QI, UQI.  */
+#define __lsx_vrotri_b(/*__m128i*/ _1, /*ui3*/ _2) \
+  ((__m128i)__builtin_lsx_vrotri_b ((v16i8)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui4.  */
+/* Data types in instruction templates:  V8HI, V8HI, UQI.  */
+#define __lsx_vrotri_h(/*__m128i*/ _1, /*ui4*/ _2) \
+  ((__m128i)__builtin_lsx_vrotri_h ((v8i16)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  V4SI, V4SI, UQI.  */
+#define __lsx_vrotri_w(/*__m128i*/ _1, /*ui5*/ _2) \
+  ((__m128i)__builtin_lsx_vrotri_w ((v4i32)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj, ui6.  */
+/* Data types in instruction templates:  V2DI, V2DI, UQI.  */
+#define __lsx_vrotri_d(/*__m128i*/ _1, /*ui6*/ _2) \
+  ((__m128i)__builtin_lsx_vrotri_d ((v2i64)(_1), (_2)))
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vextl_q_d (__m128i _1)
+{
+  return (__m128i)__builtin_lsx_vextl_q_d ((v2i64)_1);
+}
+
+/* Assembly instruction format:	vd, vj, ui4.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI, USI.  */
+#define __lsx_vsrlni_b_h(/*__m128i*/ _1, /*__m128i*/ _2, /*ui4*/ _3) \
+  ((__m128i)__builtin_lsx_vsrlni_b_h ((v16i8)(_1), (v16i8)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI, USI.  */
+#define __lsx_vsrlni_h_w(/*__m128i*/ _1, /*__m128i*/ _2, /*ui5*/ _3) \
+  ((__m128i)__builtin_lsx_vsrlni_h_w ((v8i16)(_1), (v8i16)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui6.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI, USI.  */
+#define __lsx_vsrlni_w_d(/*__m128i*/ _1, /*__m128i*/ _2, /*ui6*/ _3) \
+  ((__m128i)__builtin_lsx_vsrlni_w_d ((v4i32)(_1), (v4i32)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui7.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI, USI.  */
+#define __lsx_vsrlni_d_q(/*__m128i*/ _1, /*__m128i*/ _2, /*ui7*/ _3) \
+  ((__m128i)__builtin_lsx_vsrlni_d_q ((v2i64)(_1), (v2i64)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui4.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI, USI.  */
+#define __lsx_vsrlrni_b_h(/*__m128i*/ _1, /*__m128i*/ _2, /*ui4*/ _3) \
+  ((__m128i)__builtin_lsx_vsrlrni_b_h ((v16i8)(_1), (v16i8)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI, USI.  */
+#define __lsx_vsrlrni_h_w(/*__m128i*/ _1, /*__m128i*/ _2, /*ui5*/ _3) \
+  ((__m128i)__builtin_lsx_vsrlrni_h_w ((v8i16)(_1), (v8i16)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui6.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI, USI.  */
+#define __lsx_vsrlrni_w_d(/*__m128i*/ _1, /*__m128i*/ _2, /*ui6*/ _3) \
+  ((__m128i)__builtin_lsx_vsrlrni_w_d ((v4i32)(_1), (v4i32)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui7.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI, USI.  */
+#define __lsx_vsrlrni_d_q(/*__m128i*/ _1, /*__m128i*/ _2, /*ui7*/ _3) \
+  ((__m128i)__builtin_lsx_vsrlrni_d_q ((v2i64)(_1), (v2i64)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui4.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI, USI.  */
+#define __lsx_vssrlni_b_h(/*__m128i*/ _1, /*__m128i*/ _2, /*ui4*/ _3) \
+  ((__m128i)__builtin_lsx_vssrlni_b_h ((v16i8)(_1), (v16i8)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI, USI.  */
+#define __lsx_vssrlni_h_w(/*__m128i*/ _1, /*__m128i*/ _2, /*ui5*/ _3) \
+  ((__m128i)__builtin_lsx_vssrlni_h_w ((v8i16)(_1), (v8i16)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui6.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI, USI.  */
+#define __lsx_vssrlni_w_d(/*__m128i*/ _1, /*__m128i*/ _2, /*ui6*/ _3) \
+  ((__m128i)__builtin_lsx_vssrlni_w_d ((v4i32)(_1), (v4i32)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui7.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI, USI.  */
+#define __lsx_vssrlni_d_q(/*__m128i*/ _1, /*__m128i*/ _2, /*ui7*/ _3) \
+  ((__m128i)__builtin_lsx_vssrlni_d_q ((v2i64)(_1), (v2i64)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui4.  */
+/* Data types in instruction templates:  UV16QI, UV16QI, V16QI, USI.  */
+#define __lsx_vssrlni_bu_h(/*__m128i*/ _1, /*__m128i*/ _2, /*ui4*/ _3) \
+  ((__m128i)__builtin_lsx_vssrlni_bu_h ((v16u8)(_1), (v16i8)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  UV8HI, UV8HI, V8HI, USI.  */
+#define __lsx_vssrlni_hu_w(/*__m128i*/ _1, /*__m128i*/ _2, /*ui5*/ _3) \
+  ((__m128i)__builtin_lsx_vssrlni_hu_w ((v8u16)(_1), (v8i16)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui6.  */
+/* Data types in instruction templates:  UV4SI, UV4SI, V4SI, USI.  */
+#define __lsx_vssrlni_wu_d(/*__m128i*/ _1, /*__m128i*/ _2, /*ui6*/ _3) \
+  ((__m128i)__builtin_lsx_vssrlni_wu_d ((v4u32)(_1), (v4i32)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui7.  */
+/* Data types in instruction templates:  UV2DI, UV2DI, V2DI, USI.  */
+#define __lsx_vssrlni_du_q(/*__m128i*/ _1, /*__m128i*/ _2, /*ui7*/ _3) \
+  ((__m128i)__builtin_lsx_vssrlni_du_q ((v2u64)(_1), (v2i64)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui4.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI, USI.  */
+#define __lsx_vssrlrni_b_h(/*__m128i*/ _1, /*__m128i*/ _2, /*ui4*/ _3) \
+  ((__m128i)__builtin_lsx_vssrlrni_b_h ((v16i8)(_1), (v16i8)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI, USI.  */
+#define __lsx_vssrlrni_h_w(/*__m128i*/ _1, /*__m128i*/ _2, /*ui5*/ _3) \
+  ((__m128i)__builtin_lsx_vssrlrni_h_w ((v8i16)(_1), (v8i16)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui6.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI, USI.  */
+#define __lsx_vssrlrni_w_d(/*__m128i*/ _1, /*__m128i*/ _2, /*ui6*/ _3) \
+  ((__m128i)__builtin_lsx_vssrlrni_w_d ((v4i32)(_1), (v4i32)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui7.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI, USI.  */
+#define __lsx_vssrlrni_d_q(/*__m128i*/ _1, /*__m128i*/ _2, /*ui7*/ _3) \
+  ((__m128i)__builtin_lsx_vssrlrni_d_q ((v2i64)(_1), (v2i64)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui4.  */
+/* Data types in instruction templates:  UV16QI, UV16QI, V16QI, USI.  */
+#define __lsx_vssrlrni_bu_h(/*__m128i*/ _1, /*__m128i*/ _2, /*ui4*/ _3) \
+  ((__m128i)__builtin_lsx_vssrlrni_bu_h ((v16u8)(_1), (v16i8)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  UV8HI, UV8HI, V8HI, USI.  */
+#define __lsx_vssrlrni_hu_w(/*__m128i*/ _1, /*__m128i*/ _2, /*ui5*/ _3) \
+  ((__m128i)__builtin_lsx_vssrlrni_hu_w ((v8u16)(_1), (v8i16)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui6.  */
+/* Data types in instruction templates:  UV4SI, UV4SI, V4SI, USI.  */
+#define __lsx_vssrlrni_wu_d(/*__m128i*/ _1, /*__m128i*/ _2, /*ui6*/ _3) \
+  ((__m128i)__builtin_lsx_vssrlrni_wu_d ((v4u32)(_1), (v4i32)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui7.  */
+/* Data types in instruction templates:  UV2DI, UV2DI, V2DI, USI.  */
+#define __lsx_vssrlrni_du_q(/*__m128i*/ _1, /*__m128i*/ _2, /*ui7*/ _3) \
+  ((__m128i)__builtin_lsx_vssrlrni_du_q ((v2u64)(_1), (v2i64)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui4.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI, USI.  */
+#define __lsx_vsrani_b_h(/*__m128i*/ _1, /*__m128i*/ _2, /*ui4*/ _3) \
+  ((__m128i)__builtin_lsx_vsrani_b_h ((v16i8)(_1), (v16i8)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI, USI.  */
+#define __lsx_vsrani_h_w(/*__m128i*/ _1, /*__m128i*/ _2, /*ui5*/ _3) \
+  ((__m128i)__builtin_lsx_vsrani_h_w ((v8i16)(_1), (v8i16)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui6.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI, USI.  */
+#define __lsx_vsrani_w_d(/*__m128i*/ _1, /*__m128i*/ _2, /*ui6*/ _3) \
+  ((__m128i)__builtin_lsx_vsrani_w_d ((v4i32)(_1), (v4i32)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui7.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI, USI.  */
+#define __lsx_vsrani_d_q(/*__m128i*/ _1, /*__m128i*/ _2, /*ui7*/ _3) \
+  ((__m128i)__builtin_lsx_vsrani_d_q ((v2i64)(_1), (v2i64)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui4.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI, USI.  */
+#define __lsx_vsrarni_b_h(/*__m128i*/ _1, /*__m128i*/ _2, /*ui4*/ _3) \
+  ((__m128i)__builtin_lsx_vsrarni_b_h ((v16i8)(_1), (v16i8)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI, USI.  */
+#define __lsx_vsrarni_h_w(/*__m128i*/ _1, /*__m128i*/ _2, /*ui5*/ _3) \
+  ((__m128i)__builtin_lsx_vsrarni_h_w ((v8i16)(_1), (v8i16)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui6.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI, USI.  */
+#define __lsx_vsrarni_w_d(/*__m128i*/ _1, /*__m128i*/ _2, /*ui6*/ _3) \
+  ((__m128i)__builtin_lsx_vsrarni_w_d ((v4i32)(_1), (v4i32)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui7.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI, USI.  */
+#define __lsx_vsrarni_d_q(/*__m128i*/ _1, /*__m128i*/ _2, /*ui7*/ _3) \
+  ((__m128i)__builtin_lsx_vsrarni_d_q ((v2i64)(_1), (v2i64)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui4.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI, USI.  */
+#define __lsx_vssrani_b_h(/*__m128i*/ _1, /*__m128i*/ _2, /*ui4*/ _3) \
+  ((__m128i)__builtin_lsx_vssrani_b_h ((v16i8)(_1), (v16i8)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI, USI.  */
+#define __lsx_vssrani_h_w(/*__m128i*/ _1, /*__m128i*/ _2, /*ui5*/ _3) \
+  ((__m128i)__builtin_lsx_vssrani_h_w ((v8i16)(_1), (v8i16)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui6.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI, USI.  */
+#define __lsx_vssrani_w_d(/*__m128i*/ _1, /*__m128i*/ _2, /*ui6*/ _3) \
+  ((__m128i)__builtin_lsx_vssrani_w_d ((v4i32)(_1), (v4i32)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui7.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI, USI.  */
+#define __lsx_vssrani_d_q(/*__m128i*/ _1, /*__m128i*/ _2, /*ui7*/ _3) \
+  ((__m128i)__builtin_lsx_vssrani_d_q ((v2i64)(_1), (v2i64)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui4.  */
+/* Data types in instruction templates:  UV16QI, UV16QI, V16QI, USI.  */
+#define __lsx_vssrani_bu_h(/*__m128i*/ _1, /*__m128i*/ _2, /*ui4*/ _3) \
+  ((__m128i)__builtin_lsx_vssrani_bu_h ((v16u8)(_1), (v16i8)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  UV8HI, UV8HI, V8HI, USI.  */
+#define __lsx_vssrani_hu_w(/*__m128i*/ _1, /*__m128i*/ _2, /*ui5*/ _3) \
+  ((__m128i)__builtin_lsx_vssrani_hu_w ((v8u16)(_1), (v8i16)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui6.  */
+/* Data types in instruction templates:  UV4SI, UV4SI, V4SI, USI.  */
+#define __lsx_vssrani_wu_d(/*__m128i*/ _1, /*__m128i*/ _2, /*ui6*/ _3) \
+  ((__m128i)__builtin_lsx_vssrani_wu_d ((v4u32)(_1), (v4i32)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui7.  */
+/* Data types in instruction templates:  UV2DI, UV2DI, V2DI, USI.  */
+#define __lsx_vssrani_du_q(/*__m128i*/ _1, /*__m128i*/ _2, /*ui7*/ _3) \
+  ((__m128i)__builtin_lsx_vssrani_du_q ((v2u64)(_1), (v2i64)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui4.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI, USI.  */
+#define __lsx_vssrarni_b_h(/*__m128i*/ _1, /*__m128i*/ _2, /*ui4*/ _3) \
+  ((__m128i)__builtin_lsx_vssrarni_b_h ((v16i8)(_1), (v16i8)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  V8HI, V8HI, V8HI, USI.  */
+#define __lsx_vssrarni_h_w(/*__m128i*/ _1, /*__m128i*/ _2, /*ui5*/ _3) \
+  ((__m128i)__builtin_lsx_vssrarni_h_w ((v8i16)(_1), (v8i16)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui6.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI, USI.  */
+#define __lsx_vssrarni_w_d(/*__m128i*/ _1, /*__m128i*/ _2, /*ui6*/ _3) \
+  ((__m128i)__builtin_lsx_vssrarni_w_d ((v4i32)(_1), (v4i32)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui7.  */
+/* Data types in instruction templates:  V2DI, V2DI, V2DI, USI.  */
+#define __lsx_vssrarni_d_q(/*__m128i*/ _1, /*__m128i*/ _2, /*ui7*/ _3) \
+  ((__m128i)__builtin_lsx_vssrarni_d_q ((v2i64)(_1), (v2i64)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui4.  */
+/* Data types in instruction templates:  UV16QI, UV16QI, V16QI, USI.  */
+#define __lsx_vssrarni_bu_h(/*__m128i*/ _1, /*__m128i*/ _2, /*ui4*/ _3) \
+  ((__m128i)__builtin_lsx_vssrarni_bu_h ((v16u8)(_1), (v16i8)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui5.  */
+/* Data types in instruction templates:  UV8HI, UV8HI, V8HI, USI.  */
+#define __lsx_vssrarni_hu_w(/*__m128i*/ _1, /*__m128i*/ _2, /*ui5*/ _3) \
+  ((__m128i)__builtin_lsx_vssrarni_hu_w ((v8u16)(_1), (v8i16)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui6.  */
+/* Data types in instruction templates:  UV4SI, UV4SI, V4SI, USI.  */
+#define __lsx_vssrarni_wu_d(/*__m128i*/ _1, /*__m128i*/ _2, /*ui6*/ _3) \
+  ((__m128i)__builtin_lsx_vssrarni_wu_d ((v4u32)(_1), (v4i32)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui7.  */
+/* Data types in instruction templates:  UV2DI, UV2DI, V2DI, USI.  */
+#define __lsx_vssrarni_du_q(/*__m128i*/ _1, /*__m128i*/ _2, /*ui7*/ _3) \
+  ((__m128i)__builtin_lsx_vssrarni_du_q ((v2u64)(_1), (v2i64)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, ui8.  */
+/* Data types in instruction templates:  V4SI, V4SI, V4SI, USI.  */
+#define __lsx_vpermi_w(/*__m128i*/ _1, /*__m128i*/ _2, /*ui8*/ _3) \
+  ((__m128i)__builtin_lsx_vpermi_w ((v4i32)(_1), (v4i32)(_2), (_3)))
+
+/* Assembly instruction format:	vd, rj, si12.  */
+/* Data types in instruction templates:  V16QI, CVPOINTER, SI.  */
+#define __lsx_vld(/*void **/ _1, /*si12*/ _2) \
+  ((__m128i)__builtin_lsx_vld ((void *)(_1), (_2)))
+
+/* Assembly instruction format:	vd, rj, si12.  */
+/* Data types in instruction templates:  VOID, V16QI, CVPOINTER, SI.  */
+#define __lsx_vst(/*__m128i*/ _1, /*void **/ _2, /*si12*/ _3) \
+  ((void)__builtin_lsx_vst ((v16i8)(_1), (void *)(_2), (_3)))
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vssrlrn_b_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vssrlrn_b_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vssrlrn_h_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vssrlrn_h_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vssrlrn_w_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vssrlrn_w_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V8HI, V8HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vssrln_b_h (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vssrln_b_h ((v8i16)_1, (v8i16)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V8HI, V4SI, V4SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vssrln_h_w (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vssrln_h_w ((v4i32)_1, (v4i32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V2DI, V2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vssrln_w_d (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vssrln_w_d ((v2i64)_1, (v2i64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vorn_v (__m128i _1, __m128i _2)
+{
+  return (__m128i)__builtin_lsx_vorn_v ((v16i8)_1, (v16i8)_2);
+}
+
+/* Assembly instruction format:	vd, i13.  */
+/* Data types in instruction templates:  V2DI, HI.  */
+#define __lsx_vldi(/*i13*/ _1) \
+  ((__m128i)__builtin_lsx_vldi ((_1)))
+
+/* Assembly instruction format:	vd, vj, vk, va.  */
+/* Data types in instruction templates:  V16QI, V16QI, V16QI, V16QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vshuf_b (__m128i _1, __m128i _2, __m128i _3)
+{
+  return (__m128i)__builtin_lsx_vshuf_b ((v16i8)_1, (v16i8)_2, (v16i8)_3);
+}
+
+/* Assembly instruction format:	vd, rj, rk.  */
+/* Data types in instruction templates:  V16QI, CVPOINTER, DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vldx (void * _1, long int _2)
+{
+  return (__m128i)__builtin_lsx_vldx ((void *)_1, (long int)_2);
+}
+
+/* Assembly instruction format:	vd, rj, rk.  */
+/* Data types in instruction templates:  VOID, V16QI, CVPOINTER, DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+void __lsx_vstx (__m128i _1, void * _2, long int _3)
+{
+  return (void)__builtin_lsx_vstx ((v16i8)_1, (void *)_2, (long int)_3);
+}
+
+/* Assembly instruction format:	vd, vj.  */
+/* Data types in instruction templates:  UV2DI, UV2DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vextl_qu_du (__m128i _1)
+{
+  return (__m128i)__builtin_lsx_vextl_qu_du ((v2u64)_1);
+}
+
+/* Assembly instruction format:	cd, vj.  */
+/* Data types in instruction templates:  SI, UV16QI.  */
+#define __lsx_bnz_b(/*__m128i*/ _1) \
+  ((int)__builtin_lsx_bnz_b ((v16u8)(_1)))
+
+/* Assembly instruction format:	cd, vj.  */
+/* Data types in instruction templates:  SI, UV2DI.  */
+#define __lsx_bnz_d(/*__m128i*/ _1) \
+  ((int)__builtin_lsx_bnz_d ((v2u64)(_1)))
+
+/* Assembly instruction format:	cd, vj.  */
+/* Data types in instruction templates:  SI, UV8HI.  */
+#define __lsx_bnz_h(/*__m128i*/ _1) \
+  ((int)__builtin_lsx_bnz_h ((v8u16)(_1)))
+
+/* Assembly instruction format:	cd, vj.  */
+/* Data types in instruction templates:  SI, UV16QI.  */
+#define __lsx_bnz_v(/*__m128i*/ _1) \
+  ((int)__builtin_lsx_bnz_v ((v16u8)(_1)))
+
+/* Assembly instruction format:	cd, vj.  */
+/* Data types in instruction templates:  SI, UV4SI.  */
+#define __lsx_bnz_w(/*__m128i*/ _1) \
+  ((int)__builtin_lsx_bnz_w ((v4u32)(_1)))
+
+/* Assembly instruction format:	cd, vj.  */
+/* Data types in instruction templates:  SI, UV16QI.  */
+#define __lsx_bz_b(/*__m128i*/ _1) \
+  ((int)__builtin_lsx_bz_b ((v16u8)(_1)))
+
+/* Assembly instruction format:	cd, vj.  */
+/* Data types in instruction templates:  SI, UV2DI.  */
+#define __lsx_bz_d(/*__m128i*/ _1) \
+  ((int)__builtin_lsx_bz_d ((v2u64)(_1)))
+
+/* Assembly instruction format:	cd, vj.  */
+/* Data types in instruction templates:  SI, UV8HI.  */
+#define __lsx_bz_h(/*__m128i*/ _1) \
+  ((int)__builtin_lsx_bz_h ((v8u16)(_1)))
+
+/* Assembly instruction format:	cd, vj.  */
+/* Data types in instruction templates:  SI, UV16QI.  */
+#define __lsx_bz_v(/*__m128i*/ _1) \
+  ((int)__builtin_lsx_bz_v ((v16u8)(_1)))
+
+/* Assembly instruction format:	cd, vj.  */
+/* Data types in instruction templates:  SI, UV4SI.  */
+#define __lsx_bz_w(/*__m128i*/ _1) \
+  ((int)__builtin_lsx_bz_w ((v4u32)(_1)))
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_caf_d (__m128d _1, __m128d _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_caf_d ((v2f64)_1, (v2f64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_caf_s (__m128 _1, __m128 _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_caf_s ((v4f32)_1, (v4f32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_ceq_d (__m128d _1, __m128d _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_ceq_d ((v2f64)_1, (v2f64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_ceq_s (__m128 _1, __m128 _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_ceq_s ((v4f32)_1, (v4f32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_cle_d (__m128d _1, __m128d _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_cle_d ((v2f64)_1, (v2f64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_cle_s (__m128 _1, __m128 _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_cle_s ((v4f32)_1, (v4f32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_clt_d (__m128d _1, __m128d _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_clt_d ((v2f64)_1, (v2f64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_clt_s (__m128 _1, __m128 _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_clt_s ((v4f32)_1, (v4f32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_cne_d (__m128d _1, __m128d _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_cne_d ((v2f64)_1, (v2f64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_cne_s (__m128 _1, __m128 _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_cne_s ((v4f32)_1, (v4f32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_cor_d (__m128d _1, __m128d _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_cor_d ((v2f64)_1, (v2f64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_cor_s (__m128 _1, __m128 _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_cor_s ((v4f32)_1, (v4f32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_cueq_d (__m128d _1, __m128d _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_cueq_d ((v2f64)_1, (v2f64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_cueq_s (__m128 _1, __m128 _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_cueq_s ((v4f32)_1, (v4f32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_cule_d (__m128d _1, __m128d _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_cule_d ((v2f64)_1, (v2f64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_cule_s (__m128 _1, __m128 _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_cule_s ((v4f32)_1, (v4f32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_cult_d (__m128d _1, __m128d _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_cult_d ((v2f64)_1, (v2f64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_cult_s (__m128 _1, __m128 _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_cult_s ((v4f32)_1, (v4f32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_cun_d (__m128d _1, __m128d _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_cun_d ((v2f64)_1, (v2f64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_cune_d (__m128d _1, __m128d _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_cune_d ((v2f64)_1, (v2f64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_cune_s (__m128 _1, __m128 _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_cune_s ((v4f32)_1, (v4f32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_cun_s (__m128 _1, __m128 _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_cun_s ((v4f32)_1, (v4f32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_saf_d (__m128d _1, __m128d _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_saf_d ((v2f64)_1, (v2f64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_saf_s (__m128 _1, __m128 _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_saf_s ((v4f32)_1, (v4f32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_seq_d (__m128d _1, __m128d _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_seq_d ((v2f64)_1, (v2f64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_seq_s (__m128 _1, __m128 _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_seq_s ((v4f32)_1, (v4f32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_sle_d (__m128d _1, __m128d _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_sle_d ((v2f64)_1, (v2f64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_sle_s (__m128 _1, __m128 _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_sle_s ((v4f32)_1, (v4f32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_slt_d (__m128d _1, __m128d _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_slt_d ((v2f64)_1, (v2f64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_slt_s (__m128 _1, __m128 _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_slt_s ((v4f32)_1, (v4f32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_sne_d (__m128d _1, __m128d _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_sne_d ((v2f64)_1, (v2f64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_sne_s (__m128 _1, __m128 _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_sne_s ((v4f32)_1, (v4f32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_sor_d (__m128d _1, __m128d _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_sor_d ((v2f64)_1, (v2f64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_sor_s (__m128 _1, __m128 _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_sor_s ((v4f32)_1, (v4f32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_sueq_d (__m128d _1, __m128d _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_sueq_d ((v2f64)_1, (v2f64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_sueq_s (__m128 _1, __m128 _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_sueq_s ((v4f32)_1, (v4f32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_sule_d (__m128d _1, __m128d _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_sule_d ((v2f64)_1, (v2f64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_sule_s (__m128 _1, __m128 _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_sule_s ((v4f32)_1, (v4f32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_sult_d (__m128d _1, __m128d _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_sult_d ((v2f64)_1, (v2f64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_sult_s (__m128 _1, __m128 _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_sult_s ((v4f32)_1, (v4f32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_sun_d (__m128d _1, __m128d _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_sun_d ((v2f64)_1, (v2f64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V2DI, V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_sune_d (__m128d _1, __m128d _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_sune_d ((v2f64)_1, (v2f64)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_sune_s (__m128 _1, __m128 _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_sune_s ((v4f32)_1, (v4f32)_2);
+}
+
+/* Assembly instruction format:	vd, vj, vk.  */
+/* Data types in instruction templates:  V4SI, V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128i __lsx_vfcmp_sun_s (__m128 _1, __m128 _2)
+{
+  return (__m128i)__builtin_lsx_vfcmp_sun_s ((v4f32)_1, (v4f32)_2);
+}
+
+/* Assembly instruction format:	vd, si10.  */
+/* Data types in instruction templates:  V16QI, HI.  */
+#define __lsx_vrepli_b(/*si10*/ _1) \
+  ((__m128i)__builtin_lsx_vrepli_b ((_1)))
+
+/* Assembly instruction format:	vd, si10.  */
+/* Data types in instruction templates:  V2DI, HI.  */
+#define __lsx_vrepli_d(/*si10*/ _1) \
+  ((__m128i)__builtin_lsx_vrepli_d ((_1)))
+
+/* Assembly instruction format:	vd, si10.  */
+/* Data types in instruction templates:  V8HI, HI.  */
+#define __lsx_vrepli_h(/*si10*/ _1) \
+  ((__m128i)__builtin_lsx_vrepli_h ((_1)))
+
+/* Assembly instruction format:	vd, si10.  */
+/* Data types in instruction templates:  V4SI, HI.  */
+#define __lsx_vrepli_w(/*si10*/ _1) \
+  ((__m128i)__builtin_lsx_vrepli_w ((_1)))
+
+#endif /* defined(__loongarch_sx) */
+#endif /* _GCC_LOONGSON_SXINTRIN_H */
-- 
2.36.0


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v5 4/6] LoongArch: Add Loongson ASX vector directive compilation framework.
  2023-08-24  3:13 [PATCH v5 0/6] Add Loongson SX/ASX instruction support to LoongArch target Chenghui Pan
                   ` (2 preceding siblings ...)
  2023-08-24  3:13 ` [PATCH v5 3/6] LoongArch: Add Loongson SX directive builtin function support Chenghui Pan
@ 2023-08-24  3:13 ` Chenghui Pan
  2023-08-24  3:13 ` [PATCH v5 5/6] LoongArch: Add Loongson ASX base instruction support Chenghui Pan
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 9+ messages in thread
From: Chenghui Pan @ 2023-08-24  3:13 UTC (permalink / raw)
  To: gcc-patches; +Cc: xry111, i, chenglulu, xuchenghua

From: Lulu Cheng <chenglulu@loongson.cn>

gcc/ChangeLog:

	* config/loongarch/genopts/loongarch-strings: Add compilation framework.
	* config/loongarch/genopts/loongarch.opt.in: Ditto.
	* config/loongarch/loongarch-c.cc (loongarch_cpu_cpp_builtins): Ditto.
	* config/loongarch/loongarch-def.c: Ditto.
	* config/loongarch/loongarch-def.h (N_ISA_EXT_TYPES): Ditto.
	(ISA_EXT_SIMD_LASX): Ditto.
	(N_SWITCH_TYPES): Ditto.
	(SW_LASX): Ditto.
	* config/loongarch/loongarch-driver.cc (driver_get_normalized_m_opts): Ditto.
	* config/loongarch/loongarch-driver.h (driver_get_normalized_m_opts): Ditto.
	* config/loongarch/loongarch-opts.cc (isa_str): Ditto.
	* config/loongarch/loongarch-opts.h (ISA_HAS_LSX): Ditto.
	(ISA_HAS_LASX): Ditto.
	* config/loongarch/loongarch-str.h (OPTSTR_LASX): Ditto.
	* config/loongarch/loongarch.opt: Ditto.
---
 gcc/config/loongarch/genopts/loongarch-strings |  1 +
 gcc/config/loongarch/genopts/loongarch.opt.in  |  4 ++++
 gcc/config/loongarch/loongarch-c.cc            | 11 +++++++++++
 gcc/config/loongarch/loongarch-def.c           |  4 +++-
 gcc/config/loongarch/loongarch-def.h           |  6 ++++--
 gcc/config/loongarch/loongarch-driver.cc       |  2 +-
 gcc/config/loongarch/loongarch-driver.h        |  1 +
 gcc/config/loongarch/loongarch-opts.cc         |  9 ++++++++-
 gcc/config/loongarch/loongarch-opts.h          |  4 +++-
 gcc/config/loongarch/loongarch-str.h           |  1 +
 gcc/config/loongarch/loongarch.opt             |  4 ++++
 11 files changed, 41 insertions(+), 6 deletions(-)

diff --git a/gcc/config/loongarch/genopts/loongarch-strings b/gcc/config/loongarch/genopts/loongarch-strings
index 24a5025061f..35d08f5967d 100644
--- a/gcc/config/loongarch/genopts/loongarch-strings
+++ b/gcc/config/loongarch/genopts/loongarch-strings
@@ -42,6 +42,7 @@ OPTSTR_DOUBLE_FLOAT   double-float
 
 # SIMD extensions
 OPTSTR_LSX	lsx
+OPTSTR_LASX	lasx
 
 # -mabi=
 OPTSTR_ABI_BASE	      abi
diff --git a/gcc/config/loongarch/genopts/loongarch.opt.in b/gcc/config/loongarch/genopts/loongarch.opt.in
index 338d77a7e40..afde23c9661 100644
--- a/gcc/config/loongarch/genopts/loongarch.opt.in
+++ b/gcc/config/loongarch/genopts/loongarch.opt.in
@@ -80,6 +80,10 @@ m@@OPTSTR_LSX@@
 Target RejectNegative Var(la_opt_switches) Mask(LSX) Negative(m@@OPTSTR_LSX@@)
 Enable LoongArch SIMD Extension (LSX).
 
+m@@OPTSTR_LASX@@
+Target RejectNegative Var(la_opt_switches) Mask(LASX) Negative(m@@OPTSTR_LASX@@)
+Enable LoongArch Advanced SIMD Extension (LASX).
+
 ;; Base target models (implies ISA & tune parameters)
 Enum
 Name(cpu_type) Type(int)
diff --git a/gcc/config/loongarch/loongarch-c.cc b/gcc/config/loongarch/loongarch-c.cc
index b065921adc3..2747fb9e472 100644
--- a/gcc/config/loongarch/loongarch-c.cc
+++ b/gcc/config/loongarch/loongarch-c.cc
@@ -104,8 +104,19 @@ loongarch_cpu_cpp_builtins (cpp_reader *pfile)
       builtin_define ("__loongarch_simd");
       builtin_define ("__loongarch_sx");
       builtin_define ("__loongarch_sx_width=128");
+
+      if (!ISA_HAS_LASX)
+	builtin_define ("__loongarch_simd_width=128");
     }
 
+  if (ISA_HAS_LASX)
+    {
+      builtin_define ("__loongarch_asx");
+      builtin_define ("__loongarch_asx_width=256");
+      builtin_define ("__loongarch_simd_width=256");
+    }
+
+
   /* Native Data Sizes.  */
   builtin_define_with_int_value ("_LOONGARCH_SZINT", INT_TYPE_SIZE);
   builtin_define_with_int_value ("_LOONGARCH_SZLONG", LONG_TYPE_SIZE);
diff --git a/gcc/config/loongarch/loongarch-def.c b/gcc/config/loongarch/loongarch-def.c
index 28e24c62249..bff92c86532 100644
--- a/gcc/config/loongarch/loongarch-def.c
+++ b/gcc/config/loongarch/loongarch-def.c
@@ -54,7 +54,7 @@ loongarch_cpu_default_isa[N_ARCH_TYPES] = {
   [CPU_LA464] = {
       .base = ISA_BASE_LA64V100,
       .fpu = ISA_EXT_FPU64,
-      .simd = ISA_EXT_SIMD_LSX,
+      .simd = ISA_EXT_SIMD_LASX,
   },
 };
 
@@ -150,6 +150,7 @@ loongarch_isa_ext_strings[N_ISA_EXT_TYPES] = {
   [ISA_EXT_FPU32] = STR_ISA_EXT_FPU32,
   [ISA_EXT_NOFPU] = STR_ISA_EXT_NOFPU,
   [ISA_EXT_SIMD_LSX] = OPTSTR_LSX,
+  [ISA_EXT_SIMD_LASX] = OPTSTR_LASX,
 };
 
 const char*
@@ -180,6 +181,7 @@ loongarch_switch_strings[] = {
   [SW_SINGLE_FLOAT]	  = OPTSTR_SINGLE_FLOAT,
   [SW_DOUBLE_FLOAT]	  = OPTSTR_DOUBLE_FLOAT,
   [SW_LSX]		  = OPTSTR_LSX,
+  [SW_LASX]		  = OPTSTR_LASX,
 };
 
 
diff --git a/gcc/config/loongarch/loongarch-def.h b/gcc/config/loongarch/loongarch-def.h
index f34cffcfb9b..0bbcdb03d22 100644
--- a/gcc/config/loongarch/loongarch-def.h
+++ b/gcc/config/loongarch/loongarch-def.h
@@ -64,7 +64,8 @@ extern const char* loongarch_isa_ext_strings[];
 #define ISA_EXT_FPU64	      2
 #define N_ISA_EXT_FPU_TYPES   3
 #define ISA_EXT_SIMD_LSX      3
-#define N_ISA_EXT_TYPES	      4
+#define ISA_EXT_SIMD_LASX     4
+#define N_ISA_EXT_TYPES	      5
 
 /* enum abi_base */
 extern const char* loongarch_abi_base_strings[];
@@ -99,7 +100,8 @@ extern const char* loongarch_switch_strings[];
 #define SW_SINGLE_FLOAT	      1
 #define SW_DOUBLE_FLOAT	      2
 #define SW_LSX		      3
-#define N_SWITCH_TYPES	      4
+#define SW_LASX		      4
+#define N_SWITCH_TYPES	      5
 
 /* The common default value for variables whose assignments
    are triggered by command-line options.  */
diff --git a/gcc/config/loongarch/loongarch-driver.cc b/gcc/config/loongarch/loongarch-driver.cc
index aa5011bd86a..3b9605de35f 100644
--- a/gcc/config/loongarch/loongarch-driver.cc
+++ b/gcc/config/loongarch/loongarch-driver.cc
@@ -181,7 +181,7 @@ driver_get_normalized_m_opts (int argc, const char **argv)
 
   if (la_target.isa.simd)
     {
-      APPEND_LTR (" %<m" OPTSTR_LSX " -m");
+      APPEND_LTR (" %<m" OPTSTR_LSX " %<m" OPTSTR_LASX " -m");
       APPEND_VAL (loongarch_isa_ext_strings[la_target.isa.simd]);
     }
 
diff --git a/gcc/config/loongarch/loongarch-driver.h b/gcc/config/loongarch/loongarch-driver.h
index db663818b7c..0c6b4157261 100644
--- a/gcc/config/loongarch/loongarch-driver.h
+++ b/gcc/config/loongarch/loongarch-driver.h
@@ -52,6 +52,7 @@ driver_get_normalized_m_opts (int argc, const char **argv);
   LA_SET_FLAG_SPEC (SINGLE_FLOAT)			      \
   LA_SET_FLAG_SPEC (DOUBLE_FLOAT)			      \
   LA_SET_FLAG_SPEC (LSX)				      \
+  LA_SET_FLAG_SPEC (LASX)				      \
   " %:get_normalized_m_opts()"
 
 #define DRIVER_SELF_SPECS \
diff --git a/gcc/config/loongarch/loongarch-opts.cc b/gcc/config/loongarch/loongarch-opts.cc
index 9753cf1290b..5986a2dd456 100644
--- a/gcc/config/loongarch/loongarch-opts.cc
+++ b/gcc/config/loongarch/loongarch-opts.cc
@@ -84,6 +84,7 @@ const int loongarch_switch_mask[N_SWITCH_TYPES] = {
   /* SW_SINGLE_FLOAT */  M(FORCE_F32),
   /* SW_DOUBLE_FLOAT */  M(FORCE_F64),
   /* SW_LSX */		 M(LSX),
+  /* SW_LASX */		 M(LASX),
 };
 #undef M
 
@@ -254,8 +255,9 @@ config_target_isa:
      t.isa.fpu : DEFAULT_ISA_EXT_FPU);
 
   /* LoongArch SIMD extensions.  */
+  /* Note: LASX implies LSX, so we put "on (LASX)" first.  */
   int simd_switch;
-  if (on (LSX))
+  if (on (LASX) || on (LSX))
     {
       constrained.simd = 1;
       switch (on_switch)
@@ -264,6 +266,10 @@ config_target_isa:
 	    t.isa.simd = ISA_EXT_SIMD_LSX;
 	    break;
 
+	  case SW_LASX:
+	    t.isa.simd = ISA_EXT_SIMD_LASX;
+	    break;
+
 	  default:
 	    gcc_unreachable ();
 	}
@@ -603,6 +609,7 @@ isa_str (const struct loongarch_isa *isa, char separator)
   switch (isa->simd)
     {
       case ISA_EXT_SIMD_LSX:
+      case ISA_EXT_SIMD_LASX:
 	APPEND1 (separator);
 	APPEND_STRING (loongarch_isa_ext_strings[isa->simd]);
 	break;
diff --git a/gcc/config/loongarch/loongarch-opts.h b/gcc/config/loongarch/loongarch-opts.h
index d067c05dfc9..59a383ec5ca 100644
--- a/gcc/config/loongarch/loongarch-opts.h
+++ b/gcc/config/loongarch/loongarch-opts.h
@@ -66,7 +66,9 @@ loongarch_config_target (struct loongarch_target *target,
 				   || la_target.abi.base == ABI_BASE_LP64F \
 				   || la_target.abi.base == ABI_BASE_LP64S)
 
-#define ISA_HAS_LSX		  (la_target.isa.simd == ISA_EXT_SIMD_LSX)
+#define ISA_HAS_LSX		  (la_target.isa.simd == ISA_EXT_SIMD_LSX \
+				   || la_target.isa.simd == ISA_EXT_SIMD_LASX)
+#define ISA_HAS_LASX		  (la_target.isa.simd == ISA_EXT_SIMD_LASX)
 #define TARGET_ARCH_NATIVE	  (la_target.cpu_arch == CPU_NATIVE)
 #define LARCH_ACTUAL_ARCH	  (TARGET_ARCH_NATIVE \
 				   ? (la_target.cpu_native < N_ARCH_TYPES \
diff --git a/gcc/config/loongarch/loongarch-str.h b/gcc/config/loongarch/loongarch-str.h
index 6fa1b1571c5..951f35a3c24 100644
--- a/gcc/config/loongarch/loongarch-str.h
+++ b/gcc/config/loongarch/loongarch-str.h
@@ -43,6 +43,7 @@ along with GCC; see the file COPYING3.  If not see
 #define OPTSTR_DOUBLE_FLOAT "double-float"
 
 #define OPTSTR_LSX "lsx"
+#define OPTSTR_LASX "lasx"
 
 #define OPTSTR_ABI_BASE "abi"
 #define STR_ABI_BASE_LP64D "lp64d"
diff --git a/gcc/config/loongarch/loongarch.opt b/gcc/config/loongarch/loongarch.opt
index 5c7e6d37220..611629b4203 100644
--- a/gcc/config/loongarch/loongarch.opt
+++ b/gcc/config/loongarch/loongarch.opt
@@ -87,6 +87,10 @@ mlsx
 Target RejectNegative Var(la_opt_switches) Mask(LSX) Negative(mlsx)
 Enable LoongArch SIMD Extension (LSX).
 
+mlasx
+Target RejectNegative Var(la_opt_switches) Mask(LASX) Negative(mlasx)
+Enable LoongArch Advanced SIMD Extension (LASX).
+
 ;; Base target models (implies ISA & tune parameters)
 Enum
 Name(cpu_type) Type(int)
-- 
2.36.0


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v5 5/6] LoongArch: Add Loongson ASX base instruction support.
  2023-08-24  3:13 [PATCH v5 0/6] Add Loongson SX/ASX instruction support to LoongArch target Chenghui Pan
                   ` (3 preceding siblings ...)
  2023-08-24  3:13 ` [PATCH v5 4/6] LoongArch: Add Loongson ASX vector directive compilation framework Chenghui Pan
@ 2023-08-24  3:13 ` Chenghui Pan
  2023-08-24  3:13 ` [PATCH v5 6/6] LoongArch: Add Loongson ASX directive builtin function support Chenghui Pan
  2023-08-24  3:40 ` [PATCH v5 0/6] Add Loongson SX/ASX instruction support to LoongArch target Xi Ruoyao
  6 siblings, 0 replies; 9+ messages in thread
From: Chenghui Pan @ 2023-08-24  3:13 UTC (permalink / raw)
  To: gcc-patches; +Cc: xry111, i, chenglulu, xuchenghua

From: Lulu Cheng <chenglulu@loongson.cn>

gcc/ChangeLog:

	* config/loongarch/loongarch-modes.def
	(VECTOR_MODES): Add Loongson ASX instruction support.
	* config/loongarch/loongarch-protos.h (loongarch_split_256bit_move): Ditto.
	(loongarch_split_256bit_move_p): Ditto.
	(loongarch_expand_vector_group_init): Ditto.
	(loongarch_expand_vec_perm_1): Ditto.
	* config/loongarch/loongarch.cc (loongarch_symbol_insns): Ditto.
	(loongarch_valid_offset_p): Ditto.
	(loongarch_valid_index_p): Ditto.
	(loongarch_address_insns): Ditto.
	(loongarch_const_insns): Ditto.
	(loongarch_legitimize_move): Ditto.
	(loongarch_builtin_vectorization_cost): Ditto.
	(loongarch_split_move_p): Ditto.
	(loongarch_split_move): Ditto.
	(loongarch_output_move_index): Ditto.
	(loongarch_output_move_index_float): Ditto.
	(loongarch_split_256bit_move_p): Ditto.
	(loongarch_split_256bit_move): Ditto.
	(loongarch_output_move): Ditto.
	(loongarch_print_operand_reloc): Ditto.
	(loongarch_print_operand): Ditto.
	(loongarch_hard_regno_mode_ok_uncached): Ditto.
	(loongarch_hard_regno_nregs): Ditto.
	(loongarch_class_max_nregs): Ditto.
	(loongarch_can_change_mode_class): Ditto.
	(loongarch_mode_ok_for_mov_fmt_p): Ditto.
	(loongarch_vector_mode_supported_p): Ditto.
	(loongarch_preferred_simd_mode): Ditto.
	(loongarch_autovectorize_vector_modes): Ditto.
	(loongarch_lsx_output_division): Ditto.
	(loongarch_expand_lsx_shuffle): Ditto.
	(loongarch_expand_vec_perm): Ditto.
	(loongarch_expand_vec_perm_interleave): Ditto.
	(loongarch_try_expand_lsx_vshuf_const): Ditto.
	(loongarch_expand_vec_perm_even_odd_1): Ditto.
	(loongarch_expand_vec_perm_even_odd): Ditto.
	(loongarch_expand_vec_perm_1): Ditto.
	(loongarch_expand_vec_perm_const_1): Ditto.
	(loongarch_is_quad_duplicate): Ditto.
	(loongarch_is_double_duplicate): Ditto.
	(loongarch_is_odd_extraction): Ditto.
	(loongarch_is_even_extraction): Ditto.
	(loongarch_is_extraction_permutation): Ditto.
	(loongarch_is_center_extraction): Ditto.
	(loongarch_is_reversing_permutation): Ditto.
	(loongarch_is_di_misalign_extract): Ditto.
	(loongarch_is_si_misalign_extract): Ditto.
	(loongarch_is_lasx_lowpart_interleave): Ditto.
	(loongarch_is_lasx_lowpart_interleave_2): Ditto.
	(COMPARE_SELECTOR): Ditto.
	(loongarch_is_lasx_lowpart_extract): Ditto.
	(loongarch_is_lasx_highpart_interleave): Ditto.
	(loongarch_is_lasx_highpart_interleave_2): Ditto.
	(loongarch_is_elem_duplicate): Ditto.
	(loongarch_is_op_reverse_perm): Ditto.
	(loongarch_is_single_op_perm): Ditto.
	(loongarch_is_divisible_perm): Ditto.
	(loongarch_is_triple_stride_extract): Ditto.
	(loongarch_expand_vec_perm_const_2): Ditto.
	(loongarch_sched_reassociation_width): Ditto.
	(loongarch_expand_vector_extract): Ditto.
	(emit_reduc_half): Ditto.
	(loongarch_expand_vec_unpack): Ditto.
	(loongarch_expand_vector_group_init): Ditto.
	(loongarch_expand_vector_init): Ditto.
	(loongarch_expand_lsx_cmp): Ditto.
	(loongarch_builtin_support_vector_misalignment): Ditto.
	* config/loongarch/loongarch.h (UNITS_PER_LASX_REG): Ditto.
	(BITS_PER_LASX_REG): Ditto.
	(STRUCTURE_SIZE_BOUNDARY): Ditto.
	(LASX_REG_FIRST): Ditto.
	(LASX_REG_LAST): Ditto.
	(LASX_REG_NUM): Ditto.
	(LASX_REG_P): Ditto.
	(LASX_REG_RTX_P): Ditto.
	(LASX_SUPPORTED_MODE_P): Ditto.
	* config/loongarch/loongarch.md: Ditto.
	* config/loongarch/lasx.md: New file.

gcc/testsuite/ChangeLog:

	* g++.dg/torture/vshuf-v16hi.C: Skip loongarch*-*-* because of
	  xvshuf insn's undefined result when 6 or 7 bit of vector's
	  element is set.
---
 gcc/config/loongarch/lasx.md               | 5104 ++++++++++++++++++++
 gcc/config/loongarch/loongarch-modes.def   |    1 +
 gcc/config/loongarch/loongarch-protos.h    |    4 +
 gcc/config/loongarch/loongarch.cc          | 2490 +++++++++-
 gcc/config/loongarch/loongarch.h           |   60 +-
 gcc/config/loongarch/loongarch.md          |   20 +-
 gcc/testsuite/g++.dg/torture/vshuf-v16hi.C |    1 +
 7 files changed, 7561 insertions(+), 119 deletions(-)
 create mode 100644 gcc/config/loongarch/lasx.md

diff --git a/gcc/config/loongarch/lasx.md b/gcc/config/loongarch/lasx.md
new file mode 100644
index 00000000000..8111c8bb79a
--- /dev/null
+++ b/gcc/config/loongarch/lasx.md
@@ -0,0 +1,5104 @@
+;; Machine Description for LARCH Loongson ASX ASE
+;;
+;; Copyright (C) 2018 Free Software Foundation, Inc.
+;;
+;; This file is part of GCC.
+;;
+;; GCC is free software; you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation; either version 3, or (at your option)
+;; any later version.
+;;
+;; GCC is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;; GNU General Public License for more details.
+;;
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; <http://www.gnu.org/licenses/>.
+;;
+
+(define_c_enum "unspec" [
+  UNSPEC_LASX_XVABSD_S
+  UNSPEC_LASX_XVABSD_U
+  UNSPEC_LASX_XVAVG_S
+  UNSPEC_LASX_XVAVG_U
+  UNSPEC_LASX_XVAVGR_S
+  UNSPEC_LASX_XVAVGR_U
+  UNSPEC_LASX_XVBITCLR
+  UNSPEC_LASX_XVBITCLRI
+  UNSPEC_LASX_XVBITREV
+  UNSPEC_LASX_XVBITREVI
+  UNSPEC_LASX_XVBITSET
+  UNSPEC_LASX_XVBITSETI
+  UNSPEC_LASX_XVFCMP_CAF
+  UNSPEC_LASX_XVFCLASS
+  UNSPEC_LASX_XVFCMP_CUNE
+  UNSPEC_LASX_XVFCVT
+  UNSPEC_LASX_XVFCVTH
+  UNSPEC_LASX_XVFCVTL
+  UNSPEC_LASX_XVFLOGB
+  UNSPEC_LASX_XVFRECIP
+  UNSPEC_LASX_XVFRINT
+  UNSPEC_LASX_XVFRSQRT
+  UNSPEC_LASX_XVFCMP_SAF
+  UNSPEC_LASX_XVFCMP_SEQ
+  UNSPEC_LASX_XVFCMP_SLE
+  UNSPEC_LASX_XVFCMP_SLT
+  UNSPEC_LASX_XVFCMP_SNE
+  UNSPEC_LASX_XVFCMP_SOR
+  UNSPEC_LASX_XVFCMP_SUEQ
+  UNSPEC_LASX_XVFCMP_SULE
+  UNSPEC_LASX_XVFCMP_SULT
+  UNSPEC_LASX_XVFCMP_SUN
+  UNSPEC_LASX_XVFCMP_SUNE
+  UNSPEC_LASX_XVFTINT_S
+  UNSPEC_LASX_XVFTINT_U
+  UNSPEC_LASX_XVCLO
+  UNSPEC_LASX_XVSAT_S
+  UNSPEC_LASX_XVSAT_U
+  UNSPEC_LASX_XVREPLVE0
+  UNSPEC_LASX_XVREPL128VEI
+  UNSPEC_LASX_XVSRAR
+  UNSPEC_LASX_XVSRARI
+  UNSPEC_LASX_XVSRLR
+  UNSPEC_LASX_XVSRLRI
+  UNSPEC_LASX_XVSHUF
+  UNSPEC_LASX_XVSHUF_B
+  UNSPEC_LASX_BRANCH
+  UNSPEC_LASX_BRANCH_V
+
+  UNSPEC_LASX_XVMUH_S
+  UNSPEC_LASX_XVMUH_U
+  UNSPEC_LASX_MXVEXTW_U
+  UNSPEC_LASX_XVSLLWIL_S
+  UNSPEC_LASX_XVSLLWIL_U
+  UNSPEC_LASX_XVSRAN
+  UNSPEC_LASX_XVSSRAN_S
+  UNSPEC_LASX_XVSSRAN_U
+  UNSPEC_LASX_XVSRARN
+  UNSPEC_LASX_XVSSRARN_S
+  UNSPEC_LASX_XVSSRARN_U
+  UNSPEC_LASX_XVSRLN
+  UNSPEC_LASX_XVSSRLN_U
+  UNSPEC_LASX_XVSRLRN
+  UNSPEC_LASX_XVSSRLRN_U
+  UNSPEC_LASX_XVFRSTPI
+  UNSPEC_LASX_XVFRSTP
+  UNSPEC_LASX_XVSHUF4I
+  UNSPEC_LASX_XVBSRL_V
+  UNSPEC_LASX_XVBSLL_V
+  UNSPEC_LASX_XVEXTRINS
+  UNSPEC_LASX_XVMSKLTZ
+  UNSPEC_LASX_XVSIGNCOV
+  UNSPEC_LASX_XVFTINTRNE_W_S
+  UNSPEC_LASX_XVFTINTRNE_L_D
+  UNSPEC_LASX_XVFTINTRP_W_S
+  UNSPEC_LASX_XVFTINTRP_L_D
+  UNSPEC_LASX_XVFTINTRM_W_S
+  UNSPEC_LASX_XVFTINTRM_L_D
+  UNSPEC_LASX_XVFTINT_W_D
+  UNSPEC_LASX_XVFFINT_S_L
+  UNSPEC_LASX_XVFTINTRZ_W_D
+  UNSPEC_LASX_XVFTINTRP_W_D
+  UNSPEC_LASX_XVFTINTRM_W_D
+  UNSPEC_LASX_XVFTINTRNE_W_D
+  UNSPEC_LASX_XVFTINTH_L_S
+  UNSPEC_LASX_XVFTINTL_L_S
+  UNSPEC_LASX_XVFFINTH_D_W
+  UNSPEC_LASX_XVFFINTL_D_W
+  UNSPEC_LASX_XVFTINTRZH_L_S
+  UNSPEC_LASX_XVFTINTRZL_L_S
+  UNSPEC_LASX_XVFTINTRPH_L_S
+  UNSPEC_LASX_XVFTINTRPL_L_S
+  UNSPEC_LASX_XVFTINTRMH_L_S
+  UNSPEC_LASX_XVFTINTRML_L_S
+  UNSPEC_LASX_XVFTINTRNEL_L_S
+  UNSPEC_LASX_XVFTINTRNEH_L_S
+  UNSPEC_LASX_XVFRINTRNE_S
+  UNSPEC_LASX_XVFRINTRNE_D
+  UNSPEC_LASX_XVFRINTRZ_S
+  UNSPEC_LASX_XVFRINTRZ_D
+  UNSPEC_LASX_XVFRINTRP_S
+  UNSPEC_LASX_XVFRINTRP_D
+  UNSPEC_LASX_XVFRINTRM_S
+  UNSPEC_LASX_XVFRINTRM_D
+  UNSPEC_LASX_XVREPLVE0_Q
+  UNSPEC_LASX_XVPERM_W
+  UNSPEC_LASX_XVPERMI_Q
+  UNSPEC_LASX_XVPERMI_D
+
+  UNSPEC_LASX_XVADDWEV
+  UNSPEC_LASX_XVADDWEV2
+  UNSPEC_LASX_XVADDWEV3
+  UNSPEC_LASX_XVSUBWEV
+  UNSPEC_LASX_XVSUBWEV2
+  UNSPEC_LASX_XVMULWEV
+  UNSPEC_LASX_XVMULWEV2
+  UNSPEC_LASX_XVMULWEV3
+  UNSPEC_LASX_XVADDWOD
+  UNSPEC_LASX_XVADDWOD2
+  UNSPEC_LASX_XVADDWOD3
+  UNSPEC_LASX_XVSUBWOD
+  UNSPEC_LASX_XVSUBWOD2
+  UNSPEC_LASX_XVMULWOD
+  UNSPEC_LASX_XVMULWOD2
+  UNSPEC_LASX_XVMULWOD3
+  UNSPEC_LASX_XVMADDWEV
+  UNSPEC_LASX_XVMADDWEV2
+  UNSPEC_LASX_XVMADDWEV3
+  UNSPEC_LASX_XVMADDWOD
+  UNSPEC_LASX_XVMADDWOD2
+  UNSPEC_LASX_XVMADDWOD3
+  UNSPEC_LASX_XVHADDW_Q_D
+  UNSPEC_LASX_XVHSUBW_Q_D
+  UNSPEC_LASX_XVHADDW_QU_DU
+  UNSPEC_LASX_XVHSUBW_QU_DU
+  UNSPEC_LASX_XVROTR
+  UNSPEC_LASX_XVADD_Q
+  UNSPEC_LASX_XVSUB_Q
+  UNSPEC_LASX_XVREPLVE
+  UNSPEC_LASX_XVSHUF4
+  UNSPEC_LASX_XVMSKGEZ
+  UNSPEC_LASX_XVMSKNZ
+  UNSPEC_LASX_XVEXTH_Q_D
+  UNSPEC_LASX_XVEXTH_QU_DU
+  UNSPEC_LASX_XVEXTL_Q_D
+  UNSPEC_LASX_XVSRLNI
+  UNSPEC_LASX_XVSRLRNI
+  UNSPEC_LASX_XVSSRLNI
+  UNSPEC_LASX_XVSSRLNI2
+  UNSPEC_LASX_XVSSRLRNI
+  UNSPEC_LASX_XVSSRLRNI2
+  UNSPEC_LASX_XVSRANI
+  UNSPEC_LASX_XVSRARNI
+  UNSPEC_LASX_XVSSRANI
+  UNSPEC_LASX_XVSSRANI2
+  UNSPEC_LASX_XVSSRARNI
+  UNSPEC_LASX_XVSSRARNI2
+  UNSPEC_LASX_XVPERMI
+  UNSPEC_LASX_XVINSVE0
+  UNSPEC_LASX_XVPICKVE
+  UNSPEC_LASX_XVSSRLN
+  UNSPEC_LASX_XVSSRLRN
+  UNSPEC_LASX_XVEXTL_QU_DU
+  UNSPEC_LASX_XVLDI
+  UNSPEC_LASX_XVLDX
+  UNSPEC_LASX_XVSTX
+])
+
+;; All vector modes with 256 bits.
+(define_mode_iterator LASX [V4DF V8SF V4DI V8SI V16HI V32QI])
+
+;; Same as LASX.  Used by vcond to iterate two modes.
+(define_mode_iterator LASX_2 [V4DF V8SF V4DI V8SI V16HI V32QI])
+
+;; Only used for splitting insert_d and copy_{u,s}.d.
+(define_mode_iterator LASX_D [V4DI V4DF])
+
+;; Only used for splitting insert_d and copy_{u,s}.d.
+(define_mode_iterator LASX_WD [V4DI V4DF V8SI V8SF])
+
+;; Only used for copy256_{u,s}.w.
+(define_mode_iterator LASX_W    [V8SI V8SF])
+
+;; Only integer modes in LASX.
+(define_mode_iterator ILASX [V4DI V8SI V16HI V32QI])
+
+;; As ILASX but excludes V32QI.
+(define_mode_iterator ILASX_DWH [V4DI V8SI V16HI])
+
+;; As LASX but excludes V32QI.
+(define_mode_iterator LASX_DWH [V4DF V8SF V4DI V8SI V16HI])
+
+;; As ILASX but excludes V4DI.
+(define_mode_iterator ILASX_WHB [V8SI V16HI V32QI])
+
+;; Only integer modes equal or larger than a word.
+(define_mode_iterator ILASX_DW  [V4DI V8SI])
+
+;; Only integer modes smaller than a word.
+(define_mode_iterator ILASX_HB  [V16HI V32QI])
+
+;; Only floating-point modes in LASX.
+(define_mode_iterator FLASX  [V4DF V8SF])
+
+;; Only used for immediate set shuffle elements instruction.
+(define_mode_iterator LASX_WHB_W [V8SI V16HI V32QI V8SF])
+
+;; The attribute gives the integer vector mode with same size in Loongson ASX.
+(define_mode_attr VIMODE256
+  [(V4DF "V4DI")
+   (V8SF "V8SI")
+   (V4DI "V4DI")
+   (V8SI "V8SI")
+   (V16HI "V16HI")
+   (V32QI "V32QI")])
+
+;;attribute gives half modes for vector modes.
+;;attribute gives half modes (Same Size) for vector modes.
+(define_mode_attr VHSMODE256
+  [(V16HI "V32QI")
+   (V8SI "V16HI")
+   (V4DI "V8SI")])
+
+;;attribute gives half modes  for vector modes.
+(define_mode_attr VHMODE256
+  [(V32QI "V16QI")
+   (V16HI "V8HI")
+   (V8SI "V4SI")
+   (V4DI "V2DI")])
+
+;;attribute gives half float modes for vector modes.
+(define_mode_attr VFHMODE256
+   [(V8SF "V4SF")
+   (V4DF "V2DF")])
+
+;; The attribute gives double modes for vector modes in LASX.
+(define_mode_attr VDMODE256
+  [(V8SI "V4DI")
+   (V16HI "V8SI")
+   (V32QI "V16HI")])
+
+;; extended from VDMODE256
+(define_mode_attr VDMODEEXD256
+  [(V4DI "V4DI")
+   (V8SI "V4DI")
+   (V16HI "V8SI")
+   (V32QI "V16HI")])
+
+;; The attribute gives half modes with same number of elements for vector modes.
+(define_mode_attr VTRUNCMODE256
+  [(V16HI "V16QI")
+   (V8SI "V8HI")
+   (V4DI "V4SI")])
+
+;; Double-sized Vector MODE with same elemet type. "Vector, Enlarged-MODE"
+(define_mode_attr VEMODE256
+  [(V8SF "V16SF")
+   (V8SI "V16SI")
+   (V4DF "V8DF")
+   (V4DI "V8DI")])
+
+;; This attribute gives the mode of the result for "copy_s_b, copy_u_b" etc.
+(define_mode_attr VRES256
+  [(V4DF "DF")
+   (V8SF "SF")
+   (V4DI "DI")
+   (V8SI "SI")
+   (V16HI "SI")
+   (V32QI "SI")])
+
+;; Only used with LASX_D iterator.
+(define_mode_attr lasx_d
+  [(V4DI "reg_or_0")
+   (V4DF "register")])
+
+;; This attribute gives the 256 bit integer vector mode with same size.
+(define_mode_attr mode256_i
+  [(V4DF "v4di")
+   (V8SF "v8si")
+   (V4DI "v4di")
+   (V8SI "v8si")
+   (V16HI "v16hi")
+   (V32QI "v32qi")])
+
+
+;; This attribute gives the 256 bit float vector mode with same size.
+(define_mode_attr mode256_f
+  [(V4DF "v4df")
+   (V8SF "v8sf")
+   (V4DI "v4df")
+   (V8SI "v8sf")])
+
+ ;; This attribute gives suffix for LASX instructions.  HOW?
+(define_mode_attr lasxfmt
+  [(V4DF "d")
+   (V8SF "w")
+   (V4DI "d")
+   (V8SI "w")
+   (V16HI "h")
+   (V32QI "b")])
+
+(define_mode_attr flasxfmt
+  [(V4DF "d")
+   (V8SF "s")])
+
+(define_mode_attr lasxfmt_u
+  [(V4DF "du")
+   (V8SF "wu")
+   (V4DI "du")
+   (V8SI "wu")
+   (V16HI "hu")
+   (V32QI "bu")])
+
+(define_mode_attr ilasxfmt
+  [(V4DF "l")
+   (V8SF "w")])
+
+(define_mode_attr ilasxfmt_u
+  [(V4DF "lu")
+   (V8SF "wu")])
+
+;; This attribute gives suffix for integers in VHMODE256.
+(define_mode_attr hlasxfmt
+  [(V4DI "w")
+   (V8SI "h")
+   (V16HI "b")])
+
+(define_mode_attr hlasxfmt_u
+  [(V4DI "wu")
+   (V8SI "hu")
+   (V16HI "bu")])
+
+;; This attribute gives suffix for integers in VHSMODE256.
+(define_mode_attr hslasxfmt
+  [(V4DI "w")
+   (V8SI "h")
+   (V16HI "b")])
+
+;; This attribute gives define_insn suffix for LASX instructions that need
+;; distinction between integer and floating point.
+(define_mode_attr lasxfmt_f
+  [(V4DF "d_f")
+   (V8SF "w_f")
+   (V4DI "d")
+   (V8SI "w")
+   (V16HI "h")
+   (V32QI "b")])
+
+(define_mode_attr flasxfmt_f
+  [(V4DF "d_f")
+   (V8SF "s_f")
+   (V4DI "d")
+   (V8SI "w")
+   (V16HI "h")
+   (V32QI "b")])
+
+;; This attribute gives define_insn suffix for LASX instructions that need
+;; distinction between integer and floating point.
+(define_mode_attr lasxfmt_f_wd
+  [(V4DF "d_f")
+   (V8SF "w_f")
+   (V4DI "d")
+   (V8SI "w")])
+
+;; This attribute gives suffix for integers in VHMODE256.
+(define_mode_attr dlasxfmt
+  [(V8SI "d")
+   (V16HI "w")
+   (V32QI "h")])
+
+(define_mode_attr dlasxfmt_u
+  [(V8SI "du")
+   (V16HI "wu")
+   (V32QI "hu")])
+
+;; for VDMODEEXD256
+(define_mode_attr dlasxqfmt
+  [(V4DI "q")
+   (V8SI "d")
+   (V16HI "w")
+   (V32QI "h")])
+
+;; This is used to form an immediate operand constraint using
+;; "const_<indeximm256>_operand".
+(define_mode_attr indeximm256
+  [(V4DF "0_to_3")
+   (V8SF "0_to_7")
+   (V4DI "0_to_3")
+   (V8SI "0_to_7")
+   (V16HI "uimm4")
+   (V32QI "uimm5")])
+
+;; This is used to form an immediate operand constraint using to ref high half
+;; "const_<indeximm_hi>_operand".
+(define_mode_attr indeximm_hi
+  [(V4DF "2_or_3")
+   (V8SF "4_to_7")
+   (V4DI "2_or_3")
+   (V8SI "4_to_7")
+   (V16HI "8_to_15")
+   (V32QI "16_to_31")])
+
+;; This is used to form an immediate operand constraint using to ref low half
+;; "const_<indeximm_lo>_operand".
+(define_mode_attr indeximm_lo
+  [(V4DF "0_or_1")
+   (V8SF "0_to_3")
+   (V4DI "0_or_1")
+   (V8SI "0_to_3")
+   (V16HI "uimm3")
+   (V32QI "uimm4")])
+
+;; This attribute represents bitmask needed for vec_merge using in lasx
+;; "const_<bitmask256>_operand".
+(define_mode_attr bitmask256
+  [(V4DF "exp_4")
+   (V8SF "exp_8")
+   (V4DI "exp_4")
+   (V8SI "exp_8")
+   (V16HI "exp_16")
+   (V32QI "exp_32")])
+
+;; This attribute represents bitmask needed for vec_merge using to ref low half
+;; "const_<bitmask_lo>_operand".
+(define_mode_attr bitmask_lo
+  [(V4DF "exp_2")
+   (V8SF "exp_4")
+   (V4DI "exp_2")
+   (V8SI "exp_4")
+   (V16HI "exp_8")
+   (V32QI "exp_16")])
+
+
+;; This attribute is used to form an immediate operand constraint using
+;; "const_<bitimm256>_operand".
+(define_mode_attr bitimm256
+  [(V32QI "uimm3")
+   (V16HI  "uimm4")
+   (V8SI  "uimm5")
+   (V4DI  "uimm6")])
+
+
+(define_mode_attr d2lasxfmt
+  [(V8SI "q")
+   (V16HI "d")
+   (V32QI "w")])
+
+(define_mode_attr d2lasxfmt_u
+  [(V8SI "qu")
+   (V16HI "du")
+   (V32QI "wu")])
+
+(define_mode_attr VD2MODE256
+  [(V8SI "V4DI")
+   (V16HI "V4DI")
+   (V32QI "V8SI")])
+
+(define_mode_attr lasxfmt_wd
+  [(V4DI "d")
+   (V8SI "w")
+   (V16HI "w")
+   (V32QI "w")])
+
+(define_int_iterator FRINT256_S [UNSPEC_LASX_XVFRINTRP_S
+			       UNSPEC_LASX_XVFRINTRZ_S
+			       UNSPEC_LASX_XVFRINT
+			       UNSPEC_LASX_XVFRINTRM_S])
+
+(define_int_iterator FRINT256_D [UNSPEC_LASX_XVFRINTRP_D
+			       UNSPEC_LASX_XVFRINTRZ_D
+			       UNSPEC_LASX_XVFRINT
+			       UNSPEC_LASX_XVFRINTRM_D])
+
+(define_int_attr frint256_pattern_s
+  [(UNSPEC_LASX_XVFRINTRP_S  "ceil")
+   (UNSPEC_LASX_XVFRINTRZ_S  "btrunc")
+   (UNSPEC_LASX_XVFRINT	     "rint")
+   (UNSPEC_LASX_XVFRINTRM_S  "floor")])
+
+(define_int_attr frint256_pattern_d
+  [(UNSPEC_LASX_XVFRINTRP_D  "ceil")
+   (UNSPEC_LASX_XVFRINTRZ_D  "btrunc")
+   (UNSPEC_LASX_XVFRINT	     "rint")
+   (UNSPEC_LASX_XVFRINTRM_D  "floor")])
+
+(define_int_attr frint256_suffix
+  [(UNSPEC_LASX_XVFRINTRP_S  "rp")
+   (UNSPEC_LASX_XVFRINTRP_D  "rp")
+   (UNSPEC_LASX_XVFRINTRZ_S  "rz")
+   (UNSPEC_LASX_XVFRINTRZ_D  "rz")
+   (UNSPEC_LASX_XVFRINT	     "")
+   (UNSPEC_LASX_XVFRINTRM_S  "rm")
+   (UNSPEC_LASX_XVFRINTRM_D  "rm")])
+
+(define_expand "vec_init<mode><unitmode>"
+  [(match_operand:LASX 0 "register_operand")
+   (match_operand:LASX 1 "")]
+  "ISA_HAS_LASX"
+{
+  loongarch_expand_vector_init (operands[0], operands[1]);
+  DONE;
+})
+
+(define_expand "vec_initv32qiv16qi"
+ [(match_operand:V32QI 0 "register_operand")
+  (match_operand:V16QI 1 "")]
+  "ISA_HAS_LASX"
+{
+  loongarch_expand_vector_group_init (operands[0], operands[1]);
+  DONE;
+})
+
+;; FIXME: Delete.
+(define_insn "vec_pack_trunc_<mode>"
+  [(set (match_operand:<VHSMODE256> 0 "register_operand" "=f")
+	(vec_concat:<VHSMODE256>
+	  (truncate:<VTRUNCMODE256>
+	    (match_operand:ILASX_DWH 1 "register_operand" "f"))
+	  (truncate:<VTRUNCMODE256>
+	    (match_operand:ILASX_DWH 2 "register_operand" "f"))))]
+  "ISA_HAS_LASX"
+  "xvpickev.<hslasxfmt>\t%u0,%u2,%u1\n\txvpermi.d\t%u0,%u0,0xd8"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "<MODE>")
+   (set_attr "length" "8")])
+
+(define_expand "vec_unpacks_hi_v8sf"
+  [(set (match_operand:V4DF 0 "register_operand" "=f")
+	(float_extend:V4DF
+	  (vec_select:V4SF
+	    (match_operand:V8SF 1 "register_operand" "f")
+	    (match_dup 2))))]
+  "ISA_HAS_LASX"
+{
+  operands[2] = loongarch_lsx_vec_parallel_const_half (V8SFmode,
+						       true/*high_p*/);
+})
+
+(define_expand "vec_unpacks_lo_v8sf"
+  [(set (match_operand:V4DF 0 "register_operand" "=f")
+	(float_extend:V4DF
+	  (vec_select:V4SF
+	    (match_operand:V8SF 1 "register_operand" "f")
+	    (match_dup 2))))]
+  "ISA_HAS_LASX"
+{
+  operands[2] = loongarch_lsx_vec_parallel_const_half (V8SFmode,
+						       false/*high_p*/);
+})
+
+(define_expand "vec_unpacks_hi_<mode>"
+  [(match_operand:<VDMODE256> 0 "register_operand")
+   (match_operand:ILASX_WHB 1 "register_operand")]
+  "ISA_HAS_LASX"
+{
+  loongarch_expand_vec_unpack (operands, false/*unsigned_p*/,
+			       true/*high_p*/);
+  DONE;
+})
+
+(define_expand "vec_unpacks_lo_<mode>"
+  [(match_operand:<VDMODE256> 0 "register_operand")
+   (match_operand:ILASX_WHB 1 "register_operand")]
+  "ISA_HAS_LASX"
+{
+  loongarch_expand_vec_unpack (operands, false/*unsigned_p*/, false/*high_p*/);
+  DONE;
+})
+
+(define_expand "vec_unpacku_hi_<mode>"
+  [(match_operand:<VDMODE256> 0 "register_operand")
+   (match_operand:ILASX_WHB 1 "register_operand")]
+  "ISA_HAS_LASX"
+{
+  loongarch_expand_vec_unpack (operands, true/*unsigned_p*/, true/*high_p*/);
+  DONE;
+})
+
+(define_expand "vec_unpacku_lo_<mode>"
+  [(match_operand:<VDMODE256> 0 "register_operand")
+   (match_operand:ILASX_WHB 1 "register_operand")]
+  "ISA_HAS_LASX"
+{
+  loongarch_expand_vec_unpack (operands, true/*unsigned_p*/, false/*high_p*/);
+  DONE;
+})
+
+(define_insn "lasx_xvinsgr2vr_<lasxfmt_f_wd>"
+  [(set (match_operand:ILASX_DW 0 "register_operand" "=f")
+	(vec_merge:ILASX_DW
+	  (vec_duplicate:ILASX_DW
+	    (match_operand:<UNITMODE> 1 "reg_or_0_operand" "rJ"))
+	  (match_operand:ILASX_DW 2 "register_operand" "0")
+	  (match_operand 3 "const_<bitmask256>_operand" "")))]
+  "ISA_HAS_LASX"
+{
+#if 0
+  if (!TARGET_64BIT && (<MODE>mode == V4DImode || <MODE>mode == V4DFmode))
+    return "#";
+  else
+#endif
+    return "xvinsgr2vr.<lasxfmt>\t%u0,%z1,%y3";
+}
+  [(set_attr "type" "simd_insert")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "vec_concatv4di"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(vec_concat:V4DI
+	  (match_operand:V2DI 1 "register_operand" "0")
+	  (match_operand:V2DI 2 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+{
+  return "xvpermi.q\t%u0,%u2,0x20";
+}
+  [(set_attr "type" "simd_splat")
+   (set_attr "mode" "V4DI")])
+
+(define_insn "vec_concatv8si"
+  [(set (match_operand:V8SI 0 "register_operand" "=f")
+	(vec_concat:V8SI
+	  (match_operand:V4SI 1 "register_operand" "0")
+	  (match_operand:V4SI 2 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+{
+  return "xvpermi.q\t%u0,%u2,0x20";
+}
+  [(set_attr "type" "simd_splat")
+   (set_attr "mode" "V4DI")])
+
+(define_insn "vec_concatv16hi"
+  [(set (match_operand:V16HI 0 "register_operand" "=f")
+	(vec_concat:V16HI
+	  (match_operand:V8HI 1 "register_operand" "0")
+	  (match_operand:V8HI 2 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+{
+  return "xvpermi.q\t%u0,%u2,0x20";
+}
+  [(set_attr "type" "simd_splat")
+   (set_attr "mode" "V4DI")])
+
+(define_insn "vec_concatv32qi"
+  [(set (match_operand:V32QI 0 "register_operand" "=f")
+	(vec_concat:V32QI
+	  (match_operand:V16QI 1 "register_operand" "0")
+	  (match_operand:V16QI 2 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+{
+  return "xvpermi.q\t%u0,%u2,0x20";
+}
+  [(set_attr "type" "simd_splat")
+   (set_attr "mode" "V4DI")])
+
+(define_insn "vec_concatv4df"
+  [(set (match_operand:V4DF 0 "register_operand" "=f")
+	(vec_concat:V4DF
+	  (match_operand:V2DF 1 "register_operand" "0")
+	  (match_operand:V2DF 2 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+{
+  return "xvpermi.q\t%u0,%u2,0x20";
+}
+  [(set_attr "type" "simd_splat")
+   (set_attr "mode" "V4DF")])
+
+(define_insn "vec_concatv8sf"
+  [(set (match_operand:V8SF 0 "register_operand" "=f")
+	(vec_concat:V8SF
+	  (match_operand:V4SF 1 "register_operand" "0")
+	  (match_operand:V4SF 2 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+{
+  return "xvpermi.q\t%u0,%u2,0x20";
+}
+  [(set_attr "type" "simd_splat")
+   (set_attr "mode" "V4DI")])
+
+;; xshuf.w
+(define_insn "lasx_xvperm_<lasxfmt_f_wd>"
+  [(set (match_operand:LASX_W 0 "register_operand" "=f")
+	(unspec:LASX_W
+	  [(match_operand:LASX_W 1 "nonimmediate_operand" "f")
+	   (match_operand:V8SI 2 "register_operand" "f")]
+	  UNSPEC_LASX_XVPERM_W))]
+  "ISA_HAS_LASX"
+  "xvperm.w\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_splat")
+   (set_attr "mode" "<MODE>")])
+
+;; xvpermi.d
+(define_insn "lasx_xvpermi_d_<LASX:mode>"
+  [(set (match_operand:LASX 0 "register_operand" "=f")
+	  (unspec:LASX
+	    [(match_operand:LASX 1 "register_operand" "f")
+	     (match_operand:SI     2 "const_uimm8_operand")]
+	    UNSPEC_LASX_XVPERMI_D))]
+  "ISA_HAS_LASX"
+  "xvpermi.d\t%u0,%u1,%2"
+  [(set_attr "type" "simd_splat")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvpermi_d_<mode>_1"
+  [(set (match_operand:LASX_D 0 "register_operand" "=f")
+	(vec_select:LASX_D
+	 (match_operand:LASX_D 1 "register_operand" "f")
+	 (parallel [(match_operand 2 "const_0_to_3_operand")
+		    (match_operand 3 "const_0_to_3_operand")
+		    (match_operand 4 "const_0_to_3_operand")
+		    (match_operand 5 "const_0_to_3_operand")])))]
+  "ISA_HAS_LASX"
+{
+  int mask = 0;
+  mask |= INTVAL (operands[2]) << 0;
+  mask |= INTVAL (operands[3]) << 2;
+  mask |= INTVAL (operands[4]) << 4;
+  mask |= INTVAL (operands[5]) << 6;
+  operands[2] = GEN_INT (mask);
+  return "xvpermi.d\t%u0,%u1,%2";
+}
+  [(set_attr "type" "simd_splat")
+   (set_attr "mode" "<MODE>")])
+
+;; xvpermi.q
+(define_insn "lasx_xvpermi_q_<LASX:mode>"
+  [(set (match_operand:LASX 0 "register_operand" "=f")
+	(unspec:LASX
+	  [(match_operand:LASX 1 "register_operand" "0")
+	   (match_operand:LASX 2 "register_operand" "f")
+	   (match_operand     3 "const_uimm8_operand")]
+	  UNSPEC_LASX_XVPERMI_Q))]
+  "ISA_HAS_LASX"
+  "xvpermi.q\t%u0,%u2,%3"
+  [(set_attr "type" "simd_splat")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvpickve2gr_d<u>"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+	(any_extend:DI
+	  (vec_select:DI
+	    (match_operand:V4DI 1 "register_operand" "f")
+	    (parallel [(match_operand 2 "const_0_to_3_operand" "")]))))]
+  "ISA_HAS_LASX"
+  "xvpickve2gr.d<u>\t%0,%u1,%2"
+  [(set_attr "type" "simd_copy")
+   (set_attr "mode" "V4DI")])
+
+(define_expand "vec_set<mode>"
+  [(match_operand:ILASX_DW 0 "register_operand")
+   (match_operand:<UNITMODE> 1 "reg_or_0_operand")
+   (match_operand 2 "const_<indeximm256>_operand")]
+  "ISA_HAS_LASX"
+{
+  rtx index = GEN_INT (1 << INTVAL (operands[2]));
+  emit_insn (gen_lasx_xvinsgr2vr_<lasxfmt_f_wd> (operands[0], operands[1],
+                      operands[0], index));
+  DONE;
+})
+
+(define_expand "vec_set<mode>"
+  [(match_operand:FLASX 0 "register_operand")
+   (match_operand:<UNITMODE> 1 "reg_or_0_operand")
+   (match_operand 2 "const_<indeximm256>_operand")]
+  "ISA_HAS_LASX"
+{
+  rtx index = GEN_INT (1 << INTVAL (operands[2]));
+  emit_insn (gen_lasx_xvinsve0_<lasxfmt_f>_scalar (operands[0], operands[1],
+                      operands[0], index));
+  DONE;
+})
+
+(define_expand "vec_extract<mode><unitmode>"
+  [(match_operand:<UNITMODE> 0 "register_operand")
+   (match_operand:LASX 1 "register_operand")
+   (match_operand 2 "const_<indeximm256>_operand")]
+  "ISA_HAS_LASX"
+{
+  loongarch_expand_vector_extract (operands[0], operands[1],
+      INTVAL (operands[2]));
+  DONE;
+})
+
+(define_expand "vec_perm<mode>"
+ [(match_operand:LASX 0 "register_operand")
+  (match_operand:LASX 1 "register_operand")
+  (match_operand:LASX 2 "register_operand")
+  (match_operand:<VIMODE256> 3 "register_operand")]
+  "ISA_HAS_LASX"
+{
+   loongarch_expand_vec_perm_1 (operands);
+   DONE;
+})
+
+;; FIXME: 256??
+(define_expand "vcondu<LASX:mode><ILASX:mode>"
+  [(match_operand:LASX 0 "register_operand")
+   (match_operand:LASX 1 "reg_or_m1_operand")
+   (match_operand:LASX 2 "reg_or_0_operand")
+   (match_operator 3 ""
+    [(match_operand:ILASX 4 "register_operand")
+     (match_operand:ILASX 5 "register_operand")])]
+  "ISA_HAS_LASX
+   && (GET_MODE_NUNITS (<LASX:MODE>mode)
+       == GET_MODE_NUNITS (<ILASX:MODE>mode))"
+{
+  loongarch_expand_vec_cond_expr (<LASX:MODE>mode, <LASX:VIMODE256>mode,
+				  operands);
+  DONE;
+})
+
+;; FIXME: 256??
+(define_expand "vcond<LASX:mode><LASX_2:mode>"
+  [(match_operand:LASX 0 "register_operand")
+   (match_operand:LASX 1 "reg_or_m1_operand")
+   (match_operand:LASX 2 "reg_or_0_operand")
+   (match_operator 3 ""
+     [(match_operand:LASX_2 4 "register_operand")
+      (match_operand:LASX_2 5 "register_operand")])]
+  "ISA_HAS_LASX
+   && (GET_MODE_NUNITS (<LASX:MODE>mode)
+       == GET_MODE_NUNITS (<LASX_2:MODE>mode))"
+{
+  loongarch_expand_vec_cond_expr (<LASX:MODE>mode, <LASX:VIMODE256>mode,
+				  operands);
+  DONE;
+})
+
+;; Same as vcond_
+(define_expand "vcond_mask_<ILASX:mode><ILASX:mode>"
+  [(match_operand:ILASX 0 "register_operand")
+   (match_operand:ILASX 1 "reg_or_m1_operand")
+   (match_operand:ILASX 2 "reg_or_0_operand")
+   (match_operand:ILASX 3 "register_operand")]
+  "ISA_HAS_LASX"
+{
+  loongarch_expand_vec_cond_mask_expr (<ILASX:MODE>mode,
+				      <ILASX:VIMODE256>mode, operands);
+  DONE;
+})
+
+(define_expand "lasx_xvrepli<mode>"
+  [(match_operand:ILASX 0 "register_operand")
+   (match_operand 1 "const_imm10_operand")]
+  "ISA_HAS_LASX"
+{
+  if (<MODE>mode == V32QImode)
+    operands[1] = GEN_INT (trunc_int_for_mode (INTVAL (operands[1]),
+					       <UNITMODE>mode));
+  emit_move_insn (operands[0],
+  loongarch_gen_const_int_vector (<MODE>mode, INTVAL (operands[1])));
+  DONE;
+})
+
+(define_expand "mov<mode>"
+  [(set (match_operand:LASX 0)
+	(match_operand:LASX 1))]
+  "ISA_HAS_LASX"
+{
+  if (loongarch_legitimize_move (<MODE>mode, operands[0], operands[1]))
+    DONE;
+})
+
+
+(define_expand "movmisalign<mode>"
+  [(set (match_operand:LASX 0)
+	(match_operand:LASX 1))]
+  "ISA_HAS_LASX"
+{
+  if (loongarch_legitimize_move (<MODE>mode, operands[0], operands[1]))
+    DONE;
+})
+
+;; 256-bit LASX modes can only exist in LASX registers or memory.
+(define_insn "mov<mode>_lasx"
+  [(set (match_operand:LASX 0 "nonimmediate_operand" "=f,f,R,*r,*f")
+	(match_operand:LASX 1 "move_operand" "fYGYI,R,f,*f,*r"))]
+  "ISA_HAS_LASX"
+  { return loongarch_output_move (operands[0], operands[1]); }
+  [(set_attr "type" "simd_move,simd_load,simd_store,simd_copy,simd_insert")
+   (set_attr "mode" "<MODE>")
+   (set_attr "length" "8,4,4,4,4")])
+
+
+(define_split
+  [(set (match_operand:LASX 0 "nonimmediate_operand")
+	(match_operand:LASX 1 "move_operand"))]
+  "reload_completed && ISA_HAS_LASX
+   && loongarch_split_move_insn_p (operands[0], operands[1])"
+  [(const_int 0)]
+{
+  loongarch_split_move_insn (operands[0], operands[1], curr_insn);
+  DONE;
+})
+
+;; Offset load
+(define_expand "lasx_mxld_<lasxfmt_f>"
+  [(match_operand:LASX 0 "register_operand")
+   (match_operand 1 "pmode_register_operand")
+   (match_operand 2 "aq10<lasxfmt>_operand")]
+  "ISA_HAS_LASX"
+{
+  rtx addr = plus_constant (GET_MODE (operands[1]), operands[1],
+				      INTVAL (operands[2]));
+  loongarch_emit_move (operands[0], gen_rtx_MEM (<MODE>mode, addr));
+  DONE;
+})
+
+;; Offset store
+(define_expand "lasx_mxst_<lasxfmt_f>"
+  [(match_operand:LASX 0 "register_operand")
+   (match_operand 1 "pmode_register_operand")
+   (match_operand 2 "aq10<lasxfmt>_operand")]
+  "ISA_HAS_LASX"
+{
+  rtx addr = plus_constant (GET_MODE (operands[1]), operands[1],
+			    INTVAL (operands[2]));
+  loongarch_emit_move (gen_rtx_MEM (<MODE>mode, addr), operands[0]);
+  DONE;
+})
+
+;; LASX
+(define_insn "add<mode>3"
+  [(set (match_operand:ILASX 0 "register_operand" "=f,f,f")
+	(plus:ILASX
+	  (match_operand:ILASX 1 "register_operand" "f,f,f")
+	  (match_operand:ILASX 2 "reg_or_vector_same_ximm5_operand" "f,Unv5,Uuv5")))]
+  "ISA_HAS_LASX"
+{
+  switch (which_alternative)
+    {
+    case 0:
+      return "xvadd.<lasxfmt>\t%u0,%u1,%u2";
+    case 1:
+      {
+	HOST_WIDE_INT val = INTVAL (CONST_VECTOR_ELT (operands[2], 0));
+
+	operands[2] = GEN_INT (-val);
+	return "xvsubi.<lasxfmt_u>\t%u0,%u1,%d2";
+      }
+    case 2:
+      return "xvaddi.<lasxfmt_u>\t%u0,%u1,%E2";
+    default:
+      gcc_unreachable ();
+    }
+}
+  [(set_attr "alu_type" "simd_add")
+   (set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "sub<mode>3"
+  [(set (match_operand:ILASX 0 "register_operand" "=f,f")
+	(minus:ILASX
+	  (match_operand:ILASX 1 "register_operand" "f,f")
+	  (match_operand:ILASX 2 "reg_or_vector_same_uimm5_operand" "f,Uuv5")))]
+  "ISA_HAS_LASX"
+  "@
+   xvsub.<lasxfmt>\t%u0,%u1,%u2
+   xvsubi.<lasxfmt_u>\t%u0,%u1,%E2"
+  [(set_attr "alu_type" "simd_add")
+   (set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "mul<mode>3"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(mult:ILASX (match_operand:ILASX 1 "register_operand" "f")
+		    (match_operand:ILASX 2 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+  "xvmul.<lasxfmt>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_mul")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvmadd_<lasxfmt>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(plus:ILASX (mult:ILASX (match_operand:ILASX 2 "register_operand" "f")
+				(match_operand:ILASX 3 "register_operand" "f"))
+		    (match_operand:ILASX 1 "register_operand" "0")))]
+  "ISA_HAS_LASX"
+  "xvmadd.<lasxfmt>\t%u0,%u2,%u3"
+  [(set_attr "type" "simd_mul")
+   (set_attr "mode" "<MODE>")])
+
+
+
+(define_insn "lasx_xvmsub_<lasxfmt>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(minus:ILASX (match_operand:ILASX 1 "register_operand" "0")
+		     (mult:ILASX (match_operand:ILASX 2 "register_operand" "f")
+				 (match_operand:ILASX 3 "register_operand" "f"))))]
+  "ISA_HAS_LASX"
+  "xvmsub.<lasxfmt>\t%u0,%u2,%u3"
+  [(set_attr "type" "simd_mul")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "div<mode>3"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(div:ILASX (match_operand:ILASX 1 "register_operand" "f")
+		   (match_operand:ILASX 2 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+{
+  return loongarch_lsx_output_division ("xvdiv.<lasxfmt>\t%u0,%u1,%u2",
+					operands);
+}
+  [(set_attr "type" "simd_div")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "udiv<mode>3"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(udiv:ILASX (match_operand:ILASX 1 "register_operand" "f")
+		    (match_operand:ILASX 2 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+{
+  return loongarch_lsx_output_division ("xvdiv.<lasxfmt_u>\t%u0,%u1,%u2",
+					operands);
+}
+  [(set_attr "type" "simd_div")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "mod<mode>3"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(mod:ILASX (match_operand:ILASX 1 "register_operand" "f")
+		   (match_operand:ILASX 2 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+{
+  return loongarch_lsx_output_division ("xvmod.<lasxfmt>\t%u0,%u1,%u2",
+					operands);
+}
+  [(set_attr "type" "simd_div")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "umod<mode>3"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(umod:ILASX (match_operand:ILASX 1 "register_operand" "f")
+		    (match_operand:ILASX 2 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+{
+  return loongarch_lsx_output_division ("xvmod.<lasxfmt_u>\t%u0,%u1,%u2",
+					operands);
+}
+  [(set_attr "type" "simd_div")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "xor<mode>3"
+  [(set (match_operand:ILASX 0 "register_operand" "=f,f,f")
+	(xor:ILASX
+	  (match_operand:ILASX 1 "register_operand" "f,f,f")
+	  (match_operand:ILASX 2 "reg_or_vector_same_val_operand" "f,YC,Urv8")))]
+  "ISA_HAS_LASX"
+  "@
+   xvxor.v\t%u0,%u1,%u2
+   xvbitrevi.%v0\t%u0,%u1,%V2
+   xvxori.b\t%u0,%u1,%B2"
+  [(set_attr "type" "simd_logic,simd_bit,simd_logic")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "ior<mode>3"
+  [(set (match_operand:LASX 0 "register_operand" "=f,f,f")
+	(ior:LASX
+	  (match_operand:LASX 1 "register_operand" "f,f,f")
+	  (match_operand:LASX 2 "reg_or_vector_same_val_operand" "f,YC,Urv8")))]
+  "ISA_HAS_LASX"
+  "@
+   xvor.v\t%u0,%u1,%u2
+   xvbitseti.%v0\t%u0,%u1,%V2
+   xvori.b\t%u0,%u1,%B2"
+  [(set_attr "type" "simd_logic,simd_bit,simd_logic")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "and<mode>3"
+  [(set (match_operand:LASX 0 "register_operand" "=f,f,f")
+	(and:LASX
+	  (match_operand:LASX 1 "register_operand" "f,f,f")
+	  (match_operand:LASX 2 "reg_or_vector_same_val_operand" "f,YZ,Urv8")))]
+  "ISA_HAS_LASX"
+{
+  switch (which_alternative)
+    {
+    case 0:
+      return "xvand.v\t%u0,%u1,%u2";
+    case 1:
+      {
+	rtx elt0 = CONST_VECTOR_ELT (operands[2], 0);
+	unsigned HOST_WIDE_INT val = ~UINTVAL (elt0);
+	operands[2] = loongarch_gen_const_int_vector (<MODE>mode, val & (-val));
+	return "xvbitclri.%v0\t%u0,%u1,%V2";
+      }
+    case 2:
+      return "xvandi.b\t%u0,%u1,%B2";
+    default:
+      gcc_unreachable ();
+    }
+}
+  [(set_attr "type" "simd_logic,simd_bit,simd_logic")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "one_cmpl<mode>2"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(not:ILASX (match_operand:ILASX 1 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+  "xvnor.v\t%u0,%u1,%u1"
+  [(set_attr "type" "simd_logic")
+   (set_attr "mode" "V32QI")])
+
+;; LASX
+(define_insn "vlshr<mode>3"
+  [(set (match_operand:ILASX 0 "register_operand" "=f,f")
+	(lshiftrt:ILASX
+	  (match_operand:ILASX 1 "register_operand" "f,f")
+	  (match_operand:ILASX 2 "reg_or_vector_same_uimm6_operand" "f,Uuv6")))]
+  "ISA_HAS_LASX"
+  "@
+   xvsrl.<lasxfmt>\t%u0,%u1,%u2
+   xvsrli.<lasxfmt>\t%u0,%u1,%E2"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+;; LASX ">>"
+(define_insn "vashr<mode>3"
+  [(set (match_operand:ILASX 0 "register_operand" "=f,f")
+	(ashiftrt:ILASX
+	  (match_operand:ILASX 1 "register_operand" "f,f")
+	  (match_operand:ILASX 2 "reg_or_vector_same_uimm6_operand" "f,Uuv6")))]
+  "ISA_HAS_LASX"
+  "@
+   xvsra.<lasxfmt>\t%u0,%u1,%u2
+   xvsrai.<lasxfmt>\t%u0,%u1,%E2"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+;; LASX "<<"
+(define_insn "vashl<mode>3"
+  [(set (match_operand:ILASX 0 "register_operand" "=f,f")
+	(ashift:ILASX
+	  (match_operand:ILASX 1 "register_operand" "f,f")
+	  (match_operand:ILASX 2 "reg_or_vector_same_uimm6_operand" "f,Uuv6")))]
+  "ISA_HAS_LASX"
+  "@
+   xvsll.<lasxfmt>\t%u0,%u1,%u2
+   xvslli.<lasxfmt>\t%u0,%u1,%E2"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+
+(define_insn "add<mode>3"
+  [(set (match_operand:FLASX 0 "register_operand" "=f")
+	(plus:FLASX (match_operand:FLASX 1 "register_operand" "f")
+		    (match_operand:FLASX 2 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+  "xvfadd.<flasxfmt>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_fadd")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "sub<mode>3"
+  [(set (match_operand:FLASX 0 "register_operand" "=f")
+	(minus:FLASX (match_operand:FLASX 1 "register_operand" "f")
+		     (match_operand:FLASX 2 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+  "xvfsub.<flasxfmt>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_fadd")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "mul<mode>3"
+  [(set (match_operand:FLASX 0 "register_operand" "=f")
+	(mult:FLASX (match_operand:FLASX 1 "register_operand" "f")
+		    (match_operand:FLASX 2 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+  "xvfmul.<flasxfmt>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_fmul")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "div<mode>3"
+  [(set (match_operand:FLASX 0 "register_operand" "=f")
+	(div:FLASX (match_operand:FLASX 1 "register_operand" "f")
+		   (match_operand:FLASX 2 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+  "xvfdiv.<flasxfmt>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_fdiv")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "fma<mode>4"
+  [(set (match_operand:FLASX 0 "register_operand" "=f")
+	(fma:FLASX (match_operand:FLASX 1 "register_operand" "f")
+		   (match_operand:FLASX 2 "register_operand" "f")
+		   (match_operand:FLASX 3 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+  "xvfmadd.<flasxfmt>\t%u0,%u1,%u2,%u3"
+  [(set_attr "type" "simd_fmadd")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "fnma<mode>4"
+  [(set (match_operand:FLASX 0 "register_operand" "=f")
+	(fma:FLASX (neg:FLASX (match_operand:FLASX 1 "register_operand" "f"))
+		   (match_operand:FLASX 2 "register_operand" "f")
+		   (match_operand:FLASX 3 "register_operand" "0")))]
+  "ISA_HAS_LASX"
+  "xvfnmsub.<flasxfmt>\t%u0,%u1,%u2,%u0"
+  [(set_attr "type" "simd_fmadd")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "sqrt<mode>2"
+  [(set (match_operand:FLASX 0 "register_operand" "=f")
+	(sqrt:FLASX (match_operand:FLASX 1 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+  "xvfsqrt.<flasxfmt>\t%u0,%u1"
+  [(set_attr "type" "simd_fdiv")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvadda_<lasxfmt>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(plus:ILASX (abs:ILASX (match_operand:ILASX 1 "register_operand" "f"))
+		    (abs:ILASX (match_operand:ILASX 2 "register_operand" "f"))))]
+  "ISA_HAS_LASX"
+  "xvadda.<lasxfmt>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "ssadd<mode>3"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(ss_plus:ILASX (match_operand:ILASX 1 "register_operand" "f")
+		       (match_operand:ILASX 2 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+  "xvsadd.<lasxfmt>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "usadd<mode>3"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(us_plus:ILASX (match_operand:ILASX 1 "register_operand" "f")
+		       (match_operand:ILASX 2 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+  "xvsadd.<lasxfmt_u>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvabsd_s_<lasxfmt>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(unspec:ILASX [(match_operand:ILASX 1 "register_operand" "f")
+		       (match_operand:ILASX 2 "register_operand" "f")]
+		      UNSPEC_LASX_XVABSD_S))]
+  "ISA_HAS_LASX"
+  "xvabsd.<lasxfmt>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvabsd_u_<lasxfmt_u>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(unspec:ILASX [(match_operand:ILASX 1 "register_operand" "f")
+		       (match_operand:ILASX 2 "register_operand" "f")]
+		      UNSPEC_LASX_XVABSD_U))]
+  "ISA_HAS_LASX"
+  "xvabsd.<lasxfmt_u>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvavg_s_<lasxfmt>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(unspec:ILASX [(match_operand:ILASX 1 "register_operand" "f")
+		       (match_operand:ILASX 2 "register_operand" "f")]
+		      UNSPEC_LASX_XVAVG_S))]
+  "ISA_HAS_LASX"
+  "xvavg.<lasxfmt>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvavg_u_<lasxfmt_u>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(unspec:ILASX [(match_operand:ILASX 1 "register_operand" "f")
+		       (match_operand:ILASX 2 "register_operand" "f")]
+		      UNSPEC_LASX_XVAVG_U))]
+  "ISA_HAS_LASX"
+  "xvavg.<lasxfmt_u>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvavgr_s_<lasxfmt>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(unspec:ILASX [(match_operand:ILASX 1 "register_operand" "f")
+		       (match_operand:ILASX 2 "register_operand" "f")]
+		      UNSPEC_LASX_XVAVGR_S))]
+  "ISA_HAS_LASX"
+  "xvavgr.<lasxfmt>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvavgr_u_<lasxfmt_u>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(unspec:ILASX [(match_operand:ILASX 1 "register_operand" "f")
+		       (match_operand:ILASX 2 "register_operand" "f")]
+		      UNSPEC_LASX_XVAVGR_U))]
+  "ISA_HAS_LASX"
+  "xvavgr.<lasxfmt_u>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvbitclr_<lasxfmt>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(unspec:ILASX [(match_operand:ILASX 1 "register_operand" "f")
+		       (match_operand:ILASX 2 "register_operand" "f")]
+		      UNSPEC_LASX_XVBITCLR))]
+  "ISA_HAS_LASX"
+  "xvbitclr.<lasxfmt>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_bit")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvbitclri_<lasxfmt>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(unspec:ILASX [(match_operand:ILASX 1 "register_operand" "f")
+		       (match_operand 2 "const_<bitimm256>_operand" "")]
+		      UNSPEC_LASX_XVBITCLRI))]
+  "ISA_HAS_LASX"
+  "xvbitclri.<lasxfmt>\t%u0,%u1,%2"
+  [(set_attr "type" "simd_bit")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvbitrev_<lasxfmt>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(unspec:ILASX [(match_operand:ILASX 1 "register_operand" "f")
+		       (match_operand:ILASX 2 "register_operand" "f")]
+		      UNSPEC_LASX_XVBITREV))]
+  "ISA_HAS_LASX"
+  "xvbitrev.<lasxfmt>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_bit")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvbitrevi_<lasxfmt>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(unspec:ILASX [(match_operand:ILASX 1 "register_operand" "f")
+		       (match_operand 2 "const_<bitimm256>_operand" "")]
+		     UNSPEC_LASX_XVBITREVI))]
+  "ISA_HAS_LASX"
+  "xvbitrevi.<lasxfmt>\t%u0,%u1,%2"
+  [(set_attr "type" "simd_bit")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvbitsel_<lasxfmt_f>"
+  [(set (match_operand:LASX 0 "register_operand" "=f")
+	(ior:LASX (and:LASX (not:LASX
+			      (match_operand:LASX 3 "register_operand" "f"))
+			      (match_operand:LASX 1 "register_operand" "f"))
+		  (and:LASX (match_dup 3)
+			    (match_operand:LASX 2 "register_operand" "f"))))]
+  "ISA_HAS_LASX"
+  "xvbitsel.v\t%u0,%u1,%u2,%u3"
+  [(set_attr "type" "simd_bitmov")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvbitseli_b"
+  [(set (match_operand:V32QI 0 "register_operand" "=f")
+	(ior:V32QI (and:V32QI (not:V32QI
+				(match_operand:V32QI 1 "register_operand" "0"))
+			      (match_operand:V32QI 2 "register_operand" "f"))
+		   (and:V32QI (match_dup 1)
+			      (match_operand:V32QI 3 "const_vector_same_val_operand" "Urv8"))))]
+  "ISA_HAS_LASX"
+  "xvbitseli.b\t%u0,%u2,%B3"
+  [(set_attr "type" "simd_bitmov")
+   (set_attr "mode" "V32QI")])
+
+(define_insn "lasx_xvbitset_<lasxfmt>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(unspec:ILASX [(match_operand:ILASX 1 "register_operand" "f")
+		       (match_operand:ILASX 2 "register_operand" "f")]
+		      UNSPEC_LASX_XVBITSET))]
+  "ISA_HAS_LASX"
+  "xvbitset.<lasxfmt>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_bit")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvbitseti_<lasxfmt>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(unspec:ILASX [(match_operand:ILASX 1 "register_operand" "f")
+		       (match_operand 2 "const_<bitimm256>_operand" "")]
+		      UNSPEC_LASX_XVBITSETI))]
+  "ISA_HAS_LASX"
+  "xvbitseti.<lasxfmt>\t%u0,%u1,%2"
+  [(set_attr "type" "simd_bit")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvs<ICC:icc>_<ILASX:lasxfmt><cmpi_1>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f,f")
+	(ICC:ILASX
+	  (match_operand:ILASX 1 "register_operand" "f,f")
+	  (match_operand:ILASX 2 "reg_or_vector_same_<ICC:cmpi>imm5_operand" "f,U<ICC:cmpi>v5")))]
+  "ISA_HAS_LASX"
+  "@
+   xvs<ICC:icc>.<ILASX:lasxfmt><cmpi_1>\t%u0,%u1,%u2
+   xvs<ICC:icci>.<ILASX:lasxfmt><cmpi_1>\t%u0,%u1,%E2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_expand "vec_cmp<mode><mode256_i>"
+  [(set (match_operand:<VIMODE256> 0 "register_operand")
+	(match_operator 1 ""
+	  [(match_operand:LASX 2 "register_operand")
+	   (match_operand:LASX 3 "register_operand")]))]
+  "ISA_HAS_LASX"
+{
+  bool ok = loongarch_expand_vec_cmp (operands);
+  gcc_assert (ok);
+  DONE;
+})
+
+(define_expand "vec_cmpu<ILASX:mode><mode256_i>"
+  [(set (match_operand:<VIMODE256> 0 "register_operand")
+	(match_operator 1 ""
+	  [(match_operand:ILASX 2 "register_operand")
+	   (match_operand:ILASX 3 "register_operand")]))]
+  "ISA_HAS_LASX"
+{
+  bool ok = loongarch_expand_vec_cmp (operands);
+  gcc_assert (ok);
+  DONE;
+})
+
+(define_insn "lasx_xvfclass_<flasxfmt>"
+  [(set (match_operand:<VIMODE256> 0 "register_operand" "=f")
+	(unspec:<VIMODE256> [(match_operand:FLASX 1 "register_operand" "f")]
+			    UNSPEC_LASX_XVFCLASS))]
+  "ISA_HAS_LASX"
+  "xvfclass.<flasxfmt>\t%u0,%u1"
+  [(set_attr "type" "simd_fclass")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvfcmp_caf_<flasxfmt>"
+  [(set (match_operand:<VIMODE256> 0 "register_operand" "=f")
+	(unspec:<VIMODE256> [(match_operand:FLASX 1 "register_operand" "f")
+			     (match_operand:FLASX 2 "register_operand" "f")]
+			    UNSPEC_LASX_XVFCMP_CAF))]
+  "ISA_HAS_LASX"
+  "xvfcmp.caf.<flasxfmt>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_fcmp")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvfcmp_cune_<FLASX:flasxfmt>"
+  [(set (match_operand:<VIMODE256> 0 "register_operand" "=f")
+	(unspec:<VIMODE256> [(match_operand:FLASX 1 "register_operand" "f")
+			     (match_operand:FLASX 2 "register_operand" "f")]
+			    UNSPEC_LASX_XVFCMP_CUNE))]
+  "ISA_HAS_LASX"
+  "xvfcmp.cune.<FLASX:flasxfmt>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_fcmp")
+   (set_attr "mode" "<MODE>")])
+
+
+
+(define_int_iterator FSC256_UNS [UNSPEC_LASX_XVFCMP_SAF UNSPEC_LASX_XVFCMP_SUN
+				 UNSPEC_LASX_XVFCMP_SOR UNSPEC_LASX_XVFCMP_SEQ
+				 UNSPEC_LASX_XVFCMP_SNE UNSPEC_LASX_XVFCMP_SUEQ
+				 UNSPEC_LASX_XVFCMP_SUNE UNSPEC_LASX_XVFCMP_SULE
+				 UNSPEC_LASX_XVFCMP_SULT UNSPEC_LASX_XVFCMP_SLE
+				 UNSPEC_LASX_XVFCMP_SLT])
+
+(define_int_attr fsc256
+  [(UNSPEC_LASX_XVFCMP_SAF  "saf")
+   (UNSPEC_LASX_XVFCMP_SUN  "sun")
+   (UNSPEC_LASX_XVFCMP_SOR  "sor")
+   (UNSPEC_LASX_XVFCMP_SEQ  "seq")
+   (UNSPEC_LASX_XVFCMP_SNE  "sne")
+   (UNSPEC_LASX_XVFCMP_SUEQ "sueq")
+   (UNSPEC_LASX_XVFCMP_SUNE "sune")
+   (UNSPEC_LASX_XVFCMP_SULE "sule")
+   (UNSPEC_LASX_XVFCMP_SULT "sult")
+   (UNSPEC_LASX_XVFCMP_SLE  "sle")
+   (UNSPEC_LASX_XVFCMP_SLT  "slt")])
+
+(define_insn "lasx_xvfcmp_<vfcond:fcc>_<FLASX:flasxfmt>"
+  [(set (match_operand:<VIMODE256> 0 "register_operand" "=f")
+	(vfcond:<VIMODE256> (match_operand:FLASX 1 "register_operand" "f")
+			    (match_operand:FLASX 2 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+  "xvfcmp.<vfcond:fcc>.<FLASX:flasxfmt>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_fcmp")
+   (set_attr "mode" "<MODE>")])
+
+
+(define_insn "lasx_xvfcmp_<fsc256>_<FLASX:flasxfmt>"
+  [(set (match_operand:<VIMODE256> 0 "register_operand" "=f")
+	(unspec:<VIMODE256> [(match_operand:FLASX 1 "register_operand" "f")
+			     (match_operand:FLASX 2 "register_operand" "f")]
+			    FSC256_UNS))]
+  "ISA_HAS_LASX"
+  "xvfcmp.<fsc256>.<FLASX:flasxfmt>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_fcmp")
+   (set_attr "mode" "<MODE>")])
+
+
+(define_mode_attr fint256
+  [(V8SF "v8si")
+   (V4DF "v4di")])
+
+(define_mode_attr FINTCNV256
+  [(V8SF "I2S")
+   (V4DF "I2D")])
+
+(define_mode_attr FINTCNV256_2
+  [(V8SF "S2I")
+   (V4DF "D2I")])
+
+(define_insn "float<fint256><FLASX:mode>2"
+  [(set (match_operand:FLASX 0 "register_operand" "=f")
+	(float:FLASX (match_operand:<VIMODE256> 1 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+  "xvffint.<flasxfmt>.<ilasxfmt>\t%u0,%u1"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "cnv_mode" "<FINTCNV256>")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "floatuns<fint256><FLASX:mode>2"
+  [(set (match_operand:FLASX 0 "register_operand" "=f")
+	(unsigned_float:FLASX
+	  (match_operand:<VIMODE256> 1 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+  "xvffint.<flasxfmt>.<ilasxfmt_u>\t%u0,%u1"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "cnv_mode" "<FINTCNV256>")
+   (set_attr "mode" "<MODE>")])
+
+(define_mode_attr FFQ256
+  [(V4SF "V16HI")
+   (V2DF "V8SI")])
+
+(define_insn "lasx_xvreplgr2vr_<lasxfmt_f>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f,f")
+	(vec_duplicate:ILASX
+	  (match_operand:<UNITMODE> 1 "reg_or_0_operand" "r,J")))]
+  "ISA_HAS_LASX"
+{
+  if (which_alternative == 1)
+    return "xvldi.b\t%u0,0" ;
+
+  if (!TARGET_64BIT && (<MODE>mode == V2DImode || <MODE>mode == V2DFmode))
+    return "#";
+  else
+    return "xvreplgr2vr.<lasxfmt>\t%u0,%z1";
+}
+  [(set_attr "type" "simd_fill")
+   (set_attr "mode" "<MODE>")
+   (set_attr "length" "8")])
+
+(define_insn "logb<mode>2"
+  [(set (match_operand:FLASX 0 "register_operand" "=f")
+	(unspec:FLASX [(match_operand:FLASX 1 "register_operand" "f")]
+		      UNSPEC_LASX_XVFLOGB))]
+  "ISA_HAS_LASX"
+  "xvflogb.<flasxfmt>\t%u0,%u1"
+  [(set_attr "type" "simd_flog2")
+   (set_attr "mode" "<MODE>")])
+
+
+(define_insn "smax<mode>3"
+  [(set (match_operand:FLASX 0 "register_operand" "=f")
+	(smax:FLASX (match_operand:FLASX 1 "register_operand" "f")
+		    (match_operand:FLASX 2 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+  "xvfmax.<flasxfmt>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_fminmax")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvfmaxa_<flasxfmt>"
+  [(set (match_operand:FLASX 0 "register_operand" "=f")
+	(if_then_else:FLASX
+	   (gt (abs:FLASX (match_operand:FLASX 1 "register_operand" "f"))
+	       (abs:FLASX (match_operand:FLASX 2 "register_operand" "f")))
+	   (match_dup 1)
+	   (match_dup 2)))]
+  "ISA_HAS_LASX"
+  "xvfmaxa.<flasxfmt>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_fminmax")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "smin<mode>3"
+  [(set (match_operand:FLASX 0 "register_operand" "=f")
+	(smin:FLASX (match_operand:FLASX 1 "register_operand" "f")
+		    (match_operand:FLASX 2 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+  "xvfmin.<flasxfmt>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_fminmax")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvfmina_<flasxfmt>"
+  [(set (match_operand:FLASX 0 "register_operand" "=f")
+	(if_then_else:FLASX
+	   (lt (abs:FLASX (match_operand:FLASX 1 "register_operand" "f"))
+	       (abs:FLASX (match_operand:FLASX 2 "register_operand" "f")))
+	   (match_dup 1)
+	   (match_dup 2)))]
+  "ISA_HAS_LASX"
+  "xvfmina.<flasxfmt>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_fminmax")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvfrecip_<flasxfmt>"
+  [(set (match_operand:FLASX 0 "register_operand" "=f")
+	(unspec:FLASX [(match_operand:FLASX 1 "register_operand" "f")]
+		      UNSPEC_LASX_XVFRECIP))]
+  "ISA_HAS_LASX"
+  "xvfrecip.<flasxfmt>\t%u0,%u1"
+  [(set_attr "type" "simd_fdiv")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvfrint_<flasxfmt>"
+  [(set (match_operand:FLASX 0 "register_operand" "=f")
+	(unspec:FLASX [(match_operand:FLASX 1 "register_operand" "f")]
+		      UNSPEC_LASX_XVFRINT))]
+  "ISA_HAS_LASX"
+  "xvfrint.<flasxfmt>\t%u0,%u1"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvfrsqrt_<flasxfmt>"
+  [(set (match_operand:FLASX 0 "register_operand" "=f")
+	(unspec:FLASX [(match_operand:FLASX 1 "register_operand" "f")]
+		      UNSPEC_LASX_XVFRSQRT))]
+  "ISA_HAS_LASX"
+  "xvfrsqrt.<flasxfmt>\t%u0,%u1"
+  [(set_attr "type" "simd_fdiv")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvftint_s_<ilasxfmt>_<flasxfmt>"
+  [(set (match_operand:<VIMODE256> 0 "register_operand" "=f")
+	(unspec:<VIMODE256> [(match_operand:FLASX 1 "register_operand" "f")]
+			    UNSPEC_LASX_XVFTINT_S))]
+  "ISA_HAS_LASX"
+  "xvftint.<ilasxfmt>.<flasxfmt>\t%u0,%u1"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "cnv_mode" "<FINTCNV256_2>")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvftint_u_<ilasxfmt_u>_<flasxfmt>"
+  [(set (match_operand:<VIMODE256> 0 "register_operand" "=f")
+	(unspec:<VIMODE256> [(match_operand:FLASX 1 "register_operand" "f")]
+			    UNSPEC_LASX_XVFTINT_U))]
+  "ISA_HAS_LASX"
+  "xvftint.<ilasxfmt_u>.<flasxfmt>\t%u0,%u1"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "cnv_mode" "<FINTCNV256_2>")
+   (set_attr "mode" "<MODE>")])
+
+
+
+(define_insn "fix_trunc<FLASX:mode><mode256_i>2"
+  [(set (match_operand:<VIMODE256> 0 "register_operand" "=f")
+	(fix:<VIMODE256> (match_operand:FLASX 1 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+  "xvftintrz.<ilasxfmt>.<flasxfmt>\t%u0,%u1"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "cnv_mode" "<FINTCNV256_2>")
+   (set_attr "mode" "<MODE>")])
+
+
+(define_insn "fixuns_trunc<FLASX:mode><mode256_i>2"
+  [(set (match_operand:<VIMODE256> 0 "register_operand" "=f")
+	(unsigned_fix:<VIMODE256> (match_operand:FLASX 1 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+  "xvftintrz.<ilasxfmt_u>.<flasxfmt>\t%u0,%u1"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "cnv_mode" "<FINTCNV256_2>")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvh<optab>w_h<u>_b<u>"
+  [(set (match_operand:V16HI 0 "register_operand" "=f")
+	(addsub:V16HI
+	  (any_extend:V16HI
+	    (vec_select:V16QI
+	      (match_operand:V32QI 1 "register_operand" "f")
+	      (parallel [(const_int 1) (const_int 3)
+			 (const_int 5) (const_int 7)
+			 (const_int 9) (const_int 11)
+			 (const_int 13) (const_int 15)
+			 (const_int 17) (const_int 19)
+			 (const_int 21) (const_int 23)
+			 (const_int 25) (const_int 27)
+			 (const_int 29) (const_int 31)])))
+	  (any_extend:V16HI
+	    (vec_select:V16QI
+	      (match_operand:V32QI 2 "register_operand" "f")
+	      (parallel [(const_int 0) (const_int 2)
+			 (const_int 4) (const_int 6)
+			 (const_int 8) (const_int 10)
+			 (const_int 12) (const_int 14)
+			 (const_int 16) (const_int 18)
+			 (const_int 20) (const_int 22)
+			 (const_int 24) (const_int 26)
+			 (const_int 28) (const_int 30)])))))]
+  "ISA_HAS_LASX"
+  "xvh<optab>w.h<u>.b<u>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V16HI")])
+
+(define_insn "lasx_xvh<optab>w_w<u>_h<u>"
+  [(set (match_operand:V8SI 0 "register_operand" "=f")
+	(addsub:V8SI
+	  (any_extend:V8SI
+	    (vec_select:V8HI
+	      (match_operand:V16HI 1 "register_operand" "f")
+	      (parallel [(const_int 1) (const_int 3)
+			 (const_int 5) (const_int 7)
+			 (const_int 9) (const_int 11)
+			 (const_int 13) (const_int 15)])))
+	  (any_extend:V8SI
+	    (vec_select:V8HI
+	      (match_operand:V16HI 2 "register_operand" "f")
+	      (parallel [(const_int 0) (const_int 2)
+			 (const_int 4) (const_int 6)
+			 (const_int 8) (const_int 10)
+			 (const_int 12) (const_int 14)])))))]
+  "ISA_HAS_LASX"
+  "xvh<optab>w.w<u>.h<u>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V8SI")])
+
+(define_insn "lasx_xvh<optab>w_d<u>_w<u>"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(addsub:V4DI
+	  (any_extend:V4DI
+	    (vec_select:V4SI
+	      (match_operand:V8SI 1 "register_operand" "f")
+	      (parallel [(const_int 1) (const_int 3)
+			 (const_int 5) (const_int 7)])))
+	  (any_extend:V4DI
+	    (vec_select:V4SI
+	      (match_operand:V8SI 2 "register_operand" "f")
+	      (parallel [(const_int 0) (const_int 2)
+			 (const_int 4) (const_int 6)])))))]
+  "ISA_HAS_LASX"
+  "xvh<optab>w.d<u>.w<u>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4DI")])
+
+(define_insn "lasx_xvpackev_b"
+  [(set (match_operand:V32QI 0 "register_operand" "=f")
+	(vec_select:V32QI
+	  (vec_concat:V64QI
+	    (match_operand:V32QI 1 "register_operand" "f")
+	    (match_operand:V32QI 2 "register_operand" "f"))
+	  (parallel [(const_int 0)  (const_int 32)
+		     (const_int 2)  (const_int 34)
+		     (const_int 4)  (const_int 36)
+		     (const_int 6)  (const_int 38)
+		     (const_int 8)  (const_int 40)
+		     (const_int 10)  (const_int 42)
+		     (const_int 12)  (const_int 44)
+		     (const_int 14)  (const_int 46)
+		     (const_int 16)  (const_int 48)
+		     (const_int 18)  (const_int 50)
+		     (const_int 20)  (const_int 52)
+		     (const_int 22)  (const_int 54)
+		     (const_int 24)  (const_int 56)
+		     (const_int 26)  (const_int 58)
+		     (const_int 28)  (const_int 60)
+		     (const_int 30)  (const_int 62)])))]
+  "ISA_HAS_LASX"
+  "xvpackev.b\t%u0,%u2,%u1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V32QI")])
+
+
+(define_insn "lasx_xvpackev_h"
+  [(set (match_operand:V16HI 0 "register_operand" "=f")
+	(vec_select:V16HI
+	  (vec_concat:V32HI
+	    (match_operand:V16HI 1 "register_operand" "f")
+	    (match_operand:V16HI 2 "register_operand" "f"))
+	  (parallel [(const_int 0)  (const_int 16)
+		     (const_int 2)  (const_int 18)
+		     (const_int 4)  (const_int 20)
+		     (const_int 6)  (const_int 22)
+		     (const_int 8)  (const_int 24)
+		     (const_int 10) (const_int 26)
+		     (const_int 12) (const_int 28)
+		     (const_int 14) (const_int 30)])))]
+  "ISA_HAS_LASX"
+  "xvpackev.h\t%u0,%u2,%u1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V16HI")])
+
+(define_insn "lasx_xvpackev_w"
+  [(set (match_operand:V8SI 0 "register_operand" "=f")
+	(vec_select:V8SI
+	  (vec_concat:V16SI
+	    (match_operand:V8SI 1 "register_operand" "f")
+	    (match_operand:V8SI 2 "register_operand" "f"))
+	  (parallel [(const_int 0) (const_int 8)
+		     (const_int 2) (const_int 10)
+		     (const_int 4) (const_int 12)
+		     (const_int 6) (const_int 14)])))]
+  "ISA_HAS_LASX"
+  "xvpackev.w\t%u0,%u2,%u1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V8SI")])
+
+(define_insn "lasx_xvpackev_w_f"
+  [(set (match_operand:V8SF 0 "register_operand" "=f")
+	(vec_select:V8SF
+	  (vec_concat:V16SF
+	    (match_operand:V8SF 1 "register_operand" "f")
+	    (match_operand:V8SF 2 "register_operand" "f"))
+	  (parallel [(const_int 0) (const_int 8)
+		     (const_int 2) (const_int 10)
+		     (const_int 4) (const_int 12)
+		     (const_int 6) (const_int 14)])))]
+  "ISA_HAS_LASX"
+  "xvpackev.w\t%u0,%u2,%u1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V8SF")])
+
+(define_insn "lasx_xvilvh_b"
+  [(set (match_operand:V32QI 0 "register_operand" "=f")
+	(vec_select:V32QI
+	  (vec_concat:V64QI
+	    (match_operand:V32QI 1 "register_operand" "f")
+	    (match_operand:V32QI 2 "register_operand" "f"))
+	  (parallel [(const_int 8) (const_int 40)
+		     (const_int 9) (const_int 41)
+		     (const_int 10) (const_int 42)
+		     (const_int 11) (const_int 43)
+		     (const_int 12) (const_int 44)
+		     (const_int 13) (const_int 45)
+		     (const_int 14) (const_int 46)
+		     (const_int 15) (const_int 47)
+		     (const_int 24) (const_int 56)
+		     (const_int 25) (const_int 57)
+		     (const_int 26) (const_int 58)
+		     (const_int 27) (const_int 59)
+		     (const_int 28) (const_int 60)
+		     (const_int 29) (const_int 61)
+		     (const_int 30) (const_int 62)
+		     (const_int 31) (const_int 63)])))]
+  "ISA_HAS_LASX"
+  "xvilvh.b\t%u0,%u2,%u1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V32QI")])
+
+(define_insn "lasx_xvilvh_h"
+  [(set (match_operand:V16HI 0 "register_operand" "=f")
+	(vec_select:V16HI
+	  (vec_concat:V32HI
+	    (match_operand:V16HI 1 "register_operand" "f")
+	    (match_operand:V16HI 2 "register_operand" "f"))
+	  (parallel [(const_int 4) (const_int 20)
+		     (const_int 5) (const_int 21)
+		     (const_int 6) (const_int 22)
+		     (const_int 7) (const_int 23)
+		     (const_int 12) (const_int 28)
+		     (const_int 13) (const_int 29)
+		     (const_int 14) (const_int 30)
+		     (const_int 15) (const_int 31)])))]
+  "ISA_HAS_LASX"
+  "xvilvh.h\t%u0,%u2,%u1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V16HI")])
+
+(define_mode_attr xvilvh_suffix
+  [(V8SI "") (V8SF "_f")
+   (V4DI "") (V4DF "_f")])
+
+(define_insn "lasx_xvilvh_w<xvilvh_suffix>"
+  [(set (match_operand:LASX_W 0 "register_operand" "=f")
+	(vec_select:LASX_W
+	  (vec_concat:<VEMODE256>
+	    (match_operand:LASX_W 1 "register_operand" "f")
+	    (match_operand:LASX_W 2 "register_operand" "f"))
+	  (parallel [(const_int 2) (const_int 10)
+		     (const_int 3) (const_int 11)
+		     (const_int 6) (const_int 14)
+		     (const_int 7) (const_int 15)])))]
+  "ISA_HAS_LASX"
+  "xvilvh.w\t%u0,%u2,%u1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvilvh_d<xvilvh_suffix>"
+  [(set (match_operand:LASX_D 0 "register_operand" "=f")
+	(vec_select:LASX_D
+	  (vec_concat:<VEMODE256>
+	    (match_operand:LASX_D 1 "register_operand" "f")
+	    (match_operand:LASX_D 2 "register_operand" "f"))
+	  (parallel [(const_int 1) (const_int 5)
+		     (const_int 3) (const_int 7)])))]
+  "ISA_HAS_LASX"
+  "xvilvh.d\t%u0,%u2,%u1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvpackod_b"
+  [(set (match_operand:V32QI 0 "register_operand" "=f")
+	(vec_select:V32QI
+	  (vec_concat:V64QI
+	    (match_operand:V32QI 1 "register_operand" "f")
+	    (match_operand:V32QI 2 "register_operand" "f"))
+	  (parallel [(const_int 1)  (const_int 33)
+		     (const_int 3)  (const_int 35)
+		     (const_int 5)  (const_int 37)
+		     (const_int 7)  (const_int 39)
+		     (const_int 9)  (const_int 41)
+		     (const_int 11)  (const_int 43)
+		     (const_int 13)  (const_int 45)
+		     (const_int 15)  (const_int 47)
+		     (const_int 17)  (const_int 49)
+		     (const_int 19)  (const_int 51)
+		     (const_int 21)  (const_int 53)
+		     (const_int 23)  (const_int 55)
+		     (const_int 25)  (const_int 57)
+		     (const_int 27)  (const_int 59)
+		     (const_int 29)  (const_int 61)
+		     (const_int 31)  (const_int 63)])))]
+  "ISA_HAS_LASX"
+  "xvpackod.b\t%u0,%u2,%u1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V32QI")])
+
+
+(define_insn "lasx_xvpackod_h"
+  [(set (match_operand:V16HI 0 "register_operand" "=f")
+	(vec_select:V16HI
+	  (vec_concat:V32HI
+	    (match_operand:V16HI 1 "register_operand" "f")
+	    (match_operand:V16HI 2 "register_operand" "f"))
+	  (parallel [(const_int 1) (const_int 17)
+		     (const_int 3) (const_int 19)
+		     (const_int 5) (const_int 21)
+		     (const_int 7) (const_int 23)
+		     (const_int 9) (const_int 25)
+		     (const_int 11) (const_int 27)
+		     (const_int 13) (const_int 29)
+		     (const_int 15) (const_int 31)])))]
+  "ISA_HAS_LASX"
+  "xvpackod.h\t%u0,%u2,%u1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V16HI")])
+
+
+(define_insn "lasx_xvpackod_w"
+  [(set (match_operand:V8SI 0 "register_operand" "=f")
+	(vec_select:V8SI
+	  (vec_concat:V16SI
+	    (match_operand:V8SI 1 "register_operand" "f")
+	    (match_operand:V8SI 2 "register_operand" "f"))
+	  (parallel [(const_int 1) (const_int 9)
+		     (const_int 3) (const_int 11)
+		     (const_int 5) (const_int 13)
+		     (const_int 7) (const_int 15)])))]
+  "ISA_HAS_LASX"
+  "xvpackod.w\t%u0,%u2,%u1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V8SI")])
+
+
+(define_insn "lasx_xvpackod_w_f"
+  [(set (match_operand:V8SF 0 "register_operand" "=f")
+	(vec_select:V8SF
+	  (vec_concat:V16SF
+	    (match_operand:V8SF 1 "register_operand" "f")
+	    (match_operand:V8SF 2 "register_operand" "f"))
+	  (parallel [(const_int 1) (const_int 9)
+		     (const_int 3) (const_int 11)
+		     (const_int 5) (const_int 13)
+		     (const_int 7) (const_int 15)])))]
+  "ISA_HAS_LASX"
+  "xvpackod.w\t%u0,%u2,%u1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V8SF")])
+
+(define_insn "lasx_xvilvl_b"
+  [(set (match_operand:V32QI 0 "register_operand" "=f")
+	(vec_select:V32QI
+	  (vec_concat:V64QI
+	    (match_operand:V32QI 1 "register_operand" "f")
+	    (match_operand:V32QI 2 "register_operand" "f"))
+	  (parallel [(const_int 0) (const_int 32)
+		     (const_int 1) (const_int 33)
+		     (const_int 2) (const_int 34)
+		     (const_int 3) (const_int 35)
+		     (const_int 4) (const_int 36)
+		     (const_int 5) (const_int 37)
+		     (const_int 6) (const_int 38)
+		     (const_int 7) (const_int 39)
+		     (const_int 16) (const_int 48)
+		     (const_int 17) (const_int 49)
+		     (const_int 18) (const_int 50)
+		     (const_int 19) (const_int 51)
+		     (const_int 20) (const_int 52)
+		     (const_int 21) (const_int 53)
+		     (const_int 22) (const_int 54)
+		     (const_int 23) (const_int 55)])))]
+  "ISA_HAS_LASX"
+  "xvilvl.b\t%u0,%u2,%u1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V32QI")])
+
+(define_insn "lasx_xvilvl_h"
+  [(set (match_operand:V16HI 0 "register_operand" "=f")
+	(vec_select:V16HI
+	  (vec_concat:V32HI
+	    (match_operand:V16HI 1 "register_operand" "f")
+	    (match_operand:V16HI 2 "register_operand" "f"))
+	  (parallel [(const_int 0) (const_int 16)
+		     (const_int 1) (const_int 17)
+		     (const_int 2) (const_int 18)
+		     (const_int 3) (const_int 19)
+		     (const_int 8) (const_int 24)
+		     (const_int 9) (const_int 25)
+		     (const_int 10) (const_int 26)
+		     (const_int 11) (const_int 27)])))]
+  "ISA_HAS_LASX"
+  "xvilvl.h\t%u0,%u2,%u1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V16HI")])
+
+(define_insn "lasx_xvilvl_w"
+  [(set (match_operand:V8SI 0 "register_operand" "=f")
+	(vec_select:V8SI
+	  (vec_concat:V16SI
+	    (match_operand:V8SI 1 "register_operand" "f")
+	    (match_operand:V8SI 2 "register_operand" "f"))
+	  (parallel [(const_int 0) (const_int 8)
+		     (const_int 1) (const_int 9)
+		     (const_int 4) (const_int 12)
+		     (const_int 5) (const_int 13)])))]
+  "ISA_HAS_LASX"
+  "xvilvl.w\t%u0,%u2,%u1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V8SI")])
+
+(define_insn "lasx_xvilvl_w_f"
+  [(set (match_operand:V8SF 0 "register_operand" "=f")
+	(vec_select:V8SF
+	  (vec_concat:V16SF
+	    (match_operand:V8SF 1 "register_operand" "f")
+	    (match_operand:V8SF 2 "register_operand" "f"))
+	  (parallel [(const_int 0) (const_int 8)
+		     (const_int 1) (const_int 9)
+		     (const_int 4) (const_int 12)
+		     (const_int 5) (const_int 13)])))]
+  "ISA_HAS_LASX"
+  "xvilvl.w\t%u0,%u2,%u1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V8SF")])
+
+(define_insn "lasx_xvilvl_d"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(vec_select:V4DI
+	  (vec_concat:V8DI
+	    (match_operand:V4DI 1 "register_operand" "f")
+	    (match_operand:V4DI 2 "register_operand" "f"))
+	  (parallel [(const_int 0) (const_int 4)
+		     (const_int 2) (const_int 6)])))]
+  "ISA_HAS_LASX"
+  "xvilvl.d\t%u0,%u2,%u1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V4DI")])
+
+(define_insn "lasx_xvilvl_d_f"
+  [(set (match_operand:V4DF 0 "register_operand" "=f")
+	(vec_select:V4DF
+	  (vec_concat:V8DF
+	    (match_operand:V4DF 1 "register_operand" "f")
+	    (match_operand:V4DF 2 "register_operand" "f"))
+	  (parallel [(const_int 0) (const_int 4)
+		     (const_int 2) (const_int 6)])))]
+  "ISA_HAS_LASX"
+  "xvilvl.d\t%u0,%u2,%u1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V4DF")])
+
+(define_insn "smax<mode>3"
+  [(set (match_operand:ILASX 0 "register_operand" "=f,f")
+	(smax:ILASX (match_operand:ILASX 1 "register_operand" "f,f")
+		    (match_operand:ILASX 2 "reg_or_vector_same_simm5_operand" "f,Usv5")))]
+  "ISA_HAS_LASX"
+  "@
+   xvmax.<lasxfmt>\t%u0,%u1,%u2
+   xvmaxi.<lasxfmt>\t%u0,%u1,%E2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "umax<mode>3"
+  [(set (match_operand:ILASX 0 "register_operand" "=f,f")
+	(umax:ILASX (match_operand:ILASX 1 "register_operand" "f,f")
+		    (match_operand:ILASX 2 "reg_or_vector_same_uimm5_operand" "f,Uuv5")))]
+  "ISA_HAS_LASX"
+  "@
+   xvmax.<lasxfmt_u>\t%u0,%u1,%u2
+   xvmaxi.<lasxfmt_u>\t%u0,%u1,%B2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "smin<mode>3"
+  [(set (match_operand:ILASX 0 "register_operand" "=f,f")
+	(smin:ILASX (match_operand:ILASX 1 "register_operand" "f,f")
+		    (match_operand:ILASX 2 "reg_or_vector_same_simm5_operand" "f,Usv5")))]
+  "ISA_HAS_LASX"
+  "@
+   xvmin.<lasxfmt>\t%u0,%u1,%u2
+   xvmini.<lasxfmt>\t%u0,%u1,%E2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "umin<mode>3"
+  [(set (match_operand:ILASX 0 "register_operand" "=f,f")
+	(umin:ILASX (match_operand:ILASX 1 "register_operand" "f,f")
+		    (match_operand:ILASX 2 "reg_or_vector_same_uimm5_operand" "f,Uuv5")))]
+  "ISA_HAS_LASX"
+  "@
+   xvmin.<lasxfmt_u>\t%u0,%u1,%u2
+   xvmini.<lasxfmt_u>\t%u0,%u1,%B2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvclo_<lasxfmt>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(clz:ILASX (not:ILASX (match_operand:ILASX 1 "register_operand" "f"))))]
+  "ISA_HAS_LASX"
+  "xvclo.<lasxfmt>\t%u0,%u1"
+  [(set_attr "type" "simd_bit")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "clz<mode>2"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(clz:ILASX (match_operand:ILASX 1 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+  "xvclz.<lasxfmt>\t%u0,%u1"
+  [(set_attr "type" "simd_bit")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvnor_<lasxfmt>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f,f")
+	(and:ILASX (not:ILASX (match_operand:ILASX 1 "register_operand" "f,f"))
+		   (not:ILASX (match_operand:ILASX 2 "reg_or_vector_same_val_operand" "f,Urv8"))))]
+  "ISA_HAS_LASX"
+  "@
+   xvnor.v\t%u0,%u1,%u2
+   xvnori.b\t%u0,%u1,%B2"
+  [(set_attr "type" "simd_logic")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvpickev_b"
+  [(set (match_operand:V32QI 0 "register_operand" "=f")
+	(vec_select:V32QI
+	  (vec_concat:V64QI
+	    (match_operand:V32QI 1 "register_operand" "f")
+	    (match_operand:V32QI 2 "register_operand" "f"))
+	  (parallel [(const_int 0) (const_int 2)
+		     (const_int 4) (const_int 6)
+		     (const_int 8) (const_int 10)
+		     (const_int 12) (const_int 14)
+		     (const_int 32) (const_int 34)
+		     (const_int 36) (const_int 38)
+		     (const_int 40) (const_int 42)
+		     (const_int 44) (const_int 46)
+		     (const_int 16) (const_int 18)
+		     (const_int 20) (const_int 22)
+		     (const_int 24) (const_int 26)
+		     (const_int 28) (const_int 30)
+		     (const_int 48) (const_int 50)
+		     (const_int 52) (const_int 54)
+		     (const_int 56) (const_int 58)
+		     (const_int 60) (const_int 62)])))]
+  "ISA_HAS_LASX"
+  "xvpickev.b\t%u0,%u2,%u1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V32QI")])
+
+(define_insn "lasx_xvpickev_h"
+  [(set (match_operand:V16HI 0 "register_operand" "=f")
+	(vec_select:V16HI
+	  (vec_concat:V32HI
+	    (match_operand:V16HI 1 "register_operand" "f")
+	    (match_operand:V16HI 2 "register_operand" "f"))
+	  (parallel [(const_int 0) (const_int 2)
+		     (const_int 4) (const_int 6)
+		     (const_int 16) (const_int 18)
+		     (const_int 20) (const_int 22)
+		     (const_int 8) (const_int 10)
+		     (const_int 12) (const_int 14)
+		     (const_int 24) (const_int 26)
+		     (const_int 28) (const_int 30)])))]
+  "ISA_HAS_LASX"
+  "xvpickev.h\t%u0,%u2,%u1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V16HI")])
+
+(define_insn "lasx_xvpickev_w"
+  [(set (match_operand:V8SI 0 "register_operand" "=f")
+	(vec_select:V8SI
+	  (vec_concat:V16SI
+	    (match_operand:V8SI 1 "register_operand" "f")
+	    (match_operand:V8SI 2 "register_operand" "f"))
+	  (parallel [(const_int 0) (const_int 2)
+		     (const_int 8) (const_int 10)
+		     (const_int 4) (const_int 6)
+		     (const_int 12) (const_int 14)])))]
+  "ISA_HAS_LASX"
+  "xvpickev.w\t%u0,%u2,%u1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V8SI")])
+
+(define_insn "lasx_xvpickev_w_f"
+  [(set (match_operand:V8SF 0 "register_operand" "=f")
+	(vec_select:V8SF
+	  (vec_concat:V16SF
+	    (match_operand:V8SF 1 "register_operand" "f")
+	    (match_operand:V8SF 2 "register_operand" "f"))
+	  (parallel [(const_int 0) (const_int 2)
+		     (const_int 8) (const_int 10)
+		     (const_int 4) (const_int 6)
+		     (const_int 12) (const_int 14)])))]
+  "ISA_HAS_LASX"
+  "xvpickev.w\t%u0,%u2,%u1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V8SF")])
+
+(define_insn "lasx_xvpickod_b"
+  [(set (match_operand:V32QI 0 "register_operand" "=f")
+	(vec_select:V32QI
+	  (vec_concat:V64QI
+	    (match_operand:V32QI 1 "register_operand" "f")
+	    (match_operand:V32QI 2 "register_operand" "f"))
+	  (parallel [(const_int 1) (const_int 3)
+		     (const_int 5) (const_int 7)
+		     (const_int 9) (const_int 11)
+		     (const_int 13) (const_int 15)
+		     (const_int 33) (const_int 35)
+		     (const_int 37) (const_int 39)
+		     (const_int 41) (const_int 43)
+		     (const_int 45) (const_int 47)
+		     (const_int 17) (const_int 19)
+		     (const_int 21) (const_int 23)
+		     (const_int 25) (const_int 27)
+		     (const_int 29) (const_int 31)
+		     (const_int 49) (const_int 51)
+		     (const_int 53) (const_int 55)
+		     (const_int 57) (const_int 59)
+		     (const_int 61) (const_int 63)])))]
+  "ISA_HAS_LASX"
+  "xvpickod.b\t%u0,%u2,%u1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V32QI")])
+
+(define_insn "lasx_xvpickod_h"
+  [(set (match_operand:V16HI 0 "register_operand" "=f")
+	(vec_select:V16HI
+	  (vec_concat:V32HI
+	    (match_operand:V16HI 1 "register_operand" "f")
+	    (match_operand:V16HI 2 "register_operand" "f"))
+	  (parallel [(const_int 1) (const_int 3)
+		     (const_int 5) (const_int 7)
+		     (const_int 17) (const_int 19)
+		     (const_int 21) (const_int 23)
+		     (const_int 9) (const_int 11)
+		     (const_int 13) (const_int 15)
+		     (const_int 25) (const_int 27)
+		     (const_int 29) (const_int 31)])))]
+  "ISA_HAS_LASX"
+  "xvpickod.h\t%u0,%u2,%u1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V16HI")])
+
+(define_insn "lasx_xvpickod_w"
+  [(set (match_operand:V8SI 0 "register_operand" "=f")
+	(vec_select:V8SI
+	  (vec_concat:V16SI
+	    (match_operand:V8SI 1 "register_operand" "f")
+	    (match_operand:V8SI 2 "register_operand" "f"))
+	  (parallel [(const_int 1) (const_int 3)
+		     (const_int 9) (const_int 11)
+		     (const_int 5) (const_int 7)
+		     (const_int 13) (const_int 15)])))]
+  "ISA_HAS_LASX"
+  "xvpickod.w\t%u0,%u2,%u1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V8SI")])
+
+(define_insn "lasx_xvpickod_w_f"
+  [(set (match_operand:V8SF 0 "register_operand" "=f")
+	(vec_select:V8SF
+	  (vec_concat:V16SF
+	    (match_operand:V8SF 1 "register_operand" "f")
+	    (match_operand:V8SF 2 "register_operand" "f"))
+	  (parallel [(const_int 1) (const_int 3)
+		     (const_int 9) (const_int 11)
+		     (const_int 5) (const_int 7)
+		     (const_int 13) (const_int 15)])))]
+  "ISA_HAS_LASX"
+  "xvpickod.w\t%u0,%u2,%u1"
+  [(set_attr "type" "simd_permute")
+   (set_attr "mode" "V8SF")])
+
+(define_insn "popcount<mode>2"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(popcount:ILASX (match_operand:ILASX 1 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+  "xvpcnt.<lasxfmt>\t%u0,%u1"
+  [(set_attr "type" "simd_pcnt")
+   (set_attr "mode" "<MODE>")])
+
+
+(define_insn "lasx_xvsat_s_<lasxfmt>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(unspec:ILASX [(match_operand:ILASX 1 "register_operand" "f")
+		      (match_operand 2 "const_<bitimm256>_operand" "")]
+		     UNSPEC_LASX_XVSAT_S))]
+  "ISA_HAS_LASX"
+  "xvsat.<lasxfmt>\t%u0,%u1,%2"
+  [(set_attr "type" "simd_sat")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvsat_u_<lasxfmt_u>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(unspec:ILASX [(match_operand:ILASX 1 "register_operand" "f")
+		       (match_operand 2 "const_<bitimm256>_operand" "")]
+		      UNSPEC_LASX_XVSAT_U))]
+  "ISA_HAS_LASX"
+  "xvsat.<lasxfmt_u>\t%u0,%u1,%2"
+  [(set_attr "type" "simd_sat")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvshuf4i_<lasxfmt_f>"
+  [(set (match_operand:LASX_WHB_W 0 "register_operand" "=f")
+	(unspec:LASX_WHB_W [(match_operand:LASX_WHB_W 1 "register_operand" "f")
+			    (match_operand 2 "const_uimm8_operand")]
+			   UNSPEC_LASX_XVSHUF4I))]
+  "ISA_HAS_LASX"
+  "xvshuf4i.<lasxfmt>\t%u0,%u1,%2"
+  [(set_attr "type" "simd_shf")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvshuf4i_<lasxfmt_f>_1"
+  [(set (match_operand:LASX_W 0 "register_operand" "=f")
+    (vec_select:LASX_W
+      (match_operand:LASX_W 1 "nonimmediate_operand" "f")
+      (parallel [(match_operand 2 "const_0_to_3_operand")
+             (match_operand 3 "const_0_to_3_operand")
+             (match_operand 4 "const_0_to_3_operand")
+             (match_operand 5 "const_0_to_3_operand")
+             (match_operand 6 "const_4_to_7_operand")
+             (match_operand 7 "const_4_to_7_operand")
+             (match_operand 8 "const_4_to_7_operand")
+             (match_operand 9 "const_4_to_7_operand")])))]
+  "ISA_HAS_LASX
+   && INTVAL (operands[2]) + 4 == INTVAL (operands[6])
+   && INTVAL (operands[3]) + 4 == INTVAL (operands[7])
+   && INTVAL (operands[4]) + 4 == INTVAL (operands[8])
+   && INTVAL (operands[5]) + 4 == INTVAL (operands[9])"
+{
+  int mask = 0;
+  mask |= INTVAL (operands[2]) << 0;
+  mask |= INTVAL (operands[3]) << 2;
+  mask |= INTVAL (operands[4]) << 4;
+  mask |= INTVAL (operands[5]) << 6;
+  operands[2] = GEN_INT (mask);
+
+  return "xvshuf4i.w\t%u0,%u1,%2";
+}
+  [(set_attr "type" "simd_shf")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvsrar_<lasxfmt>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(unspec:ILASX [(match_operand:ILASX 1 "register_operand" "f")
+		       (match_operand:ILASX 2 "register_operand" "f")]
+		      UNSPEC_LASX_XVSRAR))]
+  "ISA_HAS_LASX"
+  "xvsrar.<lasxfmt>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvsrari_<lasxfmt>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(unspec:ILASX [(match_operand:ILASX 1 "register_operand" "f")
+		       (match_operand 2 "const_<bitimm256>_operand" "")]
+		      UNSPEC_LASX_XVSRARI))]
+  "ISA_HAS_LASX"
+  "xvsrari.<lasxfmt>\t%u0,%u1,%2"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvsrlr_<lasxfmt>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(unspec:ILASX [(match_operand:ILASX 1 "register_operand" "f")
+		       (match_operand:ILASX 2 "register_operand" "f")]
+		      UNSPEC_LASX_XVSRLR))]
+  "ISA_HAS_LASX"
+  "xvsrlr.<lasxfmt>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvsrlri_<lasxfmt>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(unspec:ILASX [(match_operand:ILASX 1 "register_operand" "f")
+		       (match_operand 2 "const_<bitimm256>_operand" "")]
+		      UNSPEC_LASX_XVSRLRI))]
+  "ISA_HAS_LASX"
+  "xvsrlri.<lasxfmt>\t%u0,%u1,%2"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvssub_s_<lasxfmt>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(ss_minus:ILASX (match_operand:ILASX 1 "register_operand" "f")
+		       (match_operand:ILASX 2 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+  "xvssub.<lasxfmt>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvssub_u_<lasxfmt_u>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(us_minus:ILASX (match_operand:ILASX 1 "register_operand" "f")
+		       (match_operand:ILASX 2 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+  "xvssub.<lasxfmt_u>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvshuf_<lasxfmt_f>"
+  [(set (match_operand:LASX_DWH 0 "register_operand" "=f")
+	(unspec:LASX_DWH [(match_operand:LASX_DWH 1 "register_operand" "0")
+			  (match_operand:LASX_DWH 2 "register_operand" "f")
+			  (match_operand:LASX_DWH 3 "register_operand" "f")]
+			UNSPEC_LASX_XVSHUF))]
+  "ISA_HAS_LASX"
+  "xvshuf.<lasxfmt>\t%u0,%u2,%u3"
+  [(set_attr "type" "simd_sld")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvshuf_b"
+  [(set (match_operand:V32QI 0 "register_operand" "=f")
+	(unspec:V32QI [(match_operand:V32QI 1 "register_operand" "f")
+		       (match_operand:V32QI 2 "register_operand" "f")
+		       (match_operand:V32QI 3 "register_operand" "f")]
+		      UNSPEC_LASX_XVSHUF_B))]
+  "ISA_HAS_LASX"
+  "xvshuf.b\t%u0,%u1,%u2,%u3"
+  [(set_attr "type" "simd_sld")
+   (set_attr "mode" "V32QI")])
+
+(define_insn "lasx_xvreplve0_<lasxfmt_f>"
+  [(set (match_operand:LASX 0 "register_operand" "=f")
+	(vec_duplicate:LASX
+	  (vec_select:<UNITMODE>
+	    (match_operand:LASX 1 "register_operand" "f")
+	    (parallel [(const_int 0)]))))]
+  "ISA_HAS_LASX"
+  "xvreplve0.<lasxfmt>\t%u0,%u1"
+  [(set_attr "type" "simd_splat")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvrepl128vei_b_internal"
+  [(set (match_operand:V32QI 0 "register_operand" "=f")
+	(vec_duplicate:V32QI
+	  (vec_select:V32QI
+	    (match_operand:V32QI 1 "register_operand" "f")
+	    (parallel [(match_operand 2 "const_uimm4_operand" "")
+		       (match_dup 2) (match_dup 2) (match_dup 2)
+		       (match_dup 2) (match_dup 2) (match_dup 2)
+		       (match_dup 2) (match_dup 2) (match_dup 2)
+		       (match_dup 2) (match_dup 2) (match_dup 2)
+		       (match_dup 2) (match_dup 2) (match_dup 2)
+		       (match_operand 3 "const_16_to_31_operand" "")
+		       (match_dup 3) (match_dup 3) (match_dup 3)
+		       (match_dup 3) (match_dup 3) (match_dup 3)
+		       (match_dup 3) (match_dup 3) (match_dup 3)
+		       (match_dup 3) (match_dup 3) (match_dup 3)
+		       (match_dup 3) (match_dup 3) (match_dup 3)]))))]
+  "ISA_HAS_LASX && ((INTVAL (operands[3]) - INTVAL (operands[2])) == 16)"
+  "xvrepl128vei.b\t%u0,%u1,%2"
+  [(set_attr "type" "simd_splat")
+   (set_attr "mode" "V32QI")])
+
+(define_insn "lasx_xvrepl128vei_h_internal"
+  [(set (match_operand:V16HI 0 "register_operand" "=f")
+	(vec_duplicate:V16HI
+	  (vec_select:V16HI
+	    (match_operand:V16HI 1 "register_operand" "f")
+	    (parallel [(match_operand 2 "const_uimm3_operand" "")
+		       (match_dup 2) (match_dup 2) (match_dup 2)
+		       (match_dup 2) (match_dup 2) (match_dup 2)
+		       (match_dup 2)
+		       (match_operand 3 "const_8_to_15_operand" "")
+		       (match_dup 3) (match_dup 3) (match_dup 3)
+		       (match_dup 3) (match_dup 3) (match_dup 3)
+		       (match_dup 3)]))))]
+  "ISA_HAS_LASX && ((INTVAL (operands[3]) - INTVAL (operands[2])) == 8)"
+  "xvrepl128vei.h\t%u0,%u1,%2"
+  [(set_attr "type" "simd_splat")
+   (set_attr "mode" "V16HI")])
+
+(define_insn "lasx_xvrepl128vei_w_internal"
+  [(set (match_operand:V8SI 0 "register_operand" "=f")
+	(vec_duplicate:V8SI
+	  (vec_select:V8SI
+	    (match_operand:V8SI 1 "register_operand" "f")
+	    (parallel [(match_operand 2 "const_0_to_3_operand" "")
+		       (match_dup 2) (match_dup 2) (match_dup 2)
+		       (match_operand 3 "const_4_to_7_operand" "")
+		       (match_dup 3) (match_dup 3) (match_dup 3)]))))]
+  "ISA_HAS_LASX && ((INTVAL (operands[3]) - INTVAL (operands[2])) == 4)"
+  "xvrepl128vei.w\t%u0,%u1,%2"
+  [(set_attr "type" "simd_splat")
+   (set_attr "mode" "V8SI")])
+
+(define_insn "lasx_xvrepl128vei_d_internal"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(vec_duplicate:V4DI
+	  (vec_select:V4DI
+	    (match_operand:V4DI 1 "register_operand" "f")
+	    (parallel [(match_operand 2 "const_0_or_1_operand" "")
+		       (match_dup 2)
+		       (match_operand 3 "const_2_or_3_operand" "")
+		       (match_dup 3)]))))]
+  "ISA_HAS_LASX && ((INTVAL (operands[3]) - INTVAL (operands[2])) == 2)"
+  "xvrepl128vei.d\t%u0,%u1,%2"
+  [(set_attr "type" "simd_splat")
+   (set_attr "mode" "V4DI")])
+
+(define_insn "lasx_xvrepl128vei_<lasxfmt_f>"
+  [(set (match_operand:LASX 0 "register_operand" "=f")
+	(unspec:LASX [(match_operand:LASX 1 "register_operand" "f")
+		      (match_operand 2 "const_<indeximm_lo>_operand" "")]
+		     UNSPEC_LASX_XVREPL128VEI))]
+  "ISA_HAS_LASX"
+  "xvrepl128vei.<lasxfmt>\t%u0,%u1,%2"
+  [(set_attr "type" "simd_splat")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvreplve0_<lasxfmt_f>_scalar"
+  [(set (match_operand:FLASX 0 "register_operand" "=f")
+    (vec_duplicate:FLASX
+      (match_operand:<UNITMODE> 1 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+  "xvreplve0.<lasxfmt>\t%u0,%u1"
+  [(set_attr "type" "simd_splat")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvreplve0_q"
+  [(set (match_operand:V32QI 0 "register_operand" "=f")
+	(unspec:V32QI [(match_operand:V32QI 1 "register_operand" "f")]
+		      UNSPEC_LASX_XVREPLVE0_Q))]
+  "ISA_HAS_LASX"
+  "xvreplve0.q\t%u0,%u1"
+  [(set_attr "type" "simd_splat")
+   (set_attr "mode" "V32QI")])
+
+(define_insn "lasx_xvfcvt_h_s"
+  [(set (match_operand:V16HI 0 "register_operand" "=f")
+	(unspec:V16HI [(match_operand:V8SF 1 "register_operand" "f")
+		       (match_operand:V8SF 2 "register_operand" "f")]
+		      UNSPEC_LASX_XVFCVT))]
+  "ISA_HAS_LASX"
+  "xvfcvt.h.s\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "mode" "V16HI")])
+
+(define_insn "lasx_xvfcvt_s_d"
+  [(set (match_operand:V8SF 0 "register_operand" "=f")
+	(unspec:V8SF [(match_operand:V4DF 1 "register_operand" "f")
+		      (match_operand:V4DF 2 "register_operand" "f")]
+		     UNSPEC_LASX_XVFCVT))]
+  "ISA_HAS_LASX"
+  "xvfcvt.s.d\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "mode" "V8SF")])
+
+(define_insn "vec_pack_trunc_v4df"
+  [(set (match_operand:V8SF 0 "register_operand" "=f")
+	(vec_concat:V8SF
+	  (float_truncate:V4SF (match_operand:V4DF 1 "register_operand" "f"))
+	  (float_truncate:V4SF (match_operand:V4DF 2 "register_operand" "f"))))]
+  "ISA_HAS_LASX"
+  "xvfcvt.s.d\t%u0,%u2,%u1\n\txvpermi.d\t%u0,%u0,0xd8"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "mode" "V8SF")
+   (set_attr "length" "8")])
+
+;; Define for builtin function.
+(define_insn "lasx_xvfcvth_s_h"
+  [(set (match_operand:V8SF 0 "register_operand" "=f")
+	(unspec:V8SF [(match_operand:V16HI 1 "register_operand" "f")]
+		     UNSPEC_LASX_XVFCVTH))]
+  "ISA_HAS_LASX"
+  "xvfcvth.s.h\t%u0,%u1"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "mode" "V8SF")])
+
+;; Define for builtin function.
+(define_insn "lasx_xvfcvth_d_s"
+  [(set (match_operand:V4DF 0 "register_operand" "=f")
+	(float_extend:V4DF
+	 (vec_select:V4SF
+	  (match_operand:V8SF 1 "register_operand" "f")
+	  (parallel [(const_int 2) (const_int 3)
+		      (const_int 6) (const_int 7)]))))]
+  "ISA_HAS_LASX"
+  "xvfcvth.d.s\t%u0,%u1"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "mode" "V4DF")
+   (set_attr "length" "12")])
+
+;; Define for gen insn.
+(define_insn "lasx_xvfcvth_d_insn"
+  [(set (match_operand:V4DF 0 "register_operand" "=f")
+	(float_extend:V4DF
+	(vec_select:V4SF
+	  (match_operand:V8SF 1 "register_operand" "f")
+	  (parallel [(const_int 4) (const_int 5)
+		     (const_int 6) (const_int 7)]))))]
+  "ISA_HAS_LASX"
+  "xvpermi.d\t%u0,%u1,0xfa\n\txvfcvtl.d.s\t%u0,%u0"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "mode" "V4DF")
+   (set_attr "length" "12")])
+
+;; Define for builtin function.
+(define_insn "lasx_xvfcvtl_s_h"
+  [(set (match_operand:V8SF 0 "register_operand" "=f")
+	(unspec:V8SF [(match_operand:V16HI 1 "register_operand" "f")]
+		     UNSPEC_LASX_XVFCVTL))]
+  "ISA_HAS_LASX"
+  "xvfcvtl.s.h\t%u0,%u1"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "mode" "V8SF")])
+
+;; Define for builtin function.
+(define_insn "lasx_xvfcvtl_d_s"
+  [(set (match_operand:V4DF 0 "register_operand" "=f")
+	(float_extend:V4DF
+	(vec_select:V4SF
+	  (match_operand:V8SF 1 "register_operand" "f")
+	  (parallel [(const_int 0) (const_int 1)
+		     (const_int 4) (const_int 5)]))))]
+  "ISA_HAS_LASX"
+  "xvfcvtl.d.s\t%u0,%u1"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "mode" "V4DF")
+   (set_attr "length" "8")])
+
+;; Define for gen insn.
+(define_insn "lasx_xvfcvtl_d_insn"
+  [(set (match_operand:V4DF 0 "register_operand" "=f")
+	(float_extend:V4DF
+	(vec_select:V4SF
+	  (match_operand:V8SF 1 "register_operand" "f")
+	  (parallel [(const_int 0) (const_int 1)
+		     (const_int 2) (const_int 3)]))))]
+  "ISA_HAS_LASX"
+  "xvpermi.d\t%u0,%u1,0x50\n\txvfcvtl.d.s\t%u0,%u0"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "mode" "V4DF")
+   (set_attr "length" "8")])
+
+(define_code_attr lasxbr
+  [(eq "xbz")
+   (ne "xbnz")])
+
+(define_code_attr lasxeq_v
+  [(eq "eqz")
+   (ne "nez")])
+
+(define_code_attr lasxne_v
+  [(eq "nez")
+   (ne "eqz")])
+
+(define_code_attr lasxeq
+  [(eq "anyeqz")
+   (ne "allnez")])
+
+(define_code_attr lasxne
+  [(eq "allnez")
+   (ne "anyeqz")])
+
+(define_insn "lasx_<lasxbr>_<lasxfmt_f>"
+  [(set (pc)
+	(if_then_else
+	  (equality_op
+	    (unspec:SI [(match_operand:LASX 1 "register_operand" "f")]
+		       UNSPEC_LASX_BRANCH)
+	    (match_operand:SI 2 "const_0_operand"))
+	  (label_ref (match_operand 0))
+	  (pc)))
+   (clobber (match_scratch:FCC 3 "=z"))]
+  "ISA_HAS_LASX"
+{
+  return loongarch_output_conditional_branch (insn, operands,
+					 "xvset<lasxeq>.<lasxfmt>\t%Z3%u1\n\tbcnez\t%Z3%0",
+					 "xvset<lasxne>.<lasxfmt>\t%z3%u1\n\tbcnez\t%Z3%0");
+}
+  [(set_attr "type" "simd_branch")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_<lasxbr>_v_<lasxfmt_f>"
+  [(set (pc)
+	(if_then_else
+	  (equality_op
+	    (unspec:SI [(match_operand:LASX 1 "register_operand" "f")]
+		       UNSPEC_LASX_BRANCH_V)
+	    (match_operand:SI 2 "const_0_operand"))
+	  (label_ref (match_operand 0))
+	  (pc)))
+   (clobber (match_scratch:FCC 3 "=z"))]
+  "ISA_HAS_LASX"
+{
+  return loongarch_output_conditional_branch (insn, operands,
+					 "xvset<lasxeq_v>.v\t%Z3%u1\n\tbcnez\t%Z3%0",
+					 "xvset<lasxne_v>.v\t%Z3%u1\n\tbcnez\t%Z3%0");
+}
+  [(set_attr "type" "simd_branch")
+   (set_attr "mode" "<MODE>")])
+
+;; loongson-asx.
+(define_insn "lasx_vext2xv_h<u>_b<u>"
+  [(set (match_operand:V16HI 0 "register_operand" "=f")
+	(any_extend:V16HI
+	  (vec_select:V16QI
+	    (match_operand:V32QI 1 "register_operand" "f")
+	    (parallel [(const_int 0) (const_int 1)
+		       (const_int 2) (const_int 3)
+		       (const_int 4) (const_int 5)
+		       (const_int 6) (const_int 7)
+		       (const_int 8) (const_int 9)
+		       (const_int 10) (const_int 11)
+		       (const_int 12) (const_int 13)
+		       (const_int 14) (const_int 15)]))))]
+  "ISA_HAS_LASX"
+  "vext2xv.h<u>.b<u>\t%u0,%u1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V16HI")])
+
+(define_insn "lasx_vext2xv_w<u>_h<u>"
+  [(set (match_operand:V8SI 0 "register_operand" "=f")
+	(any_extend:V8SI
+	  (vec_select:V8HI
+	    (match_operand:V16HI 1 "register_operand" "f")
+	    (parallel [(const_int 0) (const_int 1)
+		       (const_int 2) (const_int 3)
+		       (const_int 4) (const_int 5)
+		       (const_int 6) (const_int 7)]))))]
+  "ISA_HAS_LASX"
+  "vext2xv.w<u>.h<u>\t%u0,%u1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V8SI")])
+
+(define_insn "lasx_vext2xv_d<u>_w<u>"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(any_extend:V4DI
+	  (vec_select:V4SI
+	    (match_operand:V8SI 1 "register_operand" "f")
+	    (parallel [(const_int 0) (const_int 1)
+		       (const_int 2) (const_int 3)]))))]
+  "ISA_HAS_LASX"
+  "vext2xv.d<u>.w<u>\t%u0,%u1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V4DI")])
+
+(define_insn "lasx_vext2xv_w<u>_b<u>"
+  [(set (match_operand:V8SI 0 "register_operand" "=f")
+	(any_extend:V8SI
+	  (vec_select:V8QI
+	   (match_operand:V32QI 1 "register_operand" "f")
+	    (parallel [(const_int 0) (const_int 1)
+		       (const_int 2) (const_int 3)
+		       (const_int 4) (const_int 5)
+		       (const_int 6) (const_int 7)]))))]
+  "ISA_HAS_LASX"
+  "vext2xv.w<u>.b<u>\t%u0,%u1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V8SI")])
+
+(define_insn "lasx_vext2xv_d<u>_h<u>"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(any_extend:V4DI
+	  (vec_select:V4HI
+	    (match_operand:V16HI 1 "register_operand" "f")
+	    (parallel [(const_int 0) (const_int 1)
+		       (const_int 2) (const_int 3)]))))]
+  "ISA_HAS_LASX"
+  "vext2xv.d<u>.h<u>\t%u0,%u1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V4DI")])
+
+(define_insn "lasx_vext2xv_d<u>_b<u>"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(any_extend:V4DI
+	  (vec_select:V4QI
+	    (match_operand:V32QI 1 "register_operand" "f")
+	    (parallel [(const_int 0) (const_int 1)
+		       (const_int 2) (const_int 3)]))))]
+  "ISA_HAS_LASX"
+  "vext2xv.d<u>.b<u>\t%u0,%u1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V4DI")])
+
+;; Extend loongson-sx to loongson-asx.
+(define_insn "xvandn<mode>3"
+  [(set (match_operand:LASX 0 "register_operand" "=f")
+	(and:LASX (not:LASX (match_operand:LASX 1 "register_operand" "f"))
+			    (match_operand:LASX 2 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+  "xvandn.v\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_logic")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "abs<mode>2"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(abs:ILASX (match_operand:ILASX 1 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+  "xvsigncov.<lasxfmt>\t%u0,%u1,%u1"
+  [(set_attr "type" "simd_logic")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "neg<mode>2"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(neg:ILASX (match_operand:ILASX 1 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+  "xvneg.<lasxfmt>\t%u0,%u1"
+  [(set_attr "type" "simd_logic")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvmuh_s_<lasxfmt>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(unspec:ILASX [(match_operand:ILASX 1 "register_operand" "f")
+		       (match_operand:ILASX 2 "register_operand" "f")]
+		      UNSPEC_LASX_XVMUH_S))]
+  "ISA_HAS_LASX"
+  "xvmuh.<lasxfmt>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvmuh_u_<lasxfmt_u>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(unspec:ILASX [(match_operand:ILASX 1 "register_operand" "f")
+		       (match_operand:ILASX 2 "register_operand" "f")]
+		      UNSPEC_LASX_XVMUH_U))]
+  "ISA_HAS_LASX"
+  "xvmuh.<lasxfmt_u>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvsllwil_s_<dlasxfmt>_<lasxfmt>"
+  [(set (match_operand:<VDMODE256> 0 "register_operand" "=f")
+	(unspec:<VDMODE256> [(match_operand:ILASX_WHB 1 "register_operand" "f")
+			     (match_operand 2 "const_<bitimm256>_operand" "")]
+			    UNSPEC_LASX_XVSLLWIL_S))]
+  "ISA_HAS_LASX"
+  "xvsllwil.<dlasxfmt>.<lasxfmt>\t%u0,%u1,%2"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvsllwil_u_<dlasxfmt_u>_<lasxfmt_u>"
+  [(set (match_operand:<VDMODE256> 0 "register_operand" "=f")
+	(unspec:<VDMODE256> [(match_operand:ILASX_WHB 1 "register_operand" "f")
+			     (match_operand 2 "const_<bitimm256>_operand" "")]
+			    UNSPEC_LASX_XVSLLWIL_U))]
+  "ISA_HAS_LASX"
+  "xvsllwil.<dlasxfmt_u>.<lasxfmt_u>\t%u0,%u1,%2"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvsran_<hlasxfmt>_<lasxfmt>"
+  [(set (match_operand:<VHSMODE256> 0 "register_operand" "=f")
+	(unspec:<VHSMODE256> [(match_operand:ILASX_DWH 1 "register_operand" "f")
+			      (match_operand:ILASX_DWH 2 "register_operand" "f")]
+			     UNSPEC_LASX_XVSRAN))]
+  "ISA_HAS_LASX"
+  "xvsran.<hlasxfmt>.<lasxfmt>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvssran_s_<hlasxfmt>_<lasxfmt>"
+  [(set (match_operand:<VHSMODE256> 0 "register_operand" "=f")
+	(unspec:<VHSMODE256> [(match_operand:ILASX_DWH 1 "register_operand" "f")
+			      (match_operand:ILASX_DWH 2 "register_operand" "f")]
+			     UNSPEC_LASX_XVSSRAN_S))]
+  "ISA_HAS_LASX"
+  "xvssran.<hlasxfmt>.<lasxfmt>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvssran_u_<hlasxfmt_u>_<lasxfmt>"
+  [(set (match_operand:<VHSMODE256> 0 "register_operand" "=f")
+	(unspec:<VHSMODE256> [(match_operand:ILASX_DWH 1 "register_operand" "f")
+			      (match_operand:ILASX_DWH 2 "register_operand" "f")]
+			     UNSPEC_LASX_XVSSRAN_U))]
+  "ISA_HAS_LASX"
+  "xvssran.<hlasxfmt_u>.<lasxfmt>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvsrarn_<hlasxfmt>_<lasxfmt>"
+  [(set (match_operand:<VHSMODE256> 0 "register_operand" "=f")
+	(unspec:<VHSMODE256> [(match_operand:ILASX_DWH 1 "register_operand" "f")
+			      (match_operand:ILASX_DWH 2 "register_operand" "f")]
+			     UNSPEC_LASX_XVSRARN))]
+  "ISA_HAS_LASX"
+  "xvsrarn.<hlasxfmt>.<lasxfmt>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvssrarn_s_<hlasxfmt>_<lasxfmt>"
+  [(set (match_operand:<VHSMODE256> 0 "register_operand" "=f")
+	(unspec:<VHSMODE256> [(match_operand:ILASX_DWH 1 "register_operand" "f")
+			      (match_operand:ILASX_DWH 2 "register_operand" "f")]
+			     UNSPEC_LASX_XVSSRARN_S))]
+  "ISA_HAS_LASX"
+  "xvssrarn.<hlasxfmt>.<lasxfmt>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvssrarn_u_<hlasxfmt_u>_<lasxfmt>"
+  [(set (match_operand:<VHSMODE256> 0 "register_operand" "=f")
+	(unspec:<VHSMODE256> [(match_operand:ILASX_DWH 1 "register_operand" "f")
+			      (match_operand:ILASX_DWH 2 "register_operand" "f")]
+			     UNSPEC_LASX_XVSSRARN_U))]
+  "ISA_HAS_LASX"
+  "xvssrarn.<hlasxfmt_u>.<lasxfmt>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvsrln_<hlasxfmt>_<lasxfmt>"
+  [(set (match_operand:<VHSMODE256> 0 "register_operand" "=f")
+	(unspec:<VHSMODE256> [(match_operand:ILASX_DWH 1 "register_operand" "f")
+			      (match_operand:ILASX_DWH 2 "register_operand" "f")]
+			     UNSPEC_LASX_XVSRLN))]
+  "ISA_HAS_LASX"
+  "xvsrln.<hlasxfmt>.<lasxfmt>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvssrln_u_<hlasxfmt_u>_<lasxfmt>"
+  [(set (match_operand:<VHSMODE256> 0 "register_operand" "=f")
+	(unspec:<VHSMODE256> [(match_operand:ILASX_DWH 1 "register_operand" "f")
+			      (match_operand:ILASX_DWH 2 "register_operand" "f")]
+			     UNSPEC_LASX_XVSSRLN_U))]
+  "ISA_HAS_LASX"
+  "xvssrln.<hlasxfmt_u>.<lasxfmt>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvsrlrn_<hlasxfmt>_<lasxfmt>"
+  [(set (match_operand:<VHSMODE256> 0 "register_operand" "=f")
+	(unspec:<VHSMODE256> [(match_operand:ILASX_DWH 1 "register_operand" "f")
+			      (match_operand:ILASX_DWH 2 "register_operand" "f")]
+			     UNSPEC_LASX_XVSRLRN))]
+  "ISA_HAS_LASX"
+  "xvsrlrn.<hlasxfmt>.<lasxfmt>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvssrlrn_u_<hlasxfmt_u>_<lasxfmt>"
+  [(set (match_operand:<VHSMODE256> 0 "register_operand" "=f")
+	(unspec:<VHSMODE256> [(match_operand:ILASX_DWH 1 "register_operand" "f")
+			      (match_operand:ILASX_DWH 2 "register_operand" "f")]
+			     UNSPEC_LASX_XVSSRLRN_U))]
+  "ISA_HAS_LASX"
+  "xvssrlrn.<hlasxfmt_u>.<lasxfmt>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvfrstpi_<lasxfmt>"
+  [(set (match_operand:ILASX_HB 0 "register_operand" "=f")
+	(unspec:ILASX_HB [(match_operand:ILASX_HB 1 "register_operand" "0")
+			  (match_operand:ILASX_HB 2 "register_operand" "f")
+			  (match_operand 3 "const_uimm5_operand" "")]
+			 UNSPEC_LASX_XVFRSTPI))]
+  "ISA_HAS_LASX"
+  "xvfrstpi.<lasxfmt>\t%u0,%u2,%3"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvfrstp_<lasxfmt>"
+  [(set (match_operand:ILASX_HB 0 "register_operand" "=f")
+	(unspec:ILASX_HB [(match_operand:ILASX_HB 1 "register_operand" "0")
+			  (match_operand:ILASX_HB 2 "register_operand" "f")
+			  (match_operand:ILASX_HB 3 "register_operand" "f")]
+			 UNSPEC_LASX_XVFRSTP))]
+  "ISA_HAS_LASX"
+  "xvfrstp.<lasxfmt>\t%u0,%u2,%u3"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvshuf4i_d"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V4DI 1 "register_operand" "0")
+		      (match_operand:V4DI 2 "register_operand" "f")
+		      (match_operand 3 "const_uimm8_operand")]
+		     UNSPEC_LASX_XVSHUF4I))]
+  "ISA_HAS_LASX"
+  "xvshuf4i.d\t%u0,%u2,%3"
+  [(set_attr "type" "simd_sld")
+   (set_attr "mode" "V4DI")])
+
+(define_insn "lasx_xvbsrl_<lasxfmt>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(unspec:ILASX [(match_operand:ILASX 1 "register_operand" "f")
+		       (match_operand 2 "const_uimm5_operand" "")]
+		      UNSPEC_LASX_XVBSRL_V))]
+  "ISA_HAS_LASX"
+  "xvbsrl.v\t%u0,%u1,%2"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvbsll_<lasxfmt>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(unspec:ILASX [(match_operand:ILASX 1 "register_operand" "f")
+		       (match_operand 2 "const_uimm5_operand" "")]
+		      UNSPEC_LASX_XVBSLL_V))]
+  "ISA_HAS_LASX"
+  "xvbsll.v\t%u0,%u1,%2"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvextrins_<lasxfmt>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(unspec:ILASX [(match_operand:ILASX 1 "register_operand" "0")
+		       (match_operand:ILASX 2 "register_operand" "f")
+		       (match_operand 3 "const_uimm8_operand" "")]
+		      UNSPEC_LASX_XVEXTRINS))]
+  "ISA_HAS_LASX"
+  "xvextrins.<lasxfmt>\t%u0,%u2,%3"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvmskltz_<lasxfmt>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(unspec:ILASX [(match_operand:ILASX 1 "register_operand" "f")]
+		      UNSPEC_LASX_XVMSKLTZ))]
+  "ISA_HAS_LASX"
+  "xvmskltz.<lasxfmt>\t%u0,%u1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvsigncov_<lasxfmt>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(unspec:ILASX [(match_operand:ILASX 1 "register_operand" "f")
+		       (match_operand:ILASX 2 "register_operand" "f")]
+		      UNSPEC_LASX_XVSIGNCOV))]
+  "ISA_HAS_LASX"
+  "xvsigncov.<lasxfmt>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_expand "copysign<mode>3"
+  [(set (match_dup 4)
+	(and:FLASX
+	  (not:FLASX (match_dup 3))
+	  (match_operand:FLASX 1 "register_operand")))
+   (set (match_dup 5)
+	(and:FLASX (match_dup 3)
+		   (match_operand:FLASX 2 "register_operand")))
+   (set (match_operand:FLASX 0 "register_operand")
+	(ior:FLASX (match_dup 4) (match_dup 5)))]
+  "ISA_HAS_LASX"
+{
+  operands[3] = loongarch_build_signbit_mask (<MODE>mode, 1, 0);
+
+  operands[4] = gen_reg_rtx (<MODE>mode);
+  operands[5] = gen_reg_rtx (<MODE>mode);
+})
+
+
+(define_insn "absv4df2"
+  [(set (match_operand:V4DF 0 "register_operand" "=f")
+	(abs:V4DF (match_operand:V4DF 1 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+  "xvbitclri.d\t%u0,%u1,63"
+  [(set_attr "type" "simd_logic")
+   (set_attr "mode" "V4DF")])
+
+(define_insn "absv8sf2"
+  [(set (match_operand:V8SF 0 "register_operand" "=f")
+	(abs:V8SF (match_operand:V8SF 1 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+  "xvbitclri.w\t%u0,%u1,31"
+  [(set_attr "type" "simd_logic")
+   (set_attr "mode" "V8SF")])
+
+(define_insn "negv4df2"
+  [(set (match_operand:V4DF 0 "register_operand" "=f")
+	(neg:V4DF (match_operand:V4DF 1 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+  "xvbitrevi.d\t%u0,%u1,63"
+  [(set_attr "type" "simd_logic")
+   (set_attr "mode" "V4DF")])
+
+(define_insn "negv8sf2"
+  [(set (match_operand:V8SF 0 "register_operand" "=f")
+	(neg:V8SF (match_operand:V8SF 1 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+  "xvbitrevi.w\t%u0,%u1,31"
+  [(set_attr "type" "simd_logic")
+   (set_attr "mode" "V8SF")])
+
+(define_insn "xvfmadd<mode>4"
+  [(set (match_operand:FLASX 0 "register_operand" "=f")
+	(fma:FLASX (match_operand:FLASX 1 "register_operand" "f")
+		   (match_operand:FLASX 2 "register_operand" "f")
+		   (match_operand:FLASX 3 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+  "xvfmadd.<flasxfmt>\t%u0,%u1,$u2,%u3"
+  [(set_attr "type" "simd_fmadd")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "fms<mode>4"
+  [(set (match_operand:FLASX 0 "register_operand" "=f")
+	(fma:FLASX (match_operand:FLASX 1 "register_operand" "f")
+		   (match_operand:FLASX 2 "register_operand" "f")
+		   (neg:FLASX (match_operand:FLASX 3 "register_operand" "f"))))]
+  "ISA_HAS_LASX"
+  "xvfmsub.<flasxfmt>\t%u0,%u1,%u2,%u3"
+  [(set_attr "type" "simd_fmadd")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "xvfnmsub<mode>4_nmsub4"
+  [(set (match_operand:FLASX 0 "register_operand" "=f")
+	(neg:FLASX
+	  (fma:FLASX
+	    (match_operand:FLASX 1 "register_operand" "f")
+	    (match_operand:FLASX 2 "register_operand" "f")
+	    (neg:FLASX (match_operand:FLASX 3 "register_operand" "f")))))]
+  "ISA_HAS_LASX"
+  "xvfnmsub.<flasxfmt>\t%u0,%u1,%u2,%u3"
+  [(set_attr "type" "simd_fmadd")
+   (set_attr "mode" "<MODE>")])
+
+
+(define_insn "xvfnmadd<mode>4_nmadd4"
+  [(set (match_operand:FLASX 0 "register_operand" "=f")
+	(neg:FLASX
+	  (fma:FLASX
+	    (match_operand:FLASX 1 "register_operand" "f")
+	    (match_operand:FLASX 2 "register_operand" "f")
+	    (match_operand:FLASX 3 "register_operand" "f"))))]
+  "ISA_HAS_LASX"
+  "xvfnmadd.<flasxfmt>\t%u0,%u1,%u2,%u3"
+  [(set_attr "type" "simd_fmadd")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvftintrne_w_s"
+  [(set (match_operand:V8SI 0 "register_operand" "=f")
+	(unspec:V8SI [(match_operand:V8SF 1 "register_operand" "f")]
+		     UNSPEC_LASX_XVFTINTRNE_W_S))]
+  "ISA_HAS_LASX"
+  "xvftintrne.w.s\t%u0,%u1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V8SF")])
+
+(define_insn "lasx_xvftintrne_l_d"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V4DF 1 "register_operand" "f")]
+		     UNSPEC_LASX_XVFTINTRNE_L_D))]
+  "ISA_HAS_LASX"
+  "xvftintrne.l.d\t%u0,%u1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V4DF")])
+
+(define_insn "lasx_xvftintrp_w_s"
+  [(set (match_operand:V8SI 0 "register_operand" "=f")
+	(unspec:V8SI [(match_operand:V8SF 1 "register_operand" "f")]
+		     UNSPEC_LASX_XVFTINTRP_W_S))]
+  "ISA_HAS_LASX"
+  "xvftintrp.w.s\t%u0,%u1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V8SF")])
+
+(define_insn "lasx_xvftintrp_l_d"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V4DF 1 "register_operand" "f")]
+		     UNSPEC_LASX_XVFTINTRP_L_D))]
+  "ISA_HAS_LASX"
+  "xvftintrp.l.d\t%u0,%u1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V4DF")])
+
+(define_insn "lasx_xvftintrm_w_s"
+  [(set (match_operand:V8SI 0 "register_operand" "=f")
+	(unspec:V8SI [(match_operand:V8SF 1 "register_operand" "f")]
+		     UNSPEC_LASX_XVFTINTRM_W_S))]
+  "ISA_HAS_LASX"
+  "xvftintrm.w.s\t%u0,%u1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V8SF")])
+
+(define_insn "lasx_xvftintrm_l_d"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V4DF 1 "register_operand" "f")]
+		     UNSPEC_LASX_XVFTINTRM_L_D))]
+  "ISA_HAS_LASX"
+  "xvftintrm.l.d\t%u0,%u1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V4DF")])
+
+(define_insn "lasx_xvftint_w_d"
+  [(set (match_operand:V8SI 0 "register_operand" "=f")
+	(unspec:V8SI [(match_operand:V4DF 1 "register_operand" "f")
+		      (match_operand:V4DF 2 "register_operand" "f")]
+		     UNSPEC_LASX_XVFTINT_W_D))]
+  "ISA_HAS_LASX"
+  "xvftint.w.d\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4DF")])
+
+(define_insn "lasx_xvffint_s_l"
+  [(set (match_operand:V8SF 0 "register_operand" "=f")
+	(unspec:V8SF [(match_operand:V4DI 1 "register_operand" "f")
+		      (match_operand:V4DI 2 "register_operand" "f")]
+		     UNSPEC_LASX_XVFFINT_S_L))]
+  "ISA_HAS_LASX"
+  "xvffint.s.l\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4DI")])
+
+(define_insn "lasx_xvftintrz_w_d"
+  [(set (match_operand:V8SI 0 "register_operand" "=f")
+	(unspec:V8SI [(match_operand:V4DF 1 "register_operand" "f")
+		      (match_operand:V4DF 2 "register_operand" "f")]
+		     UNSPEC_LASX_XVFTINTRZ_W_D))]
+  "ISA_HAS_LASX"
+  "xvftintrz.w.d\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4DF")])
+
+(define_insn "lasx_xvftintrp_w_d"
+  [(set (match_operand:V8SI 0 "register_operand" "=f")
+	(unspec:V8SI [(match_operand:V4DF 1 "register_operand" "f")
+		      (match_operand:V4DF 2 "register_operand" "f")]
+		     UNSPEC_LASX_XVFTINTRP_W_D))]
+  "ISA_HAS_LASX"
+  "xvftintrp.w.d\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4DF")])
+
+(define_insn "lasx_xvftintrm_w_d"
+  [(set (match_operand:V8SI 0 "register_operand" "=f")
+	(unspec:V8SI [(match_operand:V4DF 1 "register_operand" "f")
+		      (match_operand:V4DF 2 "register_operand" "f")]
+		     UNSPEC_LASX_XVFTINTRM_W_D))]
+  "ISA_HAS_LASX"
+  "xvftintrm.w.d\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4DF")])
+
+(define_insn "lasx_xvftintrne_w_d"
+  [(set (match_operand:V8SI 0 "register_operand" "=f")
+	(unspec:V8SI [(match_operand:V4DF 1 "register_operand" "f")
+		      (match_operand:V4DF 2 "register_operand" "f")]
+		     UNSPEC_LASX_XVFTINTRNE_W_D))]
+  "ISA_HAS_LASX"
+  "xvftintrne.w.d\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4DF")])
+
+(define_insn "lasx_xvftinth_l_s"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V8SF 1 "register_operand" "f")]
+		     UNSPEC_LASX_XVFTINTH_L_S))]
+  "ISA_HAS_LASX"
+  "xvftinth.l.s\t%u0,%u1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V8SF")])
+
+(define_insn "lasx_xvftintl_l_s"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V8SF 1 "register_operand" "f")]
+		     UNSPEC_LASX_XVFTINTL_L_S))]
+  "ISA_HAS_LASX"
+  "xvftintl.l.s\t%u0,%u1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V8SF")])
+
+(define_insn "lasx_xvffinth_d_w"
+  [(set (match_operand:V4DF 0 "register_operand" "=f")
+	(unspec:V4DF [(match_operand:V8SI 1 "register_operand" "f")]
+		     UNSPEC_LASX_XVFFINTH_D_W))]
+  "ISA_HAS_LASX"
+  "xvffinth.d.w\t%u0,%u1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V8SI")])
+
+(define_insn "lasx_xvffintl_d_w"
+  [(set (match_operand:V4DF 0 "register_operand" "=f")
+	(unspec:V4DF [(match_operand:V8SI 1 "register_operand" "f")]
+		     UNSPEC_LASX_XVFFINTL_D_W))]
+  "ISA_HAS_LASX"
+  "xvffintl.d.w\t%u0,%u1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V8SI")])
+
+(define_insn "lasx_xvftintrzh_l_s"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V8SF 1 "register_operand" "f")]
+		     UNSPEC_LASX_XVFTINTRZH_L_S))]
+  "ISA_HAS_LASX"
+  "xvftintrzh.l.s\t%u0,%u1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V8SF")])
+
+(define_insn "lasx_xvftintrzl_l_s"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V8SF 1 "register_operand" "f")]
+		     UNSPEC_LASX_XVFTINTRZL_L_S))]
+  "ISA_HAS_LASX"
+  "xvftintrzl.l.s\t%u0,%u1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V4SF")])
+
+(define_insn "lasx_xvftintrph_l_s"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V8SF 1 "register_operand" "f")]
+		     UNSPEC_LASX_XVFTINTRPH_L_S))]
+  "ISA_HAS_LASX"
+  "xvftintrph.l.s\t%u0,%u1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V4SF")])
+
+(define_insn "lasx_xvftintrpl_l_s"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V8SF 1 "register_operand" "f")]
+		     UNSPEC_LASX_XVFTINTRPL_L_S))]
+  "ISA_HAS_LASX"
+  "xvftintrpl.l.s\t%u0,%u1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V8SF")])
+
+(define_insn "lasx_xvftintrmh_l_s"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V8SF 1 "register_operand" "f")]
+		     UNSPEC_LASX_XVFTINTRMH_L_S))]
+  "ISA_HAS_LASX"
+  "xvftintrmh.l.s\t%u0,%u1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V8SF")])
+
+(define_insn "lasx_xvftintrml_l_s"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V8SF 1 "register_operand" "f")]
+		     UNSPEC_LASX_XVFTINTRML_L_S))]
+  "ISA_HAS_LASX"
+  "xvftintrml.l.s\t%u0,%u1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V8SF")])
+
+(define_insn "lasx_xvftintrneh_l_s"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V8SF 1 "register_operand" "f")]
+		     UNSPEC_LASX_XVFTINTRNEH_L_S))]
+  "ISA_HAS_LASX"
+  "xvftintrneh.l.s\t%u0,%u1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V8SF")])
+
+(define_insn "lasx_xvftintrnel_l_s"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V8SF 1 "register_operand" "f")]
+		     UNSPEC_LASX_XVFTINTRNEL_L_S))]
+  "ISA_HAS_LASX"
+  "xvftintrnel.l.s\t%u0,%u1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V8SF")])
+
+(define_insn "lasx_xvfrintrne_s"
+  [(set (match_operand:V8SF 0 "register_operand" "=f")
+	(unspec:V8SF [(match_operand:V8SF 1 "register_operand" "f")]
+		     UNSPEC_LASX_XVFRINTRNE_S))]
+  "ISA_HAS_LASX"
+  "xvfrintrne.s\t%u0,%u1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V8SF")])
+
+(define_insn "lasx_xvfrintrne_d"
+  [(set (match_operand:V4DF 0 "register_operand" "=f")
+	(unspec:V4DF [(match_operand:V4DF 1 "register_operand" "f")]
+		     UNSPEC_LASX_XVFRINTRNE_D))]
+  "ISA_HAS_LASX"
+  "xvfrintrne.d\t%u0,%u1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V4DF")])
+
+(define_insn "lasx_xvfrintrz_s"
+  [(set (match_operand:V8SF 0 "register_operand" "=f")
+	(unspec:V8SF [(match_operand:V8SF 1 "register_operand" "f")]
+		     UNSPEC_LASX_XVFRINTRZ_S))]
+  "ISA_HAS_LASX"
+  "xvfrintrz.s\t%u0,%u1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V8SF")])
+
+(define_insn "lasx_xvfrintrz_d"
+  [(set (match_operand:V4DF 0 "register_operand" "=f")
+	(unspec:V4DF [(match_operand:V4DF 1 "register_operand" "f")]
+		     UNSPEC_LASX_XVFRINTRZ_D))]
+  "ISA_HAS_LASX"
+  "xvfrintrz.d\t%u0,%u1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V4DF")])
+
+(define_insn "lasx_xvfrintrp_s"
+  [(set (match_operand:V8SF 0 "register_operand" "=f")
+	(unspec:V8SF [(match_operand:V8SF 1 "register_operand" "f")]
+		     UNSPEC_LASX_XVFRINTRP_S))]
+  "ISA_HAS_LASX"
+  "xvfrintrp.s\t%u0,%u1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V8SF")])
+
+(define_insn "lasx_xvfrintrp_d"
+  [(set (match_operand:V4DF 0 "register_operand" "=f")
+	(unspec:V4DF [(match_operand:V4DF 1 "register_operand" "f")]
+		     UNSPEC_LASX_XVFRINTRP_D))]
+  "ISA_HAS_LASX"
+  "xvfrintrp.d\t%u0,%u1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V4DF")])
+
+(define_insn "lasx_xvfrintrm_s"
+  [(set (match_operand:V8SF 0 "register_operand" "=f")
+	(unspec:V8SF [(match_operand:V8SF 1 "register_operand" "f")]
+		     UNSPEC_LASX_XVFRINTRM_S))]
+  "ISA_HAS_LASX"
+  "xvfrintrm.s\t%u0,%u1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V8SF")])
+
+(define_insn "lasx_xvfrintrm_d"
+  [(set (match_operand:V4DF 0 "register_operand" "=f")
+	(unspec:V4DF [(match_operand:V4DF 1 "register_operand" "f")]
+		     UNSPEC_LASX_XVFRINTRM_D))]
+  "ISA_HAS_LASX"
+  "xvfrintrm.d\t%u0,%u1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V4DF")])
+
+;; Vector versions of the floating-point frint patterns.
+;; Expands to btrunc, ceil, floor, rint.
+(define_insn "<FRINT256_S:frint256_pattern_s>v8sf2"
+ [(set (match_operand:V8SF 0 "register_operand" "=f")
+	(unspec:V8SF [(match_operand:V8SF 1 "register_operand" "f")]
+			 FRINT256_S))]
+  "ISA_HAS_LASX"
+  "xvfrint<FRINT256_S:frint256_suffix>.s\t%u0,%u1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V8SF")])
+
+(define_insn "<FRINT256_D:frint256_pattern_d>v4df2"
+ [(set (match_operand:V4DF 0 "register_operand" "=f")
+	(unspec:V4DF [(match_operand:V4DF 1 "register_operand" "f")]
+			 FRINT256_D))]
+  "ISA_HAS_LASX"
+  "xvfrint<FRINT256_D:frint256_suffix>.d\t%u0,%u1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "V4DF")])
+
+;; Expands to round.
+(define_insn "round<mode>2"
+ [(set (match_operand:FLASX 0 "register_operand" "=f")
+	(unspec:FLASX [(match_operand:FLASX 1 "register_operand" "f")]
+			 UNSPEC_LASX_XVFRINT))]
+  "ISA_HAS_LASX"
+  "xvfrint.<flasxfmt>\t%u0,%u1"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+;; Offset load and broadcast
+(define_expand "lasx_xvldrepl_<lasxfmt_f>"
+  [(match_operand:LASX 0 "register_operand")
+   (match_operand 2 "aq12<lasxfmt>_operand")
+   (match_operand 1 "pmode_register_operand")]
+  "ISA_HAS_LASX"
+{
+  emit_insn (gen_lasx_xvldrepl_<lasxfmt_f>_insn
+	     (operands[0], operands[1], operands[2]));
+  DONE;
+})
+
+(define_insn "lasx_xvldrepl_<lasxfmt_f>_insn"
+  [(set (match_operand:LASX 0 "register_operand" "=f")
+	(vec_duplicate:LASX
+	  (mem:<UNITMODE> (plus:DI (match_operand:DI 1 "register_operand" "r")
+				   (match_operand 2 "aq12<lasxfmt>_operand")))))]
+  "ISA_HAS_LASX"
+{
+  return "xvldrepl.<lasxfmt>\t%u0,%1,%2";
+}
+  [(set_attr "type" "simd_load")
+   (set_attr "mode" "<MODE>")
+   (set_attr "length" "4")])
+
+;; Offset is "0"
+(define_insn "lasx_xvldrepl_<lasxfmt_f>_insn_0"
+  [(set (match_operand:LASX 0 "register_operand" "=f")
+    (vec_duplicate:LASX
+      (mem:<UNITMODE> (match_operand:DI 1 "register_operand" "r"))))]
+  "ISA_HAS_LASX"
+{
+    return "xvldrepl.<lasxfmt>\t%u0,%1,0";
+}
+  [(set_attr "type" "simd_load")
+   (set_attr "mode" "<MODE>")
+   (set_attr "length" "4")])
+
+;;XVADDWEV.H.B   XVSUBWEV.H.B   XVMULWEV.H.B
+;;XVADDWEV.H.BU  XVSUBWEV.H.BU  XVMULWEV.H.BU
+(define_insn "lasx_xv<optab>wev_h_b<u>"
+  [(set (match_operand:V16HI 0 "register_operand" "=f")
+	(addsubmul:V16HI
+	  (any_extend:V16HI
+	    (vec_select:V16QI
+	      (match_operand:V32QI 1 "register_operand" "%f")
+	      (parallel [(const_int 0) (const_int 2)
+			 (const_int 4) (const_int 6)
+			 (const_int 8) (const_int 10)
+			 (const_int 12) (const_int 14)
+			 (const_int 16) (const_int 18)
+			 (const_int 20) (const_int 22)
+			 (const_int 24) (const_int 26)
+			 (const_int 28) (const_int 30)])))
+	  (any_extend:V16HI
+	    (vec_select:V16QI
+	      (match_operand:V32QI 2 "register_operand" "f")
+	      (parallel [(const_int 0) (const_int 2)
+			 (const_int 4) (const_int 6)
+			 (const_int 8) (const_int 10)
+			 (const_int 12) (const_int 14)
+			 (const_int 16) (const_int 18)
+			 (const_int 20) (const_int 22)
+			 (const_int 24) (const_int 26)
+			 (const_int 28) (const_int 30)])))))]
+  "ISA_HAS_LASX"
+  "xv<optab>wev.h.b<u>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V16HI")])
+
+;;XVADDWEV.W.H   XVSUBWEV.W.H   XVMULWEV.W.H
+;;XVADDWEV.W.HU  XVSUBWEV.W.HU  XVMULWEV.W.HU
+(define_insn "lasx_xv<optab>wev_w_h<u>"
+  [(set (match_operand:V8SI 0 "register_operand" "=f")
+	(addsubmul:V8SI
+	  (any_extend:V8SI
+	    (vec_select:V8HI
+	      (match_operand:V16HI 1 "register_operand" "%f")
+	      (parallel [(const_int 0) (const_int 2)
+			 (const_int 4) (const_int 6)
+			 (const_int 8) (const_int 10)
+			 (const_int 12) (const_int 14)])))
+	  (any_extend:V8SI
+	    (vec_select:V8HI
+	      (match_operand:V16HI 2 "register_operand" "f")
+	      (parallel [(const_int 0) (const_int 2)
+			 (const_int 4) (const_int 6)
+			 (const_int 8) (const_int 10)
+			 (const_int 12) (const_int 14)])))))]
+  "ISA_HAS_LASX"
+  "xv<optab>wev.w.h<u>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V8SI")])
+
+;;XVADDWEV.D.W   XVSUBWEV.D.W   XVMULWEV.D.W
+;;XVADDWEV.D.WU  XVSUBWEV.D.WU  XVMULWEV.D.WU
+(define_insn "lasx_xv<optab>wev_d_w<u>"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(addsubmul:V4DI
+	  (any_extend:V4DI
+	    (vec_select:V4SI
+	      (match_operand:V8SI 1 "register_operand" "%f")
+	      (parallel [(const_int 0) (const_int 2)
+			 (const_int 4) (const_int 6)])))
+	  (any_extend:V4DI
+	    (vec_select:V4SI
+	      (match_operand:V8SI 2 "register_operand" "f")
+	      (parallel [(const_int 0) (const_int 2)
+			 (const_int 4) (const_int 6)])))))]
+  "ISA_HAS_LASX"
+  "xv<optab>wev.d.w<u>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4DI")])
+
+;;XVADDWEV.Q.D
+;;TODO2
+(define_insn "lasx_xvaddwev_q_d"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V4DI 1 "register_operand" "f")
+		      (match_operand:V4DI 2 "register_operand" "f")]
+		     UNSPEC_LASX_XVADDWEV))]
+  "ISA_HAS_LASX"
+  "xvaddwev.q.d\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4DI")])
+
+;;XVSUBWEV.Q.D
+;;TODO2
+(define_insn "lasx_xvsubwev_q_d"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V4DI 1 "register_operand" "f")
+		      (match_operand:V4DI 2 "register_operand" "f")]
+		     UNSPEC_LASX_XVSUBWEV))]
+  "ISA_HAS_LASX"
+  "xvsubwev.q.d\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4DI")])
+
+;;XVMULWEV.Q.D
+;;TODO2
+(define_insn "lasx_xvmulwev_q_d"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V4DI 1 "register_operand" "f")
+		      (match_operand:V4DI 2 "register_operand" "f")]
+		     UNSPEC_LASX_XVMULWEV))]
+  "ISA_HAS_LASX"
+  "xvmulwev.q.d\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4DI")])
+
+
+;;XVADDWOD.H.B   XVSUBWOD.H.B   XVMULWOD.H.B
+;;XVADDWOD.H.BU  XVSUBWOD.H.BU  XVMULWOD.H.BU
+(define_insn "lasx_xv<optab>wod_h_b<u>"
+  [(set (match_operand:V16HI 0 "register_operand" "=f")
+	(addsubmul:V16HI
+	  (any_extend:V16HI
+	    (vec_select:V16QI
+	      (match_operand:V32QI 1 "register_operand" "%f")
+	      (parallel [(const_int 1) (const_int 3)
+			 (const_int 5) (const_int 7)
+			 (const_int 9) (const_int 11)
+			 (const_int 13) (const_int 15)
+			 (const_int 17) (const_int 19)
+			 (const_int 21) (const_int 23)
+			 (const_int 25) (const_int 27)
+			 (const_int 29) (const_int 31)])))
+	  (any_extend:V16HI
+	    (vec_select:V16QI
+	      (match_operand:V32QI 2 "register_operand" "f")
+	      (parallel [(const_int 1) (const_int 3)
+			 (const_int 5) (const_int 7)
+			 (const_int 9) (const_int 11)
+			 (const_int 13) (const_int 15)
+			 (const_int 17) (const_int 19)
+			 (const_int 21) (const_int 23)
+			 (const_int 25) (const_int 27)
+			 (const_int 29) (const_int 31)])))))]
+  "ISA_HAS_LASX"
+  "xv<optab>wod.h.b<u>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V16HI")])
+
+;;XVADDWOD.W.H   XVSUBWOD.W.H   XVMULWOD.W.H
+;;XVADDWOD.W.HU  XVSUBWOD.W.HU  XVMULWOD.W.HU
+(define_insn "lasx_xv<optab>wod_w_h<u>"
+  [(set (match_operand:V8SI 0 "register_operand" "=f")
+	(addsubmul:V8SI
+	  (any_extend:V8SI
+	    (vec_select:V8HI
+	      (match_operand:V16HI 1 "register_operand" "%f")
+	      (parallel [(const_int 1) (const_int 3)
+			 (const_int 5) (const_int 7)
+			 (const_int 9) (const_int 11)
+			 (const_int 13) (const_int 15)])))
+	  (any_extend:V8SI
+	    (vec_select:V8HI
+	      (match_operand:V16HI 2 "register_operand" "f")
+	      (parallel [(const_int 1) (const_int 3)
+			 (const_int 5) (const_int 7)
+			 (const_int 9) (const_int 11)
+			 (const_int 13) (const_int 15)])))))]
+  "ISA_HAS_LASX"
+  "xv<optab>wod.w.h<u>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V8SI")])
+
+
+;;XVADDWOD.D.W   XVSUBWOD.D.W   XVMULWOD.D.W
+;;XVADDWOD.D.WU  XVSUBWOD.D.WU  XVMULWOD.D.WU
+(define_insn "lasx_xv<optab>wod_d_w<u>"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(addsubmul:V4DI
+	  (any_extend:V4DI
+	    (vec_select:V4SI
+	      (match_operand:V8SI 1 "register_operand" "%f")
+	      (parallel [(const_int 1) (const_int 3)
+			 (const_int 5) (const_int 7)])))
+	  (any_extend:V4DI
+	    (vec_select:V4SI
+	      (match_operand:V8SI 2 "register_operand" "f")
+	      (parallel [(const_int 1) (const_int 3)
+			 (const_int 5) (const_int 7)])))))]
+  "ISA_HAS_LASX"
+  "xv<optab>wod.d.w<u>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4DI")])
+
+;;XVADDWOD.Q.D
+;;TODO2
+(define_insn "lasx_xvaddwod_q_d"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V4DI 1 "register_operand" "f")
+		      (match_operand:V4DI 2 "register_operand" "f")]
+		     UNSPEC_LASX_XVADDWOD))]
+  "ISA_HAS_LASX"
+  "xvaddwod.q.d\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4DI")])
+
+;;XVSUBWOD.Q.D
+;;TODO2
+(define_insn "lasx_xvsubwod_q_d"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V4DI 1 "register_operand" "f")
+		      (match_operand:V4DI 2 "register_operand" "f")]
+		     UNSPEC_LASX_XVSUBWOD))]
+  "ISA_HAS_LASX"
+  "xvsubwod.q.d\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4DI")])
+
+;;XVMULWOD.Q.D
+;;TODO2
+(define_insn "lasx_xvmulwod_q_d"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V4DI 1 "register_operand" "f")
+		      (match_operand:V4DI 2 "register_operand" "f")]
+		     UNSPEC_LASX_XVMULWOD))]
+  "ISA_HAS_LASX"
+  "xvmulwod.q.d\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4DI")])
+
+;;XVADDWEV.Q.DU
+;;TODO2
+(define_insn "lasx_xvaddwev_q_du"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V4DI 1 "register_operand" "f")
+		      (match_operand:V4DI 2 "register_operand" "f")]
+		     UNSPEC_LASX_XVADDWEV2))]
+  "ISA_HAS_LASX"
+  "xvaddwev.q.du\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4DI")])
+
+;;XVSUBWEV.Q.DU
+;;TODO2
+(define_insn "lasx_xvsubwev_q_du"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V4DI 1 "register_operand" "f")
+		      (match_operand:V4DI 2 "register_operand" "f")]
+		     UNSPEC_LASX_XVSUBWEV2))]
+  "ISA_HAS_LASX"
+  "xvsubwev.q.du\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4DI")])
+
+;;XVMULWEV.Q.DU
+;;TODO2
+(define_insn "lasx_xvmulwev_q_du"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V4DI 1 "register_operand" "f")
+		      (match_operand:V4DI 2 "register_operand" "f")]
+		     UNSPEC_LASX_XVMULWEV2))]
+  "ISA_HAS_LASX"
+  "xvmulwev.q.du\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4DI")])
+
+;;XVADDWOD.Q.DU
+;;TODO2
+(define_insn "lasx_xvaddwod_q_du"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V4DI 1 "register_operand" "f")
+		      (match_operand:V4DI 2 "register_operand" "f")]
+		     UNSPEC_LASX_XVADDWOD2))]
+  "ISA_HAS_LASX"
+  "xvaddwod.q.du\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4DI")])
+
+;;XVSUBWOD.Q.DU
+;;TODO2
+(define_insn "lasx_xvsubwod_q_du"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V4DI 1 "register_operand" "f")
+		      (match_operand:V4DI 2 "register_operand" "f")]
+		     UNSPEC_LASX_XVSUBWOD2))]
+  "ISA_HAS_LASX"
+  "xvsubwod.q.du\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4DI")])
+
+;;XVMULWOD.Q.DU
+;;TODO2
+(define_insn "lasx_xvmulwod_q_du"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V4DI 1 "register_operand" "f")
+		      (match_operand:V4DI 2 "register_operand" "f")]
+		     UNSPEC_LASX_XVMULWOD2))]
+  "ISA_HAS_LASX"
+  "xvmulwod.q.du\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4DI")])
+
+;;XVADDWEV.H.BU.B   XVMULWEV.H.BU.B
+(define_insn "lasx_xv<optab>wev_h_bu_b"
+  [(set (match_operand:V16HI 0 "register_operand" "=f")
+	(addmul:V16HI
+	  (zero_extend:V16HI
+	    (vec_select:V16QI
+	      (match_operand:V32QI 1 "register_operand" "%f")
+	      (parallel [(const_int 0) (const_int 2)
+			 (const_int 4) (const_int 6)
+			 (const_int 8) (const_int 10)
+			 (const_int 12) (const_int 14)
+			 (const_int 16) (const_int 18)
+			 (const_int 20) (const_int 22)
+			 (const_int 24) (const_int 26)
+			 (const_int 28) (const_int 30)])))
+	  (sign_extend:V16HI
+	    (vec_select:V16QI
+	      (match_operand:V32QI 2 "register_operand" "f")
+	      (parallel [(const_int 0) (const_int 2)
+			 (const_int 4) (const_int 6)
+			 (const_int 8) (const_int 10)
+			 (const_int 12) (const_int 14)
+			 (const_int 16) (const_int 18)
+			 (const_int 20) (const_int 22)
+			 (const_int 24) (const_int 26)
+			 (const_int 28) (const_int 30)])))))]
+  "ISA_HAS_LASX"
+  "xv<optab>wev.h.bu.b\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V16HI")])
+
+;;XVADDWEV.W.HU.H   XVMULWEV.W.HU.H
+(define_insn "lasx_xv<optab>wev_w_hu_h"
+  [(set (match_operand:V8SI 0 "register_operand" "=f")
+	(addmul:V8SI
+	  (zero_extend:V8SI
+	    (vec_select:V8HI
+	      (match_operand:V16HI 1 "register_operand" "%f")
+	      (parallel [(const_int 0) (const_int 2)
+			 (const_int 4) (const_int 6)
+			 (const_int 8) (const_int 10)
+			 (const_int 12) (const_int 14)])))
+	  (sign_extend:V8SI
+	    (vec_select:V8HI
+	      (match_operand:V16HI 2 "register_operand" "f")
+	      (parallel [(const_int 0) (const_int 2)
+			 (const_int 4) (const_int 6)
+			 (const_int 8) (const_int 10)
+			 (const_int 12) (const_int 14)])))))]
+  "ISA_HAS_LASX"
+  "xv<optab>wev.w.hu.h\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V8SI")])
+
+;;XVADDWEV.D.WU.W   XVMULWEV.D.WU.W
+(define_insn "lasx_xv<optab>wev_d_wu_w"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(addmul:V4DI
+	  (zero_extend:V4DI
+	    (vec_select:V4SI
+	      (match_operand:V8SI 1 "register_operand" "%f")
+	      (parallel [(const_int 0) (const_int 2)
+			 (const_int 4) (const_int 6)])))
+	  (sign_extend:V4DI
+	    (vec_select:V4SI
+	      (match_operand:V8SI 2 "register_operand" "f")
+	      (parallel [(const_int 0) (const_int 2)
+			 (const_int 4) (const_int 6)])))))]
+  "ISA_HAS_LASX"
+  "xv<optab>wev.d.wu.w\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4DI")])
+
+;;XVADDWOD.H.BU.B   XVMULWOD.H.BU.B
+(define_insn "lasx_xv<optab>wod_h_bu_b"
+  [(set (match_operand:V16HI 0 "register_operand" "=f")
+	(addmul:V16HI
+	  (zero_extend:V16HI
+	    (vec_select:V16QI
+	      (match_operand:V32QI 1 "register_operand" "%f")
+	      (parallel [(const_int 1) (const_int 3)
+			 (const_int 5) (const_int 7)
+			 (const_int 9) (const_int 11)
+			 (const_int 13) (const_int 15)
+			 (const_int 17) (const_int 19)
+			 (const_int 21) (const_int 23)
+			 (const_int 25) (const_int 27)
+			 (const_int 29) (const_int 31)])))
+	  (sign_extend:V16HI
+	    (vec_select:V16QI
+	      (match_operand:V32QI 2 "register_operand" "f")
+	      (parallel [(const_int 1) (const_int 3)
+			 (const_int 5) (const_int 7)
+			 (const_int 9) (const_int 11)
+			 (const_int 13) (const_int 15)
+			 (const_int 17) (const_int 19)
+			 (const_int 21) (const_int 23)
+			 (const_int 25) (const_int 27)
+			 (const_int 29) (const_int 31)])))))]
+  "ISA_HAS_LASX"
+  "xv<optab>wod.h.bu.b\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V16HI")])
+
+;;XVADDWOD.W.HU.H   XVMULWOD.W.HU.H
+(define_insn "lasx_xv<optab>wod_w_hu_h"
+  [(set (match_operand:V8SI 0 "register_operand" "=f")
+	(addmul:V8SI
+	  (zero_extend:V8SI
+	    (vec_select:V8HI
+	      (match_operand:V16HI 1 "register_operand" "%f")
+	      (parallel [(const_int 1) (const_int 3)
+			 (const_int 5) (const_int 7)
+			 (const_int 9) (const_int 11)
+			 (const_int 13) (const_int 15)])))
+	  (sign_extend:V8SI
+	    (vec_select:V8HI
+	      (match_operand:V16HI 2 "register_operand" "f")
+	      (parallel [(const_int 1) (const_int 3)
+			 (const_int 5) (const_int 7)
+			 (const_int 9) (const_int 11)
+			 (const_int 13) (const_int 15)])))))]
+  "ISA_HAS_LASX"
+  "xv<optab>wod.w.hu.h\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V8SI")])
+
+;;XVADDWOD.D.WU.W   XVMULWOD.D.WU.W
+(define_insn "lasx_xv<optab>wod_d_wu_w"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(addmul:V4DI
+	  (zero_extend:V4DI
+	    (vec_select:V4SI
+	      (match_operand:V8SI 1 "register_operand" "%f")
+	      (parallel [(const_int 1) (const_int 3)
+			 (const_int 5) (const_int 7)])))
+	  (sign_extend:V4DI
+	    (vec_select:V4SI
+	      (match_operand:V8SI 2 "register_operand" "f")
+	      (parallel [(const_int 1) (const_int 3)
+			 (const_int 5) (const_int 7)])))))]
+  "ISA_HAS_LASX"
+  "xv<optab>wod.d.wu.w\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4DI")])
+
+;;XVMADDWEV.H.B   XVMADDWEV.H.BU
+(define_insn "lasx_xvmaddwev_h_b<u>"
+  [(set (match_operand:V16HI 0 "register_operand" "=f")
+	(plus:V16HI
+	  (match_operand:V16HI 1 "register_operand" "0")
+	  (mult:V16HI
+	    (any_extend:V16HI
+	      (vec_select:V16QI
+		(match_operand:V32QI 2 "register_operand" "%f")
+		(parallel [(const_int 0) (const_int 2)
+			   (const_int 4) (const_int 6)
+			   (const_int 8) (const_int 10)
+			   (const_int 12) (const_int 14)
+			   (const_int 16) (const_int 18)
+			   (const_int 20) (const_int 22)
+			   (const_int 24) (const_int 26)
+			   (const_int 28) (const_int 30)])))
+	    (any_extend:V16HI
+	      (vec_select:V16QI
+		(match_operand:V32QI 3 "register_operand" "f")
+		(parallel [(const_int 0) (const_int 2)
+			   (const_int 4) (const_int 6)
+			   (const_int 8) (const_int 10)
+			   (const_int 12) (const_int 14)
+			   (const_int 16) (const_int 18)
+			   (const_int 20) (const_int 22)
+			   (const_int 24) (const_int 26)
+			   (const_int 28) (const_int 30)]))))))]
+  "ISA_HAS_LASX"
+  "xvmaddwev.h.b<u>\t%u0,%u2,%u3"
+  [(set_attr "type" "simd_fmadd")
+   (set_attr "mode" "V16HI")])
+
+;;XVMADDWEV.W.H   XVMADDWEV.W.HU
+(define_insn "lasx_xvmaddwev_w_h<u>"
+  [(set (match_operand:V8SI 0 "register_operand" "=f")
+	(plus:V8SI
+	  (match_operand:V8SI 1 "register_operand" "0")
+	  (mult:V8SI
+	    (any_extend:V8SI
+	      (vec_select:V8HI
+		(match_operand:V16HI 2 "register_operand" "%f")
+		(parallel [(const_int 0) (const_int 2)
+			   (const_int 4) (const_int 6)
+			   (const_int 8) (const_int 10)
+			   (const_int 12) (const_int 14)])))
+	    (any_extend:V8SI
+	      (vec_select:V8HI
+		(match_operand:V16HI 3 "register_operand" "f")
+		(parallel [(const_int 0) (const_int 2)
+			   (const_int 4) (const_int 6)
+			   (const_int 8) (const_int 10)
+			   (const_int 12) (const_int 14)]))))))]
+  "ISA_HAS_LASX"
+  "xvmaddwev.w.h<u>\t%u0,%u2,%u3"
+  [(set_attr "type" "simd_fmadd")
+   (set_attr "mode" "V8SI")])
+
+;;XVMADDWEV.D.W   XVMADDWEV.D.WU
+(define_insn "lasx_xvmaddwev_d_w<u>"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(plus:V4DI
+	  (match_operand:V4DI 1 "register_operand" "0")
+	  (mult:V4DI
+	    (any_extend:V4DI
+	      (vec_select:V4SI
+		(match_operand:V8SI 2 "register_operand" "%f")
+		(parallel [(const_int 0) (const_int 2)
+			   (const_int 4) (const_int 6)])))
+	    (any_extend:V4DI
+	      (vec_select:V4SI
+		(match_operand:V8SI 3 "register_operand" "f")
+		(parallel [(const_int 0) (const_int 2)
+			   (const_int 4) (const_int 6)]))))))]
+  "ISA_HAS_LASX"
+  "xvmaddwev.d.w<u>\t%u0,%u2,%u3"
+  [(set_attr "type" "simd_fmadd")
+   (set_attr "mode" "V4DI")])
+
+;;XVMADDWEV.Q.D
+;;TODO2
+(define_insn "lasx_xvmaddwev_q_d"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V4DI 1 "register_operand" "0")
+		      (match_operand:V4DI 2 "register_operand" "f")
+		      (match_operand:V4DI 3 "register_operand" "f")]
+		     UNSPEC_LASX_XVMADDWEV))]
+  "ISA_HAS_LASX"
+  "xvmaddwev.q.d\t%u0,%u2,%u3"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4DI")])
+
+;;XVMADDWOD.H.B   XVMADDWOD.H.BU
+(define_insn "lasx_xvmaddwod_h_b<u>"
+  [(set (match_operand:V16HI 0 "register_operand" "=f")
+	(plus:V16HI
+	  (match_operand:V16HI 1 "register_operand" "0")
+	  (mult:V16HI
+	    (any_extend:V16HI
+	      (vec_select:V16QI
+		(match_operand:V32QI 2 "register_operand" "%f")
+		(parallel [(const_int 1) (const_int 3)
+			   (const_int 5) (const_int 7)
+			   (const_int 9) (const_int 11)
+			   (const_int 13) (const_int 15)
+			   (const_int 17) (const_int 19)
+			   (const_int 21) (const_int 23)
+			   (const_int 25) (const_int 27)
+			   (const_int 29) (const_int 31)])))
+	    (any_extend:V16HI
+	      (vec_select:V16QI
+		(match_operand:V32QI 3 "register_operand" "f")
+		(parallel [(const_int 1) (const_int 3)
+			   (const_int 5) (const_int 7)
+			   (const_int 9) (const_int 11)
+			   (const_int 13) (const_int 15)
+			   (const_int 17) (const_int 19)
+			   (const_int 21) (const_int 23)
+			   (const_int 25) (const_int 27)
+			   (const_int 29) (const_int 31)]))))))]
+  "ISA_HAS_LASX"
+  "xvmaddwod.h.b<u>\t%u0,%u2,%u3"
+  [(set_attr "type" "simd_fmadd")
+   (set_attr "mode" "V16HI")])
+
+;;XVMADDWOD.W.H   XVMADDWOD.W.HU
+(define_insn "lasx_xvmaddwod_w_h<u>"
+  [(set (match_operand:V8SI 0 "register_operand" "=f")
+	(plus:V8SI
+	  (match_operand:V8SI 1 "register_operand" "0")
+	  (mult:V8SI
+	    (any_extend:V8SI
+	      (vec_select:V8HI
+		(match_operand:V16HI 2 "register_operand" "%f")
+		(parallel [(const_int 1) (const_int 3)
+			   (const_int 5) (const_int 7)
+			   (const_int 9) (const_int 11)
+			   (const_int 13) (const_int 15)])))
+	    (any_extend:V8SI
+	      (vec_select:V8HI
+		(match_operand:V16HI 3 "register_operand" "f")
+		(parallel [(const_int 1) (const_int 3)
+			   (const_int 5) (const_int 7)
+			   (const_int 9) (const_int 11)
+			   (const_int 13) (const_int 15)]))))))]
+  "ISA_HAS_LASX"
+  "xvmaddwod.w.h<u>\t%u0,%u2,%u3"
+  [(set_attr "type" "simd_fmadd")
+   (set_attr "mode" "V8SI")])
+
+;;XVMADDWOD.D.W   XVMADDWOD.D.WU
+(define_insn "lasx_xvmaddwod_d_w<u>"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(plus:V4DI
+	  (match_operand:V4DI 1 "register_operand" "0")
+	  (mult:V4DI
+	    (any_extend:V4DI
+	      (vec_select:V4SI
+		(match_operand:V8SI 2 "register_operand" "%f")
+		(parallel [(const_int 1) (const_int 3)
+			   (const_int 5) (const_int 7)])))
+	    (any_extend:V4DI
+	      (vec_select:V4SI
+		(match_operand:V8SI 3 "register_operand" "f")
+		(parallel [(const_int 1) (const_int 3)
+			   (const_int 5) (const_int 7)]))))))]
+  "ISA_HAS_LASX"
+  "xvmaddwod.d.w<u>\t%u0,%u2,%u3"
+  [(set_attr "type" "simd_fmadd")
+   (set_attr "mode" "V4DI")])
+
+;;XVMADDWOD.Q.D
+;;TODO2
+(define_insn "lasx_xvmaddwod_q_d"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V4DI 1 "register_operand" "0")
+		      (match_operand:V4DI 2 "register_operand" "f")
+		      (match_operand:V4DI 3 "register_operand" "f")]
+		     UNSPEC_LASX_XVMADDWOD))]
+  "ISA_HAS_LASX"
+  "xvmaddwod.q.d\t%u0,%u2,%u3"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4DI")])
+
+;;XVMADDWEV.Q.DU
+;;TODO2
+(define_insn "lasx_xvmaddwev_q_du"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V4DI 1 "register_operand" "0")
+		      (match_operand:V4DI 2 "register_operand" "f")
+		      (match_operand:V4DI 3 "register_operand" "f")]
+		     UNSPEC_LASX_XVMADDWEV2))]
+  "ISA_HAS_LASX"
+  "xvmaddwev.q.du\t%u0,%u2,%u3"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4DI")])
+
+;;XVMADDWOD.Q.DU
+;;TODO2
+(define_insn "lasx_xvmaddwod_q_du"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V4DI 1 "register_operand" "0")
+		      (match_operand:V4DI 2 "register_operand" "f")
+		      (match_operand:V4DI 3 "register_operand" "f")]
+		     UNSPEC_LASX_XVMADDWOD2))]
+  "ISA_HAS_LASX"
+  "xvmaddwod.q.du\t%u0,%u2,%u3"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4DI")])
+
+;;XVMADDWEV.H.BU.B
+(define_insn "lasx_xvmaddwev_h_bu_b"
+  [(set (match_operand:V16HI 0 "register_operand" "=f")
+	(plus:V16HI
+	  (match_operand:V16HI 1 "register_operand" "0")
+	  (mult:V16HI
+	    (zero_extend:V16HI
+	      (vec_select:V16QI
+		(match_operand:V32QI 2 "register_operand" "%f")
+		(parallel [(const_int 0) (const_int 2)
+			   (const_int 4) (const_int 6)
+			   (const_int 8) (const_int 10)
+			   (const_int 12) (const_int 14)
+			   (const_int 16) (const_int 18)
+			   (const_int 20) (const_int 22)
+			   (const_int 24) (const_int 26)
+			   (const_int 28) (const_int 30)])))
+	    (sign_extend:V16HI
+	      (vec_select:V16QI
+		(match_operand:V32QI 3 "register_operand" "f")
+		(parallel [(const_int 0) (const_int 2)
+			   (const_int 4) (const_int 6)
+			   (const_int 8) (const_int 10)
+			   (const_int 12) (const_int 14)
+			   (const_int 16) (const_int 18)
+			   (const_int 20) (const_int 22)
+			   (const_int 24) (const_int 26)
+			   (const_int 28) (const_int 30)]))))))]
+  "ISA_HAS_LASX"
+  "xvmaddwev.h.bu.b\t%u0,%u2,%u3"
+  [(set_attr "type" "simd_fmadd")
+   (set_attr "mode" "V16HI")])
+
+;;XVMADDWEV.W.HU.H
+(define_insn "lasx_xvmaddwev_w_hu_h"
+  [(set (match_operand:V8SI 0 "register_operand" "=f")
+	(plus:V8SI
+	  (match_operand:V8SI 1 "register_operand" "0")
+	  (mult:V8SI
+	    (zero_extend:V8SI
+	      (vec_select:V8HI
+		(match_operand:V16HI 2 "register_operand" "%f")
+		(parallel [(const_int 0) (const_int 2)
+			   (const_int 4) (const_int 6)
+			   (const_int 8) (const_int 10)
+			   (const_int 12) (const_int 14)])))
+	    (sign_extend:V8SI
+	      (vec_select:V8HI
+		(match_operand:V16HI 3 "register_operand" "f")
+		(parallel [(const_int 0) (const_int 2)
+			   (const_int 4) (const_int 6)
+			   (const_int 8) (const_int 10)
+			   (const_int 12) (const_int 14)]))))))]
+  "ISA_HAS_LASX"
+  "xvmaddwev.w.hu.h\t%u0,%u2,%u3"
+  [(set_attr "type" "simd_fmadd")
+   (set_attr "mode" "V8SI")])
+
+;;XVMADDWEV.D.WU.W
+(define_insn "lasx_xvmaddwev_d_wu_w"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(plus:V4DI
+	  (match_operand:V4DI 1 "register_operand" "0")
+	  (mult:V4DI
+	    (zero_extend:V4DI
+	      (vec_select:V4SI
+		(match_operand:V8SI 2 "register_operand" "%f")
+		(parallel [(const_int 0) (const_int 2)
+			   (const_int 4) (const_int 6)])))
+	    (sign_extend:V4DI
+	      (vec_select:V4SI
+		(match_operand:V8SI 3 "register_operand" "f")
+		(parallel [(const_int 0) (const_int 2)
+			   (const_int 4) (const_int 6)]))))))]
+  "ISA_HAS_LASX"
+  "xvmaddwev.d.wu.w\t%u0,%u2,%u3"
+  [(set_attr "type" "simd_fmadd")
+   (set_attr "mode" "V4DI")])
+
+;;XVMADDWEV.Q.DU.D
+;;TODO2
+(define_insn "lasx_xvmaddwev_q_du_d"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V4DI 1 "register_operand" "0")
+		      (match_operand:V4DI 2 "register_operand" "f")
+		      (match_operand:V4DI 3 "register_operand" "f")]
+		     UNSPEC_LASX_XVMADDWEV3))]
+  "ISA_HAS_LASX"
+  "xvmaddwev.q.du.d\t%u0,%u2,%u3"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4DI")])
+
+;;XVMADDWOD.H.BU.B
+(define_insn "lasx_xvmaddwod_h_bu_b"
+  [(set (match_operand:V16HI 0 "register_operand" "=f")
+	(plus:V16HI
+	  (match_operand:V16HI 1 "register_operand" "0")
+	  (mult:V16HI
+	    (zero_extend:V16HI
+	      (vec_select:V16QI
+		(match_operand:V32QI 2 "register_operand" "%f")
+		(parallel [(const_int 1) (const_int 3)
+			   (const_int 5) (const_int 7)
+			   (const_int 9) (const_int 11)
+			   (const_int 13) (const_int 15)
+			   (const_int 17) (const_int 19)
+			   (const_int 21) (const_int 23)
+			   (const_int 25) (const_int 27)
+			   (const_int 29) (const_int 31)])))
+	    (sign_extend:V16HI
+	      (vec_select:V16QI
+		(match_operand:V32QI 3 "register_operand" "f")
+		(parallel [(const_int 1) (const_int 3)
+			   (const_int 5) (const_int 7)
+			   (const_int 9) (const_int 11)
+			   (const_int 13) (const_int 15)
+			   (const_int 17) (const_int 19)
+			   (const_int 21) (const_int 23)
+			   (const_int 25) (const_int 27)
+			   (const_int 29) (const_int 31)]))))))]
+  "ISA_HAS_LASX"
+  "xvmaddwod.h.bu.b\t%u0,%u2,%u3"
+  [(set_attr "type" "simd_fmadd")
+   (set_attr "mode" "V16HI")])
+
+;;XVMADDWOD.W.HU.H
+(define_insn "lasx_xvmaddwod_w_hu_h"
+  [(set (match_operand:V8SI 0 "register_operand" "=f")
+	(plus:V8SI
+	  (match_operand:V8SI 1 "register_operand" "0")
+	  (mult:V8SI
+	    (zero_extend:V8SI
+	      (vec_select:V8HI
+		(match_operand:V16HI 2 "register_operand" "%f")
+		(parallel [(const_int 1) (const_int 3)
+			   (const_int 5) (const_int 7)
+			   (const_int 9) (const_int 11)
+			   (const_int 13) (const_int 15)])))
+	    (sign_extend:V8SI
+	      (vec_select:V8HI
+		(match_operand:V16HI 3 "register_operand" "f")
+		(parallel [(const_int 1) (const_int 3)
+			   (const_int 5) (const_int 7)
+			   (const_int 9) (const_int 11)
+			   (const_int 13) (const_int 15)]))))))]
+  "ISA_HAS_LASX"
+  "xvmaddwod.w.hu.h\t%u0,%u2,%u3"
+  [(set_attr "type" "simd_fmadd")
+   (set_attr "mode" "V8SI")])
+
+;;XVMADDWOD.D.WU.W
+(define_insn "lasx_xvmaddwod_d_wu_w"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(plus:V4DI
+	  (match_operand:V4DI 1 "register_operand" "0")
+	  (mult:V4DI
+	    (zero_extend:V4DI
+	      (vec_select:V4SI
+		(match_operand:V8SI 2 "register_operand" "%f")
+		(parallel [(const_int 1) (const_int 3)
+			   (const_int 5) (const_int 7)])))
+	    (sign_extend:V4DI
+	      (vec_select:V4SI
+		(match_operand:V8SI 3 "register_operand" "f")
+		(parallel [(const_int 1) (const_int 3)
+			   (const_int 5) (const_int 7)]))))))]
+  "ISA_HAS_LASX"
+  "xvmaddwod.d.wu.w\t%u0,%u2,%u3"
+  [(set_attr "type" "simd_fmadd")
+   (set_attr "mode" "V4DI")])
+
+;;XVMADDWOD.Q.DU.D
+;;TODO2
+(define_insn "lasx_xvmaddwod_q_du_d"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V4DI 1 "register_operand" "0")
+		      (match_operand:V4DI 2 "register_operand" "f")
+		      (match_operand:V4DI 3 "register_operand" "f")]
+		     UNSPEC_LASX_XVMADDWOD3))]
+  "ISA_HAS_LASX"
+  "xvmaddwod.q.du.d\t%u0,%u2,%u3"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4DI")])
+
+;;XVHADDW.Q.D
+;;TODO2
+(define_insn "lasx_xvhaddw_q_d"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V4DI 1 "register_operand" "f")
+		      (match_operand:V4DI 2 "register_operand" "f")]
+		     UNSPEC_LASX_XVHADDW_Q_D))]
+  "ISA_HAS_LASX"
+  "xvhaddw.q.d\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4DI")])
+
+;;XVHSUBW.Q.D
+;;TODO2
+(define_insn "lasx_xvhsubw_q_d"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V4DI 1 "register_operand" "f")
+		      (match_operand:V4DI 2 "register_operand" "f")]
+		     UNSPEC_LASX_XVHSUBW_Q_D))]
+  "ISA_HAS_LASX"
+  "xvhsubw.q.d\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4DI")])
+
+;;XVHADDW.QU.DU
+;;TODO2
+(define_insn "lasx_xvhaddw_qu_du"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V4DI 1 "register_operand" "f")
+		      (match_operand:V4DI 2 "register_operand" "f")]
+		     UNSPEC_LASX_XVHADDW_QU_DU))]
+  "ISA_HAS_LASX"
+  "xvhaddw.qu.du\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4DI")])
+
+;;XVHSUBW.QU.DU
+;;TODO2
+(define_insn "lasx_xvhsubw_qu_du"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V4DI 1 "register_operand" "f")
+		      (match_operand:V4DI 2 "register_operand" "f")]
+		     UNSPEC_LASX_XVHSUBW_QU_DU))]
+  "ISA_HAS_LASX"
+  "xvhsubw.qu.du\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4DI")])
+
+;;XVROTR.B   XVROTR.H   XVROTR.W   XVROTR.D
+;;TODO-478
+(define_insn "lasx_xvrotr_<lasxfmt>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(unspec:ILASX [(match_operand:ILASX 1 "register_operand" "f")
+		       (match_operand:ILASX 2 "register_operand" "f")]
+		      UNSPEC_LASX_XVROTR))]
+  "ISA_HAS_LASX"
+  "xvrotr.<lasxfmt>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+;;XVADD.Q
+;;TODO2
+(define_insn "lasx_xvadd_q"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V4DI 1 "register_operand" "f")
+		      (match_operand:V4DI 2 "register_operand" "f")]
+		     UNSPEC_LASX_XVADD_Q))]
+  "ISA_HAS_LASX"
+  "xvadd.q\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4DI")])
+
+;;XVSUB.Q
+;;TODO2
+(define_insn "lasx_xvsub_q"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V4DI 1 "register_operand" "f")
+		      (match_operand:V4DI 2 "register_operand" "f")]
+		     UNSPEC_LASX_XVSUB_Q))]
+  "ISA_HAS_LASX"
+  "xvsub.q\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4DI")])
+
+;;XVSSRLN.B.H   XVSSRLN.H.W   XVSSRLN.W.D
+(define_insn "lasx_xvssrln_<hlasxfmt>_<lasxfmt>"
+  [(set (match_operand:<VHSMODE256> 0 "register_operand" "=f")
+	(unspec:<VHSMODE256> [(match_operand:ILASX_DWH 1 "register_operand" "f")
+			      (match_operand:ILASX_DWH 2 "register_operand" "f")]
+			     UNSPEC_LASX_XVSSRLN))]
+  "ISA_HAS_LASX"
+  "xvssrln.<hlasxfmt>.<lasxfmt>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+;;XVREPLVE.B   XVREPLVE.H   XVREPLVE.W   XVREPLVE.D
+(define_insn "lasx_xvreplve_<lasxfmt_f>"
+  [(set (match_operand:LASX 0 "register_operand" "=f")
+	(unspec:LASX [(match_operand:LASX 1 "register_operand" "f")
+		      (match_operand:SI 2 "register_operand" "r")]
+		     UNSPEC_LASX_XVREPLVE))]
+  "ISA_HAS_LASX"
+  "xvreplve.<lasxfmt>\t%u0,%u1,%z2"
+  [(set_attr "type" "simd_splat")
+   (set_attr "mode" "<MODE>")])
+
+;;XVADDWEV.Q.DU.D
+(define_insn "lasx_xvaddwev_q_du_d"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V4DI 1 "register_operand" "f")
+		      (match_operand:V4DI 2 "register_operand" "f")]
+		     UNSPEC_LASX_XVADDWEV3))]
+  "ISA_HAS_LASX"
+  "xvaddwev.q.du.d\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4DI")])
+
+;;XVADDWOD.Q.DU.D
+(define_insn "lasx_xvaddwod_q_du_d"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V4DI 1 "register_operand" "f")
+		      (match_operand:V4DI 2 "register_operand" "f")]
+		     UNSPEC_LASX_XVADDWOD3))]
+  "ISA_HAS_LASX"
+  "xvaddwod.q.du.d\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4DI")])
+
+;;XVMULWEV.Q.DU.D
+(define_insn "lasx_xvmulwev_q_du_d"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V4DI 1 "register_operand" "f")
+		      (match_operand:V4DI 2 "register_operand" "f")]
+		     UNSPEC_LASX_XVMULWEV3))]
+  "ISA_HAS_LASX"
+  "xvmulwev.q.du.d\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4DI")])
+
+;;XVMULWOD.Q.DU.D
+(define_insn "lasx_xvmulwod_q_du_d"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V4DI 1 "register_operand" "f")
+		      (match_operand:V4DI 2 "register_operand" "f")]
+		     UNSPEC_LASX_XVMULWOD3))]
+  "ISA_HAS_LASX"
+  "xvmulwod.q.du.d\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4DI")])
+
+(define_insn "lasx_xvpickve2gr_w<u>"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+	(any_extend:SI
+	  (vec_select:SI
+	    (match_operand:V8SI 1 "register_operand" "f")
+	    (parallel [(match_operand 2 "const_0_to_7_operand" "")]))))]
+  "ISA_HAS_LASX"
+  "xvpickve2gr.w<u>\t%0,%u1,%2"
+  [(set_attr "type" "simd_copy")
+   (set_attr "mode" "V8SI")])
+
+
+(define_insn "lasx_xvmskgez_b"
+  [(set (match_operand:V32QI 0 "register_operand" "=f")
+	(unspec:V32QI [(match_operand:V32QI 1 "register_operand" "f")]
+		      UNSPEC_LASX_XVMSKGEZ))]
+  "ISA_HAS_LASX"
+  "xvmskgez.b\t%u0,%u1"
+  [(set_attr "type" "simd_bit")
+   (set_attr "mode" "V32QI")])
+
+(define_insn "lasx_xvmsknz_b"
+  [(set (match_operand:V32QI 0 "register_operand" "=f")
+	(unspec:V32QI [(match_operand:V32QI 1 "register_operand" "f")]
+		      UNSPEC_LASX_XVMSKNZ))]
+  "ISA_HAS_LASX"
+  "xvmsknz.b\t%u0,%u1"
+  [(set_attr "type" "simd_bit")
+   (set_attr "mode" "V32QI")])
+
+(define_insn "lasx_xvexth_h<u>_b<u>"
+  [(set (match_operand:V16HI 0 "register_operand" "=f")
+	(any_extend:V16HI
+	  (vec_select:V16QI
+	    (match_operand:V32QI 1 "register_operand" "f")
+	      (parallel [(const_int 16) (const_int 17)
+			 (const_int 18) (const_int 19)
+			 (const_int 20) (const_int 21)
+			 (const_int 22) (const_int 23)
+			 (const_int 24) (const_int 25)
+			 (const_int 26) (const_int 27)
+			 (const_int 28) (const_int 29)
+			 (const_int 30) (const_int 31)]))))]
+  "ISA_HAS_LASX"
+  "xvexth.h<u>.b<u>\t%u0,%u1"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "mode" "V16HI")])
+
+(define_insn "lasx_xvexth_w<u>_h<u>"
+  [(set (match_operand:V8SI 0 "register_operand" "=f")
+	(any_extend:V8SI
+	  (vec_select:V8HI
+	    (match_operand:V16HI 1 "register_operand" "f")
+	    (parallel [(const_int 8) (const_int 9)
+		       (const_int 10) (const_int 11)
+		       (const_int 12) (const_int 13)
+		       (const_int 14) (const_int 15)]))))]
+  "ISA_HAS_LASX"
+  "xvexth.w<u>.h<u>\t%u0,%u1"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "mode" "V8SI")])
+
+(define_insn "lasx_xvexth_d<u>_w<u>"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(any_extend:V4DI
+	  (vec_select:V4SI
+	    (match_operand:V8SI 1 "register_operand" "f")
+	    (parallel [(const_int 4) (const_int 5)
+		       (const_int 6) (const_int 7)]))))]
+  "ISA_HAS_LASX"
+  "xvexth.d<u>.w<u>\t%u0,%u1"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "mode" "V4DI")])
+
+(define_insn "lasx_xvexth_q_d"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V4DI 1 "register_operand" "f")]
+		     UNSPEC_LASX_XVEXTH_Q_D))]
+  "ISA_HAS_LASX"
+  "xvexth.q.d\t%u0,%u1"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "mode" "V4DI")])
+
+(define_insn "lasx_xvexth_qu_du"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V4DI 1 "register_operand" "f")]
+		     UNSPEC_LASX_XVEXTH_QU_DU))]
+  "ISA_HAS_LASX"
+  "xvexth.qu.du\t%u0,%u1"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "mode" "V4DI")])
+
+(define_insn "lasx_xvrotri_<lasxfmt>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(rotatert:ILASX (match_operand:ILASX 1 "register_operand" "f")
+		       (match_operand 2 "const_<bitimm256>_operand" "")))]
+  "ISA_HAS_LASX"
+  "xvrotri.<lasxfmt>\t%u0,%u1,%2"
+  [(set_attr "type" "simd_shf")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvextl_q_d"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V4DI 1 "register_operand" "f")]
+		     UNSPEC_LASX_XVEXTL_Q_D))]
+  "ISA_HAS_LASX"
+  "xvextl.q.d\t%u0,%u1"
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "mode" "V4DI")])
+
+(define_insn "lasx_xvsrlni_<lasxfmt>_<dlasxqfmt>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(unspec:ILASX [(match_operand:ILASX 1 "register_operand" "0")
+		       (match_operand:ILASX 2 "register_operand" "f")
+		       (match_operand 3 "const_uimm8_operand" "")]
+		      UNSPEC_LASX_XVSRLNI))]
+  "ISA_HAS_LASX"
+  "xvsrlni.<lasxfmt>.<dlasxqfmt>\t%u0,%u2,%3"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvsrlrni_<lasxfmt>_<dlasxqfmt>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(unspec:ILASX [(match_operand:ILASX 1 "register_operand" "0")
+		       (match_operand:ILASX 2 "register_operand" "f")
+		       (match_operand 3 "const_uimm8_operand" "")]
+		      UNSPEC_LASX_XVSRLRNI))]
+  "ISA_HAS_LASX"
+  "xvsrlrni.<lasxfmt>.<dlasxqfmt>\t%u0,%u2,%3"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvssrlni_<lasxfmt>_<dlasxqfmt>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(unspec:ILASX [(match_operand:ILASX 1 "register_operand" "0")
+		       (match_operand:ILASX 2 "register_operand" "f")
+		       (match_operand 3 "const_uimm8_operand" "")]
+		      UNSPEC_LASX_XVSSRLNI))]
+  "ISA_HAS_LASX"
+  "xvssrlni.<lasxfmt>.<dlasxqfmt>\t%u0,%u2,%3"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvssrlni_<lasxfmt_u>_<dlasxqfmt>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(unspec:ILASX [(match_operand:ILASX 1 "register_operand" "0")
+		       (match_operand:ILASX 2 "register_operand" "f")
+		       (match_operand 3 "const_uimm8_operand" "")]
+		      UNSPEC_LASX_XVSSRLNI2))]
+  "ISA_HAS_LASX"
+  "xvssrlni.<lasxfmt_u>.<dlasxqfmt>\t%u0,%u2,%3"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvssrlrni_<lasxfmt>_<dlasxqfmt>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(unspec:ILASX [(match_operand:ILASX 1 "register_operand" "0")
+		       (match_operand:ILASX 2 "register_operand" "f")
+		       (match_operand 3 "const_uimm8_operand" "")]
+		      UNSPEC_LASX_XVSSRLRNI))]
+  "ISA_HAS_LASX"
+  "xvssrlrni.<lasxfmt>.<dlasxqfmt>\t%u0,%u2,%3"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvssrlrni_<lasxfmt_u>_<dlasxqfmt>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(unspec:ILASX [(match_operand:ILASX 1 "register_operand" "0")
+		       (match_operand:ILASX 2 "register_operand" "f")
+		       (match_operand 3 "const_uimm8_operand" "")]
+		      UNSPEC_LASX_XVSSRLRNI2))]
+  "ISA_HAS_LASX"
+  "xvssrlrni.<lasxfmt_u>.<dlasxqfmt>\t%u0,%u2,%3"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvsrani_<lasxfmt>_<dlasxqfmt>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(unspec:ILASX [(match_operand:ILASX 1 "register_operand" "0")
+		       (match_operand:ILASX 2 "register_operand" "f")
+		       (match_operand 3 "const_uimm8_operand" "")]
+		      UNSPEC_LASX_XVSRANI))]
+  "ISA_HAS_LASX"
+  "xvsrani.<lasxfmt>.<dlasxqfmt>\t%u0,%u2,%3"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvsrarni_<lasxfmt>_<dlasxqfmt>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(unspec:ILASX [(match_operand:ILASX 1 "register_operand" "0")
+		       (match_operand:ILASX 2 "register_operand" "f")
+		       (match_operand 3 "const_uimm8_operand" "")]
+		      UNSPEC_LASX_XVSRARNI))]
+  "ISA_HAS_LASX"
+  "xvsrarni.<lasxfmt>.<dlasxqfmt>\t%u0,%u2,%3"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvssrani_<lasxfmt>_<dlasxqfmt>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(unspec:ILASX [(match_operand:ILASX 1 "register_operand" "0")
+		       (match_operand:ILASX 2 "register_operand" "f")
+		       (match_operand 3 "const_uimm8_operand" "")]
+		      UNSPEC_LASX_XVSSRANI))]
+  "ISA_HAS_LASX"
+  "xvssrani.<lasxfmt>.<dlasxqfmt>\t%u0,%u2,%3"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvssrani_<lasxfmt_u>_<dlasxqfmt>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(unspec:ILASX [(match_operand:ILASX 1 "register_operand" "0")
+		       (match_operand:ILASX 2 "register_operand" "f")
+		       (match_operand 3 "const_uimm8_operand" "")]
+		      UNSPEC_LASX_XVSSRANI2))]
+  "ISA_HAS_LASX"
+  "xvssrani.<lasxfmt_u>.<dlasxqfmt>\t%u0,%u2,%3"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvssrarni_<lasxfmt>_<dlasxqfmt>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(unspec:ILASX [(match_operand:ILASX 1 "register_operand" "0")
+		       (match_operand:ILASX 2 "register_operand" "f")
+		       (match_operand 3 "const_uimm8_operand" "")]
+		      UNSPEC_LASX_XVSSRARNI))]
+  "ISA_HAS_LASX"
+  "xvssrarni.<lasxfmt>.<dlasxqfmt>\t%u0,%u2,%3"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvssrarni_<lasxfmt_u>_<dlasxqfmt>"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(unspec:ILASX [(match_operand:ILASX 1 "register_operand" "0")
+		       (match_operand:ILASX 2 "register_operand" "f")
+		       (match_operand 3 "const_uimm8_operand" "")]
+		      UNSPEC_LASX_XVSSRARNI2))]
+  "ISA_HAS_LASX"
+  "xvssrarni.<lasxfmt_u>.<dlasxqfmt>\t%u0,%u2,%3"
+  [(set_attr "type" "simd_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_mode_attr VDOUBLEMODEW256
+  [(V8SI "V16SI")
+   (V8SF "V16SF")])
+
+(define_insn "lasx_xvpermi_<lasxfmt_f_wd>"
+  [(set (match_operand:LASX_W 0 "register_operand" "=f")
+    (unspec:LASX_W [(match_operand:LASX_W 1 "register_operand" "0")
+               (match_operand:LASX_W 2 "register_operand" "f")
+                   (match_operand 3 "const_uimm8_operand" "")]
+             UNSPEC_LASX_XVPERMI))]
+  "ISA_HAS_LASX"
+  "xvpermi.w\t%u0,%u2,%3"
+  [(set_attr "type" "simd_bit")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvpermi_<lasxfmt_f_wd>_1"
+  [(set (match_operand:LASX_W 0 "register_operand" "=f")
+     (vec_select:LASX_W
+       (vec_concat:<VDOUBLEMODEW256>
+         (match_operand:LASX_W 1 "register_operand" "f")
+         (match_operand:LASX_W 2 "register_operand" "0"))
+       (parallel [(match_operand 3  "const_0_to_3_operand")
+              (match_operand 4  "const_0_to_3_operand"  )
+              (match_operand 5  "const_8_to_11_operand" )
+              (match_operand 6  "const_8_to_11_operand" )
+              (match_operand 7  "const_4_to_7_operand"  )
+              (match_operand 8  "const_4_to_7_operand"  )
+              (match_operand 9  "const_12_to_15_operand")
+              (match_operand 10 "const_12_to_15_operand")])))]
+  "ISA_HAS_LASX
+  && INTVAL (operands[3]) + 4 == INTVAL (operands[7])
+  && INTVAL (operands[4]) + 4 == INTVAL (operands[8])
+  && INTVAL (operands[5]) + 4 == INTVAL (operands[9])
+  && INTVAL (operands[6]) + 4 == INTVAL (operands[10])"
+{
+  int mask = 0;
+  mask |= INTVAL (operands[3]) << 0;
+  mask |= INTVAL (operands[4]) << 2;
+  mask |= (INTVAL (operands[5]) - 8) << 4;
+  mask |= (INTVAL (operands[6]) - 8) << 6;
+  operands[3] = GEN_INT (mask);
+
+  return "xvpermi.w\t%u0,%u1,%3";
+}
+  [(set_attr "type" "simd_bit")
+   (set_attr "mode" "<MODE>")])
+
+(define_expand "lasx_xvld"
+  [(match_operand:V32QI 0 "register_operand")
+   (match_operand 1 "pmode_register_operand")
+   (match_operand 2 "aq12b_operand")]
+  "ISA_HAS_LASX"
+{
+  rtx addr = plus_constant (GET_MODE (operands[1]), operands[1],
+			    INTVAL (operands[2]));
+  loongarch_emit_move (operands[0], gen_rtx_MEM (V32QImode, addr));
+  DONE;
+})
+
+(define_expand "lasx_xvst"
+  [(match_operand:V32QI 0 "register_operand")
+   (match_operand 1 "pmode_register_operand")
+   (match_operand 2 "aq12b_operand")]
+  "ISA_HAS_LASX"
+{
+  rtx addr = plus_constant (GET_MODE (operands[1]), operands[1],
+			    INTVAL (operands[2]));
+  loongarch_emit_move (gen_rtx_MEM (V32QImode, addr), operands[0]);
+  DONE;
+})
+
+(define_expand "lasx_xvstelm_<lasxfmt_f>"
+  [(match_operand:LASX 0 "register_operand")
+   (match_operand 3 "const_<indeximm256>_operand")
+   (match_operand 2 "aq8<lasxfmt>_operand")
+   (match_operand 1 "pmode_register_operand")]
+  "ISA_HAS_LASX"
+{
+  emit_insn (gen_lasx_xvstelm_<lasxfmt_f>_insn
+	     (operands[1], operands[2], operands[0], operands[3]));
+  DONE;
+})
+
+(define_insn "lasx_xvstelm_<lasxfmt_f>_insn"
+  [(set (mem:<UNITMODE> (plus:DI (match_operand:DI 0 "register_operand" "r")
+				 (match_operand 1 "aq8<lasxfmt>_operand")))
+	(vec_select:<UNITMODE>
+	  (match_operand:LASX 2 "register_operand" "f")
+	  (parallel [(match_operand 3 "const_<indeximm256>_operand" "")])))]
+  "ISA_HAS_LASX"
+{
+  return "xvstelm.<lasxfmt>\t%u2,%0,%1,%3";
+}
+  [(set_attr "type" "simd_store")
+   (set_attr "mode" "<MODE>")
+   (set_attr "length" "4")])
+
+;; Offset is "0"
+(define_insn "lasx_xvstelm_<lasxfmt_f>_insn_0"
+  [(set (mem:<UNITMODE> (match_operand:DI 0 "register_operand" "r"))
+    (vec_select:<UNITMODE>
+      (match_operand:LASX_WD 1 "register_operand" "f")
+      (parallel [(match_operand:SI 2 "const_<indeximm256>_operand")])))]
+  "ISA_HAS_LASX"
+{
+    return "xvstelm.<lasxfmt>\t%u1,%0,0,%2";
+}
+  [(set_attr "type" "simd_store")
+   (set_attr "mode" "<MODE>")
+   (set_attr "length" "4")])
+
+(define_insn "lasx_xvinsve0_<lasxfmt_f>"
+  [(set (match_operand:LASX_WD 0 "register_operand" "=f")
+	(unspec:LASX_WD [(match_operand:LASX_WD 1 "register_operand" "0")
+			 (match_operand:LASX_WD 2 "register_operand" "f")
+			 (match_operand 3 "const_<indeximm256>_operand" "")]
+			UNSPEC_LASX_XVINSVE0))]
+  "ISA_HAS_LASX"
+  "xvinsve0.<lasxfmt>\t%u0,%u2,%3"
+  [(set_attr "type" "simd_shf")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvinsve0_<lasxfmt_f>_scalar"
+  [(set (match_operand:FLASX 0 "register_operand" "=f")
+    (vec_merge:FLASX
+      (vec_duplicate:FLASX
+        (match_operand:<UNITMODE> 1 "register_operand" "f"))
+      (match_operand:FLASX 2 "register_operand" "0")
+      (match_operand 3 "const_<bitmask256>_operand" "")))]
+  "ISA_HAS_LASX"
+  "xvinsve0.<lasxfmt>\t%u0,%u1,%y3"
+  [(set_attr "type" "simd_insert")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvpickve_<lasxfmt_f>"
+  [(set (match_operand:LASX_WD 0 "register_operand" "=f")
+	(unspec:LASX_WD [(match_operand:LASX_WD 1 "register_operand" "f")
+			 (match_operand 2 "const_<indeximm256>_operand" "")]
+			UNSPEC_LASX_XVPICKVE))]
+  "ISA_HAS_LASX"
+  "xvpickve.<lasxfmt>\t%u0,%u1,%2"
+  [(set_attr "type" "simd_shf")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvpickve_<lasxfmt_f>_scalar"
+  [(set (match_operand:<UNITMODE> 0 "register_operand" "=f")
+	(vec_select:<UNITMODE>
+	 (match_operand:FLASX 1 "register_operand" "f")
+	 (parallel [(match_operand 2 "const_<indeximm256>_operand" "")])))]
+  "ISA_HAS_LASX"
+  "xvpickve.<lasxfmt>\t%u0,%u1,%2"
+  [(set_attr "type" "simd_shf")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvssrlrn_<hlasxfmt>_<lasxfmt>"
+  [(set (match_operand:<VHSMODE256> 0 "register_operand" "=f")
+	(unspec:<VHSMODE256> [(match_operand:ILASX_DWH 1 "register_operand" "f")
+			      (match_operand:ILASX_DWH 2 "register_operand" "f")]
+			     UNSPEC_LASX_XVSSRLRN))]
+  "ISA_HAS_LASX"
+  "xvssrlrn.<hlasxfmt>.<lasxfmt>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "xvorn<mode>3"
+  [(set (match_operand:ILASX 0 "register_operand" "=f")
+	(ior:ILASX (not:ILASX (match_operand:ILASX 2 "register_operand" "f"))
+		   (match_operand:ILASX 1 "register_operand" "f")))]
+  "ISA_HAS_LASX"
+  "xvorn.v\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_logic")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "lasx_xvextl_qu_du"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI [(match_operand:V4DI 1 "register_operand" "f")]
+		     UNSPEC_LASX_XVEXTL_QU_DU))]
+  "ISA_HAS_LASX"
+  "xvextl.qu.du\t%u0,%u1"
+  [(set_attr "type" "simd_bit")
+   (set_attr "mode" "V4DI")])
+
+(define_insn "lasx_xvldi"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+	(unspec:V4DI[(match_operand 1 "const_imm13_operand")]
+		    UNSPEC_LASX_XVLDI))]
+  "ISA_HAS_LASX"
+{
+  HOST_WIDE_INT val = INTVAL (operands[1]);
+  if (val < 0)
+    {
+      HOST_WIDE_INT modeVal = (val & 0xf00) >> 8;
+      if (modeVal < 13)
+	return  "xvldi\t%u0,%1";
+      else
+	{
+	  sorry ("imm13 only support 0000 ~ 1100 in bits '12 ~ 9' when bit '13' is 1");
+	  return "#";
+	}
+    }
+  else
+    return "xvldi\t%u0,%1";
+}
+  [(set_attr "type" "simd_load")
+   (set_attr "mode" "V4DI")])
+
+(define_insn "lasx_xvldx"
+  [(set (match_operand:V32QI 0 "register_operand" "=f")
+	(unspec:V32QI [(match_operand:DI 1 "register_operand" "r")
+		       (match_operand:DI 2 "reg_or_0_operand" "rJ")]
+		      UNSPEC_LASX_XVLDX))]
+  "ISA_HAS_LASX"
+{
+  return "xvldx\t%u0,%1,%z2";
+}
+  [(set_attr "type" "simd_load")
+   (set_attr "mode" "V32QI")])
+
+(define_insn "lasx_xvstx"
+  [(set (mem:V32QI (plus:DI (match_operand:DI 1 "register_operand" "r")
+			    (match_operand:DI 2 "reg_or_0_operand" "rJ")))
+	(unspec: V32QI[(match_operand:V32QI 0 "register_operand" "f")]
+		      UNSPEC_LASX_XVSTX))]
+
+  "ISA_HAS_LASX"
+{
+  return "xvstx\t%u0,%1,%z2";
+}
+  [(set_attr "type" "simd_store")
+   (set_attr "mode" "DI")])
+
+(define_insn "vec_widen_<su>mult_even_v8si"
+  [(set (match_operand:V4DI 0 "register_operand" "=f")
+    (mult:V4DI
+      (any_extend:V4DI
+        (vec_select:V4SI
+          (match_operand:V8SI 1 "register_operand" "%f")
+          (parallel [(const_int 0) (const_int 2)
+                         (const_int 4) (const_int 6)])))
+      (any_extend:V4DI
+        (vec_select:V4SI
+          (match_operand:V8SI 2 "register_operand" "f")
+          (parallel [(const_int 0) (const_int 2)
+             (const_int 4) (const_int 6)])))))]
+  "ISA_HAS_LASX"
+  "xvmulwev.d.w<u>\t%u0,%u1,%u2"
+  [(set_attr "type" "simd_int_arith")
+   (set_attr "mode" "V4DI")])
+
+;; Vector reduction operation
+(define_expand "reduc_plus_scal_v4di"
+  [(match_operand:DI 0 "register_operand")
+   (match_operand:V4DI 1 "register_operand")]
+  "ISA_HAS_LASX"
+{
+  rtx tmp = gen_reg_rtx (V4DImode);
+  rtx tmp1 = gen_reg_rtx (V4DImode);
+  rtx vec_res = gen_reg_rtx (V4DImode);
+  emit_insn (gen_lasx_xvhaddw_q_d (tmp, operands[1], operands[1]));
+  emit_insn (gen_lasx_xvpermi_d_v4di (tmp1, tmp, GEN_INT (2)));
+  emit_insn (gen_addv4di3 (vec_res, tmp, tmp1));
+  emit_insn (gen_vec_extractv4didi (operands[0], vec_res, const0_rtx));
+  DONE;
+})
+
+(define_expand "reduc_plus_scal_v8si"
+  [(match_operand:SI 0 "register_operand")
+   (match_operand:V8SI 1 "register_operand")]
+  "ISA_HAS_LASX"
+{
+  rtx tmp = gen_reg_rtx (V4DImode);
+  rtx tmp1 = gen_reg_rtx (V4DImode);
+  rtx vec_res = gen_reg_rtx (V4DImode);
+  emit_insn (gen_lasx_xvhaddw_d_w (tmp, operands[1], operands[1]));
+  emit_insn (gen_lasx_xvhaddw_q_d (tmp1, tmp, tmp));
+  emit_insn (gen_lasx_xvpermi_d_v4di (tmp, tmp1, GEN_INT (2)));
+  emit_insn (gen_addv4di3 (vec_res, tmp, tmp1));
+  emit_insn (gen_vec_extractv8sisi (operands[0], gen_lowpart (V8SImode,vec_res),
+				    const0_rtx));
+  DONE;
+})
+
+(define_expand "reduc_plus_scal_<mode>"
+  [(match_operand:<UNITMODE> 0 "register_operand")
+   (match_operand:FLASX 1 "register_operand")]
+  "ISA_HAS_LASX"
+{
+  rtx tmp = gen_reg_rtx (<MODE>mode);
+  loongarch_expand_vector_reduc (gen_add<mode>3, tmp, operands[1]);
+  emit_insn (gen_vec_extract<mode><unitmode> (operands[0], tmp,
+					      const0_rtx));
+  DONE;
+})
+
+(define_expand "reduc_<optab>_scal_<mode>"
+  [(any_bitwise:<UNITMODE>
+     (match_operand:<UNITMODE> 0 "register_operand")
+     (match_operand:ILASX 1 "register_operand"))]
+  "ISA_HAS_LASX"
+{
+  rtx tmp = gen_reg_rtx (<MODE>mode);
+  loongarch_expand_vector_reduc (gen_<optab><mode>3, tmp, operands[1]);
+  emit_insn (gen_vec_extract<mode><unitmode> (operands[0], tmp,
+					      const0_rtx));
+  DONE;
+})
+
+(define_expand "reduc_smax_scal_<mode>"
+  [(match_operand:<UNITMODE> 0 "register_operand")
+   (match_operand:LASX 1 "register_operand")]
+  "ISA_HAS_LASX"
+{
+  rtx tmp = gen_reg_rtx (<MODE>mode);
+  loongarch_expand_vector_reduc (gen_smax<mode>3, tmp, operands[1]);
+  emit_insn (gen_vec_extract<mode><unitmode> (operands[0], tmp,
+					      const0_rtx));
+  DONE;
+})
+
+(define_expand "reduc_smin_scal_<mode>"
+  [(match_operand:<UNITMODE> 0 "register_operand")
+   (match_operand:LASX 1 "register_operand")]
+  "ISA_HAS_LASX"
+{
+  rtx tmp = gen_reg_rtx (<MODE>mode);
+  loongarch_expand_vector_reduc (gen_smin<mode>3, tmp, operands[1]);
+  emit_insn (gen_vec_extract<mode><unitmode> (operands[0], tmp,
+					      const0_rtx));
+  DONE;
+})
+
+(define_expand "reduc_umax_scal_<mode>"
+  [(match_operand:<UNITMODE> 0 "register_operand")
+   (match_operand:ILASX 1 "register_operand")]
+  "ISA_HAS_LASX"
+{
+  rtx tmp = gen_reg_rtx (<MODE>mode);
+  loongarch_expand_vector_reduc (gen_umax<mode>3, tmp, operands[1]);
+  emit_insn (gen_vec_extract<mode><unitmode> (operands[0], tmp,
+					      const0_rtx));
+  DONE;
+})
+
+(define_expand "reduc_umin_scal_<mode>"
+  [(match_operand:<UNITMODE> 0 "register_operand")
+   (match_operand:ILASX 1 "register_operand")]
+  "ISA_HAS_LASX"
+{
+  rtx tmp = gen_reg_rtx (<MODE>mode);
+  loongarch_expand_vector_reduc (gen_umin<mode>3, tmp, operands[1]);
+  emit_insn (gen_vec_extract<mode><unitmode> (operands[0], tmp,
+					      const0_rtx));
+  DONE;
+})
diff --git a/gcc/config/loongarch/loongarch-modes.def b/gcc/config/loongarch/loongarch-modes.def
index 6f57b60525d..68a829316f4 100644
--- a/gcc/config/loongarch/loongarch-modes.def
+++ b/gcc/config/loongarch/loongarch-modes.def
@@ -33,6 +33,7 @@ VECTOR_MODES (FLOAT, 8);      /*       V4HF V2SF */
 VECTOR_MODES (INT, 16);	      /* V16QI V8HI V4SI V2DI */
 VECTOR_MODES (FLOAT, 16);     /*	    V4SF V2DF */
 
+/* For LARCH LASX 256 bits.  */
 VECTOR_MODES (INT, 32);	      /* V32QI V16HI V8SI V4DI */
 VECTOR_MODES (FLOAT, 32);     /*	     V8SF V4DF */
 
diff --git a/gcc/config/loongarch/loongarch-protos.h b/gcc/config/loongarch/loongarch-protos.h
index fc33527cdcf..f4430d0d418 100644
--- a/gcc/config/loongarch/loongarch-protos.h
+++ b/gcc/config/loongarch/loongarch-protos.h
@@ -89,6 +89,8 @@ extern bool loongarch_split_move_insn_p (rtx, rtx);
 extern void loongarch_split_move_insn (rtx, rtx, rtx);
 extern void loongarch_split_128bit_move (rtx, rtx);
 extern bool loongarch_split_128bit_move_p (rtx, rtx);
+extern void loongarch_split_256bit_move (rtx, rtx);
+extern bool loongarch_split_256bit_move_p (rtx, rtx);
 extern void loongarch_split_lsx_copy_d (rtx, rtx, rtx, rtx (*)(rtx, rtx, rtx));
 extern void loongarch_split_lsx_insert_d (rtx, rtx, rtx, rtx);
 extern void loongarch_split_lsx_fill_d (rtx, rtx);
@@ -174,9 +176,11 @@ union loongarch_gen_fn_ptrs
 extern void loongarch_expand_atomic_qihi (union loongarch_gen_fn_ptrs,
 					  rtx, rtx, rtx, rtx, rtx);
 
+extern void loongarch_expand_vector_group_init (rtx, rtx);
 extern void loongarch_expand_vector_init (rtx, rtx);
 extern void loongarch_expand_vec_unpack (rtx op[2], bool, bool);
 extern void loongarch_expand_vec_perm (rtx, rtx, rtx, rtx);
+extern void loongarch_expand_vec_perm_1 (rtx[]);
 extern void loongarch_expand_vector_extract (rtx, rtx, int);
 extern void loongarch_expand_vector_reduc (rtx (*)(rtx, rtx, rtx), rtx, rtx);
 
diff --git a/gcc/config/loongarch/loongarch.cc b/gcc/config/loongarch/loongarch.cc
index 7ffa6bbb73d..8406bf6228b 100644
--- a/gcc/config/loongarch/loongarch.cc
+++ b/gcc/config/loongarch/loongarch.cc
@@ -1928,7 +1928,7 @@ loongarch_symbol_insns (enum loongarch_symbol_type type, machine_mode mode)
 {
   /* LSX LD.* and ST.* cannot support loading symbols via an immediate
      operand.  */
-  if (LSX_SUPPORTED_MODE_P (mode))
+  if (LSX_SUPPORTED_MODE_P (mode) || LASX_SUPPORTED_MODE_P (mode))
     return 0;
 
   switch (type)
@@ -2061,6 +2061,11 @@ loongarch_valid_offset_p (rtx x, machine_mode mode)
 					loongarch_ldst_scaled_shift (mode)))
     return false;
 
+  /* LASX XVLD.B and XVST.B supports 10-bit signed offsets without shift.  */
+  if (LASX_SUPPORTED_MODE_P (mode)
+      && !loongarch_signed_immediate_p (INTVAL (x), 10, 0))
+    return false;
+
   return true;
 }
 
@@ -2274,7 +2279,9 @@ loongarch_address_insns (rtx x, machine_mode mode, bool might_split_p)
 {
   struct loongarch_address_info addr;
   int factor;
-  bool lsx_p = !might_split_p && LSX_SUPPORTED_MODE_P (mode);
+  bool lsx_p = (!might_split_p
+		&& (LSX_SUPPORTED_MODE_P (mode)
+		    || LASX_SUPPORTED_MODE_P (mode)));
 
   if (!loongarch_classify_address (&addr, x, mode, false))
     return 0;
@@ -2420,7 +2427,8 @@ loongarch_const_insns (rtx x)
       return loongarch_integer_cost (INTVAL (x));
 
     case CONST_VECTOR:
-      if (LSX_SUPPORTED_MODE_P (GET_MODE (x))
+      if ((LSX_SUPPORTED_MODE_P (GET_MODE (x))
+	   || LASX_SUPPORTED_MODE_P (GET_MODE (x)))
 	  && loongarch_const_vector_same_int_p (x, GET_MODE (x), -512, 511))
 	return 1;
       /* Fall through.  */
@@ -3259,10 +3267,11 @@ loongarch_legitimize_move (machine_mode mode, rtx dest, rtx src)
 
   /* Both src and dest are non-registers;  one special case is supported where
      the source is (const_int 0) and the store can source the zero register.
-     LSX is never able to source the zero register directly in
+     LSX and LASX are never able to source the zero register directly in
      memory operations.  */
   if (!register_operand (dest, mode) && !register_operand (src, mode)
-      && (!const_0_operand (src, mode) || LSX_SUPPORTED_MODE_P (mode)))
+      && (!const_0_operand (src, mode)
+	  || LSX_SUPPORTED_MODE_P (mode) || LASX_SUPPORTED_MODE_P (mode)))
     {
       loongarch_emit_move (dest, force_reg (mode, src));
       return true;
@@ -3844,6 +3853,7 @@ loongarch_builtin_vectorization_cost (enum vect_cost_for_stmt type_of_cost,
 				      int misalign ATTRIBUTE_UNUSED)
 {
   unsigned elements;
+  machine_mode mode = vectype != NULL ? TYPE_MODE (vectype) : DImode;
 
   switch (type_of_cost)
     {
@@ -3860,7 +3870,8 @@ loongarch_builtin_vectorization_cost (enum vect_cost_for_stmt type_of_cost,
 	return 1;
 
       case vec_perm:
-	return 1;
+	return LASX_SUPPORTED_MODE_P (mode)
+	  && !LSX_SUPPORTED_MODE_P (mode) ? 2 : 1;
 
       case unaligned_load:
       case vector_gather_load:
@@ -3941,6 +3952,10 @@ loongarch_split_move_p (rtx dest, rtx src)
   if (LSX_SUPPORTED_MODE_P (GET_MODE (dest)))
     return loongarch_split_128bit_move_p (dest, src);
 
+  /* Check if LASX moves need splitting.  */
+  if (LASX_SUPPORTED_MODE_P (GET_MODE (dest)))
+    return loongarch_split_256bit_move_p (dest, src);
+
   /* Otherwise split all multiword moves.  */
   return size > UNITS_PER_WORD;
 }
@@ -3956,6 +3971,8 @@ loongarch_split_move (rtx dest, rtx src, rtx insn_)
   gcc_checking_assert (loongarch_split_move_p (dest, src));
   if (LSX_SUPPORTED_MODE_P (GET_MODE (dest)))
     loongarch_split_128bit_move (dest, src);
+  else if (LASX_SUPPORTED_MODE_P (GET_MODE (dest)))
+    loongarch_split_256bit_move (dest, src);
   else if (FP_REG_RTX_P (dest) || FP_REG_RTX_P (src))
     {
       if (!TARGET_64BIT && GET_MODE (dest) == DImode)
@@ -4121,7 +4138,7 @@ const char *
 loongarch_output_move_index_float (rtx x, machine_mode mode, bool ldr)
 {
   int index = exact_log2 (GET_MODE_SIZE (mode));
-  if (!IN_RANGE (index, 2, 4))
+  if (!IN_RANGE (index, 2, 5))
     return NULL;
 
   struct loongarch_address_info info;
@@ -4130,17 +4147,19 @@ loongarch_output_move_index_float (rtx x, machine_mode mode, bool ldr)
       || !loongarch_legitimate_address_p (mode, x, false))
     return NULL;
 
-  const char *const insn[][3] =
+  const char *const insn[][4] =
     {
 	{
 	  "fstx.s\t%1,%0",
 	  "fstx.d\t%1,%0",
-	  "vstx\t%w1,%0"
+	  "vstx\t%w1,%0",
+	  "xvstx\t%u1,%0"
 	},
 	{
 	  "fldx.s\t%0,%1",
 	  "fldx.d\t%0,%1",
-	  "vldx\t%w0,%1"
+	  "vldx\t%w0,%1",
+	  "xvldx\t%u0,%1"
 	}
     };
 
@@ -4174,6 +4193,34 @@ loongarch_split_128bit_move_p (rtx dest, rtx src)
   return true;
 }
 
+/* Return true if a 256-bit move from SRC to DEST should be split.  */
+
+bool
+loongarch_split_256bit_move_p (rtx dest, rtx src)
+{
+  /* LSX-to-LSX moves can be done in a single instruction.  */
+  if (FP_REG_RTX_P (src) && FP_REG_RTX_P (dest))
+    return false;
+
+  /* Check for LSX loads and stores.  */
+  if (FP_REG_RTX_P (dest) && MEM_P (src))
+    return false;
+  if (FP_REG_RTX_P (src) && MEM_P (dest))
+    return false;
+
+  /* Check for LSX set to an immediate const vector with valid replicated
+     element.  */
+  if (FP_REG_RTX_P (dest)
+      && loongarch_const_vector_same_int_p (src, GET_MODE (src), -512, 511))
+    return false;
+
+  /* Check for LSX load zero immediate.  */
+  if (FP_REG_RTX_P (dest) && src == CONST0_RTX (GET_MODE (src)))
+    return false;
+
+  return true;
+}
+
 /* Split a 128-bit move from SRC to DEST.  */
 
 void
@@ -4265,6 +4312,97 @@ loongarch_split_128bit_move (rtx dest, rtx src)
     }
 }
 
+/* Split a 256-bit move from SRC to DEST.  */
+
+void
+loongarch_split_256bit_move (rtx dest, rtx src)
+{
+  int byte, index;
+  rtx low_dest, low_src, d, s;
+
+  if (FP_REG_RTX_P (dest))
+    {
+      gcc_assert (!MEM_P (src));
+
+      rtx new_dest = dest;
+      if (!TARGET_64BIT)
+	{
+	  if (GET_MODE (dest) != V8SImode)
+	    new_dest = simplify_gen_subreg (V8SImode, dest, GET_MODE (dest), 0);
+	}
+      else
+	{
+	  if (GET_MODE (dest) != V4DImode)
+	    new_dest = simplify_gen_subreg (V4DImode, dest, GET_MODE (dest), 0);
+	}
+
+      for (byte = 0, index = 0; byte < GET_MODE_SIZE (GET_MODE (dest));
+	   byte += UNITS_PER_WORD, index++)
+	{
+	  s = loongarch_subword_at_byte (src, byte);
+	  if (!TARGET_64BIT)
+	    emit_insn (gen_lasx_xvinsgr2vr_w (new_dest, s, new_dest,
+					      GEN_INT (1 << index)));
+	  else
+	    emit_insn (gen_lasx_xvinsgr2vr_d (new_dest, s, new_dest,
+					      GEN_INT (1 << index)));
+	}
+    }
+  else if (FP_REG_RTX_P (src))
+    {
+      gcc_assert (!MEM_P (dest));
+
+      rtx new_src = src;
+      if (!TARGET_64BIT)
+	{
+	  if (GET_MODE (src) != V8SImode)
+	    new_src = simplify_gen_subreg (V8SImode, src, GET_MODE (src), 0);
+	}
+      else
+	{
+	  if (GET_MODE (src) != V4DImode)
+	    new_src = simplify_gen_subreg (V4DImode, src, GET_MODE (src), 0);
+	}
+
+      for (byte = 0, index = 0; byte < GET_MODE_SIZE (GET_MODE (src));
+	   byte += UNITS_PER_WORD, index++)
+	{
+	  d = loongarch_subword_at_byte (dest, byte);
+	  if (!TARGET_64BIT)
+	    emit_insn (gen_lsx_vpickve2gr_w (d, new_src, GEN_INT (index)));
+	  else
+	    emit_insn (gen_lsx_vpickve2gr_d (d, new_src, GEN_INT (index)));
+	}
+    }
+  else
+    {
+      low_dest = loongarch_subword_at_byte (dest, 0);
+      low_src = loongarch_subword_at_byte (src, 0);
+      gcc_assert (REG_P (low_dest) && REG_P (low_src));
+      /* Make sure the source register is not written before reading.  */
+      if (REGNO (low_dest) <= REGNO (low_src))
+	{
+	  for (byte = 0; byte < GET_MODE_SIZE (TImode);
+	       byte += UNITS_PER_WORD)
+	    {
+	      d = loongarch_subword_at_byte (dest, byte);
+	      s = loongarch_subword_at_byte (src, byte);
+	      loongarch_emit_move (d, s);
+	    }
+	}
+      else
+	{
+	  for (byte = GET_MODE_SIZE (TImode) - UNITS_PER_WORD; byte >= 0;
+	       byte -= UNITS_PER_WORD)
+	    {
+	      d = loongarch_subword_at_byte (dest, byte);
+	      s = loongarch_subword_at_byte (src, byte);
+	      loongarch_emit_move (d, s);
+	    }
+	}
+    }
+}
+
 
 /* Split a COPY_S.D with operands DEST, SRC and INDEX.  GEN is a function
    used to generate subregs.  */
@@ -4352,11 +4490,12 @@ loongarch_output_move (rtx dest, rtx src)
   machine_mode mode = GET_MODE (dest);
   bool dbl_p = (GET_MODE_SIZE (mode) == 8);
   bool lsx_p = LSX_SUPPORTED_MODE_P (mode);
+  bool lasx_p = LASX_SUPPORTED_MODE_P (mode);
 
   if (loongarch_split_move_p (dest, src))
     return "#";
 
-  if ((lsx_p)
+  if ((lsx_p || lasx_p)
       && dest_code == REG && FP_REG_P (REGNO (dest))
       && src_code == CONST_VECTOR
       && CONST_INT_P (CONST_VECTOR_ELT (src, 0)))
@@ -4366,6 +4505,8 @@ loongarch_output_move (rtx dest, rtx src)
 	{
 	case 16:
 	  return "vrepli.%v0\t%w0,%E1";
+	case 32:
+	  return "xvrepli.%v0\t%u0,%E1";
 	default: gcc_unreachable ();
 	}
     }
@@ -4380,13 +4521,15 @@ loongarch_output_move (rtx dest, rtx src)
 
 	  if (FP_REG_P (REGNO (dest)))
 	    {
-	      if (lsx_p)
+	      if (lsx_p || lasx_p)
 		{
 		  gcc_assert (src == CONST0_RTX (GET_MODE (src)));
 		  switch (GET_MODE_SIZE (mode))
 		    {
 		    case 16:
 		      return "vrepli.b\t%w0,0";
+		    case 32:
+		      return "xvrepli.b\t%u0,0";
 		    default:
 		      gcc_unreachable ();
 		    }
@@ -4519,12 +4662,14 @@ loongarch_output_move (rtx dest, rtx src)
     {
       if (dest_code == REG && FP_REG_P (REGNO (dest)))
 	{
-	  if (lsx_p)
+	  if (lsx_p || lasx_p)
 	    {
 	      switch (GET_MODE_SIZE (mode))
 		{
 		case 16:
 		  return "vori.b\t%w0,%w1,0";
+		case 32:
+		  return "xvori.b\t%u0,%u1,0";
 		default:
 		  gcc_unreachable ();
 		}
@@ -4542,12 +4687,14 @@ loongarch_output_move (rtx dest, rtx src)
 	  if (insn)
 	    return insn;
 
-	  if (lsx_p)
+	  if (lsx_p || lasx_p)
 	    {
 	      switch (GET_MODE_SIZE (mode))
 		{
 		case 16:
 		  return "vst\t%w1,%0";
+		case 32:
+		  return "xvst\t%u1,%0";
 		default:
 		  gcc_unreachable ();
 		}
@@ -4568,12 +4715,14 @@ loongarch_output_move (rtx dest, rtx src)
 	  if (insn)
 	    return insn;
 
-	  if (lsx_p)
+	  if (lsx_p || lasx_p)
 	    {
 	      switch (GET_MODE_SIZE (mode))
 		{
 		case 16:
 		  return "vld\t%w0,%1";
+		case 32:
+		  return "xvld\t%u0,%1";
 		default:
 		  gcc_unreachable ();
 		}
@@ -5546,18 +5695,27 @@ loongarch_print_operand_reloc (FILE *file, rtx op, bool hi64_part,
    'T'	Print 'f' for (eq:CC ...), 't' for (ne:CC ...),
 	      'z' for (eq:?I ...), 'n' for (ne:?I ...).
    't'	Like 'T', but with the EQ/NE cases reversed
-   'V'	Print exact log2 of CONST_INT OP element 0 of a replicated
-	  CONST_VECTOR in decimal.
+   'F'	Print the FPU branch condition for comparison OP.
+   'W'	Print the inverse of the FPU branch condition for comparison OP.
+   'w'	Print a LSX register.
+   'u'	Print a LASX register.
+   'T'	Print 'f' for (eq:CC ...), 't' for (ne:CC ...),
+	      'z' for (eq:?I ...), 'n' for (ne:?I ...).
+   't'	Like 'T', but with the EQ/NE cases reversed
+   'Y'	Print loongarch_fp_conditions[INTVAL (OP)]
+   'Z'	Print OP and a comma for 8CC, otherwise print nothing.
+   'z'	Print $0 if OP is zero, otherwise print OP normally.
    'v'	Print the insn size suffix b, h, w or d for vector modes V16QI, V8HI,
 	  V4SI, V2SI, and w, d for vector modes V4SF, V2DF respectively.
+   'V'	Print exact log2 of CONST_INT OP element 0 of a replicated
+	  CONST_VECTOR in decimal.
    'W'	Print the inverse of the FPU branch condition for comparison OP.
-   'w'	Print a LSX register.
    'X'	Print CONST_INT OP in hexadecimal format.
    'x'	Print the low 16 bits of CONST_INT OP in hexadecimal format.
    'Y'	Print loongarch_fp_conditions[INTVAL (OP)]
    'y'	Print exact log2 of CONST_INT OP in decimal.
    'Z'	Print OP and a comma for 8CC, otherwise print nothing.
-   'z'	Print $r0 if OP is zero, otherwise print OP normally.  */
+   'z'	Print $0 if OP is zero, otherwise print OP normally.  */
 
 static void
 loongarch_print_operand (FILE *file, rtx op, int letter)
@@ -5699,46 +5857,11 @@ loongarch_print_operand (FILE *file, rtx op, int letter)
 	output_operand_lossage ("invalid use of '%%%c'", letter);
       break;
 
-    case 'v':
-      switch (GET_MODE (op))
-	{
-	case E_V16QImode:
-	case E_V32QImode:
-	  fprintf (file, "b");
-	  break;
-	case E_V8HImode:
-	case E_V16HImode:
-	  fprintf (file, "h");
-	  break;
-	case E_V4SImode:
-	case E_V4SFmode:
-	case E_V8SImode:
-	case E_V8SFmode:
-	  fprintf (file, "w");
-	  break;
-	case E_V2DImode:
-	case E_V2DFmode:
-	case E_V4DImode:
-	case E_V4DFmode:
-	  fprintf (file, "d");
-	  break;
-	default:
-	  output_operand_lossage ("invalid use of '%%%c'", letter);
-	}
-      break;
-
     case 'W':
       loongarch_print_float_branch_condition (file, reverse_condition (code),
 					      letter);
       break;
 
-    case 'w':
-      if (code == REG && LSX_REG_P (REGNO (op)))
-	fprintf (file, "$vr%s", &reg_names[REGNO (op)][2]);
-      else
-	output_operand_lossage ("invalid use of '%%%c'", letter);
-      break;
-
     case 'x':
       if (CONST_INT_P (op))
 	fprintf (file, HOST_WIDE_INT_PRINT_HEX, INTVAL (op) & 0xffff);
@@ -5780,6 +5903,48 @@ loongarch_print_operand (FILE *file, rtx op, int letter)
       fputc (',', file);
       break;
 
+    case 'w':
+      if (code == REG && LSX_REG_P (REGNO (op)))
+	fprintf (file, "$vr%s", &reg_names[REGNO (op)][2]);
+      else
+	output_operand_lossage ("invalid use of '%%%c'", letter);
+      break;
+
+    case 'u':
+      if (code == REG && LASX_REG_P (REGNO (op)))
+	fprintf (file, "$xr%s", &reg_names[REGNO (op)][2]);
+      else
+	output_operand_lossage ("invalid use of '%%%c'", letter);
+      break;
+
+    case 'v':
+      switch (GET_MODE (op))
+	{
+	case E_V16QImode:
+	case E_V32QImode:
+	  fprintf (file, "b");
+	  break;
+	case E_V8HImode:
+	case E_V16HImode:
+	  fprintf (file, "h");
+	  break;
+	case E_V4SImode:
+	case E_V4SFmode:
+	case E_V8SImode:
+	case E_V8SFmode:
+	  fprintf (file, "w");
+	  break;
+	case E_V2DImode:
+	case E_V2DFmode:
+	case E_V4DImode:
+	case E_V4DFmode:
+	  fprintf (file, "d");
+	  break;
+	default:
+	  output_operand_lossage ("invalid use of '%%%c'", letter);
+	}
+      break;
+
     default:
       switch (code)
 	{
@@ -6110,13 +6275,18 @@ loongarch_hard_regno_mode_ok_uncached (unsigned int regno, machine_mode mode)
   size = GET_MODE_SIZE (mode);
   mclass = GET_MODE_CLASS (mode);
 
-  if (GP_REG_P (regno) && !LSX_SUPPORTED_MODE_P (mode))
+  if (GP_REG_P (regno) && !LSX_SUPPORTED_MODE_P (mode)
+      && !LASX_SUPPORTED_MODE_P (mode))
     return ((regno - GP_REG_FIRST) & 1) == 0 || size <= UNITS_PER_WORD;
 
   /* For LSX, allow TImode and 128-bit vector modes in all FPR.  */
   if (FP_REG_P (regno) && LSX_SUPPORTED_MODE_P (mode))
     return true;
 
+  /* FIXED ME: For LASX, allow TImode and 256-bit vector modes in all FPR.  */
+  if (FP_REG_P (regno) && LASX_SUPPORTED_MODE_P (mode))
+    return true;
+
   if (FP_REG_P (regno))
     {
       if (mclass == MODE_FLOAT
@@ -6169,6 +6339,9 @@ loongarch_hard_regno_nregs (unsigned int regno, machine_mode mode)
       if (LSX_SUPPORTED_MODE_P (mode))
 	return 1;
 
+      if (LASX_SUPPORTED_MODE_P (mode))
+	return 1;
+
       return (GET_MODE_SIZE (mode) + UNITS_PER_FPREG - 1) / UNITS_PER_FPREG;
     }
 
@@ -6198,7 +6371,10 @@ loongarch_class_max_nregs (enum reg_class rclass, machine_mode mode)
     {
       if (loongarch_hard_regno_mode_ok (FP_REG_FIRST, mode))
 	{
-	  if (LSX_SUPPORTED_MODE_P (mode))
+	  /* Fixed me.  */
+	  if (LASX_SUPPORTED_MODE_P (mode))
+	    size = MIN (size, UNITS_PER_LASX_REG);
+	  else if (LSX_SUPPORTED_MODE_P (mode))
 	    size = MIN (size, UNITS_PER_LSX_REG);
 	  else
 	    size = MIN (size, UNITS_PER_FPREG);
@@ -6216,6 +6392,10 @@ static bool
 loongarch_can_change_mode_class (machine_mode from, machine_mode to,
 				 reg_class_t rclass)
 {
+  /* Allow conversions between different LSX/LASX vector modes.  */
+  if (LASX_SUPPORTED_MODE_P (from) && LASX_SUPPORTED_MODE_P (to))
+    return true;
+
   /* Allow conversions between different LSX vector modes.  */
   if (LSX_SUPPORTED_MODE_P (from) && LSX_SUPPORTED_MODE_P (to))
     return true;
@@ -6239,7 +6419,8 @@ loongarch_mode_ok_for_mov_fmt_p (machine_mode mode)
       return TARGET_HARD_FLOAT && TARGET_DOUBLE_FLOAT;
 
     default:
-      return LSX_SUPPORTED_MODE_P (mode);
+      return ISA_HAS_LASX ? LASX_SUPPORTED_MODE_P (mode)
+	: LSX_SUPPORTED_MODE_P (mode);
     }
 }
 
@@ -6441,7 +6622,8 @@ loongarch_valid_pointer_mode (scalar_int_mode mode)
 static bool
 loongarch_vector_mode_supported_p (machine_mode mode)
 {
-  return LSX_SUPPORTED_MODE_P (mode);
+  return ISA_HAS_LASX ? LASX_SUPPORTED_MODE_P (mode)
+    : LSX_SUPPORTED_MODE_P (mode);
 }
 
 /* Implement TARGET_SCALAR_MODE_SUPPORTED_P.  */
@@ -6467,19 +6649,19 @@ loongarch_preferred_simd_mode (scalar_mode mode)
   switch (mode)
     {
     case E_QImode:
-      return E_V16QImode;
+      return ISA_HAS_LASX ? E_V32QImode : E_V16QImode;
     case E_HImode:
-      return E_V8HImode;
+      return ISA_HAS_LASX ? E_V16HImode : E_V8HImode;
     case E_SImode:
-      return E_V4SImode;
+      return ISA_HAS_LASX ? E_V8SImode : E_V4SImode;
     case E_DImode:
-      return E_V2DImode;
+      return ISA_HAS_LASX ? E_V4DImode : E_V2DImode;
 
     case E_SFmode:
-      return E_V4SFmode;
+      return ISA_HAS_LASX ? E_V8SFmode : E_V4SFmode;
 
     case E_DFmode:
-      return E_V2DFmode;
+      return ISA_HAS_LASX ? E_V4DFmode : E_V2DFmode;
 
     default:
       break;
@@ -6490,7 +6672,12 @@ loongarch_preferred_simd_mode (scalar_mode mode)
 static unsigned int
 loongarch_autovectorize_vector_modes (vector_modes *modes, bool)
 {
-  if (ISA_HAS_LSX)
+  if (ISA_HAS_LASX)
+    {
+      modes->safe_push (V32QImode);
+      modes->safe_push (V16QImode);
+    }
+  else if (ISA_HAS_LSX)
     {
       modes->safe_push (V16QImode);
     }
@@ -6670,11 +6857,18 @@ const char *
 loongarch_lsx_output_division (const char *division, rtx *operands)
 {
   const char *s;
+  machine_mode mode = GET_MODE (*operands);
 
   s = division;
   if (TARGET_CHECK_ZERO_DIV)
     {
-      if (ISA_HAS_LSX)
+      if (ISA_HAS_LASX && GET_MODE_SIZE (mode) == 32)
+	{
+	  output_asm_insn ("xvsetallnez.%v0\t$fcc7,%u2",operands);
+	  output_asm_insn (s, operands);
+	  output_asm_insn ("bcnez\t$fcc7,1f", operands);
+	}
+      else if (ISA_HAS_LSX)
 	{
 	  output_asm_insn ("vsetallnez.%v0\t$fcc7,%w2",operands);
 	  output_asm_insn (s, operands);
@@ -7502,7 +7696,7 @@ loongarch_expand_lsx_shuffle (struct expand_vec_perm_d *d)
   rtx_insn *insn;
   unsigned i;
 
-  if (!ISA_HAS_LSX)
+  if (!ISA_HAS_LSX && !ISA_HAS_LASX)
     return false;
 
   for (i = 0; i < d->nelt; i++)
@@ -7526,40 +7720,413 @@ loongarch_expand_lsx_shuffle (struct expand_vec_perm_d *d)
   return true;
 }
 
-void
-loongarch_expand_vec_perm (rtx target, rtx op0, rtx op1, rtx sel)
+/* Try to simplify a two vector permutation using 2 intra-lane interleave
+   insns and cross-lane shuffle for 32-byte vectors.  */
+
+static bool
+loongarch_expand_vec_perm_interleave (struct expand_vec_perm_d *d)
 {
-  machine_mode vmode = GET_MODE (target);
+  unsigned i, nelt;
+  rtx t1,t2,t3;
+  rtx (*gen_high) (rtx, rtx, rtx);
+  rtx (*gen_low) (rtx, rtx, rtx);
+  machine_mode mode = GET_MODE (d->target);
 
-  switch (vmode)
+  if (d->one_vector_p)
+    return false;
+  if (TARGET_LASX && GET_MODE_SIZE (d->vmode) == 32)
+    ;
+  else
+    return false;
+
+  nelt = d->nelt;
+  if (d->perm[0] != 0 && d->perm[0] != nelt / 2)
+    return false;
+  for (i = 0; i < nelt; i += 2)
+    if (d->perm[i] != d->perm[0] + i / 2
+	|| d->perm[i + 1] != d->perm[0] + i / 2 + nelt)
+      return false;
+
+  if (d->testing_p)
+    return true;
+
+  switch (d->vmode)
     {
-    case E_V16QImode:
-      emit_insn (gen_lsx_vshuf_b (target, op1, op0, sel));
+    case E_V32QImode:
+      gen_high = gen_lasx_xvilvh_b;
+      gen_low = gen_lasx_xvilvl_b;
       break;
-    case E_V2DFmode:
-      emit_insn (gen_lsx_vshuf_d_f (target, sel, op1, op0));
+    case E_V16HImode:
+      gen_high = gen_lasx_xvilvh_h;
+      gen_low = gen_lasx_xvilvl_h;
       break;
-    case E_V2DImode:
-      emit_insn (gen_lsx_vshuf_d (target, sel, op1, op0));
+    case E_V8SImode:
+      gen_high = gen_lasx_xvilvh_w;
+      gen_low = gen_lasx_xvilvl_w;
       break;
-    case E_V4SFmode:
-      emit_insn (gen_lsx_vshuf_w_f (target, sel, op1, op0));
+    case E_V4DImode:
+      gen_high = gen_lasx_xvilvh_d;
+      gen_low = gen_lasx_xvilvl_d;
       break;
-    case E_V4SImode:
-      emit_insn (gen_lsx_vshuf_w (target, sel, op1, op0));
+    case E_V8SFmode:
+      gen_high = gen_lasx_xvilvh_w_f;
+      gen_low = gen_lasx_xvilvl_w_f;
       break;
-    case E_V8HImode:
-      emit_insn (gen_lsx_vshuf_h (target, sel, op1, op0));
+    case E_V4DFmode:
+      gen_high = gen_lasx_xvilvh_d_f;
+      gen_low = gen_lasx_xvilvl_d_f;
       break;
     default:
-      break;
+      gcc_unreachable ();
+    }
+
+  t1 = gen_reg_rtx (mode);
+  t2 = gen_reg_rtx (mode);
+  emit_insn (gen_high (t1, d->op0, d->op1));
+  emit_insn (gen_low (t2, d->op0, d->op1));
+  if (mode == V4DFmode || mode == V8SFmode)
+    {
+      t3 = gen_reg_rtx (V4DFmode);
+      if (d->perm[0])
+	emit_insn (gen_lasx_xvpermi_q_v4df (t3, gen_lowpart (V4DFmode, t1),
+					    gen_lowpart (V4DFmode, t2),
+					    GEN_INT (0x31)));
+      else
+	emit_insn (gen_lasx_xvpermi_q_v4df (t3, gen_lowpart (V4DFmode, t1),
+					    gen_lowpart (V4DFmode, t2),
+					    GEN_INT (0x20)));
     }
+  else
+    {
+      t3 = gen_reg_rtx (V4DImode);
+      if (d->perm[0])
+	emit_insn (gen_lasx_xvpermi_q_v4di (t3, gen_lowpart (V4DImode, t1),
+					    gen_lowpart (V4DImode, t2),
+					    GEN_INT (0x31)));
+      else
+	emit_insn (gen_lasx_xvpermi_q_v4di (t3, gen_lowpart (V4DImode, t1),
+					    gen_lowpart (V4DImode, t2),
+					    GEN_INT (0x20)));
+    }
+  emit_move_insn (d->target, gen_lowpart (mode, t3));
+  return true;
 }
 
+/* Implement extract-even and extract-odd permutations.  */
+
 static bool
-loongarch_try_expand_lsx_vshuf_const (struct expand_vec_perm_d *d)
+loongarch_expand_vec_perm_even_odd_1 (struct expand_vec_perm_d *d, unsigned odd)
 {
-  int i;
+  rtx t1;
+  machine_mode mode = GET_MODE (d->target);
+  t1 = gen_reg_rtx (mode);
+
+  if (d->testing_p)
+    return true;
+
+  switch (d->vmode)
+    {
+    case E_V4DFmode:
+      /* Shuffle the lanes around into { 0 4 2 6 } and { 1 5 3 7 }.  */
+      if (odd)
+	emit_insn (gen_lasx_xvilvh_d_f (t1, d->op0, d->op1));
+      else
+	emit_insn (gen_lasx_xvilvl_d_f (t1, d->op0, d->op1));
+
+      /* Shuffle within the 256-bit lanes to produce the result required.
+	 { 0 2 4 6 } | { 1 3 5 7 }.  */
+      emit_insn (gen_lasx_xvpermi_d_v4df (d->target, t1, GEN_INT (0xd8)));
+      break;
+
+    case E_V4DImode:
+      if (odd)
+	emit_insn (gen_lasx_xvilvh_d (t1, d->op0, d->op1));
+      else
+	emit_insn (gen_lasx_xvilvl_d (t1, d->op0, d->op1));
+
+      emit_insn (gen_lasx_xvpermi_d_v4di (d->target, t1, GEN_INT (0xd8)));
+      break;
+
+    case E_V8SFmode:
+      /* Shuffle the lanes around into:
+	 { 0 2 8 a 4 6 c e } | { 1 3 9 b 5 7 d f }.  */
+      if (odd)
+	emit_insn (gen_lasx_xvpickod_w_f (t1, d->op0, d->op1));
+      else
+	emit_insn (gen_lasx_xvpickev_w_f (t1, d->op0, d->op1));
+
+      /* Shuffle within the 256-bit lanes to produce the result required.
+	 { 0 2 4 6 8 a c e } | { 1 3 5 7 9 b d f }.  */
+      emit_insn (gen_lasx_xvpermi_d_v8sf (d->target, t1, GEN_INT (0xd8)));
+      break;
+
+    case E_V8SImode:
+      if (odd)
+	emit_insn (gen_lasx_xvpickod_w (t1, d->op0, d->op1));
+      else
+	emit_insn (gen_lasx_xvpickev_w (t1, d->op0, d->op1));
+
+      emit_insn (gen_lasx_xvpermi_d_v8si (d->target, t1, GEN_INT (0xd8)));
+      break;
+
+    case E_V16HImode:
+      if (odd)
+	emit_insn (gen_lasx_xvpickod_h (t1, d->op0, d->op1));
+      else
+	emit_insn (gen_lasx_xvpickev_h (t1, d->op0, d->op1));
+
+      emit_insn (gen_lasx_xvpermi_d_v16hi (d->target, t1, GEN_INT (0xd8)));
+      break;
+
+    case E_V32QImode:
+      if (odd)
+	emit_insn (gen_lasx_xvpickod_b (t1, d->op0, d->op1));
+      else
+	emit_insn (gen_lasx_xvpickev_b (t1, d->op0, d->op1));
+
+      emit_insn (gen_lasx_xvpermi_d_v32qi (d->target, t1, GEN_INT (0xd8)));
+      break;
+
+    default:
+      gcc_unreachable ();
+    }
+
+  return true;
+}
+
+/* Pattern match extract-even and extract-odd permutations.  */
+
+static bool
+loongarch_expand_vec_perm_even_odd (struct expand_vec_perm_d *d)
+{
+  unsigned i, odd, nelt = d->nelt;
+  if (!TARGET_LASX)
+    return false;
+
+  odd = d->perm[0];
+  if (odd != 0 && odd != 1)
+    return false;
+
+  for (i = 1; i < nelt; ++i)
+    if (d->perm[i] != 2 * i + odd)
+      return false;
+
+  return loongarch_expand_vec_perm_even_odd_1 (d, odd);
+}
+
+/* Expand a variable vector permutation for LASX.  */
+
+void
+loongarch_expand_vec_perm_1 (rtx operands[])
+{
+  rtx target = operands[0];
+  rtx op0 = operands[1];
+  rtx op1 = operands[2];
+  rtx mask = operands[3];
+
+  bool one_operand_shuffle = rtx_equal_p (op0, op1);
+  rtx t1 = NULL;
+  rtx t2 = NULL;
+  rtx t3, t4, t5, t6, vt = NULL;
+  rtx vec[32] = {NULL};
+  machine_mode mode = GET_MODE (op0);
+  machine_mode maskmode = GET_MODE (mask);
+  int w, i;
+
+  /* Number of elements in the vector.  */
+  w = GET_MODE_NUNITS (mode);
+
+  if (mode == V4DImode || mode == V4DFmode)
+    {
+      maskmode = mode = V8SImode;
+      w = 8;
+      t1 = gen_reg_rtx (maskmode);
+
+      /* Replicate the low bits of the V4DImode mask into V8SImode:
+	 mask = { A B C D }
+	 t1 = { A A B B C C D D }.  */
+      for (i = 0; i < w / 2; ++i)
+	vec[i*2 + 1] = vec[i*2] = GEN_INT (i * 2);
+      vt = gen_rtx_CONST_VECTOR (maskmode, gen_rtvec_v (w, vec));
+      vt = force_reg (maskmode, vt);
+      mask = gen_lowpart (maskmode, mask);
+      emit_insn (gen_lasx_xvperm_w (t1, mask, vt));
+
+      /* Multiply the shuffle indicies by two.  */
+      t1 = expand_simple_binop (maskmode, PLUS, t1, t1, t1, 1,
+				OPTAB_DIRECT);
+
+      /* Add one to the odd shuffle indicies:
+	 t1 = { A*2, A*2+1, B*2, B*2+1, ... }.  */
+      for (i = 0; i < w / 2; ++i)
+	{
+	  vec[i * 2] = const0_rtx;
+	  vec[i * 2 + 1] = const1_rtx;
+	}
+      vt = gen_rtx_CONST_VECTOR (maskmode, gen_rtvec_v (w, vec));
+      vt = validize_mem (force_const_mem (maskmode, vt));
+      t1 = expand_simple_binop (maskmode, PLUS, t1, vt, t1, 1,
+				OPTAB_DIRECT);
+
+      /* Continue as if V8SImode (resp.  V32QImode) was used initially.  */
+      operands[3] = mask = t1;
+      target = gen_reg_rtx (mode);
+      op0 = gen_lowpart (mode, op0);
+      op1 = gen_lowpart (mode, op1);
+    }
+
+  switch (mode)
+    {
+    case E_V8SImode:
+      if (one_operand_shuffle)
+	{
+	  emit_insn (gen_lasx_xvperm_w (target, op0, mask));
+	  if (target != operands[0])
+	    emit_move_insn (operands[0],
+			    gen_lowpart (GET_MODE (operands[0]), target));
+	}
+      else
+	{
+	  t1 = gen_reg_rtx (V8SImode);
+	  t2 = gen_reg_rtx (V8SImode);
+	  emit_insn (gen_lasx_xvperm_w (t1, op0, mask));
+	  emit_insn (gen_lasx_xvperm_w (t2, op1, mask));
+	  goto merge_two;
+	}
+      return;
+
+    case E_V8SFmode:
+      mask = gen_lowpart (V8SImode, mask);
+      if (one_operand_shuffle)
+	emit_insn (gen_lasx_xvperm_w_f (target, op0, mask));
+      else
+	{
+	  t1 = gen_reg_rtx (V8SFmode);
+	  t2 = gen_reg_rtx (V8SFmode);
+	  emit_insn (gen_lasx_xvperm_w_f (t1, op0, mask));
+	  emit_insn (gen_lasx_xvperm_w_f (t2, op1, mask));
+	  goto merge_two;
+	}
+      return;
+
+    case E_V16HImode:
+      if (one_operand_shuffle)
+	{
+	  t1 = gen_reg_rtx (V16HImode);
+	  t2 = gen_reg_rtx (V16HImode);
+	  emit_insn (gen_lasx_xvpermi_d_v16hi (t1, op0, GEN_INT (0x44)));
+	  emit_insn (gen_lasx_xvpermi_d_v16hi (t2, op0, GEN_INT (0xee)));
+	  emit_insn (gen_lasx_xvshuf_h (target, mask, t2, t1));
+	}
+      else
+	{
+	  t1 = gen_reg_rtx (V16HImode);
+	  t2 = gen_reg_rtx (V16HImode);
+	  t3 = gen_reg_rtx (V16HImode);
+	  t4 = gen_reg_rtx (V16HImode);
+	  t5 = gen_reg_rtx (V16HImode);
+	  t6 = gen_reg_rtx (V16HImode);
+	  emit_insn (gen_lasx_xvpermi_d_v16hi (t3, op0, GEN_INT (0x44)));
+	  emit_insn (gen_lasx_xvpermi_d_v16hi (t4, op0, GEN_INT (0xee)));
+	  emit_insn (gen_lasx_xvshuf_h (t1, mask, t4, t3));
+	  emit_insn (gen_lasx_xvpermi_d_v16hi (t5, op1, GEN_INT (0x44)));
+	  emit_insn (gen_lasx_xvpermi_d_v16hi (t6, op1, GEN_INT (0xee)));
+	  emit_insn (gen_lasx_xvshuf_h (t2, mask, t6, t5));
+	  goto merge_two;
+	}
+      return;
+
+    case E_V32QImode:
+      if (one_operand_shuffle)
+	{
+	  t1 = gen_reg_rtx (V32QImode);
+	  t2 = gen_reg_rtx (V32QImode);
+	  emit_insn (gen_lasx_xvpermi_d_v32qi (t1, op0, GEN_INT (0x44)));
+	  emit_insn (gen_lasx_xvpermi_d_v32qi (t2, op0, GEN_INT (0xee)));
+	  emit_insn (gen_lasx_xvshuf_b (target, t2, t1, mask));
+	}
+      else
+	{
+	  t1 = gen_reg_rtx (V32QImode);
+	  t2 = gen_reg_rtx (V32QImode);
+	  t3 = gen_reg_rtx (V32QImode);
+	  t4 = gen_reg_rtx (V32QImode);
+	  t5 = gen_reg_rtx (V32QImode);
+	  t6 = gen_reg_rtx (V32QImode);
+	  emit_insn (gen_lasx_xvpermi_d_v32qi (t3, op0, GEN_INT (0x44)));
+	  emit_insn (gen_lasx_xvpermi_d_v32qi (t4, op0, GEN_INT (0xee)));
+	  emit_insn (gen_lasx_xvshuf_b (t1, t4, t3, mask));
+	  emit_insn (gen_lasx_xvpermi_d_v32qi (t5, op1, GEN_INT (0x44)));
+	  emit_insn (gen_lasx_xvpermi_d_v32qi (t6, op1, GEN_INT (0xee)));
+	  emit_insn (gen_lasx_xvshuf_b (t2, t6, t5, mask));
+	  goto merge_two;
+	}
+      return;
+
+    default:
+      gcc_assert (GET_MODE_SIZE (mode) == 32);
+      break;
+    }
+
+merge_two:
+  /* Then merge them together.  The key is whether any given control
+     element contained a bit set that indicates the second word.  */
+  rtx xops[6];
+  mask = operands[3];
+  vt = GEN_INT (w);
+  vt = gen_const_vec_duplicate (maskmode, vt);
+  vt = force_reg (maskmode, vt);
+  mask = expand_simple_binop (maskmode, AND, mask, vt,
+			      NULL_RTX, 0, OPTAB_DIRECT);
+  if (GET_MODE (target) != mode)
+    target = gen_reg_rtx (mode);
+  xops[0] = target;
+  xops[1] = gen_lowpart (mode, t2);
+  xops[2] = gen_lowpart (mode, t1);
+  xops[3] = gen_rtx_EQ (maskmode, mask, vt);
+  xops[4] = mask;
+  xops[5] = vt;
+
+  loongarch_expand_vec_cond_expr (mode, maskmode, xops);
+  if (target != operands[0])
+    emit_move_insn (operands[0],
+		    gen_lowpart (GET_MODE (operands[0]), target));
+}
+
+void
+loongarch_expand_vec_perm (rtx target, rtx op0, rtx op1, rtx sel)
+{
+  machine_mode vmode = GET_MODE (target);
+
+  switch (vmode)
+    {
+    case E_V16QImode:
+      emit_insn (gen_lsx_vshuf_b (target, op1, op0, sel));
+      break;
+    case E_V2DFmode:
+      emit_insn (gen_lsx_vshuf_d_f (target, sel, op1, op0));
+      break;
+    case E_V2DImode:
+      emit_insn (gen_lsx_vshuf_d (target, sel, op1, op0));
+      break;
+    case E_V4SFmode:
+      emit_insn (gen_lsx_vshuf_w_f (target, sel, op1, op0));
+      break;
+    case E_V4SImode:
+      emit_insn (gen_lsx_vshuf_w (target, sel, op1, op0));
+      break;
+    case E_V8HImode:
+      emit_insn (gen_lsx_vshuf_h (target, sel, op1, op0));
+      break;
+    default:
+      break;
+    }
+}
+
+static bool
+loongarch_try_expand_lsx_vshuf_const (struct expand_vec_perm_d *d)
+{
+  int i;
   rtx target, op0, op1, sel, tmp;
   rtx rperm[MAX_VECT_LEN];
 
@@ -7660,25 +8227,1302 @@ loongarch_expand_vec_perm_const_1 (struct expand_vec_perm_d *d)
 	return true;
     }
 
-  if (loongarch_expand_lsx_shuffle (d))
-    return true;
-  return false;
-}
-
-/* Implementation of constant vector permuatation.  This function identifies
- * recognized pattern of permuation selector argument, and use one or more
- * instruction(s) to finish the permutation job correctly.  For unsupported
- * patterns, it will return false.  */
-
-static bool
-loongarch_expand_vec_perm_const_2 (struct expand_vec_perm_d *d)
-{
-  /* Although we have the LSX vec_perm<mode> template, there's still some
-     128bit vector permuatation operations send to vectorize_vec_perm_const.
-     In this case, we just simpliy wrap them by single vshuf.* instruction,
-     because LSX vshuf.* instruction just have the same behavior that GCC
-     expects.  */
-  return loongarch_try_expand_lsx_vshuf_const (d);
+  if (loongarch_expand_lsx_shuffle (d))
+    return true;
+  if (loongarch_expand_vec_perm_even_odd (d))
+    return true;
+  if (loongarch_expand_vec_perm_interleave (d))
+    return true;
+  return false;
+}
+
+/* Following are the assist function for const vector permutation support.  */
+static bool
+loongarch_is_quad_duplicate (struct expand_vec_perm_d *d)
+{
+  if (d->perm[0] >= d->nelt / 2)
+    return false;
+
+  bool result = true;
+  unsigned char lhs = d->perm[0];
+  unsigned char rhs = d->perm[d->nelt / 2];
+
+  if ((rhs - lhs) != d->nelt / 2)
+    return false;
+
+  for (int i = 1; i < d->nelt; i += 1)
+    {
+      if ((i < d->nelt / 2) && (d->perm[i] != lhs))
+	{
+	  result = false;
+	  break;
+	}
+      if ((i > d->nelt / 2) && (d->perm[i] != rhs))
+	{
+	  result = false;
+	  break;
+	}
+    }
+
+  return result;
+}
+
+static bool
+loongarch_is_double_duplicate (struct expand_vec_perm_d *d)
+{
+  if (!d->one_vector_p)
+    return false;
+
+  if (d->nelt < 8)
+    return false;
+
+  bool result = true;
+  unsigned char buf = d->perm[0];
+
+  for (int i = 1; i < d->nelt; i += 2)
+    {
+      if (d->perm[i] != buf)
+	{
+	  result = false;
+	  break;
+	}
+      if (d->perm[i - 1] != d->perm[i])
+	{
+	  result = false;
+	  break;
+	}
+      buf += d->nelt / 4;
+    }
+
+  return result;
+}
+
+static bool
+loongarch_is_odd_extraction (struct expand_vec_perm_d *d)
+{
+  bool result = true;
+  unsigned char buf = 1;
+
+  for (int i = 0; i < d->nelt; i += 1)
+    {
+      if (buf != d->perm[i])
+	{
+	  result = false;
+	  break;
+	}
+      buf += 2;
+    }
+
+  return result;
+}
+
+static bool
+loongarch_is_even_extraction (struct expand_vec_perm_d *d)
+{
+  bool result = true;
+  unsigned char buf = 0;
+
+  for (int i = 0; i < d->nelt; i += 1)
+    {
+      if (buf != d->perm[i])
+	{
+	  result = false;
+	  break;
+	}
+      buf += 1;
+    }
+
+  return result;
+}
+
+static bool
+loongarch_is_extraction_permutation (struct expand_vec_perm_d *d)
+{
+  bool result = true;
+  unsigned char buf = d->perm[0];
+
+  if (buf != 0 || buf != d->nelt)
+    return false;
+
+  for (int i = 0; i < d->nelt; i += 1)
+    {
+      if (buf != d->perm[i])
+	{
+	  result = false;
+	  break;
+	}
+      buf += 2;
+    }
+
+  return result;
+}
+
+static bool
+loongarch_is_center_extraction (struct expand_vec_perm_d *d)
+{
+  bool result = true;
+  unsigned buf = d->nelt / 2;
+
+  for (int i = 0; i < d->nelt; i += 1)
+    {
+      if (buf != d->perm[i])
+	{
+	  result = false;
+	  break;
+	}
+      buf += 1;
+    }
+
+  return result;
+}
+
+static bool
+loongarch_is_reversing_permutation (struct expand_vec_perm_d *d)
+{
+  if (!d->one_vector_p)
+    return false;
+
+  bool result = true;
+  unsigned char buf = d->nelt - 1;
+
+  for (int i = 0; i < d->nelt; i += 1)
+    {
+      if (d->perm[i] != buf)
+	{
+	  result = false;
+	  break;
+	}
+
+      buf -= 1;
+    }
+
+  return result;
+}
+
+static bool
+loongarch_is_di_misalign_extract (struct expand_vec_perm_d *d)
+{
+  if (d->nelt != 4 && d->nelt != 8)
+    return false;
+
+  bool result = true;
+  unsigned char buf;
+
+  if (d->nelt == 4)
+    {
+      buf = 1;
+      for (int i = 0; i < d->nelt; i += 1)
+	{
+	  if (buf != d->perm[i])
+	    {
+	      result = false;
+	      break;
+	    }
+
+	  buf += 1;
+	}
+    }
+  else if (d->nelt == 8)
+    {
+      buf = 2;
+      for (int i = 0; i < d->nelt; i += 1)
+	{
+	  if (buf != d->perm[i])
+	    {
+	      result = false;
+	      break;
+	    }
+
+	  buf += 1;
+	}
+    }
+
+  return result;
+}
+
+static bool
+loongarch_is_si_misalign_extract (struct expand_vec_perm_d *d)
+{
+  if (d->vmode != E_V8SImode && d->vmode != E_V8SFmode)
+    return false;
+  bool result = true;
+  unsigned char buf = 1;
+
+  for (int i = 0; i < d->nelt; i += 1)
+    {
+      if (buf != d->perm[i])
+	{
+	  result = false;
+	  break;
+	}
+      buf += 1;
+    }
+
+  return result;
+}
+
+static bool
+loongarch_is_lasx_lowpart_interleave (struct expand_vec_perm_d *d)
+{
+  bool result = true;
+  unsigned char buf = 0;
+
+  for (int i = 0;i < d->nelt; i += 2)
+    {
+      if (buf != d->perm[i])
+	{
+	  result = false;
+	  break;
+	}
+      buf += 1;
+    }
+
+  if (result)
+    {
+      buf = d->nelt;
+      for (int i = 1; i < d->nelt; i += 2)
+	{
+	  if (buf != d->perm[i])
+	    {
+	      result = false;
+	      break;
+	    }
+	  buf += 1;
+	}
+    }
+
+  return result;
+}
+
+static bool
+loongarch_is_lasx_lowpart_interleave_2 (struct expand_vec_perm_d *d)
+{
+  if (d->vmode != E_V32QImode)
+    return false;
+  bool result = true;
+  unsigned char buf = 0;
+
+#define COMPARE_SELECTOR(INIT, BEGIN, END) \
+  buf = INIT; \
+  for (int i = BEGIN; i < END && result; i += 1) \
+    { \
+      if (buf != d->perm[i]) \
+	{ \
+	  result = false; \
+	  break; \
+	} \
+      buf += 1; \
+    }
+
+  COMPARE_SELECTOR (0, 0, 8);
+  COMPARE_SELECTOR (32, 8, 16);
+  COMPARE_SELECTOR (8, 16, 24);
+  COMPARE_SELECTOR (40, 24, 32);
+
+#undef COMPARE_SELECTOR
+  return result;
+}
+
+static bool
+loongarch_is_lasx_lowpart_extract (struct expand_vec_perm_d *d)
+{
+  bool result = true;
+  unsigned char buf = 0;
+
+  for (int i = 0; i < d->nelt / 2; i += 1)
+    {
+      if (buf != d->perm[i])
+	{
+	  result = false;
+	  break;
+	}
+      buf += 1;
+    }
+
+  if (result)
+    {
+      buf = d->nelt;
+      for (int i = d->nelt / 2; i < d->nelt; i += 1)
+	{
+	  if (buf != d->perm[i])
+	    {
+	      result = false;
+	      break;
+	    }
+	  buf += 1;
+	}
+    }
+
+  return result;
+}
+
+static bool
+loongarch_is_lasx_highpart_interleave (expand_vec_perm_d *d)
+{
+  bool result = true;
+  unsigned char buf = d->nelt / 2;
+
+  for (int i = 0; i < d->nelt; i += 2)
+    {
+      if (buf != d->perm[i])
+	{
+	  result = false;
+	  break;
+	}
+      buf += 1;
+    }
+
+  if (result)
+    {
+      buf = d->nelt + d->nelt / 2;
+      for (int i = 1; i < d->nelt;i += 2)
+	{
+	  if (buf != d->perm[i])
+	    {
+	      result = false;
+	      break;
+	    }
+	  buf += 1;
+	}
+    }
+
+  return result;
+}
+
+static bool
+loongarch_is_lasx_highpart_interleave_2 (struct expand_vec_perm_d *d)
+{
+  if (d->vmode != E_V32QImode)
+    return false;
+
+  bool result = true;
+  unsigned char buf = 0;
+
+#define COMPARE_SELECTOR(INIT, BEGIN, END) \
+  buf = INIT; \
+  for (int i = BEGIN; i < END && result; i += 1) \
+    { \
+      if (buf != d->perm[i]) \
+	{ \
+	  result = false; \
+	  break; \
+	} \
+      buf += 1; \
+    }
+
+  COMPARE_SELECTOR (16, 0, 8);
+  COMPARE_SELECTOR (48, 8, 16);
+  COMPARE_SELECTOR (24, 16, 24);
+  COMPARE_SELECTOR (56, 24, 32);
+
+#undef COMPARE_SELECTOR
+  return result;
+}
+
+static bool
+loongarch_is_elem_duplicate (struct expand_vec_perm_d *d)
+{
+  bool result = true;
+  unsigned char buf = d->perm[0];
+
+  for (int i = 0; i < d->nelt; i += 1)
+    {
+      if (buf != d->perm[i])
+	{
+	  result = false;
+	  break;
+	}
+    }
+
+  return result;
+}
+
+inline bool
+loongarch_is_op_reverse_perm (struct expand_vec_perm_d *d)
+{
+  return (d->vmode == E_V4DFmode)
+    && d->perm[0] == 2 && d->perm[1] == 3
+    && d->perm[2] == 0 && d->perm[3] == 1;
+}
+
+static bool
+loongarch_is_single_op_perm (struct expand_vec_perm_d *d)
+{
+  bool result = true;
+
+  for (int i = 0; i < d->nelt; i += 1)
+    {
+      if (d->perm[i] >= d->nelt)
+	{
+	  result = false;
+	  break;
+	}
+    }
+
+  return result;
+}
+
+static bool
+loongarch_is_divisible_perm (struct expand_vec_perm_d *d)
+{
+  bool result = true;
+
+  for (int i = 0; i < d->nelt / 2; i += 1)
+    {
+      if (d->perm[i] >= d->nelt)
+	{
+	  result = false;
+	  break;
+	}
+    }
+
+  if (result)
+    {
+      for (int i = d->nelt / 2; i < d->nelt; i += 1)
+	{
+	  if (d->perm[i] < d->nelt)
+	    {
+	      result = false;
+	      break;
+	    }
+	}
+    }
+
+  return result;
+}
+
+inline bool
+loongarch_is_triple_stride_extract (struct expand_vec_perm_d *d)
+{
+  return (d->vmode == E_V4DImode || d->vmode == E_V4DFmode)
+    && d->perm[0] == 1 && d->perm[1] == 4
+    && d->perm[2] == 7 && d->perm[3] == 0;
+}
+
+/* In LASX, some permutation insn does not have the behavior that gcc expects
+ * when compiler wants to emit a vector permutation.
+ *
+ * 1. What GCC provides via vectorize_vec_perm_const ()'s paramater:
+ * When GCC wants to performs a vector permutation, it provides two op
+ * reigster, one target register, and a selector.
+ * In const vector permutation case, GCC provides selector as a char array
+ * that contains original value; in variable vector permuatation
+ * (performs via vec_perm<mode> insn template), it provides a vector register.
+ * We assume that nelt is the elements numbers inside single vector in current
+ * 256bit vector mode.
+ *
+ * 2. What GCC expects to perform:
+ * Two op registers (op0, op1) will "combine" into a 512bit temp vector storage
+ * that has 2*nelt elements inside it; the low 256bit is op0, and high 256bit
+ * is op1, then the elements are indexed as below:
+ *		  0 ~ nelt - 1		nelt ~ 2 * nelt - 1
+ *	  |-------------------------|-------------------------|
+ *		Low 256bit (op0)	High 256bit (op1)
+ * For example, the second element in op1 (V8SImode) will be indexed with 9.
+ * Selector is a vector that has the same mode and number of elements  with
+ * op0,op1 and target, it's look like this:
+ *	      0 ~ nelt - 1
+ *	  |-------------------------|
+ *	      256bit (selector)
+ * It describes which element from 512bit temp vector storage will fit into
+ * target's every element slot.
+ * GCC expects that every element in selector can be ANY indices of 512bit
+ * vector storage (Selector can pick literally any element from op0 and op1, and
+ * then fits into any place of target register). This is also what LSX 128bit
+ * vshuf.* instruction do similarly, so we can handle 128bit vector permutation
+ * by single instruction easily.
+ *
+ * 3. What LASX permutation instruction does:
+ * In short, it just execute two independent 128bit vector permuatation, and
+ * it's the reason that we need to do the jobs below.  We will explain it.
+ * op0, op1, target, and selector will be separate into high 128bit and low
+ * 128bit, and do permutation as the description below:
+ *
+ *  a) op0's low 128bit and op1's low 128bit "combines" into a 256bit temp
+ * vector storage (TVS1), elements are indexed as below:
+ *	    0 ~ nelt / 2 - 1	  nelt / 2 ~ nelt - 1
+ *	|---------------------|---------------------| TVS1
+ *	    op0's low 128bit      op1's low 128bit
+ *    op0's high 128bit and op1's high 128bit are "combined" into TVS2 in the
+ *    same way.
+ *	    0 ~ nelt / 2 - 1	  nelt / 2 ~ nelt - 1
+ *	|---------------------|---------------------| TVS2
+ *	    op0's high 128bit	op1's high 128bit
+ *  b) Selector's low 128bit describes which elements from TVS1 will fit into
+ *  target vector's low 128bit.  No TVS2 elements are allowed.
+ *  c) Selector's high 128bit describes which elements from TVS2 will fit into
+ *  target vector's high 128bit.  No TVS1 elements are allowed.
+ *
+ * As we can see, if we want to handle vector permutation correctly, we can
+ * achieve it in three ways:
+ *  a) Modify selector's elements, to make sure that every elements can inform
+ *  correct value that will put into target vector.
+    b) Generate extra instruction before/after permutation instruction, for
+    adjusting op vector or target vector, to make sure target vector's value is
+    what GCC expects.
+    c) Use other instructions to process op and put correct result into target.
+ */
+
+/* Implementation of constant vector permuatation.  This function identifies
+ * recognized pattern of permuation selector argument, and use one or more
+ * instruction(s) to finish the permutation job correctly.  For unsupported
+ * patterns, it will return false.  */
+
+static bool
+loongarch_expand_vec_perm_const_2 (struct expand_vec_perm_d *d)
+{
+  /* Although we have the LSX vec_perm<mode> template, there's still some
+     128bit vector permuatation operations send to vectorize_vec_perm_const.
+     In this case, we just simpliy wrap them by single vshuf.* instruction,
+     because LSX vshuf.* instruction just have the same behavior that GCC
+     expects.  */
+  if (GET_MODE_SIZE (d->vmode) == 16)
+    return loongarch_try_expand_lsx_vshuf_const (d);
+  else
+    return false;
+
+  bool ok = false, reverse_hi_lo = false, extract_ev_od = false,
+       use_alt_op = false;
+  unsigned char idx;
+  int i;
+  rtx target, op0, op1, sel, tmp;
+  rtx op0_alt = NULL_RTX, op1_alt = NULL_RTX;
+  rtx rperm[MAX_VECT_LEN];
+  unsigned int remapped[MAX_VECT_LEN];
+
+  /* Try to figure out whether is a recognized permutation selector pattern, if
+     yes, we will reassign some elements with new value in selector argument,
+     and in some cases we will generate some assist insn to complete the
+     permutation. (Even in some cases, we use other insn to impl permutation
+     instead of xvshuf!)
+
+     Make sure to check d->testing_p is false everytime if you want to emit new
+     insn, unless you want to crash into ICE directly.  */
+  if (loongarch_is_quad_duplicate (d))
+    {
+      /* Selector example: E_V8SImode, { 0, 0, 0, 0, 4, 4, 4, 4 }
+	 copy first elem from original selector to all elem in new selector.  */
+      idx = d->perm[0];
+      for (i = 0; i < d->nelt; i += 1)
+	{
+	  remapped[i] = idx;
+	}
+      /* Selector after: { 0, 0, 0, 0, 0, 0, 0, 0 }.  */
+    }
+  else if (loongarch_is_double_duplicate (d))
+    {
+      /* Selector example: E_V8SImode, { 1, 1, 3, 3, 5, 5, 7, 7 }
+	 one_vector_p == true.  */
+      for (i = 0; i < d->nelt / 2; i += 1)
+	{
+	  idx = d->perm[i];
+	  remapped[i] = idx;
+	  remapped[i + d->nelt / 2] = idx;
+	}
+      /* Selector after: { 1, 1, 3, 3, 1, 1, 3, 3 }.  */
+    }
+  else if (loongarch_is_odd_extraction (d)
+	   || loongarch_is_even_extraction (d))
+    {
+      /* Odd extraction selector sample: E_V4DImode, { 1, 3, 5, 7 }
+	 Selector after: { 1, 3, 1, 3 }.
+	 Even extraction selector sample: E_V4DImode, { 0, 2, 4, 6 }
+	 Selector after: { 0, 2, 0, 2 }.  */
+      for (i = 0; i < d->nelt / 2; i += 1)
+	{
+	  idx = d->perm[i];
+	  remapped[i] = idx;
+	  remapped[i + d->nelt / 2] = idx;
+	}
+      /* Additional insn is required for correct result.  See codes below.  */
+      extract_ev_od = true;
+    }
+  else if (loongarch_is_extraction_permutation (d))
+    {
+      /* Selector sample: E_V8SImode, { 0, 1, 2, 3, 4, 5, 6, 7 }.  */
+      if (d->perm[0] == 0)
+	{
+	  for (i = 0; i < d->nelt / 2; i += 1)
+	    {
+	      remapped[i] = i;
+	      remapped[i + d->nelt / 2] = i;
+	    }
+	}
+      else
+	{
+	  /* { 8, 9, 10, 11, 12, 13, 14, 15 }.  */
+	  for (i = 0; i < d->nelt / 2; i += 1)
+	    {
+	      idx = i + d->nelt / 2;
+	      remapped[i] = idx;
+	      remapped[i + d->nelt / 2] = idx;
+	    }
+	}
+      /* Selector after: { 0, 1, 2, 3, 0, 1, 2, 3 }
+	 { 8, 9, 10, 11, 8, 9, 10, 11 }  */
+    }
+  else if (loongarch_is_center_extraction (d))
+    {
+      /* sample: E_V4DImode, { 2, 3, 4, 5 }
+	 In this condition, we can just copy high 128bit of op0 and low 128bit
+	 of op1 to the target register by using xvpermi.q insn.  */
+      if (!d->testing_p)
+	{
+	  emit_move_insn (d->target, d->op1);
+	  switch (d->vmode)
+	    {
+	      case E_V4DImode:
+		emit_insn (gen_lasx_xvpermi_q_v4di (d->target, d->target,
+						    d->op0, GEN_INT (0x21)));
+		break;
+	      case E_V4DFmode:
+		emit_insn (gen_lasx_xvpermi_q_v4df (d->target, d->target,
+						    d->op0, GEN_INT (0x21)));
+		break;
+	      case E_V8SImode:
+		emit_insn (gen_lasx_xvpermi_q_v8si (d->target, d->target,
+						    d->op0, GEN_INT (0x21)));
+		break;
+	      case E_V8SFmode:
+		emit_insn (gen_lasx_xvpermi_q_v8sf (d->target, d->target,
+						    d->op0, GEN_INT (0x21)));
+		break;
+	      case E_V16HImode:
+		emit_insn (gen_lasx_xvpermi_q_v16hi (d->target, d->target,
+						     d->op0, GEN_INT (0x21)));
+		break;
+	      case E_V32QImode:
+		emit_insn (gen_lasx_xvpermi_q_v32qi (d->target, d->target,
+						     d->op0, GEN_INT (0x21)));
+		break;
+	      default:
+		break;
+	    }
+	}
+      ok = true;
+      /* Finish the funtion directly.  */
+      goto expand_perm_const_2_end;
+    }
+  else if (loongarch_is_reversing_permutation (d))
+    {
+      /* Selector sample: E_V8SImode, { 7, 6, 5, 4, 3, 2, 1, 0 }
+	 one_vector_p == true  */
+      idx = d->nelt / 2 - 1;
+      for (i = 0; i < d->nelt / 2; i += 1)
+	{
+	  remapped[i] = idx;
+	  remapped[i + d->nelt / 2] = idx;
+	  idx -= 1;
+	}
+      /* Selector after: { 3, 2, 1, 0, 3, 2, 1, 0 }
+	 Additional insn will be generated to swap hi and lo 128bit of target
+	 register.  */
+      reverse_hi_lo = true;
+    }
+  else if (loongarch_is_di_misalign_extract (d)
+	   || loongarch_is_si_misalign_extract (d))
+    {
+      /* Selector Sample:
+	 DI misalign: E_V4DImode, { 1, 2, 3, 4 }
+	 SI misalign: E_V8SImode, { 1, 2, 3, 4, 5, 6, 7, 8 }  */
+      if (!d->testing_p)
+	{
+	  /* Copy original op0/op1 value to new temp register.
+	     In some cases, operand register may be used in multiple place, so
+	     we need new regiter instead modify original one, to avoid runtime
+	     crashing or wrong value after execution.  */
+	  use_alt_op = true;
+	  op1_alt = gen_reg_rtx (d->vmode);
+	  emit_move_insn (op1_alt, d->op1);
+
+	  /* Adjust op1 for selecting correct value in high 128bit of target
+	     register.
+	     op1: E_V4DImode, { 4, 5, 6, 7 } -> { 2, 3, 4, 5 }.  */
+	  rtx conv_op1 = gen_rtx_SUBREG (E_V4DImode, op1_alt, 0);
+	  rtx conv_op0 = gen_rtx_SUBREG (E_V4DImode, d->op0, 0);
+	  emit_insn (gen_lasx_xvpermi_q_v4di (conv_op1, conv_op1,
+					      conv_op0, GEN_INT (0x21)));
+
+	  for (i = 0; i < d->nelt / 2; i += 1)
+	    {
+	      remapped[i] = d->perm[i];
+	      remapped[i + d->nelt / 2] = d->perm[i];
+	    }
+	  /* Selector after:
+	     DI misalign: { 1, 2, 1, 2 }
+	     SI misalign: { 1, 2, 3, 4, 1, 2, 3, 4 }  */
+	}
+    }
+  else if (loongarch_is_lasx_lowpart_interleave (d))
+    {
+      /* Elements from op0's low 18bit and op1's 128bit are inserted into
+	 target register alternately.
+	 sample: E_V4DImode, { 0, 4, 1, 5 }  */
+      if (!d->testing_p)
+	{
+	  /* Prepare temp register instead of modify original op.  */
+	  use_alt_op = true;
+	  op1_alt = gen_reg_rtx (d->vmode);
+	  op0_alt = gen_reg_rtx (d->vmode);
+	  emit_move_insn (op1_alt, d->op1);
+	  emit_move_insn (op0_alt, d->op0);
+
+	  /* Generate subreg for fitting into insn gen function.  */
+	  rtx conv_op1 = gen_rtx_SUBREG (E_V4DImode, op1_alt, 0);
+	  rtx conv_op0 = gen_rtx_SUBREG (E_V4DImode, op0_alt, 0);
+
+	  /* Adjust op value in temp register.
+	     op0 = {0,1,2,3}, op1 = {4,5,0,1}  */
+	  emit_insn (gen_lasx_xvpermi_q_v4di (conv_op1, conv_op1,
+					      conv_op0, GEN_INT (0x02)));
+	  /* op0 = {0,1,4,5}, op1 = {4,5,0,1}  */
+	  emit_insn (gen_lasx_xvpermi_q_v4di (conv_op0, conv_op0,
+					      conv_op1, GEN_INT (0x01)));
+
+	  /* Remap indices in selector based on the location of index inside
+	     selector, and vector element numbers in current vector mode.  */
+
+	  /* Filling low 128bit of new selector.  */
+	  for (i = 0; i < d->nelt / 2; i += 1)
+	    {
+	      /* value in odd-indexed slot of low 128bit part of selector
+		 vector.  */
+	      remapped[i] = i % 2 != 0 ? d->perm[i] - d->nelt / 2 : d->perm[i];
+	    }
+	  /* Then filling the high 128bit.  */
+	  for (i = d->nelt / 2; i < d->nelt; i += 1)
+	    {
+	      /* value in even-indexed slot of high 128bit part of
+		 selector vector.  */
+	      remapped[i] = i % 2 == 0
+		? d->perm[i] + (d->nelt / 2) * 3 : d->perm[i];
+	    }
+	}
+    }
+  else if (loongarch_is_lasx_lowpart_interleave_2 (d))
+    {
+      /* Special lowpart interleave case in V32QI vector mode.  It does the same
+	 thing as we can see in if branch that above this line.
+	 Selector sample: E_V32QImode,
+	 {0, 1, 2, 3, 4, 5, 6, 7, 32, 33, 34, 35, 36, 37, 38, 39, 8,
+	 9, 10, 11, 12, 13, 14, 15, 40, 41, 42, 43, 44, 45, 46, 47}  */
+      if (!d->testing_p)
+	{
+	  /* Solution for this case in very simple - covert op into V4DI mode,
+	     and do same thing as previous if branch.  */
+	  op1_alt = gen_reg_rtx (d->vmode);
+	  op0_alt = gen_reg_rtx (d->vmode);
+	  emit_move_insn (op1_alt, d->op1);
+	  emit_move_insn (op0_alt, d->op0);
+
+	  rtx conv_op1 = gen_rtx_SUBREG (E_V4DImode, op1_alt, 0);
+	  rtx conv_op0 = gen_rtx_SUBREG (E_V4DImode, op0_alt, 0);
+	  rtx conv_target = gen_rtx_SUBREG (E_V4DImode, d->target, 0);
+
+	  emit_insn (gen_lasx_xvpermi_q_v4di (conv_op1, conv_op1,
+					      conv_op0, GEN_INT (0x02)));
+	  emit_insn (gen_lasx_xvpermi_q_v4di (conv_op0, conv_op0,
+					      conv_op1, GEN_INT (0x01)));
+	  remapped[0] = 0;
+	  remapped[1] = 4;
+	  remapped[2] = 1;
+	  remapped[3] = 5;
+
+	  for (i = 0; i < d->nelt; i += 1)
+	    {
+	      rperm[i] = GEN_INT (remapped[i]);
+	    }
+
+	  sel = gen_rtx_CONST_VECTOR (E_V4DImode, gen_rtvec_v (4, rperm));
+	  sel = force_reg (E_V4DImode, sel);
+	  emit_insn (gen_lasx_xvshuf_d (conv_target, sel,
+					conv_op1, conv_op0));
+	}
+
+      ok = true;
+      goto expand_perm_const_2_end;
+    }
+  else if (loongarch_is_lasx_lowpart_extract (d))
+    {
+      /* Copy op0's low 128bit to target's low 128bit, and copy op1's low
+	 128bit to target's high 128bit.
+	 Selector sample: E_V4DImode, { 0, 1, 4 ,5 }  */
+      if (!d->testing_p)
+	{
+	  rtx conv_op1 = gen_rtx_SUBREG (E_V4DImode, d->op1, 0);
+	  rtx conv_op0 = gen_rtx_SUBREG (E_V4DImode, d->op0, 0);
+	  rtx conv_target = gen_rtx_SUBREG (E_V4DImode, d->target, 0);
+
+	  /* We can achieve the expectation by using sinple xvpermi.q insn.  */
+	  emit_move_insn (conv_target, conv_op1);
+	  emit_insn (gen_lasx_xvpermi_q_v4di (conv_target, conv_target,
+					      conv_op0, GEN_INT (0x20)));
+	}
+
+      ok = true;
+      goto expand_perm_const_2_end;
+    }
+  else if (loongarch_is_lasx_highpart_interleave (d))
+    {
+      /* Similar to lowpart interleave, elements from op0's high 128bit and
+	 op1's high 128bit are inserted into target regiter alternately.
+	 Selector sample: E_V8SImode, { 4, 12, 5, 13, 6, 14, 7, 15 }  */
+      if (!d->testing_p)
+	{
+	  /* Prepare temp op register.  */
+	  use_alt_op = true;
+	  op1_alt = gen_reg_rtx (d->vmode);
+	  op0_alt = gen_reg_rtx (d->vmode);
+	  emit_move_insn (op1_alt, d->op1);
+	  emit_move_insn (op0_alt, d->op0);
+
+	  rtx conv_op1 = gen_rtx_SUBREG (E_V4DImode, op1_alt, 0);
+	  rtx conv_op0 = gen_rtx_SUBREG (E_V4DImode, op0_alt, 0);
+	  /* Adjust op value in temp regiter.
+	     op0 = { 0, 1, 2, 3 }, op1 = { 6, 7, 2, 3 }  */
+	  emit_insn (gen_lasx_xvpermi_q_v4di (conv_op1, conv_op1,
+					      conv_op0, GEN_INT (0x13)));
+	  /* op0 = { 2, 3, 6, 7 }, op1 = { 6, 7, 2, 3 }  */
+	  emit_insn (gen_lasx_xvpermi_q_v4di (conv_op0, conv_op0,
+					      conv_op1, GEN_INT (0x01)));
+	  /* Remap indices in selector based on the location of index inside
+	     selector, and vector element numbers in current vector mode.  */
+
+	  /* Filling low 128bit of new selector.  */
+	 for (i = 0; i < d->nelt / 2; i += 1)
+	   {
+	     /* value in even-indexed slot of low 128bit part of selector
+		vector.  */
+	     remapped[i] = i % 2 == 0 ? d->perm[i] - d->nelt / 2 : d->perm[i];
+	   }
+	  /* Then filling the high 128bit.  */
+	 for (i = d->nelt / 2; i < d->nelt; i += 1)
+	   {
+	     /* value in odd-indexed slot of high 128bit part of selector
+		vector.  */
+	      remapped[i] = i % 2 != 0
+		? d->perm[i] - (d->nelt / 2) * 3 : d->perm[i];
+	   }
+	}
+    }
+  else if (loongarch_is_lasx_highpart_interleave_2 (d))
+    {
+      /* Special highpart interleave case in V32QI vector mode.  It does the
+	 same thing as the normal version above.
+	 Selector sample: E_V32QImode,
+	 {16, 17, 18, 19, 20, 21, 22, 23, 48, 49, 50, 51, 52, 53, 54, 55,
+	 24, 25, 26, 27, 28, 29, 30, 31, 56, 57, 58, 59, 60, 61, 62, 63}
+      */
+      if (!d->testing_p)
+	{
+	  /* Convert op into V4DImode and do the things.  */
+	  op1_alt = gen_reg_rtx (d->vmode);
+	  op0_alt = gen_reg_rtx (d->vmode);
+	  emit_move_insn (op1_alt, d->op1);
+	  emit_move_insn (op0_alt, d->op0);
+
+	  rtx conv_op1 = gen_rtx_SUBREG (E_V4DImode, op1_alt, 0);
+	  rtx conv_op0 = gen_rtx_SUBREG (E_V4DImode, op0_alt, 0);
+	  rtx conv_target = gen_rtx_SUBREG (E_V4DImode, d->target, 0);
+
+	  emit_insn (gen_lasx_xvpermi_q_v4di (conv_op1, conv_op1,
+					      conv_op0, GEN_INT (0x13)));
+	  emit_insn (gen_lasx_xvpermi_q_v4di (conv_op0, conv_op0,
+					      conv_op1, GEN_INT (0x01)));
+	  remapped[0] = 2;
+	  remapped[1] = 6;
+	  remapped[2] = 3;
+	  remapped[3] = 7;
+
+	  for (i = 0; i < d->nelt; i += 1)
+	    {
+	      rperm[i] = GEN_INT (remapped[i]);
+	    }
+
+	  sel = gen_rtx_CONST_VECTOR (E_V4DImode, gen_rtvec_v (4, rperm));
+	  sel = force_reg (E_V4DImode, sel);
+	  emit_insn (gen_lasx_xvshuf_d (conv_target, sel,
+					conv_op1, conv_op0));
+	}
+
+	ok = true;
+	goto expand_perm_const_2_end;
+    }
+  else if (loongarch_is_elem_duplicate (d))
+    {
+      /* Brocast single element (from op0 or op1) to all slot of target
+	 register.
+	 Selector sample:E_V8SImode, { 2, 2, 2, 2, 2, 2, 2, 2 }  */
+      if (!d->testing_p)
+	{
+	  rtx conv_op1 = gen_rtx_SUBREG (E_V4DImode, d->op1, 0);
+	  rtx conv_op0 = gen_rtx_SUBREG (E_V4DImode, d->op0, 0);
+	  rtx temp_reg = gen_reg_rtx (d->vmode);
+	  rtx conv_temp = gen_rtx_SUBREG (E_V4DImode, temp_reg, 0);
+
+	  emit_move_insn (temp_reg, d->op0);
+
+	  idx = d->perm[0];
+	  /* We will use xvrepl128vei.* insn to achieve the result, but we need
+	     to make the high/low 128bit has the same contents that contain the
+	     value that we need to broardcast, because xvrepl128vei does the
+	     broardcast job from every 128bit of source register to
+	     corresponded part of target register! (A deep sigh.)  */
+	  if (/*idx >= 0 &&*/ idx < d->nelt / 2)
+	    {
+	      emit_insn (gen_lasx_xvpermi_q_v4di (conv_temp, conv_temp,
+						  conv_op0, GEN_INT (0x0)));
+	    }
+	  else if (idx >= d->nelt / 2 && idx < d->nelt)
+	    {
+	      emit_insn (gen_lasx_xvpermi_q_v4di (conv_temp, conv_temp,
+						  conv_op0, GEN_INT (0x11)));
+	      idx -= d->nelt / 2;
+	    }
+	  else if (idx >= d->nelt && idx < (d->nelt + d->nelt / 2))
+	    {
+	      emit_insn (gen_lasx_xvpermi_q_v4di (conv_temp, conv_temp,
+						  conv_op1, GEN_INT (0x0)));
+	    }
+	  else if (idx >= (d->nelt + d->nelt / 2) && idx < d->nelt * 2)
+	    {
+	      emit_insn (gen_lasx_xvpermi_q_v4di (conv_temp, conv_temp,
+						  conv_op1, GEN_INT (0x11)));
+	      idx -= d->nelt / 2;
+	    }
+
+	  /* Then we can finally generate this insn.  */
+	  switch (d->vmode)
+	    {
+	    case E_V4DImode:
+	      emit_insn (gen_lasx_xvrepl128vei_d (d->target, temp_reg,
+						  GEN_INT (idx)));
+	      break;
+	    case E_V4DFmode:
+	      emit_insn (gen_lasx_xvrepl128vei_d_f (d->target, temp_reg,
+						    GEN_INT (idx)));
+	      break;
+	    case E_V8SImode:
+	      emit_insn (gen_lasx_xvrepl128vei_w (d->target, temp_reg,
+						  GEN_INT (idx)));
+	      break;
+	    case E_V8SFmode:
+	      emit_insn (gen_lasx_xvrepl128vei_w_f (d->target, temp_reg,
+						    GEN_INT (idx)));
+	      break;
+	    case E_V16HImode:
+	      emit_insn (gen_lasx_xvrepl128vei_h (d->target, temp_reg,
+						  GEN_INT (idx)));
+	      break;
+	    case E_V32QImode:
+	      emit_insn (gen_lasx_xvrepl128vei_b (d->target, temp_reg,
+						  GEN_INT (idx)));
+	      break;
+	    default:
+	      gcc_unreachable ();
+	      break;
+	    }
+
+	  /* finish func directly.  */
+	  ok = true;
+	  goto expand_perm_const_2_end;
+	}
+    }
+  else if (loongarch_is_op_reverse_perm (d))
+    {
+      /* reverse high 128bit and low 128bit in op0.
+	 Selector sample: E_V4DFmode, { 2, 3, 0, 1 }
+	 Use xvpermi.q for doing this job.  */
+      if (!d->testing_p)
+	{
+	  if (d->vmode == E_V4DImode)
+	    {
+	      emit_insn (gen_lasx_xvpermi_q_v4di (d->target, d->target, d->op0,
+						  GEN_INT (0x01)));
+	    }
+	  else if (d->vmode == E_V4DFmode)
+	    {
+	      emit_insn (gen_lasx_xvpermi_q_v4df (d->target, d->target, d->op0,
+						  GEN_INT (0x01)));
+	    }
+	  else
+	    {
+	      gcc_unreachable ();
+	    }
+	}
+
+      ok = true;
+      goto expand_perm_const_2_end;
+    }
+  else if (loongarch_is_single_op_perm (d))
+    {
+      /* Permutation that only select elements from op0.  */
+      if (!d->testing_p)
+	{
+	  /* Prepare temp register instead of modify original op.  */
+	  use_alt_op = true;
+	  op0_alt = gen_reg_rtx (d->vmode);
+	  op1_alt = gen_reg_rtx (d->vmode);
+
+	  emit_move_insn (op0_alt, d->op0);
+	  emit_move_insn (op1_alt, d->op1);
+
+	  rtx conv_op0 = gen_rtx_SUBREG (E_V4DImode, d->op0, 0);
+	  rtx conv_op0a = gen_rtx_SUBREG (E_V4DImode, op0_alt, 0);
+	  rtx conv_op1a = gen_rtx_SUBREG (E_V4DImode, op1_alt, 0);
+
+	  /* Duplicate op0's low 128bit in op0, then duplicate high 128bit
+	     in op1.  After this, xvshuf.* insn's selector argument can
+	     access all elements we need for correct permutation result.  */
+	  emit_insn (gen_lasx_xvpermi_q_v4di (conv_op0a, conv_op0a, conv_op0,
+					      GEN_INT (0x00)));
+	  emit_insn (gen_lasx_xvpermi_q_v4di (conv_op1a, conv_op1a, conv_op0,
+					      GEN_INT (0x11)));
+
+	  /* In this case, there's no need to remap selector's indices.  */
+	  for (i = 0; i < d->nelt; i += 1)
+	    {
+	      remapped[i] = d->perm[i];
+	    }
+	}
+    }
+  else if (loongarch_is_divisible_perm (d))
+    {
+      /* Divisible perm:
+	 Low 128bit of selector only selects elements of op0,
+	 and high 128bit of selector only selects elements of op1.  */
+
+      if (!d->testing_p)
+	{
+	  /* Prepare temp register instead of modify original op.  */
+	  use_alt_op = true;
+	  op0_alt = gen_reg_rtx (d->vmode);
+	  op1_alt = gen_reg_rtx (d->vmode);
+
+	  emit_move_insn (op0_alt, d->op0);
+	  emit_move_insn (op1_alt, d->op1);
+
+	  rtx conv_op0a = gen_rtx_SUBREG (E_V4DImode, op0_alt, 0);
+	  rtx conv_op1a = gen_rtx_SUBREG (E_V4DImode, op1_alt, 0);
+	  rtx conv_op0 = gen_rtx_SUBREG (E_V4DImode, d->op0, 0);
+	  rtx conv_op1 = gen_rtx_SUBREG (E_V4DImode, d->op1, 0);
+
+	  /* Reorganize op0's hi/lo 128bit and op1's hi/lo 128bit, to make sure
+	     that selector's low 128bit can access all op0's elements, and
+	     selector's high 128bit can access all op1's elements.  */
+	  emit_insn (gen_lasx_xvpermi_q_v4di (conv_op0a, conv_op0a, conv_op1,
+					      GEN_INT (0x02)));
+	  emit_insn (gen_lasx_xvpermi_q_v4di (conv_op1a, conv_op1a, conv_op0,
+					      GEN_INT (0x31)));
+
+	  /* No need to modify indices.  */
+	  for (i = 0; i < d->nelt;i += 1)
+	    {
+	      remapped[i] = d->perm[i];
+	    }
+	}
+    }
+  else if (loongarch_is_triple_stride_extract (d))
+    {
+      /* Selector sample: E_V4DFmode, { 1, 4, 7, 0 }.  */
+      if (!d->testing_p)
+	{
+	  /* Resolve it with brute force modification.  */
+	  remapped[0] = 1;
+	  remapped[1] = 2;
+	  remapped[2] = 3;
+	  remapped[3] = 0;
+	}
+    }
+  else
+    {
+      /* When all of the detections above are failed, we will try last
+	 strategy.
+	 The for loop tries to detect following rules based on indices' value,
+	 its position inside of selector vector ,and strange behavior of
+	 xvshuf.* insn; Then we take corresponding action. (Replace with new
+	 value, or give up whole permutation expansion.)  */
+      for (i = 0; i < d->nelt; i += 1)
+	{
+	  /* % (2 * d->nelt)  */
+	  idx = d->perm[i];
+
+	  /* if index is located in low 128bit of selector vector.  */
+	  if (i < d->nelt / 2)
+	    {
+	      /* Fail case 1: index tries to reach element that located in op0's
+		 high 128bit.  */
+	      if (idx >= d->nelt / 2 && idx < d->nelt)
+		{
+		  goto expand_perm_const_2_end;
+		}
+	      /* Fail case 2: index tries to reach element that located in
+		 op1's high 128bit.  */
+	      if (idx >= (d->nelt + d->nelt / 2))
+		{
+		  goto expand_perm_const_2_end;
+		}
+
+	      /* Success case: index tries to reach elements that located in
+		 op1's low 128bit.  Apply - (nelt / 2) offset to original
+		 value.  */
+	      if (idx >= d->nelt && idx < (d->nelt + d->nelt / 2))
+		{
+		  idx -= d->nelt / 2;
+		}
+	    }
+	  /* if index is located in high 128bit of selector vector.  */
+	  else
+	    {
+	      /* Fail case 1: index tries to reach element that located in
+		 op1's low 128bit.  */
+	      if (idx >= d->nelt && idx < (d->nelt + d->nelt / 2))
+		{
+		  goto expand_perm_const_2_end;
+		}
+	      /* Fail case 2: index tries to reach element that located in
+		 op0's low 128bit.  */
+	      if (idx < (d->nelt / 2))
+		{
+		  goto expand_perm_const_2_end;
+		}
+	      /* Success case: index tries to reach element that located in
+		 op0's high 128bit.  */
+	      if (idx >= d->nelt / 2 && idx < d->nelt)
+		{
+		  idx -= d->nelt / 2;
+		}
+	    }
+	  /* No need to process other case that we did not mentioned.  */
+
+	  /* Assign with original or processed value.  */
+	  remapped[i] = idx;
+	}
+    }
+
+  ok = true;
+  /* If testing_p is true, compiler is trying to figure out that backend can
+     handle this permutation, but doesn't want to generate actual insn.  So
+     if true, exit directly.  */
+  if (d->testing_p)
+    {
+      goto expand_perm_const_2_end;
+    }
+
+  /* Convert remapped selector array to RTL array.  */
+  for (i = 0; i < d->nelt; i += 1)
+    {
+      rperm[i] = GEN_INT (remapped[i]);
+    }
+
+  /* Copy selector vector from memory to vector regiter for later insn gen
+     function.
+     If vector's element in floating point value, we cannot fit selector
+     argument into insn gen function directly, because of the insn template
+     definition.  As a solution, generate a integral mode subreg of target,
+     then copy selector vector (that is in integral mode) to this subreg.  */
+  switch (d->vmode)
+    {
+    case E_V4DFmode:
+      sel = gen_rtx_CONST_VECTOR (E_V4DImode, gen_rtvec_v (d->nelt, rperm));
+      tmp = gen_rtx_SUBREG (E_V4DImode, d->target, 0);
+      emit_move_insn (tmp, sel);
+      break;
+    case E_V8SFmode:
+      sel = gen_rtx_CONST_VECTOR (E_V8SImode, gen_rtvec_v (d->nelt, rperm));
+      tmp = gen_rtx_SUBREG (E_V8SImode, d->target, 0);
+      emit_move_insn (tmp, sel);
+      break;
+    default:
+      sel = gen_rtx_CONST_VECTOR (d->vmode, gen_rtvec_v (d->nelt, rperm));
+      emit_move_insn (d->target, sel);
+      break;
+    }
+
+  target = d->target;
+  /* If temp op registers are requested in previous if branch, then use temp
+     register intead of original one.  */
+  if (use_alt_op)
+    {
+      op0 = op0_alt != NULL_RTX ? op0_alt : d->op0;
+      op1 = op1_alt != NULL_RTX ? op1_alt : d->op1;
+    }
+  else
+    {
+      op0 = d->op0;
+      op1 = d->one_vector_p ? d->op0 : d->op1;
+    }
+
+  /* We FINALLY can generate xvshuf.* insn.  */
+  switch (d->vmode)
+    {
+    case E_V4DFmode:
+      emit_insn (gen_lasx_xvshuf_d_f (target, target, op1, op0));
+      break;
+    case E_V4DImode:
+      emit_insn (gen_lasx_xvshuf_d (target, target, op1, op0));
+      break;
+    case E_V8SFmode:
+      emit_insn (gen_lasx_xvshuf_w_f (target, target, op1, op0));
+      break;
+    case E_V8SImode:
+      emit_insn (gen_lasx_xvshuf_w (target, target, op1, op0));
+      break;
+    case E_V16HImode:
+      emit_insn (gen_lasx_xvshuf_h (target, target, op1, op0));
+      break;
+    case E_V32QImode:
+      emit_insn (gen_lasx_xvshuf_b (target, op1, op0, target));
+      break;
+    default:
+      gcc_unreachable ();
+      break;
+    }
+
+  /* Extra insn for swapping the hi/lo 128bit of target vector register.  */
+  if (reverse_hi_lo)
+    {
+      switch (d->vmode)
+	{
+	case E_V4DFmode:
+	  emit_insn (gen_lasx_xvpermi_q_v4df (d->target, d->target,
+					      d->target, GEN_INT (0x1)));
+	  break;
+	case E_V4DImode:
+	  emit_insn (gen_lasx_xvpermi_q_v4di (d->target, d->target,
+					      d->target, GEN_INT (0x1)));
+	  break;
+	case E_V8SFmode:
+	  emit_insn (gen_lasx_xvpermi_q_v8sf (d->target, d->target,
+					      d->target, GEN_INT (0x1)));
+	  break;
+	case E_V8SImode:
+	  emit_insn (gen_lasx_xvpermi_q_v8si (d->target, d->target,
+					      d->target, GEN_INT (0x1)));
+	  break;
+	case E_V16HImode:
+	  emit_insn (gen_lasx_xvpermi_q_v16hi (d->target, d->target,
+					       d->target, GEN_INT (0x1)));
+	  break;
+	case E_V32QImode:
+	  emit_insn (gen_lasx_xvpermi_q_v32qi (d->target, d->target,
+					       d->target, GEN_INT (0x1)));
+	  break;
+	default:
+	  break;
+	}
+    }
+  /* Extra insn required by odd/even extraction.  Swapping the second and third
+     64bit in target vector register.  */
+  else if (extract_ev_od)
+    {
+      rtx converted = gen_rtx_SUBREG (E_V4DImode, d->target, 0);
+      emit_insn (gen_lasx_xvpermi_d_v4di (converted, converted,
+					  GEN_INT (0xD8)));
+    }
+
+expand_perm_const_2_end:
+  return ok;
 }
 
 /* Implement TARGET_VECTORIZE_VEC_PERM_CONST.  */
@@ -7799,7 +9643,7 @@ loongarch_sched_reassociation_width (unsigned int opc, machine_mode mode)
     case CPU_LOONGARCH64:
     case CPU_LA464:
       /* Vector part.  */
-      if (LSX_SUPPORTED_MODE_P (mode))
+      if (LSX_SUPPORTED_MODE_P (mode) || LASX_SUPPORTED_MODE_P (mode))
 	{
 	  /* Integer vector instructions execute in FP unit.
 	     The width of integer/float-point vector instructions is 3.  */
@@ -7839,6 +9683,44 @@ loongarch_expand_vector_extract (rtx target, rtx vec, int elt)
     case E_V16QImode:
       break;
 
+    case E_V32QImode:
+      if (TARGET_LASX)
+	{
+	  if (elt >= 16)
+	    {
+	      tmp = gen_reg_rtx (V32QImode);
+	      emit_insn (gen_lasx_xvpermi_d_v32qi (tmp, vec, GEN_INT (0xe)));
+	      loongarch_expand_vector_extract (target,
+					       gen_lowpart (V16QImode, tmp),
+					       elt & 15);
+	    }
+	  else
+	    loongarch_expand_vector_extract (target,
+					     gen_lowpart (V16QImode, vec),
+					     elt & 15);
+	  return;
+	}
+      break;
+
+    case E_V16HImode:
+      if (TARGET_LASX)
+	{
+	  if (elt >= 8)
+	    {
+	      tmp = gen_reg_rtx (V16HImode);
+	      emit_insn (gen_lasx_xvpermi_d_v16hi (tmp, vec, GEN_INT (0xe)));
+	      loongarch_expand_vector_extract (target,
+					       gen_lowpart (V8HImode, tmp),
+					       elt & 7);
+	    }
+	  else
+	    loongarch_expand_vector_extract (target,
+					     gen_lowpart (V8HImode, vec),
+					     elt & 7);
+	  return;
+	}
+      break;
+
     default:
       break;
     }
@@ -7877,6 +9759,31 @@ emit_reduc_half (rtx dest, rtx src, int i)
     case E_V2DFmode:
       tem = gen_lsx_vbsrl_d_f (dest, src, GEN_INT (8));
       break;
+    case E_V8SFmode:
+      if (i == 256)
+	tem = gen_lasx_xvpermi_d_v8sf (dest, src, GEN_INT (0xe));
+      else
+	tem = gen_lasx_xvshuf4i_w_f (dest, src,
+				     GEN_INT (i == 128 ? 2 + (3 << 2) : 1));
+      break;
+    case E_V4DFmode:
+      if (i == 256)
+	tem = gen_lasx_xvpermi_d_v4df (dest, src, GEN_INT (0xe));
+      else
+	tem = gen_lasx_xvpermi_d_v4df (dest, src, const1_rtx);
+      break;
+    case E_V32QImode:
+    case E_V16HImode:
+    case E_V8SImode:
+    case E_V4DImode:
+      d = gen_reg_rtx (V4DImode);
+      if (i == 256)
+	tem = gen_lasx_xvpermi_d_v4di (d, gen_lowpart (V4DImode, src),
+				       GEN_INT (0xe));
+      else
+	tem = gen_lasx_xvbsrl_d (d, gen_lowpart (V4DImode, src),
+				 GEN_INT (i/16));
+      break;
     case E_V16QImode:
     case E_V8HImode:
     case E_V4SImode:
@@ -7924,10 +9831,57 @@ loongarch_expand_vec_unpack (rtx operands[2], bool unsigned_p, bool high_p)
 {
   machine_mode imode = GET_MODE (operands[1]);
   rtx (*unpack) (rtx, rtx, rtx);
+  rtx (*extend) (rtx, rtx);
   rtx (*cmpFunc) (rtx, rtx, rtx);
+  rtx (*swap_hi_lo) (rtx, rtx, rtx, rtx);
   rtx tmp, dest;
 
-  if (ISA_HAS_LSX)
+  if (ISA_HAS_LASX && GET_MODE_SIZE (imode) == 32)
+    {
+      switch (imode)
+	{
+	case E_V8SImode:
+	  if (unsigned_p)
+	    extend = gen_lasx_vext2xv_du_wu;
+	  else
+	    extend = gen_lasx_vext2xv_d_w;
+	  swap_hi_lo = gen_lasx_xvpermi_q_v8si;
+	  break;
+
+	case E_V16HImode:
+	  if (unsigned_p)
+	    extend = gen_lasx_vext2xv_wu_hu;
+	  else
+	    extend = gen_lasx_vext2xv_w_h;
+	  swap_hi_lo = gen_lasx_xvpermi_q_v16hi;
+	  break;
+
+	case E_V32QImode:
+	  if (unsigned_p)
+	    extend = gen_lasx_vext2xv_hu_bu;
+	  else
+	    extend = gen_lasx_vext2xv_h_b;
+	  swap_hi_lo = gen_lasx_xvpermi_q_v32qi;
+	  break;
+
+	default:
+	  gcc_unreachable ();
+	  break;
+	}
+
+      if (high_p)
+	{
+	  tmp = gen_reg_rtx (imode);
+	  emit_insn (swap_hi_lo (tmp, tmp, operands[1], const1_rtx));
+	  emit_insn (extend (operands[0], tmp));
+	  return;
+	}
+
+      emit_insn (extend (operands[0], operands[1]));
+      return;
+
+    }
+  else if (ISA_HAS_LSX)
     {
       switch (imode)
 	{
@@ -8028,8 +9982,17 @@ loongarch_gen_const_int_vector_shuffle (machine_mode mode, int val)
   return gen_rtx_PARALLEL (VOIDmode, gen_rtvec_v (nunits, elts));
 }
 
+
 /* Expand a vector initialization.  */
 
+void
+loongarch_expand_vector_group_init (rtx target, rtx vals)
+{
+  rtx ops[2] = { XVECEXP (vals, 0, 0), XVECEXP (vals, 0, 1) };
+  emit_insn (gen_rtx_SET (target, gen_rtx_VEC_CONCAT (E_V32QImode, ops[0],
+						      ops[1])));
+}
+
 void
 loongarch_expand_vector_init (rtx target, rtx vals)
 {
@@ -8049,6 +10012,285 @@ loongarch_expand_vector_init (rtx target, rtx vals)
 	all_same = false;
     }
 
+  if (ISA_HAS_LASX && GET_MODE_SIZE (vmode) == 32)
+    {
+      if (all_same)
+	{
+	  rtx same = XVECEXP (vals, 0, 0);
+	  rtx temp, temp2;
+
+	  if (CONST_INT_P (same) && nvar == 0
+	      && loongarch_signed_immediate_p (INTVAL (same), 10, 0))
+	    {
+	      switch (vmode)
+		{
+		case E_V32QImode:
+		case E_V16HImode:
+		case E_V8SImode:
+		case E_V4DImode:
+		  temp = gen_rtx_CONST_VECTOR (vmode, XVEC (vals, 0));
+		  emit_move_insn (target, temp);
+		  return;
+
+		default:
+		  gcc_unreachable ();
+		}
+	    }
+
+	  temp = gen_reg_rtx (imode);
+	  if (imode == GET_MODE (same))
+	    temp2 = same;
+	  else if (GET_MODE_SIZE (imode) >= UNITS_PER_WORD)
+	    {
+	      if (GET_CODE (same) == MEM)
+		{
+		  rtx reg_tmp = gen_reg_rtx (GET_MODE (same));
+		  loongarch_emit_move (reg_tmp, same);
+		  temp2 = simplify_gen_subreg (imode, reg_tmp,
+					       GET_MODE (reg_tmp), 0);
+		}
+	      else
+		temp2 = simplify_gen_subreg (imode, same,
+					     GET_MODE (same), 0);
+	    }
+	  else
+	    {
+	      if (GET_CODE (same) == MEM)
+		{
+		  rtx reg_tmp = gen_reg_rtx (GET_MODE (same));
+		  loongarch_emit_move (reg_tmp, same);
+		  temp2 = lowpart_subreg (imode, reg_tmp,
+					  GET_MODE (reg_tmp));
+		}
+	      else
+		temp2 = lowpart_subreg (imode, same, GET_MODE (same));
+	    }
+	  emit_move_insn (temp, temp2);
+
+	  switch (vmode)
+	    {
+	    case E_V32QImode:
+	    case E_V16HImode:
+	    case E_V8SImode:
+	    case E_V4DImode:
+	      loongarch_emit_move (target,
+				   gen_rtx_VEC_DUPLICATE (vmode, temp));
+	      break;
+
+	    case E_V8SFmode:
+	      emit_insn (gen_lasx_xvreplve0_w_f_scalar (target, temp));
+	      break;
+
+	    case E_V4DFmode:
+	      emit_insn (gen_lasx_xvreplve0_d_f_scalar (target, temp));
+	      break;
+
+	    default:
+	      gcc_unreachable ();
+	    }
+	}
+      else
+	{
+	  rtvec vec = shallow_copy_rtvec (XVEC (vals, 0));
+
+	  for (i = 0; i < nelt; ++i)
+	    RTVEC_ELT (vec, i) = CONST0_RTX (imode);
+
+	  emit_move_insn (target, gen_rtx_CONST_VECTOR (vmode, vec));
+
+	  machine_mode half_mode = VOIDmode;
+	  rtx target_hi, target_lo;
+
+	  switch (vmode)
+	    {
+	    case E_V32QImode:
+	      half_mode=E_V16QImode;
+	      target_hi = gen_reg_rtx (half_mode);
+	      target_lo = gen_reg_rtx (half_mode);
+	      for (i = 0; i < nelt/2; ++i)
+		{
+		  rtx temp_hi = gen_reg_rtx (imode);
+		  rtx temp_lo = gen_reg_rtx (imode);
+		  emit_move_insn (temp_hi, XVECEXP (vals, 0, i+nelt/2));
+		  emit_move_insn (temp_lo, XVECEXP (vals, 0, i));
+		  if (i == 0)
+		    {
+		      emit_insn (gen_lsx_vreplvei_b_scalar (target_hi,
+							    temp_hi));
+		      emit_insn (gen_lsx_vreplvei_b_scalar (target_lo,
+							    temp_lo));
+		    }
+		  else
+		    {
+		      emit_insn (gen_vec_setv16qi (target_hi, temp_hi,
+						   GEN_INT (i)));
+		      emit_insn (gen_vec_setv16qi (target_lo, temp_lo,
+						   GEN_INT (i)));
+		    }
+		}
+	      emit_insn (gen_rtx_SET (target,
+				      gen_rtx_VEC_CONCAT (vmode, target_hi,
+							  target_lo)));
+	      break;
+
+	    case E_V16HImode:
+	      half_mode=E_V8HImode;
+	      target_hi = gen_reg_rtx (half_mode);
+	      target_lo = gen_reg_rtx (half_mode);
+	      for (i = 0; i < nelt/2; ++i)
+		{
+		  rtx temp_hi = gen_reg_rtx (imode);
+		  rtx temp_lo = gen_reg_rtx (imode);
+		  emit_move_insn (temp_hi, XVECEXP (vals, 0, i+nelt/2));
+		  emit_move_insn (temp_lo, XVECEXP (vals, 0, i));
+		  if (i == 0)
+		    {
+		      emit_insn (gen_lsx_vreplvei_h_scalar (target_hi,
+							    temp_hi));
+		      emit_insn (gen_lsx_vreplvei_h_scalar (target_lo,
+							    temp_lo));
+		    }
+		  else
+		    {
+		      emit_insn (gen_vec_setv8hi (target_hi, temp_hi,
+						  GEN_INT (i)));
+		      emit_insn (gen_vec_setv8hi (target_lo, temp_lo,
+						  GEN_INT (i)));
+		    }
+		}
+	      emit_insn (gen_rtx_SET (target,
+				      gen_rtx_VEC_CONCAT (vmode, target_hi,
+							  target_lo)));
+	      break;
+
+	    case E_V8SImode:
+	      half_mode=V4SImode;
+	      target_hi = gen_reg_rtx (half_mode);
+	      target_lo = gen_reg_rtx (half_mode);
+	      for (i = 0; i < nelt/2; ++i)
+		{
+		  rtx temp_hi = gen_reg_rtx (imode);
+		  rtx temp_lo = gen_reg_rtx (imode);
+		  emit_move_insn (temp_hi, XVECEXP (vals, 0, i+nelt/2));
+		  emit_move_insn (temp_lo, XVECEXP (vals, 0, i));
+		  if (i == 0)
+		    {
+		      emit_insn (gen_lsx_vreplvei_w_scalar (target_hi,
+							    temp_hi));
+		      emit_insn (gen_lsx_vreplvei_w_scalar (target_lo,
+							    temp_lo));
+		    }
+		  else
+		    {
+		      emit_insn (gen_vec_setv4si (target_hi, temp_hi,
+						  GEN_INT (i)));
+		      emit_insn (gen_vec_setv4si (target_lo, temp_lo,
+			  			  GEN_INT (i)));
+		    }
+		}
+	      emit_insn (gen_rtx_SET (target,
+				      gen_rtx_VEC_CONCAT (vmode, target_hi,
+							  target_lo)));
+	      break;
+
+	    case E_V4DImode:
+	      half_mode=E_V2DImode;
+	      target_hi = gen_reg_rtx (half_mode);
+	      target_lo = gen_reg_rtx (half_mode);
+	      for (i = 0; i < nelt/2; ++i)
+		{
+		  rtx temp_hi = gen_reg_rtx (imode);
+		  rtx temp_lo = gen_reg_rtx (imode);
+		  emit_move_insn (temp_hi, XVECEXP (vals, 0, i+nelt/2));
+		  emit_move_insn (temp_lo, XVECEXP (vals, 0, i));
+		  if (i == 0)
+		    {
+		      emit_insn (gen_lsx_vreplvei_d_scalar (target_hi,
+							    temp_hi));
+		      emit_insn (gen_lsx_vreplvei_d_scalar (target_lo,
+							    temp_lo));
+		    }
+		  else
+		    {
+		      emit_insn (gen_vec_setv2di (target_hi, temp_hi,
+						  GEN_INT (i)));
+		      emit_insn (gen_vec_setv2di (target_lo, temp_lo,
+						  GEN_INT (i)));
+		    }
+		}
+	      emit_insn (gen_rtx_SET (target,
+				      gen_rtx_VEC_CONCAT (vmode, target_hi,
+							  target_lo)));
+	      break;
+
+	    case E_V8SFmode:
+	      half_mode=E_V4SFmode;
+	      target_hi = gen_reg_rtx (half_mode);
+	      target_lo = gen_reg_rtx (half_mode);
+	      for (i = 0; i < nelt/2; ++i)
+		{
+		  rtx temp_hi = gen_reg_rtx (imode);
+		  rtx temp_lo = gen_reg_rtx (imode);
+		  emit_move_insn (temp_hi, XVECEXP (vals, 0, i+nelt/2));
+		  emit_move_insn (temp_lo, XVECEXP (vals, 0, i));
+		  if (i == 0)
+		    {
+		      emit_insn (gen_lsx_vreplvei_w_f_scalar (target_hi,
+							      temp_hi));
+		      emit_insn (gen_lsx_vreplvei_w_f_scalar (target_lo,
+							      temp_lo));
+		    }
+		  else
+		    {
+		      emit_insn (gen_vec_setv4sf (target_hi, temp_hi,
+						  GEN_INT (i)));
+		      emit_insn (gen_vec_setv4sf (target_lo, temp_lo,
+						  GEN_INT (i)));
+		    }
+		}
+	      emit_insn (gen_rtx_SET (target,
+				      gen_rtx_VEC_CONCAT (vmode, target_hi,
+							  target_lo)));
+	      break;
+
+	    case E_V4DFmode:
+	      half_mode=E_V2DFmode;
+	      target_hi = gen_reg_rtx (half_mode);
+	      target_lo = gen_reg_rtx (half_mode);
+	      for (i = 0; i < nelt/2; ++i)
+		{
+		  rtx temp_hi = gen_reg_rtx (imode);
+		  rtx temp_lo = gen_reg_rtx (imode);
+		  emit_move_insn (temp_hi, XVECEXP (vals, 0, i+nelt/2));
+		  emit_move_insn (temp_lo, XVECEXP (vals, 0, i));
+		  if (i == 0)
+		    {
+		      emit_insn (gen_lsx_vreplvei_d_f_scalar (target_hi,
+							      temp_hi));
+		      emit_insn (gen_lsx_vreplvei_d_f_scalar (target_lo,
+							      temp_lo));
+		    }
+		  else
+		    {
+		      emit_insn (gen_vec_setv2df (target_hi, temp_hi,
+						  GEN_INT (i)));
+		      emit_insn (gen_vec_setv2df (target_lo, temp_lo,
+						  GEN_INT (i)));
+		    }
+		}
+	      emit_insn (gen_rtx_SET (target,
+				      gen_rtx_VEC_CONCAT (vmode, target_hi,
+							  target_lo)));
+	      break;
+
+	    default:
+	      gcc_unreachable ();
+	    }
+
+	}
+      return;
+    }
+
   if (ISA_HAS_LSX)
     {
       if (all_same)
@@ -8296,6 +10538,38 @@ loongarch_expand_lsx_cmp (rtx dest, enum rtx_code cond, rtx op0, rtx op1)
 	}
       break;
 
+    case E_V8SFmode:
+    case E_V4DFmode:
+      switch (cond)
+	{
+	case UNORDERED:
+	case ORDERED:
+	case EQ:
+	case NE:
+	case UNEQ:
+	case UNLE:
+	case UNLT:
+	  break;
+	case LTGT: cond = NE; break;
+	case UNGE: cond = UNLE; std::swap (op0, op1); break;
+	case UNGT: cond = UNLT; std::swap (op0, op1); break;
+	case LE: unspec = UNSPEC_LASX_XVFCMP_SLE; break;
+	case LT: unspec = UNSPEC_LASX_XVFCMP_SLT; break;
+	case GE: unspec = UNSPEC_LASX_XVFCMP_SLE; std::swap (op0, op1); break;
+	case GT: unspec = UNSPEC_LASX_XVFCMP_SLT; std::swap (op0, op1); break;
+	default:
+		 gcc_unreachable ();
+	}
+      if (unspec < 0)
+	loongarch_emit_binary (cond, dest, op0, op1);
+      else
+	{
+	  rtx x = gen_rtx_UNSPEC (GET_MODE (dest),
+				  gen_rtvec (2, op0, op1), unspec);
+	  emit_insn (gen_rtx_SET (dest, x));
+	}
+      break;
+
     default:
       gcc_unreachable ();
       break;
@@ -8633,7 +10907,7 @@ loongarch_builtin_support_vector_misalignment (machine_mode mode,
 					       int misalignment,
 					       bool is_packed)
 {
-  if (ISA_HAS_LSX && STRICT_ALIGNMENT)
+  if ((ISA_HAS_LSX || ISA_HAS_LASX) && STRICT_ALIGNMENT)
     {
       if (optab_handler (movmisalign_optab, mode) == CODE_FOR_nothing)
 	return false;
diff --git a/gcc/config/loongarch/loongarch.h b/gcc/config/loongarch/loongarch.h
index e939dd826d1..39852d2bb12 100644
--- a/gcc/config/loongarch/loongarch.h
+++ b/gcc/config/loongarch/loongarch.h
@@ -186,6 +186,11 @@ along with GCC; see the file COPYING3.  If not see
 /* Width of a LSX vector register in bits.  */
 #define BITS_PER_LSX_REG (UNITS_PER_LSX_REG * BITS_PER_UNIT)
 
+/* Width of a LASX vector register in bytes.  */
+#define UNITS_PER_LASX_REG 32
+/* Width of a LASX vector register in bits.  */
+#define BITS_PER_LASX_REG (UNITS_PER_LASX_REG * BITS_PER_UNIT)
+
 /* For LARCH, width of a floating point register.  */
 #define UNITS_PER_FPREG (TARGET_DOUBLE_FLOAT ? 8 : 4)
 
@@ -248,10 +253,11 @@ along with GCC; see the file COPYING3.  If not see
 #define STRUCTURE_SIZE_BOUNDARY 8
 
 /* There is no point aligning anything to a rounder boundary than
-   LONG_DOUBLE_TYPE_SIZE, unless under LSX the bigggest alignment is
-   BITS_PER_LSX_REG/..  */
+   LONG_DOUBLE_TYPE_SIZE, unless under LSX/LASX the bigggest alignment is
+   BITS_PER_LSX_REG/BITS_PER_LASX_REG/..  */
 #define BIGGEST_ALIGNMENT \
-  (ISA_HAS_LSX ? BITS_PER_LSX_REG : LONG_DOUBLE_TYPE_SIZE)
+  (ISA_HAS_LASX? BITS_PER_LASX_REG \
+   : (ISA_HAS_LSX ? BITS_PER_LSX_REG : LONG_DOUBLE_TYPE_SIZE))
 
 /* All accesses must be aligned.  */
 #define STRICT_ALIGNMENT (TARGET_STRICT_ALIGN)
@@ -391,6 +397,10 @@ along with GCC; see the file COPYING3.  If not see
 #define LSX_REG_LAST  FP_REG_LAST
 #define LSX_REG_NUM   FP_REG_NUM
 
+#define LASX_REG_FIRST FP_REG_FIRST
+#define LASX_REG_LAST  FP_REG_LAST
+#define LASX_REG_NUM   FP_REG_NUM
+
 /* The DWARF 2 CFA column which tracks the return address from a
    signal handler context.  This means that to maintain backwards
    compatibility, no hard register can be assigned this column if it
@@ -409,9 +419,12 @@ along with GCC; see the file COPYING3.  If not see
   ((unsigned int) ((int) (REGNO) - FCC_REG_FIRST) < FCC_REG_NUM)
 #define LSX_REG_P(REGNO) \
   ((unsigned int) ((int) (REGNO) - LSX_REG_FIRST) < LSX_REG_NUM)
+#define LASX_REG_P(REGNO) \
+  ((unsigned int) ((int) (REGNO) - LASX_REG_FIRST) < LASX_REG_NUM)
 
 #define FP_REG_RTX_P(X) (REG_P (X) && FP_REG_P (REGNO (X)))
 #define LSX_REG_RTX_P(X) (REG_P (X) && LSX_REG_P (REGNO (X)))
+#define LASX_REG_RTX_P(X) (REG_P (X) && LASX_REG_P (REGNO (X)))
 
 /* Select a register mode required for caller save of hard regno REGNO.  */
 #define HARD_REGNO_CALLER_SAVE_MODE(REGNO, NREGS, MODE) \
@@ -733,6 +746,13 @@ enum reg_class
    && (GET_MODE_CLASS (MODE) == MODE_VECTOR_INT		\
        || GET_MODE_CLASS (MODE) == MODE_VECTOR_FLOAT))
 
+#define LASX_SUPPORTED_MODE_P(MODE)			\
+  (ISA_HAS_LASX						\
+   && (GET_MODE_SIZE (MODE) == UNITS_PER_LSX_REG	\
+       ||GET_MODE_SIZE (MODE) == UNITS_PER_LASX_REG)	\
+   && (GET_MODE_CLASS (MODE) == MODE_VECTOR_INT		\
+       || GET_MODE_CLASS (MODE) == MODE_VECTOR_FLOAT))
+
 /* 1 if N is a possible register number for function argument passing.
    We have no FP argument registers when soft-float.  */
 
@@ -985,7 +1005,39 @@ typedef struct {
   { "vr28",	28 + FP_REG_FIRST },					\
   { "vr29",	29 + FP_REG_FIRST },					\
   { "vr30",	30 + FP_REG_FIRST },					\
-  { "vr31",	31 + FP_REG_FIRST }					\
+  { "vr31",	31 + FP_REG_FIRST },					\
+  { "xr0",	 0 + FP_REG_FIRST },					\
+  { "xr1",	 1 + FP_REG_FIRST },					\
+  { "xr2",	 2 + FP_REG_FIRST },					\
+  { "xr3",	 3 + FP_REG_FIRST },					\
+  { "xr4",	 4 + FP_REG_FIRST },					\
+  { "xr5",	 5 + FP_REG_FIRST },					\
+  { "xr6",	 6 + FP_REG_FIRST },					\
+  { "xr7",	 7 + FP_REG_FIRST },					\
+  { "xr8",	 8 + FP_REG_FIRST },					\
+  { "xr9",	 9 + FP_REG_FIRST },					\
+  { "xr10",	10 + FP_REG_FIRST },					\
+  { "xr11",	11 + FP_REG_FIRST },					\
+  { "xr12",	12 + FP_REG_FIRST },					\
+  { "xr13",	13 + FP_REG_FIRST },					\
+  { "xr14",	14 + FP_REG_FIRST },					\
+  { "xr15",	15 + FP_REG_FIRST },					\
+  { "xr16",	16 + FP_REG_FIRST },					\
+  { "xr17",	17 + FP_REG_FIRST },					\
+  { "xr18",	18 + FP_REG_FIRST },					\
+  { "xr19",	19 + FP_REG_FIRST },					\
+  { "xr20",	20 + FP_REG_FIRST },					\
+  { "xr21",	21 + FP_REG_FIRST },					\
+  { "xr22",	22 + FP_REG_FIRST },					\
+  { "xr23",	23 + FP_REG_FIRST },					\
+  { "xr24",	24 + FP_REG_FIRST },					\
+  { "xr25",	25 + FP_REG_FIRST },					\
+  { "xr26",	26 + FP_REG_FIRST },					\
+  { "xr27",	27 + FP_REG_FIRST },					\
+  { "xr28",	28 + FP_REG_FIRST },					\
+  { "xr29",	29 + FP_REG_FIRST },					\
+  { "xr30",	30 + FP_REG_FIRST },					\
+  { "xr31",	31 + FP_REG_FIRST }					\
 }
 
 /* Globalizing directive for a label.  */
diff --git a/gcc/config/loongarch/loongarch.md b/gcc/config/loongarch/loongarch.md
index 7b8978e2533..30b2cb91e9a 100644
--- a/gcc/config/loongarch/loongarch.md
+++ b/gcc/config/loongarch/loongarch.md
@@ -163,7 +163,7 @@ (define_attr "alu_type" "unknown,add,sub,not,nor,and,or,xor,simd_add"
 
 ;; Main data type used by the insn
 (define_attr "mode" "unknown,none,QI,HI,SI,DI,TI,SF,DF,TF,FCC,
-  V2DI,V4SI,V8HI,V16QI,V2DF,V4SF"
+  V2DI,V4SI,V8HI,V16QI,V2DF,V4SF,V4DI,V8SI,V16HI,V32QI,V4DF,V8SF"
   (const_string "unknown"))
 
 ;; True if the main data type is twice the size of a word.
@@ -422,12 +422,14 @@ (define_mode_attr ifmt [(SI "w") (DI "l")])
 ;; floating-point mode or vector mode.
 (define_mode_attr UNITMODE [(SF "SF") (DF "DF") (V2SF "SF") (V4SF "SF")
 			    (V16QI "QI") (V8HI "HI") (V4SI "SI") (V2DI "DI")
-			    (V2DF "DF")])
+			    (V2DF "DF")(V8SF "SF")(V32QI "QI")(V16HI "HI")(V8SI "SI")(V4DI "DI")(V4DF "DF")])
 
 ;; As above, but in lower case.
 (define_mode_attr unitmode [(SF "sf") (DF "df") (V2SF "sf") (V4SF "sf")
 			    (V16QI "qi") (V8QI "qi") (V8HI "hi") (V4HI "hi")
-			    (V4SI "si") (V2SI "si") (V2DI "di") (V2DF "df")])
+			    (V4SI "si") (V2SI "si") (V2DI "di") (V2DF "df")
+			    (V8SI "si") (V4DI "di") (V32QI "qi") (V16HI "hi")
+			    (V8SF "sf") (V4DF "df")])
 
 ;; This attribute gives the integer mode that has half the size of
 ;; the controlling mode.
@@ -711,16 +713,17 @@ (define_insn "sub<mode>3"
   [(set_attr "alu_type" "sub")
    (set_attr "mode" "<MODE>")])
 
+
 (define_insn "*subsi3_extended"
-  [(set (match_operand:DI 0 "register_operand" "= r")
+  [(set (match_operand:DI 0 "register_operand" "=r")
 	(sign_extend:DI
-	    (minus:SI (match_operand:SI 1 "reg_or_0_operand" " rJ")
-		      (match_operand:SI 2 "register_operand" "  r"))))]
+	    (minus:SI (match_operand:SI 1 "reg_or_0_operand" "rJ")
+		      (match_operand:SI 2 "register_operand" "r"))))]
   "TARGET_64BIT"
   "sub.w\t%0,%z1,%2"
   [(set_attr "type" "arith")
    (set_attr "mode" "SI")])
-\f
+
 ;;
 ;;  ....................
 ;;
@@ -3634,6 +3637,9 @@ (define_insn "loongarch_crcc_w_<size>_w"
 ; The LoongArch SX Instructions.
 (include "lsx.md")
 
+; The LoongArch ASX Instructions.
+(include "lasx.md")
+
 (define_c_enum "unspec" [
   UNSPEC_ADDRESS_FIRST
 ])
diff --git a/gcc/testsuite/g++.dg/torture/vshuf-v16hi.C b/gcc/testsuite/g++.dg/torture/vshuf-v16hi.C
index 6277068b859..252de0ab97a 100644
--- a/gcc/testsuite/g++.dg/torture/vshuf-v16hi.C
+++ b/gcc/testsuite/g++.dg/torture/vshuf-v16hi.C
@@ -1,5 +1,6 @@
 // { dg-options "-std=c++11" }
 // { dg-do run }
+// { dg-skip-if "LoongArch vshuf/xvshuf insn result is undefined when 6 or 7 bit of vector's element is set." { loongarch*-*-* } }
 
 typedef unsigned short V __attribute__((vector_size(32)));
 typedef V VI;
-- 
2.36.0


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v5 6/6] LoongArch: Add Loongson ASX directive builtin function support.
  2023-08-24  3:13 [PATCH v5 0/6] Add Loongson SX/ASX instruction support to LoongArch target Chenghui Pan
                   ` (4 preceding siblings ...)
  2023-08-24  3:13 ` [PATCH v5 5/6] LoongArch: Add Loongson ASX base instruction support Chenghui Pan
@ 2023-08-24  3:13 ` Chenghui Pan
  2023-08-24  3:40 ` [PATCH v5 0/6] Add Loongson SX/ASX instruction support to LoongArch target Xi Ruoyao
  6 siblings, 0 replies; 9+ messages in thread
From: Chenghui Pan @ 2023-08-24  3:13 UTC (permalink / raw)
  To: gcc-patches; +Cc: xry111, i, chenglulu, xuchenghua

From: Lulu Cheng <chenglulu@loongson.cn>

gcc/ChangeLog:

	* config.gcc: Export the header file lasxintrin.h.
	* config/loongarch/loongarch-builtins.cc (enum loongarch_builtin_type):
	Add Loongson ASX builtin functions support.
	(AVAIL_ALL): Ditto.
	(LASX_BUILTIN): Ditto.
	(LASX_NO_TARGET_BUILTIN): Ditto.
	(LASX_BUILTIN_TEST_BRANCH): Ditto.
	(CODE_FOR_lasx_xvsadd_b): Ditto.
	(CODE_FOR_lasx_xvsadd_h): Ditto.
	(CODE_FOR_lasx_xvsadd_w): Ditto.
	(CODE_FOR_lasx_xvsadd_d): Ditto.
	(CODE_FOR_lasx_xvsadd_bu): Ditto.
	(CODE_FOR_lasx_xvsadd_hu): Ditto.
	(CODE_FOR_lasx_xvsadd_wu): Ditto.
	(CODE_FOR_lasx_xvsadd_du): Ditto.
	(CODE_FOR_lasx_xvadd_b): Ditto.
	(CODE_FOR_lasx_xvadd_h): Ditto.
	(CODE_FOR_lasx_xvadd_w): Ditto.
	(CODE_FOR_lasx_xvadd_d): Ditto.
	(CODE_FOR_lasx_xvaddi_bu): Ditto.
	(CODE_FOR_lasx_xvaddi_hu): Ditto.
	(CODE_FOR_lasx_xvaddi_wu): Ditto.
	(CODE_FOR_lasx_xvaddi_du): Ditto.
	(CODE_FOR_lasx_xvand_v): Ditto.
	(CODE_FOR_lasx_xvandi_b): Ditto.
	(CODE_FOR_lasx_xvbitsel_v): Ditto.
	(CODE_FOR_lasx_xvseqi_b): Ditto.
	(CODE_FOR_lasx_xvseqi_h): Ditto.
	(CODE_FOR_lasx_xvseqi_w): Ditto.
	(CODE_FOR_lasx_xvseqi_d): Ditto.
	(CODE_FOR_lasx_xvslti_b): Ditto.
	(CODE_FOR_lasx_xvslti_h): Ditto.
	(CODE_FOR_lasx_xvslti_w): Ditto.
	(CODE_FOR_lasx_xvslti_d): Ditto.
	(CODE_FOR_lasx_xvslti_bu): Ditto.
	(CODE_FOR_lasx_xvslti_hu): Ditto.
	(CODE_FOR_lasx_xvslti_wu): Ditto.
	(CODE_FOR_lasx_xvslti_du): Ditto.
	(CODE_FOR_lasx_xvslei_b): Ditto.
	(CODE_FOR_lasx_xvslei_h): Ditto.
	(CODE_FOR_lasx_xvslei_w): Ditto.
	(CODE_FOR_lasx_xvslei_d): Ditto.
	(CODE_FOR_lasx_xvslei_bu): Ditto.
	(CODE_FOR_lasx_xvslei_hu): Ditto.
	(CODE_FOR_lasx_xvslei_wu): Ditto.
	(CODE_FOR_lasx_xvslei_du): Ditto.
	(CODE_FOR_lasx_xvdiv_b): Ditto.
	(CODE_FOR_lasx_xvdiv_h): Ditto.
	(CODE_FOR_lasx_xvdiv_w): Ditto.
	(CODE_FOR_lasx_xvdiv_d): Ditto.
	(CODE_FOR_lasx_xvdiv_bu): Ditto.
	(CODE_FOR_lasx_xvdiv_hu): Ditto.
	(CODE_FOR_lasx_xvdiv_wu): Ditto.
	(CODE_FOR_lasx_xvdiv_du): Ditto.
	(CODE_FOR_lasx_xvfadd_s): Ditto.
	(CODE_FOR_lasx_xvfadd_d): Ditto.
	(CODE_FOR_lasx_xvftintrz_w_s): Ditto.
	(CODE_FOR_lasx_xvftintrz_l_d): Ditto.
	(CODE_FOR_lasx_xvftintrz_wu_s): Ditto.
	(CODE_FOR_lasx_xvftintrz_lu_d): Ditto.
	(CODE_FOR_lasx_xvffint_s_w): Ditto.
	(CODE_FOR_lasx_xvffint_d_l): Ditto.
	(CODE_FOR_lasx_xvffint_s_wu): Ditto.
	(CODE_FOR_lasx_xvffint_d_lu): Ditto.
	(CODE_FOR_lasx_xvfsub_s): Ditto.
	(CODE_FOR_lasx_xvfsub_d): Ditto.
	(CODE_FOR_lasx_xvfmul_s): Ditto.
	(CODE_FOR_lasx_xvfmul_d): Ditto.
	(CODE_FOR_lasx_xvfdiv_s): Ditto.
	(CODE_FOR_lasx_xvfdiv_d): Ditto.
	(CODE_FOR_lasx_xvfmax_s): Ditto.
	(CODE_FOR_lasx_xvfmax_d): Ditto.
	(CODE_FOR_lasx_xvfmin_s): Ditto.
	(CODE_FOR_lasx_xvfmin_d): Ditto.
	(CODE_FOR_lasx_xvfsqrt_s): Ditto.
	(CODE_FOR_lasx_xvfsqrt_d): Ditto.
	(CODE_FOR_lasx_xvflogb_s): Ditto.
	(CODE_FOR_lasx_xvflogb_d): Ditto.
	(CODE_FOR_lasx_xvmax_b): Ditto.
	(CODE_FOR_lasx_xvmax_h): Ditto.
	(CODE_FOR_lasx_xvmax_w): Ditto.
	(CODE_FOR_lasx_xvmax_d): Ditto.
	(CODE_FOR_lasx_xvmaxi_b): Ditto.
	(CODE_FOR_lasx_xvmaxi_h): Ditto.
	(CODE_FOR_lasx_xvmaxi_w): Ditto.
	(CODE_FOR_lasx_xvmaxi_d): Ditto.
	(CODE_FOR_lasx_xvmax_bu): Ditto.
	(CODE_FOR_lasx_xvmax_hu): Ditto.
	(CODE_FOR_lasx_xvmax_wu): Ditto.
	(CODE_FOR_lasx_xvmax_du): Ditto.
	(CODE_FOR_lasx_xvmaxi_bu): Ditto.
	(CODE_FOR_lasx_xvmaxi_hu): Ditto.
	(CODE_FOR_lasx_xvmaxi_wu): Ditto.
	(CODE_FOR_lasx_xvmaxi_du): Ditto.
	(CODE_FOR_lasx_xvmin_b): Ditto.
	(CODE_FOR_lasx_xvmin_h): Ditto.
	(CODE_FOR_lasx_xvmin_w): Ditto.
	(CODE_FOR_lasx_xvmin_d): Ditto.
	(CODE_FOR_lasx_xvmini_b): Ditto.
	(CODE_FOR_lasx_xvmini_h): Ditto.
	(CODE_FOR_lasx_xvmini_w): Ditto.
	(CODE_FOR_lasx_xvmini_d): Ditto.
	(CODE_FOR_lasx_xvmin_bu): Ditto.
	(CODE_FOR_lasx_xvmin_hu): Ditto.
	(CODE_FOR_lasx_xvmin_wu): Ditto.
	(CODE_FOR_lasx_xvmin_du): Ditto.
	(CODE_FOR_lasx_xvmini_bu): Ditto.
	(CODE_FOR_lasx_xvmini_hu): Ditto.
	(CODE_FOR_lasx_xvmini_wu): Ditto.
	(CODE_FOR_lasx_xvmini_du): Ditto.
	(CODE_FOR_lasx_xvmod_b): Ditto.
	(CODE_FOR_lasx_xvmod_h): Ditto.
	(CODE_FOR_lasx_xvmod_w): Ditto.
	(CODE_FOR_lasx_xvmod_d): Ditto.
	(CODE_FOR_lasx_xvmod_bu): Ditto.
	(CODE_FOR_lasx_xvmod_hu): Ditto.
	(CODE_FOR_lasx_xvmod_wu): Ditto.
	(CODE_FOR_lasx_xvmod_du): Ditto.
	(CODE_FOR_lasx_xvmul_b): Ditto.
	(CODE_FOR_lasx_xvmul_h): Ditto.
	(CODE_FOR_lasx_xvmul_w): Ditto.
	(CODE_FOR_lasx_xvmul_d): Ditto.
	(CODE_FOR_lasx_xvclz_b): Ditto.
	(CODE_FOR_lasx_xvclz_h): Ditto.
	(CODE_FOR_lasx_xvclz_w): Ditto.
	(CODE_FOR_lasx_xvclz_d): Ditto.
	(CODE_FOR_lasx_xvnor_v): Ditto.
	(CODE_FOR_lasx_xvor_v): Ditto.
	(CODE_FOR_lasx_xvori_b): Ditto.
	(CODE_FOR_lasx_xvnori_b): Ditto.
	(CODE_FOR_lasx_xvpcnt_b): Ditto.
	(CODE_FOR_lasx_xvpcnt_h): Ditto.
	(CODE_FOR_lasx_xvpcnt_w): Ditto.
	(CODE_FOR_lasx_xvpcnt_d): Ditto.
	(CODE_FOR_lasx_xvxor_v): Ditto.
	(CODE_FOR_lasx_xvxori_b): Ditto.
	(CODE_FOR_lasx_xvsll_b): Ditto.
	(CODE_FOR_lasx_xvsll_h): Ditto.
	(CODE_FOR_lasx_xvsll_w): Ditto.
	(CODE_FOR_lasx_xvsll_d): Ditto.
	(CODE_FOR_lasx_xvslli_b): Ditto.
	(CODE_FOR_lasx_xvslli_h): Ditto.
	(CODE_FOR_lasx_xvslli_w): Ditto.
	(CODE_FOR_lasx_xvslli_d): Ditto.
	(CODE_FOR_lasx_xvsra_b): Ditto.
	(CODE_FOR_lasx_xvsra_h): Ditto.
	(CODE_FOR_lasx_xvsra_w): Ditto.
	(CODE_FOR_lasx_xvsra_d): Ditto.
	(CODE_FOR_lasx_xvsrai_b): Ditto.
	(CODE_FOR_lasx_xvsrai_h): Ditto.
	(CODE_FOR_lasx_xvsrai_w): Ditto.
	(CODE_FOR_lasx_xvsrai_d): Ditto.
	(CODE_FOR_lasx_xvsrl_b): Ditto.
	(CODE_FOR_lasx_xvsrl_h): Ditto.
	(CODE_FOR_lasx_xvsrl_w): Ditto.
	(CODE_FOR_lasx_xvsrl_d): Ditto.
	(CODE_FOR_lasx_xvsrli_b): Ditto.
	(CODE_FOR_lasx_xvsrli_h): Ditto.
	(CODE_FOR_lasx_xvsrli_w): Ditto.
	(CODE_FOR_lasx_xvsrli_d): Ditto.
	(CODE_FOR_lasx_xvsub_b): Ditto.
	(CODE_FOR_lasx_xvsub_h): Ditto.
	(CODE_FOR_lasx_xvsub_w): Ditto.
	(CODE_FOR_lasx_xvsub_d): Ditto.
	(CODE_FOR_lasx_xvsubi_bu): Ditto.
	(CODE_FOR_lasx_xvsubi_hu): Ditto.
	(CODE_FOR_lasx_xvsubi_wu): Ditto.
	(CODE_FOR_lasx_xvsubi_du): Ditto.
	(CODE_FOR_lasx_xvpackod_d): Ditto.
	(CODE_FOR_lasx_xvpackev_d): Ditto.
	(CODE_FOR_lasx_xvpickod_d): Ditto.
	(CODE_FOR_lasx_xvpickev_d): Ditto.
	(CODE_FOR_lasx_xvrepli_b): Ditto.
	(CODE_FOR_lasx_xvrepli_h): Ditto.
	(CODE_FOR_lasx_xvrepli_w): Ditto.
	(CODE_FOR_lasx_xvrepli_d): Ditto.
	(CODE_FOR_lasx_xvandn_v): Ditto.
	(CODE_FOR_lasx_xvorn_v): Ditto.
	(CODE_FOR_lasx_xvneg_b): Ditto.
	(CODE_FOR_lasx_xvneg_h): Ditto.
	(CODE_FOR_lasx_xvneg_w): Ditto.
	(CODE_FOR_lasx_xvneg_d): Ditto.
	(CODE_FOR_lasx_xvbsrl_v): Ditto.
	(CODE_FOR_lasx_xvbsll_v): Ditto.
	(CODE_FOR_lasx_xvfmadd_s): Ditto.
	(CODE_FOR_lasx_xvfmadd_d): Ditto.
	(CODE_FOR_lasx_xvfmsub_s): Ditto.
	(CODE_FOR_lasx_xvfmsub_d): Ditto.
	(CODE_FOR_lasx_xvfnmadd_s): Ditto.
	(CODE_FOR_lasx_xvfnmadd_d): Ditto.
	(CODE_FOR_lasx_xvfnmsub_s): Ditto.
	(CODE_FOR_lasx_xvfnmsub_d): Ditto.
	(CODE_FOR_lasx_xvpermi_q): Ditto.
	(CODE_FOR_lasx_xvpermi_d): Ditto.
	(CODE_FOR_lasx_xbnz_v): Ditto.
	(CODE_FOR_lasx_xbz_v): Ditto.
	(CODE_FOR_lasx_xvssub_b): Ditto.
	(CODE_FOR_lasx_xvssub_h): Ditto.
	(CODE_FOR_lasx_xvssub_w): Ditto.
	(CODE_FOR_lasx_xvssub_d): Ditto.
	(CODE_FOR_lasx_xvssub_bu): Ditto.
	(CODE_FOR_lasx_xvssub_hu): Ditto.
	(CODE_FOR_lasx_xvssub_wu): Ditto.
	(CODE_FOR_lasx_xvssub_du): Ditto.
	(CODE_FOR_lasx_xvabsd_b): Ditto.
	(CODE_FOR_lasx_xvabsd_h): Ditto.
	(CODE_FOR_lasx_xvabsd_w): Ditto.
	(CODE_FOR_lasx_xvabsd_d): Ditto.
	(CODE_FOR_lasx_xvabsd_bu): Ditto.
	(CODE_FOR_lasx_xvabsd_hu): Ditto.
	(CODE_FOR_lasx_xvabsd_wu): Ditto.
	(CODE_FOR_lasx_xvabsd_du): Ditto.
	(CODE_FOR_lasx_xvavg_b): Ditto.
	(CODE_FOR_lasx_xvavg_h): Ditto.
	(CODE_FOR_lasx_xvavg_w): Ditto.
	(CODE_FOR_lasx_xvavg_d): Ditto.
	(CODE_FOR_lasx_xvavg_bu): Ditto.
	(CODE_FOR_lasx_xvavg_hu): Ditto.
	(CODE_FOR_lasx_xvavg_wu): Ditto.
	(CODE_FOR_lasx_xvavg_du): Ditto.
	(CODE_FOR_lasx_xvavgr_b): Ditto.
	(CODE_FOR_lasx_xvavgr_h): Ditto.
	(CODE_FOR_lasx_xvavgr_w): Ditto.
	(CODE_FOR_lasx_xvavgr_d): Ditto.
	(CODE_FOR_lasx_xvavgr_bu): Ditto.
	(CODE_FOR_lasx_xvavgr_hu): Ditto.
	(CODE_FOR_lasx_xvavgr_wu): Ditto.
	(CODE_FOR_lasx_xvavgr_du): Ditto.
	(CODE_FOR_lasx_xvmuh_b): Ditto.
	(CODE_FOR_lasx_xvmuh_h): Ditto.
	(CODE_FOR_lasx_xvmuh_w): Ditto.
	(CODE_FOR_lasx_xvmuh_d): Ditto.
	(CODE_FOR_lasx_xvmuh_bu): Ditto.
	(CODE_FOR_lasx_xvmuh_hu): Ditto.
	(CODE_FOR_lasx_xvmuh_wu): Ditto.
	(CODE_FOR_lasx_xvmuh_du): Ditto.
	(CODE_FOR_lasx_xvssran_b_h): Ditto.
	(CODE_FOR_lasx_xvssran_h_w): Ditto.
	(CODE_FOR_lasx_xvssran_w_d): Ditto.
	(CODE_FOR_lasx_xvssran_bu_h): Ditto.
	(CODE_FOR_lasx_xvssran_hu_w): Ditto.
	(CODE_FOR_lasx_xvssran_wu_d): Ditto.
	(CODE_FOR_lasx_xvssrarn_b_h): Ditto.
	(CODE_FOR_lasx_xvssrarn_h_w): Ditto.
	(CODE_FOR_lasx_xvssrarn_w_d): Ditto.
	(CODE_FOR_lasx_xvssrarn_bu_h): Ditto.
	(CODE_FOR_lasx_xvssrarn_hu_w): Ditto.
	(CODE_FOR_lasx_xvssrarn_wu_d): Ditto.
	(CODE_FOR_lasx_xvssrln_bu_h): Ditto.
	(CODE_FOR_lasx_xvssrln_hu_w): Ditto.
	(CODE_FOR_lasx_xvssrln_wu_d): Ditto.
	(CODE_FOR_lasx_xvssrlrn_bu_h): Ditto.
	(CODE_FOR_lasx_xvssrlrn_hu_w): Ditto.
	(CODE_FOR_lasx_xvssrlrn_wu_d): Ditto.
	(CODE_FOR_lasx_xvftint_w_s): Ditto.
	(CODE_FOR_lasx_xvftint_l_d): Ditto.
	(CODE_FOR_lasx_xvftint_wu_s): Ditto.
	(CODE_FOR_lasx_xvftint_lu_d): Ditto.
	(CODE_FOR_lasx_xvsllwil_h_b): Ditto.
	(CODE_FOR_lasx_xvsllwil_w_h): Ditto.
	(CODE_FOR_lasx_xvsllwil_d_w): Ditto.
	(CODE_FOR_lasx_xvsllwil_hu_bu): Ditto.
	(CODE_FOR_lasx_xvsllwil_wu_hu): Ditto.
	(CODE_FOR_lasx_xvsllwil_du_wu): Ditto.
	(CODE_FOR_lasx_xvsat_b): Ditto.
	(CODE_FOR_lasx_xvsat_h): Ditto.
	(CODE_FOR_lasx_xvsat_w): Ditto.
	(CODE_FOR_lasx_xvsat_d): Ditto.
	(CODE_FOR_lasx_xvsat_bu): Ditto.
	(CODE_FOR_lasx_xvsat_hu): Ditto.
	(CODE_FOR_lasx_xvsat_wu): Ditto.
	(CODE_FOR_lasx_xvsat_du): Ditto.
	(loongarch_builtin_vectorized_function): Ditto.
	(loongarch_expand_builtin_insn): Ditto.
	(loongarch_expand_builtin): Ditto.
	* config/loongarch/loongarch-ftypes.def (1): Ditto.
	(2): Ditto.
	(3): Ditto.
	(4): Ditto.
	* config/loongarch/lasxintrin.h: New file.
---
 gcc/config.gcc                             |    2 +-
 gcc/config/loongarch/lasxintrin.h          | 5338 ++++++++++++++++++++
 gcc/config/loongarch/loongarch-builtins.cc | 1180 ++++-
 gcc/config/loongarch/loongarch-ftypes.def  |  271 +-
 4 files changed, 6788 insertions(+), 3 deletions(-)
 create mode 100644 gcc/config/loongarch/lasxintrin.h

diff --git a/gcc/config.gcc b/gcc/config.gcc
index d6b809cdb55..1cc04a8757e 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -469,7 +469,7 @@ mips*-*-*)
 	;;
 loongarch*-*-*)
 	cpu_type=loongarch
-	extra_headers="larchintrin.h lsxintrin.h"
+	extra_headers="larchintrin.h lsxintrin.h lasxintrin.h"
 	extra_objs="loongarch-c.o loongarch-builtins.o loongarch-cpu.o loongarch-opts.o loongarch-def.o"
 	extra_gcc_objs="loongarch-driver.o loongarch-cpu.o loongarch-opts.o loongarch-def.o"
 	extra_options="${extra_options} g.opt fused-madd.opt"
diff --git a/gcc/config/loongarch/lasxintrin.h b/gcc/config/loongarch/lasxintrin.h
new file mode 100644
index 00000000000..d3937992746
--- /dev/null
+++ b/gcc/config/loongarch/lasxintrin.h
@@ -0,0 +1,5338 @@
+/* LARCH Loongson ASX intrinsics include file.
+
+   Copyright (C) 2018 Free Software Foundation, Inc.
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published
+   by the Free Software Foundation; either version 3, or (at your
+   option) any later version.
+
+   GCC is distributed in the hope that it will be useful, but WITHOUT
+   ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+   or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
+   License for more details.
+
+   Under Section 7 of GPL version 3, you are granted additional
+   permissions described in the GCC Runtime Library Exception, version
+   3.1, as published by the Free Software Foundation.
+
+   You should have received a copy of the GNU General Public License and
+   a copy of the GCC Runtime Library Exception along with this program;
+   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef _GCC_LOONGSON_ASXINTRIN_H
+#define _GCC_LOONGSON_ASXINTRIN_H 1
+
+#if defined(__loongarch_asx)
+
+typedef signed char v32i8 __attribute__ ((vector_size(32), aligned(32)));
+typedef signed char v32i8_b __attribute__ ((vector_size(32), aligned(1)));
+typedef unsigned char v32u8 __attribute__ ((vector_size(32), aligned(32)));
+typedef unsigned char v32u8_b __attribute__ ((vector_size(32), aligned(1)));
+typedef short v16i16 __attribute__ ((vector_size(32), aligned(32)));
+typedef short v16i16_h __attribute__ ((vector_size(32), aligned(2)));
+typedef unsigned short v16u16 __attribute__ ((vector_size(32), aligned(32)));
+typedef unsigned short v16u16_h __attribute__ ((vector_size(32), aligned(2)));
+typedef int v8i32 __attribute__ ((vector_size(32), aligned(32)));
+typedef int v8i32_w __attribute__ ((vector_size(32), aligned(4)));
+typedef unsigned int v8u32 __attribute__ ((vector_size(32), aligned(32)));
+typedef unsigned int v8u32_w __attribute__ ((vector_size(32), aligned(4)));
+typedef long long v4i64 __attribute__ ((vector_size(32), aligned(32)));
+typedef long long v4i64_d __attribute__ ((vector_size(32), aligned(8)));
+typedef unsigned long long v4u64 __attribute__ ((vector_size(32), aligned(32)));
+typedef unsigned long long v4u64_d __attribute__ ((vector_size(32), aligned(8)));
+typedef float v8f32 __attribute__ ((vector_size(32), aligned(32)));
+typedef float v8f32_w __attribute__ ((vector_size(32), aligned(4)));
+typedef double v4f64 __attribute__ ((vector_size(32), aligned(32)));
+typedef double v4f64_d __attribute__ ((vector_size(32), aligned(8)));
+typedef float __m256 __attribute__ ((__vector_size__ (32),
+				     __may_alias__));
+typedef long long __m256i __attribute__ ((__vector_size__ (32),
+					  __may_alias__));
+typedef double __m256d __attribute__ ((__vector_size__ (32),
+				       __may_alias__));
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V32QI, V32QI, V32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsll_b (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsll_b ((v32i8)_1, (v32i8)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V16HI, V16HI, V16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsll_h (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsll_h ((v16i16)_1, (v16i16)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V8SI, V8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsll_w (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsll_w ((v8i32)_1, (v8i32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V4DI, V4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsll_d (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsll_d ((v4i64)_1, (v4i64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, ui3.  */
+/* Data types in instruction templates:  V32QI, V32QI, UQI.  */
+#define __lasx_xvslli_b(/*__m256i*/ _1, /*ui3*/ _2) \
+  ((__m256i)__builtin_lasx_xvslli_b ((v32i8)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, ui4.  */
+/* Data types in instruction templates:  V16HI, V16HI, UQI.  */
+#define __lasx_xvslli_h(/*__m256i*/ _1, /*ui4*/ _2) \
+  ((__m256i)__builtin_lasx_xvslli_h ((v16i16)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, ui5.  */
+/* Data types in instruction templates:  V8SI, V8SI, UQI.  */
+#define __lasx_xvslli_w(/*__m256i*/ _1, /*ui5*/ _2) \
+  ((__m256i)__builtin_lasx_xvslli_w ((v8i32)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, ui6.  */
+/* Data types in instruction templates:  V4DI, V4DI, UQI.  */
+#define __lasx_xvslli_d(/*__m256i*/ _1, /*ui6*/ _2) \
+  ((__m256i)__builtin_lasx_xvslli_d ((v4i64)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V32QI, V32QI, V32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsra_b (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsra_b ((v32i8)_1, (v32i8)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V16HI, V16HI, V16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsra_h (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsra_h ((v16i16)_1, (v16i16)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V8SI, V8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsra_w (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsra_w ((v8i32)_1, (v8i32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V4DI, V4DI, V4DI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsra_d (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsra_d ((v4i64)_1, (v4i64)_2);
+}
+
+/* Assembly instruction format:	xd, xj, ui3.  */
+/* Data types in instruction templates:  V32QI, V32QI, UQI.  */
+#define __lasx_xvsrai_b(/*__m256i*/ _1, /*ui3*/ _2) \
+  ((__m256i)__builtin_lasx_xvsrai_b ((v32i8)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, ui4.  */
+/* Data types in instruction templates:  V16HI, V16HI, UQI.  */
+#define __lasx_xvsrai_h(/*__m256i*/ _1, /*ui4*/ _2) \
+  ((__m256i)__builtin_lasx_xvsrai_h ((v16i16)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, ui5.  */
+/* Data types in instruction templates:  V8SI, V8SI, UQI.  */
+#define __lasx_xvsrai_w(/*__m256i*/ _1, /*ui5*/ _2) \
+  ((__m256i)__builtin_lasx_xvsrai_w ((v8i32)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, ui6.  */
+/* Data types in instruction templates:  V4DI, V4DI, UQI.  */
+#define __lasx_xvsrai_d(/*__m256i*/ _1, /*ui6*/ _2) \
+  ((__m256i)__builtin_lasx_xvsrai_d ((v4i64)(_1), (_2)))
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V32QI, V32QI, V32QI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsrar_b (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsrar_b ((v32i8)_1, (v32i8)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V16HI, V16HI, V16HI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsrar_h (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsrar_h ((v16i16)_1, (v16i16)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templates:  V8SI, V8SI, V8SI.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256i __lasx_xvsrar_w (__m256i _1, __m256i _2)
+{
+  return (__m256i)__builtin_lasx_xvsrar_w ((v8i32)_1, (v8i32)_2);
+}
+
+/* Assembly instruction format:	xd, xj, xk.  */
+/* Data types in instruction templa